Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve directory tree read performance #30629

Closed
wants to merge 20 commits into from
Closed

Improve directory tree read performance #30629

wants to merge 20 commits into from

Conversation

Nazar65
Copy link
Member

@Nazar65 Nazar65 commented Oct 23, 2020

Description (*)

Scan only directories for directory tree component
The result of using glob instead of iterator with more than 500000 files in media folder

2020-10-23-225235_1920x1080_scrot

Related Pull Requests

Fixed Issues (if relevant)

  1. Fixes Magento 2.4.1 Media Gallery....not ready for prime time. Fails when you have large amounts of data (images) #30596

Manual testing scenarios (*)

  1. A lot of images in filesystem -> execute this file to generate empty txt files in media folder https://gist.github.com/Nazar65/61c756ce76afa8c6718bf670794d548c execute in CMD bash glob.sh
  2. Set memory_limit in php to 100MB
  3. Open Media Galley
  4. Media gallery directory tree component renders directory tree component

Questions or comments

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds are green)

@m2-assistant
Copy link

m2-assistant bot commented Oct 23, 2020

Hi @Nazar65. Thank you for your contribution
Here is some useful tips how you can test your changes using Magento test environment.
Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names. Allowed build names are:

  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE,
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests

You can find more information about the builds here

ℹ️ Please run only needed test builds instead of all when developing. Please run all test builds before sending your PR for review.

For more details, please, review the Magento Contributor Guide documentation.

⚠️ According to the Magento Contribution requirements, all Pull Requests must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of Pull Requests happens in the queue order. If you want to speed up the delivery of your contribution, please join the Community Contributions Triage session to discuss the appropriate ticket.

🎥 You can find the recording of the previous Community Contributions Triage on the Magento Youtube Channel

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@perfpcs
Copy link

perfpcs commented Oct 23, 2020

This is good news!

@Nazar65
Copy link
Member Author

Nazar65 commented Oct 23, 2020

@magento run all tests

@sivaschenko sivaschenko added the Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. label Oct 23, 2020
@Nazar65
Copy link
Member Author

Nazar65 commented Oct 23, 2020

@magento run all tests

private function recursiveRead(string $pattern, int $flags = 0): array
{
// phpcs:ignore Magento2.Functions.DiscouragedFunction
$directories = glob($pattern, $flags);
Copy link
Contributor

@le0n4ik le0n4ik Oct 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the \Magento\Framework\Filesystem\DriverInterface::search method be used instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@le0n4ik this would be great, but main idea is the flag to read directories only, and search method have no ability to set this flag, and it will search only for one level.

Copy link
Member Author

@Nazar65 Nazar65 Oct 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but thanks, we can use Magento\Framework\Filesystem\Glob as well

@ghost ghost assigned le0n4ik Oct 23, 2020
$directories = glob($pattern, $flags);

// phpcs:ignore Magento2.Functions.DiscouragedFunction
foreach (glob($this->driver->getParentDirectory($pattern) . '/*', $flags) as $dir) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment

@Nazar65
Copy link
Member Author

Nazar65 commented Oct 24, 2020

@magento run all tests

@Nazar65
Copy link
Member Author

Nazar65 commented Oct 26, 2020

@magento run all tests

@Nazar65 Nazar65 changed the title [WIP] Improve directory tree read performance Improve directory tree read performance Oct 26, 2020
@ghost ghost added the Severity: S1 Affects critical data or functionality and forces users to employ a workaround. label Oct 26, 2020
@perfpcs
Copy link

perfpcs commented Feb 1, 2021

@thomas-kl1 So I spent many many hours testing things this weekend and trying every possible combination of fixes. What I finally determined was the problem for me was having both my static files and media files on 2 different AWS Cloudfront Distributions. Here is what I determined this weekend:

  1. Improve directory tree read performance #30629 this pull does actually work properly with large amounts of images, In my case over 2 million images. This is true only when using NO CDN and Magento Base URL's are set as URL's local to the server and not pointed to AWS Cloudfront CDN distributions. See settings:

image

  1. Fix performance issue in directory tree of MediaGallery #31749 this pull does actually work properly with large amounts of images, In my case over 2 million. Again this works with settings pointing Static and Media to the local server via URL. Same as above.

  2. Both Improve directory tree read performance #30629 and Fix performance issue in directory tree of MediaGallery #31749 DO NOT work when Magento Static Files and Media files are served from an AWS CDN. While the directory tree loads, the images do not always load. Clicking on folders changes nothing in views. Also the search bar does not appear and clicking on any folder shows no active filters to view or click and cancel. Also Adobe Stock Images does not work at all, you can not click and save images or previews. Clicking on anything in Adobe Stock does nothing.

image

image

4.#30629 DOES work fine with large amounts of images (in my case over 2 million images) when the Static URL is change back to local server. I WAS able to use the media Cloudfront distribution though. So, something about Magento files does not like the static files being served from a Cloudfront Distribution. Bottom line, with #30629 pull, Static URL pointed to local and media URL pointed to Cloudfront server, everything worked perfect. The search bar shows, clicking on folders adds them as the active filters that you can X out and close. Search terms work fine. Adobe Stock Images works perfectly and I can add previews and images with no problems. Basically, I can say everything works perfectly as expected with #30629 and just changing Static URL to the local server.

image

image

  1. Fix performance issue in directory tree of MediaGallery #31749 for me does not work using any CDNs. I tried both Static and Media both, individual each, and nothing works with AWS Cloudfront. So Fix performance issue in directory tree of MediaGallery #31749 ONLY works with static and media URLs pointed to local URLs.

So, I hope this is helpful to those working on these files. I would love to have a fix to be able to use my Static file CDN on AWS. I am not sure what the issue is, I can only report what is happening. Thanks

@jayshields
Copy link

Anyone else having problems with this - excluding directories from the media gallery can help a lot; https://devdocs.magento.com/guides/v2.4/ext-best-practices/tutorials/modify-image-library-permissions/

We had to exclude everything except for wysiwyg after the 2.4 upgrade to get the gallery UI to stop crashing PHP. Couldn't figure out a way to whitelist, so had to literally exclude every directory one by one in the XML.

@hostep
Copy link
Contributor

hostep commented Jul 5, 2021

@jayshields: so this is more or less the same problem as #22609, where for some reason they wanted all directories under pub/media to be viewable instead of only the wysiwyg directory. I keep wondering why they made that change, if it was on purpose or if it happened by accident...

@jayshields
Copy link

Yes - for me this is related to that issue. 2.4 opened the media gallery up to pub/media which has a huge amount of files for us (it's a legacy M1 upgraded site which has been running for over 6 years).

We created a module with a di.xml containing this;

<?xml version="1.0"?>
<config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="urn:magento:framework:ObjectManager/etc/config.xsd">
    <type name="Magento\Cms\Model\Wysiwyg\Images\Storage">
        <arguments>
            <argument name="dirs" xsi:type="array">
                <item name="exclude" xsi:type="array">
                    <item name="ajaxlayerednavigation" xsi:type="array">
                        <item name="regexp" xsi:type="boolean">true</item>
                        <item name="name" xsi:type="string">pub[/\\]+media[/\\]+ajaxlayerednavigation[/\\]*$</item>
                    </item>
                    <item name="analytics" xsi:type="array">
                        <item name="regexp" xsi:type="boolean">true</item>
                        <item name="name" xsi:type="string">pub[/\\]+media[/\\]+analytics[/\\]*$</item>
                    </item>
                    <item name="bannerslider" xsi:type="array">
                        <item name="regexp" xsi:type="boolean">true</item>
                        <item name="name" xsi:type="string">pub[/\\]+media[/\\]+bannerslider[/\\]*$</item>
                    </item>
// ...SNIP...
                </item>
            </argument>
        </arguments>
    </type>
</config>

@perfpcs
Copy link

perfpcs commented Jul 12, 2021

@jayshields when you say you made a module, what does that exactly mean? What does the module do and what files are in this module besides the di.xml file? Seems crazy we need to do this and that the Magento team still hasn't fixed this issue after almost a year! Why they wont make the wysiwyg as the root directory is beyond me. Seems like someone made a huge mistake allowing the catalog images to basically cripple the Media Gallery or pretty much make it unusable. What's it going to take for them to correct this bad judgement??

@thomas-kl1
Copy link
Member

@perfpcs at least they should've done an "include directory" list instead of an "exclude" one..

@perfpcs
Copy link

perfpcs commented Jul 12, 2021

@thomas-kl1 Yes Sir. Seems they don't even read half these posts to fix these critical errors. :(

@Nazar65
Copy link
Member Author

Nazar65 commented Jul 12, 2021

@thomas-kl1 @perfpcs I think it was bcs, the wisywig folder means this for wisywig package, and media folder means that there should be files for media right ? so it sounds logical why they have the media folder for the media gallery, but i think the core issue here is that directory tree component render's you full tree at one request, which i think is wrong, bcs we should render the first level of directory, and then the others only when user click's on parent, that how it was done on the olds media gallery, So with that implementation we will not have any performance issue. But anyway this is a new media gallery with new approach so it completely different from what we had before

@jayshields
Copy link

@jayshields when you say you made a module, what does that exactly mean? What does the module do and what files are in this module besides the di.xml file?

By module I mean a directory in app/code/<vendor> containing a basic registration.php and then a directory alongside for etc, with 2 files. A basic module.xml and then the di.xml as explained above.

@perfpcs
Copy link

perfpcs commented Jul 13, 2021

@jayshields thank you

@engcom-Charlie
Copy link
Contributor

It seems that, bot moved the PR into Review in Progress automatically, so moving back to Changes Requested.

Thank You!

@Nazar65
Copy link
Member Author

Nazar65 commented May 2, 2022

closing in favor -> #35404

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Tests: Covered All changes in Pull Request is covered by auto-tests Component: MediaGalleryUi Event: MageCONF CD 2020 Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Release Line: 2.4 Severity: S1 Affects critical data or functionality and forces users to employ a workaround.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Magento 2.4.1 Media Gallery....not ready for prime time. Fails when you have large amounts of data (images)