Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed duplicated ids to index by mview changelog #36155

Conversation

MateuszMesek
Copy link
Contributor

@MateuszMesek MateuszMesek commented Sep 17, 2022

Description (*)

When we have a large number of records in changelog tables (over 1000) then indexer called by mview action can process this same ids few times.

ChangeLogBatchWalker is changed to use temporary table to collect unique ids to reindex it. List of ids is returns in batches by Generator.

Fixed Issues (if relevant)

  1. Fixes Asynchronous indexing can break large servers #30012

Manual testing scenarios (*)

  1. Set indexer in "On Scheduled" mode.
  2. Insert to the changelog table related to your indexer over 1000 ids to reindex them. But in on each 1000 records your need have duplicated id. This can be a list of this same id duplicated over 1000x.
  3. Run indexer_update_all_views cron job.
  4. You will see this same ids used few times by mview action.

In case when your changelog table will be filled by this same id (ex: 1) 10000x then you should see used this id (1) 10x by mview action.

Related Pull Requests

https://github.com/magento-gl/magento2-infrastructure/pull/22

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • README.md files for modified modules are updated and included in the pull request if any README.md predefined sections require an update
  • All automated tests passed successfully (all builds are green)

@m2-assistant
Copy link

m2-assistant bot commented Sep 17, 2022

Hi @MateuszMesek. Thank you for your contribution
Here are some useful tips how you can test your changes using Magento test environment.
Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names. Allowed build names are:

  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE,
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests
  13. Semantic Version Checker

You can find more information about the builds here

ℹ️ Run only required test builds during development. Run all test builds before sending your pull request for review.

For more details, review the Magento Contributor Guide documentation.

⚠️ According to the Magento Contribution requirements, all Pull Requests must go through the Community Contributions Triage process. Community Contributions Triage is a public meeting.

🕙 You can find the schedule on the Magento Community Calendar page.

📞 The triage of Pull Requests happens in the queue order. If you want to speed up the delivery of your contribution, join the Community Contributions Triage session to discuss the appropriate ticket.

✏️ Feel free to post questions/proposals/feedback related to the Community Contributions Triage process to the corresponding Slack Channel

@MateuszMesek
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@hostep
Copy link
Contributor

hostep commented Sep 19, 2022

This following issue sounds related to this fix: #30012, can you confirm @MateuszMesek? If they are related, we should link this PR to that issue.

@slavvka
Copy link
Member

slavvka commented Sep 19, 2022

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@MateuszMesek
Copy link
Contributor Author

@hostep yes, this PR is related to issue #30012. I updated PR description.

@MateuszMesek
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@MateuszMesek
Copy link
Contributor Author

@magento run Static Tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@MateuszMesek
Copy link
Contributor Author

@magento run Static Tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

@engcom-Hotel engcom-Hotel added the Priority: P2 A defect with this priority could have functionality issues which are not to expectations. label Sep 21, 2022
@m2-community-project m2-community-project bot added the Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed. label Nov 10, 2022
@MateuszMesek MateuszMesek force-pushed the fixed-duplicated-ids-to-index-by-mview-changelog branch from 1d88eec to 6eb1890 Compare January 10, 2023 22:34
@m2-community-project m2-community-project bot added the Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. label Mar 31, 2023
@hostep
Copy link
Contributor

hostep commented Apr 6, 2023

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please re-request them if they don't show in a reasonable amount of time.

Copy link
Contributor

@ihor-sviziev ihor-sviziev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MateuszMesek
Thank you wu much for a great improvement! Could you please fix failing static tests?

@m2-community-project m2-community-project bot added Progress: needs update Priority: P0 This generally occurs in cases when the entire functionality is blocked. and removed Progress: pending review labels Apr 6, 2023
@engcom-Echo
Copy link
Contributor

@magento run Static Tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Echo
Copy link
Contributor

@magento run Static Tests,Unit Tests,WebAPI Tests,Functional Tests EE,Functional Tests CE,Functional Tests B2B

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Echo
Copy link
Contributor

@magento run Unit Tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Echo
Copy link
Contributor

Failing test does not seems to be related to PR changes. Hence moving it to Merge In Progress

@engcom-Echo
Copy link
Contributor

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Echo
Copy link
Contributor

@magento run Unit Tests,Integration Tests,Functional Tests EE

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Echo
Copy link
Contributor

Functional Tests EE failure are different on last two run on same commit. Other build failure are not related to PR changes.
Functional Tests EE
Run1
36155-ee

Run2
Screenshot 2023-08-18 at 12 36 47 PM

@dooblem
Copy link

dooblem commented Nov 24, 2023

We have 2 customers affected by this. One on m2cloud, the other on premise (community edition). Both version 2.4.6

The dev teams are working to apply this PR next week.

To be able to unlock the other indexes and pass the weekend without loosing products on the frontoffice. We are using this very quick and dirty workaround, issuing the following SQL query every 1min or 30s or so. Schedule by a cron or a while true loop in a screen :

mysql --defaults-file=xxxx.cnf magento -e 'delete from catalog_product_price_cl;'

I also really cannot understand why there are so many duplicates in the changelog table :

MariaDB [62k7efhsipvjs]> select * from catalog_product_price_cl;
+------------+-----------+
| version_id | entity_id |
+------------+-----------+
| 1349334317 |   2135411 |
...
| 1349438813 |   2144443 |
| 1349438816 |   2144443 |
| 1349438819 |   2144443 |
| 1349438822 |   2144443 |
+------------+-----------+
34801 rows in set (0.014 sec)

@dooblem
Copy link

dooblem commented Nov 28, 2023

Just for the record : in parallel to this, a few month ago our developpers (@mabaud) created a much simpler patch. It's already been deployed successfully on a project.
See the patch attached there : #37367

@MateuszMesek MateuszMesek deleted the fixed-duplicated-ids-to-index-by-mview-changelog branch November 28, 2023 09:10
@Bar3nho
Copy link

Bar3nho commented Dec 6, 2023

@MateuszMesek have you tested this solution with different MAGE_INDEXER_THREADS_COUNT values?

Locally I've tested it on 4 threads and it always ends with a series of DROP TEMPORARY TABLE ... that break the indexer.

@MateuszMesek
Copy link
Contributor Author

@MateuszMesek have you tested this solution with different MAGE_INDEXER_THREADS_COUNT values?

Locally I've tested it on 4 threads and it always ends with a series of "drop temporary table" that break the indexer.

@Bar3nho could you provide how you call mview via multi-thread or provide steps to replicate your issue?

MAGE_INDEXER_THREADS_COUNT is supported by Magento\Indexer\Model\ProcessManager.
This manager is used only for full indexation, not for mview indexation.
https://github.com/search?q=repo%3Amagento%2Fmagento2%20processManager-%3Eexecute&type=code

@Bar3nho
Copy link

Bar3nho commented Dec 6, 2023

@MateuszMesek I checked it once more and the problem is due to the fact that our custom mview indexer uses Magento\Indexer\Model\ProcessManager, which enables multithreading. The changes introduced in ChangelogBatchWalker mean that it is no longer possible to use multiple threads while doing just a partial reindex. Since Magento has never really taken advantage of multithreading in its partial indexers, this is not a bug but rather an inconvenience. I will point this out in my issue report, thank you for a good point.

@ilnytskyi
Copy link
Contributor

ilnytskyi commented Dec 6, 2023

As @Bar3nho mentioned in #38246 When using Magento\Indexer\Model\ProcessManager in a custom Mview for partial reindex it leads to strange behaviour.
I think it's due to combining yield and finally in one place.
Selection_1132

Our DB log shows that tmp table dropped multiple times. Then Select is running against that table, and then DROP again.
Selection_1131

I think the bug should be easily reproduceable with any implementation for \Magento\Framework\Mview\ActionInterface that uses Magento\Indexer\Model\ProcessManager . e.g. when data can be isolated per store and indexed in parallel.

Looks like second iteration foreach inside of of

\Magento\Framework\Mview\View::executeAction

would cause this problem due to finally being called every time, and there are still selects pending to read data from tmp table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Award: bug fix Award: category of expertise Priority: P1 Once P0 defects have been fixed, a defect having this priority is the next candidate for fixing. Progress: accept Project: Community Picked PRs upvoted by the community Severity: S0 A problem that is blocking the ability to work. An immediate fix is needed. Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Asynchronous indexing can break large servers