Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Product images are being duplicated on import #26713

Merged
merged 12 commits into from
Oct 13, 2020
Merged

Product images are being duplicated on import #26713

merged 12 commits into from
Oct 13, 2020

Conversation

PieterCappelle
Copy link
Contributor

Description

Built on top of #21146 and #21855.

This PR adds image deletion so after import product will have images specified in the CSV. New images will be added, existing will not be duplicated and images not mentioned in the CSV will be removed. This PR has been created for 2.4-develop branch and when approved I'll create backport for 2.3 branch.

Fixed Issues

#14398: Images Not Replaced via CSV Import
#21885: Product images are being duplicated on import

Manual testing scenarios (*)

  1. Import product CSV using Add/Update method multiple times
  2. Check if image duplication happens or not

@m2-assistant
Copy link

m2-assistant bot commented Feb 5, 2020

Hi @PieterCappelle. Thank you for your contribution
Here is some useful tips how you can test your changes using Magento test environment.
Add the comment under your pull request to deploy test or vanilla Magento instance:

  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

For more details, please, review the Magento Contributor Guide documentation.

@AddoSolutions
Copy link

Hey @PieterCappelle thanks for putting this together. If I am not mistaken, the error here is that when an import with invalid entries goes through, it bugs out?

@PieterCappelle
Copy link
Contributor Author

@AddoSolutions No, at Magento 2.3 and 2.4 there is a issue when importing products with images. It's duplicating them instead of replacing them. If you import for the first time the image is added, if you re-run the import for the second time the image is not replaced but added (duplicated).

@AddoSolutions
Copy link

AddoSolutions commented Feb 20, 2020 via email

@erfanimani
Copy link
Contributor

Thanks @PieterCappelle . I haven't tested this, but one testing scenario I think is useful to point out is the following:

When changing image roles, i.e. Hide on product page or custom image attributes, even though the md5 hash of the image is the same, it should still update the roles. (There had been two other PRs for this same issue and one of them didn't do that.)

@PieterCappelle
Copy link
Contributor Author

@erfanimani You have links to those PR's? I can't find them immediately. Thanks!

@erfanimani
Copy link
Contributor

Ah I didn't realize you had linked them already. It's just #21146 and #21855 @PieterCappelle

@jwhitney55
Copy link

I had this same duplicate image issue and applied the changes to 2.3.3-p1. It fixed the issue, but introduced a new one. If you import the additional_images attribute column with more than one image URL, it results in only the first image being imported.

@AddoSolutions
Copy link

Checking in on this, has this issue stalled?

@sidolov sidolov added Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Severity: S2 Major restrictions or short-term circumventions are required until a fix is available. labels Aug 17, 2020
@sidolov sidolov self-assigned this Aug 24, 2020
@engcom-Charlie
Copy link
Contributor

Hi @PieterCappelle, can you please look through the requested changes and fix falling tests?
Otherwise, we can't proceed with your PR.
Thank you.

# Conflicts:
#	app/code/Magento/CatalogImportExport/Model/Import/Product.php
@engcom-Charlie
Copy link
Contributor

@PieterCappelle I'll try to continue with your PR.

@engcom-Alfa
Copy link
Contributor

✔️ QA Passed

Preconditions:

  1. Install Magento;
  2. Create directories import/images in the var folder;
  3. Add images test-1.jpg, test-2.jpg, test-3.jpg and test-4.jpg inside the path <install dir>/var/import/images;
  4. Import the product CSV data below twice
"sku","product_online","website_id","store_view_code","attribute_set_code","product_type","categories","name","description","short_description","visibility","price","special_price","special_price_from_date","special_price_to_date","url_key","meta_title","meta_keywords","meta_description","display_product_options_in","msrp_display_actual_price_type","additional_attributes","qty","out_of_stock_qty","use_config_min_qty","is_qty_decimal","allow_backorders","use_config_backorders","min_cart_qty","use_config_min_sale_qty","max_cart_qty","use_config_max_sale_qty","is_in_stock","use_config_notify_stock_qty","manage_stock","use_config_manage_stock","use_config_qty_increments","qty_increments","use_config_enable_qty_inc","enable_qty_increments","is_decimal_divided","deferred_stock_update","use_config_deferred_stock_update","related_skus","crosssell_skus","upsell_skus","custom_options","bundle_price_type","bundle_sku_type","bundle_price_view","bundle_weight_type","bundle_values","associated_skus","base_image","base_image_label","small_image","small_image_label","thumbnail_image","thumbnail_image_label","swatch_image","swatch_image_label","additional_images","additional_image_labels","configurable_variations"
"some-test-product",1,1,,"Default","simple","Default Category/Some Category/Some Test Product","Some test product","This is just some test product",,"Catalog, Search",2999,,,,"some-test-product","Some test product",,,"Block after Info Column","Use config",,100,0,1,0,0,1,1,0,0,1,1,1,0,1,1,0,1,0,0,0,1,,,,,,,,,,,"test-1.jpg",,"test-1.jpg",,"test-1.jpg",,"test-1.jpg",,"test-3.jpg",,
"another-test-product",1,1,,"Default","simple","Default Category/Some Category/Some Test Product","Some test product","This is just some test product",,"Catalog, Search",2999,,,,"another-test-product","Another test product",,,"Block after Info Column","Use config",,100,0,1,0,0,1,1,0,0,1,1,1,0,1,1,0,1,0,0,0,1,,,,,,,,,,,"test-2.jpg",,"test-2.jpg",,"test-2.jpg",,"test-2.jpg",,"test-4.jpg",,

Import settings

  • Entity Type: Products
  • Import Behavior: Add/Update
  • Stop on Error (default)
  • Allowed Errors Count: 10 (default)
  • Field separator: , (default)
  • Multiple value separator: , (default)
  • Empty attribute value constant: __EMPTY__VALUE__ (default)
  • Fields Enclosure: Not checked (default)

Before: ✖️ Product images duplicated

2020-10-07_13-09

After: ✔️ Product images are not duplicated

2020-10-07_12-52

Case 2. Tested and with external links for images and everything works as expected

Manual testing scenario:

  1. Add the values test-1.jpg,https://(some image) to the additional_images field in a CSV file;
  2. Import the product CSV data several times;

Actual Result: ✔️ Product images are not duplicated

2020-10-07_13-21

Also, tested Replace and Delete functionalities and everything works as expected.

@engcom-Alfa engcom-Alfa added the QA: Ready to add to Regression Scope Should be analyzed and added to Regression Testing Scope(if applicable) label Oct 7, 2020
@engcom-Alfa engcom-Alfa added QA: Added to Regression Scope Scenario was analysed and added to Regression Testing Scope and removed QA: Ready to add to Regression Scope Should be analyzed and added to Regression Testing Scope(if applicable) labels Oct 7, 2020
@engcom-Charlie
Copy link
Contributor

@magento run all tests

@m2-assistant
Copy link

m2-assistant bot commented Oct 13, 2020

Hi @PieterCappelle, thank you for your contribution!
Please, complete Contribution Survey, it will take less than a minute.
Your feedback will help us to improve contribution process.

@PieterCappelle PieterCappelle deleted the product-import-fix-duplicate-images branch October 13, 2020 11:48
@erfanimani
Copy link
Contributor

After two years, we can stop patching this ourselves. Well done all — cause for celebration :)

@barryvdh
Copy link
Contributor

Nice!

@PieterCappelle

This PR has been created for 2.4-develop branch and when approved I'll create backport for 2.3 branch.

Do you still have plans for this, or is a patch available?

markalston pushed a commit to markalston/magento2 that referenced this pull request Feb 11, 2021
Product images are being duplicated on import magento#26713
markalston pushed a commit to markalston/magento2 that referenced this pull request Feb 11, 2021
markalston pushed a commit to markalston/magento2 that referenced this pull request Feb 11, 2021
@barryvdh
Copy link
Contributor

Are the hashes re-calculated on each import? Or are they stored in the database?
Because otherwise this can be quite CPU-intensive right?

And how does this work with the S3 storage? hash_file() doesn't work on the S3 system right? So wouldn't you need to use the file driver to get the contents and use hash() instead? And store the result to prevent accessing S3 every import.

Or am I overlooking something?

@aurmil
Copy link

aurmil commented Nov 22, 2021

hello

Are the hashes re-calculated on each import? Or are they stored in the database?

they are re-calculated

Because otherwise this can be quite CPU-intensive right?

indeed
we experienced a "Maximum execution time" reached on overall product data import process because of hash_file (called in getFileHash) taking too long (when importing a small file with 2 columns, no image data)
we decided to add a condition on call to addImageHashes near the begining of _saveProducts > while: only when imported file contains image-related fields, based on $this->_imagesArrayKeys

And how does this work with the S3 storage? hash_file() doesn't work on the S3 system right? So wouldn't you need to use the file driver to get the contents and use hash() instead? And store the result to prevent accessing S3 every import.

we don't have the case yet but may come later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auto-Tests: Covered All changes in Pull Request is covered by auto-tests Component: CatalogImportExport Priority: P2 A defect with this priority could have functionality issues which are not to expectations. Progress: accept QA: Added to Regression Scope Scenario was analysed and added to Regression Testing Scope Release Line: 2.4 Severity: S2 Major restrictions or short-term circumventions are required until a fix is available.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.