-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve getCustomAttribute() performance #13308
Improve getCustomAttribute() performance #13308
Conversation
The same mechanism was used for categories, I updated the PR to reflect the changes there analogously |
Without getName(), __toString() is triggered, which creates a string representation of the whole class and apparently also can lead to errors
Hi @schmengler |
I confirm that PR has improvements in next scenarios:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a lot of BiC being introduced in the scope of performance improvement.
It looks better to introduce dedicated service for getCustomAttributesCodes
logic.
And move all the logic there.
This service could be injected into Product model which will proxy call into it.
No need to bring this logic to Product/Category resource models which contain a lot of logic now (violating Single Responsibility), breaking their contract along with that.
The more granular single responsible services we introduce in M2 codebase - the easier it would be maintain this code in future
const KEY_PRODUCT_COUNT = 'product_count'; | ||
const KEY_CHILDREN_DATA = 'children_data'; | ||
/**#@-*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constants removal from class and moving them to interface is backward incompatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? TheClass::THE_CONSTANT
still works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schmengler
Possible fatal error if there is another implementation of the interface with same constant name declaration.
But this is more like the theoretical issue.
Currently, I raised this question for voting on the Internal Architectural Council to allow moving constants preserving names and values from Implementation to Interface being implemented. And thus fix our Backward Compatibility Guideline http://devdocs.magento.com/guides/v2.2/contributor-guide/backward-compatible-development/index.html saying
Removing or renaming constants
Do not remove or rename constants.
And to adapt internal tests which break on such changes.
I will get back to you shortly as soon as get results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maghamed It wouldn't even be a problem: https://3v4l.org/H4mYd 😄
/** | ||
* @var CategoryAttributeRepositoryInterface | ||
*/ | ||
protected $metadataService; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new properties added to classes should be private, protected properties are now allowed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I'll change it
* @param MetadataServiceInterface $metadataService | ||
* @return string[] | ||
*/ | ||
protected function getEavAttributesCodes(MetadataServiceInterface $metadataService) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inheritance based API not encouraged in Magento 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know but I have to work with what is there (which is inheritance based resource models) or introduce huge changes, probably BIC. Do you have an idea for a simple solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's basically moved from \Magento\Framework\Model\AbstractExtensibleModel::getEavAttributesCodes
(inheritance based model API) - so it's not that I am introducing something new here.
A possible non-inheritance solution I see is to to implement the method in the Category and Product resource models seperately in a non-generic way. Then, if we want to get rid of duplicated logic, add a method getCustomAttributeCodes($dataObjectClassName)
to \Magento\Framework\Api\MetadataServiceInterface
(BIC: new method in interface). IMHO this would be subject to a different PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
disregard this, I've read your top comment
/** | ||
* @var ProductAttributeRepositoryInterface | ||
*/ | ||
protected $metadataService; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
introducing of protected properties/methods are not encouraged in M2
I'm going to think this through |
@maghamed Here's my draft for such a service. I'd like to discuss it before updating the PR: <?php
/**
* Copyright © Magento, Inc. All rights reserved.
* See COPYING.txt for license details.
*/
namespace Magento\Eav\Model\Entity;
use Magento\Framework\Api\MetadataServiceInterface;
class GetCustomAttributeCodes
{
/**
* @var string[]
*/
private $customAttributesCodes;
/**
* Receive a list of custom EAV attributes using provided metadata service. The results are cached per entity type
*
* @param MetadataServiceInterface $metadataService Custom attribute metadata service to be used
* @param string[] $interfaceAttributes Attribute codes that are part of the interface and should not be
* considered custom
* @param string|null $entityType Entity type (class name), only needed if metadata service handles different
* entities
* @return string[]
*/
public function execute(
MetadataServiceInterface $metadataService,
array $interfaceAttributes,
string $entityType = null
): array {
$cacheKey = get_class($metadataService) . '|' . $entityType;
if (!isset($this->customAttributesCodes[$cacheKey])) {
$customAttributesCodes = $this->getEavAttributesCodes($metadataService, $entityType);
$this->customAttributesCodes[$cacheKey] = array_values(
array_diff($customAttributesCodes, $interfaceAttributes)
);
}
return $this->customAttributesCodes;
}
/**
* Receive a list of EAV attributes using provided metadata service.
*
* @param MetadataServiceInterface $metadataService
* @param string|null $entityType
* @return string[]
*/
private function getEavAttributesCodes(MetadataServiceInterface $metadataService, string $entityType = null)
{
$attributeCodes = [];
$customAttributesMetadata = $metadataService->getCustomAttributesMetadata($entityType);
if (is_array($customAttributesMetadata)) {
/** @var $attribute \Magento\Framework\Api\MetadataObjectInterface */
foreach ($customAttributesMetadata as $attribute) {
$attributeCodes[] = $attribute->getAttributeCode();
}
}
return $attributeCodes;
}
} This could be injected into the return $this->getCustomAttributesCodes->execute(
$this->metadataService,
ProductInterface::ATTRIBUTES
); What I don't like about it is that the caller (the product/category model) is responsible to pass the non-custom interface attributes. This is something that the custom attribute metadata service should be able to provide, don't you think? Again, adding a method to What do you think? Also just to be sure: what's the consensus on state in services? Because caching the results is absolutely necessary. |
array_diff($this->customAttributesCodes, CategoryInterface::ATTRIBUTES) | ||
); | ||
} | ||
return $this->customAttributesCodes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of products we can improve this even further for larger datasets, as it stands here it will work great for a small number of attributes (< 500) but as soon as we start to go upward of that figure the performance will fall through the floor (each attribute loaded results in Magento\Eav\Model\ResourceModel\Entity\Attribute::_afterLoad
being called which in turn runs a select query to join additional attribute data).
In my current project we have ~2200 custom attributes, but all of our attribute sets are small (< 100 custom attributes in each one), the problem is that this method loads every attribute regardless of the product's attribute set, to fix it I pulled in this PR as a patch and then changed this method to the following:
public function getCustomAttributesCodes($attributeSetId)
{
if (!isset($this->customAttributesCodes[$attributeSetId])) {
$codes = [];
$criteria = $this->searchCriteriaBuilder->addFilter('attribute_set_id', $attributeSetId, 'eq');
$attributes = $this->attributeRepository->getList($criteria->create())->getItems();
if (is_array($attributes)) {
/** @var $attribute \Magento\Framework\Api\MetadataObjectInterface */
foreach ($attributes as $attribute) {
$codes[] = $attribute->getAttributeCode();
}
}
$codes = array_values(
array_diff($codes, ProductInterface::ATTRIBUTES)
);
$this->customAttributesCodes[$attributeSetId] = $codes;
}
return $this->customAttributesCodes[$attributeSetId];
}
then had the Magento\Catalog\Model\Product
pass through the product's attribute set ID, this way we only ever load the attributes for the current attribute set
I think the memoization based on attribute set ID in the $this->customAttributesCodes[$attributeSetId]
is not required because resource model instances are not shared across products, but I might be wrong so I've left it in there for safety
whilst this would result in a few more queries (1 for each different attribute set to get the attributes relevant for that product) it means that creating a product is not affected by the total number of attributes in Magento, only by the number of attributes in the product's attribute set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because resource model instances are not shared across products,
They are, this is what my initial approach is based on. Thanks a lot for your addition, it makes sense and I'd add it to the PR!
But first waiting for @maghamed to discuss the final architecture
Hey @schmengler , Your
I agree with you that it's the responsibility of Metadata Service to know what attributes belong to Interface and what attributes are Custom ones. But as you mentioned we can't change existing interface because of BC requirements. What we can do is to introduce interface for your service like this
along with that you provide the implementation
And introduce new Composite services for Product and Category.
Now you make both Product and Category model depend on And with DI Type you change the incoming object.
Do the same for Category. So, summarizing all the above.
Do proposed solution make sense for you? |
@maghamed Thank you for the detailed response, your suggestion sounds good! Regarding the type argument:
It's used in the implementation for address attributes, although on second look there is a default value and other values are never passed. I'll leave it out then. |
@maghamed About the addition by @jameshalsall : I could add an optional |
If you make it easy for me to apply that fix later (by allowing the parameter in the interface method) then I can send a separate PR after this one is merged 👍 thanks |
Hi @maghamed, thank you for the review. |
* @var ResourceModel\Category | ||
*/ | ||
protected $_resource; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while this is not necessary and not relevant for this PR anymore, it helps with static analysis and IDE support by specifying the concrete type of $_resource
{ | ||
return parent::_getResource(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as with $_resource
. It's not super important but a minor improvement that I made while I was at it. Should it be a separate PR?
->willReturn([$customAttributeCode]); | ||
$this->category->setData($interfaceAttributeCode, "sub"); | ||
$nameAttributeCode = 'name'; | ||
$descriptionAttributeCode = 'description'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you rename the variables? For the test it's not important that it's name and description, but that one is an interface attribute and the other is custom
->willReturn([$customAttributeCode]); | ||
$this->model->setData($interfaceAttributeCode, 10); | ||
$priceCode = 'price'; | ||
$colorAttributeCode = 'color'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as with "name" and "description" above. you probably reverted both to the previous version
|
||
//The color attribute is not set, expect empty custom attribute array | ||
$this->assertEquals([], $this->model->getCustomAttributes()); | ||
|
||
//Set the color attribute; | ||
$initialCustomAttribueValue = "red"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the variables for "initial value" and "new value" wer introduced to make connection between fixture and verification explicit and clear
@jameshalsall ooops, too late, sorry! |
Hi @jameshalsall , |
Yeah I will do this at the weekend.
…Sent from my iPhone
On 3 May 2018, at 18:06, Michail Slabko ***@***.***> wrote:
Hi @jameshalsall ,
Do you plan to create PR with optimization of loading attributes?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hi @jameshalsall , |
@mslabko I will take a look at it now and see how much effort is involved |
so it seems the option |
@jameshalsall , yes, for 2.3 it's not a problem to add second argument for GetCustomAttributeCodesInterface::execute. Also, I recommend do not use MetadataServiceInterface for retrieving attributes, due to this is not optimal from a performance perspective. Instead of MetadataServiceInterface it's better to use \Magento\Eav\Model\Config::getEntityAttributes, which already has a caching mechanism - permanent and application cache. So I suggetst refactor GetCustomAttributeCodesInterface as
|
oops, I already added - #15135 |
Description
When traversing a product collection and using
getCustomAttribute()
within the loop, it triggered a database query each time to retrieve the metadata from theeav_attributes
table. This was due to the custom attributes being cached in each product model separately.I moved it into the resource model so that only one query per entity type is necessary.
Manual testing scenarios
On an instance with sample data or more than a handful of products
Profile this code snippet:
More background and a test tool can be found in this blog by my colleague @avstudnitz who found the issue: https://www.integer-net.com/performance-trap-magento-2-getcustomattribute-function/
After the changes in this PR, database calls should been reduced by (n-1) with n = number of products.
Contribution checklist