-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EZP-31569: Implemented Repository filtering #54
EZP-31569: Implemented Repository filtering #54
Conversation
Then maybe document find to also return
That was probably just to stay close to how it is everywhere else, however putting it on Filter is fine. Maybe we should do the same on Query then.
It's better if it is specific, so we avoid the Content vs Location mess we have with Query. With no clear type hint on what goes where, only known on run time.
Ideally everything, except:
Interesting |
About the language parameters:
Actually it should respect the siteaccess aware repository approach (in the SA aware implementation), and return items depending on the current SA's language settings. How did the repo behave before we added that again ? What about query types ? Can we somehow re-use those ? About criteria and sort clauses, this is my prioritized list:
About What about visibility ? I'd expect that by default, invisible items aren't returned on the frontend (and can be set to be returned), and are returned on the backend (and can be set to not be returned). How could we manage that ? |
+20 but this should ideally be done Repo wide for all API's. Either as SA aware layer injects config for you on hidden or not for instance. Or we move this to permissions (with something als SALimitation("admin")) to get permission system to do this for us.
Basically it works by setting value from config if |
By default, I assumed (I should not) that it follows the same architecture than the rest of the repository, with that extra siteaccess aware layer (transparent). Can you confirm @alongosz ?
Good. Let's do this for that API endpoint as well. If no language is specified, it uses the current SA's languages parameters. OK @alongosz ? |
Thanks @andrerom and @bdunogier for your comments.
This is exactly the approach now -
It's already done via: /**
* @return \eZ\Publish\API\Repository\Values\Content\Content[]|\Traversable
*/
public function getIterator(): Traversable;
+1
TBD, I might indeed do this to provide strict API
Ok, here comes the list, with questions and remarks: Criteria:
SortClauses:
Note that consumption would be a bit different, more strict. This would be actually DX advantage, because usage with IDE should be more intuitive. The list is huge, so we should focus on the most important ones and deliver the remaining ones as a follow-up. [1] IMHO we should have some fallback
SA layer is enabled by default now, so some language always will be defined.
It should set languages only if not set (
+0.75 - I would like to avoid distinction between
They're strictly typed for Search API.
That's exactly the point of having interface on API for strict referencing in 3rd party code w/o possibility to instantiate.
There's no way to distinguish front-end from backend. I've proposed 3 separate Criteria
The same goes for Version status, though we can probably assume that if someone asks for |
f7a33b2
to
7d7c06d
Compare
Yes there is: the siteaccess. In legacy, we had a setting for showing invisible objects, enabled for backoffice, and disabled for front. It would be a perfect task for the SA aware repository layer (add the criterion if it wasn't added by the user, as a default.
Yes, I know that :) It is even more important now that we have added built-in types. I really insist that we try to make that consistent, even though I don't know how right now. In the end, 90% of what a query type returns can be used for other find operations (filter + sort + pagination). In any case, we can NOT require users to add the visibility criteria to each and every call. One other thing: the query field should definitely be changed at some point to use that API instead of Search... (and it should also be exposed over REST, as an alternative to |
Internal sync:
use eZ\Publish\API\Repository\Values\Filter\Filter;
$filter = (new Filter())
->withCriteria(new LogicalAnd(new...))
->withSortClauses(new SortClause(...)); |
7d7c06d
to
c891fc5
Compare
c891fc5
to
f0ca777
Compare
tests/lib/integration/API/Repository/Tests/ContentService/ContentFilteringTest.php
Outdated
Show resolved
Hide resolved
tests/lib/integration/API/Repository/Tests/ContentService/ContentFilteringTest.php
Outdated
Show resolved
Hide resolved
Could you try to find some time to update the description so that it matches the last decisions ? It would really make reviews and feedback easier. |
f0ca777
to
2012e57
Compare
@bdunogier should be up to date now with the current state after rebasing, but I already see handling of languages needs to be changed to follow known pattern used in SiteAccess-aware services layer. Moreover Filter vs. FilterBuilder is still unresolved. |
2012e57
to
9271d06
Compare
eb7e753
to
51b0d4a
Compare
Internal sync: removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for API and SPI changes (besides things mentioned on Slack). I have some suggestions about implementation but we can apply this as follow up PR 😉
Internal sync: Moved Service Tags constants to dedicated ServiceTags class (30f125b). |
SPI\Persistence\* is considered an internal implementation
Refactored EzPublishCoreExtension so registering of Symfony automatic configuration for services has a separate method.
30f125b
to
cac3e77
Compare
Rebased to resolve conflicts after another merge and squashed fixups. |
Merging, QA will be done on the RC tag. |
v3.1
Summary
This PR introduces Content Repository filtering API allowing a Developer to query Content Repository to retrieve either Content or Location list without involvement of a search engine.
API endpoints
Content Service
LocationService
Notice that I've kept
$languages
argument which injects context languages and is used by SiteAccessAware layer. Initially I planned to rely entirely on theFilter
criteria, but the code required for this was too inconsistent and too complicated when compared with current SiteAccessAware implementation and test coverage.$language
behaves the same as for the other existing API endpoints - ifnull
, then it should be chosen from a context of a layer. This allows to set an empty array[]
to force no extra languages.API consumption
Filter
is a fluent Value Object defining its own setters using grammar which encourages chained use. It does not introduce its own builder to avoid complicating already complex solution.Full list of
Filter
fluent setters:withCriterion
- accepts a single Criterion, Criterion cannot be already setandWithCriterion
- appends a Criterion using LogicalAnd operation, if no Criterion was set prior this operation, it will simply set it instead (likewithCriterion
)orWithCriterion
- append a Criterion using LogicalOr operation, the same rules apply as forandWithCriterion
withSortClause
- append a Sort ClausesliceBy
- set limit and offset for paginationreset
- remove all Criteria, Sort Clauses, and pagination settings for fluent reuse.It's worth to point out that a
Filter
stores only a single Criterion. If multiple Criteria are needed they should be nested inside one of the logical operators -LogicalAnd
,LogicalOr
, orLogicalNot
.On the other hand a list of Sort Clauses is just an array.
Setting pagination offset and limit is not a requirement, though it's a recommended practice. Unlike its search counterpart , a Query,
Filter
does not impose default page limit, making results unlimited when not set. This is in line with database queries behavior, so should be less surprising than what search Query offers. Still, it's recommended to set limit and use pagination.To avoid making the solution even bigger than it is right now, all the search Criteria are reused for filtering. They all reside in the
\eZ\Publish\API\Repository\Values\Content\Query\Criterion
namespace, so there seems nothing wrong with that approach.The same goes for Sort Clauses - the ones from the
\eZ\Publish\API\Repository\Values\Content\Query\SortClause
namespace are reused.Important: Not all Criteria, nor the Sort Clauses are supported with Repository Filtering. Some of them will appear later on, others, like FullText Criterion were never meant to work on a bare Repository without a content indexed for search.
To avoid confusion, a Criterion which can be used for Repository Filtering needs to implement
\eZ\Publish\SPI\Repository\Values\Filter\FilteringCriterion
interface. Similarly,a Sort Clause needs to implement
\eZ\Publish\SPI\Repository\Values\Filter\FilteringSortClause
interface. To improve DX, with proper IDE used, it will be immediately apparent that a given fluent setter does not accept a given Criterion or Sort Clause not implementing those interfaces.Supported Criteria
Ancestor
ContentId
ContentTypeGroupId
ContentTypeId
ContentTypeIdentifier
DateMetadata
LanguageCode
ObjectStateId
ObjectStateIdentifier
ParentLocationId
RemoteId
SectionId
SectionIdentifier
Sibling
Subtree
Visibility
UserEmail
UserId
UserLogin
UserMetadata
IsUserBased
IsUserEnabled
LocationId
LocationRemoteId
Location\Depth
Location\IsMainLocation
Location\Priority
LogicalAnd
LogicalNot
LogicalOr
MatchAll
MatchNone
The difference with search here is that all Location Criteria are actually supported regardless of filtering Content or Locations. I don't see a lot of technical reasons (maybe performance a bit, you'll see nested DISTINCT magic in case of Content gateway implementation), not to allow this. For instance, a
Location\Depth
query for Content items will return only those Content items which have any Location satisfying the given depth criteria. The same goes forLocation\Priority
. EvenLocation\IsMainLocation
can be found useful because for non-main Location requirement it will match only Content items which have more than one (so at least one non-main) Location.Supported Sort Clauses
Content
ContentId
ContentName
DateModified
DatePublished
SectionIdentifier
SectionName
Location
Depth
Id
Path
Priority
Visibility
Service Provider Interfaces
The architecture relies on query builders both for Criteria and Sort Clauses which follow Visitor pattern.
For a Criterion to support Repository Filtering, it must:
\eZ\Publish\SPI\Repository\Values\Filter\FilteringCriterion
interface\eZ\Publish\SPI\Repository\Values\Filter\CriterionQueryBuilder
interfaceFor a Sort Clause to support Repository filtering, it must:
\eZ\Publish\SPI\Repository\Values\Filter\FilteringSortClause
interface\eZ\Publish\SPI\Repository\Values\Filter\SortClauseQueryBuilder
interface.Implementation details
Implementation aims to be a separate module, so expect its parts mostly in dedicated namespaces prefixed either as
Filter
orFiltering
.One of most notable things is probably an existence of
\eZ\Publish\SPI\Persistence\Filter\Doctrine\FilteringQueryBuilder
which extends Doctrine DBAL Query Builder. It just felt right with the current architecture to add dedicated responsibility to query builder rather than introducing separate service. Query Builder has a BC promise, because it is injected into Query Builder implementations of Criteria and Sort Clauses. While extending 3rd party implementation is often risky, I haven't found any indication that this component is either gonna change in the future or is intended to be internal.The implementation relies on
JOIN
s combined withWHERE
constraints for filtering data as this usually proves to be more efficient thanWHERE IN (subquery)
approach. Both handlers and a mapper were written specifically for this feature as the data returned by gateways are a bit different. It was also an opportunity to use current stack domain language in data set. Still, when it was clean and possible, existing mappers were used to map data.Changes to known patterns
\eZ\Publish\API\Repository\Values\Content\ContentList
is a Value, but instead of controlling properties by the usualValueObject
magic to avoid overriding them, proper getter has been introduced.@internal
so it's clear thatis not supported outside of Kernel.
TODO
LocationService
Provide caching support// not feasible ATM, needs to be a separate SpikeImplement needed Criteria and Sort Clauses (TBD)// re-using Content Query Criteria with proper markersFollow-ups (tracked by the EZP-31711 Epic)
Conclusion
There are still things to improve, but this seems like a reasonable MVP, covering a lot of Criteria and Sort Clauses.
Any Reviewer who knows Repository a bit could see a lot of similarities between Legacy Search Engine and this implementation. So the natural question would be - why we didn't reuse more of internal structure of LSE? Or maybe even why we didn't simply expose LSE in a search settings-independent way? The reason for that was an attempt to make more robust implementation by relying more on
JOIN
s thanWHERE content.id IN (subquery)
pattern. As a side-effect we've got rid of LSE limitation when it comes to use some Location Criteria when searching for Content.QA
This is a PHP API that can be used to retrieve Content or Location item lists from Repository.
Example EE project which relies on volume testing data set can be found in that diff and more specifically with that sample Filtering query.
Checklist:
$ composer fix-cs
).