Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting the IndexingPolicy #19540

Merged
merged 5 commits into from
Mar 3, 2021
Merged

Setting the IndexingPolicy #19540

merged 5 commits into from
Mar 3, 2021

Conversation

amarathavale
Copy link
Contributor

@amarathavale amarathavale commented Mar 2, 2021

As part of this change, the indexingPolicy is set to use the same strategy that we use for our production workloads: Index only the id and partitioningKey field for this collection (there is another use-case where we will validate the composite indexes; but in another change)

Validated this change by running the java cli command, and you can see the indexingPolicy:
CTL collection indexing policy

Additional changes in here:

  1. Refactored the existing implementation to support additional 'entities'. Like Invitations, entities such as UserGeneratedContent (where we use composite indexes) have a different document structure.
  2. Using Random number generator seeded using the same seed to generate a predictable sequence of numbers. In the previous iteration, all Keys were cached, which increased the JVM heap utilization.

Validated these 2 items by running it locally, and verifying the NotFound count is 0.

-- Meters ----------------------------------------------------------------------
GET ctlWorkloadInvitations Document NotFound Operations
count = 0
...
GET ctlWorkloadInvitations Successful Operations
count = 132975
mean rate = 237.10 events/second
GET ctlWorkloadInvitations Unsuccessful Operations
count = 0
...
-- Timers ----------------------------------------------------------------------
GET ctlWorkloadInvitations Latency
count = 132975
...

@ghost ghost added Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Mar 2, 2021
@ghost
Copy link

ghost commented Mar 2, 2021

Thank you for your contribution amarathavale! We will review the pull request and get back to you soon.

@amarathavale
Copy link
Contributor Author

Note: This PR can/should be merged before #19515 for a couple of reasons: this change is more urgent, and I can handle the merge conflict in the other one after this is committed.

Copy link
Contributor

@moderakh moderakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. please wait for @simplynaveen20 review and sign off

Copy link
Member

@simplynaveen20 simplynaveen20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question - "Refactored the existing implementation to support additional 'entities'. Like Invitations, entities such as UserGeneratedContent (where we use composite indexes) have a different document structure." -> Are we creating another set up documents for this entities along with invitation. So if we load 100 document , in BE it will be 100(invitation)+100(entities)

Copy link
Member

@simplynaveen20 simplynaveen20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for memory issue fix . LGTM

@simplynaveen20 simplynaveen20 merged commit 7b342e3 into Azure:master Mar 3, 2021
@amarathavale amarathavale deleted the linkedin/ctl-collection-indexing-policy branch March 7, 2021 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cosmos customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants