Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate TestPlanVersions #621

Merged
merged 14 commits into from
Jul 18, 2023
Merged

Deduplicate TestPlanVersions #621

merged 14 commits into from
Jul 18, 2023

Conversation

alflennik
Copy link
Contributor

Currently the ARIA-AT App imports all 35 test plans whenever a new commit is detected in the ARIA-AT repo. This resulted in a lot of duplicated data. This PR detects when a test plan actually changed and only imports it in that case. Also, it deletes all the previously imported duplicate test plan versions.

In production, currently there are around 2200 test plan versions. After this PR is merged, there will only be the smallest possible number needed, which is 127 test plan versions. In other words, 95% of the data in the table was entirely duplicated. The smaller size is obviously much more manageable, and improves performance quite a lot. While testing queries that required loading the complete list of test plan versions, I found that performance increased by around 10x, from more than three seconds to only a few hundred milliseconds.

Since this PR vastly reduces the noise, it exposes an existing issue: as I was looking through the test plan versions listed on the Test Queue page of the app, I noticed that a couple commit messages were referring to the wrong test plan. Upon investigation I found that these mislabelings were occurring in the case that multiple PRs were merged at roughly the same time in the ARIA-AT repo. Our system does an import every hour, and so changes imported at that time will be attributed to whatever the latest commit is, even if there were several commits.

Before releasing, we will need to make a copy of the production database. Since the migration in the PR deletes the majority of the data in the database, it is probably much more risky than any migration we've introduced in the past. That said, the migration was tested on a recent copy of the production database without issue.

@alflennik alflennik requested review from evmiguel and howard-e May 2, 2023 22:01
@alflennik
Copy link
Contributor Author

Also worth mentioning: before, the database was around 286MB and it's now 22MB, so the overall database shrunk to 7% of the previous size.

@evmiguel
Copy link
Contributor

evmiguel commented May 8, 2023

@alflennik When running the migration, I get this error:

Loaded configuration file "config/config.js".
Using environment "development".
== 20230501220810-deduplicateTestPlanVersions: migrating =======
Indexing 0 of 35
Indexed 35 of 35
Fixing 0 of 35

ERROR: syntax error at or near ")"

My database isn't based on production but rather the latest imported tests.

Copy link
Contributor

@howard-e howard-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work @alflennik! This really helps to keep our data lean and reduce a lot of clutter. Huge plus that this also helps to address #579 at the same time!

I observed a few functional issues as well listed below. Key of which, being the Migration Inconsistencies:

Migration Inconsistencies

  • (Should now be addressed by 11886a9)
  • Using production data from a April 13th, 2023 backup dump, I’ve noticed there are inconsistencies in the reported values shown for Toggle Button (JAWS) and Toggle Button (VoiceOver) on both the Candidate Tests Page and the Reports Page
    • Toggle Button (16 Tests) for JAWS
      • Previously, No assertions
      • Now, 14 assertions failed across 6 tests run with 1 browser
    • Toggle Button (8 Tests) for VoiceOver
      • Previously, 18 assertions failed across 6 tests run with 1 browser
      • Now, 23 assertions failed across 7 tests run with 1 browser
  • I also checked using our production data from a backup dump today, May 8th 2023 to see if this could be safely "sided-stepped" for now as a fluke, but I noticed there are also inconsistencies with the Modal Dialog (JAWS) and Toggle Button (JAWS) and Toggle Button (VoiceOver) which can be seen on the Candidate Tests Page and the Reports Page

Alphabetical Ordering

  • When adding a new item to the Test Queue, the Test Plans in the dropdown are no longer in an alphabetical order (Should now be addressed by 7905cad)
    • (We may also want to address the Test Plan Target sections + the Test Plans in each section also not being in alphabetical order but this can be done in a separate issue)

Testing

  • Can also confirm I experienced the issue reported by @evmiguel when not using the production data. (Should now be addressed by 5fb8aae)

Edit on Jun 22, 2023

Test plan run pages fails to properly load

  • A crashing bug I found was also when opening some TestPlanRuns, the tests would fail to load because they were attempting to reference TestPlanVersions which no longer exist. (Should now be addressed by 82f9c4c).

@howard-e howard-e force-pushed the deduplicate-test-plan-versions branch from 68f6388 to e82e10e Compare June 21, 2023 16:30
@howard-e
Copy link
Contributor

Confirming I no longer experience the following issue reported by Erika and myself with the commit, 5fb8aae:

Loaded configuration file "config/config.js".
Using environment "development".
== 20230501220810-deduplicateTestPlanVersions: migrating =======
Indexing 0 of 35
Indexed 35 of 35
Fixing 0 of 35

ERROR: syntax error at or near ")"

Copy link
Contributor

@howard-e howard-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for this @alflennik!

I've since added some commits to this PR to address some of the items found:

Migration Inconsistencies

  • Using production data from a April 13th, 2023 backup dump, I’ve noticed there are inconsistencies in the reported values shown for Toggle Button (JAWS) and Toggle Button (VoiceOver) on both the Candidate Tests Page and the Reports Page

    • Toggle Button (16 Tests) for JAWS

      • Previously, No assertions
      • Now, 14 assertions failed across 6 tests run with 1 browser
    • Toggle Button (8 Tests) for VoiceOver

      • Previously, 18 assertions failed across 6 tests run with 1 browser
      • Now, 23 assertions failed across 7 tests run with 1 browser
  • I also checked using our production data from a backup dump today, May 8th 2023 to see if this could be safely "sided-stepped" for now as a fluke, but I noticed there are also inconsistencies with the Modal Dialog (JAWS) and Toggle Button (JAWS) and Toggle Button (VoiceOver) which can be seen on the Candidate Tests Page and the Reports Page

Should now be addressed by 11886a9. Note that work being done for #950 should also eliminate the need for these checks in the future.

Alphabetical Ordering

  • When adding a new item to the Test Queue, the Test Plans in the dropdown are no longer in an alphabetical order

Should now be addressed by 7905cad. Also took the chance to alphabetically sort the list of testers which are assigned as well.

Running with Sample data

Reported by Erika: When running the migration, I get this error:

Loaded configuration file "config/config.js".
Using environment "development".
== 20230501220810-deduplicateTestPlanVersions: migrating =======
Indexing 0 of 35
Indexed 35 of 35
Fixing 0 of 35

ERROR: syntax error at or near ")"

My database isn't based on production but rather the latest imported tests.

Should now be addressed by 5fb8aae

Some Test Plan Run pages fails to properly load

  • A crashing bug I found was also when opening some TestPlanRuns, the tests would fail to load because they were attempting to reference TestPlanVersions which no longer exist.

Should now be addressed by 82f9c4c

Copy link
Contributor

@jugglinmike jugglinmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @howard-e! I'm going to need a little more time to verify this functionally; in the mean time, I'm sharing some suggestions for the implementation.

Comment on lines 255 to 257
assertions: test.assertions.map(assertion => ({
...omit(assertion, ['id'])
})),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assertions: test.assertions.map(assertion => ({
...omit(assertion, ['id'])
})),
assertions: test.assertions.map(assertion => omit(assertion, ['id'])),

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be addressed in 2324298. It's been moved to server/util/aria.js.

@@ -244,6 +246,22 @@ const updateAtsJson = async ats => {
);
};

const hashTests = tests => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define this function as a utility so it's not duplicated in the migration? Also could we test it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved and tests have been added in cd6ee71.

);
const [[{ count: testPlanVersionCount }]] = results;

const iterationsNeeded = Math.ceil(testPlanVersionCount / 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ten is something of a magic number right now. Can we get const batchSize = 10; or similar?

Comment on lines 124 to 126
offset + 100 < uniqueHashCount
? offset + 100
: uniqueHashCount
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
offset + 100 < uniqueHashCount
? offset + 100
: uniqueHashCount
Math.min(offset + batchCount, uniqueHashCount)

});
};

return queryInterface.sequelize.transaction(async transaction => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performing two distinct operations in the same scope makes this function harder to read and write than it needs to be. If it were instead expressed as a composition of two functions (e.g. computeHashes and removeDuplicates), this would be more self-documenting, and it would no longer need to resort to awkward binding names like iterationsNeeded2.


const keptTestPlanVersion =
await queryInterface.sequelize.query(
`SELECT id, tests FROM "TestPlanVersion" WHERE id = ? LIMIT 1`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`SELECT id, tests FROM "TestPlanVersion" WHERE id = ? LIMIT 1`,
`SELECT id, tests FROM "TestPlanVersion" WHERE id = ?`,

@howard-e
Copy link
Contributor

I'm sharing some suggestions for the implementation.

@jugglinmike thanks for the feedback and suggestions. The most recent commits should address your comments.

…erating the various result ids because the source tests.scenarios are further broken apart by their respective ATs
Copy link
Contributor

@jugglinmike jugglinmike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functional testing complete (I successfully ran the migrations on a local instance, verifying that they had the desired effect). Nice work, @howard-e!

@howard-e howard-e merged commit 382795e into main Jul 18, 2023
@howard-e howard-e deleted the deduplicate-test-plan-versions branch July 18, 2023 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants