Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(common/models): convert traversal.children method to return array 📖 #12089

Merged

Conversation

jahorton
Copy link
Contributor

@jahorton jahorton commented Aug 2, 2024

At present, our correction and prediction search methods are not leveraging the generator-based form of the children() field on the LexiconTraversal type. When we ask for it, we always utilize every entry in the array in some form. So, why continue to maintain the generator pattern if we can just return an array instead?

To support #11994, this also adds a new entry to the returned objects: p, the probability of the highest-frequency entry on the node's path. We've already been returning it on the node itself, but there can be benefits to making it available before constructing the traversal itself; we can compare against the value and avoid overhead if we decide not to take the path, rather than constructing it and only then checking.

Part of the motivation here:

/**
* Provides an iterable pattern used to search for words with a 'keyed' prefix matching
* the current traversal state's prefix when a new character is appended. Iterating
* across `children` provides 'breadth' to a lexical search.
*
* For an example with English, if the current traversal state corresponds to 'th',
* children() may return an iterator with states corresponding 'e'
* (for 'the', 'then', 'there'), 'a' (for 'than', 'that'), etc.
*
* @param char A full character (a UTF-16 code point, which may be comprised of two
* code units) corresponding to the child node, appended to the current
* node's prefix to produce the child node's prefix.
* <p>
* Example: if the current traversal's represented prefix is 'th',
* char = 'e' would indicate a child node with prefix = 'the'.
* @param traversal a LexiconTraversal starting from (or "rooted at") the child node. Use
* of the returned object provides 'depth' to a lexical search.
* <p>
* Example:
*
* - Suppose our current LexiconTraversal represents a prefix of 'th'.
* - If `char` = 'e', the child represents a prefix of 'the'.
* - Then `traversal` allows traversing the part of the lexicon prefixed by 'the'.
*/
children(): Generator<{char: USVString, traversal: () => LexiconTraversal}>;

Examining the current specification of the method (within the linked block above)... note that nowhere within that spec is a requirement or guarantee that the entries be in sorted order. Granted, our current implementation certainly does enforce this for Trie wordlist models. Interestingly, upon further examination... there's not actually a need to even require that the entries be sorted. We generally toss them into a priority queue, and they'll be "sorted" there.

Specifying p here gives us the basis to enforce a sort if and needed, without having to call the traversal-generating functions in advance. There are other cases (such as for model-blending) where the p value is needed more as a "filter" than as a sorting condition; it's useful for that as well.

@keymanapp-test-bot skip

@jahorton jahorton requested a review from mcdurdin as a code owner August 2, 2024 07:25
@@ -160,10 +160,12 @@ export class TrieTraversal implements LexiconTraversal {
}
}

*children(): Generator<{char: USVString, traversal: () => LexiconTraversal}> {
children(): {char: USVString, p: number, traversal: () => LexiconTraversal}[] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably also faster, if we are not leveraging generator for partial results!

Base automatically changed from refactor/common/models/traversal-model to epic/user-dict August 12, 2024 02:08
@jahorton jahorton merged commit 424d68a into epic/user-dict Aug 12, 2024
14 of 15 checks passed
@jahorton jahorton deleted the refactor/common/models/traversal-children-as-array branch August 12, 2024 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants