refactor: normalize component types with ComponentType #148

ormsbee · 2024-01-31T20:17:04Z

Prior to this commit, Components encoded their namespace + type info as
two columns on the Component model itself. This wasted a lot of space on
a very common index lookup–many rows of ('xblock.v1', 'video'),
('xblock.v1', 'problem'), and ('xblock.v1', 'text'). But worse, the lack
of a separate ComponentType model meant that there was no first class
entity to store per-type policy data against. For example, we might want
to say "the following types are supported by libraries, while these
other types are experimental", or "these types are enabled for these
orgs", etc.

Components are required to have a ComponentType. We're rewriting the
first migration for the components app here, since this app hasn't been
added to edx-platform yet.

Prior to this commit, Components encoded their namespace + type info as two columns on the Component model itself. This wasted a lot of space on a very common index lookup–many rows of ('xblock.v1', 'video'), ('xblock.v1', 'problem'), and ('xblock.v1', 'text'). But worse, the lack of a separate ComponentType model meant that there was no first class entity to store per-type policy data against. For example, we might want to say "the following types are supported by libraries, while these other types are experimental", or "these types are enabled for these orgs", etc. Components are required to have a ComponentType. We're rewriting the first migration for the components app here, since this app hasn't been added to edx-platform yet.

ormsbee · 2024-01-31T20:17:41Z

Supersedes #147

kdmccormick

Just a couple Qs, otherwise looks great.

openedx_learning/core/components/models.py

kdmccormick · 2024-02-01T18:08:20Z

openedx_learning/core/components/api.py

+
+
+@lru_cache(maxsize=128)
+def get_or_create_component_type_id(namespace: str, name: str) -> int:


Even though it returns a pk, I think I would mildly prefer get_or_create_component_type, since you're going to create a "compoment type", not a "component type id".

Yeah, it's not great, but I decided to keep it with _id at the end because of all the places we're returning a full model.

Honestly, I'm currently mulling over how this part works in my current PR. I realized that the way things are now, the cache is vulnerable to getting corrupted when there are errors, since we might cache a value for a ComponentType that was later rolled back. So I might have to rethink this setup entirely. But I'll do that in the next PR. 😛

Ah. Would the rolled-back id be mis-cached in the lru_cache, or somewhere else?

Yeah, in lru_cache.

I don't expect that creating ComponentType instances will occur outside of create_component. So I suggest that you just put the ComponentType.objects.get_or_create( call inline (without a cache) inside create_component. Then if it gets rolled back, no problem.

If this gets simplified to @lru_cache def get_component_type_id() then you have two nice simplifications:

If the ComponentType is not found, it won't be cached, due to the ComponentType.NotFound exception

You can optimize lookups like this:

Component.with_publishing_relations \ .get( learning_package_id=learning_package_id, component_type__namespace=namespace, component_type__name=type_name, local_key=local_key, )

becomes

Component.with_publishing_relations \ .get( learning_package_id=learning_package_id, component_type_id=get_component_type_id(namespace, type_name), local_key=local_key, )

which will then cause your function to throw differentiated exceptions: Component.NotFound OR ComponentType.NotFound vs. only throwing Component.NotFound. Arguably more explicit (but maybe a pain if callers need to plan to catch both).

^ Nice, I like that.

I think that I'm going to just kick the lru to the curb entirely for now. The worry I have is that the lru bleeds across transactions and probably even threads (?), and we don't know what combination of reads and writes of Components people are going to do in a big transaction (e.g. import). So it's possible that a process will add a bunch of stuff (including new ContentTypes), then do some reads of Components because they're grouping them somehow (thus loading things into the LRU cache). Then their process dies, and gets rolled back.

I guess the semantics I'm really looking for are transaction-local caches. Which actually sounds really cool now that I think about it... but I'm definitely not going there this week. A much simpler version of that would be a local object that proxies the requests through and goes out of scope at the end of whatever import operation happens. But things will probably be fine if I just punt on caching for a while.

ormsbee requested review from bradenmacdonald and kdmccormick January 31, 2024 20:17

fix: remove incomplete comment thought

fc33d66

kdmccormick approved these changes Feb 1, 2024

View reviewed changes

ormsbee merged commit 8d03e22 into openedx:main Feb 1, 2024
7 checks passed

ormsbee deleted the component-types-4 branch February 1, 2024 20:26

This was referenced Feb 2, 2024

Discovery: Determine how v2 Content Libraries can make use of Learning Core #30

Closed

Switch v2 libraries to Learning Core data models openedx/edx-platform#34066

Merged

ormsbee self-assigned this Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: normalize component types with ComponentType #148

refactor: normalize component types with ComponentType #148

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024

kdmccormick left a comment

kdmccormick Feb 1, 2024 •

edited

Loading

ormsbee Feb 1, 2024

kdmccormick Feb 1, 2024

ormsbee Feb 1, 2024

bradenmacdonald Feb 1, 2024 •

edited

Loading

kdmccormick Feb 1, 2024

ormsbee Feb 1, 2024 •

edited

Loading



		@lru_cache(maxsize=128)
		def get_or_create_component_type_id(namespace: str, name: str) -> int:

refactor: normalize component types with ComponentType #148

refactor: normalize component types with ComponentType #148

Conversation

ormsbee commented Jan 31, 2024

ormsbee commented Jan 31, 2024

kdmccormick left a comment

Choose a reason for hiding this comment

kdmccormick Feb 1, 2024 • edited Loading

Choose a reason for hiding this comment

ormsbee Feb 1, 2024

Choose a reason for hiding this comment

kdmccormick Feb 1, 2024

Choose a reason for hiding this comment

ormsbee Feb 1, 2024

Choose a reason for hiding this comment

bradenmacdonald Feb 1, 2024 • edited Loading

Choose a reason for hiding this comment

kdmccormick Feb 1, 2024

Choose a reason for hiding this comment

ormsbee Feb 1, 2024 • edited Loading

Choose a reason for hiding this comment

kdmccormick Feb 1, 2024 •

edited

Loading

bradenmacdonald Feb 1, 2024 •

edited

Loading

ormsbee Feb 1, 2024 •

edited

Loading