Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Complex jexl issues (refactor(forms): rewrite structure and jexl evaluator) #2356

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

winged
Copy link
Contributor

@winged winged commented Jan 14, 2025

refactor(forms): rewrite structure and jexl evaluator

The new structure / jexl evaluator works a bit differently: Instead of
trying to replace evaluation contexts during recursive evaluation (for
example is_hidden checks), we now have a local JEXL runtime for each
field. Also, the JEXL expressions (or their results, rather) are heavily
cached and should speed things up significantly.

Regarding the test cases:

We're trying to keep the test cases' meaning 100% unchanged - the only
modifications currently are some improved assertion messages, so
debugging becomes easier, as well as refactoring some for better readability.

Some tests are extended, and some are now better documented, to cover
more aspects and explain in more detail what our assumptions and
expectations actually are.

BREAKING CHANGE: Code that uses the form jexl and / or structure code
most likely will need to be rewritten. The changes are small-ish, but
still semantically not exactly equal.

refactor: rewrite the calculated-question code to use the new structure

The whole updating code for calculated fields was rather complex and
had quite a few subtle bugs. With the new structure, we have infrastructure
in place to build the same behaviour in a much better, more reliable way.

@winged winged force-pushed the complex_jexl_issues branch 3 times, most recently from 54e63bd to 2007067 Compare January 24, 2025 14:56
@winged winged force-pushed the complex_jexl_issues branch 14 times, most recently from 15ea4c2 to 931f0f7 Compare February 18, 2025 15:15
@winged winged requested a review from open-dynaMIX February 18, 2025 15:17
@winged winged changed the title Fix: Complex jexl issues Fix: Complex jexl issues (refactor(forms): rewrite structure and jexl evaluator) Feb 18, 2025
@winged winged requested review from czosel and nlzet February 18, 2025 15:17
@winged
Copy link
Contributor Author

winged commented Feb 21, 2025

Note to the reviewers: This is a complete rewrite of caluma_form/structure.py and caluma_form/jexl.py, so I'd suggest not looking at the diff, and just look at the code "as it were new"

Comment on lines 497 to 498
# TODO: That 1 should probably be 0 - there are no answers around
# in the "bare" document fixture, right?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed.

Comment on lines 368 to 369
# TODO: This test fails because our new _extend_context() likely doesn't properly
# update the chainmaps as expected
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed.

# Root field is always visible
return False

# do_raise = self.all_dependencies_hidden(self.question.is_hidden)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed

Comment on lines 170 to 174
# log.info(
# "JEXL: evaluating expression <<< %s >>> in context: %s",
# str(expression),
# str(dict(context)),
# )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed

@winged winged force-pushed the complex_jexl_issues branch from 931f0f7 to 6c03ffe Compare February 21, 2025 14:02
Copy link

@nlzet nlzet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winged as mentioned mostly some minor typo changes and a few questions to double check

"""
result = []
for formfield in self.get_all_fields():
if formfield.question and formfield.slug() == slug:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formfield.slug() already checks for formfield.question presence, so check here might be redundant ?

@@ -103,7 +103,10 @@ def remove_calc_dependents(sender, instance, **kwargs):
@filter_events(lambda instance: instance.type == models.Question.TYPE_CALCULATED_FLOAT)
@filter_events(lambda instance: getattr(instance, "calc_expression_changed", False))
def update_calc_from_question(sender, instance, created, update_fields, **kwargs):
for document in models.Document.objects.filter(form__questions=instance):
# TODO: we need to find documents that contain this form as a subform
# as well. Tis would only find documents where the question is attached
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo Tis should be This ?

@@ -113,5 +116,8 @@ def update_calc_from_question(sender, instance, created, update_fields, **kwargs
lambda instance: instance.question.type == models.Question.TYPE_CALCULATED_FLOAT
)
def update_calc_from_form_question(sender, instance, created, **kwargs):
for document in instance.form.documents.all():
# TODO: we need to find documents that contain this form as a subform
# as well. Tis would only find documents where the question is attached
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, minor typo Tis should be This ?

else:
parent_data = None
return {
# "_type": type(self).__qualname__,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commented line could be removed ?


# no value, no special handling
return None
In JEXL, the parent refers to th next field up that represents another
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo th should be the ?

):
"""Test saving a document via the Python API.

For detailled explanation about the expected behaviour, see the docs for
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo detailled should be detailed ?

if not hasattr(self, "_memoise"):
self._memoise = {}
self._memoise_hit_count = 0
self._memoise_miss_count = 0

key = str([args, kwargs, method])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be some extra sorting of arguments before generating the key here, in order to prevent getting a miss with the same values with a different order ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were fully generic, absolutely. But it's intended for a relatively small scope, so I think it's actually fine.

Comment on lines 374 to 559
if self.answer.date:
return self.answer.date
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to check for the date answer without checking the question type? Also see similar remark below


@object_local_memoise
def get_value(self):
if self.is_hidden() or self.is_empty():
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it makes sense to either:

  • switch the is_hidden and is_empty around
  • or remove is_hidden altogether

Since is_empty will already check for is_hidden

yield formfield
if isinstance(formfield, FieldSet):
yield from formfield.get_all_fields()
if isinstance(formfield, RowSet):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

elif instead of if since it cannot be both I guess ?

Comment on lines +62 to +66
else:
# No default arg, so we must raise an exception
raise QuestionMissing(
f"Question `{question_slug}` could not be found in form {self.field.get_form()}"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the behavior has slightly changed (exception raised, instead of continuing to is_hidden check), I think it would be good to document this for the release (some jexls will need to be updated).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads to issues in the current forms, and it is kind of difficult to find and adapt all affected question configs.

Comment on lines 176 to 177
# TODO how is "root" expected to behave if we're *already* on root?
"root": self._get_root().get_local_info_context() if self.parent else None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, in other places the root points to itself / contains itself, if we're already on root

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the form family? Yes - but this is a bit different. I'd keep it as is here.

Comment on lines +306 to +474
# TODO: update / save answer
# TODO: reset caches in all dependents (calc dependents are easy, but what
# about the rest? Like visibility dependents etc?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these still open TODOs?

self._own_fields = {}

if parent:
# TODO This likely causes a circular dependency - Verify
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still an open TODO?

field = self.fields.get(question_slug)
def is_required(self) -> bool:
# Fieldsets (in other words - subforms) should never be required.
# TODO: Verify this assumption
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open TODO?

answer=answers.get(fq.question.slug),
parent=self,
# This should already be sorted, as the context buildup
# is doing that for us. TODO: Verify this claim
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open TODO?

Comment on lines +105 to +109
validation_context = validation_context or structure.FieldSet(document.family)

jexl = QuestionJexl(validation_context)
with jexl.use_field_context(
validation_context["structure"].get_field(question.slug)
):
return [o.slug for o in options if not jexl.evaluate(o.is_hidden)]
my_field = validation_context.find_field_by_document_and_question(
document.pk, question.pk
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During manual testing, the provided validation_context doesn't seem to always be a FieldSet. For example when passing a ValueField, the method find_field_by_document_and_question isn't found.

@@ -312,165 +311,110 @@ def validate(
if not validation_context:
validation_context = self._validation_context(document)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we removed _validation_context(...), would it be get_validation_context(...)?

@winged winged force-pushed the complex_jexl_issues branch from 7fff11b to 78e60e8 Compare February 25, 2025 16:39
Table rows were sorted, but backwards; questions were not sorted at all,
and thus might have lead to unpredictable behaviour. We noe explicitly
sort this correctly, therefore making things a bit more testable.
Calculated questions do not work correctly when located inside a table row:
The recalculation is currently triggered on the root document, which will
only find one of the rows, and update that - while likely ignoring the row
where the actual dependency is located.

This test is intended to demonstrate the problem, and thus will currently
fail.
@winged winged force-pushed the complex_jexl_issues branch 2 times, most recently from 8291232 to e039193 Compare February 26, 2025 10:02
The new structure / jexl evaluator works a bit differently: Instead of
trying to replace evaluation contexts during recursive evaluation (for
example `is_hidden` checks), we now have a local JEXL runtime for each
field. Also, the JEXL expressions (or their results, rather) are heavily
cached and should speed things up significantly.

Test cases:

We're trying to keep the test cases' meaning 100% unchanged - the only
modifications currently are some improved assertion messages, so
debugging becomes easier, as well as refactoring some for better readability.

Some tests are extended, and some are now better documented, to cover
more aspects and explain in more detail what our assumptions and
expectations actually are.

BREAKING CHANGE: Code that uses the form jexl and / or structure code
most likely will need to be rewritten. The changes are small-ish, but
still semantically not exactly equal.
The whole updating code for calculated fields was rather complex and
had quite a few subtle bugs. With the new structure, we have infrastructure
in place to build the same behaviour in a much better, more reliable way.

TODO: This is currently not yet fully optimized, and we're doing quite a few
more queries than before. Also TODO: A few issues were discovered that still
need to be addressed - namely calculated questions not attached to a root form.
Introduce a FastLoader class that is able to preload a full
document/form structure into memory, with as few and simple queries
as possible.

This reduces the number of DB hits the code needs to perform during
document validation.

Note some tests had to be fixed as well - adding family attributes
to some row documents, as the fast-loader is even more picky than the
structure code was about it.
This allows you, during debugging, to get the exact location of a field
within a form / document structure.
@winged winged force-pushed the complex_jexl_issues branch from e039193 to 0c11183 Compare February 26, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants