-
Notifications
You must be signed in to change notification settings - Fork 14.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pagination to HttpOperator
and make it more modular
#34669
Add pagination to HttpOperator
and make it more modular
#34669
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI still fails but as I had a review / comments on the first PR I want to also provide some early feedback on the follow-up - knowing it is in DRAFT.
- I assume there are some typing-glitches, but I'll leave this feedback to the CI. MyPy makes a better job than me hunting for problems. Would make a code level review later. Some signatures seem for me not matching. Can review later. Otherwise you might save time in calling
breeze static-checks
locally. - Thanks for separating-out this extension to a dedicated extended operator. This is better than adding more complexity to the
SimpleHttpOperator
. - The current implementation names the operator being "extended" which is kind of abstract, the extension nevertheless is very specific to your "pagination" use case. I understand that the pagination use case is something that is your use case. But this is for me a bit of a naming conflict to just name it "extended". I would propose either naming the operator to be
PaginatedHttpOperator
(as it is specific for this case with the function signatures as extension) or call thepagination_function
rather be apost_call_hook
as there might be other extended/generic stuff possibly be implemented after the call as generic hook. - Otherwise talking about "generic" and the
post_call_hook
- I have not tried and therefore don't know if and what is the gap: You know that the existingSimpleHttpOperator
already carries a function calledresponse_filter
which can be used to inject any callable function already - in this case to post-process the call. Have you tried and thought about why not using theresponse_filter
as a generic hook to call further HTTP pages and filter the context as merge into the overall response? Is for the pagination use case a separate operator required?
Thanks for early review ! Mypy is indeed still complaining, and the DiscordWebHook too... Will fix that tomorrow :) About your last point: Yes, i considered the
On the name: It's totally up to you (and other reviewers) how that should be named. Reasoning is the following: We have the SimpleHttpOperator, which is a nice toolset for simple calls. Let's have another toolset for complex calls. It has to start somewhere, so, for now, it has only a pagination feature. With a "PaginatedOperator", we don't have a toolset anymore. But it's indeed more clear what the intention is. Will rename ! About renaming to On the other side, I agree with adding another generic |
ExtendedHttpOperator
+ refactor SimpleHttpOperator
PaginatedHttpOperator
+ refactor SimpleHttpOperator
ce9c051
to
7e6ad62
Compare
6af80dd
to
2741eda
Compare
@hussein-awala Could you review this one ? (as you reviewed #34606 ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasnt part of the discussion but not sure why we need 2 operators. It has the potential of confusing users and no where in the code we have something like this
Not blocking the idea just want to make sure this idea is because we want it to be like it and not because some old desicion bind us. If this is what we want then lets proceed if this is not what we want and we were driven into this by some old convention/desicion then my recommendation is to challange it first.
Everything in one single Http Operator is indeed simpler ! From a user point of view, I get the docstring on my IDE and can opt-in to a pagination feature just by providing the required parameter. This was the idea of the first PR. But shouldn't the Or |
2741eda
to
d94478c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow this PR seems to be in an intermediate state. Docs, tests and operator in core is not matching together. Is is WIP or do I have a mis-understanding?
PaginatedHttpOperator
+ refactor SimpleHttpOperator
SimpleHttpOperator
and make it more modular
16b9540
to
2cde718
Compare
I implemented back everything into the SimpleHttpOperator, and changed the title. This PR is not about an extra operator anymore, but extending the SimpleHttpOperator. Tests should be ok too. |
If we decide to implement the feature in the existing operator, I suggest renaming it to |
SimpleHttpOperator
and make it more modularHttpOperator
and make it more modular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this contribution looks good, but as I am not a committer, some more eyes are needed for review. Propose to make this into 2.8.0
Co-authored-by: Jens Scheffler <[email protected]>
b2e61ac
to
d787257
Compare
Anyone else ? It looks pretty good for merging :) |
* feat: Make SimpleHttpOperator extendable * feat: Implement ExtendedHttpOperator * feat: Add sync and async tests for `pagination_function` * feat: Add example and documentation * fix: Add missing return statements * fix: typo in class docstring Co-authored-by: Jens Scheffler <[email protected]> * fix: make use of hook property in DiscordWebhookHook * fix: rename to PaginatedHttpOperator * fix: Correctly route reference link to PaginatedHttpOperator docs * fix: makes SimpleHttpOperator types customizable for mypy * fix: add missing dashes in docs + add missing reference to paginated operator * fix: add missing reference to `PaginatedHttpOperator` * feat: implement hook retrieval based on connection id * feat: Merge PaginatedOperator to SimpleHttpOperator * fix: Removes mention of PaginatedHttpOperator * fix: Apply static checks code quality * fix: Reformulate docs * feat: Deprecate `SimpleHttpOperator` and rename to `HttpOperator` * fix: Remove 'HttpOperator' from `__deprecated_classes` --------- Co-authored-by: Jens Scheffler <[email protected]>
Hello, this PR is an alternative to #34606.
This PR,
pagination_function
to theSimpleHttpOperator: The operators supports a customizable pagination feature. User injects a callable aspagination_function
. As long as this callable returns the parameters for a next call, the Operator keep calling the API (with the new parameters).SimpleHttpOperator:HttpOperator
and add a deprecated SimpleHttpOperator classUse case
I do a lot of data pulling from APIs. The SimpleHttpOperator is a great tool for that, but:
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.