Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenAI Provider #35023

Merged
merged 24 commits into from
Nov 7, 2023
Merged

Add OpenAI Provider #35023

merged 24 commits into from
Nov 7, 2023

Conversation

utkarsharma2
Copy link
Contributor

This PR is part of our larger effort to add first-class integrations to support LLMOps that was presented at Airflow Summit.

This PR adds explicitly the OpenAI Provider. OpenAI is a leading American artificial intelligence organization, which offers one of the most used LLM - ChatGPT and offers embedding models.

The primary objective of this Provider is to present users with an alternative embedding model. This allows them to generate vectors for their proprietary data, a pivotal step towards establishing integrations with LLM models like ChatGPT.

Example DAG:
The OpenAIEmbeddingOperator can accept either a string or a callable returning a list of strings.

OpenAIEmbeddingOperator(
        task_id="embedding_using_xcom_data",
        conn_id="openai_default",
        input_text=xcom_text["input_text"],
        model="text-embedding-ada-002",
    )

Email Discussion related to the effort can be found here - https://lists.apache.org/thread/0d669fmy4hn29h5c0wj0ottdskd77ktp

@utkarsharma2 utkarsharma2 marked this pull request as ready for review October 26, 2023 13:50
Copy link
Member

@pankajastro pankajastro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pankajastro pankajastro requested a review from eladkal November 4, 2023 11:52
@pankajastro pankajastro merged commit cca4aa4 into apache:main Nov 7, 2023
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Nov 10, 2023
* Add OpenAI Provider

* Apply suggestions from code review

Co-authored-by: Phani Kumar <[email protected]>

* Remove create_completions method from hook

* Change type of input_text param

Since the upstream API accepts str ot list of tokens, we accept the similar inputs from user.

* Updated min-airflow version to 2.5.0

* Updated the interface and fix docs and static files

* Fix tests

* Fix tests

* Change the version

Because of OpenAI SDK not being production ready

* Add embedding_kwargs as a param to operator

* Update tests/providers/openai/hooks/test_openai.py

Co-authored-by: Pankaj Singh <[email protected]>

* Remove unwanted params in docstring

* Update Changelog

* Add security.rst file

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <[email protected]>

* Add host field for connections

* Update docs/apache-airflow-providers-openai/index.rst

Co-authored-by: Pankaj Singh <[email protected]>

* Add changelog.rst file to docs

* Change version to 1.0.0

* Resolve conflicts

* Fix tests

* Fixed tests

* Fix test

* Resolve Conflict

---------

Co-authored-by: Pankaj Koti <[email protected]>
Co-authored-by: Phani Kumar <[email protected]>
Co-authored-by: Pankaj Singh <[email protected]>
@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Nov 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) kind:documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.