Create VDK Confluence Data Source #3048
Labels
initiative: VDK for Private AI
Initiative including the effort to support Private AI usecases of VMWare with VDK
Milestone
The goal of this issue is to create a Confluence data source for the Versatile Data Kit (VDK). T
his data source will enable VDK to fetch and ingest data from Confluence spaces and documents.
VDK Data Source encapsulate how a data source can be ingested
Requirements
Generate new plugin vdk-confluence using https://github.com/versatile-data-kit-dev/new-vdk-data-source
Data Source Implementation
Establish a connection to Confluence using provided URL and tokens.
Fetch specific documents or entire spaces based on provided IDs or keys.
Data Source Stream
Implement streams to handle subsets of Confluence data. Options:
Data Source Payload
Structure payload with data, metadata, and state.
Impl notes
To create new plugin one can use cookie-cutter https://github.com/versatile-data-kit-dev/new-vdk-data-source
There's an example in https://github.com/vmware/versatile-data-kit/tree/main/events/data-sources
See example data job: https://github.com/vmware/versatile-data-kit/tree/main/examples/confluence-data-retrieval-example
Of course write functional tests
The text was updated successfully, but these errors were encountered: