Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 763 Bytes

File metadata and controls

15 lines (9 loc) · 763 Bytes

Open source data ingestion for RAGs with dlt

In this hands-on workshop, we’ll learn how to build a data ingestion pipeline using dlt to load data from a REST API into LanceDB so you can have an always up to date RAG.

​We’ll cover the following steps:

  • Extract data from REST APIs
  • Loading and vectorizing into LanceDB, which unlike other vector DBs stores the data and the embeddings
  • Incremental loading

​By the end of this workshop, you’ll be able to write a portable, OSS data pipeline for your RAG that you can deploy anywhere, such as python notebooks, virtual machines, or orchestrators like Airflow, Dagster or Mage.

If you don't take the course and want to sign up only for this workshop, use this link: https://lu.ma/cnpdoc5n