DocumentAgent (Phase 1) #438

marklysze · 2025-01-10T19:05:57Z

--FEEDBACK WELCOME--

DocumentAgent will be a document-based agent, able to ingest documents/sources of information and have that knowledge accessible to achieve its given task.

Examples of use-cases:

Document classification
Document/Page summarisation
Question Answering
Identify missing information
Invoice handling

The objective for this Phase is to provide a quick-start agent that developers can incorporate easily.

This DocumentAgent will include RAG capabilities and, so, it will be built progressively, with this Phase 1 implementation containing basic RAG capabilities such as being able to ingest and then embed into a vector database. Future implementations will include more advanced RAG capabilities and engines, as well as additional capabilities for document transformation.

Capabilities include:

Input: Read one or more TXT, CSV, PDF, HTML, Markdown, PPTX, JSON
Extract and store data, including into an intermediate format (such as Doclings DoclingDocument format)
Developer determined handling (put in prompt, use vector database, use third party query engine)
Query data, including support for 3rd party querying
Support for Structured Outputs to control output format

Example code (not final API):

# Most basic
my_document_agent = DocumentAgent(
    name="docagent",
    llm_config=...,
    sources="my_file.txt")

# Multiple sources, supporting different types
my_document_agent = DocumentAgent(
    name="docagent",
    llm_config=...,
    sources=[my_file_name_with_path, "https://my.url.com"]

# Storage and Retrieval from a Vector database
my_document_agent = DocumentAgent(
    name="docagent",
    llm_config=...,
    sources=[my_file_name_with_path, my_file_name_with_path],
    handling_config = DocumentHandlingConfig(document_types=[DocType.Text, DocType.XLSX], storage=DocumentStore.Weaviate, settings={...})

# 3rd-party query engine (or this could be an agent built on DocumentAgent, e.g. DocumentAgentAgentQL)
my_document_agent = DocumentAgent(
    name="docagent",
    llm_config=None,
    sources="https://my.url.com",
    handling_config = None,
    query_config = DocumentQueryConfig(document_types=[DocType.URL], provider=DocQueryProvider.AgentQL, settings={...})

Internal agent workflow:

Load/Convert the document through handling configuration (defaulted for easy of use)
Uses query configuration to respond to queries (e.g. inject full source into system message, query vector store and inject into system message, run external provider)

Notes:

The use of a common intermediate format may be important, such as using Docling for document parsing and their Docling Document format for local storage. This could provide a good basis for standardised tools for this agent.

Deliverables:

DocumentAgent code
Documentation
Blog
Notebook
Video script

The text was updated successfully, but these errors were encountered:

AgentGenie · 2025-01-14T18:57:08Z

Use JSON for intermediate data format and use some convertor to integrate with parsing frameworks.
Going to replace RetrieveUserProxyAgent
Clear interface for user and internal. e.g. structured user config class, internal interfaces like query tool to integrate with 3rd party query frameworks.
Refactor Vector DB interface to query engine.

marklysze · 2025-02-20T04:06:29Z

Thanks everyone! First phase in 0.7.5!

marklysze added this to ag2 Jan 10, 2025

marklysze converted this from a draft issue Jan 10, 2025

marklysze added enhancement New feature or request RAG labels Jan 10, 2025

marklysze moved this to Todo in ag2 Jan 10, 2025

davorrunje assigned marklysze Jan 14, 2025

davorrunje added the roadmap label Jan 14, 2025

marklysze added the agents:docagent Document Agent label Jan 14, 2025

qingyun-wu assigned AgentGenie Jan 15, 2025

This was referenced Jan 17, 2025

Tools capability #526

Merged

Add default document loader and parser for RAG #624

Merged

marklysze moved this from Todo to In Progress in ag2 Feb 7, 2025

sitloboi2012 mentioned this issue Feb 13, 2025

[Feature Request]: Enhance Docling Query Engine: Add PGVector, MongoDB, and Qdrant Support via VectorDBFactory Wrapper #950

Open

This was referenced Feb 13, 2025

Document agent phase 1 #957

Closed

Document agent phase 1 #978

Merged

marklysze closed this as completed Feb 20, 2025

github-project-automation bot moved this from In Progress to Done in ag2 Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DocumentAgent (Phase 1) #438

DocumentAgent (Phase 1) #438

marklysze commented Jan 10, 2025 •

edited

Loading

AgentGenie commented Jan 14, 2025

marklysze commented Feb 20, 2025

DocumentAgent (Phase 1) #438

DocumentAgent (Phase 1) #438

Comments

marklysze commented Jan 10, 2025 • edited Loading

AgentGenie commented Jan 14, 2025

marklysze commented Feb 20, 2025

marklysze commented Jan 10, 2025 •

edited

Loading