gpt_read is a versatile tool that leverages GPT models to answer questions based on document content. It is designed to process documents, extract key information, and generate responses tailored to user queries.
The repository implements several strategies for document processing, including direct text retrieval, chunk-based processing, semantic segmentation, hierarchical summarization, and multi-pass refinement. These approaches allow users to choose a method that best fits the document type and question complexity.
-
Multiple Processing Methods:
Supports various strategies such as retrieval, chunking, hierarchical summarization, and combined multi-pass approaches, all callable from the main.R allows users the flexibility of handling diverse documents. -
Broad Format Support:
Works with PDFs, DOCX, TXT, and images (with OCR), making it applicable across a wide range of document formats. -
Efficiency and Accuracy:
Automates the extraction of relevant content, reducing manual review time while enabling users to compare different methods to achieve the best results. -
Scalability:
Incorporates parallel processing and dynamic chunking, ensuring it can manage documents of varying sizes and complexities.
The system reads and processes documents by splitting them into manageable segments. It then applies the selected processing strategy to extract and synthesize the information needed to answer the user's query. The result is an answer that reflects the key content of the document, with options for further refinement.
This system assumes you have a valid API key with OpenAI. To obtain an API key:
- Visit OpenAI's API page.
- Register for an account and follow the instructions to generate your API key.
- Configure your environment to use this key as required by the system.
- The Shiny application will automatically prompt you for the key.
-
Select Document(s):
Place your document in the designated folder. Supported formats include PDF, DOCX, TXT, and images (with OCR). -
Select Processing Method(s):
Choose from one of the available strategies based on your document and the nature of your question. See the 'R' folder README for more details about each method. -
Ask Your Question:
Use the provided functions to ask a question. The system will process the document and return a generated answer.
This framework offers a clear, systematic approach to document analysis, enabling efficient extraction of relevant information from a variety of sources.
This repository expands on the original gpt_read function. Initially, gpt_read split text into chunks and queried GPT with each chunk plus a user question, then consolidated the responses. This method, designed for smaller context windows and high hallucination rates, was slow and API-intensive.
Inspired by projects like LangChain, I sought more control over how GPT ingests, extracts, and parses information. As API costs fell and context windows grew, I experimented with methods that mimic human reading: first skimming to identify key points, then revisiting details. For example, while chunked_mode.R
refines the original approach, the retrieval method first filters relevant text before querying GPT, aiming for greater accuracy.
This seems too niche for a lot of people, but I am sharing should someone find they'd like to experiment with these sort of methods too.