GenIAL is an innovative document analysis tool developed during a hackathon for Gide, a leading international law firm. The application streamlines the audit process by automatically analyzing PDF documents using Large Language Models (LLM) and generating structured audit reports in Word format based on predefined templates.
The application simplifies the document review process through:
- PDF Upload: Users can easily upload PDF documents for analysis
- Intelligent Processing: Utilizes RAG (Retrieval-Augmented Generation) technology to comprehend and extract relevant information
- Automated Reporting: Generates professional Word documents following Gide's audit template format
genial/
├── app/ # Frontend React application
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── contexts/ # React contexts
│ │ ├── routes/ # Application routes
│ │ └── main.tsx # Application entry point
│ └── package.json # Frontend dependencies
│
├── api/ # Backend AWS Lambda functions
│ └── functions/
│ ├── generate/ # RAG generation Lambda
│ └── export/ # Word document export Lambda
│
└── venv/ # Python virtual environment
- Navigate to the app directory:
cd app
- Install dependencies using pnpm:
pnpm install
- Start the development server:
pnpm dev
The application will be available at http://localhost:5173
- Create a Python virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install Python dependencies:
cd api/functions/<function_name>
pip install -r requirements.txt
The project uses two main Lambda functions:
-
Generate Function (
/api/functions/generate
)- Processes uploaded PDFs using RAG technology
- Extracts and analyzes relevant information using LLM
- Structures data for audit report generation
- Memory: 1024 MB (recommended)
- Timeout: 10 minutes (recommended)
-
Export Function (
/api/functions/export
)- Generates Word documents based on templates
- Formats extracted data into structured audit reports
- Creates professional-grade documentation
- Memory: 512 MB (recommended)
- Timeout: 30 seconds (recommended)
- Create a new REST API in API Gateway
- Create the following endpoints:
- POST
/generate
→ Generate Lambda (PDF processing) - POST
/export
→ Export Lambda (Word document generation)
- POST
- Enable CORS for the frontend domain
- Deploy the API to a stage (e.g., 'prod')
- Built with React 18, TypeScript, and Vite
- Interactive PDF viewer with
@react-pdf-viewer
- Form handling with
react-hook-form
and Zod validation - Modern UI components using Radix UI
- Responsive design with Tailwind CSS
The backend implements a sophisticated document processing pipeline:
-
Document Analysis
- PDF document upload and text extraction
- Document segmentation for detailed analysis
- Intelligent content processing using RAG technology
-
LLM Processing
- Context-aware information extraction
- Structured data generation for audit reports
- Quality assurance checks on extracted information
-
Report Generation
- Template-based Word document creation
- Professional formatting and styling
- Consistent with Gide's documentation standards
VITE_API_URL=your_api_gateway_url
Configure through AWS Lambda environment variables:
OPENAI_API_KEY
: Your OpenAI API key- Other necessary API keys and configuration
- Secure document handling and processing
- API keys stored securely in AWS Lambda environment variables
- CORS configuration for protected endpoints
- Input validation for all file uploads
- Secure document storage and transmission
- Make changes to the frontend code
- Test locally using
pnpm dev
- Build for production using
pnpm build
- Deploy Lambda functions through AWS Console or CLI
- Update API Gateway configuration if needed
This project was developed during a hackathon in collaboration with Gide. It addresses the specific challenge of automating and streamlining the document audit process, demonstrating the potential of AI-powered solutions in legal document processing.
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.