This project is a web application that allows users to upload a photo of a bill, and automatically parses the information using OCR (Tesseract.js) and an LLaMA model. The extracted data is then structured into JSON format and stored in a PostgreSQL database.
- OCR with Tesseract.js: Extracts text, rows, and columns from uploaded bill images.
- LLaMA Model Integration: Converts the OCR output into structured JSON format.
- Database Storage: Saves parsed data into a PostgreSQL database for future reference.
- User-Friendly Interface: Allows users to easily upload and process bills through a web interface built with Next.js.
- NestJS - Backend framework for building efficient, scalable Node.js server-side applications.
- TypeScript - Typed superset of JavaScript that compiles to plain JavaScript.
- Next.js - React framework for server-rendered or statically-exported React apps.
- PostgreSQL - Open-source relational database management system.
The user uploads a bill image via the front-end interface built with Next.js. The image is then sent to the backend server built with NestJS.
The uploaded image is processed by Tesseract.js, an OCR library that extracts the text, rows, and columns from the bill.
The extracted data from the OCR process is passed to a LLaMA model, which converts the unstructured text into a structured JSON format.
The resulting JSON is stored in a PostgreSQL database, allowing for easy retrieval and analysis in the future.
- Node.js (v14 or later)
- PostgreSQL (v12 or later)
- Tesseract.js
- LLaMA Model (pre-trained)
-
Clone the repository:
git clone https://github.com/yourusername/bill-parser-app.git cd bill-parser-app
-
Install dependencies:
npm install
-
Set up the database:
- Ensure PostgreSQL is installed and running.
- Create a database and configure your
.env
file with the database credentials.
-
Run the application:
npm run dev
- Open your browser and navigate to
http://localhost:3000
. - Upload an image of a bill.
- The system will process the image and display the extracted data in JSON format.
- The JSON data is saved to the database for future use.
We welcome contributions! Please follow our guidelines for contributing and make sure to run tests before submitting a pull request.
This project is licensed under the MIT License. See the LICENSE file for more details.
For any inquiries, please reach out to [email protected]
.