Content Based Image Retreival

Content-Based Image Retrieval System Utilizing Vision Transformer

This repository contains the code for a Content-Based Image Retrieval (CBIR) system, as described in the research paper "Content-Based Image Retrieval System Utilizing Vision Transformer" by Aniket Mahajan, Nirmiti Rane, and Pranav Patil.

Overview

The CBIR system presented here allows users to search for images using natural language queries. It combines several key techniques:

Vision Transformer (ViT): A pre-trained ViT (specifically google/vit-base-patch16-224 from Hugging Face Transformers) is used for feature extraction, capturing fine-grained image details. A quadtree-based approach divides images into quadrants for more granular feature extraction.
Natural Language Processing (NLP): NLP is used extensively for:
- Query processing: Tokenization, stemming, stop word removal, and negation handling are performed using libraries like spaCy and NLTK.
- Synonym generation: WordNet is used to expand the query with synonyms, improving recall.
- Textual feedback processing: User-provided sentences are analyzed using TF-IDF to extract keywords and update image feature probabilities.
Vector Space Model (VSM): A VSM represents images and queries as vectors in a high-dimensional space. Cosine similarity is used to rank images based on their relevance to the query.
User Relevance Feedback:
- Binary Feedback: Users can provide "like" or "dislike" feedback on retrieved images. This directly adjusts the probabilities of associated image features.
- Textual Feedback: Users can input sentences describing images. These sentences are processed using TF-IDF, and the extracted keywords are added as new features with their TF-IDF scores as probabilities. This allows the system to learn from user input and improve over time.
Database Integration (SQLite): The system uses SQLite to store image features, metadata, user feedback, and probabilities. The SQLmethods directory contains the database interaction logic.

Setup and Installation

Backend (Python/Flask)

Clone the repository:

git clone <repository_url>
cd <repository_name>

Create and activate a virtual environment (recommended):

python3 -m venv venv
source venv/bin/activate  # On Linux/macOS
venv\Scripts\activate  # On Windows

Install backend dependencies:
```
pip install -r backend/requirements.txt
```
Add your model path in backend/system/methods/getFeaturesCNN

Download NLTK data:

python -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"

Run the backend server:
```
cd backend/system
python main.py
```
The backend server will run on http://localhost:5000.

Frontend (React.js)

Navigate to the frontend directory:
```
cd frontend/CBIR_frontend
```
Install frontend dependencies:
```
npm install
```
Run the frontend development server:
```
npm start
```
The frontend will be accessible at http://localhost:3000 (or a different port if 3000 is in use).

Usage

Upload Images: Use the drag-and-drop area or click to upload images. The backend will process the images, extract features, and store them in the database.
Search Images: Enter a natural language query in the search bar and click "Search." The system will process the query, retrieve relevant images, and display them in the grid.
Provide Feedback:
- Binary Feedback: Click the thumbs-up (like) or thumbs-down (dislike) button below each image to indicate its relevance to the query.
- Textual Feedback: Enter a sentence describing an image in the text input field below the image and click "Submit."
View All Images: Click the "Home" button (house icon) in the header to view all uploaded images.
Switch Theme: Click the theme toggle button in the header to switch between light and dark mode.

Screenshots

Paper

This project implements the system described in the following research paper:

Aniket Mahajan, Nirmiti Rane, Pranav Patil. "Content-Based Image Retrieval System Utilizing Vision Transformer".

DOI: 10.5281/zenodo.14934936

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
ProjectAssets		ProjectAssets
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Content Based Image Retreival

Content-Based Image Retrieval System Utilizing Vision Transformer

Overview

Setup and Installation

Backend (Python/Flask)

Frontend (React.js)

Usage

Screenshots

Paper

Important links

About

Languages

Aniike-t/Content-based-Image-Retrieval

Folders and files

Latest commit

History

Repository files navigation

Content Based Image Retreival

Content-Based Image Retrieval System Utilizing Vision Transformer

Overview

Setup and Installation

Backend (Python/Flask)

Frontend (React.js)

Usage

Screenshots

Paper

Important links

About

Topics

Resources

Stars

Watchers

Forks

Languages