Automation using Hand Gestures

This project involves recognizing real time gestures and automating simple real life tasks such as turning on/off appliances, interacting with a television and other electronic items. This project implements Home Automation with hand gestures using Background Elimination and Convolution Neural Network based classifier. These gestures are used as an actuation mechanism for actuating various household items. Click here to see the output.

About the Project

The project involves recognizing real time gestures and automating simple real life tasks. There are 12 gestures that are recognized.
The different gestures include - Direction_left, Direction_right, Fist, Five-palm, OK, Stop, Thumbs_up, Thumbs_down, One, Two Three, Four and Five, see Dataset.
The dataset used is a custom dataset. It consists of 12,000 images for 12 of the gestures mentioned above. Each gesture has 1000 images corresponding to the right hand. Each image is resized to a dimension of 89x100.

Requirements

Python 3
tensorflow
tflearn
keras
numpy
sklearn
cv2 (Camera/Video capture)
imutils and pillow (image transforms)
pyfirmata (interacting with the Arduino)
turtle and tkinter (for simulation)
Pubnub account with a publisher and subscriber key (to transmit messages over internet).

File description

Directories:

Dataset custom dataset with 12,000 images.
gifs gifs/images for simulation output.
preprocessing (preprocessing files for dataset creation)
- all_dataset_dirs.txt keeps record of sub-directories created in Dataset.
- create_final_dataset.py resizes all images from all gesture sub-directories to appropriate size.
- gestures.txt file containing list of gestures.
- mk_dirs_for_dataset.py creates a Dataset directory with sub-directories for various gestures.
TrainedModel contains the trained model.

Files:

Actuator.py to actuate devices/simulation (shows simulation as well as some hardware actuation on receiver side).
ContinuousGesturePredictor.py predict and transmit gesture over internet in real time.
PalmTracker.py for dataset creation.
ModelTrainer.ipynb to train the CNN.

Refer the how to section to get started.

Some key insights

Background Elimination Algorithm

The Background elimination algorithm uses the concept of running average. Here it is used to detect a hand and isolate it from the background. The algorithm analyses the first 30 frames of the video feed where it analyses the 'still' background. After the first 30 frames, if an object enters the region, that object is considered as "not the background" and hence it is detected as the object in the region. This way when a hand is brought in the region, it is properly detected.

The Convolution Neural Network

The network contains of 7 hidden layers with ReLU as the activation function. Each of these layers is followed by a Max-Pool layer. The input shape to the network is (89x100x1).
The network consists 1 fully-connected layer with sigmoid as the activation function.
The network is trained over 10 epochs with a batch size of 50 and a learning rate of 0.001. The optimizer used is Adam.
The obtained accuracy for the validation set is 99.87%.

Real-Time Detection

The input image is fed into the network after thresholding and background elimination, background elimination is done using the concept of running average. The CNN model is 7 hidden layer architecture and it is trained on a custom dataset consisting of 12 gestures with a total of 1000 images for each gesture. The automation is simulated using a GUI. Some parts (lights in the house) of the actuation are also demonstrated using an Arduino (Hardware actuation).

The recognized gesture is transmitted to another device and the corresponding actuation occurs on the receiver's side. The gesture classification is transmitted across devices via the internet using the Pubnub API. Refer this article to get started with using the Pubnub API.

HOW TO

Procedure for creating a custom dataset

Note: All the files are to be executed from the root of the project directory only.

fill in gesture names in preprocessing/gestures.txt.
run preprocessing/mk_dirs_for_dataset.py: creates a Dataset directory with sub-directories for various gestures, as shown below.
- Dataset (Dataset directory created)
  - Gesture1 (directory 1)
  - Gesture2 (directory 2)
  - ..
  - .. (similarly for all gestures)
run PalmTracker.py for each of the gestures in preprocessing/gestures.txt.
- input the gesture name (as per provided in preprocessing/gestures.txt (case sensitive)).
- input number of images to capture.
- input from where to continue capturing (number) (images in the dataset are numbered).
  (this opens up a window with the video feed. Wait for the background elimination algorithm to register the background before performing any gestures (you can see numbers 0 to 29 being printed over the console). Once that is done, you can start performing the gestures as required. Once you start performing the gestures, another window named "Thresholded" opens up. To start recording/capturing the gestures, press 's' on your keyboard ('q' to quit)).
  (generates images (with appropriate labels) in each of the sub-directories).
run preprocessing/create_final_dataset.py: run only after creating the entire dataset. Resizes all dataset images to appropriate size of 89x100.

How to run the Real time prediction

Run ContinuousGesturePredictor.py and wait for a window with the video feed to open up. Once the window opens, wait for the background elimination algorithm to register the background before performing any gestures (you can see numbers 0 to 29 being printed over the console). Once that is done, you can start performing the gestures as required. Once you start performing the gestures, another window named "Thresholded" opens up. To start recognizing the gestures, press 's' on your keyboard ('q' to quit). Another window named "Statistics" will open up and print the results. This same file will also transmit the recognised gesture over the internet using the Pubnub API. This file acts as the publisher when transmitting the gesture.

Run Actuator.py on the same or another computer (connect the Arduino for hardware actuations if required), a window named "Home" opens up with various appliances represented in a simple way. This file acts as the subscriber and will receive any message transmitted by the publisher (ContinuousGesturePredictor.py). As the messages (gestures) are received, the corresponding actuations occur in the simulation as well as the hardware. The actions corresponding to each gesture can be found in the table below.

Various gestures mapped to their actuations

Gesture	Actuation
Thumbs up	Red light ON
Thumbs down	Lighter Red ON
Fist	Red light OFF
One	Fan ON
Two	Green light ON
Three	Lighter Green ON
Four	Green OFF
Palm-Five	Floor clean
Stop	Fan OFF
OK	Predict rain
Direction right	T.V channel change

Output

Here is the final output:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automation using Hand Gestures

About the Project

Requirements

File description

Some key insights

Background Elimination Algorithm

The Convolution Neural Network

Real-Time Detection

HOW TO

Procedure for creating a custom dataset

How to run the Real time prediction

Various gestures mapped to their actuations

Output

Other links

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Dataset		Dataset
TrainedModel		TrainedModel
gifs		gifs
preprocessing		preprocessing
.gitignore		.gitignore
Actuator.py		Actuator.py
ContinuousGesturePredictor.py		ContinuousGesturePredictor.py
LICENSE.txt		LICENSE.txt
ModelTrainer.ipynb		ModelTrainer.ipynb
PalmTracker.py		PalmTracker.py
README.md		README.md
Result.gif		Result.gif

License

prtvi/automation-using-hand-gestures

Folders and files

Latest commit

History

Repository files navigation

Automation using Hand Gestures

About the Project

Requirements

File description

Some key insights

Background Elimination Algorithm

The Convolution Neural Network

Real-Time Detection

HOW TO

Procedure for creating a custom dataset

How to run the Real time prediction

Various gestures mapped to their actuations

Output

Other links

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages