- install Conda
- Install conda
- run the following
conda env create -f requirements.yml
conda activate Image-Project
python3 main.py <input directory path> <output directory path>
e.g. python3 main.py test-cases test-out --> this will generate parse the input folder test-cases and for each image in it it will produce file that will be placed in test-out folder
-
the input image is converted to binary using otsu's method then it is inverted so that the symbols of interest have higher pixel values so they're easier to play with
-
then the image is segmented using histogram projection analysis of the inverted image to a number of staffs 3 in this case
-
then we loop on the number of resulting boxes from the previous step and for each box we remove the horizontal lines using morphological operations and contour finding then the damaged image symbols are reconstructed using column dialation and anding operation with the original image.
-
using find contours we segment the resulting no_staff_line image of the staff to a number of subimages each containing one symbol.
-
the segmented images from the previous step can be classified into 3 groups of symbols based on the length of the array resulting from filled_holes_centers, which returns the number of ovals in the symbol image.
-
if number of ovals in the image is 0 then it is either segmentation noise or symbols with no ovals like accidentals or numbers and we distinguish between them by using a support vector machine model (model[0]) that is trained on a dataset that we made for ourselves
-
if number of ovals in the image is 1 then it is either quarter ,8th or 32nd note or sometimes g-clef. we disinguish between these symbols using another model(model[1]) that is trained to dintinguish between these symbols.
-
if number of ovals is >1 then the symbol is a beam or chord.
-
using the oval centres returned from filled_holes_centers we can know by comparing these to the y coordinates of the staff lines the pitch of the note.
-
note that we consider images that has length or width less that a certain threshold a dot symbol.
this is the repo for the image processing project where we turn images of musical notes
-
how to read a note khan academy -> very important
-
how to deal with music in python (arabic)-> very important