This repository presents a methodology for determining the positions of points on raster images of scatter plots.
The proposed methodology is semi-automatic and needs user interaction on different algorithm stages.
Firstly, user select target marker. Then, in interactive mode, user varies the parameters of the algorithm and choose the best result. As a result, user receives the coordinates of the points for the selected marker.
The proposed algorithm considers the digitization of the scatter plot as the task of detecting a pattern in the image. The technique of marker detection on image is based on Template Matching algorithm, Generalized Hough Transform and Non Maximum Suppression.
The algorithm supports both grayscale and RGB images. Scatter plot may have several marker types of any shape and color.
Input data that is not supported:
- markers with color gradient
- semi-transparent markers
- markers of the same type but with different sizes
Detailed information about installation and usage can be found in the user documentation.
- Clone repository:
git clone https://github.com/adusachev/scanplot.git <REPO>
cd <REPO>
Add your data (plot images) to <REPO>/datasets/
, so that images can be accessed from inside the Docker container.
- Start Docker container:
docker compose up -d
- Go to http://localhost:8888 and run notebook
main.ipynb
Stop Docker container:
docker compose down
The algorithm is ready to use, but some features are still under development.
So keep an eye out for updates.
Proposed algorithm was compared with existing tools for plot digitization:
- WebPlotDigitizer: https://automeris.io/
- DigitizeIt: https://www.digitizeit.xyz/
- PlotDigitizer: https://plotdigitizer.com/
- Adding opportunity to select region of interest
- Adding opportunity to manually edit detected data points
- Improving the accuracy of the algorithm on b/w images
- Implementation of the graphical user interface for the algorithm
Plot image sources:
[1] Meister M, Schall E, Dziak R, Spiesecke S, Thomisch K (2024) A multi-year analysis of acoustic occurrence and habitat use of blue and fin whales in eastern and central Fram Strait. PLoS ONE19(11): e0314369.
[2] Darcel, C., Davy, P., Le Goc, R., de Dreuzy, J. R., & Bour, O. (2009). Statistical methodology for discrete fracture model-including fracture size, orientation uncertainty together with intensity uncertainty and variability.
[3] Bonnet, E., Bour, O., Odling, N. E., Davy, P., Main, I., Cowie, P., & Berkowitz, B. (2001). Scaling of fracture systems in geological media. Reviews of geophysics, 39(3), 347-383.
[4] Kim J, Woo HK, Vimalajeewa D, Vidakovic B (2023) Analysis and classification of 1H-NMR spectra by multifractal analysis. PLoS ONE 18(6): e0286205.
[5] Diaferio, A. (1999). Mass estimation in the outer regions of galaxy clusters. Monthly Notices of the Royal Astronomical Society, 309(3), 610-622.