Instagram Fake Account Detection

This project is a deliverable created for the "Artificial Intelligence" exam from the Alma Mater Studiorum's Master's Degree course in Computer Science.

Introduction

Instagram is one of the most used social networks today, specifically by many young people as well as companies and content creators. However, the large presence of fake accounts puts in danger the experience of all its users, spreading phenomena such as spam, fraud, and fake engagement.

The aim of this project is therefore to give a deep analysis of the set of features useful for the discrimination of these accounts leveraging Machine Learning techniques, with a particular interest in explainability. This has been done starting from two existing proposals from the literature, giving a better set of attributes for both approaches both proposing new features composed by the already existing ones and removing old features considered unnecessary. Various Machine Learning techniques are then compared to give a better understanding of the solution, giving positive results.

Installing and running the project

This section contains instructions to configure and run the project. According to the installation, you may need to refer to python as python3. Python 3 is mandatory for the execution of this project, as is the presence of the venv module.

Either download or clone the project from GitLab.
Open up a terminal inside the project directory.
Create a virtual environment by running the command python -m venv venv. This will create a virtual environment in which the libraries will get installed.
Install libraries with ./venv/bin/pip install -r ./requirements.txt.
Run the script main.py with the command ./venv/bin/python ./main.py.

Once the script is running, follow the instructions on screen. Never run generate_dl_dataset.py, as it's not needed for the demo and will invalidate all the work done on the MLP experiments.

Codebase structure

main.py serves as executable script to run the experiments.
generate_dl_dataset.py serves as executable script to instantiate or reset deep learning models datasets.
utils/utils.py contains many useful functions such as the ones to run the experiments or get the metric scores.
dataset/
- normalizer.py contains a script to create a single dataset from two different ones and export it in .json format.
- utils.py contains many useful functions to work with the datasets, such as shuffling and splitting and getting combined datasets.
- deep/ contains all the fixed datasets for multilayer perceptron experiments.
- sources/ contains the datasets that are being used for the experiments.
  - automatedAccountData.json contains the fake accounts of the InstaFake dataset.
  - nonautomatedAccountData.json contains the real accounts of the InstaFake dataset.
  - user_fake_authentic_2class.csv contains the IJECE dataset.
deep/ contains all the multilayer perceptron related functions and models.
- common.py contains several utility functions for multilayer perceptron.
- experiment.py contains the main experiment runner for multilayer perceptron.
- combined/ contains training scripts, model definitions and models for the "combined" datasets.
- compatible/ contains training scripts, model definitions and models for the "compatible" datasets.
- IJECE/ contains training scripts, model definitions and models for the "IJECE" datasets.
- InstaFake/ contains training scripts, model definitions and models for the "InstaFake" datasets.
visualization/
- plotter.py contains many useful functions to plot data and results.
- plots/ contains the plots for the data analysis on the original datasets.
- plots_results/ contains the plots representing the result of the experiments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instagram Fake Account Detection

Introduction

Installing and running the project

Codebase structure

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
dataset		dataset
deep		deep
utils		utils
visualization		visualization
.gitignore		.gitignore
README.md		README.md
REPORT_AI.pdf		REPORT_AI.pdf
generate_dl_dataset.py		generate_dl_dataset.py
main.py		main.py
requirements.txt		requirements.txt

alberto-paparella/InstagramFakeAccountDetection

Folders and files

Latest commit

History

Repository files navigation

Instagram Fake Account Detection

Introduction

Installing and running the project

Codebase structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages