This is an analytics engineering project where I use my Spotify data to create a dashboard that visualises my listening habits for each year. This project was inspired by Spotify Wrapped and how I think the product has been lackluster in 2024. I request my data from Spotify, which comes back as JSON
files. I use dltHub to then create a pipeline to extract track, artist, and album details from the Spotify API. I use duckdb to store the data locally and dbt to transform the data into a STAR Schema
. I then feed this data into Tableau to create the 🔗dashboard.
Python
SQL
dbt
dltHub
duckdb
Tableau
The dashboard is split into 2 distinct sections to give
Insights at a glance
, on the left hand side of the dashboard. These are 1-liners that give you an overview of your listening habits related to tracks, artists, albums, and genres. This is a quick way to see what you've been listening to the most, and is used by Spotify Wrapped to give you a quick overview of your listening habits.Detailed insights
, on the right hand side of the dashboard. This is a more detailed look at your listening habits, with a focus on the top tracks, artists, and albums you've listened to, as well as the genres you've been listening to. This is a more detailed look at your listening habits, and is used by Spotify Wrapped to give you a more detailed look at your listening habits.
You can also see a filter on the dashboard that allows you to filter by year, so you can see how your listening habits have changed over time.
- Request your
extended streaming history
data from Spotify - Clone this repository to your local PC
- Follow the instructions on creating an app to use the Spotify API, get your
client_id
andclient_secret
and store these in a.env
file in the root directory of the project as
SPOTIFY_CLIENT_ID="your_client_id"
SPOTIFY_CLIENT_SECRET="your_client_secret"
- Create a virtual environment by running
python -m venv venv
in the root directory of the project - Activate the virtual environment by running
source venv/bin/activate
on MacOS/Linux orvenv\Scripts\activate
on Windows - Install the necessary packages by running
pip install -r requirements.txt
- Run the
spotify_extract_load.py
file by runningpython spotify_extract_load.py
to extract the data from theJSON
files and load it into aduckdb
database - Once the pipeline is run, navigate to the
spotify_transform
directory and rundbt run
to transform the data into aSTAR Schema
- Once this is complete, run
dbt test
to ensure the data is correct - Now save the files locally by running
save_local.py
in the root directory. This will save files for all layers of the data transformation process. We need the files from thedata/mart
directory - Open the
Spotify Yearbook.twbx
file in Tableau Public and connect to the files in thedata/mart
directory. You will need to refresh the data source to ensure the data is up to date - The Tableau dashboard should now be populated with your data!