Repository of public datasets and examples provided by Neo4j

IMDB

This dataset originates from https://github.com/seongjunyun/Graph_Transformer_Networks which in turn obtained the dataset from earlier sources. The dataset has been processed to be compatible with the Snowflake application Neo4j Graph Analytics.

The graph contains node labels 'ACTOR', 'MOVIE' and 'DIRECTOR' and relationship types 'ACTED_IN' and 'DIRECTED_IN'. All nodes contain 1256 dimensional feature vectors via the property "plot_keywords". Some of the MOVIE nodes have a 'GENRE' value of 0, 1 or 2.

A method is provided in imdb/load_to_pandas.py to load the dataset into dataframes describing nodes and relationships for the IMDB graph.

Ingesting the IMDB dataset into Snowflake

A script imdb/upload_imdb_to_snowflake.py uploads the data into a snowflake account. For usage of this script run python upload_imdb_to_snowflake.py from the imdb folder. The script takes command line arguments about snowflake configuration. The snowflake host depends on region but can be for example, snowflake.eu-west-2.aws.snowflakecomputing.com.

Running GraphSAGE on IMDB in Snowflake

The file imdb/graph_sage.sql is provided that you can run for example from inside the SnowsightUI after installing the Snowflake application Neo4j Graph Analytics inside your Snowflake account. This sql file is a template in which you need to replace the placeholders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository of public datasets and examples provided by Neo4j

IMDB

Ingesting the IMDB dataset into Snowflake

Running GraphSAGE on IMDB in Snowflake

Files

README.md

Latest commit

History

README.md

File metadata and controls

Repository of public datasets and examples provided by Neo4j

IMDB

Ingesting the IMDB dataset into Snowflake

Running GraphSAGE on IMDB in Snowflake