ADGC is a collection of state-of-the-art (SOTA), novel deep graph clustering methods (papers, codes and datasets). Any other interesting papers and codes are welcome. Any problems, please contact [email protected]. If you find this repository useful to your research or work, it is really appreciate to star this repository. ❤️
Deep graph clustering, which aims to reveal the underlying graph structure and divide the nodes into different groups, has attracted intensive attention in recent years.
- K-Means: "Algorithm AS 136: A k-means clustering algorithm" [paper|code]
- DEC (ICML16): "Unsupervised Deep Embedding for Clustering Analysis" [paper|code]
- DCN (ICML17): "Towards k-means-friendly spaces: Simultaneous deep learning and clustering" [paper|code]
- IDEC (IJCAI17): "Improved Deep Embedded Clustering with Local Structure Preservation" [paper|code]
- GAE/VGAE : "Variational Graph Auto-Encoders" [paper|code]
- DAEGC (IJCAI19): "Attributed Graph Clustering: A Deep Attentional Embedding Approach" [paper|code]
- ARGA/ARVGA (TCYB19): "Learning Graph Embedding with Adversarial Training Methods" [paper|code]
- MCGC (TIP19): "Multiview Consensus Graph Clustering" [paper|code]
- SDCN/SDCN_Q (WWW20): "Structural Deep Clustering Network" [paper|code]
- AGE (SIGKDD20): "Adaptive Graph Encoder for Attributed Graph Embedding" [paper|code]
- MVGRL (ICML20): "Contrastive Multi-View Representation Learning on Graphs" [paper|code]
- DFCN (AAAI21): "Deep Fusion Clustering Network" [paper|code]
- MCGC (NIPS21): "Multi-view Contrastive Graph Clustering" [paper|code]
- GCC (ICCV21): "Graph Contrastive Clustering" [paper|code]
- DCRN (AAAI22): "Deep Graph Clustering via Dual Correlation Reduction" [paper|code]
We divide the datasets into two categories, i.e. graph datasets and non-graph datasets. Graph datasets are some graphs in real-world, such as citation networks, social networks and so on. Non-graph datasets are NOT graph type. However, if necessary, we could construct "adjacency matrices" by K-Nearest Neighbors (KNN) algorithm.
- Step1: Download all datasets from [Google Drive] | [Baidu Netdisk, code: 1234]. Optionally, download some of them from URLs in the tables (Google Drive)
- Step2: Unzip them to ./dataset/
- Step3: Change the type and the name of the dataset in main.py
- Step4: Run the main.py
- utils.py
- load_graph_data: load graph datasets
- load_data: load non-graph datasets
- normalize_adj: normalize the adjacency matrix
- diffusion_adj: calculate the graph diffusion
- construct_graph: construct the knn graph for non-graph datasets
- numpy_to_torch: convert numpy to torch
- torch_to_numpy: convert torch to numpy
- clustering.py
- setup_seed: fix the random seed
- evaluation: evaluate the performance of clustering
- k_means: K-means algorithm
- visualization.py
- t_sne: t-SNE algorithm
- similarity_plot: visualize cosine similarity matrix of the embedding or feature
-
Graph Datasets
Dataset Samples Dimension Edges Classes URL CORA 2708 1433 6632 7 cora.zip CITESEER 3327 3703 6215 6 citeseer.zip PUBMED 19717 500 44325 3 pubmed.zip DBLP 4057 334 3528 4 dblp.zip CITE 3327 3703 4552 6 cite.zip ACM 3025 1870 13128 3 acm.zip AMAP 7650 745 119081 8 amap.zip AMAC 13752 767 245861 10 amac.zip CORAFULL 19793 8710 63421 70 corafull.zip COCS -
Non-graph Datasets
Dataset Samples Dimension Type Classes URL USPS 9298 256 Image 10 usps.zip HHAR 10299 561 Record 6 hhar.zip REUT 10000 2000 Text 4 reut.zip
If you use code or datasets in this repository for your research, please cite our paper.
@inproceedings{
}