Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The original feature preprocess on datasets #33

Open
khan-yin opened this issue May 21, 2023 · 1 comment
Open

The original feature preprocess on datasets #33

khan-yin opened this issue May 21, 2023 · 1 comment

Comments

@khan-yin
Copy link

hello author, I am a student now focused on GNN. I am curious about the original feature preprocess details on datasets for node classification. I want to know whether the origin feature has Heterogeneity or not. for example, the origin feature is generated by metapath2vec/transE ..etc, or maybe randomwalk? because I noticed that on some dataset,the original features (feats-type 0) can not even work better than only target with others zero features (feats-type 1). Thanks a lot. looking forward to your reply.😆

@1049451037
Copy link
Member

Hi. Thank you for your attention. The original features depends on the datasets. For example, the paper node in ACM and DBLP features are paper keyword n-gram. The author nodes are aggregated features from papers as suggested in HAN and MAGNN. Maybe the early aggregation causes the worse performance. For other information, you can refer to the dataset preprocessing scripts:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants