Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to calculate the average path length? #511

Closed
YuhuYang opened this issue Feb 16, 2024 · 9 comments · Fixed by #513
Closed

how to calculate the average path length? #511

YuhuYang opened this issue Feb 16, 2024 · 9 comments · Fixed by #513

Comments

@YuhuYang
Copy link

YuhuYang commented Feb 16, 2024

how to calculate global indicators of hypergraphs? Like the average path length, diameter

@YuhuYang YuhuYang changed the title how to calculate the average shortest path length? how to calculate the average path length? Feb 16, 2024
@maximelucas
Copy link
Collaborator

Thanks for the question @YuhuYang.

For a hypergraph H, you can compute all shortest path lenghts with spl = xgi.shortest_path_length(H).
Note: this gives and infinite length for disconnected nodes.

You can then just collect the lengths and take the average like so:

lens = []
for tup in spl:
    lens += tup[1].values()

# remove lengths 0 for self-loops
lens = [i for i in lens if i!=0] 

# replace inf by 0 for disconnected nodes
lens = [0 if i==np.inf else i for i in lens] 

avg_shortest_path = np.sum(lens) / (N * (N-1)) # = np.mean(lens)

You can find other global measures by exploring this page. You can find assortativity measures, clustering coefficient, or the density for example.

@YuhuYang
Copy link
Author

Thanks for the question @YuhuYang.

For a hypergraph H, you can compute all shortest path lenghts with spl = xgi.shortest_path_length(H). Note: this gives and infinite length for disconnected nodes.

You can then just collect the lengths and take the average like so:

lens = []
for tup in spl:
    lens += tup[1].values()

# remove lengths 0 for self-loops
lens = [i for i in lens if i!=0] 

# replace inf by 0 for disconnected nodes
lens = [0 if i==np.inf else i for i in lens] 

avg_shortest_path = np.sum(lens) / (N * (N-1)) # = np.mean(lens)

You can find other global measures by exploring this page. You can find assortativity measures, clustering coefficient, or the density for example.

Thank you very much!

@maximelucas
Copy link
Collaborator

Note for self: not closing because we could put this a recipe or new function.

@YuhuYang
Copy link
Author

However, the code costs much time (now, about 30 minutes). The num of nodes is around 6000. Is this situation normal? (I have used the largest connected component)
image

@maximelucas
Copy link
Collaborator

maximelucas commented Feb 16, 2024

Can you check what part is taking time? I expect it is the shortest_path_length() function. Computing shortest paths is expensive in general. Optimization is on our list for the future.

@YuhuYang
Copy link
Author

Yes, its is the shortest_path_length(). Are there some methods can decrease the time cost?

@thomasrobiglio
Copy link
Collaborator

Hi @YuhuYang! Computing the shortest past between two nodes is a computationally complex task, so it's normal for it to take some time.
In XGI I think we have implemented a version of Dijkstra's algorithm which has a computational complexity of O(N^2) --N here is the number of nodes-- for a single node.
shortest_path_length() does this for every node in your structure so you will have something O(N^3) which is quite slow.
In addition to this polynomial cost of the procedure, XGI is implemented in python which is notably slower than other programming languages.

My suggestions to decrease the computational cost are:

  1. Parallelize the operation by calling single_source_shortest_path_length() for every node in your structure but using different cores of your machine. You can do this using for example the multiprocessing package.
  2. Use libraries for networks which are implemented in C++ or other faster languages and then wrapped in python, notable examples are graph_tool (see here for the corresponding function) or reticula (here).

I hope this helps! Happy coding!

@YuhuYang
Copy link
Author

Thanks again! I'll try them!

@thomasrobiglio
Copy link
Collaborator

thomasrobiglio commented Feb 16, 2024

2. Use libraries for networks which are implemented in C++ or other faster languages and then wrapped in python, notable examples are graph_tool (see here for the corresponding function) or reticula (here).

graph_tool is not a library for higher-order structures (I am less familiar with reticula), so you would need to take the skeleton of your hypergraph/simplicial complex and compute the paths on the corresponding graph.

thomasrobiglio added a commit to thomasrobiglio/xgi that referenced this issue Feb 29, 2024
thomasrobiglio added a commit that referenced this issue Mar 3, 2024
…length (#513)

* Update recipes.ipynb

* Added recipe for average path lenght, related to #511

* fixed displayment of links

* resp. to review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants