Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Updates to modularity PR #358

Merged
merged 19 commits into from
Mar 12, 2024

Conversation

stefan-apollo
Copy link
Collaborator

@stefan-apollo stefan-apollo commented Mar 11, 2024

Improve modularity scripts

Running mod add * toy models and fixing modularity bugs on the way.

Description

  • Update mod add

Motivation and Context

Make modularity tools useful for mod add/

How Has This Been Tested?

Todo

Does this PR introduce a breaking change?

We'll see

@stefan-apollo stefan-apollo changed the base branch from main to feature/clustering-algos March 11, 2024 17:16
@stefan-apollo stefan-apollo changed the title [Not a PR] Play/mod add march WIP Updates to modularity PR Mar 12, 2024
@stefan-apollo stefan-apollo merged commit e76050c into feature/clustering-algos Mar 12, 2024
nix-apollo added a commit that referenced this pull request Apr 22, 2024
* add modularity.py

* small improvements

* modularity script

* add tests

* update requirements

* small fixes

* remove tabulate dependancy

* pytest mark slow

* fix return sets

* plot modular RIB graph

* improvements from self-review

* ignore0 fix

* ablation fixes

* plotting and script improvements

* stefan comment fixes

* pytest ignore fix

* more PR review fixes

* Apply suggestions from code review

Co-authored-by: Stefan Heimersheim <[email protected]>

* fix bisect ablation

* docstring improvments

* comment fixes

* fix none for leiden iters

* Add ablation path arg

* Improve docs / don't run plot_edge_dist for other usecases than it was designed for

* Fix ablation node layers for output node layer

* Fix incorrect if in _make_graph

* Rename run_modularity.py to be for MLPs and add 2nd WIP script for toy models

* Fix typo

* Move LLM specific plotting function into right file

* Move to_results into rib_builder

* Updates to modularity PR (#358)

* Updated mod add yaml

* Allow for different types

* Allow for node layers without .

* Add ablation path arg

* Run w/o mlp

* Rename file to be specific to LLMs

* WIP run script for mod add & toy mlp

* Undo changes

* Fix typo

* Make interface of plot_by_layer more like plot_rib_graph

* Make plot_rib_graph and plot_by_layer interfaces more similar

* Finished plotting interface change (mostly)

Remaining todo: Cluster docstrings

* Move plot_graph_by_layer to plotting.py

* Made line scaling automatic

* Change default norm to Sqrt

* Typo

* Split up gpt2 and Pythis scripts

* Show RIB indices as labels

* Only keep one run script in RIB, move others to nterp/interp/modularity

* Fix labels

* Plot 0 dimension in piano plot

* Move modularity run-script to rib_build and delete modularity folder

* Update docstring

* Get threshhold from ablation file

* Allow for labels and non-by-layer plot

* Update docstring

* WIP Rewrote part of plotting code; though contains bug at the moment!

* Hotfix fixed node keys

* Hotfix: Fix color to hex conversion

* Updated configs

* Allow for nodes_per_layer cmdline

* Change nodes_per_layer behaviour

* Allow for an nk seed

* Improve code, better indexing, delete old code

* Allow to specify plot norm

* Print seed

* Improve docstring

* WIP Better sorting method

* Rename piano to paino

* Docstrings

* Set all kwargs

* Fix colors and labelling

* Improved docs

* We decided to enable checks on all branches

* Fix tests

* Rename nodes_per_layer to max_nodes_per_layer

* Fix distributed CI

* Fixed the failing distributed test

* Update sample config yamls

* Add seed to block diagonal DNN config

* Add more edge norms & add option to reproduce legacy normalization

* Rename plotting labels from RIB_ to D (for direction/dimension) because it also plots PCA dims

* Fixed a bug where hide_const_edges was ignored

* Format plots properly

* Set default max_nodes_per_layer high so it doesn't trigger by default in non-LMs

* Allow manually setting lognorm plotting scale

* A bunch of small extensions and fixes of plotting and modularity tools

* Add mod add config files

* Introduce get_norm to simplify code, and add some kwargs

* Add seed sample configs

* More plotting customization

* Fixed ablation config to use test set

* Plotting adjustments. Not super happy with spacing but better than prev situation

* Implement option to specify number of clusters for leiden to find

* Minor logging improvement

* Minor filename changes

* fix typos

* fix outdated doc

---------

Co-authored-by: Stefan Heimersheim <[email protected]>
Co-authored-by: Stefan Heimersheim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant