-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add README to run scripts * Added zenodo
- Loading branch information
Showing
2 changed files
with
39 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Installation | ||
|
||
In order to reproduce our results clone the repository and checkout the specific tag to get the state at which the experiments where done: | ||
|
||
``` | ||
git clone https://github.com/dobraczka/klinker.git | ||
cd klinker | ||
git checkout paper | ||
``` | ||
|
||
Create a virtual environment with micromamba and install the dependencies: | ||
|
||
``` | ||
micromamba env create -n klinker-conda --file=klinker-conda.yaml | ||
micromamba activate klinker-conda | ||
pip install -e ".[all]" | ||
``` | ||
|
||
# Running the experiments | ||
We originally used SLURM to run our experiments utilizing SLURM Job arrays. We adapted our code so it can be run without SLURM, but kept the arrays. | ||
For each embedding based method the entries 0-15 utilize sentence transformer embeddings and 16-31 rely on SIF aggregated fasttext embeddings. | ||
For the entries 24-31 it is expected, that you have the dimensionality reduced fasttext embeddings in `~/.data/klinker/word_embeddings/100wiki.en.bin`. | ||
For methods without embeddings (`non_relational/run_token.sh` and `relational/run_relational_token.sh`) only the entries 0-15 exist. | ||
|
||
You can reduce the dimensionality of the fasttext embeddings like this: | ||
``` | ||
import fasttext | ||
import fasttext.util | ||
ft = fasttext.load_model('wiki.en.bin') | ||
fasttext.util.reduce_model(ft, 100) | ||
ft.save_model("~/.data/klinker/word_embeddings/100wiki.en.bin") | ||
``` | ||
|
||
The experiments can then be run individually by supplying the wanted entry as first argument, e.g: | ||
``` | ||
bash run_scripts/relational/run_token_attribute.sh 16 | ||
``` |