Skip to content

Commit

Permalink
add arxiv link and citation
Browse files Browse the repository at this point in the history
  • Loading branch information
ibigoula committed Jan 16, 2025
1 parent 08bc3d0 commit d3eb9b2
Show file tree
Hide file tree
Showing 4 changed files with 37 additions and 17 deletions.
21 changes: 10 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities
[![Arxiv](https://img.shields.io/badge/Arxiv-YYMM.NNNNN-red?style=flat-square&logo=arxiv&logoColor=white)](https://put-here-your-paper.com)
[![Arxiv](https://img.shields.io/badge/Arxiv-2501.08716-red?style=flat-square&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2501.08716)
[![License](https://img.shields.io/github/license/UKPLab/arxiv2025-inherent-limits-plms)](https://github.com/UKPLab/arxiv2025-inherent-limits-plms/blob/main/LICENSE)
[![Python Versions](https://img.shields.io/badge/Python-3.9-blue.svg?style=flat&logo=python&logoColor=white)](https://www.python.org/)

Expand Down Expand Up @@ -171,17 +171,16 @@ The results are written to `eval_logs.csv` and `bertscore_evals.csv`
If you found this repository helpful, please cite our paper:

```
@InProceedings{smith:20xx:CONFERENCE_TITLE,
author = {},
title = {},
booktitle = {},
month = mmm,
year = {20xx},
address = {},
publisher = {},
pages = {XXXX--XXXX},
url = {http://xxxx.xxx}
@misc{bigoulaeva2025inherentlimitspretrainedllms,
title={The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities},
author={Irina Bigoulaeva and Harish Tayyar Madabushi and Iryna Gurevych},
year={2025},
eprint={2501.08716},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2501.08716},
}
```

## Disclaimer
Expand Down
19 changes: 19 additions & 0 deletions data/manual_downloads/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Manually-Downloaded Datasets
Some datasets contained within FLAN must be downloaded manually from the creators.

These are:

* Newsroom: https://lil.nlp.cornell.edu/newsroom/
* Fill out the form provided by the authors to get access to the dataset

Paper: [Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies](https://aclanthology.org/N18-1065/) (Grusky et al., 2018)

* Opin iDebate & Opin Movie: http://www.ccs.neu.edu/home/luwang/

Paper: [Neural Network-Based Abstract Generation for Opinions and Arguments](https://aclanthology.org/N16-1007/) (Wang & Ling, 2016)

* Story Cloze: https://cs.rochester.edu/nlp/rocstories/
* Fill out the form provided by the authors to get access to the test dataset
* Following the original FLAN, we use the 2016 version.

Paper: [A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories](https://aclanthology.org/N16-1098/) (Mostafazadeh et al., 2016)
6 changes: 3 additions & 3 deletions data_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,7 @@ def make_label_field(example):

def load_hf_dataset(dataset_name, split=None, do_partitioning=False):
print("LOADING DATA SPLIT:", split)
cutoff_num = 100000
cutoff_num = 100000 # Set this to avoid fully loading massive datasets
args_dict = {"split": split,
"trust_remote_code": True}

Expand Down Expand Up @@ -305,7 +305,7 @@ def load_hf_dataset(dataset_name, split=None, do_partitioning=False):

elif dataset_name == "cnn_dailymail":
args_dict["path"] = dataset_name
args_dict["name"] = "3.0.0" # FLAN uses 3.1.0??
args_dict["name"] = "3.0.0" # FLAN uses 3.1.0, but this is unavailable on HF

elif dataset_name == "web_nlg":
args_dict["path"] = dataset_name
Expand Down Expand Up @@ -537,7 +537,7 @@ def enumerate_lines(line_list):

# Some datasets require post-processing and filtering
if dataset_name == "snli":
# Remove "unsure" samples. Source: (TODO: CITE)
# Remove samples without a consensus label. Source: https://huggingface.co/datasets/stanfordnlp/snli
dataset = dataset.filter(lambda item: item["label"] != -1)
dataset = dataset.add_column("options", [""] * len(dataset))
elif dataset_name == "fix_punct":
Expand Down
8 changes: 5 additions & 3 deletions test.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,8 @@ def format_ic_bigbench(sample, inner_template, task_name, task_dataset, ic_examp
return sample

def filter_bad_samples(example):
# These samples caused the regex search to hang, since they had too many matches.
# We can simply remove these.
val = True
if "Applied for full membership" in example["text"]:
val = False
Expand Down Expand Up @@ -605,11 +607,11 @@ def run_test(model_name,
}

# Load a trained or base model
saved_model = "/storage/ukp/work/bigoulaeva/CoT_Recovery/src/saved_models/" + config.args.run_name
saved_model = config.path + "saved_models/" + config.args.run_name
if config.args.run_name == "base":
saved_model = None
if not from_samplegen:
out_file = "/storage/ukp/work/bigoulaeva/CoT_Recovery/src/saved_models/base_model_evals/" + eval_name + ".csv"
out_file = config.path + "saved_models/base_model_evals/" + eval_name + ".csv"
elif from_samplegen:
target_folder = config.path + "saved_models/" + config.samplegen_model + "/"
samplegen_eval_path = "samplegen_pipeline_evals"
Expand Down Expand Up @@ -861,7 +863,7 @@ def run_test(model_name,
responses["gold_options"].append(orig_options[idx])
elif from_samplegen:
if config.sample_source == "model":
if orig_options[0] != "": # Otherwise it breaks for tasks without options (batch_start+idx too high by 1).....
if orig_options[0] != "": # Otherwise it breaks for tasks without options (batch_start+idx too high by 1)
responses["gold_options"].append(orig_options[batch_start+idx][0])
else:
responses["gold_options"].append("")
Expand Down

0 comments on commit d3eb9b2

Please sign in to comment.