This repository contains the data release for the paper Delving into Qualitative Implications of Synthetic Data for Hate Speech Detection at EMNLP 2024, by Camilla Casula, Sebastiano Vecellio Salto, Alan Ramponi, and Sara Tonelli.
In this repository, we release the set of data that was manually annotated (3,500 examples in total). This subset does not include the original texts from the Measuring Hate Speech Corpus (Kennedy et al., 2020; Sachdeva et al., 2022), but only their comment IDs from the original dataset and the aggregated hate speech label we calculated for them, so the source MHS corpus should be first retrieved if one wishes to pair the source texts with their synthetic version. Here is a link to the Measuring Hate Speech Corpus.
Warning: These files contain hateful and upsetting language.
The annotation was divided into two sets:
- 500 examples annotated as human or llm-written. These are contained in the file
human-vs-llm.tsv
, which contains 5 columns:comment_id
: the original comment id that can be used to retrieve the original text from the MHS corpus.label_x
: the hate speech label we calculated after the aggregation process for the text corresponding tocomment_id
, based on the annotations of the MHS corpus (0
means no hate speech,1
means hate speech).author
: the source of thetext
. It is a string with the name of the LLM if the text was LLM-written (llama
,mistral
, ormixtral
) orgold
if it is an original real-world text.text
: the paraphrased or original text annotated by our annotators.LLM?
: the annotation by our annotators. It isTRUE
if the annotator marked the text as LLM-written,FALSE
if they thought it was a human.
- 3,000 examples (1,000 per model) annotated according to the other aspects we analyzed in the paper. These annotations are found in the files
annotations-llama2-chat-7b.tsv
,annotations-mistral-7b.tsv
, andannotations-mixtral-8x7b.tsv
. These files each contain 1,000 lines with 14 columns:comment_id
: the original comment id that can be used to retrieve the original text from the MHS corpus.label_x
: the hate speech label we calculated after the aggregation process for the text corresponding tocomment_id
, based on the annotations of the MHS corpus (0
means no hate speech,1
means hate speech).synth_text
: a string containing the LLM-generated paraphrase of the original text.prompt_failure
: whether the annotators deemed that the model was not able to correctly fulfill the instructions, and if so, the type of failure. Can beFALSE
,Prompt failure
, orDescription of original gold
.hate_speech
: whether our annotators found the synthetic text to contain hate speech or not. Can beNo
(no hate),Yes
(hateful), orUnclear
.grammar_ok
: if the synthetic text was deemed grammatically correct/realistic or not. Can beYes
orNo
.world_knowledge_correct
: whether the world knowledge present in the synthetic text is ok/realistic. Can beYes
orNo
.- target information: for each target category t in [origin, race, religion, gender, sexuality, age, and disability], there is a column
target_
[t], which isFALSE
if that target is not present in the synthetic text, or a string detailing which target type it is if there is one under that category. We use the same targets as the original MHS corpus where possible for all categories but origin, since it can get extremely sparse, for which we only use theTRUE
value if relevant.
Please note: for easier parsing and visualization of the files, we have changed all double inverted commas into single inverted commas in the texts.
The paper is set to appear in the proceedings of the EMNLP 2024 conference. It can be cited as:
Camilla Casula, Sebastiano Vecellio Salto, Alan Ramponi, and Sara Tonelli. 2024. Delving into Qualitative Implications of Synthetic Data for Hate Speech Detection. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. To appear. Association for Computational Linguistics.
@inproceedings{casula-etal-2024-delving,
title = "Delving into Qualitative Implications of Synthetic Data for Hate Speech Detection",
author = "Casula, Camilla and
Vecellio Salto, Sebastiano and
Ramponi, Alan and
Tonelli, Sara",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
year = "2024",
publisher = "Association for Computational Linguistics"
}