This is the official code repository for Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
- main experiment
conda create -n Silent_Guardian python=3.9
conda activate Silent_Guardian
pip install -r requirements.txt
- test result (Given that PyTorch and TensorFlow might conflict, you can set up two separate environments.)
conda create -n Silent_Guardian_Test python=3.9
conda activate Silent_Guardian_Test
pip install -r requirements_test.txt
- The Vicuna dataset is located in
dataset/target.json
. - The Novel dataset is located in
dataset/novel.json
. - The five rewrite-related prefixes used in this paper is located in
dataset/instructions.json
. - The 100 rewrite-related prefixes generated by ChatGPT-3.5 is located in
dataset/rewrite_instruction.json
.
python create.py --STP "STP" --path "vicuna" --bert_path "bert" --agg_path "llama" --target_file "target.json" --instructions_file "instructions.json" --epoch 15 --batch_size 128 --topk 5 --topk_semanteme 10
--STP
: You can choose one of the four STP modes: "STP", "STP_bert", "STP_agg", or "STP_instructions". STP is the standard STP algorithm, STP_bert uses BERT for selecting synonymous tokens, STP_agg can aggregate the loss functions of two models to construct TPE, and STP_instructions allows constructing STP based on prepared prompts of the same theme.--path
: The path to the target model.--bert_path
: The path to the BERT model.--agg_path
: The path to the second model.--target_file
: The text file for which STP needs to be constructed.--instructions_file
: Prompts of the same theme.--epoch
: The number of STP iterations.--batch_size
: The number of items to be constructed in a single iteration.--topk
: The final replacement set size.--topk_semanteme
: The size of the synonym set.
python test_result.py --encoder_path "universal encoder" --target_path "result_of_target.json"
You can use test_result.py
to calculate the Character Replacement Ratio and Semantic Preservation.
--encoder_path
: The path to the universal encoder.--target_path
: The result of target file.
Refer to the Universal Sentence Encoder official code repository for universal-encoder.