This repository contains the code for our senior project "Watermarking Machine-Generated Text".
Final Notebook |
---|
Notebook |
Description | Notebooks | Datasets |
---|---|---|
1. Create a dataset of Unwatermarked Text | Notebook | Dataset |
2. Test Watermarking and Detection Algorithms | Logits Deviation with Green-Red List Sampling With Randomized Numbers using TinyLlama Logits Deviation with Randomized Numbers using OPT-350M |
N/A |
3. Automate Watermarking Using the Sunbird Dataset | Part 1 Part 2 Part 3 Merge |
Part 1 Part 2 Part 3 Merged |
4. Merge Unwatermarked and Watermarked Datasets | Merge Truncate |
Merged Truncated |
5. Run the Detection Algorithm on the first 1200 rows of the Truncated Dataset | Part1 Part2 Part3 Merge |
Part1 Part2 Part3 Merged |
6. Evaluate the Accuracy, Precision, Recall and F1-score of the Detection Algorithm | Notebook | N/A |
Description | Notebooks | Datasets |
---|---|---|
1. Experimented paraphrasing using Dipper | Notebook | N/A |
2. Experimented paraphrasing using T5 on the Watermarked Dataset | Notebook | Dataset |
3. Run the Detection Algorithm on the Paraphrased Dataset | Part1 Part2 Part3 Merge |
Part1 Part2 Part3 Merged |
4. Evaluate the Accuracy, Precision, Recall and F1-score of the Detection Algorithm after Paraphrasing | Notebook | N/A |
5. Experimented paraphrasing using Roundtrip paraphrasing (English to French to English) on the Watermarked Dataset | Notebook | Dataset |
6. Run the Detection Algorithm on the Paraphrased Dataset | Part1 Part2 Part3 Merge |
Part1 Part2 Part3 Merged |
7. Evaluate the Accuracy, Precision, Recall and F1-score of the Detection Algorithm after RT Paraphrasing | Notebook | N/A |
Description | Notebooks | Datasets |
---|---|---|
1. Simulate Homoglyph Attack on the Watermarked Dataset | Notebook | Dataset |
2. Run the Detection Algorithm on the Homoglyph Text | Part1 Part2 Part3 Merge |
Part1 Part2 Part3 Merged |
3. Evaluate the Accuracy, Precision, Recall and F1-score of the Detection Algorithm after applying Homoglyph Attack | Notebook | N/A |
4. Counteract the Homoglyph Attack | Notebook | Dataset |
Description | Notebooks | Datasets |
---|---|---|
1. Simulate Zero-width attack on watermarked text | Notebook | Dataset |
2. Run the Detection Algorithm on the first 100 samples from the Zero-width Attacked Dataset | Part 1 Part 2 Part 3 Merge |
Part 1 Part 2 Part 3 Merged |
3. Evaluate the Accuracy, Precision, Recall and F1-score of the Detection Algorithm after Zero-Width Attack without counteracting | Notebook | N/A |
4. Counteract the Zero-Width Attack | Notebook | Dataset |
Description | Notebooks | Datasets |
---|---|---|
1. Simulate Bidirectional Reordering attack on watermarked text | Notebook | Dataset |
2. Run the Detection Algorithm on the first fews samples from the Bidirectional Reordering Attack Dataset | Part 1 | Part 1 |
3. Use RTL languages detector to evaluate if Bidi characters are unnecessary | Notebook | Dataset |
Note: Proposed countermeasure is to detect Right-To-Left (RTL) languages and then evaluate the text for bidi reordering characters. If text is in a Left-To-Right (LTR) language and uses Bidi characters then it's flagged as manipulated.
Description | Notebooks | Datasets |
---|---|---|
1. Simulate Spelling Mistakes Attack on watermarked text | Notebook | Dataset |
2. Used a spellcheck pretrained model on the Mispelled dataset | Notebook | |
3. Run detection algorithm on the dataset before and after spellcheck | Before Spellcheck After Spellcheck |
N/A |
Note: Accuracy before spellcheck was 83% and after spellcheck it was significantly reduced to 54%. This can be due to the method we used to introduce spelling mistakes (They simulated typos such as forgetting a letter or swapping any two letters in a word), or some small hallucinations from the language model used for spellcheck.
Description | Notebooks | Datasets |
---|---|---|
1. Simulate Unnecessary Whitespace Attack on watermarked text and Undo it (Disadvantages: Removes all newlines) | Notebook | Dataset |
2. Remove unnecessary whitespace from dataset (watermarked and unwatermarked) | Notebook | Dataset |
3. Run detection algorithm on the modified dataset | Before After |
N/A |
For any inquiries, please reach out to us at
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]