Coder Social home page Coder Social logo

gem-benchmark / nl-augmenter Goto Github PK

View Code? Open in Web Editor NEW
769.0 23.0 195.0 91.64 MB

NL-Augmenter 🦎 β†’ 🐍 A Collaborative Repository of Natural Language Transformations

License: MIT License

Python 86.87% Jupyter Notebook 13.10% Makefile 0.03%

nl-augmenter's Introduction

Checks Forks Issues Pull requests Contributors License

NL-Augmenter 🦎 β†’ 🐍

The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformations augment text datasets in diverse ways, including: randomizing names and numbers, changing style/syntax, paraphrasing, KB-based paraphrasingΒ ... and whatever creative augmentation you contribute. We invite submissions of transformations to this framework by way of GitHub pull request.

Paper accepted at NEJLT 2023 here.

The framework organizers can be contacted at [email protected].

Table of contents

Colab notebook

Open In Colab To quickly see transformations and filters in action, run through our colab notebook.

Some Ideas for Transformations

If you need inspiration for what transformations to implement, check out #75, where some ideas and previous papers are discussed. So far, contributions have focused on morphological inflections, character level changes, and random noise. The best new pull requests will be dissimilar from these existing contributions.

Installation

Requirements

  • Python 3.7

Instructions

# When creating a new transformation, replace this with your forked repository (see below)
git clone https://github.com/GEM-benchmark/NL-Augmenter.git
cd NL-Augmenter
python setup.py sdist
pip install -e .
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz

How do I create a transformation?

Setup

First, fork the repository in GitHub! 🍴

fork button

Your fork will have its own location, which we will call PATH_TO_YOUR_FORK. Next, clone the forked repository and create a branch for your transformation, which here we will call my_awesome_transformation:

git clone $PATH_TO_YOUR_FORK
cd NL-Augmenter
git checkout -b my_awesome_transformation

We will base our transformation on an existing example. Create a new transformation directory by copying over an existing transformation. You can choose to copy from other transformation directories depending on the task you wish to create a transformation for. Check some of the existing pull requests and merged transformations first to avoid duplicating efforts or creating transformations too similar to previous ones.

cd nlaugmenter/transformations/
cp -r butter_fingers_perturbation my_awesome_transformation
cd my_awesome_transformation

Creating a transformation

  1. In the file transformation.py, rename the class ButterFingersPerturbation to MyAwesomeTransformation and choose one of the interfaces from the interfaces/ folder. See the full list of options here.
  2. Now put all your creativity in implementing the generate method. If you intend to use external libraries, add them with their version numbers in requirements.txt
  3. Update my_awesome_transformation/README.md to describe your transformation.

Testing and evaluating (Optional)

Once you are done, add at least 5 example pairs as test cases in the file test.json so that no one breaks your code inadvertently.

Once the transformation is ready, test it:

pytest -s --t=my_awesome_transformation

If you would like to evaluate your transformation against a common πŸ€—HuggingFace model, we encourage you to check evaluation

Code Styling To standardized the code we use the black code formatter which will run at the time of pre-commit. To use the pre-commit hook, install pre-commit with pip install pre-commit (should already be installed if you followed the above instructions). Then run pre-commit install to install the hook. On future commits, you should see the black code formatter is run on all python files you've staged for commit.

Submitting

Once the tests pass and you are happy with the transformation, submit them for review. First, commit and push your changes:

git add transformations/my_awesome_transformation/*
git commit -m "Added my_awesome_transformation"
git push --set-upstream origin my_awesome_transformation

Finally, submit a pull request. The last git push command prints a URL that can be copied into a browser to initiate such a pull request. Alternatively, you can do so from the GitHub website.

pull request button

✨ Congratulations, you've submitted a transformation to NL-Augmenter! ✨

How do I create a filter?

We also accept pull-requests for creating filters which identify interesting subpopulations of a dataset. The process to add a new filter is just the same as above. All filter implementations require implementing .filter instead of .generate and need to be placed in the filters folder. So, just the way transformations can transform examples of text, filters can identify whether an example follows some pattern of text! The only difference is that while transformations return another example of the same input format, filters simply return True or False! For step-by-step instructions, follow these steps.

BIG-Bench πŸͺ‘

If you are interested in NL-Augmenter, you may also be interested in the BIG-bench large scale collaborative benchmark for language models.

Most Creative Implementations πŸ†

After all pull-requests have been merged, 3 of the most creative implementations would be selected and featured on this README page and on the NL-Augmenter webpage.

Paper

@misc{dhole2021nlaugmenter,
  title={NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation},
  author={Kaustubh D. Dhole and Varun Gangal and Sebastian Gehrmann and Aadesh Gupta and Zhenhao Li and Saad Mahamood and Abinaya Mahendiran and Simon Mille and Ashish Srivastava and Samson Tan and Tongshuang Wu and Jascha Sohl-Dickstein and Jinho D. Choi and Eduard Hovy and Ondrej Dusek and Sebastian Ruder and Sajant Anand and Nagender Aneja and Rabin Banjade and Lisa Barthe and Hanna Behnke and Ian Berlot-Attwell and Connor Boyle and Caroline Brun and Marco Antonio Sobrevilla Cabezudo and Samuel Cahyawijaya and Emile Chapuis and Wanxiang Che and Mukund Choudhary and Christian Clauss and Pierre Colombo and Filip Cornell and Gautier Dagan and Mayukh Das and Tanay Dixit and Thomas Dopierre and Paul-Alexis Dray and Suchitra Dubey and Tatiana Ekeinhor and Marco Di Giovanni and Rishabh Gupta and Rishabh Gupta and Louanes Hamla and Sang Han and Fabrice Harel-Canada and Antoine Honore and Ishan Jindal and Przemyslaw K. Joniak and Denis Kleyko and Venelin Kovatchev and Kalpesh Krishna and Ashutosh Kumar and Stefan Langer and Seungjae Ryan Lee and Corey James Levinson and Hualou Liang and Kaizhao Liang and Zhexiong Liu and Andrey Lukyanenko and Vukosi Marivate and Gerard de Melo and Simon Meoni and Maxime Meyer and Afnan Mir and Nafise Sadat Moosavi and Niklas Muennighoff and Timothy Sum Hon Mun and Kenton Murray and Marcin Namysl and Maria Obedkova and Priti Oli and Nivranshu Pasricha and Jan Pfister and Richard Plant and Vinay Prabhu and Vasile Pais and Libo Qin and Shahab Raji and Pawan Kumar Rajpoot and Vikas Raunak and Roy Rinberg and Nicolas Roberts and Juan Diego Rodriguez and Claude Roux and Vasconcellos P. H. S. and Ananya B. Sai and Robin M. Schmidt and Thomas Scialom and Tshephisho Sefara and Saqib N. Shamsi and Xudong Shen and Haoyue Shi and Yiwen Shi and Anna Shvets and Nick Siegel and Damien Sileo and Jamie Simon and Chandan Singh and Roman Sitelew and Priyank Soni and Taylor Sorensen and William Soto and Aman Srivastava and KV Aditya Srivatsa and Tony Sun and Mukund Varma T and A Tabassum and Fiona Anting Tan and Ryan Teehan and Mo Tiwari and Marie Tolkiehn and Athena Wang and Zijian Wang and Gloria Wang and Zijie J. Wang and Fuxuan Wei and Bryan Wilie and Genta Indra Winata and Xinyi Wu and Witold WydmaΕ„ski and Tianbao Xie and Usama Yaseen and M. Yee and Jing Zhang and Yue Zhang},
  journal={Northern European Journal of Language Technology},
  volume={9},
  number={1},
  year={2023}
}

License

Some transformations include components released under a different (permissive, open source) license. For license details, refer to the README.md and any license files in the transformations's or filter's directory.

nl-augmenter's People

Contributors

aadesh11 avatar abinayam02 avatar ashish3586 avatar asnota avatar boyleconnor avatar bryanwilie avatar filco306 avatar gentaiscool avatar gxywang avatar juand-r avatar jzcs2018 avatar kaustubhdhole avatar kvadityasrivatsa avatar marco-digio avatar mnamysl avatar nickeilf avatar raft001 avatar saad-mahamood avatar samuelcahyawijaya avatar sirrob1997 avatar sotwi avatar tanay2001 avatar tanfiona avatar timothy22000 avatar uyaseen avatar vyraun avatar wwydmanski avatar xudongolivershen avatar zhexiongliu avatar zijwang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nl-augmenter's Issues

Loading of Filter Tests

I think there might be something broken with the filter tests, at least when I extended the test.json of the TextContainsKeywordsFilter to contain another test case:

{
    "type": "keywords",
    "test_cases": [
        {
            "class": "TextContainsKeywordsFilter",
            "args": {
                "keywords": ["in", "at"]
            },
            "inputs": {
                "sentence": "Andrew played cricket in India"
            },
            "outputs": true
        },
	{
            "class": "TextContainsKeywordsFilter",
            "args": {
                "keywords": ["sad"]
            },
            "inputs": {
                "sentence": "Andrew played cricket in India"
            },
            "outputs": false
        }
    ]
}

And then ran: pytest -s --f=keywords

It fails the test, although from my understanding it should still work properly. In particular, after printing self.keywords in the filter method, it seems like there is no new instance created for the new test case and the old keywords are still used which causes the second test case to fail.

Am I misusing something here? I ran into this when writing the tests for my addition of a filter.

Style paraphrasers work best in a two-stage pipeline, can re-use HuggingFace `generate(...)` APIs

Hi everyone, I'm the original author of the STRAP paraphrasers (paper link) which were recently accepted to NL-Augmenter (#227), an effort led by @Filco306. Excited to see these models in NL-Augmenter!

After discussing with @Filco306 and seeing the PR, I saw that 6 different variants of the paraphraser have been provided, a "Basic" style agnostic paraphraser as well as five style-specific paraphrasers (link). While the "Basic" paraphraser is implemented fine, for the style-specific paraphrasers it's recommended to use a two-step pipelined process ---

(1) normalize the text using the "Basic" paraphraser;
(2) pass the output from (1) through the style-specific paraphraser.

This is important since all style-specific paraphrasers were trained on the outputs of "Basic", so any other text is technically out-of-distribution. In an ablation study (-Inf PP. in Table 3 of the paper) we saw a significant drop in style transfer performance without this step. Moreover, the two-step process helps boost output diversity since the "Basic" paraphraser strips input style. This should be fairly simple to implement.

Another minor point is that the models are fully compatible with the new HuggingFace generate(...) APIs, which provide additional functionality compared to what was originally implemented in my repository (in other words, this import can be avoided). Here's an example of how to do it,

out = gpt2.generate(
    input_ids=gpt2_sentences[:, 0:init_context_size],
    max_length=gpt2_sentences.shape[1],
    return_dict_in_generate=True,
    eos_token_id=eos_token_id,
    output_scores=True,
    do_sample=top_k > 0 or top_p > 0.0,
    top_k=top_k,
    top_p=top_p,
    temperature=temperature,
    num_beams=beam_size,
    token_type_ids=segments[:, 0:init_context_size]
)

Also CCing the NL-Augmenter reviewers for the style paraphraser to keep them in the loop --- @sebastianGehrmann @Nickeilf @juand-r @kaustubhdhole

PR Filter label

There should probably be another label called "filter" to quickly check in the PR's which transformations/filters have already been implemented. Both of my PRs are filters and should therefore not have a transformation label.

Change batch size and number of visible devices for text-style-transfer

Hi @Filco306

Thank you for your great work to make the powerful paraphrasing model easily accessible through HuggingFace! Now it is much easier for me to work with it without the hassle of handling complicated dependencies!

But is there any way for us to use a larger batch size and more GPUs to accelerate the paraphrasing process? Now it I could use only one GPU and a small batch size. I read your implementation here but there does not seem to be an easy to do either of them.

Thank you. I am looking forward to your reply.

Spacy upgrade to 3.0+

Hi there,
Just wondering - is there any reason spacy is locked with the old version spacy==2.2.4 in the main requirements.txt?

Spacy 3.0 was quite a big upgrade from 2.2.4, and 3.1.0 was just released today so it might make sense to look forward and make that a requirement instead.

I don't think any current implementations would break by this upgrade but I'm happy to make a PR for it and fix things if needed.

`summarization_transformation` has unresolved reference to spaCy

When running this transformation there are several unresolved references to the spacy_nlp variable. In particular on line:

  • Line 21: self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm", disable=['ner','textcat'])

Please use from initialize import spacy_nlp to get a handle on the global spacy instance.

Cannot Run `evaluate.py` Script

I've tried running the evaluate.py script in this Colab notebook. I get the following error:

OSError: /usr/local/lib/python3.7/dist-packages/torchtext/_torchtext.so: undefined symbol: _ZNK3c104Type14isSubtypeOfExtESt10shared_ptrIS0_EPSo

`sentiment_emoji_augmenter` throws SyntaxWarning messages

When run it throws the following error messages:

/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/sentiment_emoji_augmenter/transformation.py:103: SyntaxWarning: "is" with a literal. Did you mean "=="?
  if sentiment is "pos":
/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/sentiment_emoji_augmenter/transformation.py:106: SyntaxWarning: "is" with a literal. Did you mean "=="?
  elif sentiment is "neg":

`correct_common_misspellings` throws FileNotFoundError and incorrectly assumes resources are relative to transformation directory

These issues appear when trying to use this transformation outside of the root NL-Augumenter directory. For example in another sub-directory off the root directory. The fixes needed are the following:

  • Remove:
spell_corrections = os.path.join(
        "transformations", "correct_common_misspellings", "spell_corrections.json"
    )
  • Use file = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'spell_corrections.json') to get a handle on the current path relative to transformation.py script file.

Standardize loading of different spacy models

Some of the transformations/filters use different spacy models (en, es, zh, de). The way it is loaded needs to be standardized. The function initialize_models in initialize.py needs to be re-written to accommodate language parameter and the following transformations/filters should be updated.

Once the changes are done, test the modules individually using pytest using the below command,

pytest -s --t=<module_name>

Transformations:

  • grapheme_to_phoneme_transformation
  • city_names_transformation
  • synonym_substitution
  • ocr_perturbation
  • change_person_named_entities
  • antonyms_substitute
  • emojify
  • sentence_reordering
  • transformer_fill
  • auxiliary_negation_removal
  • correct_common_misspellings
  • word_noise
  • yes_no_question
  • subject_object_switch
  • dyslexia_words_swap
  • close_homophones_swap
  • gender_neutral_rewrite
  • tense
  • adjectives_antonyms_switch
  • abbreviation_transformation
  • hashtagify
  • token_replacement
  • mr_value_replacement
  • urban_dict_swap
  • syntactically_diverse_paraphrase
  • yoda_transform
  • disability_transformation
  • replace_numerical_values
  • unit_converter
  • suspecting_paraphraser
  • change_date_format
  • negate_strengthen
  • gender_culture_diverse_name
  • lexical_counterfactual_generator
  • change_two_way_ne
  • gender_culture_diverse_name_two_way
  • replace_abbreviation_and_acronyms
  • replace_financial_amounts
  • slangificator
  • summarization_transformation
  • pinyin
  • gender_neopronouns
  • spanish_gender_swap
  • add_hashtags

Filters:

  • question_filter
  • length
  • polarity
  • yesno_question
  • keywords
  • soundex
  • numeric
  • code_mixing
  • speech_tag
  • quantitative_ques
  • group_inequity
  • token_amount

Is the first test case skipped?

When adjusting the tests for #146 I noticed that I almost never needed to adjust the first test case in each test.json but all the others. It almost felt as if the first one was being skipped since it is so unlikely that all other test cases needed slight adjustments but the first one always perfectly matched. Can someone quickly check if everything works as intended there? Could very well be chance as well but just to make sure.

`ocr_perturbation` requirements issues

The ocr_perturbation package requires trdg==1.6.0. However, under macOS 11.6 with Python 3.9 it will not install due to a dependency on pillow==7.0.0, which generates a RequiredDependencyException: zlib error.

Installing pillow==8.3.2 works fine but is too new for trdg==1.6.0.

Installing trdg==1.7.0 has a dependency conflicts with opencv-python:

ERROR: Cannot install opencv-python==4.5.3.56, trdg and trdg==1.7.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.3.56 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.2.54 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.2.52 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.5.1.48 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.46 depends on numpy>=1.19.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.42 depends on numpy>=1.17.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.4.0.40 depends on numpy>=1.17.3
    trdg 1.7.0 depends on numpy<1.17 and >=1.16.4
    opencv-python 4.3.0.38 depends on numpy>=1.17.3

Should we add a global seed for all transformations?

Almost all transformations such as, for example, butter_fingers_perturbation or replace_numerical_values use a seed in their constructor that is set to some value. How are we going to handle the global seed? we could easily set one in initialize.py that get's imported in each transformation and set that as the default, similar to what is currently done for spacy_nlp. Otherwise, we can also set it during evaluation, as far as I could tell that is not currently done but I think having a global default is a little cleaner.

Happy to make the required changes if that's something we'd want.

`gender_neutral_rewrite` Unresolved references to spaCy and Unresolved List reference

When running the gender_neutral_rewrite there are several unresolved references to the spacy_nlp variable. In particular on line:

  • Line 27: self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

Please use from initialize import spacy_nlp to get a handle on the global spacy instance.

There is also an unresolved reference on Line 495: def generate(self, sentence: str) -> List[str]. List[str] is not resolvable. Should this be lower case? e.g. list[str]

`english_inflectional_variation` throws ValueError when called

Here is the stack trace when the EnglishInflectionalVariation class is initialised:

File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/english_inflectional_variation/__init__.py", line 1, in <module>
    from .transformation import *
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/english_inflectional_variation/transformation.py", line 1, in <module>
    import random, lemminflect
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/lemminflect/__init__.py", line 49, in <module>
    spacy.tokens.Token.set_extension('inflect', method=Inflections().spacyGetInfl)
  File "spacy/tokens/token.pyx", line 47, in spacy.tokens.token.Token.set_extension
ValueError: [E090] Extension 'inflect' already exists on Token. To overwrite the existing extension, set `force=True` on `Token.set_extension`.

Typos discovered by codespell

codespell --ignore-words-list="fro,ist,oder"

./dataset.py:122: relavent ==> relevant
./dataset.py:143: hierachy ==> hierarchy
./notebooks/Write_a_sample_transformation.ipynb:1442: tht ==> the, that
./notebooks/Write_a_sample_transformation.ipynb:1718: exisiting ==> existing
./evaluation/evaluate_text_generation.py:84: upto ==> up to
./transformations/change_two_way_ne/README.md:11: implemetation ==> implementation

`insert_abbreviation` incorrectly imports python file and incorrectly assumes resources are relative to transformation directory

These issues appear when trying to use this transformation outside of the root NL-Augumenter directory. For example in another sub-directory off the root directory. The fixes needed are the following:

  • import grammaire.py using the full import path: import transformations.insert_abbreviation.grammaire as grammaire
  • Remove sys.path.append("./transformations/insert_abbreviation")
  • Use file = os.path.join(os.path.dirname(os.path.abspath(__file__)), '<file_name>.txt') to get a handle on the current path relative to transformation.py script file. This will allow easy access to the two .txt resource files.

Add CUDA argument in evaluate.py to set the "is_cuda" flag in evaluate methods to False. (for non-Nvidia GPUs to use CPU)

Hi All,

I am using a Mac OS for my project so I am running into an issue when trying to evaluate my transformations. As I do not have Nvidia GPUs, I would like to use the CPU when working with PyTorch otherwise I would get an "AssertionError: Torch not compiled with CUDA enabled".

Mac OS users that do not have Nvidia GPU will have to set device = -1 to not use GPU:
MacOS: "AssertionError: Torch not compiled with CUDA enabled"
allenai/allennlp#877

This seems to be stemmed from the fact that there is currently no way to change the is_CUDA flag that is being set to TRUE by default in the evaluate() method inside evaluate_text_classification.py to FALSE. (There is code to set the device to 0 or -1 based on the is_cuda flag.)

I am able to run my evaluations by changing the is_cuda flag in the code. It will probably be better to make it an argument so that future users who want to use CPUs instead of GPUs to be able to do it when running python evaluate.py -t [transformation] -task [task_type]

I will be happy to make the required changes if that's something we'd want.

Thanks,
Tim

Language Detection

How can we detect, which language is used for the evaluation on the fly?
We want to apply the correct transformation in "generate" on the fly according to the current language...

Thanks in advance

Error when evaluating TEXT_TO_TEXT_GENRATION

When running python evaluate.py -t ButterFingersPerturbation -task "TEXT_TO_TEXT_GENERATION" -p 1, there will be error of

Here is the performance of the model on the transformed set
Length of Evaluation dataset is 226
Traceback (most recent call last):
  File "evaluate.py", line 67, in <module>
    if_filter
  File "./NL-Augmenter/evaluation/evaluation_engine.py", line 41, in evaluate
    percentage_of_examples=percentage_of_examples,
  File "./NL-Augmenter/evaluation/evaluation_engine.py", line 115, in execute_model
    split=f"test[:{percentage_of_examples}%]",
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 44, in evaluate
    dataset, summarization_pipeline, transformation=operation
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 70, in transformation_performance
    pt_dataset, summarization_pipeline
  File "./NL-Augmenter/evaluation/evaluate_text_generation.py", line 81, in performance_on_dataset
    article, gold_summary = example
  File "./NL-Augmenter/dataset.py", line 301, in <genexpr>
    yield (datapoint[field] for field in self.fields)
TypeError: string indices must be integers


Data augmentation methods and filters that require the entire dataset

Hello!

First of all, thanks for the effort to build such a collaborative framework!

At the moment, the augmentation methods and filters are only provided with a single example per call. Since there are many techniques that need the whole dataset with the class information (to be conditioned on the class, to interpolate instances, etc.), I wanted to ask if there are plans to add this to this framework?

OSError in the PR Workflow Test

Hi, I just found an OS error in the PRs' workflow.

Collecting huggingface-hub<0.1.0
  Downloading huggingface_hub-0.0.8-py3-none-any.whl (34 kB)
Collecting sacremoses
  Downloading sacremoses-0.0.45-py3-none-any.whl (895 kB)
Collecting tokenizers<0.11,>=0.10.1
  Downloading tokenizers-0.10.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (3.3 MB)
Collecting filelock
  Downloading filelock-3.0.12-py3-none-any.whl (7.6 kB)
ERROR: Could not install packages due to an OSError: [Errno 2] No such file or directory: '/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/importlib_metadata-4.6.0.dist-info/METADATA'

Probably, somebody has some idea about this error that occurred in many PRs recently.

Thanks!

The default performance evaluation shows strange results

Hi all,

If one runs the evaluate.py script against our transformation (#230), the results are very strange. The performance is too good, considering the dramatic changes made by our transformation.

Here is the performance of the model aychang/roberta-base-imdb on the test[:20%] split of the imdb dataset
The accuracy on this subset which has 1000 examples = 96.0
Applying transformation:
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 1000/1000 [00:19<00:00, 51.83it/s]
Finished transformation! 1000 examples generated from 1000 original examples, with 1000 successfully transformed and 0 unchanged (1.0 perturb rate)
Here is the performance of the model on the transformed set
The accuracy on this subset which has 1000 examples = 100.0

On the other hand, if we use non-default models, they produce reasonable results (kudos to @sotwi):

roberta-base-SST-2: 94.0 -> 51.0
bert-base-uncased-QQP: 92.0 -> 67.0
roberta-large-mnli: 91.0 -> 43.0

I speculate that the problem in the default test could be caused by some deficiency in the model aychang/roberta-base-imdb and / or the imdb dataset. But I'm not knowledgeable enough in the inner workings of the model to identify the source of the problem.


How to reproduce the strange results:

Get the writing_system_replacement transformation from #230.

cd to the NL-Augmenter dir.

Run this:

python3 evaluate.py -t WritingSystemReplacement


Expected results:

a massive drop in accuracy, similar to the results by @sotwi on non-default models, as mentioned above.

Observed results:

a perfect accuracy of 100.0.

Informal & Untested Suggestions for Possible Transformations

Here are some random ideas informally put which could be used for perturbations & augmentations. @vgtomahawk is making a formal list in this branch.

Meanwhile here is an informal list for the benefit of the participants.

  1. Interchange positions of SRL AM arguments for non-overlapping AM arguments:

    • Alex left for Delhi with his wife at 5 pm. --> Alex left for Delhi at 5 pm with his wife.
    • "at 5 pm" (AM-TMP) and "with his wife" (AM-COM) can be exchanged: This is safe to do only with non-core arguments and non-overlapping arguments. Check what SRL is here.
  2. The ButterFingersPertubation could be implemented for keyboard types other than English - like Devanagiri (Hindi, Marathi, Nepail), Shahmukhi (Urdu, Persian), South Indian languages (Tamil, Telugu, Kannada, Malayalam) or Chinese, etc.

  3. Style transfer approaches could be interesting to look at - Changing formal to informal and vice versa. Check this model.

  • What the heck is going on? --> What is going on?
  • What you upto? --> What are you doing?
  1. Word Order Changes: Active to Passive & vice versa, Topicalisation, Extraposition, Wh-fronting, (& vice versa) & other used in constituency tests.
    Scrambling (for German, Turkic languages)
    John went to the store to buy bread. --> To buy bread, John went to the store.

The above are only related to SentenceOperation. There are other transformation types too which could be looked at.

`re.sub` method error during the evaluation

Hi,
While running the evaluate method (for #246), I get an error in my re.sub method for one of the tests --most likely due to a problem with the escape characters. I can replace it with string.replace to solve the problem. However, this branch is already merged. Do you suggest creating a new branch or to leave the corresponding eval columns empty?

`p1_noun_transformation` wptools dependency issues

The p1_noun_transformation relies on wptools as a dependency. However, wptools depends on pycurl. Unfortunately, pycurl keeps throwing the following message when used:

  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/p1_noun_transformation/__init__.py", line 1, in <module>
    from .transformation import *
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/transformations/p1_noun_transformation/transformation.py", line 9, in <module>
    import wptools
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/__init__.py", line 23, in <module>
    from . import core
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/core.py", line 14, in <module>
    from . import request
  File "/Users/saad/Documents/Research Work/GEM/NL-Augmenter/venv/lib/python3.9/site-packages/wptools/request.py", line 17, in <module>
    import pycurl
ImportError: pycurl: libcurl link-time ssl backends (secure-transport) do not include compile-time ssl backend (openssl)

Spacy behaves differently when testing one case vs testing all cases

It seems Spacy's tokenizer behaves differently when I run pytest -s --t=emojify and pytest -s --t=light --f=light.

For example, I added the following snippet in my generate() function:

print([str(t) for t in self.nlp(sentence)])

With input sentence "Apple is looking at buying U.K. startup for $132 billion."

pytest -s --t=emojify gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$', '132', 'billion', '.']

However, pytest -s --t=light --f=light gives:

['Apple', 'is', 'looking', 'at', 'buying', 'U.K.', 'startup', 'for', '$1', '32', 'billion.']

I use the fowling code to load spacy:

import spacy
from initialize import spacy_nlp
self.nlp = spacy_nlp if spacy_nlp else spacy.load("en_core_web_sm")

It looks very strange. Am I overlooking something?

Swap Transformations

Thank you for your great work! It's super useful!

I have a suggestion for improvement -
Some transformations are working with a "swap" principle. For example, in GenderSwap, if we had "sister" in the original sentence then it would be transformed to 'brother" and vice versa.
There are scenarios when it's important to know what direction the transformation went, female to male or male to female. In my case for example, I want to compare the performances of my model on female/male sentences on inference time.

I really liked the way TenseTransformation works. You need to specify in the constructor what tense (past/present/future) you want to transform to.
Maybe that could be applicable for other swap transformations?

Thanks again!

Tests do not Check that Expected and Generated Outputs have Same Number of Sentences

This issue concerns the following line in the main test script:

for pred_output, output in zip(perturbs, outputs):

The zip() builtin (which is used in the above-mentioned line to pair up expected sentences with generated sentences) clips the longer of its two inputted iterables to the length of the shorter iterable. E.g.:

>>> list(zip([1,2,3], [6,7,8,9,10]))
[(1, 6), (2, 7), (3, 8)]

This means that even if a transformation generates fewer sentences (e.g. 0) than the expected number of sentences, it will still pass and the later expected sentences will not get evaluated. This also makes it impossible to test affirmatively that a transformation does not generate any outputs for a given input.

I would recommend either asserting that the two iterables are of equal length, or replacing zip() with zip_longest().

`Formal2Casual` fails to load due to unavailable huggingface model

from nlaugmenter.transformations.formality_change.transformation import Formal2Casual
OSError: prithivida/parrot_adequacy_on_BART is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login` and pass `use_auth_token=True`.

The model (prithivida/parrot_adequacy_on_BART) is indeed not available on huggingface anymore. Perhaps an acceptable alternative is to use prithivida/parrot_adequacy_model instead?

Spacy Loading can be done once

Many transformations load spacy multiple times and reparse the same utterance. We will need a mechanism to load spacy once and parse once or at least cache the parse for a string so that when running all transformations together, there is no repetition of parsing.

Standardize module names - Transformation

The module number-to-word should be changed to number_to_word.

Solution:

  • Rename the folder from number-to-word to number_to_word
  • Add an entry number_to_word in the test/mapper.py file in the appropriate dictionary (either heavy or light transformation depending on the flag heavy)
  • Once added, test the module by executing
pytest -s --t=number_to_word

`GermanGenderSwap` missing `noun_pairs.json` file and incorrectly assumes the resources are on the script path

Hi @raft001,

It seems that in addition to issue #310 there are two other issues that need addressing:

  • noun_pairs.json is missing. This needed on line 17.
  • The script assumes that the resource *.json files will always be on the script path. Please instead do the following to resolve the path:
    file = os.path.join(os.path.dirname(os.path.abspath(__file__)), '<file_name>.json')

Then current_path can be used as the absolute path to your resource files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.