Coder Social home page Coder Social logo

creat's Introduction

Contextualized representation-Adversarial Training

This repo is for the ICLR 2023 paper Toward Adversarial Training on Contextualized Language Representation.

Trainer

We implement a number of adversarial training algorithms in trainer, e.g. FreeLBTrainer, SMARTTrainer.

The current version supports huggingface BERT, RoBERTa, DeBERTa, ALBERT, etc.

CreAT:

from trainer.creat import CreATTrainer

trainer = CreATTrainer(model, optimizer, scheduler, max_train_steps=10000, fp16=True)

for epoch in trange(3):
  train_loss, train_step = trainer.step(train_dataloader)
  global_step = trainer.global_step

SMART:

from trainer.smart import SMARTTrainer

trainer = SMARTTrainer(model, optimizer, scheduler, max_train_steps=10000, fp16=True)

for epoch in trange(3):
  train_loss, train_step = trainer.step(train_dataloader)
  global_step = trainer.global_step

R3F:

from trainer.r3f import R3FTrainer

trainer = R3FTrainer(model, optimizer, scheduler, max_train_steps=10000, fp16=True)

for epoch in trange(3):
  train_loss, train_step = trainer.step(train_dataloader)
  global_step = trainer.global_step

FreeLB:

from trainer.freelb import FreeLBTrainer

trainer = FreeLBTrainer(model, optimizer, scheduler, max_train_steps=10000, fp16=True)

for epoch in trange(3):
  train_loss, train_step = trainer.step(train_dataloader)
  global_step = trainer.global_step

Standard training:

from trainer.base import Trainer

trainer = Trainer(model, optimizer, scheduler, max_train_steps=10000, fp16=True)

for epoch in trange(3):
  train_loss, train_step = trainer.step(train_dataloader)
  global_step = trainer.global_step

creat's People

Contributors

gingasan avatar

Stargazers

 avatar  avatar  avatar Jeff Carpenter avatar  avatar  avatar Ley avatar  avatar

Watchers

 avatar

Forkers

fenffef

creat's Issues

How to use it in encoder-decoder model

It's my code,
logits = modeling_outputs.logits
input_ids = self.batch["input_ids"]
attention_mask = self.batch["attention_mask"]
decoder_attention_mask = self.batch["decoder_attention_mask"]
inputs_embeds = self.encoder.embed_tokens(input_ids)
extended_input_mask = decoder_attention_mask.view(-1, decoder_attention_mask.size(-1)).unsqueeze(-1)
ctxr = modeling_outputs.decoder_hidden_states[-1] * extended_input_mask
delta = torch.randn_like(inputs_embeds, requires_grad=True) * self.adv_init_var

  for adv_step in range(self.adv_step):
      inputs_embeds = inputs_embeds + delta
      with autocast(enabled=self.config.fp16):
          batch = {
              "inputs_embeds": inputs_embeds,
              "attention_mask": attention_mask,
              "input_ids": None,
              "labels": self.batch["labels"],
              "output_hidden_states": True,
          }

          adv_modeling_outputs = self.model(**batch)
          loss_ptb = adv_modeling_outputs.loss
          logits_ptb = adv_modeling_outputs.logits
          ctxr_ptb = adv_modeling_outputs.decoder_hidden_states[-1] * extended_input_mask

      if adv_step == self.adv_step - 1:
          break

      loss_ptb = loss_ptb - cos_loss(ctxr_ptb, ctxr.detach()) * self.adv_temp
      delta = self._inner_update(delta, loss_ptb)
      delta = delta.requires_grad_()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.