Coder Social home page Coder Social logo

hil-mt's Introduction

Human-in-the-loop Machine Translation with Large Language Model (MT Summit 2023)

Paper

1. Overview

In this study, we propose a human-in-the-loop pipeline that guides LLMs to produce customized outputs with revision instructions. The pipeline initiates by prompting the LLM to produce a draft translation, followed by the utilization of automatic retrieval or human feedback as supervision signals to enhance the LLM’s translation through in-context learning. The humanmachine interactions generated in this pipeline are also stored in an external database to expand the in-context retrieval database, enabling us to leverage human supervision in an offline setting. We evaluate the proposed pipeline using the GPT-3.5-turbo API on five domain-specific benchmarks for German-English translation. The results demonstrate the effectiveness of the pipeline in tailoring in-domain translations and improving translation performance compared to direct translation instructions. This work was featured in MT Summit 2023.

2. Feedback Collection

  • Initial translation results are obtained via translation_base.py.
  • Get TER-based generated Feedback via sacrebleu_patch/sacrebleu.py sacrebleu [ref] -m ter --ter-trace-file op.json < [hypo].
  • Get In-context demonstrations by running retrieval.py [DATA_STORE_PATH] [TEST_SET_PATH]

3. In-context Refinement Translation Pipeline

  • stage1:run translation_hil.py.
  • stage2:run compare_hil.py.

*All results about the experiment are stored in data.

Citation

@inproceedings{yang-etal-2023-hilmt,
    title = "Human-in-the-loop Machine Translation with Large Language Model",
    author = "Yang, Xinyi  and
      Zhan, Runzhe  and
      Wong, Derek F.  and
      Wu, Junchao  and
      Chao, Lidia S.",
    booktitle = "Proceedings of Machine Translation Summit XIX Vol. 2: Users Track",
    month = sep,
    year = "2023",
    address = "Macau SAR, China",
    publisher = "Machine Translation Summit",
    url = "https://files.sciconf.cn/upload/file/20230827/20230827195133_32318.pdf",
    pages = "88--98",
}

hil-mt's People

Contributors

x1iris avatar ririkoo avatar eltociear avatar

Stargazers

 avatar Jimmy Stridh avatar  avatar Matt Shaffer avatar WU JUNCHAO avatar

Watchers

 avatar Xu Mingzhou avatar Derek F. Wong avatar Ben Ao avatar Sunbow Liu avatar Matt Shaffer avatar

Forkers

eltociear

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.