Coder Social home page Coder Social logo

CNN/DM about control-length HOT 6 CLOSED

YoonaX avatar YoonaX commented on May 27, 2024
CNN/DM

from control-length.

Comments (6)

takase avatar takase commented on May 27, 2024

We haven't applied this control output length model to CNN/DM dataset because the dataset doesn't prepare desired lengths for each summary in my understanding.

from control-length.

YoonaX avatar YoonaX commented on May 27, 2024

What exactly does it mean that each summary has the desired length?
My understanding is that the DUC 04 data set has 4 references, right?
And can you provide a download link to the GigaWord test set?
The link you provided earlier contains only training sets and validation sets.
thank you very much~

from control-length.

takase avatar takase commented on May 27, 2024

Yes, DUC 2004 task 1 dataset contains 4 manual summaries.
The organizers of DUC 2004 instructed people to write summaries whose lengths are less than 75 characters.
Thus, we evaluate DUC 2004 task 1 dataset with summaries truncated over 75 characters.
Please read descriptions of datasets and previous studies.
https://duc.nist.gov/duc2004/
https://www.aclweb.org/anthology/D15-1044.pdf

For Gigaword dataset, I can't distribute it due to the license of LDC as described the previous comments.
But, in my understanding, the linked directory contains test data used in https://www.aclweb.org/anthology/D15-1044.pdf. The test set contains 1951 lines.

from control-length.

YoonaX avatar YoonaX commented on May 27, 2024

That's a good answer. I have another question for you.
Are the generated results of different lengths calculated with all the summaries?
Otherwise, is? The result with a length of 10 and the summary with a length of 10 is calculated at Rouge, the result with a length of 20 and the summary with a length of 20 is calculated at Rouge

Such as:
Two original sentences and two abstracts. The abstract length is 10 and 20, respectively. I set the target length to 10. Is the first sentence in the generated result calculated only with the first summary of length 10? Or are the two results calculated separately with the corresponding summary?

from control-length.

takase avatar takase commented on May 27, 2024

I consider your question is on the case that we have multiple reference summaries.
The ROUGE score uses all reference summaries but if you set a max length in ROUGE script, it truncate characters over the max length from both of system and reference summaries.

from control-length.

YoonaX avatar YoonaX commented on May 27, 2024

Oh, I see. Thank you very much for your patience~

from control-length.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.