Coder Social home page Coder Social logo

bert-transformer-for-summarization's People

Contributors

fangpings avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

bert-transformer-for-summarization's Issues

The generated summaries are all [pad] characters?

tensor([[ 101, 6825, 5330, 124, 1921, 4638, 7433, 7434, 1921, 3698, 1400, 8024,
8111, 3299, 8122, 3189, 8024, 7270, 3217, 2356, 4638, 7360, 7434, 5303,
754, 1121, 2483, 511, 6381, 5442, 1355, 4385, 8024, 7481, 2190, 1331,
1331, 4638, 4916, 7434, 8024, 6387, 1914, 7270, 3217, 2356, 3696, 2458,
1993, 1831, 7434, 782, 1357, 727, 511, 671, 3198, 7313, 8024, 1762,
7270, 3217, 2356, 1277, 1139, 4385, 1392, 2466, 1392, 3416, 4638, 7434,
782, 511, 1745, 711, 2356, 3696, 1762, 100, 7881, 1356, 100, 7434,
782, 1184, 1920, 6663, 3736, 1298, 8969, 1469, 1762, 100, 7987, 4344,
100, 1184, 1394, 2512, 511, 102, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
pred: [[[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.62939453125e-06], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.62939453125e-06, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]]
Summ: [PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD][PAD]

hello,I really need your help

Can you give me your files :train.csv, train_big.csv ,train_full.csv and could you tell me what's mean of these .csv files
thank you very much .I really really need your help

results

Can you please tell me what good results you achieved? Because my result is not well. thanks!

thank you for your good job,some questions

1.which version Bert pretrained models did you apply?can you show the linked url;
2.can you show the order of action the data.py, i dont know how to produce the train.csv and eval.csv, is there remain valid.csv?
3.when i use the files in bert_model ,show the error ,no such file(No such file or directory: 'bert_model\pytorch_model.bin')

thank your for your reply

代码问题

修改了
将train.py 277行:pred, _ = model.beam_decode(batch[0], batch[1]) 改为 pred, _ = model.beam_decode(batch[0], batch[1], 3, 3)
报错
Traceback (most recent call last):
File "train.py", line 277, in
pred, _ = model.beam_decode(batch[0], batch[1], 3, 3)
File "/root/_project/summarization-bert-transformer/model.py", line 200, in beam_decode
active_inst_idx_list = beam_decode_step(
File "/root/_project/summarization-bert-transformer/model.py", line 162, in beam_decode_step
dec_seq = prepare_beam_dec_seq(inst_dec_beams, len_dec_seq)
File "/root/_project/summarization-bert-transformer/model.py", line 138, in prepare_beam_dec_seq
dec_partial_seq = [b.get_current_state() for b in inst_dec_beams if not b.done]
File "/root/_project/summarization-bert-transformer/model.py", line 138, in
dec_partial_seq = [b.get_current_state() for b in inst_dec_beams if not b.done]
File "/root/_project/summarization-bert-transformer/transformer/Beam.py", line 33, in get_current_state
return self.get_tentative_hypothesis()
File "/root/_project/summarization-bert-transformer/transformer/Beam.py", line 90, in get_tentative_hypothesis
hyps = [self.get_hypothesis(k) for k in keys]
File "/root/_project/summarization-bert-transformer/transformer/Beam.py", line 90, in
hyps = [self.get_hypothesis(k) for k in keys]
File "/root/_project/summarization-bert-transformer/transformer/Beam.py", line 100, in get_hypothesis
hyp.append(self.next_ys[j+1][k])
IndexError: tensors used as indices must be long, byte or bool tensors

版本为torch1.6

Question about optimizer.

Why setting parameters of whose name include bias, LayerNorm.bias or LayerNorm.weight to weight_decay:0.0, Thanks~

# optimizer
param_optimizer = list(model.named_parameters())
no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
    {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
    {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}]
optimizer = BertAdam(optimizer_grouped_parameters,
                        lr=args.learning_rate,
                        warmup=0.1,
                        t_total=num_train_optimization_steps)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.