In https://derwen.ai/docs/

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Many thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hoverc

Is handling of singular / plural forms ('sentence' and 'sentences') correct / consistent? about pytextrank HOT 3 CLOSED

0dB commented on June 27, 2024 1

Is handling of singular / plural forms ('sentence' and 'sentences') correct / consistent?

from pytextrank.

Comments (3)

0dB commented on June 27, 2024 3

Thanks, let me try that out and see what effect that has in total and then I would also update the sample output, too. I can do this sometime soon.

Update: I think I am more pleased with the results, I am getting better summaries this way, since singular and plural forms of words now are "equal" to the algorithm and together have more weight instead of carrying separate but then not so strong weights. I will test some more and then propose a few updates to the sample page.

from pytextrank.

Ankush-Chander commented on June 27, 2024 2

Hi @0dB
Thanks bringing this to our attention.
The occurrences of sentences being grouped together is working as per the scrubber code.
Since scrubber function returns the span.text in the example code, sentences are grouped as one, while sentence are being grouped together.

We can change the desired behaviour by changing the example code from

return span.text

return span.lemma_

This will group all occurrences of sentence and sentences together.

Please feel free to make this change in the example notebook in your existing PR #233 .

from pytextrank.

ceteri commented on June 27, 2024

Many thanks @0dB and @Ankush-Chander !

It would help to have examples/sample.ipynb updated to illustrate the behaviors discussed here.

@0dB, the changes in your PR #233 look good -

We're having issues with our CI pipeline (see #235) and as soon as I get that cleared (hopefully tonight) I'll accept/merge the PR.

I also noticed the typo toekn in that same notebook :) FWIW, these notebooks get rendered as Markdown to build portions of our docs, so the docs will become updated by the same fix.

from pytextrank.

Recommend Projects

Is handling of singular / plural forms ('sentence' and 'sentences') correct / consistent? about pytextrank HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent