Very useful and great work. How do I use a different ontology from a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How to apply to a different ontology/domain? about cso-classifier HOT 6 OPEN

innerop commented on May 30, 2024

How to apply to a different ontology/domain?

from cso-classifier.

Comments (6)

angelosalatino commented on May 30, 2024 3

Hi,
we wrote an article explaining how you can adopt the CSO Classifier in other fields: https://infernusweb.altervista.org/wp/how-to-use-the-cso-classifier-in-other-domains/

Please do let us know if you need further information.

from cso-classifier.

angelosalatino commented on May 30, 2024 1

Hi, these are very good questions. I will soon write an article/tutorial/guide on my blog on how to move towards other domains of science. Stay tuned

from cso-classifier.

innerop commented on May 30, 2024 1

@angelosalatino

I looked at the code for generating the file which you shared in the article.

I'd like to point out the divergence I see with respect to the description given in the article.

The description says:

"To generate this dictionary/file, we collected all the different words available within the vocabulary of the model. Then iterating on each word, we retrieved its top 10 similar words from the model, and we computed their Levenshtein similarity against all CSO topics. If the similarity was above 0.7, we created a record which stored all CSO topics triggered by the initial word."

But I believe the code does this instead:

"To generate this dictionary/file, we collected all the different words available within the vocabulary of the model. Then iterating on each word, we retrieved its top 10 similar words from the model and put them in a list, which we iterated over. If the cosine similarity for a word in the list was equal to or greater than 0.7, and we computed its Levenshtein similarity against all CSO topics and where that was equal to or above 0.94 we added the topic to a record (or created it if it didn't exist) which stored all CSO topics triggered by the initial word from our model."

from cso-classifier.

innerop commented on May 30, 2024

@angelosalatino

That would help greatly in adopting and adapting this work.

For now, however, could you please provide the script that generates the token-to-cso-combined file?

The README is clear on what is involved but looking at the CSO I have no clue what constitutes a "topic" The "words" (1,2,3-gram entities) show up in so many places. I have no idea how to even query the CSO properly? Do I use SPARQL? is this RDF? RDFS? I'm completely new to the format.

Referring to this passage in README.MD:

To generate this file, we collected all the set of words available within the vocabulary of the model. Then iterating on each word, we retrieved its top 10 similar words from the model, and we computed their Levenshtein similarity against all CSO topics. If the similarity was above 0.7, we created a record which stored all CSO topics triggered by the initial word.

from cso-classifier.

innerop commented on May 30, 2024

Thank you and I’ll keep you in the loop on how I’m using it and any improvements I can think of or further questions.

I managed to find an older version prior to when you added the cache and I could see how you’re doing the matching against ontology with the embeddings so that was very educational. One note, however, is that the older version only works on Python 3.6, not 3.7 or later. It throws a StopIteration exception from NLTK util. That’s an issue with Python and NLTK not your codebase

Thank you 🙏 .

from cso-classifier.

angelosalatino commented on May 30, 2024

Hi, yes. Your explanation is very detailed. We left some details out for the sake of the narrative and demanded the reader to the code for further details. But definitely. Your description fits 100% with the actual process.

Thanks

from cso-classifier.

How to apply to a different ontology/domain? about cso-classifier HOT 6 OPEN

Comments (6)

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent