Comments (7)
@MarketingPip Sorry to intrude on your conversation, but I feel a bit compelled to jump in.
I have been using Compromise.js for over 3 years, am a huge fan and I have previously had the pleasure of working with @spencermountain. I'm subscribed to this feed and I make sure to read every issue raised, every release note etc as Compromise is a critical component in our projects.
So, I have read all of the (many) tickets you are raising. It's becoming a frustration for me as it's getting really spammy.
I have to agree with Spencer; A lot of your suggestions probably belong in your own project, not in Compromise itself. The whole point of Compromise is to be (very) lightweight, run (very) fast and facilitate developers to solve NLP problems by providing a generic toolkit that can be used as a foundation for many many problem domains.
Our own company has a large implementation with Compromise at its core and we would not dream of polluting the library with our domain-specific patterns. When we find a bug in Compromise that is blocking us from proceeding, we raise an issue. Or when we have developed a Compromise plugin, verified it works through tests and UAT... then sometimes we consider integrating it with the core library and raising a PR for the benefit of others.
As a matter of GitHub etiquette, if you have something to contribute, you should read the documentation in-depth, investigate the source code in-depth, learn how the unit testing library works... with all this in mind you could simply add features to the library (complete with unit test updates, running all existing unit tests, documentation updates, etc). If the features are rejected, no worries, you're free to use them in your own fork and/or projects.
from compromise.
I think you should build on top of it. That's what compromise was made for.
Sounds like you've got a lot of ambitious ideas for Named-Entity disambiguation that exceed the scope of this project. I can add fixes for some of the holes you've found in the #organization tag, like schools and banks, in an upcoming release.
I say, put a neural net on top, do really aggressive classification of topics, and open-source it. That sounds like a good time. I think you may get frustrated by the slow and increasingly-tedious parts of maintaining a generic ibrary for everybody.
cheers
from compromise.
Please do not create so many issues Jared. This is not an issue with compromise, or something that requires my time.
from compromise.
@spencermountain - my apologizes Spencer, I just want to run anything but that I could see as being beneficial not just for my project
but to the Compromise project. But again - just some second eyes are more than useful & prefer you hating me for blowing your notifications instead of creating a dumpster fire by submitting a PR that will make ground breaking changes for the project.
Again - don't mean to be majorly annoying by blowing your notifications up! lol
from compromise.
Yeah, this is becoming a problem. I'm glad you're excited about the project, and you're encouraged to work on these things in your own projects. Creating 100 issues is not productive or helpful. Maintaining a open source project is hard, and I don't have a lot of time in my day.
from compromise.
I guess a better question - way to solve this. How can I propose solutions / changes / updates to lexicon without wasting your time or opening issues and helping maintain / the community as a whole benefit from my changes...?
from compromise.
@MarketingPip Sorry to intrude on your conversation, but I feel a bit compelled to jump in.
I have been using Compromise.js for over 3 years, am a huge fan and I have previously had the pleasure of working with @spencermountain. I'm subscribed to this feed and I make sure to read every issue raised, every release note etc as Compromise is a critical component in our projects.
So, I have read all of the (many) tickets you are raising. It's becoming a frustration for me as it's getting really spammy.
I have to agree with Spencer; A lot of your suggestions probably belong in your own project, not in Compromise itself. The whole point of Compromise is to be (very) lightweight, run (very) fast and facilitate developers to solve NLP problems by providing a generic toolkit that can be used as a foundation for many many problem domains.
Our own company has a large implementation with Compromise at its core and we would not dream of polluting the library with our domain-specific patterns. When we find a bug in Compromise that is blocking us from proceeding, we raise an issue. Or when we have developed a Compromise plugin, verified it works through tests and UAT... then sometimes we consider integrating it with the core library and raising a PR for the benefit of others.
As a matter of GitHub etiquette, if you have something to contribute, you should read the documentation in-depth, investigate the source code in-depth, learn how the unit testing library works... with all this in mind you could simply add features to the library (complete with unit test updates, running all existing unit tests, documentation updates, etc). If the features are rejected, no worries, you're free to use them in your own fork and/or projects.
Thank you for comments. That said - I guess I should be opening a discussion etc for potential rules etc.
Most of the rules I suggested will help keep compromise - lightweight
& help with POS tagging. So I don't understand how that is not beneficial...?
You know that things like " bank of #Country" is going to be an organization or "the #word corporation". So I don't understand how these rules aren't critical to keeping things lightweight- nor why this should just be for "my project" - as this will help benefit your company's POS tagger as well. Rather then populate the lexicon full of data - making it non lightweight.
Tho some obviously syntax matches need improved obviously.
ps; if you're company has patterns that can help with POS / identifying places etc. I don't know why you're not contributing them thinking it is not helping the Compromise project. 🤷♂️
As well not trying to sh*t on your party - but your PR's here - literally is putting false positives in the lexicon here & the function you added didn't work properly & needed fixed and improved here - these things DO/CAN happen.... Maybe you made a older commit tho with a feature that you added & verified it works through tests that didn't need fixed....?
from compromise.
Related Issues (20)
- Apostrophe "s" disambiguation issue with search query style sentences HOT 7
- Query: Does Compromise.js compile RegExes from match-syntax? HOT 1
- Get .terms() but keep hyphenated strings (similar to .hyphenated() ) HOT 1
- Using .freeze() in nlp.plugin()? HOT 11
- JSON Speed HOT 2
- Tagging mixed number as #Value HOT 5
- Feature request: Logical operations in match HOT 2
- [Issue]: Various common nouns tagged as proper noun. HOT 6
- True Casing HOT 10
- [Improvements]: Add .toLowerCase() API to various functions. HOT 1
- [Issue]: Gov Rule & Possible Other's Needs Improved. HOT 5
- [Issue]: "My favorite time of the year" in .nouns() response HOT 3
- `.prepend()` removes frozen tags for acronyms HOT 2
- Improve TypeScript DX by reducing usage of "any" HOT 1
- NFD form combining characters not picked up as part of word HOT 3
- Feature: .slashes() tokenize transform HOT 6
- Geedy tag matching and punctuation HOT 2
- [Feature Request]: Flesch–Kincaid Function HOT 6
- "to" is a preposition and not a conjuction HOT 1
- Verb is mistakenly parsed as a noun. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compromise.