juliatext / textmodels.jl Goto Github PK
View Code? Open in Web Editor NEWNeural Network based models for Natural Language Processing
License: Other
Neural Network based models for Natural Language Processing
License: Other
Update TextAnalysis to [email protected] with Zygote.
Without compromising on performance, Zygote supports the full flexibility and dynamism of the Julia language, including control flow, recursion, closures, structs, dictionaries, and more.
DiffEqFlux and NeuralNetDiffEq already moved to zygote.
On the other hand Tracker is alive but is no longer heavily maintained so it is better to use Zygote
Multiple issues with this one. Line 34 has SentimentClassifier
which isn't defined anywhere which should be BinSentimentClassifier
.
Moreover TextModels does not include the sentiment.jl
file where BinSentimentClassifier
is defined.
And then running this:
julia> using TextModels.ULMFiT
julia> c = ULMFiT.BinSentimentClassifier()
ERROR: Expected param size (1150, 1), got (1150,)
Stacktrace:
[1] error(s::String)
@ Base .\error.jl:33
[2] loadparams!(m::TextModels.ULMFiT.BinSentimentClassifier, xs::Vector{Any})
@ Flux C:\Users\Admin\.julia\packages\Flux\BPPNj\src\functor.jl:59
[3] TextModels.ULMFiT.BinSentimentClassifier()
@ TextModels.ULMFiT C:\Users\Admin\Desktop\work\TweetClassification\dev\TextModels\src\ULMFiT\sentiment.jl:75
[4] top-level scope
@ REPL[5]:1
[5] top-level scope
@ C:\Users\Admin\.julia\packages\CUDA\DfvRa\src\initialization.jl:52
Running the following below line 49 of ULMFiT/sentiment.jl
:
i = 0
for (p, x) in zip(Flux.params(sc), weights)
println("Iteration ", i)
println("size(p): ", size(p))
println("size(x): ", size(x))
size(p) == size(x) ||
error("Expected param size (size(p)), got (size(x))")
i += 1
println()
end
I get
Any workaround/solution?
When trying to use PoSTagger on a text with a different language (but supported in the default Penn Treebank) than English, everything gets tagged as 'NNP'.
Here is how to reproduce the problem:
using TextModels
using TextAnalysis
using Languages
str = "André trouve cette maison très belle"
sd = StringDocument(str)
pos = PoSTagger()
language!(sd,Languages.French())
pos(sd)
TextModels v0.1.0
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.6.1 (2021-04-23)
_/ |\__'_|_|_|\__'_| | Built by Homebrew (v1.6.1)
|__/ |
(@v1.6) pkg> add TextModels
Installing known registries into `~/.julia`
Added registry `General` to `~/.julia/registries/General`
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package Flux [587475ba]:
Flux [587475ba] log:
├─possible versions are: 0.4.1-0.12.4 or uninstalled
├─restricted by compatibility requirements with TextModels [77b9cbda] to versions: 0.9.0
│ └─TextModels [77b9cbda] log:
│ ├─possible versions are: 0.1.0 or uninstalled
│ └─restricted to versions * by an explicit requirement, leaving only versions 0.1.0
└─restricted by compatibility requirements with CuArrays [3a865a2d] to versions: [0.4.1-0.8.3, 0.11.0-0.12.4] or uninstalled — no versions left
└─CuArrays [3a865a2d] log:
├─possible versions are: 0.2.1-2.2.2 or uninstalled
└─restricted by julia compatibility requirements to versions: uninstalled
Since a lot was updated in the last PR, can we have a new release for the TextModels?
cc @aviks @tejasvaidhyadev
Fairly complex version comapatibility issues due to Flux 0.9.0 dependency.
(test) pkg> dev TextModels
Path `/home/sambit/.julia/dev/TextModels` exists and looks like the correct package. Using existing path.
Resolving package versions...
ERROR: Unsatisfiable requirements detected for package CUDAapi [3895d2a7]:
CUDAapi [3895d2a7] log:
├─possible versions are: 0.5.0-4.0.0 or uninstalled
├─restricted by compatibility requirements with Flux [587475ba] to versions: 1.1.0-1.2.0
│ └─Flux [587475ba] log:
│ ├─possible versions are: 0.4.1-0.12.1 or uninstalled
│ └─restricted to versions 0.9 by TextModels [77b9cbda], leaving only versions 0.9.0
│ └─TextModels [77b9cbda] log:
│ ├─possible versions are: 0.1.0 or uninstalled
│ └─TextModels [77b9cbda] is fixed to version 0.1.0
└─restricted by julia compatibility requirements to versions: uninstalled — no versions left
Can be fixed with #1 as well. Flux
internal dependencies have significantly changed.
While testing #32, I noticed this triggered in the log: https://github.com/FluxML/Flux.jl/blob/0a215462ad8e0ba795205c9e94864403207d63fa/src/deprecations.jl#L13.
On a side note, should the badges in the README point to GHA now?
Transformers.Bert
with a tiny embeddings tweaks.BPE.jl
) as a tokenizer (same as GPT-2) and uses a different pre-training scheme.</s>
)we can also wrapper Camembert (or the french version of BERT) around RoBERT.
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
During Training on CoNLL 2003 Dataset (with around 203k POS tagged word ) the program is getting terminated. Probably because of excessive memory allocation.
Working on Better implementation
Training code can be found here
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.