olegtarasov / fasttext.netwrapper Goto Github PK
View Code? Open in Web Editor NEW.NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!
License: MIT License
.NET Standard wrapper for fastText library. Now works on Windows, Linux and MacOs!
License: MIT License
The original fasttext library provides a test method to test the accuracy of a classification model, which calculates the precission and recall of a model.
Command example:
fasttext test model_cooking.bin cooking.valid
Output:
N 3000
P@1 0.164
R@1 0.0717
Number of examples: 3000
I miss this very important method in the FasttextNetWrapper. Is it possible to implement that?
Hello, I am trying to use the FastText.NetWrapper library on Mac. I saved the FastText file to usr/local/bin. I directly cloned the FastText.NetWrapper files via Github and tried to try it via console, but I got the following error:
Unhandled exception. System.EntryPointNotFoundException: Unable to find an entry point named 'CreateFastText' in shared library 'fasttext'.
If anyone has an idea that can help I would be very grateful.
can support .net core 2.0?
Hello,
first thanks a lot for the new features implemented, especially for the test method.
Unfortenately the NuGet package does not work. As soon as i install the update, i get the error:
"CS0246 The type or namespace name 'FastText' could not be found (are you missing a using directive or an assembly reference?)"
Also the package is not more referenceable by using statement (using FastText.NetWrapper;).
Installing the package 1.0.37 back again resolves the errors.
I tried with .NET Framework, .NET Standard and .NET Core, not one worked.
Could you test the package and check for any issues?
Getting this exception while using this code:
FastTextWrapper fastText = new FastTextWrapper();
fastText.LoadModel("lid.176.ftz");
Prediction[] prediction = fastText.PredictMultiple(File.ReadAllText("TextDoc.txt"), 4);
Message=Unable to load DLL 'fasttext' or one of its dependencies: The specified module could not be found. (0x8007007E).
What am I missing? I installed the FastText.NetWrapper nuget & have the latest Visual C++ Runtime installed. Thanks!
Hi!
I would like to display the current training status in the model in a different interface than in the console.
Need to get time to finish.
Is this functionality planned?
Hello,
What is the license for this project?
Could you add a LICENSE file to the repo and nuget package?
Thanks!
I have a model trained however when run the prediction in C# the result is negative like this -0.681497.
It looks like a conversion overflow issue.
Hi and thanks for the effort.
I have a model, trained using the unsupervised method purely for querying the nearest neighbors.
It seems that the unsupervised models cannot be used with this library at all.
Am I don't something wrong or is it not really supported?
when make a call from WCF project ( .Net framework 4.7.1 ) and use Bitness.x64 and got the below error at manager.LoadNativeLibrary();,
There is no supported native library for platform 'Windows' and bitness 'x32'
Thanks.
To get a vector from a word, use the function
float[] GetWordVector(string word)
How to convert back to word?
Hi I'm trying to use the library with a .net framework web api project but, it doesn't seem to work with the following error: System.DllNotFoundException: 'Failed to load DLL' fasttext ': The specified module cannot be found. (Exception from HRESULT: 0x8007007E). '
Hi!
In the .csproj file I changed the header from
<Project Sdk="Microsoft.NET.Sdk.Web">
to
<Project Sdk="Microsoft.NET.Sdk">
and it all worked.
I have no explanation for this.
Originally posted by @VZMDeadAngel in #26 (comment)
Did you modify the source code for the fastText binary? When I try to compile my own version of the .dll and include it as a resource, then the call to CreateFastText() seems to fail.
If you modified any of the c++ classes/headers for the import can you maybe share the source file?
Thanks.
Hi.
I created a simple Net core 5.0 application.
namespace ConsoleApp
{
class Program
{
static void Main(string[] args)
{
var fasttext = new FastTextWrapper();
}
}
}
Created Dockerfile. (Such a file is created by default Visual Studio)
FROM mcr.microsoft.com/dotnet/runtime:5.0 AS base
WORKDIR /app
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /src
COPY ["ConsoleApp/ConsoleApp.csproj", "ConsoleApp/"]
RUN dotnet restore "ConsoleApp/ConsoleApp.csproj"
COPY . .
WORKDIR "/src/ConsoleApp"
RUN dotnet build "ConsoleApp.csproj" -c Release -o /app/build
FROM build AS publish
RUN dotnet publish "ConsoleApp.csproj" -c Release -o /app/publish
FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "ConsoleApp.dll"]
As a result, I see an error.
Unable to load shared library 'fasttext' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libfasttext: cannot open shared object file: No such file or directory
I'm using Docker Desktop for Windows with Linux enviropment.
What am I doing wrong? Thank you!
Using the code below with a pretrained model throws "Model not loaded!".
static void Main(string[] args)
{
FileInfo fileInfo = new FileInfo("D:\\models\\wiki.sv.bin");
if (fileInfo.Exists)
{
using (FastTextWrapper wrapper = new FastTextWrapper())
{
wrapper.LoadModel(fileInfo.FullName);
// Exception here
Prediction predictSingle = wrapper.PredictSingle("test");
if (!string.IsNullOrWhiteSpace(predictSingle.Label))
{
Console.WriteLine(predictSingle.Label);
}
}
}
}
Training with pretrained vectors is not possible. No matter if the method is SkipGram or Supervised, the wrapper throws an exception:
"External component has thrown an exception."
StackTrace:
at FastText.NetWrapper.FastTextWrapper.Train(IntPtr hPtr, String input, String output, TrainingArgsStruct args, String labelPrefix, String pretrainedVectors)
In addition, the original Fasttext library allows supervised training with pretrained vectors, whereas in the wrapper this is only possible for low-level training (missing property in the supervised hyperparameters).
For small learning sets, however, it is necessary that the training can process a pretrained vector file with both methods.
Is it possible a.) to fix the error and b.) to implement pretrained vectors in supervised training as well?
I want to use this library and run it on 32 bit windows machines. In the embedded resources, there is only windows x64. (+mac and linux)
Would it be possible to have also a 32 bit embedded resource for windows?
For version 1.1 we need to implement a new API which closely follows original FastText command line interface. It will provide much better experience since people who are proficient with command line FastText will be able to use the wrapper immediately. This approach will also minimize manual argument mapping and boilerplate code, decreasing the risk of introdcuing nasty bugs.
Tasks:
Supervised
function instead of old Train
and TrainSupervised
.Test
function.Unsupervised
function for cbow and skipgram training.I built a model from wikipedia dump. When I run the nearest neighbor call with fasttext-0.9.1 cmd line, I get different results than the FastText Nuget Netwrapper. For example these are the results for query "primeobsession" from the cmd line:
whatsyourobsession 0.79625
thrillersacd 0.769539
shiftinaction 0.764332
andydehnart 0.764045
secretstory 0.761035
digitallyobsessed 0.76042 confrontmagazine 0.758854 comicaddiction 0.756743
thrilljockey 0.756244
marchmadness 0.754537
Whereas the following are the results from the NetWrapper:
Name | Value | Type | |
---|---|---|---|
Label | "francisrimbert" | string | |
Probability | 0.767542958 | float | |
◢ | [1] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "dbernard" | string | |
Probability | 0.762311339 | float | |
◢ | [2] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "urgentessays" | string | |
Probability | 0.762221634 | float | |
◢ | [3] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "danielravennest" | string | |
Probability | 0.7606867 | float | |
◢ | [4] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "netknowledgenow" | string | |
Probability | 0.753572464 | float | |
◢ | [5] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "wilkpedia" | string | |
Probability | 0.7532526 | float | |
◢ | [6] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "sorfernando" | string | |
Probability | 0.745988 | float | |
◢ | [7] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "orlandoferrer" | string | |
Probability | 0.7453338 | float | |
◢ | [8] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "esperanzadirector" | string | |
Probability | 0.7446482 | float | |
◢ | [9] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "universaltennis" | string | |
Probability | 0.74298507 | float | |
◢ | [10] | {FastText.NetWrapper.Prediction} | FastText.NetWrapper.Prediction |
Label | "musewiki" | string | |
Probability | 0.742397249 | float |
Is the FastText NetWrapper using a different version of FastText?
Where is the cooking.train.txt sample file which is mentioned in the text?
Also, are you open to pull requests for a few minor tweaks?
Hi!
Thank you so much for adding a callback to track the learning process. However, this feature doesn't work for me (.
I use the "Unsupervised" function to train the model to create a "skipgram" models and the callback doesn't work.
Below is an example of my code.
Am I doing something wrong?
var ftArgs = new UnsupervisedArgs()
{
dim = 300,
ws = 15,
minCount = 3,
minn = 3,
maxn = 6,
neg = 5,
wordNgrams = 1,
loss = LossName.HierarchicalSoftmax,
thread = Environment.ProcessorCount,
verbose = 2,
lr = 0.05,
lrUpdateRate = 100,
epoch = 1000,
TrainProgressCallback = (progress, loss, wst, lr, eta) =>
{
...
}
};
fastText.Unsupervised(UnsupervisedModel.SkipGram, "text.txt", "model.bin", ftArgs);
Will appreciate to know how to train a model incrementally.
Tried the below link. but it is now working.
Link
facebookresearch/fastText#423
command
./fasttext [supervised | skipgram | cbow] -input train.data -inputModel trained.model.bin -output re-trained [other options] -incr
Thanks in advances
Hi,
I'm trying to use FastText wrapper, I create new console app (.bet core 3.1 or standard) using VS2019, install FastText.NetWrapper and I always get this error on first row
var fastText = new FastTextWrapper();
Could you pls advise haw to fix it ?
First of all, congratulations for your library. It's very nice.
We are consuming your NuGet package from one of our microservices that is responsible of doing some AI staff. We have a problem when this service is published as portable. The FastText files are not copied to the corresponding runtime folder. The same can happen if the microservice is published for a specific runtime identifier, such as win-x64.
At the moment, as a workaround we have added the following code that copies the files after publishing the service.
<!-- This is a workaround to copy the FastText native files to the runtimes folder when the service is published as Portable. -->
<Target Name="PostPublishPortableCopyFastTextNativeFiles" AfterTargets="AfterPublish" Condition="'$(RuntimeIdentifier)' == ''">
<Warning Text="This project's platform is set to '$(PlatformTarget)', but it requires a native DLLs. Defaulting to 'x64' binaries. Explicitly choose a platform in the project's Build properties to remove this warning."
Condition="'$(PlatformTarget)' != 'x86' And '$(PlatformTarget)' != 'x64'" />
<PropertyGroup>
<FastTextPlatformTarget>$(PlatformTarget)</FastTextPlatformTarget>
<FastTextPlatformTarget Condition="'$(PlatformTarget)' != 'x86' And '$(PlatformTarget)' != 'x64'">x64</FastTextPlatformTarget>
</PropertyGroup>
<PropertyGroup>
<FastTextWinFilesOutputPath>$(PublishDir)runtimes\win-$(FastTextPlatformTarget)\native\</FastTextWinFilesOutputPath>
<FastTextLinuxFilesOutputPath>$(PublishDir)runtimes\linux-$(FastTextPlatformTarget)\native\</FastTextLinuxFilesOutputPath>
<FastTextOsxFilesOutputPath>$(PublishDir)runtimes\osx-$(FastTextPlatformTarget)\native\</FastTextOsxFilesOutputPath>
</PropertyGroup>
<Message Text="FastText.NetWrapper Post Publish Portable Message" Importance="high"/>
<Message Text="RuntimeIdentifier: Portable" Importance="high"/>
<Message Text="PlatformTarget: $(FastTextPlatformTarget)" Importance="high"/>
<Message Text="FastTextWinFilesOutputPath: $(FastTextWinFilesOutputPath)" Importance="high"/>
<Message Text="FastTextLinuxFilesOutputPath: $(FastTextLinuxFilesOutputPath)" Importance="high"/>
<Message Text="FastTextOsxFilesOutputPath: $(FastTextOsxFilesOutputPath)" Importance="high"/>
<ItemGroup>
<FastTextWinFiles Include="$(MSBuildThisFileDirectory)**\fasttext.dll" />
<FastTextLinuxFiles Include="$(MSBuildThisFileDirectory)**\libfasttext.so" />
<FastTextOsxFiles Include="$(MSBuildThisFileDirectory)**\libfasttext.dylib" />
</ItemGroup>
<Message Text="Copying FastText native files to the runtimes folder..." Importance="high"/>
<Copy SourceFiles="@(FastTextWinFiles)" DestinationFolder="$(FastTextWinFilesOutputPath)" />
<Copy SourceFiles="@(FastTextLinuxFiles)" DestinationFolder="$(FastTextLinuxFilesOutputPath)" />
<Copy SourceFiles="@(FastTextOsxFiles)" DestinationFolder="$(FastTextOsxFilesOutputPath)" />
<ItemGroup>
<FastTextWinFilesToDelete Include="$(PublishDir)fasttext.dll" />
<FastTextLinuxFilesToDelete Include="$(PublishDir)libfasttext.so" />
<FastTextOsxFilesToDelete Include="$(PublishDir)libfasttext.dylib" />
</ItemGroup>
<Message Text="Deleting FastText native files from directory that contains only Portable libraries..." Importance="high"/>
<Delete Files="@(FastTextWinFilesToDelete)" />
<Delete Files="@(FastTextLinuxFilesToDelete)" />
<Delete Files="@(FastTextOsxFilesToDelete)" />
</Target>
<!-- This is a workaround to delete the FastText native files that not match the specified runtime. -->
<Target Name="PostPublishRuntimeDeleteFastTextNativeFilesThatNotMatchSpecifiedRuntime" AfterTargets="AfterPublish" Condition="'$(RuntimeIdentifier)' != ''">
<Message Text="FastText.NetWrapper Post Publish Runtime Message" Importance="high"/>
<Message Text="RuntimeIdentifier: $(RuntimeIdentifier)" Importance="high"/>
<ItemGroup>
<FastTextWinFilesToDelete Include="$(PublishDir)fasttext.dll" />
<FastTextLinuxFilesToDelete Include="$(PublishDir)libfasttext.so" />
<FastTextOsxFilesToDelete Include="$(PublishDir)libfasttext.dylib" />
</ItemGroup>
<Message Text="Deleting FastText native files that not match the specified runtime..." Importance="high"/>
<Delete Files="@(FastTextWinFilesToDelete)" Condition="!$(RuntimeIdentifier.StartsWith('win'))" />
<Delete Files="@(FastTextLinuxFilesToDelete)" Condition="!$(RuntimeIdentifier.StartsWith('linux'))" />
<Delete Files="@(FastTextOsxFilesToDelete)" Condition="!$(RuntimeIdentifier.StartsWith('osx'))" />
</Target>
We create the issue only to report our solution. Perhaps our fix will be interesting for you.
Thank you in advance.
I am needing to use this in a Centos 7 environment.
The bundled Linux so is built against a newer version of glibc
I had to build my own, but when the app launches it overwrites my custom lib, I am forced to do some trickery with ld preload to make it work.
Hi Oleg, great work with the wrapper. I am looking at integrating it into another project, that is .NET Core, so I took the code and updated to .NET Standard 2.
It was working great until I put it into a linux docker container, then was hitting this error;
Unhandled Exception: System.DllNotFoundException: Unable to load shared library 'kernel32.dll' or one of its dependencies. I can understand that it won't work with the [DllImport("kernel32.dll")] LoadLibraryEx... invoke because that is Windows specific. I'm researching what's involved in modifying it to run on Linux. Any idea if there's a solution to that?
Cheers!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.