Coder Social home page Coder Social logo

gocld3's Introduction

cld3

GoDoc

Package cld3 implements language detection using the Compact Language Detector v3.

This package includes the relevant sources from the cld3 project, so it doesn't require any external dependencies. For more information on CLD3, see https://github.com/google/cld3/.

Install with go get github.com/jmhodges/gocld3/cld3.

Documentation is available on GoDoc.

Example

	langId, err := cld3.NewLanguageIdentifier(0, 512)
	if err != nil {
		fmt.Println("whoops, couldn't create a new LanguageIdentifier:", err)
	}
	defer cld3.FreeLanguageIdentifier(langId)
	res := langId.FindLanguage("Hey, this is an english sentence")
	if res.IsReliable {
		fmt.Println("pretty sure we've got text written in", res.Language)
	}
	res = langId.FindLanguage("Muy bien, gracias.")
	if res.IsReliable {
		fmt.Println("ah, and this one is", res.Language)
	}

gocld3's People

Contributors

agrimprasad avatar jmhodges avatar lpla avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

gocld3's Issues

Make gocld3 compilable with protobuf 3.17.3

Hello Jeff,

I'm on macOS 10.15 and am trying to install gocld3. I've installed pkg-config and protobuf using Homebrew for this purpose. The current version 3.17.3 of protobuf is not compatible with some generated files. I've tried to build the compatible version 3.11.3 of protobuf from source but failed. I'm not a C++ guy either. Can you please make the project compilable with the latest version of protobuf? Thanks a lot. Below is the stacktrace I got:

% go get github.com/jmhodges/gocld3/cld3
# github.com/jmhodges/gocld3/cld3
In file included from cld3.cc:3:
In file included from ../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/nnet_language_identifier.h:22:
In file included from ../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/embedding_feature_extractor.h:23:
In file included from ../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/feature_extractor.h:45:
../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/feature_extractor.pb.h:17:2: error: This file was generated by an older version of protoc which is
../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/feature_extractor.pb.h:18:2: error: incompatible with your Protocol Buffer headers. Please
../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/feature_extractor.pb.h:19:2: error: regenerate this file with a newer version of protoc.
../../../go/pkg/mod/github.com/jmhodges/[email protected]/cld3/feature_extractor.pb.h:28:10: fatal error: 'google/protobuf/inlined_string_field.h' file not found

Issues during go get

Hi,
I am trying to use gocld3 but I have got some issues during the install.
Is there any pre-requisities to do before getting the project?

What version of Go are you using (go version)?
go version go1.9.3 darwin/amd64

What operating system and processor architecture are you using (go env)?
macOS 10.13.6

Version of protoc (protoc --version)?
libprotoc 2.6.1

What did you do?
go get github.com/jmhodges/gocld3/cld3

What did you expect to see?
A successful install

What did you see instead?

$ go get github.com/jmhodges/gocld3/cld3
# github.com/jmhodges/gocld3/cld3
In file included from cld3.cc:3:
In file included from gocld3/cld3/nnet_language_identifier.h:22:
In file included from gocld3/cld3/embedding_feature_extractor.h:23:
In file included from gocld3/cld3/feature_extractor.h:45:
gocld3/cld3/feature_extractor.pb.h:12:2: error: This file was generated by a newer version of protoc which is
gocld3/cld3/feature_extractor.pb.h:13:2: error: incompatible with your Protocol Buffer headers.  Please update
gocld3/cld3/feature_extractor.pb.h:14:2: error: your headers.
gocld3/cld3/feature_extractor.pb.h:23:10: fatal error: 'google/protobuf/arena.h' file not found

Can't build. Protobuff problem

$go get github.com/jmhodges/gocld3/cld3
# github.com/jmhodges/gocld3/cld3
In file included from feature_extractor.h:45:0,
                 from embedding_feature_extractor.h:23,
                 from nnet_language_identifier.h:22,
                 from cld3.cc:3:
feature_extractor.pb.h:10:40: fatal error: google/protobuf/port_def.inc: No such file or directory
compilation terminated.
$ uname -a
Linux acc-MS-7A15 4.15.0-62-generic #69~16.04.1-Ubuntu SMP Fri Sep 6 02:43:35 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ go version
go version go1.10.4 linux/amd64

Any suggestions?

Invalid memory access

The code in https://github.com/jmhodges/gocld3/blob/master/cld3/cld3.cc#L23-L25 has a bug as the address is not stable.

It says:

These strings are statically allocated, so we can do this c_str() without worrying about them going off the stack.`

However, although the const char* strings are indeed static, cld3 returns a new std::string every single time, so it's needed to copy the result, or otherwise you can access garbage.

Cannot compile gocld3 with latest Protobuf 3.11.3

Seems like Protobuf 3.11 breaks compatibility with 3.10:

In file included from feature_extractor.h:45:0,
                 from embedding_feature_extractor.h:23,
                 from nnet_language_identifier.h:22,
                 from cld3.cc:3:
feature_extractor.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
 #error This file was generated by an older version of protoc which is
  ^~~~~
feature_extractor.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
 #error incompatible with your Protocol Buffer headers. Please
  ^~~~~
feature_extractor.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
 #error regenerate this file with a newer version of protoc.
  ^~~~~
In file included from task_context.h:23:0,
                 from feature_extractor.h:49,
                 from embedding_feature_extractor.h:23,
                 from nnet_language_identifier.h:22,
                 from cld3.cc:3:
task_spec.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
 #error This file was generated by an older version of protoc which is
  ^~~~~
task_spec.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
 #error incompatible with your Protocol Buffer headers. Please
  ^~~~~
task_spec.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
 #error regenerate this file with a newer version of protoc.
  ^~~~~
In file included from language_identifier_features.h:24:0,
                 from nnet_language_identifier.h:25,
                 from cld3.cc:3:
sentence.pb.h:17:2: error: #error This file was generated by an older version of protoc which is
 #error This file was generated by an older version of protoc which is
  ^~~~~
sentence.pb.h:18:2: error: #error incompatible with your Protocol Buffer headers. Please
 #error incompatible with your Protocol Buffer headers. Please
  ^~~~~
sentence.pb.h:19:2: error: #error regenerate this file with a newer version of protoc.
 #error regenerate this file with a newer version of protoc.
  ^~~~~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.