Coder Social home page Coder Social logo

gse's Introduction

gse

Go中文分词

CircleCI Status codecov Build Status Go Report Card GoDoc Release Join the chat at https://gitter.im/go-ego/ego

词典用双数组trie(Double-Array Trie)实现, 分词器算法为基于词频的最短路径加动态规划。

支持普通和搜索引擎两种分词模式,支持用户词典、词性标注,可运行JSON RPC服务

分词速度单线程9MB/s,goroutines并发42MB/s(8核Macbook Pro)。

安装/更新

go get -u github.com/go-ego/gse
go get -u github.com/go-ego/re 

re gse

To create a new gse application

$ re gse my-gse

re run

To run the application we just created, you can navigate to the application folder and execute:

$ cd my-gse && re run

使用

package main

import (
	"fmt"

	"github.com/go-ego/gse"
)

func main() {
	// 载入词典
	var segmenter gse.Segmenter
	segmenter.LoadDict()
	// segmenter.LoadDict("your gopath"+"/src/github.com/go-ego/gse/data/dict/dictionary.txt")

	// 分词
	text := []byte("中华人民共和国**人民政府")
	segments := segmenter.Segment(text)
  
	// 处理分词结果
	// 支持普通模式和搜索模式两种分词,见代码中ToString函数的注释。
	fmt.Println(gse.ToString(segments, false)) 
}

License

Gse is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), base on sego

gse's People

Contributors

vcaesar avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.