Coder Social home page Coder Social logo

ocab's Introduction

What is this?

日本語解析ライブラリMeCabを使う際の、前処理を行うためのpython用ライブラリOcabです。
Mecabを雌株と見たてて、雄株ことOcabと命名。
使い方の詳細は、こちらを参照してください。

How to use

As a single program

単体で使うときは、以下のように使います。

$ python Ocab.py 南アルプスの天然水-Sparking*Lemon+レモン一絞り
input     : 南アルプスの天然水-Sparking*Lemon+レモン一絞り
normalized: 南アルプスの天然水-Sparking*Lemon+レモン一絞り
wakati    : 南アルプスの天然水 Sparking Lemon レモン 一 絞る
rmv st wds: 南アルプスの天然水 Sparking Lemon レモン 絞る

As like Library in Python code

ライブラリとして使うときは、こんな感じです。

$ python
from Ocab import Ocab, Regexp
c = Regexp()
text1 = c.normalize("南アルプスの天然水-Sparking*Lemon+レモン一絞り")
print(text1) # 南アルプスの天然水-Sparking*Lemon+レモン一絞り
m = Ocab(target=["名詞","動詞","形容詞","副詞"])
text2 = m.wakati(text1)
print(text2) # 南アルプスの天然水 Sparking Lemon レモン 一 絞る
text3 = m.removeStoplist(text2, [])
print(text3) # 南アルプスの天然水 Sparking Lemon レモン 絞る

m = Ocab(target=["名詞","動詞","形容詞","副詞"])の部分でもっといろいろ指定できたりしますが、   そこはコード読んでください。

License

This program is applied MIT License.

Reference

  1. 解析前に行うことが望ましい文字列の正規化処理
  2. MeCabとPythonで品詞を選びつつ分かち書きをしたよ
  3. 日本語のストップワードのリスト

ocab's People

Contributors

boomin614 avatar

Stargazers

maruo avatar hira avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.