Coder Social home page Coder Social logo

letuananh / chirptext Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 3.0 371 KB

ChirpText is a collection of text processing tools for Python.

Home Page: https://chirptext.readthedocs.io

License: MIT License

Python 99.85% Shell 0.15%
chinese japanese linguistics mecab nlp python vietnamese

chirptext's Introduction

Hi there πŸ‘‹

Le Tuan Anh is a semanticist who is deeply interested in well-being, languages, and free software. His main research interests are in computational meaning and theory of language. He is the author of the coolisf software, which provides computational deep semantic analysis by combining structural semantics from construction grammars and lexical semantics from ontologies in a single representation. The dissertation (2019), source code, and data are available on the Open Science Framework.

In his free time, Tuan Anh develops jamdict - a free and open-source Python 3 library for manipulating Japanese linguistic resources: Jim Breen’s JMdict, KanjiDic2, JMnedict, and kanji-radical mappings. There is an online demo available at: https://jamdict.herokuapp.com/

From Mar 2019 to Apr 2022, Tuan Anh was a research fellow at Nanyang Technological University (NTU), Singapore and contributed to the development of BELA - a pathway for creating, checking, visualizing, and analysing multilingual corpuses of natural languages.

chirptext's People

Contributors

letuananh avatar saihtaungkham avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

chirptext's Issues

Revamp TTL APIs for more complex usecases

  • simplify multi-tag handling (i.e. sense candidates, chunk languages, annotators, etc.)
  • Built-in support for CoNLL
  • use first tag slot for scalar tags (i.e. POS, lemma, surface, languages)
  • Re-design TTL JSON

FileNotFoundError [WinError 2] The system cannot find the file specified

>>> from chirptext import deko
>>> sent = deko.parse('猫がε₯½γγ§γ™γ€‚')
>>> sent.tokens
===========================
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-15-bd2eed5e1672> in <module>
      1 from chirptext import deko
----> 2 sent = deko.parse('猫がε₯½γγ§γ™γ€‚')
      3 sent.tokens

~\Anaconda3\lib\site-packages\chirptext\deko.py in txt2mecab(text, **kwargs)
    250 def txt2mecab(text, **kwargs):
    251     ''' Use mecab to parse one sentence '''
--> 252     mecab_out = _internal_mecab_parse(text, **kwargs).splitlines()
    253     tokens = [MeCabToken.parse(x) for x in mecab_out]
    254     return MeCabSent(text, tokens)

~\Anaconda3\lib\site-packages\chirptext\dekomecab.py in parse(content, *args, **kwargs)
     65         return MeCab.Tagger(*args).parse(content)
     66     else:
---> 67         return run_mecab_process(content, *args, **kwargs)
     68 
     69 

~\Anaconda3\lib\site-packages\chirptext\dekomecab.py in run_mecab_process(content, *args, **kwargs)
     55     output = subprocess.run(proc_args,
     56                             input=content.encode(encoding),
---> 57                             stdout=subprocess.PIPE)
     58     output_string = os.linesep.join(output.stdout.decode(encoding).splitlines())
     59     return output_string

~\Anaconda3\lib\subprocess.py in run(input, capture_output, timeout, check, *popenargs, **kwargs)
    464         kwargs['stderr'] = PIPE
    465 
--> 466     with Popen(*popenargs, **kwargs) as process:
    467         try:
    468             stdout, stderr = process.communicate(input, timeout=timeout)

~\Anaconda3\lib\subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
    767                                 c2pread, c2pwrite,
    768                                 errread, errwrite,
--> 769                                 restore_signals, start_new_session)
    770         except:
    771             # Cleanup if the child failed starting.

~\Anaconda3\lib\subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, unused_restore_signals, unused_start_new_session)
   1170                                          env,
   1171                                          os.fspath(cwd) if cwd is not None else None,
-> 1172                                          startupinfo)
   1173             finally:
   1174                 # Child is launched. Close the parent's copy of those pipe

FileNotFoundError: [WinError 2] The system cannot find the file specified

Asking for a new release on PyPi

Hi,

Version 0.1a18 is a bit outdated, could you update a newer version to PyPi?

I only need this commit, but since long time is passed i think most of the master change are stable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.