Comments (15)
Hi @ai-nikolai, @takuto0515 and everyone who participated in this chat! Thank you for your patience.
Nagisa is now available on MacOS M1/2. It is compatible with Python versions 3.9 to 3.12. Also, without your installation method, this problem could not have been solved. It was really helpful. Thank you, @denvazh!
Please install nagisa using the following command.
pip install nagisa
Here is the basic usage.
import nagisa
text = 'Pythonで簡単に使えるツールです'
words = nagisa.tagging(text)
print(words)
#=> Python/名詞 で/助詞 簡単/形状詞 に/助動詞 使える/動詞 ツール/名詞 です/助動詞
# Get a list of words
print(words.words)
#=> ['Python', 'で', '簡単', 'に', '使える', 'ツール', 'です']
# Get a list of POS-tags
print(words.postags)
#=> ['名詞', '助詞', '形状詞', '助動詞', '動詞', '名詞', '助動詞']
If you encounter any installation errors, please comment again. I apologize for any inconvenience caused. Thank you for considering the use of nagisa. I hope this tool will be useful to you.
from nagisa.
I was able to make it working for myself, however this required building both nagisa and related DyNet dependencies from sources and directly from git repositories. This was fine for my problem, because I was experimenting and only cared about making it working. It might be a bit more challenging if it has to be installed automatically as part of some bigger project.
Hopefully one day both Nagisa and DyNet would publish wheels for both OS X on M1 and linux/arm64 🙄
First was DyNet since it was the one causing the install error on M1. It seems M1 support was not released yet ( clab/dynet#1648 ) and it had to be built from sources regardless.
Normally, it should be possible to install from git repository directly (pip install git+https://github.com/clab/dynet#egg=dynet
) however this didn't work:
Copying dyNET.egg-info to build/bdist.macosx-12.4-arm64/wheel/dyNET-0.0.0-py3.10.egg-info
running install_scripts
error: [Errno 2] No such file or directory: 'LICENSE.txt'
----------------------------------------
ERROR: Failed building wheel for dynet
Failed to build dynet
ERROR: Could not build wheels for dynet which use PEP 517 and cannot be installed directly
Instead I had to do clone dynet project, disable license copy, build wheel and install it locally
git clone [email protected]:clab/dynet.git
cd dynet
echo "license_files =" >> setup.cfg
brew install eigen
pip install wheel
python setup.py bdist_wheel
pip install build/py3.10-64bit/python/dist/dyNET-0.0.0-cp310-cp310-macosx_12_0_arm64.whl
Then I could install nagisa
with pip install nagisa
however it didn't work when I actually used it:
Python 3.10.2 (main, Jun 13 2022, 19:02:38) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nagisa
[dynet] random seed: 1234
[dynet] allocating memory: 32MB
[dynet] memory allocation done.
>>> text = 'ペニーは鮮やかな青い魚を買った。'
>>> doc = nagisa.tagging(text)
>>> doc.words
[1] 42608 segmentation fault python
I used the same method and installed nagisa from local repository:
git clone [email protected]:taishi-i/nagisa.git
cd nagisa
I had to patch setup.py
and force DyNet
because it was expecting DyNet38
project fork:
diff --git a/setup.py b/setup.py
index 83f8da6..9cc1693 100644
--- a/setup.py
+++ b/setup.py
@@ -73,6 +73,8 @@ def extensions():
def switch_install_requires():
major = sys.version_info.major
minor = sys.version_info.minor
+ return ['six', 'numpy', 'DyNet']
+
if os.name == 'posix' and major == 3 and minor > 7:
return ['six', 'numpy', 'DyNet38']
else:
With that I was able to build the project:
python setup.py bdist_wheel
pip install dist/nagisa-0.2.8-cp310-cp310-macosx_12_0_arm64.whl
and could see that its finally working:
Python 3.10.2 (main, Jun 13 2022, 19:02:38) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nagisa
[dynet] random seed: 1234
[dynet] allocating memory: 32MB
[dynet] memory allocation done.
>>> text = 'ペニーは鮮やかな青い魚を買った。'
>>> doc = nagisa.tagging(text)
>>> doc.words
['ペニー', 'は', '鮮やか', 'な', '青い', '魚', 'を', '買っ', 'た', '。']
>>>
from nagisa.
Hi @dataf3l. I'm sorry for not being able to reply until now. Thank you for your comment. I am very grateful to receive such a comment.
To ensure that other Mac M1 users do not face difficulties, please keep this issue open. I cannot promise an immediate solution, but I would like to resume efforts to solve this problem. Until then, I recommend M1 Mac users use the alternative methods mentioned above or consider using Ubuntu. Thank you.
from nagisa.
Hi @takuto0515! Thank you for your message. I am aware that this library is not available on Mac OS and it is an issue that I intend to resolve urgently.
To address this problem, I am currently experimenting with creating dynet wheels using the latest Python on Windows (Not Ubuntu OS). Concurrently, I plan to set up a Mac OS environment and aim to create a wheel there as well. However, I cannot promise immediate availability, and if you need to use it on Mac OS, I would suggest the above alternatives or running it on an Ubuntu environment with an Intel CPU. I apologize for any inconvenience. Thank you.
from nagisa.
Thank you everyone for your comments. I apologize for the inconvenience.
This error is not caused by nagisa itself, but by the dependent library dynet, which does not provide a wheel for M1 Mac. Therefore, I tried to build a dynet wheel on my own, but I didn't have the M1 Mac environment at hand, and even using GitHub action, I couldn't build it successfully.
It is difficult to solve this problem immediately, so I recommend using alternative methods such as Janome, Fugashi or Sudachi for M1 Mac.
Finally, thank you for considering nagisa. I'm sorry I couldn't help you.
I can confirm that fugashi works fine (I recommend using it with ipadic).
import fugashi
import ipadic
tagger = fugashi.GenericTagger(ipadic.MECAB_ARGS + ' -Owakati')
tagger.parse(text)
from nagisa.
thank you
from nagisa.
@denvazh thank you for the write-up
@taishi-i any updates on if this could become available on Mac ARM based?
from nagisa.
I for one vote for all of us combined to pitch in 10 bucks so the author(s?) can have nice a new shiny M1.
that aside,
I don't remember this task, or what I was trying to accomplish, or what was even the project about, I guess I was just testing things. guys, today we have llama and gpt, maybe let's use that if the usecase is not industrial?
so here is the question, can gpt do the same thing as nagisa?
and if so, do we need nagisa?
those are my humble questions, I am not by any means trying to diminish the value of the contribution of the authors, just pointing out that perhaps an alternative exists to whomever has this problem, the alternative being chatgpt
in this context, maybe we don't really need to fix this? or maybe it's low priority?
having said that, MAYBE WE CAN USE CHATGPT ITSELF to fix whatever issue was present back in nov 2022, a few days before chatgpt came out.
from nagisa.
Thank you everyone for your comments. I apologize for the inconvenience.
This error is not caused by nagisa itself, but by the dependent library dynet, which does not provide a wheel for M1 Mac. Therefore, I tried to build a dynet wheel on my own, but I didn't have the M1 Mac environment at hand, and even using GitHub action, I couldn't build it successfully.
It is difficult to solve this problem immediately, so I recommend using alternative methods such as Janome, Fugashi or Sudachi for M1 Mac.
Finally, thank you for considering nagisa. I'm sorry I couldn't help you.
from nagisa.
hay man don't worry, you didn't inconvenience anybody, quite the contrary, you helped inmensely by making awesome open source software.
I'll file a issue on dynet so they can get the thing done, and then we just wait for a solution.
just because you posted some code online, this doesn't mean you were in a obligation to help anybody!
so don't worry too much, things will eventually work out.
from nagisa.
@taishi-i Hi, I am facing the same DyNet setup problem with M2 Pro Max.
I think this project is much simpler than other existing Japanese division tools and seems very easy to use.
I am looking forward to seeing the problem resolved!
from nagisa.
I have confirmed that it works on macOS M1/M2 using Github Actions. Therefore, as this issue has been resolved, I will close this issue. If you are unable to install, please reopen the issue and add a comment. Thank you, everyone!
from nagisa.
Issue seems to work and be resolved for me.
➜ study mkdir nagisa
➜ study cd nagisa
➜ nagisa python3 -m venv venv
source venv/bi% ➜ nagisa source venv/bin/activate
(venv) ➜ nagisa pip install nagisa
Collecting nagisa
Downloading nagisa-0.2.11-cp312-cp312-macosx_11_0_arm64.whl.metadata (6.6 kB)
Collecting six (from nagisa)
Using cached six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Collecting numpy (from nagisa)
Downloading numpy-2.0.0-cp312-cp312-macosx_14_0_arm64.whl.metadata (60 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.9/60.9 kB 1.2 MB/s eta 0:00:00
Collecting DyNet38 (from nagisa)
Downloading dyNET38-2.2-cp312-cp312-macosx_11_0_arm64.whl.metadata (6.5 kB)
Collecting cython (from DyNet38->nagisa)
Using cached Cython-3.0.10-py2.py3-none-any.whl.metadata (3.2 kB)
Downloading nagisa-0.2.11-cp312-cp312-macosx_11_0_arm64.whl (21.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.3/21.3 MB 6.3 MB/s eta 0:00:00
Downloading dyNET38-2.2-cp312-cp312-macosx_11_0_arm64.whl (2.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.9/2.9 MB 6.6 MB/s eta 0:00:00
Downloading numpy-2.0.0-cp312-cp312-macosx_14_0_arm64.whl (5.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 7.8 MB/s eta 0:00:00
Using cached six-1.16.0-py2.py3-none-any.whl (11 kB)
Using cached Cython-3.0.10-py2.py3-none-any.whl (1.2 MB)
Installing collected packages: six, numpy, cython, DyNet38, nagisa
Successfully installed DyNet38-2.2 cython-3.0.10 nagisa-0.2.11 numpy-2.0.0 six-1.16.0
(venv) ➜ nagisa python
Python 3.12.3 (main, Apr 9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
import nagisa
nagisa.tagging("これは何ですか")
<nagisa.tagger.Tagger._Token object at 0x100cf2f30>
doc = nagisa.tagging("これは何ですか")
doc.words
['これ', 'は', '何', 'です', 'か']
my sincere thanks to the team.
from nagisa.
as things come to my mind, perhaps using chatgpt maybe also solves the problem of POS tagging?
from nagisa.
Hi @dataf3l. Thank you for checking. I'm glad to hear it worked without any problems. I'm not sure if this answers your question, but you can get part-of-speech tags without using ChatGPT by accessing doc.postags
. If you have any questions about retrieving part-of-speech tags, feel free to ask.
import nagisa
doc = nagisa.tagging("これは何ですか")
doc.words
# ['これ', 'は', '何', 'です', 'か']
doc.postags
# ['代名詞', '助詞', '代名詞', '助動詞', '助詞']
from nagisa.
Related Issues (20)
- Drop support for Python2.7? HOT 3
- dict_file format HOT 2
- Fail to install the package
- Details about pre-trained nagisa model HOT 2
- install error on UBUNTU 18.04--python3.6 HOT 2
- Why do you have 6 dim outputs for word segmentation? HOT 2
- About referecnce this library HOT 4
- Pip/pip3 install nagisa Error HOT 14
- Wheel request for Python 3.8 HOT 10
- Suppress output messages HOT 2
- Illegal instruction (core dumped) HOT 2
- core dumped HOT 4
- Heroku deployment of NLP model Nagisa Tokenizer showing error HOT 22
- How to train with gpu? HOT 5
- Could not install nagisa with poetry (without complicated configurations) HOT 6
- Failed to install on docker on pi4b HOT 5
- importing nagisa gives error "source code string cannot contain null bytes" HOT 5
- Dynet38 is not compatible with python3.11 on macos m2 HOT 4
- Nagisa changes Japanese zenkaku to hankaku HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nagisa.