dominhhai / captcha-breaker Goto Github PK
View Code? Open in Web Editor NEWHigh Accuracy Captcha Breaker with Tensorflow and Node.js
License: MIT License
High Accuracy Captcha Breaker with Tensorflow and Node.js
License: MIT License
Would be great if src/create_train_data.py
picked up all files under data/captcha
by itself, using each filename as the captcha solution.
I find it very tedious having to create the data/captcha.json
while we could use the filename itself to store the captcha solution in an easier way (also a lot easier if you are manually creating test data).
Thoughts?
Hey @dominhhai, I hope you're doing well!
I'm getting the below error while running create_train_data
:
➜ captcha-breaker (master) ✗ python src/create_train_data.py
1 1otutm.jpg 1otutm
ERROR: number of letters is NOT valid 0
[]
DEPRECATION WARNING: The system version of Tk is deprecated and may be removed in a future release. Please don't rely on it. Set TK_SILENCE_DEPRECATION=1 to suppress this warning.
It ends up open a Python application like this
My guess is the application cannot read the captcha image properly since It shows a blank black screen?
Here is the captcha for 1otutm.jpg
Can you let me know how to fix it?
Your help will be much appreciated!
Thanks,
After a long way resolving dependency issues:
(py3) C:\Users\rodri\Workspace\captcha-breaker>python src\create_train_data.py
C:\Users\rodri\Workspace\captcha-breaker\src\data\captcha.json
C:\Users\rodri\Workspace\captcha-breaker\src\data\captcha
C:\Users\rodri\Workspace\captcha-breaker\src\data\train
C:\Users\rodri\Workspace\captcha-breaker\src\data\captcha
1 0.png Td1Rc9
Traceback (most recent call last):
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\numpy\lib\arraypad.py", line 1036, in _normalize_shape
shape_arr = np.broadcast_to(shape_arr, (ndims, 2))
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\numpy\lib\stride_tricks.py", line 173, in broadcast_to
return _broadcast_to(array, shape, subok=subok, readonly=True)
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\numpy\lib\stride_tricks.py", line 128, in _broadcast_to
op_flags=[op_flag], itershape=shape, order='C').itviews[0]
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "src\create_train_data.py", line 36, in <module>
letters = split_letters(image, debug=True)
File "C:\Users\rodri\Workspace\captcha-breaker\src\img.py", line 20, in split_letters
left = crop(image, ((0, 0), (0, image.shape[1]-SHIFT_PIXEL)), copy=True)
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\skimage\util\arraycrop.py", line 172, in crop
crops = _validate_lengths(ar, crop_width)
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\numpy\lib\arraypad.py", line 1080, in _validate_lengths
normshp = _normalize_shape(narray, number_elements)
File "C:\Users\rodri\.conda\envs\py3\lib\site-packages\numpy\lib\arraypad.py", line 1039, in _normalize_shape
raise ValueError(fmt % (shape,))
ValueError: Unable to create correctly shaped tuple from ((0, 0), (0, 170))
Any idea how to fix that?
Hi,
thanks for providing this awesome module! :-)
Can you share how much captcha examples you've used for the training stage? I got an accuracy of 0.3 with 5 examples. How much should I use for training?
Best
Willy
Could you also share the model or dataset?
H:\Private#2019\>python src/create_train_data.py
1 4.jpg 892851
Traceback (most recent call last):
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\arraypad.py", line 1036, in _normalize_shape
shape_arr = np.broadcast_to(shape_arr, (ndims, 2))
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\stride_tricks.py", line 173, in broadcast_to
return _broadcast_to(array, shape, subok=subok, readonly=True)
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\stride_tricks.py", line 128, in _broadcast_to
op_flags=[op_flag], itershape=shape, order='C').itviews[0]
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (2,2) and requested shape (3,2)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "src/create_train_data.py", line 30, in <module>
letters = split_letters(image, debug=True)
File "H:\Private#2019\Project\article-glow\captcha-breaker\src\img.py", line 20, in split_letters
left = crop(image, ((0, 0), (0, image.shape[1]-SHIFT_PIXEL)), copy=True)
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\skimage\util\arraycrop.py", line 172, in crop
crops = _validate_lengths(ar, crop_width)
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\arraypad.py", line 1080, in _validate_lengths
normshp = _normalize_shape(narray, number_elements)
File "C:\Users\hm\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\arraypad.py", line 1039, in _normalize_shape
raise ValueError(fmt % (shape,))
ValueError: Unable to create correctly shaped tuple from ((0, 0), (0, 110))
Python 3.5.2
Error: ModuleNotFoundError: No module named 'skimage'
Not sure if I am doing something wrong!
Can we have a much better distinguishment and explanation on what needs to be done exclusively for node users..?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.