Coder Social home page Coder Social logo

captcha-mca's Introduction

Objective :

This is an attempt to solve the capthca given out in www.mca.gov.in Note : Few of the steps are mentioned in Data/KarzaTest.pdf

crawler.py :

Script using the following tech stack, no machine learning and not much of opencv stuff.

Technologies :

  • Python
  • Selenium
  • Tesseract-ORC

Script Logic :

  1. Script simulate the chrome browser using selenium and open the link - http://www.mca.gov.in/
  2. Once the page is loaded, we click on "View Company or LLP Master Data". This opens a new tab. We switch to the newly opened tab.
  3. The script then takes screenshot of the captcha, and attempts at solving it using tesseract-ocr.
  4. If it succeeds, we download the data loaded by website using export to excel. And If it fails, we re-try solving. If the second attemp fails, script closes the browser and again start form Step 1.
  5. This is repeated till we get data for all the Complany CINs of our interset.

captcha-mca's People

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.