Skip to content

Attempt at Solving Captcha given by mca.gov.in using ML, OpenCV methods

Notifications You must be signed in to change notification settings

sushantMoon/Captcha-Solver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Objective :

This is an attempt to solve the capthca given out in www.mca.gov.in Note : Few of the steps are mentioned in Data/KarzaTest.pdf

crawler.py :

Script using the following tech stack, no machine learning and not much of opencv stuff.

Technologies :

  • Python
  • Selenium
  • Tesseract-ORC

Script Logic :

  1. Script simulate the chrome browser using selenium and open the link - http://www.mca.gov.in/
  2. Once the page is loaded, we click on "View Company or LLP Master Data". This opens a new tab. We switch to the newly opened tab.
  3. The script then takes screenshot of the captcha, and attempts at solving it using tesseract-ocr.
  4. If it succeeds, we download the data loaded by website using export to excel. And If it fails, we re-try solving. If the second attemp fails, script closes the browser and again start form Step 1.
  5. This is repeated till we get data for all the Complany CINs of our interset.

About

Attempt at Solving Captcha given by mca.gov.in using ML, OpenCV methods

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published