Free ReCaptcha v2 and v3 breaker
rastarr
Thailand
Caveats First:
- The only coding I've done is assembly language on a Z80 Amstrad a great many years ago
- I've never touched Python or used FastAPI
- I was prompted to do this when some Facebook guy said he'd integrated Recaptcha into GSA CaptchaBreaker, said he'd give it away and then didn't
- I'm also a Mac user but have GSA SER & CB etc running on my MacMini to try and get some juice for my sites. I'm no Windows expert but I can get around
1. install Python from https://www.python.org/downloads/
click the box on the installer to 'Add python.exe to PATH'
at the end of the installation, also click the 'Disable path length limit'
then close button
2. Now to install ffmpeg via the command line
Scroll down to the 'Install Using Windows Cmd Shell' section. Paste in the command you see there
3. next run 'choco install ffmpeg' and when prompted, type 'yes' to accept all further install prompts
4. also run ' pip install fastapi playwright playwright-recaptcha python-multipart uvicorn ' - this installs the wrapper which will act as an internal webserver as well as the GitHub code that does the captcha solving
5. also ' pip install msvc-runtime ' which I found early on is needed to avoid some missing code
6. from Windows command line, ' playwright install ' - this preps and installs the needed web browsers so captcha solving can run in the background
7. To avoid Windows permission problems, I chose my user directory and Documents to store stuff.
Make a 'recaptcha' directory, and then 1 inside this one - used _capv2v3 since I prefer directories to be listed at the top. These names can be anything that makes better sense for you. They can be anything you like.
Unzip the attachment from this post and copy the attached _capv2v3.py into the '_capv2v3' directory
NOTE: You will need edit the _capv2v3.py file to fill in rotating proxy details into the top solve_captcha2 function. I've been told ReCaptchav2 solving has a higher success rate with proxies but no idea really. Mine are currently webshare proxies.
8. This script/FastAPI webserver will create small txt files that I have used to update the captcha's status and final result. It creates lots of these files so I have an internal function inside it that deletes any .txt file that is older than 30 minutes. Be aware to not store anything else inside these 2 directories, with a .txt extension.
Caveat: There is room for improvement to use some lightweight database obviously but that's too complicated for me, at this stage.
9. Open a Windows CMD shell (run as Administrator) and navigate to the '_capv2v3' directory.
In the Windows CMD shell, run this command 'uvicorn _capv2v3:app --port 8000 --workers 8' - adjust number of workers to your system.
The Windows CMD shell should now be running a webserver awaiting input for captcha solving
10. Now it is time to activate them into GSA SER. You can add this to GSA CaptchaBreaker if it is running in webserver mode too.
11. Click GSA SER options, and the left 'Captcha' tab.
Click add and choose 2Captcha API With IP from the top of the list
Host - 127.0.0.1:8000
API-Key - GSA (or anything else but fill something in)
Usage types - tick both ReCaptchav2 and ReCaptchav3
Other options are up to your situation and choice.
12. Click OK and you should soon see activity in the Windows CMD shell.
13. Be very careful when editing the .ph Python file too. Python is not tolerate with incorrect code formatting. Keep the formatting as it is in the original file and you should be OK.
14. My experience with XEvil was an awful lot of failures, timeouts etc. It was very heavy on my MacMini, crashed a lot, it's developers were always price-gouging for upgrades and they never reply to any emails, leaving me without access to their forums for help. Plus their hidden monthly subscription fee.
Anyway, I hope I've covered the installation. I've spent a bit of time optimising as best I can with my experience and all seems to be OK.
Big thanks go out to the guy/s at https://github.com/Xewdy444/Playwright-reCAPTCHA and special thanks to ChatGPT who I chatted with for many long hours in Python help
Comments welcome and certainly if you have any Python skills then don't be afraid to jump in on improvements.
Edit: script updated Mar 3, 2023 and threads set to 0 for testing
Tagged:
Comments
Why not add this natively @Sven
Let's not get ahead of ourselves - it works for me. no idea about others and what my 64 year old brain has forgotten
I just implemented this and its working great. Thanks for sharing this.
I'm however seeing a lot of errors mainly saying
Error solving v2 captcha: 'PlainTextResponse' object has no attribute 'startswith'
So my solving ratio is at 3%. Have you faced this? any ideas mate?
Run the CMD shell as Administrator too, I think there's a heap of webserver traffic which seems to halt a response back to GSA. or something like that.
Yeah, solving success is going to be low.
When I was using Xevil, I so a great many sites where their Captcha did not function or some key issue.
Also, bear in mind, the solving ration is way way wrong. The success may be right but that high number of failed seems to be each GET request to poll the status. That polling happens about every 5 or so seconds and, I think, each time it gets a CAPCHA_NOT_READY message that it's counting as a failure.
I'm running my 2capv2v3 solving instance with 40 workers on my more powerful iMac actually and connecting from my Windows 10 Mac Mini machine. Running more threads may help too. Still early days yet.
Personally, I think I'll be dropping Recap v2 v3 solving - very low returns which don't appear to be worth the effort.
I've also added an updated .py file (zipped) so I'm running Firefox in headless mode for an improvement in speed and memory usage
And a big thanks for giving it a try-out too.
Does your recaptcha solver need proxies ?
NOTE: You will need edit the _capv2v3.py file to fill in rotating proxy details into the top solve_captcha2 function. I've been told ReCaptchav2 solving has a higher success rate with proxies but no idea really. Mine are currently webshare proxies.
Great stuff, thx! I salute you.
Did your LpM increase significantly?
Could you tell me more about your proxies experiences with Webshare proxies. I've checked their pricing and it's quite confusing. I use 50 private proxies from BuyProxies.org and they pretty bad for GSA (works well on Scrapebox though...)
I'd like get it for GSA 2-3 projects with 50 threads running. Any recommendations on Webshare packages?
I'm not the right person to talk to about Webshare since my proxy experiences are limited, sorry.
I think there would be better options but I chose Webshare simply as a starting point.
Until I start seeing some benefit from using GSA SER, I'm on a limited budget/expenditure. Hopefully I'll start seeing something soon though.
Hi @rastarr
Thank you for sharing your detailed instructions on how to create a free ReCaptcha v2 and v3 breaker. Your post is incredibly helpful and informative for those who want to use ReCaptcha without incurring additional costs.
As you noted in your post, the script could benefit from the use of a lightweight database to improve its functionality. Therefore, I would suggest exploring the use of SQLite, a self-contained, serverless, zero-configuration, transactional SQL database engine. It is a perfect choice for small to medium-sized web applications, and it can easily be integrated into Python code.
Moreover, to enhance the script's performance, it could be helpful to consider the use of multi-threading. Multi-threading will allow the script to execute multiple tasks simultaneously, thereby reducing the time it takes to solve the captchas.
Overall, your post is informative and helpful, and your efforts are commendable. I hope my suggestions will help improve the functionality of your script and make it even more user-friendly.
The below can help with the Multithreading:
And from all my research, SQLite is a poor choice for instances of multiple transactions. I'd have to think along the lines of mySQL or similar
Thanks, you are absolutely correct .
While SQLite supports transaction management, it uses a locking mechanism to ensure data integrity, which can lead to performance issues in cases of concurrent transactions. As a result, if your application requires high levels of concurrency or heavy write operations, using a more robust database management system such as MySQL or PostgreSQL may be a better choice.
Thanks for correcting me.
When I do a test "check balance" it works and says 100. But when I try to test a recaptcha2 or 3 it gives error :
Missing data for Next-URL (Internal Server Error)
can you help? Maybe you've seen this before and know what it is. I don't really have a clue at this point.
I put a proxy url and port in the py script, it doesn't have a username or password so I left those empty, I'm not sure if that affects it.
Thanks so much in advance!
Hugh
But I'm now getting the same error that was mentioned by dp001 or similar -
Error solving v2 captcha: 'PlainTextResponse' object has no attribute 'startswith'
Error solving v3 captcha: 'PlainTextResponse' object has no attribute 'startswith'
that shows in the command line. In the GUI it says CAPTCHA UNSOLVABLE.
This is happening when I do the recaptcha test in GSA Captcha Breaker for v2 and v3 recaptcha. It seems like those should be easily solved since they're the basic demos. It also seems like everything else is running perfectly so this is perplexing.
Any ideas? Many thanks in advance
For a v3 test, use https://antcpt.com/score_detector/ as the URL which usually worked for me
v2 I was testing with https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php
The site key is irrelevant as the script does it's own thing and doesn't need these parameters to be passed.
Apologies as my internet is limited at the moment
I've updated my install instructions too - must have missed that dependency
"username": "PROXY-USERNAME",
"password": "PROXY-PASSWORD"