SDK for Dummies - How to improve existing captcha definitions
Captcha Breaker has some neat tools on board to improve the solving rate of a captcha.
Basically there are two ways to improve them.
1) Fine tuning an existent definition
2) Add another solving process for captchas which CB wasn't able to give an answer for. This means if 'Process 01' couldn't solve the captcha you can define 'Process 02' that will try to solve it. You are able to define up to 10 processes for each captcha.
Often times it helps a lot if you change some parameters to your existing definition. To name the most important parameters you need to take a closer look on
Other parameter are also worth investigating but both mentioned above are the most important ones in my opinion. As both correlate often times you will only get an effect if one of them will be fine tuned.
- Threshold 50 %
- Scale 200 %
If we fine tune 'Threshold' and get a value of 45 than this value if optimised for a captcha which was scaled up to 200%. Most likely you won't notice any changes if we try to fine tune the scale parameter.
- Threshold 50%
The definition has no 'Scale' parameter up to this point. Once we have fine tuned 'Threshold' to 45% you can try to add the 'Scale' parameter afterwards (= best result you will get is with a scale of 200 %). In this case it is worth trying to do such and fine tune 'Scale' after adding it to the definition.
If or not parameters are correlating to each other is depending on which parameters and in which order they appear. Because of that it is often times worth a try to all different kinds of testing.
This is how 'Fine Tuning' works with the help of 'Test focused Filter Parameter' (-> right click parameter -> click 'Test focused...')
In the first run we test it with a wide range of values. In this case I did it for instance with a minimum of 20 and a maximum of 80.
The first run gave us just a tiny improvement. I need to remind you though that the more sample captchas you have the better definitions could be made.
Next thing we try is to even optimize the value to its decimals. Once again we click 'Test filters...' but this time the minimum value will be 40, the maximum value 42 with an increase of just 0,1.
Sadly enough I got no further captchas solved this times but sometimes this works wonders if you test it with a sizable captcha database.
But I tweaked the scale percentage and got a result of 30% solved captchas with a sample size of 50 captchas of this type.
This time it was worth a try to test the 'scale' parameter because the 'remove-objects' filter as well as the 'remove-dust' filter were correlated to the scale size.
We leave that definition for now and try add another solving process for further improvement.
Add another Process
Captcha Breaker has the unique feature to add another solving process for captchas it wasn't able to give an answer in its first process.
Before we start with 'Process 2' we testing all captchas with the original definition again and delete all captchas with an answer afterwards (regardless of wether the answer was correct or not).
To do that you need to click 'Add / Edit / Delete' -> 'Delete' -> 'All with none empty result'.
Next thing we need to to is to increase the 'Process' value to 2.
You can either 'Brute Force' for 'Process 2' with the left captchas again or try something different. In this case i added a simple filter for demonstration purposes and this filter has to be used with when "brute forcing".
When asked if it should always use current filter when brute forcing we say 'Yes' (or 'Ja' if you are using a german OS )
This run will take some time again but not as long as in the very first 'Brute Force' run because we test it with a smaller sample size of captchas this time.
After the extra run is done and fine tuning this process we have gotten the following result for the remaining captchas.
You'll see that some captchas could be solved that weren't solvable in 'Process 1'. Now we are adding all captachs that were delete before ('Add / Edit / Delete' -> 'Add' -> sample folder) and click 'Test' to get the results with both processes.
As you can see we improved our captcha definition from 24% to a solving rate of 40% with fine tuning the original definition and by adding an additionally process. You can even add more processes (up to 10 overall) but for that you need a very large database of sample captchas.
This example has shown that it is worth the time to optimise the captcha definitions to maximize the results. Hopefully many users will participate in creating a large captcha database for testing of each type and try to improve existing definitions or adding new ones for the benefit of all.
One last word. Everyone is able to use the SDK on each of his PC(s). After the trial of CB has expired the SDK will remain full functionally so you are able to play with it on your home PC when your license of Captcha Breaker is running on your VPS!