Skip to content

GSA with only image filters

Hello there.

Since GSA filters are pretty good, I am interested in using GSA capabilities to clean images. 
It is possible to use just this functionality without recognition?

Best regards


  • SvenSven
    sorry no, I could add something in SDK where you load in the images, hit a button and the filtered images are saved.
  • Well, no problem then.

    The 'save' button on the SDK will not help since I'm planning to automate the process to lots of images.

    No problem.

  • Here is the full story:

    I have to recognize a captcha with fixed character length and position, but lots and lots of noise.
    The filters that GSA is applying are very good, but the recognition process are not good for this image, getting from 10-15% of accuracy.

    I'm performing some tests with tesseract and when I grab the cleaned image from GSA's SDK (printing screen), crop by hand in an image editor and pass each segment to tesseract (configured to recognize single chars) I get almost 100% of accuracy.

    There are any way to use GSA to recognize the image char by char? Or some way to emulate what I'm doing by hand?

    Best regards
  • SvenSven
    I need to see a captcha sample to give any options here.
  • SvenSven
    isnt that type already included?
  • Yes, but the accuracy isn't so good.

    If there is a way to recognize the chars one by one, specifying each position, the accuracy will be much better.

  • SvenSven
    but this is the holy-grail on OCR anyway...find a way to seperate merged/joined chars. If this would be easy, then recaptcha, kcaptcha and so on would all be broken.
  • I know, Sven. 
    But in this particular case, a simple crop on specific fixed areas could resolve the issue. That's what I did manually and works good.

    If there is a way on GSA to specify rectangle coordinates and tell the OCRs to look only at the rectangles, recognizing each rectangle as a single char...

    Best regards
  • SvenSven
    well you can cut that off using shape/add border. This way you can at least cut off every dust if it's char is really on a fixed position. The OCR is usually clever enough to do the rest then (except char seperation).
Sign In or Register to comment.