Skip to content

GSA SER Posting working wrong - already parsed

rysioslaw2rysioslaw2 hitman.agency
Hi,
I use GSA SER.
I made 5mln links with comment form on a pages. When I posted I saw many 'already parsed' info in log window.
After investigating I almost understand how it's work. I think, when posting fail on one link, never more try to post on same domain.
I made log to .txt file and made new project. The new project with imported "already parsed" links gave me a lot of good valuable and verified links, but also new already parsed.
In effect I get about 80% verified links from links which was signed as already parsed in first project.
In my opinion, this is a logical error in the operation of GSA SER. The fact that the first or second link in the same domain failed does not mean that the third and fourth are incorrect!

My question: Can I disable this feature and never see the "already parsed" message again?

Tagged:

Comments

  • SvenSven www.GSA-Online.de
    SER will set a whole domain as "already parsed" if it failed to submit or this engine is not URL based (e.g. wordpess article). For comments it will use every URL and only add it as "already parsed" if it submitted there before (and no duplicate posting is enabled) or it does not detect ot as a comment plattfrom or fails to submit there.
  • I not sure if its of help what your saying but you can modify project and clear the cache and say yes to clear message "already parsed" just maybe wanna say no to delete accounts.

    This would go over the "already parsed" links ago. I do this sometime if I think the project is hanging or confused, then it starts ripping through them again.

    Helps for me in certain situations....
  • rysioslaw2rysioslaw2 hitman.agency
    I have no duplicate urls in my list.
    I think this logic is wrong because engine.ini (in this case General Blogs.ini) can't recognize engine. It's recognized only is it possible to post on exact page. In this case SER not recognize urls to pages like: contact-me about-me etc, as a wordpress and flag domain as "already parsed".

    For explain:
    I made a test: I made 4 links to my page: 3 links to blog posts where I have comment form and 4th to contact form (it's bad link but still wordpress engine).
    I try to post to Blog Comment / General Blog. When my list have links order: 1,2,3,4 - I have made 3 comments, but when I change order to: 4,1,2,3 I get: "no engine matches" and 3 links "already parsed".

    That problem appear in another situation. When I have 20 proxy and one of them is blacklisted, websites with antibot protection (cloudflare sg-captcha or siteground) not opening when SER try to use blacklisted proxy. In that case good proxy posts on that website, blacklisted have "no engine matches" and after this attempt, the entire domain is marked as "already parsed".

  • Maybe need to add or adjust something in script like adding [Extra _Step*] and deal with it somehow there.

    I notice scripting it seems like to get higher quailty now, have to make script for each site.

    Most default ones with metrics and traffic have been changed someway if they have decent metrics.

    Like               fixed url=https://authoritydomain.com  

    One that can be identified but not posted to like the way you might want.

    Usually, identified and failed are rip opportunities to find some sites worth making fixed url scripts for, granted the metrics match quality your after.

    Then create a script/bot for your purposes on that domain.

    Here's an example. . .



    This was search for phpbiolinks. While there is a larger list only a few are above Moz Rank 2.

    I wouldnt find posting a bunch of links on sites with zero metrics helpful in my campaigns.

    And of these. . .

    Some are banned by country.

    Some have H-captcha

    Some have Recaptcha.

    Some have image captcha.

    Some have just ticking different consent boxes.

    Some seem down...etc

    So it can take some time to find a good url to script for.

    The already parsed message if Im hearing you correctly probably has to be dealt with differently for what your trying to do maybe?

    I frequently stop the software, clear some caches, even hit refresh. Then I start and I usually get better results if having an issue with a project.
  • rysioslaw2rysioslaw2 hitman.agency
    @backlinkaddict Thank you for your help, I appreciate it.
    In my case, I have huge list, many many millions of links to wordpress page where was comment form on a page. Because of the I was very surprised when I saw more than 200 logs "already parsed". I think it's appearing because proxy blacklisting.
    Of course the best idea is to make script or another way to solve this situation. Another program when find something like that try to use another proxy.

    GSA SER know if there is antibot script. I think @Sven know that very good. 
    When response is similar to ex:

    response:
    <html><head><link rel="icon" href="data:;"><meta http-equiv="refresh" content="0;/.well-known/sgcaptcha/?r=%2F%D9%8A%D8%A7-%D8%A3%D9%85%D8%A7%D9%87-%D8%A3
    %D8%BA%D9%84%D9%82%D9%8A-%D8%A7%D9%84%D8%A8%D8%A7%D8%A8-%D9%88%D9%86%D8%A7%D9%85%D9%8A%2F&y=ipr:xxx.xxx.xxx.xxx:1710612134.417"></meta></head></html>

    it is certain that antibot is installed. Of course, this is just an example.

    If in that case SER will resolve that problem - will be very good, if SER will try to use another proxy - will be good, if SER will do nothing - will be enough, but when SER will flag whole domain as "already parsed" and in the future won't to try posting - it will be VERY BAD.

    GSA SER has a proxy scraper and should attempt to post to URLs even if it failed to post before.

  • Do you have set to be able to post to same domain in your how to submit settings?

    Or maybe check script and have a look at. . .     posted domain check=1

    Maybe need to change to allow posting to same site but only if URL if different as example. . .

    posted domain check

    Overwrites project settings: Avoid posting URL on same domain twice
    0 = do not post any link if anything has been posted before
    1 = allow to post a link again on the same domain (but only if the URL is different)
    2 = special setting for tier projects that would than allow to post several URLs on the same site.

    Just wondering if that would help, I see what your saying I think.
  • There is a URL you can visit (well more then a few now) and it will tell you if you pass bot test.

    One was antoli or something and theres newer more advanced one telling you everything about your browser fingerprint.

    I have scripts that go take a screenshot and come back with picture result.

    I'll have to dig it out.
  • ways like this to. . .

    antibot_test=1

    basically just flipping a switch in script to make false variable true.

    Of course you'd have to find it and its not working everywhere.
Sign In or Register to comment.