Skip to content

SER v15.85 slows to a crawl on large submissions, older versions do not

edited January 11 in Bugs
Hi @Sven,

With version 15.85, using 150 threads and 30 html timeout, after a few minutes, it slows from 150 threads... down to a crawl, only 10-20 threads being used.

At first, I thought this was due to "already parsed," so I enabled "Retry to submit to already failed sites," but while this removed "already parsed," it didn't fix the issue.

So as a last resort, I tested this exact same setup (same PC, same Internet connection, same project) on v14.70. It doesn't slow down at all. I've been running at 150 threads consistently for the past 30 minutes.

- - - - - - - - - -

Recreating this is quite simple. Take a list of 50-100k+ blog comments, import them, set to 150 threads and press go. After a few minutes, SER will slow down to a crawl and only submit at 10-20 threads.

I've tried a ton of different setups / options, but this happens every single time on the latest version, but not older versions. Note: the server / proxies I'm using are more than capable of running well over 150 threads.

Comments

  • SvenSven www.GSA-Online.de
    Testing this against a 100+ version old build and tell me it's the latest update is just not a correct test. Sorry, but tons of things happened since that very old build. I can not tell what might have made a change.
  • edited January 11
    The only thing I see differently on a surface level is the way the imported URLs are parsed. The older version seems to parse them in groups of ~300, while the new version parses them in groups of ~3000.

    But this bug is easy to reproduce, even if it's not comparable to an older version. All you have to do is set to 150 threads and import a large list of blog comments (into a single project), within minutes SER will slow to a crawl and only submit at 10-20 threads.

    If you stop the project then start it again, the same thing happens. It will start at 150 threads, then after a few minutes, go down to 10-20 threads. The only way to realistically submit with 150 threads (using a single project) is to stop and start it repeatedly every 10 minutes or so.

    I have a programming background and usually have some idea of what could be causing issues, but on this one, I'm stumped. The best way I know how to describe it is that SER is capping itself. It's unable (or unwilling) to open more connections. I even started to wonder if this could be OS-related (Windows 10), but being as the old versions do not have this issue, it seems to be an SER issue.
  • Also, I am only using CB / Text Captcha Solver for captcha solving, nothing else.
  • SvenSven www.GSA-Online.de
    Run one project, log to file and when you think it drops on threads, write down the time and send me the log file.
    Also if you think it should start more threads, please click on help->create bugreport.
    Also please send the project backup in question.
  • I will do this now, where can I find "log to file"?
  • Nvm found it
  • edited January 11
    Watching it closely, this issue is almost certainly related to the way SER is loading / parsing batches of links.

    - On older versions, groups of ~300-600 were being constantly loaded, which resulted in a constant 150 threads.

    - On newer versions, groups of ~3000-6000 are not added until the existing batch is almost complete. Even though there are plenty of links remaining in the "batch," SER will cap the threads (progressively lower, read below) until this "batch" of links is processed... instead of using 150 threads all the way until the end or loading a new batch.

    The next batches are being loaded far too slow, which causes SER to dip down to 10-30 threads for a significant amount of time.

    2600/6000 batch processed -> 60 threads
    3400/6000 -> 40 threads
    4000/6000 -> 35 threads
    4300/6000 -> 25 threads
    4500/6000 -> 20 threads
    4700/6000 -> 15 threads
    5100/6000 -> 10 threads

    (Next batch STILL not loaded at this point)

    Next batch was loaded around ~5500/6000, and now I'm back up to 150 threads for a few minutes.

    Next batch:

    1000/3200 -> 70 threads
    1400/3200 -> 50 threads
    1700/3200 -> 35 threads
    1900/3200 -> 25 threads
    2000/3200 -> 10-15 threads

    Next batch (2100/3200):



    Next batch (2000/3100):



    Next batch (2000/3200):



    Last batch (4300/6100):



    Old version - SER stays at a consistent 150 threads.

    Notice the drastic increase in LpM using the old version because of this issue - currently a 300+ LpM increase (+100%) and still raising!



    As the amount of parsed links per batch increases, the number of threads are decreased until the next batch is added (which takes far too long). Once the next batch is added, the threads temporarily boost back to 150... only to come down a few minutes later.

    In the end, SER spends far more time at 10-60 threads than it does at 150. Either the batch adding needs to be sped up, or SER needs to be able to use the full amount of threads until the next batch is added.
  • edited January 11
    Upon further testing, the old version does in fact sometimes split the batches into 3000-6000. However, it's scheduling (or thread management) system seems to be far more effective... as even with the larger batches, it never dips below 150 threads.



    If this was in the new version, I'd already be down at 10-30 threads, as you can see from the images above.
  • SvenSven www.GSA-Online.de
    OK, thanks for the research...I might have found something that would improve results in next update.
  • edited January 12
    Sadly the issue still persists. It's maybe... 5% better, but that's about it. I hope there is some way to fix this issue. During my testing last night, I found that the older version was able to achieve ~100% higher LpM than the newer version. I believe it's because of the way the newer version(s) slow down the threads so much, whereas the older versions keep pumping at high threads the entire time.



    Edit: I have actually found some instances where the update did helps out quite a bit. It does seem that whatever you did, it's on the right track. I will continue testing to see if I can find any other factors that may be contributing to this (i.e. targets having problems, etc).
  • edited January 14
    @Sven

    On the latest update, the speed does seem to be increased, but there's a big problem. SER is trying to re-submit to the same accounts over and over again, even though "Retry to submit to previously failed sites" is unchecked, and "Allow posting on same site again" is unchecked.

    It loads the targets via:

    07:05:51: [+] 922 possible new Target URLs from present accounts.

    Even though it was not instructed to do this. This is especially bad when you're using SER for scraping search engines and posting (to URLs that require accounts), because it doesn't allow any scraping to happen. It keeps re-loading the same target URLs (ones that have accounts) and trying to post to them again.

    I have tested and confirmed this does not happen on v15.86.
  • I'm also getting the problem with SER trying to use present accounts when I don't want it to. Reverting to previous version.
  • SvenSven www.GSA-Online.de
    It will use present accounts and your project settings to submit more articles or create new accounts. Nothing has changed on that logic really.
  • SvenSven www.GSA-Online.de
    found one problem in previous build that was fixed in latest update.
  • edited January 15
    Sven said:
    It will use present accounts and your project settings to submit more articles or create new accounts. Nothing has changed on that logic really.
    Something has definitely changed because it didn't do that in any version (ever) before the most recent 2... Why would anyone want SER to automatically submit more articles when they didn't select "Allow posting on same site again" or "Retry to submit to previously failed sites"?

    There's no way that anyone who uses contextual engines (that require accounts) can use this most recent version for submission.
  • SvenSven www.GSA-Online.de
    that latest update should have fixed this!?
  • edited January 16
    Sven said:
    that latest update should have fixed this!?
    Sadly, it did not fix it... it's still the same thing happening. It's particularly noticeable on contextual targets where an account is created. This didn't happen on 15.86 and all versions beforehand.

    I noticed it because I use SER for scraping and posting to contextual engines. What happens is that SER keep trying to post to existing accounts, which leaves zero time for scraping. So basically, it keeps importing the same list of targets over and over again, and no scraping ever occurs. The only way I can scrape new targets (using SER) is by using 15.86 or before.
  • SvenSven www.GSA-Online.de
    but the code is the same as before...just that it's optimized :/
    please send me the project backup for a closer review.
Sign In or Register to comment.