Skip to content

Low Thread Count

AlexRAlexR Cape Town
I would like to start a discussion or list of causes of a low thread count.

I have:
  • 80 projects on my VPS
  • Lots of CPU (running at 10%)
  • lots of RAM available
  • lots of projects.
  • a mixture of T0 and T1 projects. 
  • My proxies are all private and good. (over 40 private proxies available)

I have set threads to 100, but I see SER is often sitting at between 2 and 10 threads. (Sometimes it climbs to 22 threads)

I am not seeing "Download Failed" in the logs. 

What I would like to know is:
1) What causes a low thread count?
2) Surely if there are resources available SER, wouldn't it be a good thing if in the background SER did:
a) cleaned out duplicate URL's
b) maybe globally verified URL's again? (not the per project reverify, but a global one)

Would love some input here!


Comments

  • - no target urls left or only few? check the results of the search queries.
    - delete target history once in a while
  • AlexRAlexR Cape Town
    - delete target history once in a while - recently done.
    - not getting any "No target URL's left" errors in log either. 
  • I get this problem myself from time to time. Only way to counter it, is to create multiple campaigns, which will then run simultaneously with a higher overall thread count.

    The problem is not missing targets or bandwith or something like that. It even says in the log-windows below:
    - 150/690 matches engine ...
    - 391/432 matches engine ...
  • AlexRAlexR Cape Town
    I have 80 projects on the server that are active as stated in original post. How many do you mean when you say "Create multiple campaigns" to solve it?
  • AlexRAlexR Cape Town
    @sven - do you throttle number of threads based on SE's selected?

    Let's say I only select:
    Google
    Yahoo
    Lycos

    and use only these 3 across all my projects on a VPS, does SER throttle number of connections per SE? This is only reason I can think of the low thread count. 
  • SvenSven www.GSA-Online.de
    Of course the program queries the SE's still by the set time (time to wait between queries). Else if you would have 100 projects, and each project would just wait 60 seconds between queries, you would end up being banned in one minute.
  • I just figured out, why the threadcount is going down so rapidly sometimes.

    When the project in question is getting closer to the submissions per day limit, the thread counts goes down. At this point SER seems to be lowering the threads with every successful submission, until it hits the cap.

    I guess that makes sense, not having 300 threads running when the limit will be reached in the next couple of minutes. :D The +/- option might be helpful here.
  • SvenSven www.GSA-Online.de
    Yes thats indeed the case. To make sure it stays in the limit, it must only start the number of threads equals the left number of submissions set in the limit.
  • AlexRAlexR Cape Town
    @Sven - "To make sure it stays in the limit, it must only start the number of threads equals the left number of submissions set in the limit." 

    Question 1:

    So if you use the scheduler with 15 projects and each project has a limit of 20 submissions per day, then it's 15x20=300 threads limit at start, and as you get some submissions, thread count = max of total submissions for all selected projects left?

    Also - can you help explain this a little to me (from your first statement).

    I have 50 private proxies.
    I have 4s between SE time selected.
    I have scheduler running at 15 projects.
    I have 3 SE's selected (Yahoo, Google, Lycos)

    Does the program use 1 thread, per SE, per proxy and then does a search with a wait of 4s between next search?

    Question 2:

    If so, the max the program will ever use (with setup above) is 3 (SE's) x 50 (proxies) = 150 threads? This 150 threads get's spread out among the selected projects on the scheduler? 
    (i.e. 150 / 15 = max of 10 threads per project) 

    Question 3:


    Does SER see different Google SE's as different SE's? I.e. if I select 10 different Google SE's, will it see all 10 as 1 SE when allocating threads? So with 50 private proxies and 10 Google SE's selected, it will still limit it to 50 threads max?

    (Sorry for all questions, but working on a way to increase thread count!) 
  • edited May 2013
    i got my lpm from 10 to 160 just by prescraping with SB and changing 1 setting (see below). the LpM dropped from 160 to 40 after a few days, but its staying there now. i only got 12 projects and no pingers/indexers selected.

    i like SB way more for fetching targets. because that's what it's made for. GSA SER is what it is, a submitter and not a scraper.

    what i do know is running SER on 10 private proxies and only let the "shit-tiers" look for targets via search engine, the top tiers only use the global site list. SB is scraping 24/7 with 17 cheap fiverr-proxies and produces approx. 1 mio Urls/day which i transfer to SER then.

    maybe it's worth a try for you.

    best regards
  • SvenSven www.GSA-Online.de

    Bahh you can be sure if I answer one question...AlexR has much more to ask :( OK once more...

    Q1: Each project starts as much threads as it can. Of course the project with limits can only request the max. number of threads to get started as explaind by my previous post. It is not influencing any project with no limits.

    - It also does 4sec wait time on each SE ... projects request a SE query and the global control thread allowes it or not. Proxies are not counted here.

    Q2: Common it's getting all to complicated. It is handled already as good as it gets. DO NOT CARE ABOUT IT!

    Q3: It will see all google SEs as one as it is one IP it sends it in the end. Same for all bing-SEs.

  • AlexRAlexR Cape Town
    @sven - What if I asked questions like "how do I increase my LPM??" ;-)

    An update for all reading, I have stopped using the scheduler and just press start and it's now using full threads. From 3 to 80 threads! Quick fix! Has anybody had an issue with the scheduler? - could the scheduler not be allocating resources somehow? It's just the scheduler was set to 15 projects max and it never increased the threads past 10, even when there was lots of work for it to do (even if it didn't have anywhere near enough links per project)! 

    Q3 - that is very very interesting!




  • I tried scheduler once too, but abandoned that feature. It was not related to performance though. Just that when using it, what should save memory ended up wasting more memory than if I press "Run" without scheduling.

    Perhaps just pure "unluck." Should look at that feature again these days. Frankly I never really like this feature. As I'm building tiered links, if a campaign is important enough to run, I want it to get all the time it could get to finish the job.

    With scheduling, if you choose "x10 submissions per url per day", it may give you a sense those projects are run every day, but perhaps just 1x per tier 2. That means at 10% verification rate, only 1 link every 10 days. No wonder some people get none or non-optimal results.

    I suggest you try to increase the number of projects running at the same time. If that is the issue, setting it to 20, 30 and so on should increase the thread counts gradually.
Sign In or Register to comment.