Skip to content

[HELP!] Number of Threads

2»

Comments

  • SvenSven www.GSA-Online.de

    the network connection itself has not that much influence but the hardware does. A good router is the top priority here. Many threads need a lot open connections to be handled. Thats done by a router and if thats a bad one it can not handle it and breaks down (loses connection).

    Another issue is the software. A normal setup system without firewall/anti virus is optimal here. Everything inbetwen slows things downs as they monitor the traffic and eventually can't proceed that fast.

    The last thing you should care about is the network speed. Of course it helps to have a better one here but I don't see it as such a big issue.

    But one thing is correct. The more threads you use, the higher the timeout should be.

  • edited April 2014
    Thanks @Sven! Could you please tell me how much HTML timeout should I use? I'm using 1000 threads. Thanks!
  • SvenSven www.GSA-Online.de
    1000 threads...well thats a lot...too much if you ask me but well...use the maximum timeout here ... at least 120 sec.
  • @fakenickahl - I alluded to this privately with someone else. SER handles processing of freshly scraped lists better than it does verified link lists. I reasoned this to a verified list of URLs being hit by everyone else and thus the sites being larger, pages being larger, hosting bandwidth processing all of the different users SER connections, etc etc. This would obviously not apply to your own verified list.
  • @Justin that really makes a lot of sense! It never occured to me that this could be the reason, but doesn't this really only make sense on image comments, blog comments, and guestbooks? I'll try it again soon while having these platforms unticked anyways. I'm thinking that just because an article site for example is being visited a lot by SER would only make the load time longer instead of increasing cpu and ram usage.
  • goonergooner SERLists.com
    Are you totally sure about that guys? I have identical servers, some run scrapes and some run verified lists.
    The ones running verified lists always have more submissions and a better submitted to verified percentage.
    I mean every single day on every single server. It has never once been the reverse.
  • @gooner, I'm sure that a larger percentage of my verified lists results in a submission than my scraped lists and also that my verified lists produces more verified links. My issue is that SER uses waaay more resources posting to a verified list compared to a freshly scraped list even though it's doing a lot more submissions per minute on my scraped list.

    I just wanted to point out the fact that the verified list takes up a lot more resources than my scraped lists even though it's doing more submissions per minute on my scraped list. I guess the reason for fewer submissions per minute on my verified lists are due to the huge use of resources.
  • goonergooner SERLists.com
    @fakenickahl - Gotcha, for me generally the higher the LPM the more resources used, no matter what the list i'm running is. I'm not saying you are wrong, just that i don't see that on my servers.

    I use dedi's usually but i will running performance tests on a VPS next week so i'll see if i can re-produce what you are seeing and hopefully we can all make comparisons to improve performance.
  • @gooner, I just thought you didn't properly catch what my issue is and that's why I tried clarifying. It's interesting you're not seeing the same as I am, it might be due to a difference in how we are setting up projects. I've experienced the same problem on two dedicated servers now.
  • goonergooner SERLists.com
    @fakenickahl - No worries brother, thanks for clarifying. Yea i'm thinking it could be a setting maybe. We are putting together a pdf with the settings we use, so when it's done you can check it out and maybe let me know if you do something differently and hopefully we'll find the cause.
  • gooner i have same issue. please help.

  • goonergooner SERLists.com
    Hi @adystanley - I don't know the cause of the problem, but if we compare settings then maybe we can find the answer.
  • That sounds great man, I'll definitely go through your recommendations and see if I'm doing anything differently and what.
  • I have having crazy cpu usage when importing verified lists too.  I never thought it could be a verified list problem, but looking back at when i just had RAW scrapes, I had much, much, much lower cpu usage.  

    I was thinking the problem was either VPS issues or recent SER version issues.  Interesting thread though. When I have burnt through this list I will retry using RAW scrapes and see if it is the solution to CPU.
  • Now that is odd...  On one of my servers I didn't wait to burn through the veriifed list, I just cleared the targets and added 400k RAW scrapes and I am now at 10 - 50% cpu instead of 99% cpu.... Very strange.  

    Just to clarify, I was using the verified list previously and I imported it 3 different ways over the last week; 

    1)import from site list (right click menu)
    2)pull from folder (options tab) 
    3)merged, split and imported as 25k lists

    I even tried to rotate just 3 projects at a time and still had the cpu problem.
  • edited April 2014
    That's the same thing I've been seeing @Brumnick, but my RAM usage is also going through the roof on verified lists. I'd be great if we look further into this so we can give Sven something to work with if there is indeed something to fix.
  • HinkysHinkys SEOSpartans.com - Catchalls for SER - 30 Day Free Trial
    edited April 2014
    As of yesterday I'm seeing something similar. 

    I had some very bad CPU issues a few weeks ago (which I thought was due to VPS) but it seemed to fix itself without me doing anything special. (Tho I did untick "skip for identification" proxy setting which seemed to help a lot)

    Everything was back to normal, SER was running at 300 threads and 60-90% CPU without ever topping out at 100%. Untill yesterday.

    When I logged in the VPS, it was basically frozen with 100% topped out CPU. For some reason it now works REALLY sluggish and I can't go over 80-100 threads without using up 100% CPU.

    Anyway, here's what I noticed:
    When I run even 1 single project (it doesn't matter if it's 1 or 10 projects, the result is the same) using ONLY verified list on over 100 threads, it uses 100% CPU all the time (while posting). It doesn't seem to matter which project it is, as long as it uses a verified list, it's sluggish. 

    However, when I run my 3 scraping projects (which are scraping with the "Use URLs linking on same verified URLs" option), they run on 300 threads more or less as expected (around 90% CPU).

    RAM on the other hand is as normal as it gets, doesn't go over 300mb tbh (I'm looking at it right now and it's 160mb, 99% CPU)

    The only thing I don't understand is why this started happening basically over-night. :S
  • @Hinkys: Thanks for that!

    Could someone please explain how "Skip for identification" affects your project? Thanks!
  • Skip for identification means you don't use proxies to identify the type of the website engine, before even posting to it. SER will use your ip to visit the webpage and check it out.

Sign In or Register to comment.