Skip to content

Is HTML timeout proportionate to total number of threads?

IN GSA SER it states to increase HTML timeout if you run a lot of threads. Should you run a proportionally larger timeout if you're running a lot of threads? For example, if I am running 100 threads I would normally use about a 120 HTML timeout, does that mean if I am running 300 threads I should be running at about 320 HTML timeout? Should the timeout always be just a little bit bigger than the number of threads that I am running? Thanks!

Comments

  • goonergooner SERLists.com
    No necessarily, i use 500 threads and 180 timeout.
    I don't think you would need to go any higher than 180.
  • I tried doing the math but I came up blank. 50 proxies, 400 threads, 3 minute timeout. That would mean that at any given time 50 of those connections could be waiting 3 minutes for each connection, right? @gooner
  • And this would bottleneck the whole system?
  • goonergooner SERLists.com
    Hmmm i'm not sure about the maths, but yes max waiting time of 3 minutes per connection.
    I don't seem to have bottlenecks using that setting.
  • ronron SERLists.com
    edited December 2013
    I think there's 4 factors that all come into play. Threads, html timeout, wait time between search engines, and the number of search engines you use. I should also mention proxies because that is a part of the equation as well.

    I'm not recommending this, but I am at 300 threads, 120 timeout, 5 second wait, 190 search engines, 30 proxies.

    It's best to experiment. I was at 225-250 threads for a long time with these settings, but then added threads and search engines, and it all worked out great.


  • SvenSven www.GSA-Online.de

    @ron 5 sec wait time is not good. It would mean that it sends each 5 sec a query to the same proxy for searches. This setting changed in one of the last versions and is now per proxy and not globally where proxy IPs are not relevant. If you have set it to 120sec it means that a proxy searches each 120 sec for new targets on the same search engine. But thats happening for each proxy. Meaning 30 proxies = 30 queries to the same search engine each 120 seconds.

    I hope I could explain it a bit with my limited English.

    Also one note to the timeout. This timeout is not for the whole page but each time a new byte/bit from the server is received, this counter is reset and it has again the whole time set in timeout to proceed.

  • ronron SERLists.com
    Yikes! I bet this explains why suddenly a while back everybody started showing their proxies as being banned by Google.

    Although I have gotten good results, I am going to need to rethink this. Thanks @sven for the heads-up.
  • so basically if you set the timeout to be too long, you could theoretically occupy all 300 of those possible proxy connections you have and end up slowing down SER a lot.
  • ronron SERLists.com
    Yes, you could slow it down a ton. There is a balance between LPM and getting your proxies banned.

    I took off a few months and wasn't paying attention. This reminds me of when the forum first started. The question and debate always centered around the best choices for each of those settings. For now I bumped it to 30 seconds, and I'm going to test various settings.
  • Well I mean even if you're not letting SER scrape targets. I could it slowing down SER tenfold if you're letting ser scrape with such a small custom wait time between search engine queries (proxies getting banned) and having a large HTML timeout (pages taking forever to load). there is definitely a point of balance somewhere.
  • if not using proxies, then you would need to go any higher than 30s
  • SvenSven www.GSA-Online.de
    @useruser1 wrong...it does not matter if you use proxies or not. That setting is for each outgoing IP used. So either your IP or for each proxy.
  • Yeah, especially if the HTML timeout timer is resetting with each byte of information received. This could really lock up and occupy a lot of your threads at once if your targets are slow loading and spammed out. @sven, does the clean site lists function in the global options remove all dead sites from your global site lists?
  • SvenSven www.GSA-Online.de
    yes
  • sven, not wrong, but : more people use lots of threads(300+) and set higher timeout(120+) - it's wrong way! best solution - reduce thread count, dosn't matter which software are you using, more than 300 it is wrong way in any vps and especially home windows))
  • goonergooner SERLists.com
    @useruser1 - 200,000 verified per day from my VPS with 500 threads says you are wrong.
    More so when you consider the only junk engines i use is URL shortener.
    No blog comments, image comments, guestbooks etc.

    There are no hard and fast rules, experiment to find the optimum setup for you.
Sign In or Register to comment.