How does HTML Timeout affect processing?

edited July 17 in Need Help
Quick question.

I'm trying to lower the amount of connections my router is pushing so I can pump more threads into SB/GSA. Currently I've lowered the TIME_WAIT timeout in DD-WRT from 120s to 60s, and the connections drop out A LOT quicker. I usually run my GSA HTML timeout at 60s as well.

I'm curious though, are most of the GSA timeouts happening when the TCP process is in "TIME_OUT"? Or is it for example, SYN_send/receive? If I could lower the TIME_WAIT timeout to 30s, it would dramatically increase the number of threads I could use. If you're not sure I understand, but figured I'd ask.

- - - - - - - -

Also, does more threads lower GSA's ability to function properly. Meaning, currently I use the same amount of threads as I have proxies. So right now I have 100 dedicated proxies and use 75-100 threads. Let's say I acquire another 100 proxies and use 200 threads. Would this lower the functionality / verified links that GSA can produce, or is the software able to function properly at high thread count (so long as you have the proxies to match it)?

Comments

  • SvenSven www.GSA-Online.de
    the timeout in all GSA products is on the tcp/IP stack, not the html timeout. Meaning that a timeout of e.g. 60sec. means that when no packet is received from the server within that time frame, it is dropping the connection. That also means that a html timeout is not applying here, meaning a 1MB timeout on a slow server can take many minutes as long as the server sends packets in that time frame.
    ---
    Too many threads are indeed not really helpful when your memory/cpu is already on the edge. I think you can fine tune your network activities already very well when seeing your specific question on the timeout.
    Maybe it's a good idea to turn on the "failed downloads" stats on the status bar and decrease threads  if that value increases dramatically.
  • edited July 17
    My CPU/RAM are no problem. You may think I'm crazy but one of my "servers" running SER/SB is a brand new i9-7940x (14 core/28 thread) with 32gb DDR4. (I do a lot of SB Link Extract sessions which are high-CPU/core, hence the need for such a processor). Realistically, no amount of SER threads (within reason at least) could top out the CPU/RAM on this PC, it's more of a worry of GSA not completing the tasks properly. I.e. I harvest / scrape links 24/7/365 with SB and SER search engines, but only end up using maybe, 5-10% of those links after analysis. I'm just trying to make sure that the targets I'm submitting to have the resources to complete the submission properly.

    Is the general rule of thumb that you should have 1 proxy per thread (I only use dedicated proxies for submission / scraping on SER, public for SB scraping)? Or can "good" as in, dedicated proxies, handle more than 1 thread per proxy?

    - - - - - - - - -

    The problem (with the timeouts) is that my router cannot handle more than 15,000 simultaneous connections without slowing, so I'm adjusting my timeouts (SYN, TIME_WAIT, etc) to drop the connections faster. It's working for the most part, but I'm still somewhat held back by my router. Internet isn't a problem, 150mbps/unlimited data. Most I see between GSA/SB on high threads is ~20-50mbps, which is a ton, but I want to push the limits even more.

    The only other solution is to build my own router and figure out Pfsense. From talking to networking experts, it seems doing this would allow my to have many, many thousands of more connections open at once. Mainly not only because superior hardware, way better CPU/RAM than a standard router... but because Pfsense's sorting/etc algorithms are enterprise-level stuff, far better then DD-WRT which I use now.
  • SvenSven www.GSA-Online.de
    Accepted Answer
    Usually it's 1proxy is good for 10 threads.
Sign In or Register to comment.