Number of threads in relation to Internet Speed
Most tutorials and FAQs suggest that you use something between 5-10 threads per proxy for link submission and then work your way up. However, these tutorials always assume that you're running SER on a VPS with insane internet speed (1Gbps).
I'm actually running SER on my home computer. But since my home line isn't nearly as fast as a VPS connection would be, I'm wondering if I should reduce the number of threads to avoid timeouts/download fails/etc. I'm not getting a whole lot of those, but occasionally I do.
Right now I'm running with 200-300 threads + simultaneous scraping with SER's built in scraper (on public proxies). CPU + memory isn't nearly at max. My home line speed is ~17Mbps.
I'm actually running SER on my home computer. But since my home line isn't nearly as fast as a VPS connection would be, I'm wondering if I should reduce the number of threads to avoid timeouts/download fails/etc. I'm not getting a whole lot of those, but occasionally I do.
Right now I'm running with 200-300 threads + simultaneous scraping with SER's built in scraper (on public proxies). CPU + memory isn't nearly at max. My home line speed is ~17Mbps.
- Should I consider reducing the threads? And additionally, is my internet speed creating a tight bottle neck for the whole system, or is it fine, as long as I only run a few projects (which is the case right now)?
- Would renting a VPS with much higher internet speed than my home computer increase my LPM by a lot (even if the hardware is much worse)?
I'm kinda trying to figure out the dominant factors when using SER. How important the internet speed vs. # of threads is.
Thanks in advance guys.
Comments
Timeouts and download failed errors can be caused by a lot of factors out of your control. Personally, I'd just keep increasing my threads until I was using as much bandwidth as possible.
@fakenickahl The 5-10 per proxy 'rule' is a recommendation, I've read a couple of times. The guys from serlists.com recommend to start with 10 per proxy and then work your way up (which is exactly what I've said in my first post. I never said you should stay at 10T per proxy).
You've actually stated something which has been running around my head for a while now. TimeOut/DL failed can be caused by so many factors. I've had an awful lot at those while running at 100 threads. So I guess it's not the one and only factor you should take into consideration.
"The amount of threads you're running doesn't impact the bandwidth usage, only your amount of threads and timeout setting."
Would you mind elaborating? I don't quite follow. Isn't the timeout setting very dependant on the bandwidth? Say you're running with 1000 threads and a very small timeout setting of let's say 10 seconds. If you're running these settings on a dedi server with 1Gbps speed and on a home computer then you'd get completely different results, because the more threads are running, the less bandwidth is available per thread and therefore the timeout setting needs to be chosen much higher. Or am I misunderstanding the conept of threads/bandwidth/timeout?
I've tested ~800-1000 threads now and this is pretty much where the limit for my bandwidth is, I guess. At 1000 threads my connection was maxed out. So I guess 800-900 is within the limits of my bandwidth.
But I just can't believe that running 700-800 threads on a 17Mbps connection is realistic. It just sounds way too much. That'd mean that people with dedicated servers and high end hardware are able to run 3000-5000 threads or even more and I haven't read anything about numbers that high.
@fakenickahl What do you consider a reasonable timeout setting? I'm at 150-180.
Monitoring the bandwidth usage is something I've considered before, but it's surprisingly hard to find a reliable tool for that. But I finally found one and this was quite helpful, to be honest. I increased threads by 50 everytime I hit a stable plateau and now I'm at 400 threads and this seems to work just fine. My connection keeps dancing around 80-90% usage with occasional spikes to 100-110%. I think I'll keep it that way to avoid too many spikes which may cause timeouts.
Regarding the timeout settings - yes, I'm doing kinda the same here haha. I'm keeping it at 180 since I'm still convinced that my bandwidth is the weak link in the system and therefore I don't feel like putting too much pressure on it. If I'm not mistaken this setting will only increase LPM, which is only neccessary if you're running a shit load of projects or a churn & burn campaign and need as many links as fast as possible. Since neither is the case for me, I'll keep it like that.
This has been a great discussion, mate. I'm very grateful for that. Thanks for taking the time to help me out. I really do appreciate it.
Explained:
- Proxies: most providers allow up to 100 connections per proxy, some don't have a limit
- Threads: the amount of simultaneous tasks your VPS/computer can handle
- Bandwidth: max amount of kb/s per second
So what I did is play around with those factors a little. My result was that unless one of these factors had reached its personal limit, there was no reason for me to decrease any of the others.
Example:
- 10 proxies
- 17.5mbps bandwidth limit
I then started increasing the amount of threads until my bandwidth was maxed. This was now the limit for my setup.
Example2:
- 10 proxies
- 1Gbps bandiwdth limit (VPS)
- Again, steadily increasing the threads. This time my VPS gave in at a certain point (due to high amount of threads). In this scenario the hardware is the limit.
If you think about it, it's quite logical. There's no reason not to increase your threads until you either hit your hardware limit, or your bandwidth limit.
So my conclusion is, do not rely on fixed numbers like xy threads for setup abc. Start increasing your threads until either of your factors hits its limit. This is then the limit for your whole system. Additionally you automatically know what you'd have to change about your system if you wanted to increase overall performance.
Just my 2cents.