Add new URLs to Random is too slow?
Hello,
I am running GSA Email Spider on a very fast server with a 1 Gbps connection. If I add new URLs to the bottom, it can easily do 100 threads or even 300 threads, but the sites I scrape get HAMMERED and I get DoS complaints, so I need to add new URLs to Random.
When I do 'add new URLs to Random' it works nicely and doesn't hammer any sites, but it runs VERY SLOW, 5-10 threads, no matter what I put in the number of threads.
1. Is there any way to keep the URLs random and make it run at 100 threads? It's only using 20% CPU so I'm not sure why it's slow.
2. Is there any way to run more than one instance of GSA at the same time?
Overall the software is very solid, excellent code, I highly recommend it to everyone.
Comments
Very interesting, you were right. I just upgraded from v7.0.1 to v7.0.2 and it's MUCH faster. Threads were bouncing between 20 and 70 at first, now they are 100, which is the max that I set.
CPU usage is still only around 20%. I think I can push it to 300 threads.
1. Why do you recommend not going over 100 threads?
2. What was the limitation that you fixed? Something with the parser?
Again, this is one of the best programs I've seen. I appreciate the beauty of this code.
1. Because the majority has no good network connection and 100 threads is breaking it often.
2. I don'T know what you mean with that?
2. I mean, what change did you make between 7.0.1 and 7.0.2 because it's SO MUCH FASTER NOW
I thought you made some changes in how it parses the links
I was just curious