Suggestion to increase overall performance
Kaine
thebestindexer.com
Hello @Sven ,
I am testing Gsa PI because it is interesting for the future.
Why not download all the HTML from URLs at full bandwidth and then process them all in sequence afterwards? (see place in ram if possible, compression?).
This would avoid fiddling with threads and bandwidth limit. Mixing the two causes performance issues. If the operation was done in two passes, completely using each resource (1-download/2-analysis), it would probably go much faster. Currently, the network connection is underutilized.
A configuration of threads for downloading then the number of threads for heavy analysis processing.
I'm sure there's a lot of time to be won.
EDIT
During the download phase, the cpu can directly compress the "HTML" in ram. The fact of postponing the analysis makes it possible to choose the order of treatment of the pool (weight of the compressed file).
EDIT
During the download phase, the cpu can directly compress the "HTML" in ram. The fact of postponing the analysis makes it possible to choose the order of treatment of the pool (weight of the compressed file).
Tagged: