Indexer speed improvement suggestion
When I watch the indexer system of submitting each url from que and the multithreading, I see here place for improvement. The speed of indexing doesnt improve much with increasing amount of threads and I think its because it doesnt use fully the multithreading support. It runs list of urls from que one by one and then uses xxx threads to submit it. So lets say I want to index url by submitting it to 100 pages when using 1000 threads. Obviously I waste 900 threads which will do nothing. Such threads could work on next url from que instead going one by one. If you could change the system to work like platform identifier, when each thread can work on different url, it would improve the speed thousandtimes and I wouldnt watch my 100k+ que of urls to sit there forever.
If there wouldnt be problem of flooding the indexer urls at such fast rate, why not change the system to make the indexing faster?
If there wouldnt be problem of flooding the indexer urls at such fast rate, why not change the system to make the indexing faster?
Comments