Skip to content

Barely Working To Identify "Extended Engines" [99% CPU Usage]

GSA Platform Identifier flies through sorting SER engines, but as soon as I enable "Extended Engines" sorting, the CPU usage goes to 99%, it slows my VPS right down, and works at 1/20th the pace or slower.

I'm pretty sure there's a bug in the code, unless these extended engines are so much more difficult to identify?


  • SvenSven
    You don't need that for SER at least.
  • No, you're right. I'm not that big on SER submissions however. I use GSA (platform identifier) primarily for market research & competitive analysis & organizing crawled links.
    Is it possible to get this bug fixed? - if it's easy enough...
    If it's a lot of work where you'd need to re-code a lot, I can shrink my sample size and continue at this slow pace...
  • SvenSven
    Well, I can't seem to find any bug on the identification part really. Maybe it's one of your URLs being high in content size that would cause this!?
  • No it's wasn't the content size. (I cap that at 5mb or 15mb)

    I've narrowed it down:
    The more engines selected, the more CPU usage it uses.

    With all engines selected, the CPU usage gets so high that it is almost unusable.
    When I select fewer engines, the system can process in a timely manner.

    I can work around this, I don't need to know all details about each website - only some, minimal actually.

    Might be helpful to add a note to users: [more engines = more cpu intensive]
Sign In or Register to comment.