Skip to content

Checking metrics after successful identification?

muchcontentmuchcontent Europe
edited October 2016 in GSA Platform Identifier
I just bought platform identifier and proxy scraper. So I'm trying to emulate PR when identifying a list in platform identifier using majestic metrics. I started with 300 proxies that was good for majestic checking, but with default threads 100 of them burned out in 3 min. So I've set the thread count for this project in platform identifier to 10, which will complete the project of about 1 million URLs in about a month. Proxies are still slowly decreasing though. So I have a few questions about this:

1. Does platform identifier check for majestic metrics before or after the URL has been identified with a platform? If the answer is before, is there a way for me to make the metric filtering process start after the URL has been identified in order to save proxies from burning out? Because there is no use getting metrics for URLs which will not be used anyway.

2. I'm getting a lot of error messages in proxy scraper during the metric checking. What happens when it fails to retrieve metrics for a URL and a PR-filter is set in platform identifier? Will the URL be discarded?

3. Is there any way to increase the speed of this whole process without buying 500 private proxies?

4. Is 15% a normal amount of recognized/identified links from a footprint scraped list?

Comments

  • s4nt0ss4nt0s Houston, Texas
    edited October 2016
    1) It will do the identification process before it starts checking majestic, moz, etc.

    2) It will be unrecognized

    3) The Metrics Scanner in the Proxy Scraper runs off proxies in order to work properly so increasing proxies is the only thing I can think of. :/

    4) Depends on how targeted the footprints are and which ones you're using but it can be normal. You can try enabling the "deep matching" option if you want slightly improved detection. It will use more CPU resources though.
Sign In or Register to comment.