Skip to content

Can GSA Email Spider hide this?

Hi guys,

I have important question.

I wish to run the program to scrap emails but I do not want those websites to know my identity/IP addreee and so on...

Anyway to run GSA anonymously? I don't have VPN.

Comments

  • You can use proxies to stay anonymous
  • Mind to guide me how to use proxies with GSA?

    I am very newbie.

    Thanks.
  • @chuck when you open the Email Spider  below the start button select options tab, in that go to proxy tab and configure your proxies
  • How reliable is this auto email scrappers such as GSA?

    I have a website to crawl, specifically, it's a discussion forum and another one is online marketplace.

    Will GSA crawl every single link of it? When I manually crawl it (takes 12 hours), the total emails that I got is more than using autopilot (GSA) programs, minus all the duplicates etc

    How is that? Will GSA crawl the whole website to ensure there is not even single email missed?
  • SvenSven www.GSA-Online.de

    1. GSA Email Spider (GSA = company) is not adding any duplicates to the list.

    2. It crawls depending on your settings. If you crawl all sublinks it can take a long time and you should set it to like 3 sub links only.

    3. You can improve things dramatically if you use %nr% place holders in the URL indicating e.g. database IDs or alike.


    Anyway GSA Email Spider can crawl the whole website. It all depends on your settings.

  • So you mean GSA Email Spider CAN crawl the whole website to ensure there is not even single email missed?

    I am impressed if it really does.

    About the sublinks, what does that means? If set to more, wouldn't it crawl more?

    Example: www.email.com/

    I set it to crawl at email.com only the entire website starting with email.com

    Correct?
  • SvenSven www.GSA-Online.de

    How deep you crawl a site is set by "How deep to parse this site"

    http://docu.gsa-online.de/email_spider/methods_to_collect_emails#how_deep_to_parse_this_site


    When defining a custom level, count the clicks you must do in a browser to get to the point where the email is.

  • Hmmm..

    Okay, I will provide a clearer explanation of what I am facing now.

    There is a main page website, with many links on it and each link have emails in it.

    Well, I am sure it can crawl it easily without any problems.

    BUT, the layout of the website is, below the main page, there are Page 1 | Page 2 >>

    After clicking Page 2>>, you will see Page 2 | Page 3 >> and so on... Total Pages : 1000

    I mean, how to make it crawl to that extent? I don't know whether you understand what I am saying...ehehheh
  • And this websites have POP-UP page, you will only see the email after clicking the link with popup

    I will provide some examples of the site (not actual site but something similar)

    http://www.freeads.co.uk/
    http://www.jobstreet.co.in/
    http://www.vivastreet.co.uk/

    Can GSA crawl this kind of complicated sites?
  • And the URL look like this

    http://www.website.com/?classifieds:food&PAGE=1
    http://www.website.com/?classifieds:food&PAGE=2
    http://www.website.com/?classifieds:food&PAGE=3

    There is 1000 pages.

    How do I configure the settings to make sure I will grab 100% and I mean 100% of the emails?
  • Ok good guide thanks Sven

    ANOTHER thing to help :

    http://www.website.com/?classifieds:food&PAGE=%nr%

    http://www.website.com/?classifieds:car&PAGE=%nr%

    http://www.website.com/?classifieds:jobs&PAGE=%nr%

    Notice that in same domain website.com, there are many categories such as jobs, cars & so on.....

    If I set the URL to http://www.website.com/?classifieds:food&PAGE=%nr%

    I notices the email spider will crawl over to other categories as well.

    Any solution to this?
  • SvenSven www.GSA-Online.de
    Yes, set it to only parse that internal link and no externals at all.
  • Is there a way to shortcut?

    I mean http://www.website.com/?classifieds:food&PAGE=%nr%

    If I have 1000pages, does it mean I have to type 1000 times?


    http://www.website.com/?classifieds:food&PAGE=%1%......
    ............................

    http://www.website.com/?classifieds:food&PAGE=%1000%......

    Can I put 1 to 1000 or something?

    Please guide me if you understand my question
  • Can you show me an example of the URL I need to key in the wordpad
  • SvenSven www.GSA-Online.de

    Please just try it. If you only once would have entered "http://www.website.com/?classifieds:food&PAGE=%nr%" in "Use URL as Start" and hit START, you would have seen a popup comes up asking you how to fill %nr%.

    So no, you don't have to write 10000 urls in a text file you are about to import. Thats the whole point of this.

  • Ok. I have spend an hour or so figuring all the settings and this program is VERY GOOD and does really crawls all emails.

    I have no problem now.

    But my main concern now is :

    1. What does all the websites see us as? Can they see/know me using GSA Email Spider program or sees me as a normal internet browser? Can I keep anonymous? Please tell me more on this.

    2. Some websites has hidden emails, if scan it and we will become in blacklist.
    Is GSA Email Spider has something to bypass this?

    Please answer me 2 questions above. Thank you.
  • SvenSven www.GSA-Online.de

    1. They see you as a normal Browser. You can set in options what browser they see (can be randomized or a fixed browser). Being anonymous is only possible when you use proxies because your IP is always visible to them elsewhere. However proxy options are there.

    2. Any examples? Of course the program finds even hidden emails. It also can make quick tests if that email is real or fake (see options).

Sign In or Register to comment.