Archive.org article scraper

0
I know the Content Generator can already go to expireddomains.net and lookup domains from keywords provided and go off to archive.org and attempt to scrape them.

But say I have a list of expired domains, can I feed this list into GSA Content Generator and have it search through archive.org ?

Comments

  • 0
    SvenSven www.GSA-Online.de
    no, thats not possible right now.
  • 0
    Would this be hard to add as a new feature?  It could even run in parrallel with expired domain crawlers if it could read from a dynamic txt file etc.
  • 0
    i am also interested in this feature.
  • 0
    I have a Problem with expireddomains.net . When i have a german language project with german keywords i get this error message for all articles:

    Unwanted language "en" detected for http://web.archive.org/web/20171030052420...
  • 0
    SvenSven www.GSA-Online.de
    You also wrote me an email...the lang="en" is part of the websites you sent in log. Some have lang="de" and if there is no detection possible, then CG will do that by domain/ip which is of course a problem on this site.
Sign In or Register to comment.