expired domains

March 2021

Hi Is it possible to add a feature to scrape articles from expired domains?

These expired domains are specified

For example, the method used by the Expired Article Hunter program

March 2021

That can be done already. You have 3 scrapers that would do this task for you. Though it requires good proxies.

March 2021

How can I do this?

Where do I put the expired domains?

March 2021

hmm no, it works differently. It will query databases of expired domains with your keywords, gets the data and parses webarchive for the lost content.

You can try importing your expired domains with...

https://web.archive.org/web/20170101202654if_/http://<b>DOMAIN</b>

March 2021

I mean to add a new method by adding an expired domain, and the program extracts articles

March 2021

I use Scrapebox for that, but I find the reliability of web.archive.org terrible to useless, lately.

March 2021

The latest update allows you to import a list of expired domains as custom source.

March 2021

I tried this domain

https://web.archive.org/web/20181231000000if_/internetguruhub.com/

But no article was extracted

[18:09:17] Starting "Scraping Articles"...

[18:09:18] Starting "Removing duplicate content"...

[18:09:18] Starting "Filtering Content"...

[18:09:18] Starting "Generating Articles"...

[18:09:18] SameArticle: extracting paragraphs and titles...

[18:09:18] Amount of words from all data sets: 0

[18:09:18] Sorry, not enough content to create any article. Try to select more sources or use more keywords.

[18:09:18] Finished.

March 2021

it was not even downloading here!?

March 2021

I found the problem and will optimize it for next update.

March 2021

I've updated. But still the same problem

[22:03:03] Starting "Scraping Articles"...

[22:03:03] Starting "Removing duplicate content"...

[22:03:03] Starting "Filtering Content"...

[22:03:03] Starting "Generating Articles"...

[22:03:03] MixSentence: extracting sentences and titles from 0 data sets...

[22:03:03] Amount of words from all data sets: 0

[22:03:03] Sorry, not enough content to create any article. Try to select more sources or use more keywords.

[22:03:03] Starting "Inserting Spin syntax"...

[22:03:03] Finished.

March 2021

is that custom source actually enabled? Can you show a screenshot?

March 2021

Image: https://forum.gsa-online.de/uploads/editor/31/v8n6q9kc23i5.jpg

March 2021

https://web.archive.org/web/20181231000000if_/internetguruhub.com/

March 2021

please send me the project backup. I don't get why it is not even scraping for you.

March 2021

you need to have keywords added to the project...you didn't have that so it was of course unable to extract something based on your keywords.

March 2021

Thank you

Can articles be scraped without keywords?

March 2021

no, because the text to extract should have some relation to the keyword.

March 2021

I am not targeting keywords. Because I want the largest number of articles to use in building a backlink

March 2021

well use some conman words then like "the" or "a".

July 2021

Type y

Sven said:
hmm no, it works differently. It will query databases of expired domains with your keywords, gets the data and parses webarchive for the lost content.

You can try importing your expired domains with...
https://web.archive.org/web/20170101202654if_/http://<b>DOMAIN</b>

Gentoo, para entrar os artigos de dominios inspiradas tenho que inserir "https://web.archive.org/web/20170101202654if_/http://DOMAIN"

Would that be it, or isn't it more accurate ?

expired domains

Comments