Expired article hunter alternative?

Veronique89 · December 2019

Hi! I am already looking months for a good replacement for expired article hunter? This bot is scraping articles from expired domains. However, the dev stopped updating it. Does anyone please know another way or bot to scrape articles from expired domains?

Thanks in advance!
X

Sven · December 2019

GSA Content Generator can do it.

Veronique89 · December 2019

Oh, I didn't knew that. In expired article hunter you can upload expired domains and then it scrapes all the articles from them. How does that work with GSA content scraper please? Is there a trial please? Thanks for your reply!

Sven · December 2019

Simply choose the ExpiredDomains as a source and thats it.

Veronique89 · December 2019

Thanks so much! Can I also import my own domains or is that impossible (last question - apologies!)?

Sven · December 2019

no, but you can import the urls from archive.org

Veronique89 · December 2019

Sven said:

Simply choose the ExpiredDomains as a source and thats it.

Seriously last question - so sorry

! Do they come from expireddomains.net and are the imports unlimited? Because at expired domain article hunter the maximum domains is 9999. And some keywords in expireddomains.net have even +50.000 domains. Thanks so much for your reply!

Sven · December 2019

its limited to 10 pages...however that can be changed in the script when you manually edit it.

Other sources for expired domains are also added like SEDO or PEEW

Veronique89 · December 2019

Could you please share how I can manually edit the limit

?

Sven · December 2019

Edit the following file with e.g. Notepad...

c:\users\<login>\appdata\roaming\gsa content generator\scraper\article\<script>.ini

You will see comments for each entry as well.

Veronique89 · December 2019

Hi Sven, I did everything as you suggested. However when only selecting expired domains and wayback the content generator says not enough content to create articles. I know that in expireddomains.net my keywords has more than 10.000 domains. Can you maybe tell me what I am doing wrong? Thanks in advance!

Sven · December 2019

Do you use proxies? The seem to ban you very fast without using them.

Veronique89 · December 2019

Apologies for the late reply!

Yes, I am using private proxies and when multiple sources are checked, it creates articles without any problem. However when I only check expireddomains it says not enough content to create any article

. Do you maybe have any idea of what I am doing wrong?

What I actually want is the following. For example: my main keyword is curtains. I want to scrape any article from wayback archive that has the word curtains in it (no matter if it is in the URL or not). So simply said: as many expired content from the keyword as possible

.

Would there be a way to help me out? If necessary: paid.

Thanks s much for your help! It is so much appreciated!

Sven · December 2019

well what does the current log look like when you let it search the mentioned sources alone?

Veronique89 · December 2019

Hereby the log:

[12:31:29] Starting "Scraping Articles"...

[12:31:37] Last page: 1 | Results: 0 | URL: https://www.expireddomains.net/domainnamesearch/?q=xanax&start=0

[12:31:37] Starting "Removing duplicate content"...

[12:31:37] Starting "Filtering Content"...

[12:31:37] Starting "Generating Articles"...

[12:31:37] MixSentence: extracting sentences and titles...

[12:31:37] Sorry, not enough content to create any article. Try to select more sources or use more keywords.

[12:31:37] Starting "Inserting Spin syntax"...

[12:31:37] Finished.

Sven · December 2019

thanks. Looks like they are blocking you...I will try to debug it. But might have to wait till holiday session is over since Im not in office.

Veronique89 · December 2019

Sure, not a problem.

710fla · December 2019

You tried changing proxies or scraping without proxies?

Might need rotating proxies to avoid IP ban

Veronique89 · December 2019

Yes, I tried both without proxies and even with rotating proxies, but without any luck

. Thanks for your help!

Veronique89 · December 2019

Please don't forget me

.

Best wishes for 2020!

Sven · December 2019

I have it on my to-do list don't worry...though it has to wait till Im back in office.

Veronique89 · December 2019

Sure, not a problem. All the best!

Sven · January 2020

I tried it now that Im back in office...

Seems like I can't even get a single result now in browser (with/without proxies) as well. Though Im online over a VPN so I can't tell for sure if it's also a ban or site is malfunctioned.

Veronique89 · January 2020

Yes, noticed this too, I think they are working on expireddomains.net. It happens from time to time. Usually tomorrow it has to work again.

Thanks for your help! Appreciated

.

Veronique89 · January 2020

Hi Sven. Again here (apologies)! I am aware now that GSA content generator scrapes from expireddomains and other sites, but do you know if there even is a scraper that can scrape archive.org on keywords only (like no keyword in URL)? Thanks for helping me out. Please let me know if I need to pay extra for your services.

Sven · January 2020

I can try adding an own version for archive.org...though you don't know if this found site is from an expired domain or not.

Veronique89 · January 2020

Sven said:

I can try adding an own version for archive.org...though you don't know if this found site is from an expired domain or not.

I understand the issue, but don't mind. What would be the charge for that?

wolfvanween · January 2020

I'll jump in here, if you allow, apologies @Veronique89 - but I believe it's better having only one discussion about expired articles.
@Sven is it possible that there is some error involved with the expired article sources when you switch language?
I have just had the program run for a few hours searching for German expired articles, with a long list of financial-related (German) keywords. And I while I watched the log I saw a lot of web.archive requests being fulfilled, yet the end result is 0 articles.

wolfvanween · January 2020

And if I switch language to english (with obvious German keywords), it finds a lot of articles, which however seem to be mostly english indeed...
But now web.archive rejects all my requests (despite me using over 1000 private proxies). Do they have a way to know?

Sven · January 2020

@wolfvanween send me a log please for this

Expired article hunter alternative?

Comments