Correct url structure to scrape for success/verify

September 2012

I've been scraping guestbooks on the post page but then saw for instance in Basti that it seems to modify the url like:

modify url=%targethost%%targetpath%?new_message=1

and verify here

verify url=%targethost%%targetpath%

But if I'm scraping the sites to already have the new_message=1 in the url is this ok? Also what exactly defines targetpath (I assume targethost is the raw domain name?)

Thanks.

September 2012

You don't need to scrape for the message site but it won't hurt also.

www.example.com/guestbook?new_message=!

%targethost%/%targetpath%?new_message=!

September 2012

Ok thanks, What I'm seeing though is the target path is different for same platform as there are variations. I do see a few variations in the engines but can it really account for all types it might find?

Correct url structure to scrape for success/verify

Comments