Skip to content

Scraping my own set of targets

2»

Comments

  • looplineloopline autoapprovemarketplace.com
    @Olve1954
      You could also use the custom harvester and do any time span you want. 
  • @squawk1200 and @cherub, thank you for the video information.
  • Was scraping 80k keywords list with SB for a couple hours today on extra server I have, I decide to stop harvester and boom! SB crashes as it usually does when handling large keywords list..now I don't even know which keywords have been processed. Back to square one, what a waste of time.

    This is exactly why for scraping I recommend Gscraper over Scrapebox.

    /rant
  • hmm just look into your harvested links folder .. and check the files maybe helps
  • jpvr90
    exactly
    i can't believe that scrapebox devs are not listening to hardcore users for the past 2-3 years.

    1- Able to import proxies in a timely fashion. (do not tell me custom harvester. There is literally no way for custom harvester can keep up million keywords. Crashing with %100 rate.)
    2- Show us keywords that has/has note been processed or better yet just auto export them.

    They are too stubborn to implement these 2 very important features.

    Sorry but since you don't have 10k private proxies, scraping with private proxies are gone. But it seems scrapebox devs are not hardcore scrapers cause they keep not enabling to import new proxies during scraping.

    Good news is a month ago i opened update addons tab. And saw 4-5 add-ons are greyed out with 64-bit notes next to them.

    I am hoping/believing 64 bit version will have most of the features we request.



  • 1linklist1linklist FREE TRIAL Linklists - VPM of 150+ - http://1linklist.com
    Gscraper is the way to go if you really want to get into scraping your own lists.

    A quick, easy, low-time intensive way to throw some together however, is an ahrefs subscription :)
  • "I guess I should restate in that I used to use scraping for raw urls, now days I have become adept at "tracing" other peoples work and I just find people who are doing great things and utilize scraping to find and examine what they do and dial things in.  I don't need to process raw  billions of urls, I just go see what everyone else accomplished with their hard work and "borrow" the process, combine it with what already know, bounce it off people in the know and dial in. "
    ...This piqued my interest...
    It doesn't have to be an either or situation. I have both programs and SB's crashing does infuriate but they both have their place. 
    @looplineSome of your videos have been of tremendous benefit. Many thanks
  • @loopline May I ask what proxies you are using for the scraping? Thanks
  • The original poster asked this specific question (which no one seems to answer)
    "Is there a place in SER where a list like this can be inserted? Is doing this a benefit, or is it just as good to let SER find the sites based on the keywords I gave it? Are people out there using Scrapebox?"

    I have the same question - do I really need to use ScrapeBox to get target sites ? as it seems GSA does this already - or have I missed something ?
  • You don't need a separate scraper. Many of the more experienced users here do scrape the targets themselves due to the fact that using a different scraper is more efficient. 

    It is much more hands on and time consuming though. Loopline has a good YouTube channel with lots of information on how to scrape with Scrapebox. A member here has a good video series on scraping too @DonaldBeck.

    Do you need to scrape yourself? No you don't.
  • looplineloopline autoapprovemarketplace.com
    @Don
    I have used all sorts of proxy providers, right now I am enjoying proxy rack, as well as back connect proxies from reverse proxies (although I have these for other purposes first, but they work good enough for scraping) and then I have a lot of buyproxies.org proxies as well. 

    If you set a delay you can use your own IP, I use my own IP on every server I have with a 100 second +- delay and can get a surprising amount of results. 

    @bluenun
    Sven has showed how you can use the search online for urls feature under tools to scrape in SER, but the built in scraper that is in the projects is pretty slow, IMHO.   With Scrapebox you can get things done faster, and I believe with more flexibility.

    Once Scrapebox 2.0 comes out it will have several distinct advantages over SER for scraping, such as it can be exponentially faster, as well as having 64bit allows you to utilize the speed but still work with massive footprint/keyword lists. 

    I use Scrapebox to scrape for my SER, but I do plan on testing out SER with some very specific footprints, like a 24 hour scrape for specific things.  I think that having that in SER would be great, and your only scraping it 1 time every 24 hours so speed is not crucial accuracy is. 

     
  • @loopline That's interesting thank you for your reply. If you don't mind can you explain more how I could use my own IP to scrape with the delay you mention. I don't want to get big lists, just a decent size of relevant targets. Ive tried most proxies and some are good but I'm excited to try the naked route :) Thanks
  • @Don‌ there is a full tutorial on YouTube. Search scrapebox + loopline
  • Is there? OK, will check that. Thanks @icarusVN‌ :)
  • looplineloopline autoapprovemarketplace.com
    @don
      Ive lost track of what I cover in all the videos, so its probably there, but basically in Scrapebox 1.x you choose a delay from the delay drop down in the lower right hand quadrant.  Choose RND (random). 

    Then under settings at the top go to adjust rnd delay range.  Then set min to like 50 and max to 60. 

    Then go to settings and uncheck both use custom harvester, and use multi threaded harvester. 

    Then just load up keywords as normal and scrape google, its generally slow enough you can get a lot done.  The only caveat is that the delay only kicks in pages with results.  So if you load in a ton of keywords/footprints that have 0 results it can still wind up going to fast, although generally this is not an issue. 

    In 2.0 there is a delay box right in the detailed harvester when it pops up, and I just pick a number like 103 seconds and go with it and it works well. 
  • @loopline‌ will the new 64bit scrapebox be a paid update?
  • DonDon
    edited October 2014
    @icarusVN‌ Scrapebox 2 is free upgrade
    @loopline‌ Thanks. I ended up watching a lot of your videos last night. Very helpful. I bought SB years ago but never knew the power of it and I guess a lot of others are like that. Of course there's SB 2.0 soon. Can't wait 4 that...
  • looplineloopline autoapprovemarketplace.com
    @icarusVN
    The new 64bit and 32bit version of Scrapebox will be free updates. 

    @Don
    yes 2.0 will bring even more power, it has some really cool features and its not even done, so Im sure it will bring more.  Plus it lays the framework for future updates, and they can build in more things where as before it was often messy to add new things now its clean and laid out with the future in mind.  
  • Cant wait to see what the new version will do
  • satyr85  hi, after remove duplicates, how many links you may get per day ?
Sign In or Register to comment.