Skip to content

Importing target URLs

Hi @Sven

I just want to ask, when we import target URLs from file, does SER save those URLs to the projects, or it just save the "link" to the file?

If SER just save the link to the file, I'm afraid when I accidentally change the target URLs file names, SER won't recognize it.

Regards,
Alex

Comments

  • SvenSven www.GSA-Online.de
    it does not save a link but physically read that file and adds the content to the project.
  • @Sven so after importing target URLs from file, I can delete the file, right?
  • OzzOzz
    edited June 2013
    yes, unless you click "clear url cache".
  • Thank you guys ;)
  • @Ozz, where's the "clear url cache" function?
  • right click project -> modify -> delete target URL cache

    that function deletes all urls that SER has cached and wants to post to.
  • edited June 2013
    @Ozz Thanks

    That would be used if I change the search engines or update the global site list or what's the purpose of this function?

    Since I plan to outsource the scraping to SB I have two additional questions:

    What happens when I import URLs? They get identified and added to the following folder, correct?
    C:\Users\mattcutts\AppData\Roaming\GSA Search Engine Ranker\site_list-identified

    What does "Add URLs from Projects" do? Don't get those URLs added anyway?

  • @komakino If you use Options - Advanced - Import URLs and Identify platforms, those will be saved to the folder you stated above. 
    Add URLs from Projects are for users who don't check automatically save at Options - Advanced.
  • edited June 2013
    @Alex Now I understand, thanks!

    Any chance to have different identified or verified lists? E.g. for a German project I don't want links from websites hosted in USA, China or Japan.

  • edited June 2013
    @komakino Country filter inside Project Option might help you.
  • @Alex Got it, thanks!
  • edited June 2013
    Got a new question and thought I won't open a new thread for it.

    When importing URLs I had a coppermine gallery site going into the datso list.

    Therefore I'd like to know ,based on what information does the identifying of the platform/engine happen?
    Does it look at the "page must have" variables?

    And while we're at it, how does this variable work?
    page must have1=Next Image|<form name='commentform'|com_datsogallery
    page must have2=SlideShow:|<form name='commentform'|com_datsogallery
    page must have3=Your Comment|<form name='commentform'|com_datsogallery
    page must have4=Your Name|<form name='commentform'|com_datsogallery
    page must have5=!Next Image+SlideShow
    page must have6=!powered by WordPress
    What does the numbers increment of the variable mean? "AND"?
    And what does the pipe symbol mean? "AND" or "OR"?

  • OzzOzz
    edited June 2013
    as some of those identifiers are somewhat general there needs to be more than one indentifier matching to get the submission started.

    IF something of have1 is matching THEN have2 needs to match AND have3 AND have4 also.
    BUT have5 or have6 aren't allowed.
  • edited June 2013
    Thanks for clarifying!

    In this case it was bad luck: http://www.tc-thalwil.ch/fotoalbum/displayimage.php?album=32&pos=20
    made it into the Datso list because someone posted a link to a Datso gallery image which contains the string com_datsogallery (search for "related web site" on the page).

    The engine is somewhat suboptimal, because any webpage containing a link to a Datso gallery image will be identified as Datso gallery page.

  • Going through my Datso list I see plenty of other engines falsely identified as Datso galleries:

    Zenphoto: http://physicianforfree.org/zenphoto/Album1/Picture 1202.jpg.php
    Phoca GB: http://www.rivieralipno.de/index.php?option=com_phocaguestbook&view=phocaguestbook&id=1&Itemid=91&lang=nl
    Shownews: http://www.jinyanzhiye.com/Shownews.asp?id=3267
    RSGallery2: http://www.orchidgarden.sk/component/rsgallery2/gallery/1/itemPage/45/asInline
    ...and the list goes on!

    I'll try to come up with a better engine match.
  • SvenSven www.GSA-Online.de
    Datso fixed on next version
  • edited June 2013
    Thanks @Sven

    What's the correct input for the "page must have" variable (what's in the HTML or what's on the page?):
    1. page must have1=Log in</a> to post your comment
    2. page must have1=Log in to post your comment
    in case the 'Log in' string is a link?

  • Ok, I found the scripting manual and according to the manual both should work.

    I need some guidance with the posting process. This is from the Coppermine engine:
    [STEP1]
    link type=Comment-Contextual

    form name=OK
    form url=*db_input.php

    msg_author=%spinfile-names.dat%
    msg_body=%Image_Comment%
    confirmCode=%captcha% captcha.php
    captcha=%captcha%
    email=%random_email%
    url=%url%

    My questions:
    • "url=" is supposed to fill a form element with "name=url", correct? I'm asking b/c I haven't seen a single coppermine page with an URL text field.
    • I want to post my link in the text field defined by "msg_body", how do I define that in the engine?
    • In general, how can I define the way my link gets posted, e.g. html, bbcode or plain text?
    • I'd like to define within the engine that my URL gets posted as plain text without the "http://" portion. How can I achieve that?

    Thanks!


  • SvenSven www.GSA-Online.de

    1) correct, don't know what I left it there...if it's not used fine...doesn't hurt

    2) than define Image_Comment as "allow html=1" and "auto_add_anchor=2" The 2 makes sure that an URL s always added to it, 1 will only add it if the url variable has not been used before. See scripting manual for more. IF you just want to place a nacked URL there, use something like ....

    html to custom link format=1

    custom link format=" %url% "


  • Hi...new here. pls guide. thanks.

    After imported (MIXED) target URLs into project, if i just wanna post to Article, Doc Sharing & Web2, just check the "Article", "Doc Sharing" & "Web2" and SER will do the rest?

    if i check the Global lists, it will simply save into lists (separated by platforms?) when process submission?

    Thanks
  • SvenSven www.GSA-Online.de
    yes
  • Hi, I have a question need your help.

    If I import a target URLs list which includes 3 type engineers (Article, Blog Comment, Directory), into a project. But I just want to post on Article & Blog Comment, and I clicked these 2 engineer except Directory.
    Then will GSA still post on all the 3 platforms?
    Or it'll only post on Article & Blog Comment according my indicate?

    Thanks!
  • SvenSven www.GSA-Online.de
    It will show the directory URLs in log with "no engine matches". So no, it will not post to them.
Sign In or Register to comment.