Try to locate new url on "no engine match" - is this feature has depth limit?
Ok lets say i have imported 100 root url into my project
no search engine etc checked
Also ticket try to locate new url feature
Now lets say 50 of these root url are engine matched so no new url extracted from them
for the lest 50 (depth 0) from my understanding GSA extracted crawled pages urls
and lets say extracted 500 url as depth 1
now does it continue to extract for depth 2 and goes on ?
or it does stop at depth 1 ?
@sven
ty very much for answer
no search engine etc checked
Also ticket try to locate new url feature
Now lets say 50 of these root url are engine matched so no new url extracted from them
for the lest 50 (depth 0) from my understanding GSA extracted crawled pages urls
and lets say extracted 500 url as depth 1
now does it continue to extract for depth 2 and goes on ?
or it does stop at depth 1 ?
@sven
ty very much for answer
Comments
because if it continue to crawl extracted links from root url it would continue until crawl entire website
it must have a simple explanation
like :
1: crawls root url
2: get source
3: extract found urls in the source of root url
4: rawl them do not get new urls from those crawled leaf urls
end
Sorry but I don't understand what the problem here is.
This option is turned on means:
1. On a 404 error it will go to the root URL (if a deep link saw used as starting URL) and try again from there to identify the engine
2. On a "no engine match" it will try to find a deeper link with something of a date in the URL or a ?p=<number> to locate a blog entry to post to it (in case Blog Comments are used)
3. It will also try to locate iframes to post to it
however good approach
i think if an option to set depth of inner crawl would be great
>In that case if blog is not such structured urls for posts gsa would fail right ?
yes