url/meta crawler to respect webapps and fragment meta tag
myhq
usa
hi @sven,
we work with webapps, and since these cannot be crawled, we cache html content to serve to search engines.
For this we place the <meta name="fragment" content="!"> tag in the content as google's best practise.
content for that url is then available for crawlers on: www.mydomain.com?_escaped_fragment_=#
so add: ?_escaped_fragment_=# behind the urls to get the html cached version that includes all meta needed for GSA SER.
This would save us a whole lot of time, probably easy to implement, and i guess is a feature that will be more and more appreciated.
Thanks!
Comments
notes: on https://getmeinside.com?_escaped_fragment_=# the <meta name="fragment" content="!"> can be ignored not to go in circles.
Thats already working in SER.
?_escaped_fragment_=
after every url (at the end) scraped.
<meta name="fragment" content="!">
in the<head>
of the HTML, the crawler will transform the URL of this page fromdomain[:port]/path
todomain[:port]/path?_escaped_fragment=
(ordomain[:port]/path?queryparams
todomain[:port]/path?queryparams&_escaped_fragment_=
and will then access the transformed URL. For example, ifwww.example.com
contains<meta name="fragment" content="!">
in the head, the crawler will transform this URL intowww.example.com?_escaped_fragment_=
and fetchwww.example.com?_escaped_fragment_=
from the web server.Source:https://developers.google.com/webmasters/ajax-crawling/docs/specification?hl=en
So you are talking of what function where this should be added!? Project edit->urls->edit-> "Crawl URLs" ?
so either when your META crawler crawls the page, and sees this meta name in the header, it can instead crawl the alternate url for the keywords:
whatever is easiest
Also note, that currently the #! add ?_escaped_fragment= to the target urls for link building, and I dont think you or anybody wants this, you only want to use this parameter for scraping the meta, links should be build to the clean url, without ?_escaped_fragment= ..: http://screencast.com/t/H498Htx5
So although we use domain.com/targeturl/?_escaped_fragment= for getting the keywords, we want to build backlinks to: domain.com/targeturl/
http://ES.domain.com/maincategory/subcategory/postname
http://ES.domain.com/maincategory/subcategory/
And then set to how many levels deep we want to build links, e.g.: 1
levels: 1
I assume that every day these links are updated before resuming campaigns? so updates on the site would be picked up the next day and included in the project..
*.domain.com/
levels: 10