Skip to content

UTF-8 characters in search footprints

Hi i have a problem with my footprints with russian characters. I modify engine and put that footprint but before change file encoding from ascii to utf-8 so file display it correctly. When run gsa->options->advanced->search for new urls my russian characters display wrong so cant parse google fot that footprints. Is that any solution or will you do support for utf-8 in next versions ? Part of my footprint:
Количество публикаций
Tagged:

Comments

  • SvenSven www.GSA-Online.de

    do the following here...

    search within google with it and copy the resulting url part into the ini as seen in the mediawiki engine.

    in your sample: 

    "%D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE %D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9"

  • good advice, thanks. always wondered about a good way to get those characters :)
  • @sven,

    When I do this I get two different sets of results.

    Using the example above:

    I paste that into Google and I get 15.6MM results. Then when I encode it and paste it into Google, I get: 1.19MM results.

    Am I missing something...
  • SvenSven www.GSA-Online.de
    You pasted that into the URL or search box field? You have to use that in the URL as parameter.
  • to me it seems that you've missed to put your search query into question marks for exact match like "Количество публикаций"

    if you leave the question marks you get around 15.5m search results, with "" you get 1.33m.
  • Just check:
    %D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE %D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9 
    give 15mln results when pasted into URL and 1,2 when pasted into search box.

    I expect if I paste this in SER footprints windows or direct in engine ini files it will give 15mln also ;)
  • gotcha... that makes sense now... I was searching for it, not appending to the query string

    Thanks!
  • OzzOzz
    edited June 2013
    15.5m =
    %D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE+%D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9%22&oq=%22%D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE+%D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9

    1.3m =
    %22%D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE+%D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9%22&oq=%22%D0%9A%D0%BE%D0%BB%D0%B8%D1%87%D0%B5%D1%81%D1%82%D0%B2%D0%BE+%D0%BF%D1%83%D0%B1%D0%BB%D0%B8%D0%BA%D0%B0%D1%86%D0%B8%D0%B9%22

    the %22 at the beginning and the end is the code for: "
Sign In or Register to comment.