Skip to content

Urgent bug - Anchor with UTF-8 character is removed

Hi @sven,

All anchors with UTF-8 characters are having the said character removed upon posting.

I'm using the new custom import url/anchor match to post my links.

I have just noticed this today and it's happening 100% of all time.

Example anchor:
Glcksspiel</code>Glücksspiel</pre>Outputs as:<br><pre class="CodeBlock"><code>

Even using the Euro symbol € is completely removed on all anchors.

I just caught this after posting dozens of links, and am now having to go back and fix all of them...

Please let me know if you're able to identify this bug...



Comments

  • SvenSven www.GSA-Online.de
    seoaddict said:
    Hi @sven,

    I'm using the new custom import url/anchor match to post my links.
    Sorry but what/where is that function?
  • edited March 2020
    The new custom URL/anchor import feature that you added a while back.

    https://www.import-domain.com/wp-admin/[https://www.moneysite.com/#{anchor|text}]

    The “anchor text” part is what removes all UTF-8 characters. Also, if you use the “secondary anchor” for titles as I do, all UTF-8 characters are also removed from the titles.

    It seems anything that’s inside this particular import method has UTF-8 characters removed (anchor text and titles)

    Note: Other than UTF-8 characters being removed, the import feature itself works fantastically.

    Please let me know if you need anything else, thanks.
  • SvenSven www.GSA-Online.de
    well yes thats true...indeed that encoding correction is done before even noticing the special anchor/format thing. I have to think about it on how to not do this for some of the urls.
  • Just sent another donation your way @sven, hopefully this helps pay for some of the time if you manage to get this figured out. Thanks again
  • SvenSven www.GSA-Online.de
    Thanks for the donation...but since it's a bug...you shouldn't have done that. I feel guilty for bugs and would fix them anyway.

    Just that times are hard for me right now (corvid-19). With two kids and a country shutting down, you have to take care of too many things that there is little time for work tasks.

    Though I hope to debug this later today.
    Thanked by 1Kaine
  • I hope you are all well and this thing pass through without much as a sneeze or cough.


    What about adding a checkbox to SER to ask if you want encoding correction or not? Wouldn't that be the easiest?
  • SvenSven www.GSA-Online.de
    yes but more confusing for ppl ;) I will detect that format and skip it then...don'T worry.
  • SvenSven www.GSA-Online.de
    I think the latest update should fix it. I guess it was just a none utf8 encoding on the url import. At least it was working for me now.
  • I'm a little bit confused now.  Are you talking about this update:

    new: [2020-01-15] secondary anchor can be used in target syntax...
                              target-url[overwrite-url#overwrite-anchor#overwrite-sanchor]

    Is it documented somewhere what this actually does?

  • SvenSven www.GSA-Online.de
    yes, there is a bug but only on the encoding of the import and it's auto-fix.
    Not documented...it's a special import format discussed here.
  • Hey it seems far in my testing you did indeed fix the issue. Thanks again! Will continue to test and let you know if I see anymore characters being removed.
Sign In or Register to comment.