Redirection rules with special characters

I want to use redirect 301 rules (i.e. I hope to be able to avoid rewriting rules) to redirect URLs that contain special characters (like é, à ,...) like for instance

redirect 301 /éxàmple http://mydomain.com/example

However, simply adding this doesn't work. Any suggestions?


<strong>How to troubleshoot this on a Windows system</strong>

On Windows, you can use Notepad++ to enter Unicode characters correctly. After launching Notepad++, select 'Encoding in UTF-8 without BOM' from the 'Encoding' menu, then type your Unicode characters and save the file.

To make sure that the characters have been saved properly, download a hex editor for Windows and make sure that é is saved as c3 89 and à is saved as c3 a0.

<strong>Previous response where I assumed that you are on a Linux system</strong>

Most likely the Unicode characters have not been saved properly in .htaccess file.

What do you get when you try this command:

grep -o .x.mple .htaccess | od -t x1 -c

You should get this if your Unicode characters are saved correctly.

0000000 c3 a9 78 c3 a0 6d 70 6c 65 0a 65 78 61 6d 70 6c 303 251 x 303 240 m p l e \n e x a m p l 0000020 65 0a e \n 0000022

If you have xxd or hd installed, you can get a neater output to do your troubleshooting:

$ grep -o .x.mple .htaccess | xxd -g1 0000000: c3 a9 78 c3 a0 6d 70 6c 65 0a 65 78 61 6d 70 6c ..x..mple.exampl 0000010: 65 0a e.

In all the outputs you can see that é is saved as the binary numbers: c3 89. You can see from http://www.fileformat.info/info/unicode/char/e9/index.htm that the é when encoded in UTF-8 is indeed two-bytes: 0xC3 and 0xA9.

Similarly, à in UTF-8 format is: 0xC3 0xA0. See http://www.fileformat.info/info/unicode/char/e0/index.htm. You can see these codes in the output as well.


These should work, but it depends on some things that you have to check as a checklit:

    <li>Do you have mod_alias enabled? If not, you should run a2enmod mod_alias</li> <li>Do you have some redirection to your example page? (Redirections are applied before aliases)</li> </ul>

    Then, instead of converting it to UTF-8, you can try to put the characters as they're encoded by browsers, for example %C3%A9 for é, etc.


