Regular expression to match French and German characters


I am parsing the request parameters to find any vulnerable characters to prevent XSS threats. Our web application supports both French and German languages other than English. I am using the following regular expression to achieve this, but it fails to handle French and German

^[a-zA-Z0-9\r\n\\-=\\*\\.\\?;,+\\/:&_ %@#]*$

Any suggestions on this is highly appreciated


<a href="http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html" rel="nofollow">\p{L} will match any unicode character that is a letter</a>.


Try [\p{Latin}\p{Punctuation}\p{Math_Symbol}] or add more character classes. Have a look <a href="http://www.regular-expressions.info/unicode.html#prop" rel="nofollow">here</a> for other unicode character classes.


I know this is an old question.

But hope it helps someone out there! you can try this regex:


Basically it should match all the Latin and extended Latin characters, including numbers, feel free to remove the unicode characters as necessary. I would say that this would be the surest way of getting it right for all your scenarios.


<ul><li><a href="http://unicode.org/charts/PDF/U0000.pdf" rel="nofollow">http://unicode.org/charts/PDF/U0000.pdf</a></li> <li><a href="http://unicode.org/charts/PDF/U0080.pdf" rel="nofollow">http://unicode.org/charts/PDF/U0080.pdf</a></li> <li><a href="http://unicode.org/charts/PDF/U0100.pdf" rel="nofollow">http://unicode.org/charts/PDF/U0100.pdf</a></li> </ul>


