34796

0x202A in filename: Why?

I recently needed to do a isnull in SQL on a varbinary image. So far so (ab)normal. I very quickly wrote a C# program to read in the file no_image.png from my desktop, and output the bytes as hex string.

That program started like this:

byte[] ba = System.IO.File.ReadAllBytes(@"‪D:\UserName\Desktop\no_image.png"); Console.WriteLine(ba.Length); // From here, change ba to hex string

And as I had used readallbytes countless times before, I figured no big deal. To my surprise, I got a "NotSupported" exception on ReadAllBytes.

I found that the problem was that when I right click on the file, go to tab "Security", and copy-paste the object-name (start marking at the <strong>right</strong> and move inaccurately to the left), this happens.

And it happens only on Windows 8.1 (and perhaps 8), but not on Windows 7.

<img src="https://i.stack.imgur.com/ulwNn.png" alt="202A">

When I output the string in question:

public static string ToHexString(string input) { string strRetVal = null; System.Text.StringBuilder sb = new System.Text.StringBuilder(); foreach (char c in input) { sb.Append(((int)c).ToString("X2")); } strRetVal = sb.ToString(); sb.Length = 0; sb = null; return strRetVal; } // End Function ToHexString string str = ToHexString(@"‪D:\UserName\Desktop\cookie.png"); string strRight = " (" + ToHexString(@"D:\UserName\Desktop\cookie.png") + ")"; // Correct value, for comparison string msg = str + Environment.NewLine + " " + strRight; Console.WriteLine(msg);

I get this:

202A443A5C557365724E616D655C4465736B746F705C636F6F6B69652E706E67 (443A5C557365724E616D655C4465736B746F705C636F6F6B69652E706E67)

First thing, when I lookup 20 2A in ascii, it's [space] + *

Since I don't see neither a space nor a star, when I google 20 2A, the first thing I get is paragraph 202a of the german penal code http://dejure.org/gesetze/StGB/202a.html

But I suppose that is rather an unfortunate coincidence and it is actually the unicode control character 'LEFT-TO-RIGHT EMBEDDING' (U+202A) http://www.fileformat.info/info/unicode/char/202a/index.htm

Is that a bug, or is that a feature ? My guess is, it's a buggy feature.

Answer1:

The issue is that the string does not begin with a letter D at all - it just looks like it does.

It appears that the string is hard-coded in your source file.

If that's the case, then you have pasted the string from the security dialog. Unbeknownst to you, the string you pasted begins with the LRO character. This is an invisible character which tales no space, but tells the renderer to render characters from left-to-right, ignoring the usual rendering.

You just need to delete the character.

To do this, position the cursor AFTER the D in the string. Use the Backspace or Delete to Left key <x] to delete the D. Use the key again to delete the invisible LRO character. One more time to delete the ". Now retype the " and the D.

A similar problem could occur wherever the string came from - e.g. from user input, command line, script file etc.

<strong>Note:</strong> The security dialog shows the filename beginning with the LRO character to ensure that characters are displayed in the left-to-right order, which is necessary to ensure that the hierarchy is correctly understood when using RTL characters. e.g. a filename c:\folder\path\to\file in Arabic might be c:\folder\مسار/إلى/ملف. The "gotcha" is the Arabic parts read in the other direction so the word "path" according to google translate is مسار, and that is the rightmost word, making it appear is if it was the last element of the path, when in fact it is the element immediately after "c:\folder\".

Because security object paths have an hierarchy which is in conflict with the RTL text layout rules, the security dialog always displays RTL text in LTR mode. That means that the Arabic words will be mangled (letters in wrong order) on the security tab. (Imagine it as if it said "elif ot htap"). So the meaning is just about discernable, but from the point of view of security, the security semantics are preserved.

Answer2:

Filenames that contain RLO/LRO overrides are commonly created by malware. Eg. “exe” read backwards spells “malware”. You probably have an infected host, or the origin of the .png is infected.

Answer3:

This question bothered me a lot, how would it be possible that a deterministic function would give 2 different results for identical input? After some testing, it turns out that the answer is simple.

If you look through it in your debugger, you will see that the 'D' char in your @"‪D:\UserName\Desktop\cookie.png" (first use of Hex function) is NOT the same char as in @"D:\UserName\Desktop\cookie.png" (second use).

You must have used some other 'D'-like character, probably by unwanted keyboard shortcut or by messing with your Visual Studio character encoding.

It looks exactly the same, but in reality it's not event a single char 9try to watch the c variable in your toHex function.

if you change to the normal 'D' in your first example, it will work fine.

Recommend

  • example for Singleton pattern
  • Displaying *(star) superscript in R - expression( )
  • Calling python function with an unknown number of arguments [duplicate]
  • Create a Star Pattern in Java.
  • Close button on dialog returns to the wrong page
  • In LEX, how to parse some code until a special symbol?
  • Android - Launching widget manager from market
  • ** in javascript? [closed]
  • Find all paths in directed cyclic graph as regular expression
  • Custom SOAP Faults in spring integration Fault code namespace
  • Packaging drawable resources with a JAR?
  • How to assign to a variable an alias
  • Closest value different files, with different number of lines and other conditions ( bash awk other)
  • setElementClass for multiple classes in angular2
  • PhoneGap build + jquerymobile: onclick on button does not work
  • Is it possible to set an Android Notification or a later date and time that will fire when the app i
  • yii2 create translated URLs
  • Complex python3 csv scraper
  • IIS7 Application Request Routing HTTPS
  • something very wrong with SESSIONS
  • Getting errors while using neuralnet function
  • How do I check if System::Collections:ArrayList is empty / nullptr / null?
  • C function strchr - How to calculate the position of the character?
  • What is the use of a session store?
  • How to skip require in ruby?
  • converter json to two dimensional array
  • Sending cookie value via httpget but not getting the desired response
  • Saving image to sd with current date and time in name doesn't work
  • SAXReader not re-ecape characters
  • Implicit joins and Where in Doctrine - how?
  • Web.config system.webserver errors
  • Parse a date string in a specific locale (not timezone!)
  • Yii2: Config params vs. const/define
  • Splitting given String into two variables - php
  • jQuery show() function is not executed in Safari if submit handler returns true
  • How to handle AllServersUnavailable Exception
  • Function pointer “assignment from incompatible pointer type” only when using vararg ellipsis
  • File not found error Google Drive API
  • How does Linux kernel interrupt the application?
  • Python/Django TangoWithDjango Models and Databases