
Question:
I'm trying to create a relative link in pdf file created with iTextSharp
everything works good with ASCII letters, but if I add other unicode symbols in path they are just skipped
This works fine:
Chunk chunk = new Chunk(text, font);
chunk.SetAnchor("./Attachments/1.jpg");
This creates incorrect link (link is created like this: //1.jpg, <strong>Вложения</strong> - part is missing):
Chunk chunk = new Chunk(text, font);
chunk.SetAnchor("./Вложения/1.jpg");
Is there any way to create correct link with unicode symbols? Thanks
Answer1:By using Chunk.SetAnchor
in iText 5 you effectively generate an <strong>URI Action</strong>. The URI parameter thereof is specified as
<strong>URI</strong> ASCII string <em>(Required)</em> The uniform resource identifier to resolve, encoded in 7-bit ASCII.
</blockquote><em>(ISO 32000-1, Table 206 – Additional entries specific to a URI action)</em>
Thus, it can be considered ok that non-ASCII characters like your Cyrillic ones are not accepted by Chunk.SetAnchor
. (It is not ok, though, that they are simply dropped; if the method does not accept its input, it should throw an exception.)
But by no means does that mean you cannot reference a file in a path that is using some non-ASCII characters. Instead you can make use of the fact that the path is considered an URI: This in particular means that you can apply the URL encoding scheme for special characters!
Thus, simply replace
chunk.SetAnchor("./Вложения/1.jpg");
by
chunk.SetAnchor(WebUtility.UrlEncode("./Вложения/1.jpg"));
and your link works again! (At least it did in my tests.)
<hr />PS: In .Net you actually have quite a choice of classes to do the URL encoding, cf. for example <a href="https://stackoverflow.com/a/8451941/1729265" rel="nofollow">this answer</a>. WebUtility.UrlEncode
worked for me in the case at hand but depending on your use case one of the others might be more appropriate.
PPS: The situation changes a bit in the newer PDF specification:
<blockquote><strong>URI</strong> ASCII string <em>(Required)</em> The uniform resource identifier to resolve, encoded in UTF8.
</blockquote><em>(ISO 32000-2, Table 210 — Additional entries specific to a URI action)</em>
(I think the "ASCII" in the <em>type</em> column is a specification error and the UTF8 in the <em>value</em> column is to be taken seriously.)
But iText 5 has no PDF 2.0 support and, therefore, does not support UTF8 encoding here. One should probably test with iText 7 which claims PDF 2.0 support...