70270

When does conversion between unsigned and signed character pointer becomes unsafe in C?

Question:

If I do this in both clang and Visual Studio:

unsigned char *a = 0; char * b = 0; char x = '3'; a = & x; b = (unsigned char*) a;

I get the warning that I am trying to convert between signed and unsigned character pointer but the code sure works. Though compiler is saying it for a reason. Can you point out a situation where this can turn into a problem?

Answer1:

To make it very simple because char represents:

<ul><li>A single character (char, it doesn't matter if signed or not). When you assign a character like 'A' what you're doing is to write <kbd>A</kbd> ASCII code (65) in that memory location.</li> <li>A string (when used as array or pointer to a char buffer).</li> <li>An eight bit <strong>number</strong> (with or without sign).</li> </ul>

Then when you convert a signed byte like -1 to unsigned byte you'll loose information (at least sign but probably number too), that's why you get a warning:

signed char a = -1; unsigned char b = (unsigned char)a; if ((int)b == -1) ; // No! Now b is 255!

Value <em>may</em> not be 255 but 1 if your system doesn't represent negative numbers with 2's complement, in that example it doesn't really matter (and I never worked with any system like that but they exist) because the concept is <em>a signed/unsigned conversion may discard information</em>. It doesn't matter if this happens because of an explicit cast or a cast through pointers: bits will represent something else (and result will change according to implementation, environment and actual value).

Note that for C standard char, signed char and unsigned char are formally distinct types. You won't care (and VS will default char to signed or unsigned according to a compiler option but this isn't portable) and you may need casting.

Answer2:

Your code is correct (any type can be aliased by unsigned char). Also, on 2's complement systems, this alias is the same as the result of a value conversion.

The reverse operation; aliasing unsigned char by char is only a problem on esoteric systems that have trap representations for plain char.

I don't know of any such systems ever existing, although the C standard provides for their existence. Unfortunately a cast is required because of this possibility, which is more annoying than useful IMHO.

The aliasing of unsigned char by char is the same as the value conversion on every modern system that I know of (technically implementation-defined, but everyone implements it that the value conversion retains the same representation).

NB. definition of terms, taking for example unsigned char x = 250;:

<ul><li><em>alias</em> char y = *(char *)&x;</li> <li><em>conversion</em> char y = x;</li> </ul>

Answer3:

The char type can either be signed or unsigned depending on the platform. The code that you write with casting a char type to either unsigned or signed char might work fine within one platform, but not if the data is transferred across operating systems, ETC. See this URL:

<a href="http://www.trilithium.com/johan/2005/01/char-types/" rel="nofollow">http://www.trilithium.com/johan/2005/01/char-types/</a>

Answer4:

Because you can lose some values - look at this:

unsigned char *a = 0; char b = -3; a = &b; printf("%d", *a);

Result: 253

Let me explain this. Just look at ranges:

unsigned char: from 0 to 255<br /> signed char: from -128 to 127

<em>Edited: sorry for mistake, too hot today ;)</em>

Recommend

  • OR instruction in assembly into ECX register
  • C# NOT (~) bit wise operator returns negative values
  • Shift operation implementation in java
  • How can the java 'class' literal return different instances of the Class object for the sa
  • byte, char, int in Java - bit representation
  • In BASH convert a string with . in float
  • C++ String tokenisation from 3D .obj files
  • Determining the length of a read stream in node js
  • Crafting a LINQ based solution to determine if a set of predicates are satisfied for a pair of colle
  • Multiple flexboxes with margin-right, except the last one in the row? Without JS?
  • Is there a way to link a linux's thread TID and a pthread_t “thread ID”
  • incomplete type 'struct' error in C
  • Create Instant using a negative year
  • Using a canvas object in a thread to do simple animations - Java
  • Unable to decode certificate at client new X509Certificate2()
  • Allowing both email and username for authentication
  • Get one-time binding to work for ng-if
  • Splitting given String into two variables - php
  • NetLogo BehaviorSpace - Measure runs using reporters
  • Is my CUDA kernel really runs on device or is being mistekenly executed by host in emulation?
  • DirectX11 ClearRenderTargetViewback with transparent buffer?
  • Why is the timeout on a windows udp receive socket always 500ms longer than set by SO_RCVTIMEO?
  • Build own AppleScript numerical error handling
  • Timeout for blocking function call, i.e., how to stop waiting for user input after X seconds?
  • How do you troubleshoot character encoding problems?
  • Web-crawler for facebook in python
  • Rearranging Cells in UITableView Bug & Saving Changes
  • Comma separated Values
  • Windows forms listbox.selecteditem displaying “System.Data.DataRowView” instead of actual value
  • Unit Testing MVC Web Application in Visual Studio and Problem with QTAgent
  • Proper folder structure for lots of source files
  • Benchmarking RAM performance - UWP and C#
  • How get height of the a view with gone visibility and height defined as wrap_content in xml?
  • Angular 2 constructor injection vs direct access
  • FormattedException instead of throw new Exception(string.Format(…)) in .NET
  • How does Linux kernel interrupt the application?
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • Django query for large number of relationships
  • Sorting a 2D array using the second column C++
  • java string with new operator and a literal