The signedness of char
is implementation-defined.
A cleaner solution to the problem you're describing would be to mandate that plain char
must be unsigned.
The reason plain char
may be either signed or unsigned is partly historical, and partly related to performance.
Very early versions of C didn't have unsigned types. Since ASCII only covers the range 0 to 127, it was assumed that there was no particular disadvantage in making char
a signed type. Once that decision was made, some programmers might have written code that depends on that, and later compilers kept char
as a signed type to avoid breaking such code.
Quoting a C Reference Manual from 1975, 3 years before the publication of K&R1:
Characters (declared, and hereinafter called, char
) are chosen from
the ASCII set; they occupy the right- most seven bits of an 8-bit
byte. It is also possible to interpret char
s as signed, 2’s complement
8-bit numbers.
EBCDIC requires 8-bit unsigned char
, but apparently EBCDIC-based machines weren't yet supported at that time.
As for performance, values of type char
are implicitly converted, in many contexts, to int
(assuming that int
can represent all values of type char
, which is usually the case). This is done via the "integer promotions". For example, this:
char ch = '0';
ch ++;
doesn't just perform an 8-bit increment. It converts the value of ch
from char
to int
, adds 1 to the result, and converts the sum back from int
to char
to store it in ch
. (The compiler can generate any code that provably achieves the same effect.)
Converting an 8-bit signed char
to a 32-bit signed int
requires sign extension. Converting an 8-bit unsigned char
to a 32-bit signed int
requires zero-filling the high-order 24 bits of the target. (The actual widths of these types may vary.) Depending on the CPU, one of these operations may be faster than the other. On some CPUs, making plain char
signed might result in faster generated code.
(I don't know what the magnitude of this effect is.)