Thursday, March 9, 2017

Multi-character literal (Multiple characters inside single quotes) Vs char literal in C, C++

Any character with in single quotes('a') is a character literal  (character constant) and a string literal is a set of characters with in double quotes ("Hello").
And if we have multiple characters in side single quote('aa') ,it is a multi-character literal

Before going to the multi-character literal we will check how these literals are considered (as value and type) by compiler. In C language type of character literal is int. So size of char literal is 4 bytes in 32 bit architecture. Where as in C++ the type of character literal is char and size is 1 byte.

Save following program with .c and .cpp extension and run with c and cpp compiler.

 int main()  
 {  
 printf("Size of char literal %d\n",sizeof('a'));  
 return 0;  
 }  

What is the value of  multiple characters inside single quotes?

But for multi-character literal, type is int in C, C++ and the value is implementation defined.
The value of an integer character constant containing more than one character or containing a character or escape sequence that does not map to a single-byte execution character (C90 6.1.3.4, C99 and C11 6.4.4.4). It is implementation defined. Reference LINK
The value of multi character literal as implementation for gnu compiler  is as follows.
The compiler evaluates a multi-character character constant a character at a time, shifting the previous value left by the number of bits per target character, and then or-ing in the bit-pattern of the new character truncated to the width of a target character. The final bit-pattern is given type int, and is therefore signed, regardless of whether single characters are signed or not. If there are more characters in the constant than would fit in the target int the compiler issues a warning, and the excess leading characters are ignored.
For example, 'ab' for a target with an 8-bit char would be interpreted as ‘(int) ((unsigned char) 'a' * 256 + (unsigned char) 'b')’, and '\234a' as ‘(int) ((unsigned char) '\234' * 256 + (unsigned char) 'a')’. 
Reference LINK
Example calculation:
  •  The So for the multi character literal 'aa',  the value is calculated as in GNU
(int) 'a' * 256 + (int) 'a'  (Here 256 = 16^2)
(97*256) + 97  = 24929
  •  'aaa' is valued as
(int)'a' * 65536 + (int) 'a' * 256 + (int) 'a'  (Here 65536 = 16^4)
(97*65536) + (97*256) + 97  = 6381921

Example Program:
 #include<iostream>  
 using namespace std;  
 int main()  
 {  
 cout<<'aaa'<<endl;  
 return 0;  
 }  

Output:
6381921




1 comment: