W3cubDocs

/C

character constant

Syntax

' c-char ' (1)
u ' c-char ' (since C11) (2)
U ' c-char ' (since C11) (3)
L ' c-char ' (4)
' c-char-sequence ' (5)

where.

  • c-char is either
    • a character from the basic source character set minus single-quote ('), backslash (\), or the newline character.
    • escape sequence: one of special character escapes \' \" \? \\ \a \b \f \n \r \t \v, hex escapes \x... or octal escapes \... as defined in escape sequences.
    • universal character name, \u... or \U... as defined in escape sequences.

      (since C99)
      • c-char-sequence is a sequence of two or more c-chars.
      1) single-byte integer character constant, e.g. 'a' or '\n' or '\13'. Such constant has type int and a value equal to the representation of c-char in the execution character set as a value of type char mapped to int. If c-char is not representable as a single byte in the execution character set, the value is implementation-defined.
      2) 16-bit wide character constant, e.g. u'่ฒ“', but not u'๐ŸŒ' (u'\U0001f34c'). Such constant has type char16_t and a value equal to the value of c-char in the 16-bit encoding produced by mbrtoc16 (normally UTF-16). If c-char is not representable or maps to more than one 16-bit character, the behavior is implementation-defined.
      3) 32-bit wide character constant, e.g. U'่ฒ“' or U'๐ŸŒ'. Such constant has type char32_t and a value equal to the value of c-char in in the 32-bit encoding produced by mbrtoc32 (normally UTF-32). If c-char is not representable or maps to more than one 32-bit character, the behavior is implementation-defined.
      4) wide character constant, e.g. L'ฮฒ' or L'่ฒ“. Such constant has type wchar_t and a value equal to the value of c-char in the execution wide character set (that is, the value that would be produced by mbtowc). If c-char is not representable or maps to more than one wide character (e.g. a non-BMP value on Windows where wchar_t is 16-bit), the behavior is implementation-defined .
      5) multicharacter constant, e.g. 'AB', has type int and implementation-defined value.

      Notes

      Many implementations of multicharacter constants use the values of each char in the constant to initialize successive bytes of the resulting integer, in big-endian order, e.g. the value of '\1\2\3\4' is 0x01020304.

      In C++, ordinary character constants have type char, rather than int.

      Unlike integer constants, a character constant may have a negative value if char is signed: on such implementations '\xFF' is an int with the value -1.

      When used in a controlling expression of #if or #elif, character constants may be interpreted in terms of the source character set, the execution character set, or some other implementation-defined character set.

      Example

      #include <stddef.h>
      #include <stdio.h>
      #include <uchar.h>
       
      int main (void)
      {
          printf("constant    value     \n");
          printf("--------    ----------\n");
       
          // integer character constants,
          int c1='a'; printf("'a':        %#010x\n", c1);
          int c2='๐ŸŒ'; printf("'๐ŸŒ':       %#010x\n\n", c2); // implementation-defined
       
          // multicharacter constant
          int c3='ab'; printf("'ab':       %#010x\n\n", c3); // implementation-defined
       
          // 16-bit wide character constants
          char16_t uc1 = u'a'; printf("'a':        %#010x\n", (int)uc1);
          char16_t uc2 = u'ยข'; printf("'ยข':        %#010x\n", (int)uc2);
          char16_t uc3 = u'็Œซ'; printf("'็Œซ':       %#010x\n", (int)uc3);
          // implementation-defined (๐ŸŒ maps to two 16-bit characters)
          char16_t uc4 = u'๐ŸŒ'; printf("'๐ŸŒ':       %#010x\n\n", (int)uc4);
       
          // 32-bit wide character constants
          char32_t Uc1 = U'a'; printf("'a':        %#010x\n", (int)Uc1);
          char32_t Uc2 = U'ยข'; printf("'ยข':        %#010x\n", (int)Uc2);
          char32_t Uc3 = U'็Œซ'; printf("'็Œซ':       %#010x\n", (int)Uc3);
          char32_t Uc4 = U'๐ŸŒ'; printf("'๐ŸŒ':       %#010x\n\n", (int)Uc4);
       
          // wide character constants
          wchar_t wc1 = L'a'; printf("'a':        %#010x\n", (int)wc1);
          wchar_t wc2 = L'ยข'; printf("'ยข':        %#010x\n", (int)wc2);
          wchar_t wc3 = L'็Œซ'; printf("'็Œซ':       %#010x\n", (int)wc3);
          wchar_t wc4 = L'๐ŸŒ'; printf("'๐ŸŒ':       %#010x\n\n", (int)wc4);
      }

      Possible output:

      constant    value     
      --------    ----------
      'a':        0x00000061
      '๐ŸŒ':       0xf09f8d8c
       
      'ab':       0x00006162
       
      'a':        0x00000061
      'ยข':        0x000000a2
      '็Œซ':       0x0000732b
      '๐ŸŒ':       0x0000df4c
       
      'a':        0x00000061
      'ยข':        0x000000a2
      '็Œซ':       0x0000732b
      '๐ŸŒ':       0x0001f34c
       
      'a':        0x00000061
      'ยข':        0x000000a2
      '็Œซ':       0x0000732b
      '๐ŸŒ':       0x0001f34c

      References

      • C11 standard (ISO/IEC 9899:2011):
        • 6.4.4.4 Character constants (p: 67-70)
      • C99 standard (ISO/IEC 9899:1999):
        • 6.4.4.4 Character constants (p: 59-61)
      • C89/C90 standard (ISO/IEC 9899:1990):
        • 3.1.3.4 Character constants

      See also

      C++ documentation for character literal

ยฉ cppreference.com
Licensed under the Creative Commons Attribution-ShareAlike Unported License v3.0.
http://en.cppreference.com/w/c/language/character_constant