BlackWasp
C# Programming
.NET 1.1

C# Character Data Type

The fifteenth part of the C# Fundamentals tutorial returns to the topic of data types. This article examines the character data type that permits us to begin crossing the boundary between numeric values and representations of textual information.

ASCII

In 1967, the American Standard Code for Information Interchange (ASCII) was published. The ASCII standard defined a number to represent each English letter and number and for various punctuation symbols. The code also included a series of non-printable control characters to allow control of devices such as printers. Examples of control characters included character 13 which represented a carriage return and 10 for a line feed.

The ASCII standard used the range of numbers from zero to one hundred and twenty six to represent both its printing and non-printing characters. Each character could be stored in seven binary digits of information. In most programming languages that supported a character data type, a single byte of information could therefore be used to represent a single ASCII character.

Unicode

Unicode is a more recent industry standard for encoding characters. The Unicode character set was created to allow the representation of many more characters than ASCII provided, including international letters and symbols not used by the English language. With some languages requiring many thousands of characters, a single byte per character was not sufficient for encoding. Unicode can therefore uses two or more bytes for this encoding to allow for the larger character sets.

The Character Data Type

The character or char data type defined within C# is used to hold a single Unicode character. Character variables hold a sixteen bit number representation of a letter, number, symbol or control character. They can also be considered as a numeric data type with similar properties to an unsigned short integer value.

Assignment

Values can be assigned to a character variable using the normal assignment operator (=). As the data type provides a crossover between numeric and textual information, information can be assigned using two methods. To assign the character directly, a single letter, number or symbol can be used, surrounded by apostrophes ('). An integer value may also be used but must be cast as a char.

char letterA;

letterA = 'A';                      // Assign a character directly
letterA = (char)65;                 // Assign a number cast to a char

The example above demonstrates the two methods by which a value can be assigned to a character variable. In fact, the two assignment operations provided effectively perform the same task; the Unicode character represented by the number sixty-five is the capital letter "A". This can be demonstrated further by converting the resultant character back to a numeric or character representation and displaying it using the Console.WriteLine method.

char letterA;

letterA = 'A';                      // Assign a character directly
Console.WriteLine(letterA);         // Outputs "A";
Console.WriteLine((int)letterA);    // Outputs "65";

C# 2.0 Nullable Character

Earlier in the C# Fundamentals tutorial we examined the nullable numeric data types that were introduced as part of the .NET Framework 2.0. The character data type has a nullable equivalent with similar functionality.

char? c;

c = 'A';                            // Assign a non-null character
c = null;                           // Assign an undefined, or null, value;
Link to this Page10 October 2006
RSS RSS Feed