W3cubDocs

/Haskell 8

Data.Char

Copyright	(c) The University of Glasgow 2001
License	BSD-style (see the file libraries/base/LICENSE)
Maintainer	libraries@haskell.org
Stability	stable
Portability	portable
Safe Haskell	Trustworthy
Language	Haskell2010

Character classification
- Subranges
- Unicode general categories
Case conversion
Single digit characters
Numeric representations
String representations

Description

The Char type and associated operations.

data Char :: * Source

The character type Char is an enumeration whose values represent Unicode (or equivalently ISO/IEC 10646) characters (see http://www.unicode.org/ for details). This set extends the ISO 8859-1 (Latin-1) character set (the first 256 characters), which is itself an extension of the ASCII character set (the first 128 characters). A character literal in Haskell has type Char.

To convert a Char to or from the corresponding Int value defined by Unicode, use toEnum and fromEnum from the Enum class respectively (or equivalently ord and chr).

Instances

Bounded Char	Since: 2.1
Methods minBound :: Char Source maxBound :: Char Source
Enum Char	Since: 2.1
Methods succ :: Char -> Char Source pred :: Char -> Char Source toEnum :: Int -> Char Source fromEnum :: Char -> Int Source enumFrom :: Char -> [Char] Source enumFromThen :: Char -> Char -> [Char] Source enumFromTo :: Char -> Char -> [Char] Source enumFromThenTo :: Char -> Char -> Char -> [Char] Source
Eq Char
Methods (==) :: Char -> Char -> Bool Source (/=) :: Char -> Char -> Bool Source
Data Char	Since: 4.0.0.0
Methods gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Char -> c Char Source gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Char Source toConstr :: Char -> Constr Source dataTypeOf :: Char -> DataType Source dataCast1 :: Typeable (* -> ) t => (forall d. Data d => c (t d)) -> Maybe (c Char) Source dataCast2 :: Typeable ( -> * -> *) t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Char) Source gmapT :: (forall b. Data b => b -> b) -> Char -> Char Source gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Char -> r Source gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Char -> r Source gmapQ :: (forall d. Data d => d -> u) -> Char -> [u] Source gmapQi :: Int -> (forall d. Data d => d -> u) -> Char -> u Source gmapM :: Monad m => (forall d. Data d => d -> m d) -> Char -> m Char Source gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Char -> m Char Source gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Char -> m Char Source
Ord Char
Methods compare :: Char -> Char -> Ordering Source (<) :: Char -> Char -> Bool Source (<=) :: Char -> Char -> Bool Source (>) :: Char -> Char -> Bool Source (>=) :: Char -> Char -> Bool Source max :: Char -> Char -> Char Source min :: Char -> Char -> Char Source
Read Char	Since: 2.1
Methods readsPrec :: Int -> ReadS Char Source readList :: ReadS [Char] Source readPrec :: ReadPrec Char Source readListPrec :: ReadPrec [Char] Source
Show Char	Since: 2.1
Methods showsPrec :: Int -> Char -> ShowS Source show :: Char -> String Source showList :: [Char] -> ShowS Source
Ix Char	Since: 2.1
Methods range :: (Char, Char) -> [Char] Source index :: (Char, Char) -> Char -> Int Source unsafeIndex :: (Char, Char) -> Char -> Int inRange :: (Char, Char) -> Char -> Bool Source rangeSize :: (Char, Char) -> Int Source unsafeRangeSize :: (Char, Char) -> Int
Storable Char	Since: 2.1
Methods sizeOf :: Char -> Int Source alignment :: Char -> Int Source peekElemOff :: Ptr Char -> Int -> IO Char Source pokeElemOff :: Ptr Char -> Int -> Char -> IO () Source peekByteOff :: Ptr b -> Int -> IO Char Source pokeByteOff :: Ptr b -> Int -> Char -> IO () Source peek :: Ptr Char -> IO Char Source poke :: Ptr Char -> Char -> IO () Source
IsChar Char	Since: 2.1
Methods toChar :: Char -> Char Source fromChar :: Char -> Char Source
PrintfArg Char	Since: 2.1
Methods formatArg :: Char -> FieldFormatter Source parseFormat :: Char -> ModifierParser Source
Generic1 k (URec k Char)
Associated Types type Rep1 (URec k Char) (f :: URec k Char -> ) :: k -> Source Methods from1 :: f a -> Rep1 (URec k Char) f a Source to1 :: Rep1 (URec k Char) f a -> f a Source
Functor (URec * Char)
Methods fmap :: (a -> b) -> URec * Char a -> URec * Char b Source (<$) :: a -> URec * Char b -> URec * Char a Source
Foldable (URec * Char)
Methods fold :: Monoid m => URec * Char m -> m Source foldMap :: Monoid m => (a -> m) -> URec * Char a -> m Source foldr :: (a -> b -> b) -> b -> URec * Char a -> b Source foldr' :: (a -> b -> b) -> b -> URec * Char a -> b Source foldl :: (b -> a -> b) -> b -> URec * Char a -> b Source foldl' :: (b -> a -> b) -> b -> URec * Char a -> b Source foldr1 :: (a -> a -> a) -> URec * Char a -> a Source foldl1 :: (a -> a -> a) -> URec * Char a -> a Source toList :: URec * Char a -> [a] Source null :: URec * Char a -> Bool Source length :: URec * Char a -> Int Source elem :: Eq a => a -> URec * Char a -> Bool Source maximum :: Ord a => URec * Char a -> a Source minimum :: Ord a => URec * Char a -> a Source sum :: Num a => URec * Char a -> a Source product :: Num a => URec * Char a -> a Source
Traversable (URec * Char)
Methods traverse :: Applicative f => (a -> f b) -> URec * Char a -> f (URec * Char b) Source sequenceA :: Applicative f => URec * Char (f a) -> f (URec * Char a) Source mapM :: Monad m => (a -> m b) -> URec * Char a -> m (URec * Char b) Source sequence :: Monad m => URec * Char (m a) -> m (URec * Char a) Source
Eq (URec k Char p)
Methods (==) :: URec k Char p -> URec k Char p -> Bool Source (/=) :: URec k Char p -> URec k Char p -> Bool Source
Ord (URec k Char p)
Methods compare :: URec k Char p -> URec k Char p -> Ordering Source (<) :: URec k Char p -> URec k Char p -> Bool Source (<=) :: URec k Char p -> URec k Char p -> Bool Source (>) :: URec k Char p -> URec k Char p -> Bool Source (>=) :: URec k Char p -> URec k Char p -> Bool Source max :: URec k Char p -> URec k Char p -> URec k Char p Source min :: URec k Char p -> URec k Char p -> URec k Char p Source
Show (URec k Char p)
Methods showsPrec :: Int -> URec k Char p -> ShowS Source show :: URec k Char p -> String Source showList :: [URec k Char p] -> ShowS Source
Generic (URec k Char p)
Associated Types type Rep (URec k Char p) :: * -> * Source Methods from :: URec k Char p -> Rep (URec k Char p) x Source to :: Rep (URec k Char p) x -> URec k Char p Source
data URec k Char	Used for marking occurrences of `Char#` Since: 4.9.0.0
data URec k Char = UChar { uChar# :: Char# }
type Rep1 k (URec k Char)
type Rep1 k (URec k Char) = D1 k (MetaData "URec" "GHC.Generics" "base" False) (C1 k (MetaCons "UChar" PrefixI True) (S1 k (MetaSel (Just Symbol "uChar#") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (UChar k)))
type Rep (URec k Char p)
type Rep (URec k Char p) = D1 * (MetaData "URec" "GHC.Generics" "base" False) (C1 * (MetaCons "UChar" PrefixI True) (S1 * (MetaSel (Just Symbol "uChar#") NoSourceUnpackedness NoSourceStrictness DecidedLazy) (UChar *)))

Character classification

Unicode characters are divided into letters, numbers, marks, punctuation, symbols, separators (including spaces) and others (including control characters).

isControl :: Char -> Bool Source

Selects control characters, which are the non-printing characters of the Latin-1 subset of Unicode.

isSpace :: Char -> Bool Source

Returns True for any Unicode space character, and the control characters \t, \n, \r, \f, \v.

isLower :: Char -> Bool Source

Selects lower-case alphabetic Unicode characters (letters).

isUpper :: Char -> Bool Source

Selects upper-case or title-case alphabetic Unicode characters (letters). Title case is used by a small number of letter ligatures like the single-character form of Lj.

isAlpha :: Char -> Bool Source

Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isLetter.

isAlphaNum :: Char -> Bool Source

Selects alphabetic or numeric digit Unicode characters.

Note that numeric digits outside the ASCII range are selected by this function but not by isDigit. Such digits may be part of identifiers but are not used by the printer and reader to represent numbers.

isPrint :: Char -> Bool Source

Selects printable Unicode characters (letters, numbers, marks, punctuation, symbols and spaces).

isDigit :: Char -> Bool Source

Selects ASCII digits, i.e. '0'..'9'.

isOctDigit :: Char -> Bool Source

Selects ASCII octal digits, i.e. '0'..'7'.

isHexDigit :: Char -> Bool Source

Selects ASCII hexadecimal digits, i.e. '0'..'9', 'a'..'f', 'A'..'F'.

isLetter :: Char -> Bool Source

Selects alphabetic Unicode characters (lower-case, upper-case and title-case letters, plus letters of caseless scripts and modifiers letters). This function is equivalent to isAlpha.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Letter".

Examples

Basic usage:

>>> isLetter 'a'
True
>>> isLetter 'A'
True
>>> isLetter '0'
False
>>> isLetter '%'
False
>>> isLetter '♥'
False
>>> isLetter '\31'
False

Ensure that isLetter and isAlpha are equivalent.

>>> let chars = [(chr 0)..]
>>> let letters = map isLetter chars
>>> let alphas = map isAlpha chars
>>> letters == alphas
True

isMark :: Char -> Bool Source

Selects Unicode mark characters, for example accents and the like, which combine with preceding characters.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Mark".

Examples

Basic usage:

>>> isMark 'a'
False
>>> isMark '0'
False

Combining marks such as accent characters usually need to follow another character before they become printable:

>>> map isMark "ò"
[False,True]

Puns are not necessarily supported:

>>> isMark '✓'
False

isNumber :: Char -> Bool Source

Selects Unicode numeric characters, including digits from various scripts, Roman numerals, et cetera.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Number".

Examples

Basic usage:

>>> isNumber 'a'
False
>>> isNumber '%'
False
>>> isNumber '3'
True

ASCII '0' through '9' are all numbers:

>>> and $ map isNumber ['0'..'9']
True

Unicode Roman numerals are "numbers" as well:

>>> isNumber 'Ⅸ'
True

isPunctuation :: Char -> Bool Source

Selects Unicode punctuation characters, including various kinds of connectors, brackets and quotes.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Punctuation".

Examples

Basic usage:

>>> isPunctuation 'a'
False
>>> isPunctuation '7'
False
>>> isPunctuation '♥'
False
>>> isPunctuation '"'
True
>>> isPunctuation '?'
True
>>> isPunctuation '—'
True

isSymbol :: Char -> Bool Source

Selects Unicode symbol characters, including mathematical and currency symbols.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Symbol".

Examples

Basic usage:

>>> isSymbol 'a'
False
>>> isSymbol '6'
False
>>> isSymbol '='
True

The definition of "math symbol" may be a little counter-intuitive depending on one's background:

>>> isSymbol '+'
True
>>> isSymbol '-'
False

isSeparator :: Char -> Bool Source

Selects Unicode space and separator characters.

This function returns True if its argument has one of the following GeneralCategorys, or False otherwise:

These classes are defined in the Unicode Character Database, part of the Unicode standard. The same document defines what is and is not a "Separator".

Examples

Basic usage:

>>> isSeparator 'a'
False
>>> isSeparator '6'
False
>>> isSeparator ' '
True

Warning: newlines and tab characters are not considered separators.

>>> isSeparator '\n'
False
>>> isSeparator '\t'
False

But some more exotic characters are (like HTML's  ):

>>> isSeparator '\160'
True

Subranges

isAscii :: Char -> Bool Source

Selects the first 128 characters of the Unicode character set, corresponding to the ASCII character set.

isLatin1 :: Char -> Bool Source

Selects the first 256 characters of the Unicode character set, corresponding to the ISO 8859-1 (Latin-1) character set.

isAsciiUpper :: Char -> Bool Source

Selects ASCII upper-case letters, i.e. characters satisfying both isAscii and isUpper.

isAsciiLower :: Char -> Bool Source

Selects ASCII lower-case letters, i.e. characters satisfying both isAscii and isLower.

Unicode general categories

data GeneralCategory Source

Unicode General Categories (column 2 of the UnicodeData table) in the order they are listed in the Unicode standard (the Unicode Character Database, in particular).

Examples

Basic usage:

>>> :t OtherLetter
OtherLetter :: GeneralCategory

Eq instance:

>>> UppercaseLetter == UppercaseLetter
True
>>> UppercaseLetter == LowercaseLetter
False

Ord instance:

>>> NonSpacingMark <= MathSymbol
True

Enum instance:

>>> enumFromTo ModifierLetter SpacingCombiningMark
[ModifierLetter,OtherLetter,NonSpacingMark,SpacingCombiningMark]

Read instance:

>>> read "DashPunctuation" :: GeneralCategory
DashPunctuation
>>> read "17" :: GeneralCategory
*** Exception: Prelude.read: no parse

Show instance:

>>> show EnclosingMark
"EnclosingMark"

Bounded instance:

>>> minBound :: GeneralCategory
UppercaseLetter
>>> maxBound :: GeneralCategory
NotAssigned

Ix instance:

>>> import Data.Ix ( index )
>>> index (OtherLetter,Control) FinalQuote
12
>>> index (OtherLetter,Control) Format
*** Exception: Error in array index

Constructors

UppercaseLetter	Lu: Letter, Uppercase
LowercaseLetter	Ll: Letter, Lowercase
TitlecaseLetter	Lt: Letter, Titlecase
ModifierLetter	Lm: Letter, Modifier
OtherLetter	Lo: Letter, Other
NonSpacingMark	Mn: Mark, Non-Spacing
SpacingCombiningMark	Mc: Mark, Spacing Combining
EnclosingMark	Me: Mark, Enclosing
DecimalNumber	Nd: Number, Decimal
LetterNumber	Nl: Number, Letter
OtherNumber	No: Number, Other
ConnectorPunctuation	Pc: Punctuation, Connector
DashPunctuation	Pd: Punctuation, Dash
OpenPunctuation	Ps: Punctuation, Open
ClosePunctuation	Pe: Punctuation, Close
InitialQuote	Pi: Punctuation, Initial quote
FinalQuote	Pf: Punctuation, Final quote
OtherPunctuation	Po: Punctuation, Other
MathSymbol	Sm: Symbol, Math
CurrencySymbol	Sc: Symbol, Currency
ModifierSymbol	Sk: Symbol, Modifier
OtherSymbol	So: Symbol, Other
Space	Zs: Separator, Space
LineSeparator	Zl: Separator, Line
ParagraphSeparator	Zp: Separator, Paragraph
Control	Cc: Other, Control
Format	Cf: Other, Format
Surrogate	Cs: Other, Surrogate
PrivateUse	Co: Other, Private Use
NotAssigned	Cn: Other, Not Assigned

Instances

Bounded GeneralCategory
Methods minBound :: GeneralCategory Source maxBound :: GeneralCategory Source
Enum GeneralCategory
Methods succ :: GeneralCategory -> GeneralCategory Source pred :: GeneralCategory -> GeneralCategory Source toEnum :: Int -> GeneralCategory Source fromEnum :: GeneralCategory -> Int Source enumFrom :: GeneralCategory -> [GeneralCategory] Source enumFromThen :: GeneralCategory -> GeneralCategory -> [GeneralCategory] Source enumFromTo :: GeneralCategory -> GeneralCategory -> [GeneralCategory] Source enumFromThenTo :: GeneralCategory -> GeneralCategory -> GeneralCategory -> [GeneralCategory] Source
Eq GeneralCategory
Methods (==) :: GeneralCategory -> GeneralCategory -> Bool Source (/=) :: GeneralCategory -> GeneralCategory -> Bool Source
Ord GeneralCategory
Methods compare :: GeneralCategory -> GeneralCategory -> Ordering Source (<) :: GeneralCategory -> GeneralCategory -> Bool Source (<=) :: GeneralCategory -> GeneralCategory -> Bool Source (>) :: GeneralCategory -> GeneralCategory -> Bool Source (>=) :: GeneralCategory -> GeneralCategory -> Bool Source max :: GeneralCategory -> GeneralCategory -> GeneralCategory Source min :: GeneralCategory -> GeneralCategory -> GeneralCategory Source
Read GeneralCategory
Methods readsPrec :: Int -> ReadS GeneralCategory Source readList :: ReadS [GeneralCategory] Source readPrec :: ReadPrec GeneralCategory Source readListPrec :: ReadPrec [GeneralCategory] Source
Show GeneralCategory
Methods showsPrec :: Int -> GeneralCategory -> ShowS Source show :: GeneralCategory -> String Source showList :: [GeneralCategory] -> ShowS Source
Ix GeneralCategory
Methods range :: (GeneralCategory, GeneralCategory) -> [GeneralCategory] Source index :: (GeneralCategory, GeneralCategory) -> GeneralCategory -> Int Source unsafeIndex :: (GeneralCategory, GeneralCategory) -> GeneralCategory -> Int inRange :: (GeneralCategory, GeneralCategory) -> GeneralCategory -> Bool Source rangeSize :: (GeneralCategory, GeneralCategory) -> Int Source unsafeRangeSize :: (GeneralCategory, GeneralCategory) -> Int

generalCategory :: Char -> GeneralCategory Source

The Unicode general category of the character. This relies on the Enum instance of GeneralCategory, which must remain in the same order as the categories are presented in the Unicode standard.

Examples

Basic usage:

>>> generalCategory 'a'
LowercaseLetter
>>> generalCategory 'A'
UppercaseLetter
>>> generalCategory '0'
DecimalNumber
>>> generalCategory '%'
OtherPunctuation
>>> generalCategory '♥'
OtherSymbol
>>> generalCategory '\31'
Control
>>> generalCategory ' '
Space

Case conversion

toUpper :: Char -> Char Source

Convert a letter to the corresponding upper-case letter, if any. Any other character is returned unchanged.

toLower :: Char -> Char Source

Convert a letter to the corresponding lower-case letter, if any. Any other character is returned unchanged.

toTitle :: Char -> Char Source

Convert a letter to the corresponding title-case or upper-case letter, if any. (Title case differs from upper case only for a small number of ligature letters.) Any other character is returned unchanged.

Single digit characters

digitToInt :: Char -> Int Source

Convert a single digit Char to the corresponding Int. This function fails unless its argument satisfies isHexDigit, but recognises both upper- and lower-case hexadecimal digits (that is, '0'..'9', 'a'..'f', 'A'..'F').

Examples

Characters '0' through '9' are converted properly to 0..9:

>>> map digitToInt ['0'..'9']
[0,1,2,3,4,5,6,7,8,9]

Both upper- and lower-case 'A' through 'F' are converted as well, to 10..15.

>>> map digitToInt ['a'..'f']
[10,11,12,13,14,15]
>>> map digitToInt ['A'..'F']
[10,11,12,13,14,15]

Anything else throws an exception:

>>> digitToInt 'G'
*** Exception: Char.digitToInt: not a digit 'G'
>>> digitToInt '♥'
*** Exception: Char.digitToInt: not a digit '\9829'

intToDigit :: Int -> Char Source

Convert an Int in the range 0..15 to the corresponding single digit Char. This function fails on other inputs, and generates lower-case hexadecimal digits.

Numeric representations

ord :: Char -> Int Source

The fromEnum method restricted to the type Char.

chr :: Int -> Char Source

The toEnum method restricted to the type Char.

String representations

showLitChar :: Char -> ShowS Source

Convert a character to a string using only printable characters, using Haskell source-language escape conventions. For example:

showLitChar '\n' s  =  "\\n" ++ s

lexLitChar :: ReadS String Source

Read a string representation of a character, using Haskell source-language escape conventions. For example:

lexLitChar  "\\nHello"  =  [("\\n", "Hello")]

readLitChar :: ReadS Char Source

Read a string representation of a character, using Haskell source-language escape conventions, and convert it to the character that it encodes. For example:

readLitChar "\\nHello"  =  [('\n', "Hello")]

© The University of Glasgow and others
Licensed under a BSD-style license (see top of the page).
https://downloads.haskell.org/~ghc/8.2.1/docs/html/libraries/base-4.10.0.0/Data-Char.html

Data.Char

Contents

Description

Instances

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Associated Types

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Associated Types

Methods

Character classification

Examples

Examples

Examples

Examples

Examples

Examples

Subranges

Unicode general categories

Examples

Constructors

Instances

Methods

Methods

Methods

Methods

Methods

Methods

Methods

Examples

Case conversion

Single digit characters

Examples

Numeric representations

String representations