About
White-space characters is a set of characters that contains:
- spaces,
- tabs,
- and line breaks
They are part of the non printing characters.
Space set
They may be a difference between the class:
- space
- and whitespace
as seen below between the regular expression definition and unicode.
Regular expression
In regular expression, the [:space:] class contains the following characters.
| Name | Unicode | Regexp Shorthand |
|---|---|---|
| Horizontal Tabulation | HT (9, 0x009) | \t |
| Linefeed | LF (10, 0x00A) | \n |
| Formfeed | FF (12, 0x00C) | \f |
| Carriage Return | CR (13, 0x00D) | \r |
| Space | space (32, 0x0020) | \s |
Note that the shorthand:
- \S select anything that is not a whitespace character
- \s or [:space:] select any whitespace character
Unicode code
See the whole set on Unicode by giving the class [:whitespace:]
How to
Show them
They are generally replaced by visible characters
