Unicode - Encodings
Encodings
- Python provides the following string encoding types
'ascii' # 7-bit ASCII (0-127)
'latin-1', 'iso-8859-1' # 8-bit extended ASCII (0-255)
'utf-8' # 8-bit variable length encoding
'utf-16' # 16-bit variable length encoding
'utf-16-le' # 16-bit little endian
'utf-16-be' # 16-bit big endian
'unicode-escape' # Format used in u"xxxxx" literals
'raw-unicode-escape' # Format used in ur"xxxxx" literals
- To encode: s.encode([encoding [,errors]])
>>> s = u"Hello"
>>> s.encode('utf-8')
'Hello'
>>> s.encode('utf-16-le')
'H\000e\000l\000l\000o\000'
>>> s.encode('utf-16-be')
'\000H\000e\000l\000l\000o'
|