Unicode - Decoding
Decoding
- Conversion of raw byte-streams back into Unicode strings
- unicode(s, [encoding [,errors]])
>>> e = 'H\000e\000l\000l\000o\000'
>>> unicode(e,'utf-16-le')
u'Hello'
>>> unicode('hello', 'utf-8')
u'Hello'
>>>
- Of course, to properly decode a string, you need to know what encoding was used
- Usually, this is obtained elsewhere (e.g., MIME header)
Content-type: text/plain
Encoding: utf-8
|