Unicode

Error handling

  • Encoding/decoding errors result in UnicodeError exception.
  • Can control behavior with optional errors parameter
     'strict'         # Errors cause UnicodeError exception
     'ignore'         # Errors are silently ignored
     'replace'        # Bad characters replaced by special replacement character
     
     >>> a = u"M\u00fcller"
     >>> a.encode('ascii')
     UnicodeError: ASCII encoding error: ordinal not in range(128)
     >>> a.encode('ascii','ignore')
     'Mller'
     >>> a.encode('ascii','replace')
     'M?ller'
<<< O'Reilly OSCON 2001, New Features in Python 2, Slide 54
July 26, 2001, beazley@cs.uchicago.edu
>>>