Unicode and Standard Strings

Default Encoding

  • Default encoding/decoding is determined at interpreter startup
  • Can be obtained from sys.getdefaultencoding()
  • Default is usually 'ascii'
  • Can be changed in site.py or sitecustomize.py
  • However, this is a good way to get strange program behavior.

Comments

  • Mixing of Unicode and standard strings mostly works like you expect
  • If strings contain identical characters, will compare as equals, have same hash value, etc.
  • May get occasional UnicodeErrors when converting.
  • Performance is obviously worse if many conversions are performed.
<<< O'Reilly OSCON 2001, New Features in Python 2, Slide 59
July 26, 2001, beazley@cs.uchicago.edu
>>>