Unicode - I/O

I/O in a nutshell

  • To write data, must be encoded in some external format
     u =   # Big Unicode string
     f = open("foo","w")
     f.write(u.encode('utf-8'))
     f.close()
  • To read data, must be decoded
     f = open("foo")
     u = unicode(f.read(),'utf-8')
  • Unfortunately, explicit decoding/encoding is awkward and error prone

Solution: use the codecs module

<<< O'Reilly OSCON 2001, New Features in Python 2, Slide 55
July 26, 2001, beazley@cs.uchicago.edu
>>>