unicode(str, "utf-8") and str.encode("utf-8")

the key is, unicode in Python is an object, unicode(str, “utf-8″) makes that object from an utf-9 str, and str.encode(”utf-8″) encode a string to the utf-8 encoding.
To write unicode-aware python code, I’ll need to:

when getting data, use unicode(str, “the_encoding”) to get an unicode object
use unicode object inside my program, like all internal strings should be […]

reading utf-8 file in Python

import codecs
fp = codecs.open(fileName, "r", "utf-8")
fp.read()
* http://evanjones.ca/python-utf8.html
* http://www.jorendorff.com/articles/unicode/python.html

Share This

Close
E-mail It