There is an interesting topic in the python.list about comparing two strings deiregarding whitespace without re. As the discussion went on, there cames two different requirements:
normalize whitespace That is, “a\n b c” == “a b \n c” but “ab c” <> “a bc”
totally ignore withespace Both “a\n b […]
¶
Posted 21 March 2006
§
python
‡
the key is, unicode in Python is an object, unicode(str, “utf-8″) makes that object from an utf-9 str, and str.encode(”utf-8″) encode a string to the utf-8 encoding.
To write unicode-aware python code, I’ll need to:
when getting data, use unicode(str, “the_encoding”) to get an unicode object
use unicode object inside my program, like all internal strings should be […]
¶
Posted 09 March 2006
§
python
‡