According to the XML spec http://www.w3.org/TR/REC-xml/#NT-EncodingDecl whitespace is allowed around the quotes of encoding Here is a simple patch:
— /usr/ports/textproc/py-feedparser/work/feedparser/feedparser.py.old Sat Jul 2 16:17:11 2005
+++ /usr/ports/textproc/py-feedparser/work/feedparser/feedparser.py Sat Jul 2 16:18:25 2005
@@ -2101,7 +2101,7 @@
else:
# ASCII-compatible
pass
- xml_encoding_match = re.compile(’^<\?.*encoding=[\’"](.*?)[\’"].*\?>’).match(xml_data)
+ xml_encoding_match = re.compile(’^<\?.*encoding\s=\s[\’"](.*?)[\’"].*\?>’).match(xml_data)
except:
xml_encoding_match = None
if xml_encoding_match:
+++ /usr/ports/textproc/py-feedparser/work/feedparser/feedparser.py Sat Jul 2 16:18:25 2005
@@ -2101,7 +2101,7 @@
else:
# ASCII-compatible
pass
- xml_encoding_match = re.compile(’^<\?.*encoding=[\’"](.*?)[\’"].*\?>’).match(xml_data)
+ xml_encoding_match = re.compile(’^<\?.*encoding\s=\s[\’"](.*?)[\’"].*\?>’).match(xml_data)
except:
xml_encoding_match = None
if xml_encoding_match:
I’ve send this patch to the author via sf.net: https://sourceforge.net/tracker/index.php?func=detail&aid=1231408&group_id=112328&atid=661939
Tags: python, feedparser, rss
Post a Comment
You could use <code type="name"> to get your code colorized