Python の Tips
- Indent は 4 from PEP 8: Style Guide for Python Code
- MoriMoin: Python
きちんと DTD をたどれない XML を簡易 parse する
Sax2 ではなくて minidom を使う
from xml.dom import minidom import urllib stream = urllib.urlopen(args, proxies = {'http': 'http://localhost:8080/'}) dom = minidom.parseString(stream.read()) elms = dom.getElementsByTagName("item") title = dom.getElementsByTagName("title")
- http://mail.python.org/pipermail/python-list/2004-February/209444.html
- http://www.python.jp/doc/nightly/lib/module-xml.dom.minidom.html
HTML entity の unescape を行う
cgi module には escape しかない.
def unescape(s): s = s.replace("<", "<") s = s.replace(">", ">") # this has to be last: s = s.replace("&", "&") return s
- http://wiki.python.org/moin/EscapingHtml
- http://www.python.org/doc/2.3/lib/module-xml.sax.saxutils.html#l2h-4194 - sax ライブラリに存在する
formatstring ではまる
>>> print("%d/%d" % 3, 7)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: not enough arguments for format string
>>> print("%d/%d" % (3, 7))
3/7
