pyquery

インストール方法

# pacman -S python-pyquery

サンプル

pick_up_pyquery.py

実行コマンド

./pick_up_pyquery.py list_inp1763_1.html list_1763.json

エラーになる例

$ ./pick_up_pyquery.py list_inp112_1.html list_112.json
*** 開始 ***
Traceback (most recent call last):
  File "./pick_up_pyquery.py", line 21, in 
    doc = PyQuery(filename=file_in)
  File "/usr/lib/python3.4/site-packages/pyquery/pyquery.py", line 201, in __init__
    elements = fromstring(html, self.parser)
  File "/usr/lib/python3.4/site-packages/pyquery/pyquery.py", line 66, in fromstring
    result = getattr(etree, meth)(context)
  File "lxml.etree.pyx", line 3310, in lxml.etree.parse (src/lxml/lxml.etree.c:72517)
  File "parser.pxi", line 1812, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:106204)
  File "parser.pxi", line 1832, in lxml.etree._parseFilelikeDocument (src/lxml/lxml.etree.c:106464)
  File "parser.pxi", line 1727, in lxml.etree._parseDocFromFilelike (src/lxml/lxml.etree.c:105354)
  File "parser.pxi", line 1146, in lxml.etree._BaseParser._parseDocFromFilelike (src/lxml/lxml.etree.c:100481)
  File "parser.pxi", line 580, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:94350)
  File "parser.pxi", line 686, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:95751)
  File "lxml.etree.pyx", line 316, in lxml.etree._ExceptionContext._raise_if_stored (src/lxml/lxml.etree.c:10323)
  File "parser.pxi", line 369, in lxml.etree._FileReaderContext.copyToBuffer (src/lxml/lxml.etree.c:92111)
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 21: invalid start byte

Return

Jun/26/2015 AM 08:15