10 lines
501 B
Plaintext
10 lines
501 B
Plaintext
|
Hpricot is a fast, flexible HTML parser written in C. It's designed
|
||
|
to be very accommodating (like Tanaka Akira's HTree) and to have a
|
||
|
very helpful library (like some JavaScript libs -- JQuery, Prototype
|
||
|
-- give you.) The XPath and CSS parser, in fact, is based on John
|
||
|
Resig's JQuery.
|
||
|
|
||
|
Also, Hpricot can be handy for reading broken XML files, since many of
|
||
|
the same techniques can be used. If a quote is missing, Hpricot tries
|
||
|
to figure it out. If tags overlap, Hpricot works on sorting them out.
|