m1gin 112

To read xml document with terminal we can use xmllint tool.

Lets say we have a file named 1.xml with the below content:

<?xml version="1.0" encoding="utf-8"?>
<item>
<definition test="abc">v. to provide a place to sleep or live</definition>
<definition>def2</definition>
<word>accommodates</word>
</item>

Get elements:

xmllint --xpath "/item/definition" 1.xml

Output: <definition test="abc">v. to provide a place to sleep or live</definition><definition>def2</definition>

Get text of elements:

xmllint --xpath "//item/word/text()" 1.xml

Output: accommodates

Get attributes:

xmllint --xpath "/item/definition/@test" 1.xml

Output: test="abc"

Select elements by its attribute:

xmllint --xpath "/item/definition[@test='abc']" 1.xml

Output: <definition test="abc">v. to provide a place to sleep or live</definition>


Search element by part of text:

xmllint --xpath "//*[contains(*,'sleep')]" 1.xml

Search element by part of attribute:

xmllint --xpath "//*[contains(@test,'bc')]" 1.xml

Parse HTML URL:

xmllint --html --xpath //div/a "http://blog.mbirgin.com"

Parse HTML and ignore error messages:

curl -s http://www.mbirgin.com | xmllint --html --xpath '(//td[contains(@class,"lastpost") and contains(@class,"windowbg2")])[1]/text()' 2>/dev/null - ​​​​​​​


#from blog.mbirgin.com, archive, xml parsing, terminal xml, select xml node, read xml document, bash, command line, xmllint, xmllint examples, parse html cli, cli parse html url

Add to: