]> git.immae.eu Git - github/wallabag/wallabag.git/blame - inc/3rdparty/site_config/standard/fnal.gov.txt
Merge branch 'dev' of github.com:wallabag/wallabag into dev
[github/wallabag/wallabag.git] / inc / 3rdparty / site_config / standard / fnal.gov.txt
CommitLineData
4e067cea
NL
1title: normalize(//h1)
2
3author: //td/p[position()=last()]/em
4
5# I swear, this is really the best way to do this
6date: normalize(//td[contains(@style, "color: #ffffff")])
7
8# my god, it's full of tables
9body: /table/tbody/tr[5]//table/tbody//table/tbody/tr/td
10strip: //h1
11
12# the following two lines strip the byline at the end of the article (the byline is a <p> that consists of an em dash and then some text in an <em>). I have no idea why I can't just strip //p[position()=last()], but trying to do so includes a bunch of other crap in the output.
13strip: //p[position()=last()]/em
ac4d1142
NL
14strip: //p[position()=last()]/child::text()
15test_url: http://www.fnal.gov/pub/today/archive_2011/today11-11-09_MuonDepartmentReadMore.html