diff options
author | Nicolas LÅ“uillet <nicolas@loeuillet.org> | 2014-07-23 13:44:48 +0200 |
---|---|---|
committer | Nicolas LÅ“uillet <nicolas@loeuillet.org> | 2014-07-23 13:44:48 +0200 |
commit | 887b015def3098f1e898e7bf3338fa2d093b6d95 (patch) | |
tree | 41206132200aa9390e11d600ad2b84ffa23242e4 /inc/3rdparty/site_config/standard/fnal.gov.txt | |
parent | ebd6bf6007e0fad4c3e11dac0e79f687e1d195a2 (diff) | |
parent | 505a74ad1de7cf2cd3605e793233365501f03d87 (diff) | |
download | wallabag-887b015def3098f1e898e7bf3338fa2d093b6d95.tar.gz wallabag-887b015def3098f1e898e7bf3338fa2d093b6d95.tar.zst wallabag-887b015def3098f1e898e7bf3338fa2d093b6d95.zip |
Merge branch 'refactor' into dev
Diffstat (limited to 'inc/3rdparty/site_config/standard/fnal.gov.txt')
-rwxr-xr-x[-rw-r--r--] | inc/3rdparty/site_config/standard/fnal.gov.txt | 26 |
1 files changed, 13 insertions, 13 deletions
diff --git a/inc/3rdparty/site_config/standard/fnal.gov.txt b/inc/3rdparty/site_config/standard/fnal.gov.txt index 7faa6bfc..e404ccb8 100644..100755 --- a/inc/3rdparty/site_config/standard/fnal.gov.txt +++ b/inc/3rdparty/site_config/standard/fnal.gov.txt | |||
@@ -1,15 +1,15 @@ | |||
1 | title: normalize(//h1) | 1 | title: normalize(//h1) |
2 | 2 | ||
3 | author: //td/p[position()=last()]/em | 3 | author: //td/p[position()=last()]/em |
4 | 4 | ||
5 | # I swear, this is really the best way to do this | 5 | # I swear, this is really the best way to do this |
6 | date: normalize(//td[contains(@style, "color: #ffffff")]) | 6 | date: normalize(//td[contains(@style, "color: #ffffff")]) |
7 | 7 | ||
8 | # my god, it's full of tables | 8 | # my god, it's full of tables |
9 | body: /table/tbody/tr[5]//table/tbody//table/tbody/tr/td | 9 | body: /table/tbody/tr[5]//table/tbody//table/tbody/tr/td |
10 | strip: //h1 | 10 | strip: //h1 |
11 | 11 | ||
12 | # the following two lines strip the byline at the end of the article (the byline is a <p> that consists of an em dash and then some text in an <em>). I have no idea why I can't just strip //p[position()=last()], but trying to do so includes a bunch of other crap in the output. | 12 | # the following two lines strip the byline at the end of the article (the byline is a <p> that consists of an em dash and then some text in an <em>). I have no idea why I can't just strip //p[position()=last()], but trying to do so includes a bunch of other crap in the output. |
13 | strip: //p[position()=last()]/em | 13 | strip: //p[position()=last()]/em |
14 | strip: //p[position()=last()]/child::text() | 14 | strip: //p[position()=last()]/child::text() |
15 | test_url: http://www.fnal.gov/pub/today/archive_2011/today11-11-09_MuonDepartmentReadMore.html \ No newline at end of file | 15 | test_url: http://www.fnal.gov/pub/today/archive_2011/today11-11-09_MuonDepartmentReadMore.html \ No newline at end of file |