diff options
author | Nicolas LÅ“uillet <nicolas@loeuillet.org> | 2014-07-13 10:15:40 +0200 |
---|---|---|
committer | Nicolas LÅ“uillet <nicolas@loeuillet.org> | 2014-07-13 10:15:40 +0200 |
commit | 4e067ceabd705201a16b4c92cf4b23f3b990326c (patch) | |
tree | 939f3a8e5ff3ab9ee414a57a895d3e78e1d46ce3 /inc/3rdparty/site_config/standard/allthingsd.com.txt | |
parent | 58dbe103889148def78b0fc8744d3f94c56a1561 (diff) | |
download | wallabag-4e067ceabd705201a16b4c92cf4b23f3b990326c.tar.gz wallabag-4e067ceabd705201a16b4c92cf4b23f3b990326c.tar.zst wallabag-4e067ceabd705201a16b4c92cf4b23f3b990326c.zip |
updated specific configuration for parsing
Diffstat (limited to 'inc/3rdparty/site_config/standard/allthingsd.com.txt')
-rwxr-xr-x[-rw-r--r--] | inc/3rdparty/site_config/standard/allthingsd.com.txt | 21 |
1 files changed, 12 insertions, 9 deletions
diff --git a/inc/3rdparty/site_config/standard/allthingsd.com.txt b/inc/3rdparty/site_config/standard/allthingsd.com.txt index cd52498f..f8c67d02 100644..100755 --- a/inc/3rdparty/site_config/standard/allthingsd.com.txt +++ b/inc/3rdparty/site_config/standard/allthingsd.com.txt | |||
@@ -1,10 +1,13 @@ | |||
1 | title://div[@class="article-title"]/h1[@class="title"] | 1 | title://div[@class="article-title"]/h1[@class="title"] |
2 | date: //p[@class="article-date"] | 2 | date: //p[@class="article-date"] |
3 | body://*[@class="article-body article-text"] | 3 | body://div[contains(@class, "article-body")] |
4 | # Trim out related posts at bottom of article | 4 | # Trim out related posts at bottom of article |
5 | strip://blockquote[@class="memo"] | 5 | strip://blockquote[@class="memo"] |
6 | 6 | ||
7 | # Yup, no idea why author won't work... | 7 | tidy: no |
8 | author://div[@class="page-header article-header clearfix"]/p[@class="title"] | 8 | |
9 | # Yup, no idea why author won't work... | ||
10 | author://div[@class="page-header article-header clearfix"]/p[@class="title"] | ||
9 | # [Marco:] Author won't work here because the page defines the "home" link under the author's name as rel="author", which always gets priority if the page has defined it. | 11 | # [Marco:] Author won't work here because the page defines the "home" link under the author's name as rel="author", which always gets priority if the page has defined it. |
10 | test_url: http://allthingsd.com/20120513/exclusive-yahoos-thompson-out-levinsohn-in-board-settlement-with-loeb-nears-completion/ \ No newline at end of file | 12 | test_url: http://allthingsd.com/20120513/exclusive-yahoos-thompson-out-levinsohn-in-board-settlement-with-loeb-nears-completion/ |
13 | test_url: http://allthingsd.com/20131010/google-cio-ben-fried-on-how-google-works/ \ No newline at end of file | ||