diff options
author | Nicolas LÅ“uillet <nicolas.loeuillet@smile.fr> | 2014-10-10 13:33:54 +0200 |
---|---|---|
committer | Nicolas LÅ“uillet <nicolas.loeuillet@smile.fr> | 2014-10-10 13:33:54 +0200 |
commit | 44d35257e805856b4913c63fcbed3c0acb64bae8 (patch) | |
tree | 11e9d276c34b1b287706cb61182bdc71729661e2 /inc/3rdparty/site_config/standard/bostonglobe.com.txt | |
parent | af8292c1de1886cd975d79f0f42df40e0bd1c5bd (diff) | |
parent | cf8a5e1eedbed484dbcb1ddc9f7a13fc19b7a27b (diff) | |
download | wallabag-44d35257e805856b4913c63fcbed3c0acb64bae8.tar.gz wallabag-44d35257e805856b4913c63fcbed3c0acb64bae8.tar.zst wallabag-44d35257e805856b4913c63fcbed3c0acb64bae8.zip |
Merge branch 'dev'1.8.0
Diffstat (limited to 'inc/3rdparty/site_config/standard/bostonglobe.com.txt')
-rwxr-xr-x[-rw-r--r--] | inc/3rdparty/site_config/standard/bostonglobe.com.txt | 28 |
1 files changed, 14 insertions, 14 deletions
diff --git a/inc/3rdparty/site_config/standard/bostonglobe.com.txt b/inc/3rdparty/site_config/standard/bostonglobe.com.txt index d3e6f43f..4c74a34e 100644..100755 --- a/inc/3rdparty/site_config/standard/bostonglobe.com.txt +++ b/inc/3rdparty/site_config/standard/bostonglobe.com.txt | |||
@@ -1,16 +1,16 @@ | |||
1 | # NOTE: If testing this configuration yields bad results, including junk text like "Try BostonGlobe.com today" and "THIS STORY APPEARED IN", please replace the Test URL with a current-day headline link from bostonglobe.com. | 1 | # NOTE: If testing this configuration yields bad results, including junk text like "Try BostonGlobe.com today" and "THIS STORY APPEARED IN", please replace the Test URL with a current-day headline link from bostonglobe.com. |
2 | 2 | ||
3 | title: //div[@class="header"]/h1 | 3 | title: //div[@class="header"]/h1 |
4 | author: substring-after(//div[@class="byline"]/h2[@class="author"],"By ") | 4 | author: substring-after(//div[@class="byline"]/h2[@class="author"],"By ") |
5 | date: //div[@class="byline"]/p[last()] | 5 | date: //div[@class="byline"]/p[last()] |
6 | body: //div[@class="article-body"] | 6 | body: //div[@class="article-body"] |
7 | 7 | ||
8 | strip_id_or_class: aside | 8 | strip_id_or_class: aside |
9 | strip_id_or_class: promo | 9 | strip_id_or_class: promo |
10 | strip_id_or_class: skip-nav | 10 | strip_id_or_class: skip-nav |
11 | strip_id_or_class: article-more | 11 | strip_id_or_class: article-more |
12 | strip_id_or_class: article-bar | 12 | strip_id_or_class: article-bar |
13 | 13 | ||
14 | # This removes image captions. If the parser starts saving images from bostonglobe.com (currently, it does not), then this directive should be removed. | 14 | # This removes image captions. If the parser starts saving images from bostonglobe.com (currently, it does not), then this directive should be removed. |
15 | strip_id_or_class: figure | 15 | strip_id_or_class: figure |
16 | test_url: http://bostonglobe.com/news/nation/2012/03/17/illinois-primary-could-pivotal/PsDzFZqvhEYyXbOcF9FOkO/story.html \ No newline at end of file | 16 | test_url: http://bostonglobe.com/news/nation/2012/03/17/illinois-primary-could-pivotal/PsDzFZqvhEYyXbOcF9FOkO/story.html \ No newline at end of file |