]> git.immae.eu Git - github/wallabag/wallabag.git/blame - inc/3rdparty/site_config/standard/nytimes.com.txt
Merge pull request #1 from inthepoche/dev
[github/wallabag/wallabag.git] / inc / 3rdparty / site_config / standard / nytimes.com.txt
CommitLineData
ac4d1142
NL
1title://h1[@class="articleHeadline"]\r
2body://div[@id="article"]\r
3strip_id_or_class:articleTools\r
4strip_id_or_class:readerscomment\r
5#strip://div[contains(@class, "articleInline runaroundLeft")]\r
6strip: //div[contains(@class, "doubleRule")]\r
7# strip image credit - appears as a bold heading\r
8strip: //div[contains(@class, "articleInline")]//h6\r
9strip_id_or_class:enlargeThis\r
10strip_id_or_class:pageLinks\r
11strip_id_or_class:memberTools\r
12strip_id_or_class:articleExtras\r
13strip_id_or_class:singleAd\r
14strip_id_or_class:byline\r
15strip_id_or_class:dateline\r
16strip_id_or_class:articleheadline\r
17strip_id_or_class:articleBottomExtra\r
18strip://a[contains(@href, 'nytimes.com/adx/')]\r
19strip: //nyt_byline\r
20strip: //span[contains(@class, 'slideshow') or contains(@class, 'video')]\r
21strip: //p[@class='caption']//a[contains(., 'More Photos')]\r
22\r
23prune: no\r
24tidy: no\r
25\r
26date: substring-after(//*[contains(@class, 'dateline')], 'Published:')\r
27\r
28single_page_link: //link[contains(@href, 'pagewanted=all')]\r
29#single_page_link: //a[contains(@href, 'pagewanted=all') and not(contains(@href, 'login'))]\r
30\r
31strip://ul[@id = 'toolsList']\r
32strip://h6[@class = 'kicker']\r
33author:substring-after(//h6[@class='byline'],'By ')\r
34\r
35test_url: http://www.nytimes.com/2011/07/24/books/review/an-academic-authors-unintentional-masterpiece.html\r
36test_url: http://www.nytimes.com/2012/06/10/arts/television/the-newsroom-aaron-sorkins-return-to-tv.html