]>
Commit | Line | Data |
---|---|---|
ac4d1142 NL |
1 | author: substring-before(substring-after(//div[@class='post-byline'], 'By '), ', on')\r |
2 | date: substring-after(//div[@class='post-byline'], ', on')\r | |
3 | \r | |
4 | # for some reason, the following is producing a "no text [48]" error\r | |
5 | #title: //div[@class='post-headline']\r | |
6 | \r | |
7 | # for some reason, the following doesn't appear to isolate just the body copy\r | |
8 | body: //div[@class='post-bodycopy']\r | |
9 | \r | |
10 | # we solve the above issue by stripping out everything else we don't want\r | |
11 | # these can probably all be removed if the body: command above worked\r | |
12 | strip_id_or_class: reply\r | |
13 | strip_id_or_class: left\r | |
14 | strip_id_or_class: post-headline\r | |
15 | strip_id_or_class: post-byline\r | |
16 | strip_id_or_class: footer | |
17 | test_url: http://www.uni-watch.com/2011/10/18/the-curious-case-of-steve-debergs-microphone-and-speaker/ |