]>
Commit | Line | Data |
---|---|---|
59e91bc8 NL |
1 | Write config files |
2 | ================== | |
3 | ||
4 | wallabag can use specific site config files to parse website articles. | |
5 | These files are stored in the | |
6 | ```inc/3rdparty/site_config/standard`` <https://github.com/wallabag/wallabag/tree/master/inc/3rdparty/site_config/standard>`__ | |
7 | folder. | |
8 | ||
9 | The format used for these files is | |
10 | `XPath <http://www.w3.org/TR/xpath20/>`__. Look at some examples in the | |
11 | folder. | |
12 | ||
13 | Automatic config files generation | |
14 | --------------------------------- | |
15 | ||
16 | Fivefilters has created a `very useful | |
17 | tool <http://siteconfig.fivefilters.org/>`__ to create config files. You | |
18 | just type in the adress of the article to work on with, and you select | |
19 | the area containing the content you want. | |
20 | ||
21 | .. figure:: https://lut.im/RNaO7gGe/l9vRnO1b | |
22 | :alt: siteconfig | |
23 | ||
24 | siteconfig | |
25 | | You should confirm this area by trying with other articles. | |
26 | | When you got the right area, just click on *Download Full-Text RSS | |
27 | site config* to download your file. | |
28 | ||
29 | Manual config file generation | |
30 | ----------------------------- | |
31 | ||
32 | If Fivefilters tool doesn't work correctly, take a look at the source | |
33 | (Ctrl + U on Firefox and Chromium). Search for your content and get the | |
34 | ``class`` or the ``id`` attribute of the area containing what you want. | |
35 | ||
36 | Once you've got the id or class, you can write for example one or | |
37 | another of these lines: | |
38 | ||
39 | :: | |
40 | ||
41 | body: //div[@class='myclass'] | |
42 | body: //div[@id='myid'] | |
43 | ||
44 | Then, test you file. If you got the right content but you want to strip | |
45 | unnecessary parts, do: | |
46 | ||
47 | :: | |
48 | ||
49 | strip: //div[@class='hidden'] | |
50 | ||
51 | You can look at other options for siteconfig files | |
52 | `here <http://help.fivefilters.org/customer/portal/articles/223153-site-patterns>`__. |