]> git.immae.eu Git - github/wallabag/wallabag.git/blob - docs/en/Developer/write_config_files.rst
add docs
[github/wallabag/wallabag.git] / docs / en / Developer / write_config_files.rst
1 Write config files
2 ==================
3
4 wallabag can use specific site config files to parse website articles.
5 These files are stored in the
6 ```inc/3rdparty/site_config/standard`` <https://github.com/wallabag/wallabag/tree/master/inc/3rdparty/site_config/standard>`__
7 folder.
8
9 The format used for these files is
10 `XPath <http://www.w3.org/TR/xpath20/>`__. Look at some examples in the
11 folder.
12
13 Automatic config files generation
14 ---------------------------------
15
16 Fivefilters has created a `very useful
17 tool <http://siteconfig.fivefilters.org/>`__ to create config files. You
18 just type in the adress of the article to work on with, and you select
19 the area containing the content you want.
20
21 .. figure:: https://lut.im/RNaO7gGe/l9vRnO1b
22 :alt: siteconfig
23
24 siteconfig
25 | You should confirm this area by trying with other articles.
26 | When you got the right area, just click on *Download Full-Text RSS
27 site config* to download your file.
28
29 Manual config file generation
30 -----------------------------
31
32 If Fivefilters tool doesn't work correctly, take a look at the source
33 (Ctrl + U on Firefox and Chromium). Search for your content and get the
34 ``class`` or the ``id`` attribute of the area containing what you want.
35
36 Once you've got the id or class, you can write for example one or
37 another of these lines:
38
39 ::
40
41 body: //div[@class='myclass']
42 body: //div[@id='myid']
43
44 Then, test you file. If you got the right content but you want to strip
45 unnecessary parts, do:
46
47 ::
48
49 strip: //div[@class='hidden']
50
51 You can look at other options for siteconfig files
52 `here <http://help.fivefilters.org/customer/portal/articles/223153-site-patterns>`__.