aboutsummaryrefslogtreecommitdiffhomepage
path: root/docs/en/Developer/write_config_files.rst
blob: c1ea61df4cc914a22a6a8b082ee560a8ebe87994 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Write config files
==================

wallabag can use specific site config files to parse website articles.
These files are stored in the
```inc/3rdparty/site_config/standard`` <https://github.com/wallabag/wallabag/tree/master/inc/3rdparty/site_config/standard>`__
folder.

The format used for these files is
`XPath <http://www.w3.org/TR/xpath20/>`__. Look at some examples in the
folder.

Automatic config files generation
---------------------------------

Fivefilters has created a `very useful
tool <http://siteconfig.fivefilters.org/>`__ to create config files. You
just type in the adress of the article to work on with, and you select
the area containing the content you want.

.. figure:: https://lut.im/RNaO7gGe/l9vRnO1b
   :alt: siteconfig

   siteconfig
| You should confirm this area by trying with other articles.
| When you got the right area, just click on *Download Full-Text RSS
site config* to download your file.

Manual config file generation
-----------------------------

If Fivefilters tool doesn't work correctly, take a look at the source
(Ctrl + U on Firefox and Chromium). Search for your content and get the
``class`` or the ``id`` attribute of the area containing what you want.

Once you've got the id or class, you can write for example one or
another of these lines:

::

    body: //div[@class='myclass']
    body: //div[@id='myid']

Then, test you file. If you got the right content but you want to strip
unnecessary parts, do:

::

    strip: //div[@class='hidden']

You can look at other options for siteconfig files
`here <http://help.fivefilters.org/customer/portal/articles/223153-site-patterns>`__.