aboutsummaryrefslogtreecommitdiffhomepage
path: root/docs/discover.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/discover.html')
-rw-r--r--docs/discover.html127
1 files changed, 127 insertions, 0 deletions
diff --git a/docs/discover.html b/docs/discover.html
new file mode 100644
index 0000000..aecc2b6
--- /dev/null
+++ b/docs/discover.html
@@ -0,0 +1,127 @@
1<!DOCTYPE html>
2<html>
3<head>
4 <meta http-equiv="content-type" content="text/html;charset=utf-8">
5 <title>discover.py</title>
6 <link rel="stylesheet" href="pycco.css">
7</head>
8<body>
9<div id='container'>
10 <div id="background"></div>
11 <div class='section'>
12 <div class='docs'><h1>discover.py</h1></div>
13 </div>
14 <div class='clearall'>
15 <div class='section' id='section-0'>
16 <div class='docs'>
17 <div class='octowrap'>
18 <a class='octothorpe' href='#section-0'>#</a>
19 </div>
20
21 </div>
22 <div class='code'>
23 <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">singer.catalog</span> <span class="kn">import</span> <span class="n">Catalog</span><span class="p">,</span> <span class="n">CatalogEntry</span><span class="p">,</span> <span class="n">Schema</span>
24<span class="kn">from</span> <span class="nn">tap_google_sheets.schema</span> <span class="kn">import</span> <span class="n">get_schemas</span><span class="p">,</span> <span class="n">STREAMS</span></pre></div>
25 </div>
26 </div>
27 <div class='clearall'></div>
28 <div class='section' id='section-1'>
29 <div class='docs'>
30 <div class='octowrap'>
31 <a class='octothorpe' href='#section-1'>#</a>
32 </div>
33 <p>Construct a Catalog Entry for each stream</p>
34<p>Inputs:</p>
35<ul>
36<li>client: A <code>GoogleClient</code> object</li>
37<li>spreadsheet_id: the ID of a Google Sheet Doc</li>
38</ul>
39<p>Returns:</p>
40<ul>
41<li>A singer.Catalog object</li>
42</ul>
43 </div>
44 <div class='code'>
45 <div class="highlight"><pre><span class="k">def</span> <span class="nf">discover</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">):</span></pre></div>
46 </div>
47 </div>
48 <div class='clearall'></div>
49 <div class='section' id='section-2'>
50 <div class='docs'>
51 <div class='octowrap'>
52 <a class='octothorpe' href='#section-2'>#</a>
53 </div>
54 <p>It&rsquo;s typical for taps in this style to call <code>schema.py:get_schemas()</code> to get <code>schemas</code> and
55<code>field_metadata</code>.</p>
56 </div>
57 <div class='code'>
58 <div class="highlight"><pre></pre></div>
59 </div>
60 </div>
61 <div class='clearall'></div>
62 <div class='section' id='section-3'>
63 <div class='docs'>
64 <div class='octowrap'>
65 <a class='octothorpe' href='#section-3'>#</a>
66 </div>
67 <p>Here <code>schemas</code> is a dictionary of stream name to JSON schema and <code>field_metadata</code> is a dictionary
68of stream name to another dictionary of stuff. In this tap, it seems that <code>discover.py:discover()</code>
69only cares about sometimes getting <code>table-key-properties</code> from <code>field_metadata</code>.</p>
70 </div>
71 <div class='code'>
72 <div class="highlight"><pre></pre></div>
73 </div>
74 </div>
75 <div class='clearall'></div>
76 <div class='section' id='section-4'>
77 <div class='docs'>
78 <div class='octowrap'>
79 <a class='octothorpe' href='#section-4'>#</a>
80 </div>
81 <ul>
82<li>This could be a point of confusion because <code>table-key-properties</code> is a stream / table level
83metadata, which you may or may not expect to be returned and stored in <code>field_metadata</code>.</li>
84</ul>
85 </div>
86 <div class='code'>
87 <div class="highlight"><pre> <span class="n">schemas</span><span class="p">,</span> <span class="n">field_metadata</span> <span class="o">=</span> <span class="n">get_schemas</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">)</span>
88 <span class="n">catalog</span> <span class="o">=</span> <span class="n">Catalog</span><span class="p">([])</span>
89
90 <span class="k">for</span> <span class="n">stream_name</span><span class="p">,</span> <span class="n">schema_dict</span> <span class="ow">in</span> <span class="n">schemas</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
91 <span class="n">schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="o">.</span><span class="n">from_dict</span><span class="p">(</span><span class="n">schema_dict</span><span class="p">)</span>
92 <span class="n">mdata</span> <span class="o">=</span> <span class="n">field_metadata</span><span class="p">[</span><span class="n">stream_name</span><span class="p">]</span>
93 <span class="n">key_properties</span> <span class="o">=</span> <span class="kc">None</span>
94 <span class="k">for</span> <span class="n">mdt</span> <span class="ow">in</span> <span class="n">mdata</span><span class="p">:</span>
95 <span class="n">table_key_properties</span> <span class="o">=</span> <span class="n">mdt</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;metadata&#39;</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;table-key-properties&#39;</span><span class="p">)</span>
96 <span class="k">if</span> <span class="n">table_key_properties</span><span class="p">:</span>
97 <span class="n">key_properties</span> <span class="o">=</span> <span class="n">table_key_properties</span></pre></div>
98 </div>
99 </div>
100 <div class='clearall'></div>
101 <div class='section' id='section-5'>
102 <div class='docs'>
103 <div class='octowrap'>
104 <a class='octothorpe' href='#section-5'>#</a>
105 </div>
106 <p>Once you have the <code>stream_name</code>, value of <code>table-key-properties</code>, the schema, and the
107metadata for the some stream, we pass all of that to the <code>singer.CatalogEntry</code> constructor
108and append that to the <code>singer.Catalog</code> object initialized at the start of
109<code>discover.py:discover()</code>.</p>
110 </div>
111 <div class='code'>
112 <div class="highlight"><pre> <span class="n">catalog</span><span class="o">.</span><span class="n">streams</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">CatalogEntry</span><span class="p">(</span>
113 <span class="n">stream</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
114 <span class="n">tap_stream_id</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
115 <span class="n">key_properties</span><span class="o">=</span><span class="n">STREAMS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">&#39;key_properties&#39;</span><span class="p">,</span> <span class="n">key_properties</span><span class="p">),</span>
116 <span class="n">schema</span><span class="o">=</span><span class="n">schema</span><span class="p">,</span>
117 <span class="n">metadata</span><span class="o">=</span><span class="n">mdata</span>
118 <span class="p">))</span>
119
120 <span class="k">return</span> <span class="n">catalog</span>
121
122</pre></div>
123 </div>
124 </div>
125 <div class='clearall'></div>
126</div>
127</body>