diff options
author | Andy Lu <luandy64@gmail.com> | 2021-02-21 21:09:11 -0600 |
---|---|---|
committer | Andy Lu <luandy64@gmail.com> | 2021-02-21 21:09:11 -0600 |
commit | 55ac5a86eae5ef9079e3312fd589f1bda20e43f2 (patch) | |
tree | f12426516309b481c44282c1d5c0741b35a05676 /docs/streams.html | |
parent | 0a0f2e89de6cde25ba6ef104c64e30f92091e007 (diff) | |
download | tap-google-sheets-55ac5a86eae5ef9079e3312fd589f1bda20e43f2.tar.gz tap-google-sheets-55ac5a86eae5ef9079e3312fd589f1bda20e43f2.tar.zst tap-google-sheets-55ac5a86eae5ef9079e3312fd589f1bda20e43f2.zip |
Add files from pycco
Diffstat (limited to 'docs/streams.html')
-rw-r--r-- | docs/streams.html | 185 |
1 files changed, 185 insertions, 0 deletions
diff --git a/docs/streams.html b/docs/streams.html new file mode 100644 index 0000000..8c9b6d9 --- /dev/null +++ b/docs/streams.html | |||
@@ -0,0 +1,185 @@ | |||
1 | <!DOCTYPE html> | ||
2 | <html> | ||
3 | <head> | ||
4 | <meta http-equiv="content-type" content="text/html;charset=utf-8"> | ||
5 | <title>streams.py</title> | ||
6 | <link rel="stylesheet" href="pycco.css"> | ||
7 | </head> | ||
8 | <body> | ||
9 | <div id='container'> | ||
10 | <div id="background"></div> | ||
11 | <div class='section'> | ||
12 | <div class='docs'><h1>streams.py</h1></div> | ||
13 | </div> | ||
14 | <div class='clearall'> | ||
15 | <div class='section' id='section-0'> | ||
16 | <div class='docs'> | ||
17 | <div class='octowrap'> | ||
18 | <a class='octothorpe' href='#section-0'>#</a> | ||
19 | </div> | ||
20 | <p><code>streams.py:STREAMS</code> is an <code>OrderedDict</code>. Only because we want to loop over it in the same order | ||
21 | every time.</p> | ||
22 | <p>It’s still the same global variable found in taps of this style. It maps stream names to a | ||
23 | dictionary describing the stream.</p> | ||
24 | <p>Some notable things we learn in this file:</p> | ||
25 | <ul> | ||
26 | <li> | ||
27 | <p><code>api</code> is either <code>"files"</code> or <code>"sheets"</code></p> | ||
28 | </li> | ||
29 | <li> | ||
30 | <p>We saw this used in <code>client.py:GoogleClient.request()</code> to switch the base url of the request</p> | ||
31 | </li> | ||
32 | <li> | ||
33 | <p><code>"file_metadata"</code> is the only incremental stream</p> | ||
34 | </li> | ||
35 | <li> | ||
36 | <p>Full table streams include:</p> | ||
37 | </li> | ||
38 | <li><code>"spreadsheet_metadata"</code></li> | ||
39 | <li><code>"sheet_metadata"</code></li> | ||
40 | <li> | ||
41 | <p><code>"sheets_loaded"</code></p> | ||
42 | </li> | ||
43 | <li> | ||
44 | <p><code>"sheets_loaded"</code> is the only stream with a <code>"data_key"</code></p> | ||
45 | </li> | ||
46 | <li>We typically see <code>data_key</code> be the name of the key to get data out of “envelope” responses</li> | ||
47 | </ul> | ||
48 | </div> | ||
49 | <div class='code'> | ||
50 | <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">OrderedDict</span></pre></div> | ||
51 | </div> | ||
52 | </div> | ||
53 | <div class='clearall'></div> | ||
54 | <div class='section' id='section-1'> | ||
55 | <div class='docs'> | ||
56 | <div class='octowrap'> | ||
57 | <a class='octothorpe' href='#section-1'>#</a> | ||
58 | </div> | ||
59 | <p>streams: API URL endpoints to be called | ||
60 | properties: | ||
61 | <root node>: Plural stream name for the endpoint | ||
62 | path: API endpoint relative path, when added to the base URL, creates the full path, | ||
63 | default = stream_name | ||
64 | key_properties: Primary key fields for identifying an endpoint record. | ||
65 | replication_method: INCREMENTAL or FULL_TABLE | ||
66 | replication_keys: bookmark_field(s), typically a date-time, used for filtering the results | ||
67 | and setting the state | ||
68 | params: Query, sort, and other endpoint specific parameters; default = {} | ||
69 | data_key: JSON element containing the results list for the endpoint; | ||
70 | default = root (no data_key)</p> | ||
71 | </div> | ||
72 | <div class='code'> | ||
73 | <div class="highlight"><pre></pre></div> | ||
74 | </div> | ||
75 | </div> | ||
76 | <div class='clearall'></div> | ||
77 | <div class='section' id='section-2'> | ||
78 | <div class='docs'> | ||
79 | <div class='octowrap'> | ||
80 | <a class='octothorpe' href='#section-2'>#</a> | ||
81 | </div> | ||
82 | <p>file_metadata: Queries Google Drive API to get file information and see if file has been modified | ||
83 | Provides audit info about who and when last changed the file.</p> | ||
84 | </div> | ||
85 | <div class='code'> | ||
86 | <div class="highlight"><pre><span class="n">FILE_METADATA</span> <span class="o">=</span> <span class="p">{</span> | ||
87 | <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"files"</span><span class="p">,</span> | ||
88 | <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"files/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span> | ||
89 | <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"id"</span><span class="p">],</span> | ||
90 | <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"INCREMENTAL"</span><span class="p">,</span> | ||
91 | <span class="s2">"replication_keys"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"modifiedTime"</span><span class="p">],</span> | ||
92 | <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span> | ||
93 | <span class="s2">"fields"</span><span class="p">:</span> <span class="s2">"id,name,createdTime,modifiedTime,version,teamDriveId,driveId,lastModifyingUser"</span> | ||
94 | <span class="p">}</span> | ||
95 | <span class="p">}</span></pre></div> | ||
96 | </div> | ||
97 | </div> | ||
98 | <div class='clearall'></div> | ||
99 | <div class='section' id='section-3'> | ||
100 | <div class='docs'> | ||
101 | <div class='octowrap'> | ||
102 | <a class='octothorpe' href='#section-3'>#</a> | ||
103 | </div> | ||
104 | <p>spreadsheet_metadata: Queries spreadsheet to get basic information on spreadhsheet and sheets</p> | ||
105 | </div> | ||
106 | <div class='code'> | ||
107 | <div class="highlight"><pre><span class="n">SPREADSHEET_METADATA</span> <span class="o">=</span> <span class="p">{</span> | ||
108 | <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span> | ||
109 | <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span> | ||
110 | <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"spreadsheetId"</span><span class="p">],</span> | ||
111 | <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span> | ||
112 | <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span> | ||
113 | <span class="s2">"includeGridData"</span><span class="p">:</span> <span class="s2">"false"</span> | ||
114 | <span class="p">}</span> | ||
115 | <span class="p">}</span></pre></div> | ||
116 | </div> | ||
117 | </div> | ||
118 | <div class='clearall'></div> | ||
119 | <div class='section' id='section-4'> | ||
120 | <div class='docs'> | ||
121 | <div class='octowrap'> | ||
122 | <a class='octothorpe' href='#section-4'>#</a> | ||
123 | </div> | ||
124 | <p>sheet_metadata: Get Header Row and 1st data row (Rows 1 & 2) from a Sheet on Spreadsheet. | ||
125 | This endpoint includes detailed metadata about each cell in the header and first data row | ||
126 | incl. data type, formatting, etc.</p> | ||
127 | </div> | ||
128 | <div class='code'> | ||
129 | <div class="highlight"><pre><span class="n">SHEET_METADATA</span> <span class="o">=</span> <span class="p">{</span> | ||
130 | <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span> | ||
131 | <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span> | ||
132 | <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"sheetId"</span><span class="p">],</span> | ||
133 | <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span> | ||
134 | <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span> | ||
135 | <span class="s2">"includeGridData"</span><span class="p">:</span> <span class="s2">"true"</span><span class="p">,</span> | ||
136 | <span class="s2">"ranges"</span><span class="p">:</span> <span class="s2">"'</span><span class="si">{sheet_title}</span><span class="s2">'!1:2"</span> | ||
137 | <span class="p">}</span> | ||
138 | <span class="p">}</span></pre></div> | ||
139 | </div> | ||
140 | </div> | ||
141 | <div class='clearall'></div> | ||
142 | <div class='section' id='section-5'> | ||
143 | <div class='docs'> | ||
144 | <div class='octowrap'> | ||
145 | <a class='octothorpe' href='#section-5'>#</a> | ||
146 | </div> | ||
147 | <p>sheets_loaded: Queries a batch of Rows for each Sheet in the Spreadsheet. | ||
148 | Each query uses the <code>values</code> endpoint, to get data-only, w/out the formatting/type metadata.</p> | ||
149 | </div> | ||
150 | <div class='code'> | ||
151 | <div class="highlight"><pre><span class="n">SHEETS_LOADED</span> <span class="o">=</span> <span class="p">{</span> | ||
152 | <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span> | ||
153 | <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">/values/'</span><span class="si">{sheet_title}</span><span class="s2">'!</span><span class="si">{range_rows}</span><span class="s2">"</span><span class="p">,</span> | ||
154 | <span class="s2">"data_key"</span><span class="p">:</span> <span class="s2">"values"</span><span class="p">,</span> | ||
155 | <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"spreadsheetId"</span><span class="p">,</span> <span class="s2">"sheetId"</span><span class="p">,</span> <span class="s2">"loadDate"</span><span class="p">],</span> | ||
156 | <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span> | ||
157 | <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span> | ||
158 | <span class="s2">"dateTimeRenderOption"</span><span class="p">:</span> <span class="s2">"SERIAL_NUMBER"</span><span class="p">,</span> | ||
159 | <span class="s2">"valueRenderOption"</span><span class="p">:</span> <span class="s2">"UNFORMATTED_VALUE"</span><span class="p">,</span> | ||
160 | <span class="s2">"majorDimension"</span><span class="p">:</span> <span class="s2">"ROWS"</span> | ||
161 | <span class="p">}</span> | ||
162 | <span class="p">}</span></pre></div> | ||
163 | </div> | ||
164 | </div> | ||
165 | <div class='clearall'></div> | ||
166 | <div class='section' id='section-6'> | ||
167 | <div class='docs'> | ||
168 | <div class='octowrap'> | ||
169 | <a class='octothorpe' href='#section-6'>#</a> | ||
170 | </div> | ||
171 | <p>Ensure streams are ordered sequentially, logically.</p> | ||
172 | </div> | ||
173 | <div class='code'> | ||
174 | <div class="highlight"><pre><span class="n">STREAMS</span> <span class="o">=</span> <span class="n">OrderedDict</span><span class="p">()</span> | ||
175 | <span class="n">STREAMS</span><span class="p">[</span><span class="s1">'file_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">FILE_METADATA</span> | ||
176 | <span class="n">STREAMS</span><span class="p">[</span><span class="s1">'spreadsheet_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SPREADSHEET_METADATA</span> | ||
177 | <span class="n">STREAMS</span><span class="p">[</span><span class="s1">'sheet_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SHEET_METADATA</span> | ||
178 | <span class="n">STREAMS</span><span class="p">[</span><span class="s1">'sheets_loaded'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SHEETS_LOADED</span> | ||
179 | |||
180 | </pre></div> | ||
181 | </div> | ||
182 | </div> | ||
183 | <div class='clearall'></div> | ||
184 | </div> | ||
185 | </body> | ||