--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>__init__.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>__init__.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+ <p>This project syncs data from the v4 Google Sheets API.</p>
+<h1>Discovery Mode</h1>
+<p>There are a few static streams (<code>"file_metadata"</code>, <code>"spreadsheet_metadata"</code>, <code>"sheet_metadata"</code>,
+<code>"sheets_loaded"</code>) and any number of dynamic streams. There’s one dynamic stream per sheet in the
+one Google Sheets Doc.</p>
+<h1>Sync Mode</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">sys</span>
+<span class="kn">import</span> <span class="nn">json</span>
+<span class="kn">import</span> <span class="nn">argparse</span> <span class="c1"># unused import</span>
+<span class="kn">import</span> <span class="nn">singer</span>
+<span class="kn">from</span> <span class="nn">singer</span> <span class="kn">import</span> <span class="n">metadata</span><span class="p">,</span> <span class="n">utils</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.client</span> <span class="kn">import</span> <span class="n">GoogleClient</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.discover</span> <span class="kn">import</span> <span class="n">discover</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.sync</span> <span class="kn">import</span> <span class="n">sync</span>
+
+<span class="n">LOGGER</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">get_logger</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+ <h1>Configuration</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+ <p>This is a typical OAuth2 tap. So in a config file we expect the following keys.</p>
+<ul>
+<li>
+<p>OAuth Related:</p>
+<ul>
+<li><code>client_id</code></li>
+<li><code>client_secret</code></li>
+<li><code>refresh_token</code></li>
+</ul>
+</li>
+<li>
+<p>Tap related:</p>
+<ul>
+<li><code>spreadsheet_id</code></li>
+<li><code>start_date</code></li>
+<li><code>user_agent</code></li>
+</ul>
+</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">REQUIRED_CONFIG_KEYS</span> <span class="o">=</span> <span class="p">[</span>
+ <span class="s1">'client_id'</span><span class="p">,</span>
+ <span class="s1">'client_secret'</span><span class="p">,</span>
+ <span class="s1">'refresh_token'</span><span class="p">,</span>
+ <span class="s1">'spreadsheet_id'</span><span class="p">,</span>
+ <span class="s1">'start_date'</span><span class="p">,</span>
+ <span class="s1">'user_agent'</span>
+<span class="p">]</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+ <h1>Discovery Mode</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+ <p>Creates a Singer Catalog and writes it to STDOUT</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">do_discover</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+ <p>Inputs:</p>
+<ul>
+<li><code>client</code></li>
+<li>An instance of the GoogleClient class</li>
+<li><code>spreadsheet_id</code></li>
+<li>The id of the Google Sheet</li>
+</ul>
+<p>Returns:</p>
+<ul>
+<li>None</li>
+</ul>
+<p>Side Effects:</p>
+<ul>
+<li>Writes to STDOUT</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Starting discover'</span><span class="p">)</span>
+ <span class="n">catalog</span> <span class="o">=</span> <span class="n">discover</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">)</span>
+ <span class="n">json</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">catalog</span><span class="o">.</span><span class="n">to_dict</span><span class="p">(),</span> <span class="n">sys</span><span class="o">.</span><span class="n">stdout</span><span class="p">,</span> <span class="n">indent</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Finished discover'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-6'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-6'>#</a>
+ </div>
+ <h1>Entrypoint</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-7'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-7'>#</a>
+ </div>
+ <p>Read a config, then run discovery mode or sync mode</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="nd">@singer</span><span class="o">.</span><span class="n">utils</span><span class="o">.</span><span class="n">handle_top_exception</span><span class="p">(</span><span class="n">LOGGER</span><span class="p">)</span>
+<span class="k">def</span> <span class="nf">main</span><span class="p">():</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-8'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-8'>#</a>
+ </div>
+ <p>Inputs:</p>
+<ul>
+<li>None</li>
+</ul>
+<p>Returns:</p>
+<ul>
+<li>None</li>
+</ul>
+<p>Side Effects:</p>
+<ul>
+<li>Writes to STDOUT</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">parsed_args</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">utils</span><span class="o">.</span><span class="n">parse_args</span><span class="p">(</span><span class="n">REQUIRED_CONFIG_KEYS</span><span class="p">)</span>
+
+ <span class="k">with</span> <span class="n">GoogleClient</span><span class="p">(</span><span class="n">parsed_args</span><span class="o">.</span><span class="n">config</span><span class="p">[</span><span class="s1">'access_token'</span><span class="p">],</span>
+ <span class="n">parsed_args</span><span class="o">.</span><span class="n">config</span><span class="p">[</span><span class="s1">'user_agent'</span><span class="p">])</span> <span class="k">as</span> <span class="n">client</span><span class="p">:</span>
+
+ <span class="n">state</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="k">if</span> <span class="n">parsed_args</span><span class="o">.</span><span class="n">state</span><span class="p">:</span>
+ <span class="n">state</span> <span class="o">=</span> <span class="n">parsed_args</span><span class="o">.</span><span class="n">state</span>
+
+ <span class="n">config</span> <span class="o">=</span> <span class="n">parsed_args</span><span class="o">.</span><span class="n">config</span>
+ <span class="n">spreadsheet_id</span> <span class="o">=</span> <span class="n">config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'spreadsheet_id'</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="n">parsed_args</span><span class="o">.</span><span class="n">discover</span><span class="p">:</span>
+ <span class="n">do_discover</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">)</span>
+ <span class="k">elif</span> <span class="n">parsed_args</span><span class="o">.</span><span class="n">catalog</span><span class="p">:</span>
+ <span class="n">sync</span><span class="p">(</span><span class="n">client</span><span class="o">=</span><span class="n">client</span><span class="p">,</span>
+ <span class="n">config</span><span class="o">=</span><span class="n">config</span><span class="p">,</span>
+ <span class="n">catalog</span><span class="o">=</span><span class="n">parsed_args</span><span class="o">.</span><span class="n">catalog</span><span class="p">,</span>
+ <span class="n">state</span><span class="o">=</span><span class="n">state</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-9'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-9'>#</a>
+ </div>
+ <p>Unused</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s1">'__main__'</span><span class="p">:</span>
+ <span class="n">main</span><span class="p">()</span>
+
+</pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>
--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>client.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>client.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span><span class="p">,</span> <span class="n">timedelta</span>
+<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">OrderedDict</span>
+<span class="kn">import</span> <span class="nn">backoff</span>
+<span class="kn">import</span> <span class="nn">requests</span>
+<span class="kn">import</span> <span class="nn">singer</span>
+<span class="kn">from</span> <span class="nn">singer</span> <span class="kn">import</span> <span class="n">metrics</span>
+<span class="kn">from</span> <span class="nn">singer</span> <span class="kn">import</span> <span class="n">utils</span>
+
+<span class="n">BASE_URL</span> <span class="o">=</span> <span class="s1">'https://www.googleapis.com'</span>
+<span class="n">GOOGLE_TOKEN_URI</span> <span class="o">=</span> <span class="s1">'https://oauth2.googleapis.com/token'</span>
+<span class="n">LOGGER</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">get_logger</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">Server5xxError</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">Server429Error</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleError</span><span class="p">(</span><span class="ne">Exception</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleBadRequestError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleUnauthorizedError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-6'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-6'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GooglePaymentRequiredError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-7'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-7'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleNotFoundError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-8'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-8'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleMethodNotAllowedError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-9'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-9'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleConflictError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-10'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-10'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleGoneError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-11'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-11'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GooglePreconditionFailedError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-12'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-12'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleRequestEntityTooLargeError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-13'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-13'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleRequestedRangeNotSatisfiableError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-14'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-14'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleExpectationFailedError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-15'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-15'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleForbiddenError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-16'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-16'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleUnprocessableEntityError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-17'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-17'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GooglePreconditionRequiredError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-18'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-18'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleInternalServiceError</span><span class="p">(</span><span class="n">GoogleError</span><span class="p">):</span>
+ <span class="k">pass</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-19'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-19'>#</a>
+ </div>
+ <p>Error Codes: https://developers.google.com/webmaster-tools/search-console-api-original/v3/errors</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">ERROR_CODE_EXCEPTION_MAPPING</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="mi">400</span><span class="p">:</span> <span class="n">GoogleBadRequestError</span><span class="p">,</span>
+ <span class="mi">401</span><span class="p">:</span> <span class="n">GoogleUnauthorizedError</span><span class="p">,</span>
+ <span class="mi">402</span><span class="p">:</span> <span class="n">GooglePaymentRequiredError</span><span class="p">,</span>
+ <span class="mi">403</span><span class="p">:</span> <span class="n">GoogleForbiddenError</span><span class="p">,</span>
+ <span class="mi">404</span><span class="p">:</span> <span class="n">GoogleNotFoundError</span><span class="p">,</span>
+ <span class="mi">405</span><span class="p">:</span> <span class="n">GoogleMethodNotAllowedError</span><span class="p">,</span>
+ <span class="mi">409</span><span class="p">:</span> <span class="n">GoogleConflictError</span><span class="p">,</span>
+ <span class="mi">410</span><span class="p">:</span> <span class="n">GoogleGoneError</span><span class="p">,</span>
+ <span class="mi">412</span><span class="p">:</span> <span class="n">GooglePreconditionFailedError</span><span class="p">,</span>
+ <span class="mi">413</span><span class="p">:</span> <span class="n">GoogleRequestEntityTooLargeError</span><span class="p">,</span>
+ <span class="mi">416</span><span class="p">:</span> <span class="n">GoogleRequestedRangeNotSatisfiableError</span><span class="p">,</span>
+ <span class="mi">417</span><span class="p">:</span> <span class="n">GoogleExpectationFailedError</span><span class="p">,</span>
+ <span class="mi">422</span><span class="p">:</span> <span class="n">GoogleUnprocessableEntityError</span><span class="p">,</span>
+ <span class="mi">428</span><span class="p">:</span> <span class="n">GooglePreconditionRequiredError</span><span class="p">,</span>
+ <span class="mi">500</span><span class="p">:</span> <span class="n">GoogleInternalServiceError</span><span class="p">}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-20'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-20'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_exception_for_error_code</span><span class="p">(</span><span class="n">error_code</span><span class="p">):</span>
+ <span class="k">return</span> <span class="n">ERROR_CODE_EXCEPTION_MAPPING</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">error_code</span><span class="p">,</span> <span class="n">GoogleError</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-21'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-21'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-22'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-22'>#</a>
+ </div>
+ <p><code>client.py:raise_for_error()</code> calls the <code>raise_for_status()</code> function from the <code>requests</code> library.
+and catches all <code>requests.HTTPError</code> and <code>requests.ConnectionError</code>. Note the name difference.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-23'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-23'>#</a>
+ </div>
+ <h5>Thoughts</h5>
+<p>I believe there are 5 ways to leave this function. It’s worth skimming this just to understand the
+structure. I’ll note below, but I think there’s just two ways to leave this function.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-24'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-24'>#</a>
+ </div>
+ <ol>
+<li>If the length of the response content is 0, then we just leave<ul>
+<li>I believe this results in us swallowing the <code>requests.HTTPError</code> and successfully returns to
+the calling function</li>
+<li>I believe it’s possible to leave the function this way</li>
+</ul>
+</li>
+<li>If you can call <code>response.json()</code>, then we attempt to create a specific error message via
+ <code>client.py:get_exception_for_error_code()</code>, which just looks up a code found in
+ <code>response.json()</code><ul>
+<li>I am not convinced this ever works for this tap because my understanding of
+<code>raise_for_status()</code> is if you <code>raise_for_status()</code> is unsuccessful then <code>response.json()</code>
+will also be unsuccessful. So, because we are in the exception handling for
+<code>raise_for_status()</code> I think we never make it past <code>response = response.json()</code> on
+<code>client.py:118</code></li>
+<li>I believe it’s possible to leave the function this way</li>
+</ul>
+</li>
+<li>Assuming <code>response.json()</code> does fail, then that function will raise a
+ <code>simplejson.scanner.JSONDecodeError</code> with an error message like <code>"Expecting value: line 1
+ column 1 (char 0)"</code></li>
+<li>Assuming <code>response.json()</code> worked, but it’s lacks an <code>error</code> key and lacks an <code>errorCode</code> key,
+ we re-raise whatever was caught from <code>raise_for_status()</code></li>
+<li>We also re-raise whatever was caught from <code>raise_for_status()</code> if a <code>ValueError</code> or <code>TypeError</code>
+ occurs in trying to handle the <code>raise_for_status()</code> error</li>
+</ol>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-25'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-25'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-26'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-26'>#</a>
+ </div>
+ <p>Try to catch API errors to rethrow as tap specific errors</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">raise_for_error</span><span class="p">(</span><span class="n">response</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-27'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-27'>#</a>
+ </div>
+ <p>Inputs:</p>
+<ul>
+<li><code>response</code>: A requests.Response object</li>
+</ul>
+<p>Returns:</p>
+<ul>
+<li>None</li>
+</ul>
+<p>Side Effects:</p>
+<ul>
+<li>Raises a GoogleError</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">try</span><span class="p">:</span>
+ <span class="n">response</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
+ <span class="k">except</span> <span class="p">(</span><span class="n">requests</span><span class="o">.</span><span class="n">HTTPError</span><span class="p">,</span> <span class="n">requests</span><span class="o">.</span><span class="n">ConnectionError</span><span class="p">)</span> <span class="k">as</span> <span class="n">error</span><span class="p">:</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">content_length</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">response</span><span class="o">.</span><span class="n">content</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">content_length</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
+ <span class="k">return</span>
+ <span class="n">response</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">()</span>
+ <span class="k">if</span> <span class="p">(</span><span class="s1">'error'</span> <span class="ow">in</span> <span class="n">response</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="s1">'errorCode'</span> <span class="ow">in</span> <span class="n">response</span><span class="p">):</span>
+ <span class="n">message</span> <span class="o">=</span> <span class="s1">'</span><span class="si">%s</span><span class="s1">: </span><span class="si">%s</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">response</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'error'</span><span class="p">,</span> <span class="nb">str</span><span class="p">(</span><span class="n">error</span><span class="p">)),</span>
+ <span class="n">response</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'message'</span><span class="p">,</span> <span class="s1">'Unknown Error'</span><span class="p">))</span>
+ <span class="n">error_code</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'error'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'code'</span><span class="p">)</span>
+ <span class="n">ex</span> <span class="o">=</span> <span class="n">get_exception_for_error_code</span><span class="p">(</span><span class="n">error_code</span><span class="p">)</span>
+ <span class="k">raise</span> <span class="n">ex</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
+ <span class="k">raise</span> <span class="n">GoogleError</span><span class="p">(</span><span class="n">error</span><span class="p">)</span>
+ <span class="k">except</span> <span class="p">(</span><span class="ne">ValueError</span><span class="p">,</span> <span class="ne">TypeError</span><span class="p">):</span>
+ <span class="k">raise</span> <span class="n">GoogleError</span><span class="p">(</span><span class="n">error</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-28'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-28'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-29'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-29'>#</a>
+ </div>
+ <h3>Handling a successful response</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-30'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-30'>#</a>
+ </div>
+ <p>A successful response is defined as anything that returns a <code>HTTP 200</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-31'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-31'>#</a>
+ </div>
+ <p>On a successful response, we store the <code>access_token</code> returned on a private field,
+<code>GoogleClient.__access_token</code>, and we update <code>GoogleClient.__expires</code> to be the time this
+<code>access_token</code> expires. <code>GoogleClient.__expires</code> is a <code>datetime</code> object in UTC.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-32'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-32'>#</a>
+ </div>
+ <h3>Handling an unsuccessful response</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-33'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-33'>#</a>
+ </div>
+ <p>To handle unsuccessful requests, the tap has the following pattern</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-34'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-34'>#</a>
+ </div>
+ <pre><code class="language-Python">if response.status_code >= 500:
+ raise Server5xxError()
+
+if response.status_code != 200:
+ raise_for_error(response)
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-35'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-35'>#</a>
+ </div>
+ <p>The <code>client.py:Server5xxError</code> is caught by <code>backoff</code> and we exponentially backoff the request.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-36'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-36'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-37'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-37'>#</a>
+ </div>
+ <p>This is a class implemented in the tap in <code>client.py</code>. We initialize it once in <code>__init__.py</code> as
+a context manager in <code>__init__.py:main()</code></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">class</span> <span class="nc">GoogleClient</span><span class="p">:</span> <span class="c1"># pylint: disable=too-many-instance-attributes</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-38'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-38'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-39'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-39'>#</a>
+ </div>
+ <p>To create the <code>GoogleClient</code> object, we have to pass in the three OAuth2 variables. Optionally we
+can include the <code>user_agent</code>.</p>
+<p>Side Effects:</p>
+<ul>
+<li>All of this gets stored in private fields by the constructor.</li>
+<li>The constructor also initializes a <code>requests.Session</code>.</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">def</span> <span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">client_id</span><span class="p">,</span> <span class="n">client_secret</span><span class="p">,</span> <span class="n">refresh_token</span><span class="p">,</span> <span class="n">access_token</span><span class="p">,</span> <span class="n">user_agent</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-40'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-40'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="bp">self</span><span class="o">.</span><span class="n">__client_id</span> <span class="o">=</span> <span class="n">client_id</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__client_secret</span> <span class="o">=</span> <span class="n">client_secret</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__refresh_token</span> <span class="o">=</span> <span class="n">refresh_token</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__user_agent</span> <span class="o">=</span> <span class="n">user_agent</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__access_token</span> <span class="o">=</span> <span class="n">access_token</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__session</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">Session</span><span class="p">()</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">base_url</span> <span class="o">=</span> <span class="kc">None</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-41'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-41'>#</a>
+ </div>
+ <p>On enter, get a new access token</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">def</span> <span class="fm">__enter__</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-42'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-42'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="bp">self</span><span class="o">.</span><span class="n">get_access_token</span><span class="p">()</span>
+ <span class="k">return</span> <span class="bp">self</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-43'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-43'>#</a>
+ </div>
+ <p>On exit, close the Requests Session</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">def</span> <span class="fm">__exit__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">exception_type</span><span class="p">,</span> <span class="n">exception_value</span><span class="p">,</span> <span class="n">traceback</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-44'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-44'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="bp">self</span><span class="o">.</span><span class="n">__session</span><span class="o">.</span><span class="n">close</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-45'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-45'>#</a>
+ </div>
+ <p><code>get_access_token()</code> will <code>POST</code> to <code>client.py:GOOGLE_TOKEN_URI</code> which is just
+<code>https://oauth2.googleapis.com/token</code>. The body of the <code>POST</code> looks like</p>
+<pre><code class="language-JSON">{
+ "grant_type": "refresh_token",
+ "client_id": my_client_id,
+ "client_secret": my_client_secret,
+ "refresh_token": my_refresh_token
+}
+</code></pre>
+<p>Side Effects:</p>
+<ul>
+<li>Store the access token and time it expires in private fields on the Client object</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="nd">@backoff</span><span class="o">.</span><span class="n">on_exception</span><span class="p">(</span><span class="n">backoff</span><span class="o">.</span><span class="n">expo</span><span class="p">,</span>
+ <span class="n">Server5xxError</span><span class="p">,</span>
+ <span class="n">max_tries</span><span class="o">=</span><span class="mi">5</span><span class="p">,</span>
+ <span class="n">factor</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
+ <span class="k">def</span> <span class="nf">get_access_token</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-46'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-46'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">__access_token</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span>
+ <span class="k">return</span>
+
+ <span class="n">headers</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">__user_agent</span><span class="p">:</span>
+ <span class="n">headers</span><span class="p">[</span><span class="s1">'User-Agent'</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">__user_agent</span>
+
+ <span class="n">response</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">__session</span><span class="o">.</span><span class="n">post</span><span class="p">(</span>
+ <span class="n">url</span><span class="o">=</span><span class="n">GOOGLE_TOKEN_URI</span><span class="p">,</span>
+ <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">,</span>
+ <span class="n">data</span><span class="o">=</span><span class="p">{</span>
+ <span class="s1">'grant_type'</span><span class="p">:</span> <span class="s1">'refresh_token'</span><span class="p">,</span>
+ <span class="s1">'client_id'</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">__client_id</span><span class="p">,</span>
+ <span class="s1">'client_secret'</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">__client_secret</span><span class="p">,</span>
+ <span class="s1">'refresh_token'</span><span class="p">:</span> <span class="bp">self</span><span class="o">.</span><span class="n">__refresh_token</span><span class="p">,</span>
+ <span class="p">})</span>
+
+ <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span> <span class="o">>=</span> <span class="mi">500</span><span class="p">:</span>
+ <span class="k">raise</span> <span class="n">Server5xxError</span><span class="p">()</span>
+
+ <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span> <span class="o">!=</span> <span class="mi">200</span><span class="p">:</span>
+ <span class="n">raise_for_error</span><span class="p">(</span><span class="n">response</span><span class="p">)</span>
+
+ <span class="n">data</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">()</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__access_token</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s1">'access_token'</span><span class="p">]</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">__expires</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">utcnow</span><span class="p">()</span> <span class="o">+</span> <span class="n">timedelta</span><span class="p">(</span><span class="n">seconds</span><span class="o">=</span><span class="n">data</span><span class="p">[</span><span class="s1">'expires_in'</span><span class="p">])</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Authorized, token expires = </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">__expires</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-47'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-47'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-48'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-48'>#</a>
+ </div>
+ <p>This function starts with a call to <code>GoogleClient.get_access_token()</code> which likely returns
+immediately most of the time.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-49'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-49'>#</a>
+ </div>
+ <p>Then we decide what url we are sending the request to. Sometimes it’s
+<code>https://sheets.googleapis.com/v4</code> and sometimes it’s <code>https://www.googleapis.com/drive/v3</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-50'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-50'>#</a>
+ </div>
+ <ul>
+<li>It seems like a mistake to decide this so deep into the code. Why doesn’t the caller decide
+ where the request goes?</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-51'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-51'>#</a>
+ </div>
+ <p>Then we set up the request headers. The <code>authorization</code>, <code>user-agent</code>, and <code>content-type</code> keys come
+into play here</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-52'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-52'>#</a>
+ </div>
+ <ul>
+<li>One benefit of a a <code>requests.Session</code> is that you can set the headers for the session. I’m not
+sure why we don’t do that here</li>
+<li>If we did that, we wouldn’t have to think about the access_token making it into the headers
+here. They would just already be there</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-53'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-53'>#</a>
+ </div>
+ <p>Then we make the request, timing how long it takes with a <code>singer.metrics.http_request_timer</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-54'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-54'>#</a>
+ </div>
+ <p>The chunk of code after making the request handles an unsuccessful response. We will retry <code>HTTP
+500</code> and <code>HTTP 429</code> errors, and <code>client.py:raise_for_error</code> for everything else.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-55'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-55'>#</a>
+ </div>
+ <p>The most unique thing of this tap happens here: we return an <code>OrderedDict</code> of the response with this
+line</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-56'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-56'>#</a>
+ </div>
+ <pre><code class="language-Python">return response.json(object_pairs_hook=OrderedDict)
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-57'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-57'>#</a>
+ </div>
+ <p>where <code>object_pairs_hook</code> is a <code>kwarg</code> passed to the JSON parser used by <code>requests</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-58'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-58'>#</a>
+ </div>
+ <p>This turns every key-value pair in the JSON response into a <code>OrderedDict</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-59'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-59'>#</a>
+ </div>
+ <p>Why do we do this? I don’t know. See the footnote for code examples</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-60'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-60'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-61'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-61'>#</a>
+ </div>
+ <p>Rate Limit: https://developers.google.com/sheets/api/limits
+ 100 request per 100 seconds per User</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="nd">@backoff</span><span class="o">.</span><span class="n">on_exception</span><span class="p">(</span><span class="n">backoff</span><span class="o">.</span><span class="n">expo</span><span class="p">,</span>
+ <span class="p">(</span><span class="n">Server5xxError</span><span class="p">,</span> <span class="ne">ConnectionError</span><span class="p">,</span> <span class="n">Server429Error</span><span class="p">),</span>
+ <span class="n">max_tries</span><span class="o">=</span><span class="mi">7</span><span class="p">,</span>
+ <span class="n">factor</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
+ <span class="nd">@utils</span><span class="o">.</span><span class="n">ratelimit</span><span class="p">(</span><span class="mi">100</span><span class="p">,</span> <span class="mi">100</span><span class="p">)</span>
+ <span class="k">def</span> <span class="nf">request</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">method</span><span class="p">,</span> <span class="n">path</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">url</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">api</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-62'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-62'>#</a>
+ </div>
+ <p>Make a request to the API</p>
+<p>Inputs:</p>
+<ul>
+<li>method: “GET” or “POST”</li>
+<li>url: The start of the url to make the request to</li>
+<li>path:</li>
+</ul>
+<p>Returns:</p>
+<ul>
+<li>A requests.Reponse</li>
+</ul>
+<p>Side Effects:</p>
+<ul>
+<li>Might store a new access token</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="bp">self</span><span class="o">.</span><span class="n">get_access_token</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-63'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-63'>#</a>
+ </div>
+ <p>Construct the URL to make a request to</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="bp">self</span><span class="o">.</span><span class="n">base_url</span> <span class="o">=</span> <span class="s1">'https://sheets.googleapis.com/v4'</span>
+ <span class="k">if</span> <span class="n">api</span> <span class="o">==</span> <span class="s1">'files'</span><span class="p">:</span>
+ <span class="bp">self</span><span class="o">.</span><span class="n">base_url</span> <span class="o">=</span> <span class="s1">'https://www.googleapis.com/drive/v3'</span>
+
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">url</span> <span class="ow">and</span> <span class="n">path</span><span class="p">:</span>
+ <span class="n">url</span> <span class="o">=</span> <span class="s1">'</span><span class="si">{}</span><span class="s1">/</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">base_url</span><span class="p">,</span> <span class="n">path</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="s1">'endpoint'</span> <span class="ow">in</span> <span class="n">kwargs</span><span class="p">:</span>
+ <span class="n">endpoint</span> <span class="o">=</span> <span class="n">kwargs</span><span class="p">[</span><span class="s1">'endpoint'</span><span class="p">]</span>
+ <span class="k">del</span> <span class="n">kwargs</span><span class="p">[</span><span class="s1">'endpoint'</span><span class="p">]</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">endpoint</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'</span><span class="si">{}</span><span class="s1"> URL = </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">endpoint</span><span class="p">,</span> <span class="n">url</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-64'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-64'>#</a>
+ </div>
+ <p>Contruct the <code>headers</code> arg for <code>requests.request()</code></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="s1">'headers'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">kwargs</span><span class="p">:</span>
+ <span class="n">kwargs</span><span class="p">[</span><span class="s1">'headers'</span><span class="p">]</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">kwargs</span><span class="p">[</span><span class="s1">'headers'</span><span class="p">][</span><span class="s1">'Authorization'</span><span class="p">]</span> <span class="o">=</span> <span class="s1">'Bearer </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">__access_token</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">__user_agent</span><span class="p">:</span>
+ <span class="n">kwargs</span><span class="p">[</span><span class="s1">'headers'</span><span class="p">][</span><span class="s1">'User-Agent'</span><span class="p">]</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">__user_agent</span>
+
+ <span class="k">if</span> <span class="n">method</span> <span class="o">==</span> <span class="s1">'POST'</span><span class="p">:</span>
+ <span class="n">kwargs</span><span class="p">[</span><span class="s1">'headers'</span><span class="p">][</span><span class="s1">'Content-Type'</span><span class="p">]</span> <span class="o">=</span> <span class="s1">'application/json'</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-65'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-65'>#</a>
+ </div>
+ <p>Make request</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">with</span> <span class="n">metrics</span><span class="o">.</span><span class="n">http_request_timer</span><span class="p">(</span><span class="n">endpoint</span><span class="p">)</span> <span class="k">as</span> <span class="n">timer</span><span class="p">:</span>
+ <span class="n">response</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">__session</span><span class="o">.</span><span class="n">request</span><span class="p">(</span><span class="n">method</span><span class="p">,</span> <span class="n">url</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span>
+ <span class="n">timer</span><span class="o">.</span><span class="n">tags</span><span class="p">[</span><span class="n">metrics</span><span class="o">.</span><span class="n">Tag</span><span class="o">.</span><span class="n">http_status_code</span><span class="p">]</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-66'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-66'>#</a>
+ </div>
+ <p>Start backoff logic</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span> <span class="o">>=</span> <span class="mi">500</span><span class="p">:</span>
+ <span class="k">raise</span> <span class="n">Server5xxError</span><span class="p">()</span>
+
+ <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span> <span class="o">==</span> <span class="mi">429</span><span class="p">:</span>
+ <span class="k">raise</span> <span class="n">Server429Error</span><span class="p">()</span>
+
+ <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">status_code</span> <span class="o">!=</span> <span class="mi">200</span><span class="p">:</span>
+ <span class="n">raise_for_error</span><span class="p">(</span><span class="n">response</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-67'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-67'>#</a>
+ </div>
+ <p>Ensure keys and rows are ordered as received from API.
+QUESITON: But why??</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">return</span> <span class="n">response</span><span class="o">.</span><span class="n">json</span><span class="p">(</span><span class="n">object_pairs_hook</span><span class="o">=</span><span class="n">OrderedDict</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-68'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-68'>#</a>
+ </div>
+ <h3>Syntactic Sugar</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">def</span> <span class="nf">get</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">api</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
+ <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">request</span><span class="p">(</span><span class="n">method</span><span class="o">=</span><span class="s1">'GET'</span><span class="p">,</span> <span class="n">path</span><span class="o">=</span><span class="n">path</span><span class="p">,</span> <span class="n">api</span><span class="o">=</span><span class="n">api</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-69'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-69'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">def</span> <span class="nf">post</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">api</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span>
+ <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">request</span><span class="p">(</span><span class="n">method</span><span class="o">=</span><span class="s1">'POST'</span><span class="p">,</span> <span class="n">path</span><span class="o">=</span><span class="n">path</span><span class="p">,</span> <span class="n">api</span><span class="o">=</span><span class="n">api</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-70'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-70'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-71'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-71'>#</a>
+ </div>
+ <h1>Footnotes</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-72'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-72'>#</a>
+ </div>
+ <p>Here’s a normal <code>.json()</code>‘s output</p>
+<pre><code class="language-Python">{"file": "this is my file"}
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-73'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-73'>#</a>
+ </div>
+ <p>Here’s the weird one’s <code>.json(object_pairs_hook=OrderedDict)</code> output</p>
+<pre><code class="language-Python">OrderedDict([('file', 'this is my file')])
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-74'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-74'>#</a>
+ </div>
+ <p>Here’s a more complex example:</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-75'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-75'>#</a>
+ </div>
+ <pre><code class="language-python">{ "deleted": false,
+ "__v": 0,
+ "_id": "5887e1d85c873e0011036889",
+ "text": "Cats make about 100 different sounds. Dogs make only about 10.",
+ "createdAt": "2018-01-15T21:20:00.003Z",
+ "updatedAt": "2020-09-03T16:39:39.578Z",
+ "used": true,
+ "status": {
+ "sentCount": 1,
+ "feedback": "",
+ "verified": true
+ },
+ "type": "cat",
+ "user": "5a9ac18c7478810ea6c06381",
+ "source": "user"}
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-76'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-76'>#</a>
+ </div>
+ <p>Versus <code>.json(object_pairs_hook=OrderedDict)</code></p>
+<pre><code class="language-python">OrderedDict([('status', OrderedDict([('verified', True),
+ ('sentCount', 1),
+ ('feedback', '')])),
+ ('type', 'cat'),
+ ('deleted', False),
+ ('_id', '5887e1d85c873e0011036889'),
+ ('user', '5a9ac18c7478810ea6c06381'),
+ ('text', 'Cats make about 100 different sounds. Dogs make only about 10.'),
+ ('__v', 0),
+ ('source', 'user'),
+ ('updatedAt', '2020-09-03T16:39:39.578Z'),
+ ('createdAt', '2018-01-15T21:20:00.003Z'),
+ ('used', True)])
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>
--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>discover.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>discover.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">singer.catalog</span> <span class="kn">import</span> <span class="n">Catalog</span><span class="p">,</span> <span class="n">CatalogEntry</span><span class="p">,</span> <span class="n">Schema</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.schema</span> <span class="kn">import</span> <span class="n">get_schemas</span><span class="p">,</span> <span class="n">STREAMS</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+ <p>Construct a Catalog Entry for each stream</p>
+<p>Inputs:</p>
+<ul>
+<li>client: A <code>GoogleClient</code> object</li>
+<li>spreadsheet_id: the ID of a Google Sheet Doc</li>
+</ul>
+<p>Returns:</p>
+<ul>
+<li>A singer.Catalog object</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">discover</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+ <p>It’s typical for taps in this style to call <code>schema.py:get_schemas()</code> to get <code>schemas</code> and
+<code>field_metadata</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+ <p>Here <code>schemas</code> is a dictionary of stream name to JSON schema and <code>field_metadata</code> is a dictionary
+of stream name to another dictionary of stuff. In this tap, it seems that <code>discover.py:discover()</code>
+only cares about sometimes getting <code>table-key-properties</code> from <code>field_metadata</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+ <ul>
+<li>This could be a point of confusion because <code>table-key-properties</code> is a stream / table level
+metadata, which you may or may not expect to be returned and stored in <code>field_metadata</code>.</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">schemas</span><span class="p">,</span> <span class="n">field_metadata</span> <span class="o">=</span> <span class="n">get_schemas</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">)</span>
+ <span class="n">catalog</span> <span class="o">=</span> <span class="n">Catalog</span><span class="p">([])</span>
+
+ <span class="k">for</span> <span class="n">stream_name</span><span class="p">,</span> <span class="n">schema_dict</span> <span class="ow">in</span> <span class="n">schemas</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
+ <span class="n">schema</span> <span class="o">=</span> <span class="n">Schema</span><span class="o">.</span><span class="n">from_dict</span><span class="p">(</span><span class="n">schema_dict</span><span class="p">)</span>
+ <span class="n">mdata</span> <span class="o">=</span> <span class="n">field_metadata</span><span class="p">[</span><span class="n">stream_name</span><span class="p">]</span>
+ <span class="n">key_properties</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="k">for</span> <span class="n">mdt</span> <span class="ow">in</span> <span class="n">mdata</span><span class="p">:</span>
+ <span class="n">table_key_properties</span> <span class="o">=</span> <span class="n">mdt</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'metadata'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'table-key-properties'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">table_key_properties</span><span class="p">:</span>
+ <span class="n">key_properties</span> <span class="o">=</span> <span class="n">table_key_properties</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+ <p>Once you have the <code>stream_name</code>, value of <code>table-key-properties</code>, the schema, and the
+metadata for the some stream, we pass all of that to the <code>singer.CatalogEntry</code> constructor
+and append that to the <code>singer.Catalog</code> object initialized at the start of
+<code>discover.py:discover()</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">catalog</span><span class="o">.</span><span class="n">streams</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">CatalogEntry</span><span class="p">(</span>
+ <span class="n">stream</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">tap_stream_id</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">key_properties</span><span class="o">=</span><span class="n">STREAMS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'key_properties'</span><span class="p">,</span> <span class="n">key_properties</span><span class="p">),</span>
+ <span class="n">schema</span><span class="o">=</span><span class="n">schema</span><span class="p">,</span>
+ <span class="n">metadata</span><span class="o">=</span><span class="n">mdata</span>
+ <span class="p">))</span>
+
+ <span class="k">return</span> <span class="n">catalog</span>
+
+</pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>
--- /dev/null
+/*--------------------- Layout and Typography ----------------------------*/
+body {
+ font-family: 'Palatino Linotype', 'Book Antiqua', Palatino, FreeSerif, serif;
+ font-size: 16px;
+ line-height: 24px;
+ color: #252519;
+ margin: 0; padding: 0;
+ background: #f5f5ff;
+}
+a {
+ color: #261a3b;
+}
+ a:visited {
+ color: #261a3b;
+ }
+p {
+ margin: 0 0 15px 0;
+}
+h1, h2, h3, h4, h5, h6 {
+ margin: 40px 0 15px 0;
+}
+h2, h3, h4, h5, h6 {
+ margin-top: 0;
+ }
+#container {
+ background: white;
+ }
+#container, div.section {
+ position: relative;
+}
+#background {
+ position: absolute;
+ top: 0; left: 580px; right: 0; bottom: 0;
+ background: #f5f5ff;
+ border-left: 1px solid #e5e5ee;
+ z-index: 0;
+}
+#jump_to, #jump_page {
+ background: white;
+ -webkit-box-shadow: 0 0 25px #777; -moz-box-shadow: 0 0 25px #777;
+ -webkit-border-bottom-left-radius: 5px; -moz-border-radius-bottomleft: 5px;
+ font: 10px Arial;
+ text-transform: uppercase;
+ cursor: pointer;
+ text-align: right;
+}
+#jump_to, #jump_wrapper {
+ position: fixed;
+ right: 0; top: 0;
+ padding: 5px 10px;
+}
+ #jump_wrapper {
+ padding: 0;
+ display: none;
+ }
+ #jump_to:hover #jump_wrapper {
+ display: block;
+ }
+ #jump_page {
+ padding: 5px 0 3px;
+ margin: 0 0 25px 25px;
+ }
+ #jump_page .source {
+ display: block;
+ padding: 5px 10px;
+ text-decoration: none;
+ border-top: 1px solid #eee;
+ }
+ #jump_page .source:hover {
+ background: #f5f5ff;
+ }
+ #jump_page .source:first-child {
+ }
+div.docs {
+ float: left;
+ max-width: 500px;
+ min-width: 500px;
+ min-height: 5px;
+ padding: 10px 25px 1px 50px;
+ vertical-align: top;
+ text-align: left;
+}
+ .docs pre {
+ margin: 15px 0 15px;
+ padding-left: 15px;
+ overflow-y: scroll;
+ }
+ .docs p tt, .docs p code {
+ background: #f8f8ff;
+ border: 1px solid #dedede;
+ font-size: 12px;
+ padding: 0 0.2em;
+ }
+ .octowrap {
+ position: relative;
+ }
+ .octothorpe {
+ font: 12px Arial;
+ text-decoration: none;
+ color: #454545;
+ position: absolute;
+ top: 3px; left: -20px;
+ padding: 1px 2px;
+ opacity: 0;
+ -webkit-transition: opacity 0.2s linear;
+ }
+ div.docs:hover .octothorpe {
+ opacity: 1;
+ }
+div.code {
+ margin-left: 580px;
+ padding: 14px 15px 16px 50px;
+ vertical-align: top;
+}
+ .code pre, .docs p code {
+ font-size: 12px;
+ }
+ pre, tt, code {
+ line-height: 18px;
+ font-family: Monaco, Consolas, "Lucida Console", monospace;
+ margin: 0; padding: 0;
+ }
+div.clearall {
+ clear: both;
+}
+
+
+/*---------------------- Syntax Highlighting -----------------------------*/
+td.linenos { background-color: #f0f0f0; padding-right: 10px; }
+span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }
+body .hll { background-color: #ffffcc }
+body .c { color: #408080; font-style: italic } /* Comment */
+body .err { border: 1px solid #FF0000 } /* Error */
+body .k { color: #954121 } /* Keyword */
+body .o { color: #666666 } /* Operator */
+body .cm { color: #408080; font-style: italic } /* Comment.Multiline */
+body .cp { color: #BC7A00 } /* Comment.Preproc */
+body .c1 { color: #408080; font-style: italic } /* Comment.Single */
+body .cs { color: #408080; font-style: italic } /* Comment.Special */
+body .gd { color: #A00000 } /* Generic.Deleted */
+body .ge { font-style: italic } /* Generic.Emph */
+body .gr { color: #FF0000 } /* Generic.Error */
+body .gh { color: #000080; font-weight: bold } /* Generic.Heading */
+body .gi { color: #00A000 } /* Generic.Inserted */
+body .go { color: #808080 } /* Generic.Output */
+body .gp { color: #000080; font-weight: bold } /* Generic.Prompt */
+body .gs { font-weight: bold } /* Generic.Strong */
+body .gu { color: #800080; font-weight: bold } /* Generic.Subheading */
+body .gt { color: #0040D0 } /* Generic.Traceback */
+body .kc { color: #954121 } /* Keyword.Constant */
+body .kd { color: #954121; font-weight: bold } /* Keyword.Declaration */
+body .kn { color: #954121; font-weight: bold } /* Keyword.Namespace */
+body .kp { color: #954121 } /* Keyword.Pseudo */
+body .kr { color: #954121; font-weight: bold } /* Keyword.Reserved */
+body .kt { color: #B00040 } /* Keyword.Type */
+body .m { color: #666666 } /* Literal.Number */
+body .s { color: #219161 } /* Literal.String */
+body .na { color: #7D9029 } /* Name.Attribute */
+body .nb { color: #954121 } /* Name.Builtin */
+body .nc { color: #0000FF; font-weight: bold } /* Name.Class */
+body .no { color: #880000 } /* Name.Constant */
+body .nd { color: #AA22FF } /* Name.Decorator */
+body .ni { color: #999999; font-weight: bold } /* Name.Entity */
+body .ne { color: #D2413A; font-weight: bold } /* Name.Exception */
+body .nf { color: #0000FF } /* Name.Function */
+body .nl { color: #A0A000 } /* Name.Label */
+body .nn { color: #0000FF; font-weight: bold } /* Name.Namespace */
+body .nt { color: #954121; font-weight: bold } /* Name.Tag */
+body .nv { color: #19469D } /* Name.Variable */
+body .ow { color: #AA22FF; font-weight: bold } /* Operator.Word */
+body .w { color: #bbbbbb } /* Text.Whitespace */
+body .mf { color: #666666 } /* Literal.Number.Float */
+body .mh { color: #666666 } /* Literal.Number.Hex */
+body .mi { color: #666666 } /* Literal.Number.Integer */
+body .mo { color: #666666 } /* Literal.Number.Oct */
+body .sb { color: #219161 } /* Literal.String.Backtick */
+body .sc { color: #219161 } /* Literal.String.Char */
+body .sd { color: #219161; font-style: italic } /* Literal.String.Doc */
+body .s2 { color: #219161 } /* Literal.String.Double */
+body .se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */
+body .sh { color: #219161 } /* Literal.String.Heredoc */
+body .si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */
+body .sx { color: #954121 } /* Literal.String.Other */
+body .sr { color: #BB6688 } /* Literal.String.Regex */
+body .s1 { color: #219161 } /* Literal.String.Single */
+body .ss { color: #19469D } /* Literal.String.Symbol */
+body .bp { color: #954121 } /* Name.Builtin.Pseudo */
+body .vc { color: #19469D } /* Name.Variable.Class */
+body .vg { color: #19469D } /* Name.Variable.Global */
+body .vi { color: #19469D } /* Name.Variable.Instance */
+body .il { color: #666666 } /* Literal.Number.Integer.Long */
--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>schema.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>schema.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">os</span>
+<span class="kn">import</span> <span class="nn">json</span>
+<span class="kn">import</span> <span class="nn">re</span>
+<span class="kn">import</span> <span class="nn">urllib.parse</span>
+<span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">OrderedDict</span>
+<span class="kn">import</span> <span class="nn">singer</span>
+<span class="kn">from</span> <span class="nn">singer</span> <span class="kn">import</span> <span class="n">metadata</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.streams</span> <span class="kn">import</span> <span class="n">STREAMS</span>
+
+<span class="n">LOGGER</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">get_logger</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+ <p>Convert column index to column letter</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">colnum_string</span><span class="p">(</span><span class="n">num</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">string</span> <span class="o">=</span> <span class="s2">""</span>
+ <span class="k">while</span> <span class="n">num</span> <span class="o">></span> <span class="mi">0</span><span class="p">:</span>
+ <span class="n">num</span><span class="p">,</span> <span class="n">remainder</span> <span class="o">=</span> <span class="nb">divmod</span><span class="p">(</span><span class="n">num</span> <span class="o">-</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">26</span><span class="p">)</span>
+ <span class="n">string</span> <span class="o">=</span> <span class="nb">chr</span><span class="p">(</span><span class="mi">65</span> <span class="o">+</span> <span class="n">remainder</span><span class="p">)</span> <span class="o">+</span> <span class="n">string</span>
+ <span class="k">return</span> <span class="n">string</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+ <p>The goal of this function is to get the JSON schema of the sheet you pass in. Our return values here
+are <code>sheet_json_schema</code> and <code>columns</code>, an <code>OrderedDict</code> and a list respectively.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+ <p>This function is massive and we will discuss it in the following parts:</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-6'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-6'>#</a>
+ </div>
+ <ul>
+<li>Part 1</li>
+<li>Part 2<ul>
+<li>Part 2A</li>
+<li>Part 2B<ul>
+<li>Part 3</li>
+<li>Part 4</li>
+</ul>
+</li>
+</ul>
+</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-7'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-7'>#</a>
+ </div>
+ <p>Part 1 is just setting up constants and variables. We can skim through this part.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-8'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-8'>#</a>
+ </div>
+ <p>Part 2 is split into two parts because it’s a loop over the column and there’s two ways to handle a
+column.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-9'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-9'>#</a>
+ </div>
+ <p>We’ll consider 2A to be the “skip this column” case.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-10'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-10'>#</a>
+ </div>
+ <p>We’ll consider 2B as the “not skipped” case. In which we determine a field’s type (Part 3) and then
+use the type to decide the JSON Schema (Part 4).</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-11'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-11'>#</a>
+ </div>
+ <hr />
+<p>Create sheet_metadata_json with columns from sheet</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_sheet_schema_columns</span><span class="p">(</span><span class="n">sheet</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-12'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-12'>#</a>
+ </div>
+ <p>The input to this function is shaped like</p>
+<pre><code class="language-JSON">{
+ "data" : [
+ {
+ "rowData": [
+ {"values": <thing 1>},
+ {"values": <thing 2>}
+ ]
+ }
+ ]
+}
+</code></pre>
+<p>Return Values</p>
+<ul>
+<li>
+<p>columns</p>
+<ul>
+<li>A <code>column</code> that goes into <code>columns</code> is a dictionary with keys <code>"columnIndex"</code>,
+<code>"columnLetter"</code>, <code>"columnName"</code>, <code>"columnType"</code>, and <code>"columnSkipped"</code>.</li>
+</ul>
+</li>
+<li>
+<p>sheet_json_schema</p>
+<ul>
+<li>A <code>col_properties</code> that goes into <code>sheet_json_schema['properties'][column_name]</code> is the JSON
+schema of <code>column_name</code>.</li>
+</ul>
+</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_title</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'title'</span><span class="p">)</span>
+ <span class="n">sheet_json_schema</span> <span class="o">=</span> <span class="n">OrderedDict</span><span class="p">()</span>
+ <span class="n">data</span> <span class="o">=</span> <span class="nb">next</span><span class="p">(</span><span class="nb">iter</span><span class="p">(</span><span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'data'</span><span class="p">,</span> <span class="p">[])),</span> <span class="p">{})</span>
+ <span class="n">row_data</span> <span class="o">=</span> <span class="n">data</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'rowData'</span><span class="p">,</span> <span class="p">[])</span>
+ <span class="k">if</span> <span class="n">row_data</span> <span class="o">==</span> <span class="p">[]:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'SKIPPING Empty Sheet: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">))</span>
+ <span class="k">return</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-13'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-13'>#</a>
+ </div>
+ <p>So this function starts by unpacking it into two lists, <code>headers</code> and <code>first_values</code>, which is
+“thing 1” and “thing 2” respectively.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">headers</span> <span class="o">=</span> <span class="n">row_data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'values'</span><span class="p">,</span> <span class="p">[])</span>
+ <span class="n">first_values</span> <span class="o">=</span> <span class="n">row_data</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'values'</span><span class="p">,</span> <span class="p">[])</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-14'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-14'>#</a>
+ </div>
+ <p>All of the objects in <code>headers</code> and <code>first_values</code> have the following shape:</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-15'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-15'>#</a>
+ </div>
+ <pre><code class="language-JSON">{
+ "userEnteredValue": {"stringValue": "time1"},
+ "effectiveValue": {"stringValue": "time1"},
+ "formattedValue": "time1",
+ "userEnteredFormat": {...},
+ "effectiveFormat": {}
+}
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-16'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-16'>#</a>
+ </div>
+ <p>The base Sheet schema</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_json_schema</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="s1">'object'</span><span class="p">,</span>
+ <span class="s1">'additionalProperties'</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
+ <span class="s1">'properties'</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s1">'__sdc_spreadsheet_id'</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]</span>
+ <span class="p">},</span>
+ <span class="s1">'__sdc_sheet_id'</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'integer'</span><span class="p">]</span>
+ <span class="p">},</span>
+ <span class="s1">'__sdc_row'</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'integer'</span><span class="p">]</span>
+ <span class="p">}</span>
+ <span class="p">}</span>
+ <span class="p">}</span>
+
+ <span class="n">header_list</span> <span class="o">=</span> <span class="p">[]</span> <span class="c1"># used for checking uniqueness</span>
+ <span class="n">columns</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">prior_header</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span>
+ <span class="n">skipped</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-17'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-17'>#</a>
+ </div>
+ <p>We loop over the columns in the <code>headers</code> list and accummulate an object in each return
+variable.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">for</span> <span class="n">header</span> <span class="ow">in</span> <span class="n">headers</span><span class="p">:</span>
+ <span class="n">column_index</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span>
+ <span class="n">column_letter</span> <span class="o">=</span> <span class="n">colnum_string</span><span class="p">(</span><span class="n">column_index</span><span class="p">)</span>
+ <span class="n">header_value</span> <span class="o">=</span> <span class="n">header</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'formattedValue'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">header_value</span><span class="p">:</span> <span class="c1"># NOT skipped</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-18'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-18'>#</a>
+ </div>
+ <p>Assuming the column we are looking at does not get skipped, we have to figure out the
+schema.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">column_is_skipped</span> <span class="o">=</span> <span class="kc">False</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-19'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-19'>#</a>
+ </div>
+ <p>First we reset the counter for consecutive skipped columns.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">skipped</span> <span class="o">=</span> <span class="mi">0</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-20'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-20'>#</a>
+ </div>
+ <p>Then we let the name of this column be the value of <code>formattedValue</code> from the <code>header</code>
+object we are looking at. This seems to be the value rendered in Google Sheets in the
+cell.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">column_name</span> <span class="o">=</span> <span class="s1">'</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">header_value</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-21'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-21'>#</a>
+ </div>
+ <p>We assert that this column name is unique or else we raise a “Duplicate Header Error”.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">column_name</span> <span class="ow">in</span> <span class="n">header_list</span><span class="p">:</span>
+ <span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="s1">'DUPLICATE HEADER ERROR: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">1'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">))</span>
+ <span class="n">header_list</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">column_name</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-22'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-22'>#</a>
+ </div>
+ <p>We attempt to grab the value in the second row of the sheet (the first row of data)
+associated with this column. Remember this row we are looking at is stored in
+<code>first_values</code>. Note again that <code>headers</code> and <code>first_values</code> have the same shape.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">first_value</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">first_value</span> <span class="o">=</span> <span class="n">first_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
+ <span class="k">except</span> <span class="ne">IndexError</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'NO VALUE IN 2ND ROW FOR HEADER. SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">2. </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">,</span> <span class="n">err</span><span class="p">))</span>
+ <span class="n">first_value</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">first_values</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">first_value</span><span class="p">)</span>
+ <span class="k">pass</span>
+
+ <span class="n">column_effective_value</span> <span class="o">=</span> <span class="n">first_value</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'effectiveValue'</span><span class="p">,</span> <span class="p">{})</span>
+
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="k">if</span> <span class="n">column_effective_value</span> <span class="o">==</span> <span class="p">{}:</span>
+ <span class="n">column_effective_value_type</span> <span class="o">=</span> <span class="s1">'stringValue'</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: NO VALUE IN 2ND ROW FOR HEADER. SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">2.'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">' Setting column datatype to STRING'</span><span class="p">)</span>
+ <span class="k">else</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-23'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-23'>#</a>
+ </div>
+ <p>The tap calls the value of <code>"effectiveValue"</code> the <code>column_effective_value</code>. This
+dictionary can be empty or it can have a <code>key1</code> that looks like <code>"numberValue"</code>,
+<code>"stringValue"</code>, or <code>"boolValue"</code>. If the dictionary is empty, we force <code>key1</code> to
+be <code>"stringValue"</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">for</span> <span class="n">key</span><span class="p">,</span> <span class="n">val</span> <span class="ow">in</span> <span class="n">column_effective_value</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
+ <span class="k">if</span> <span class="n">key</span> <span class="ow">in</span> <span class="p">(</span><span class="s1">'numberValue'</span><span class="p">,</span> <span class="s1">'stringValue'</span><span class="p">,</span> <span class="s1">'boolValue'</span><span class="p">):</span>
+ <span class="n">column_effective_value_type</span> <span class="o">=</span> <span class="n">key</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">val</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-24'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-24'>#</a>
+ </div>
+ <p>Sometimes <code>key1</code> also looks like <code>"errorType"</code> or <code>"formulaType"</code>, but in
+these cases, we raise a “Data Type Error” error immediately.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">key</span> <span class="ow">in</span> <span class="p">(</span><span class="s1">'errorType'</span><span class="p">,</span> <span class="s1">'formulaType'</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">val</span><span class="p">)</span>
+ <span class="k">raise</span> <span class="ne">Exception</span><span class="p">(</span><span class="s1">'DATA TYPE ERROR 2ND ROW VALUE: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">2, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">,</span> <span class="n">key</span><span class="p">,</span> <span class="n">col_val</span><span class="p">))</span>
+
+ <span class="n">column_number_format</span> <span class="o">=</span> <span class="n">first_values</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'effectiveFormat'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span>
+ <span class="s1">'numberFormat'</span><span class="p">,</span> <span class="p">{})</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-25'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-25'>#</a>
+ </div>
+ <p>column_number_format_type = UNSPECIFIED, TEXT, NUMBER, PERCENT, CURRENCY, DATE</p>
+<ul>
+<li>TIME, DATE_TIME, SCIENTIFIC</li>
+<li>https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/cells#NumberFormatType</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">column_number_format_type</span> <span class="o">=</span> <span class="n">column_number_format</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'type'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-26'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-26'>#</a>
+ </div>
+ <p>the giant if-elif-else block: All it does is set a variable <code>col_properties</code> and
+<code>column_gs_type</code> based on the values of <code>column_effective_value_type</code> and
+<code>column_number_format_type</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">column_format</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="k">if</span> <span class="n">column_effective_value</span> <span class="o">==</span> <span class="p">{}:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'stringValue'</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: 2ND ROW VALUE IS BLANK: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">2'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">' Setting column datatype to STRING'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-27'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-27'>#</a>
+ </div>
+ <p>column_effective_value_type = numberValue, stringValue, boolValue</p>
+<ul>
+<li>INVALID: errorType, formulaType</li>
+<li>https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/other#ExtendedValue</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">column_effective_value_type</span> <span class="o">==</span> <span class="s1">'stringValue'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'stringValue'</span>
+ <span class="k">elif</span> <span class="n">column_effective_value_type</span> <span class="o">==</span> <span class="s1">'boolValue'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'boolean'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'boolValue'</span>
+ <span class="k">elif</span> <span class="n">column_effective_value_type</span> <span class="o">==</span> <span class="s1">'numberValue'</span><span class="p">:</span>
+ <span class="k">if</span> <span class="n">column_number_format_type</span> <span class="o">==</span> <span class="s1">'DATE_TIME'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">],</span>
+ <span class="s1">'format'</span><span class="p">:</span> <span class="s1">'date-time'</span>
+ <span class="p">}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'numberType.DATE_TIME'</span>
+ <span class="k">elif</span> <span class="n">column_number_format_type</span> <span class="o">==</span> <span class="s1">'DATE'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">],</span>
+ <span class="s1">'format'</span><span class="p">:</span> <span class="s1">'date'</span>
+ <span class="p">}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'numberType.DATE'</span>
+ <span class="k">elif</span> <span class="n">column_number_format_type</span> <span class="o">==</span> <span class="s1">'TIME'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">],</span>
+ <span class="s1">'format'</span><span class="p">:</span> <span class="s1">'time'</span>
+ <span class="p">}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'numberType.TIME'</span>
+ <span class="k">elif</span> <span class="n">column_number_format_type</span> <span class="o">==</span> <span class="s1">'TEXT'</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'stringValue'</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="s1">'number'</span><span class="p">,</span> <span class="s1">'multipleOf'</span><span class="p">:</span> <span class="mf">1e-15</span><span class="p">}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'numberType'</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'unsupportedValue'</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: UNSUPPORTED 2ND ROW VALUE: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">2, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">,</span> <span class="n">column_effective_value_type</span><span class="p">,</span> <span class="n">col_val</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Converting to string.'</span><span class="p">)</span>
+ <span class="k">else</span><span class="p">:</span> <span class="c1"># skipped</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-28'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-28'>#</a>
+ </div>
+ <p>We note that we are skipping this column. It still gets added to the schema though as
+a string field. The only other notable thing about skipped columns is the we create
+the field name for it, and it looks like <code>"__sdc_skip_col_XY"</code>, where the <code>XY</code> goes
+from <code>"00"</code>, <code>"01"</code>, to <code>"99"</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">column_is_skipped</span> <span class="o">=</span> <span class="kc">True</span>
+ <span class="n">skipped</span> <span class="o">=</span> <span class="n">skipped</span> <span class="o">+</span> <span class="mi">1</span>
+ <span class="n">column_index_str</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">column_index</span><span class="p">)</span><span class="o">.</span><span class="n">zfill</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span>
+ <span class="n">column_name</span> <span class="o">=</span> <span class="s1">'__sdc_skip_col_</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">column_index_str</span><span class="p">)</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="n">column_gs_type</span> <span class="o">=</span> <span class="s1">'stringValue'</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: SKIPPED COLUMN; NO COLUMN HEADER. SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}</span><span class="s1">1'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">' This column will be skipped during data loading.'</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="n">skipped</span> <span class="o">>=</span> <span class="mi">2</span><span class="p">:</span>
+ <span class="n">sheet_json_schema</span><span class="p">[</span><span class="s1">'properties'</span><span class="p">]</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="n">prior_header</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'TWO CONSECUTIVE SKIPPED COLUMNS. STOPPING SCAN AT: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL </span><span class="si">{}</span><span class="s1">1'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">column_name</span><span class="p">,</span> <span class="n">column_letter</span><span class="p">))</span>
+ <span class="k">break</span>
+
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">column</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">column</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'columnIndex'</span><span class="p">:</span> <span class="n">column_index</span><span class="p">,</span>
+ <span class="s1">'columnLetter'</span><span class="p">:</span> <span class="n">column_letter</span><span class="p">,</span>
+ <span class="s1">'columnName'</span><span class="p">:</span> <span class="n">column_name</span><span class="p">,</span>
+ <span class="s1">'columnType'</span><span class="p">:</span> <span class="n">column_gs_type</span><span class="p">,</span>
+ <span class="s1">'columnSkipped'</span><span class="p">:</span> <span class="n">column_is_skipped</span>
+ <span class="p">}</span>
+ <span class="n">columns</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">column</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="n">column_gs_type</span> <span class="ow">in</span> <span class="p">{</span><span class="s1">'numberType.DATE_TIME'</span><span class="p">,</span> <span class="s1">'numberType.DATE'</span><span class="p">,</span> <span class="s1">'numberType.TIME'</span><span class="p">,</span> <span class="s1">'numberType'</span><span class="p">}:</span>
+ <span class="n">col_properties</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s1">'anyOf'</span><span class="p">:</span> <span class="p">[</span>
+ <span class="n">col_properties</span><span class="p">,</span>
+ <span class="p">{</span><span class="s1">'type'</span><span class="p">:</span> <span class="p">[</span><span class="s1">'null'</span><span class="p">,</span> <span class="s1">'string'</span><span class="p">]}</span>
+ <span class="p">]</span>
+ <span class="p">}</span>
+
+ <span class="n">sheet_json_schema</span><span class="p">[</span><span class="s1">'properties'</span><span class="p">][</span><span class="n">column_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">col_properties</span>
+
+ <span class="n">prior_header</span> <span class="o">=</span> <span class="n">column_name</span>
+ <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span>
+
+ <span class="k">return</span> <span class="n">sheet_json_schema</span><span class="p">,</span> <span class="n">columns</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-29'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-29'>#</a>
+ </div>
+ <p>The point of this function seems to be (1) make a request to get a sheet (2) return the schema
+generated for this sheet by <code>schema.py:get_sheet_schema_columns</code>.</p>
+<p><code>get_sheet_metadata()</code> sets up a lot of variables to ultimately make a request to</p>
+<pre><code class="language-Text">https://sheets.googleapis.com/v4/spreadsheets/my-spreadsheet-id?includeGridData=true&ranges='my-sheet-title'!1:2
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-30'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-30'>#</a>
+ </div>
+ <p>Let’s dissect the query params here a bit.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-31'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-31'>#</a>
+ </div>
+ <p><code>includeGridData</code> is false by default and setting this to true lets us get “Grid data”. If you
+compare the same request but with that value flipped, then you’ll notice the <code>includeGridData=false</code>
+gives you a relatively small response with no data in it. It seems like just a bunch of metadata.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-32'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-32'>#</a>
+ </div>
+ <p><code>ranges</code> controls the rows returned.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_sheet_metadata</span><span class="p">(</span><span class="n">sheet</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">client</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-33'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-33'>#</a>
+ </div>
+ <p>Get Header Row and 1st data row (Rows 1 & 2) from a Sheet on Spreadsheet w/ sheet_metadata query</p>
+<ul>
+<li>endpoint: spreadsheets/{spreadsheet_id}</li>
+<li>params: includeGridData = true, ranges = ‘{sheet_title}’!1:2
+This endpoint includes detailed metadata about each cell - incl. data type, formatting, etc.</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_id</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheetId'</span><span class="p">)</span>
+ <span class="n">sheet_title</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'title'</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'sheet_id = </span><span class="si">{}</span><span class="s1">, sheet_title = </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_id</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">))</span>
+
+ <span class="n">stream_name</span> <span class="o">=</span> <span class="s1">'sheet_metadata'</span>
+ <span class="n">stream_metadata</span> <span class="o">=</span> <span class="n">STREAMS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">api</span> <span class="o">=</span> <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'api'</span><span class="p">,</span> <span class="s1">'sheets'</span><span class="p">)</span>
+ <span class="n">params</span> <span class="o">=</span> <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'params'</span><span class="p">,</span> <span class="p">{})</span>
+ <span class="n">sheet_title_encoded</span> <span class="o">=</span> <span class="n">urllib</span><span class="o">.</span><span class="n">parse</span><span class="o">.</span><span class="n">quote_plus</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">)</span>
+ <span class="n">sheet_title_escaped</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">escape</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">)</span>
+ <span class="n">querystring</span> <span class="o">=</span> <span class="s1">'&'</span><span class="o">.</span><span class="n">join</span><span class="p">(</span>
+ <span class="p">[</span><span class="s1">'</span><span class="si">%s</span><span class="s1">=</span><span class="si">%s</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="k">for</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="ow">in</span> <span class="n">params</span><span class="o">.</span><span class="n">items</span><span class="p">()]</span>
+ <span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">'</span><span class="si">{sheet_title}</span><span class="s1">'</span><span class="p">,</span> <span class="n">sheet_title_encoded</span><span class="p">)</span>
+ <span class="n">path</span> <span class="o">=</span> <span class="s1">'</span><span class="si">{}</span><span class="s1">?</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'path'</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">'</span><span class="si">{spreadsheet_id}</span><span class="s1">'</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">),</span>
+ <span class="n">querystring</span>
+ <span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-34'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-34'>#</a>
+ </div>
+ <p>See the Footnotes for this response shape</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_md_results</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">path</span><span class="o">=</span><span class="n">path</span><span class="p">,</span> <span class="n">api</span><span class="o">=</span><span class="n">api</span><span class="p">,</span> <span class="n">endpoint</span><span class="o">=</span><span class="n">sheet_title_escaped</span><span class="p">)</span>
+ <span class="n">sheet_metadata</span> <span class="o">=</span> <span class="n">sheet_md_results</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheets'</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
+
+
+ <span class="k">try</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-35'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-35'>#</a>
+ </div>
+ <p>Create sheet_json_schema (for discovery/catalog) and columns (for sheet_metadata results)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_json_schema</span><span class="p">,</span> <span class="n">columns</span> <span class="o">=</span> <span class="n">get_sheet_schema_columns</span><span class="p">(</span><span class="n">sheet_metadata</span><span class="p">)</span>
+ <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s1">'</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">err</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s1">'SKIPPING Malformed sheet: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">))</span>
+ <span class="n">sheet_json_schema</span><span class="p">,</span> <span class="n">columns</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="kc">None</span>
+
+ <span class="k">return</span> <span class="n">sheet_json_schema</span><span class="p">,</span> <span class="n">columns</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-36'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-36'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_abs_path</span><span class="p">(</span><span class="n">path</span><span class="p">):</span>
+ <span class="k">return</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">dirname</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">realpath</span><span class="p">(</span><span class="vm">__file__</span><span class="p">)),</span> <span class="n">path</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-37'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-37'>#</a>
+ </div>
+ <p>We initialize our return variables, <code>schemas</code> and <code>field_metadata</code> to empty dictionaries.</p>
+<p>We loop over each stream in <code>streams.py:STREAMS</code>. We load the static JSON file into memory - all
+four streams currently have some static schema. We store this on our return variable <code>schemas</code>
+under the stream name.</p>
+<p>We then call <code>singer.metadata.get_standard_metadata()</code> passing in whatever metadata we do have
+(key properties, valid replication keys, the replication method). The return value here is
+stored on our return variable <code>field_metadata</code> under the stream name.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_schemas</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-38'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-38'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">schemas</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">field_metadata</span> <span class="o">=</span> <span class="p">{}</span>
+
+ <span class="k">for</span> <span class="n">stream_name</span><span class="p">,</span> <span class="n">stream_metadata</span> <span class="ow">in</span> <span class="n">STREAMS</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
+ <span class="n">schema_path</span> <span class="o">=</span> <span class="n">get_abs_path</span><span class="p">(</span><span class="s1">'schemas/</span><span class="si">{}</span><span class="s1">.json'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">))</span>
+ <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">schema_path</span><span class="p">)</span> <span class="k">as</span> <span class="n">file</span><span class="p">:</span>
+ <span class="n">schema</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">file</span><span class="p">)</span>
+ <span class="n">schemas</span><span class="p">[</span><span class="n">stream_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">schema</span>
+ <span class="n">mdata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">new</span><span class="p">()</span>
+
+ <span class="n">mdata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">get_standard_metadata</span><span class="p">(</span>
+ <span class="n">schema</span><span class="o">=</span><span class="n">schema</span><span class="p">,</span>
+ <span class="n">key_properties</span><span class="o">=</span><span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'key_properties'</span><span class="p">,</span> <span class="kc">None</span><span class="p">),</span>
+ <span class="n">valid_replication_keys</span><span class="o">=</span><span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'replication_keys'</span><span class="p">,</span> <span class="kc">None</span><span class="p">),</span>
+ <span class="n">replication_method</span><span class="o">=</span><span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'replication_method'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+ <span class="p">)</span>
+ <span class="n">field_metadata</span><span class="p">[</span><span class="n">stream_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">mdata</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-39'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-39'>#</a>
+ </div>
+ <p>If we are handling the <code>"spreadsheet_metadata"</code> stream, we do some extra work to build the
+dynamic schemas of each Sheet we want to sync.. Otherwise, that’s it.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">stream_name</span> <span class="o">==</span> <span class="s1">'spreadsheet_metadata'</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-40'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-40'>#</a>
+ </div>
+ <p>We ultimately end up making a <code>GET</code> to</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-41'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-41'>#</a>
+ </div>
+ <pre><code class="language-Text">https://sheets.googleapis.com/v4/spreadsheets/my-spreadsheet-id?includeGridData=false
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-42'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-42'>#</a>
+ </div>
+ <p>Notice this is <code>base_url + path + query_string</code>. There’s code here to figure out and
+properly format <code>path</code> and <code>query_string</code>. I’m not sure why we don’t let <code>requests</code>
+handle this.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-43'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-43'>#</a>
+ </div>
+ <p>We assume this request is successful and we store the <code>OrderedDict</code> return value as
+<code>spreadsheet_md_results</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">api</span> <span class="o">=</span> <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'api'</span><span class="p">,</span> <span class="s1">'sheets'</span><span class="p">)</span>
+ <span class="n">params</span> <span class="o">=</span> <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'params'</span><span class="p">,</span> <span class="p">{})</span>
+ <span class="n">querystring</span> <span class="o">=</span> <span class="s1">'&'</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s1">'</span><span class="si">%s</span><span class="s1">=</span><span class="si">%s</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="k">for</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="ow">in</span> <span class="n">params</span><span class="o">.</span><span class="n">items</span><span class="p">()])</span>
+ <span class="n">path</span> <span class="o">=</span> <span class="s1">'</span><span class="si">{}</span><span class="s1">?</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">stream_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'path'</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">'</span><span class="si">{spreadsheet_id}</span><span class="s1">'</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">),</span>
+ <span class="n">querystring</span>
+ <span class="p">)</span>
+
+ <span class="n">spreadsheet_md_results</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">(</span>
+ <span class="n">path</span><span class="o">=</span><span class="n">path</span><span class="p">,</span>
+ <span class="n">params</span><span class="o">=</span><span class="n">querystring</span><span class="p">,</span>
+ <span class="n">api</span><span class="o">=</span><span class="n">api</span><span class="p">,</span>
+ <span class="n">endpoint</span><span class="o">=</span><span class="n">stream_name</span>
+ <span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-44'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-44'>#</a>
+ </div>
+ <p>The response here is one of those “envelope” kinds. The data we care about is under
+the <code>"sheets"</code> key.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheets</span> <span class="o">=</span> <span class="n">spreadsheet_md_results</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheets'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">sheets</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-45'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-45'>#</a>
+ </div>
+ <p>Looping over this array, we call <code>schema.py:get_sheet_metadata</code>. This gets the
+JSON schema of each sheet found in this Google Doc. We use the sheet’s title as
+the stream name here.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">for</span> <span class="n">sheet</span> <span class="ow">in</span> <span class="n">sheets</span><span class="p">:</span>
+ <span class="n">sheet_json_schema</span><span class="p">,</span> <span class="n">columns</span> <span class="o">=</span> <span class="n">get_sheet_metadata</span><span class="p">(</span><span class="n">sheet</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">client</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="n">sheet_json_schema</span> <span class="ow">and</span> <span class="n">columns</span><span class="p">:</span>
+ <span class="n">sheet_title</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'title'</span><span class="p">)</span>
+ <span class="n">schemas</span><span class="p">[</span><span class="n">sheet_title</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_json_schema</span>
+ <span class="n">sheet_mdata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">new</span><span class="p">()</span>
+ <span class="n">sheet_mdata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">get_standard_metadata</span><span class="p">(</span>
+ <span class="n">schema</span><span class="o">=</span><span class="n">sheet_json_schema</span><span class="p">,</span>
+ <span class="n">key_properties</span><span class="o">=</span><span class="p">[</span><span class="s1">'__sdc_row'</span><span class="p">],</span>
+ <span class="n">valid_replication_keys</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
+ <span class="n">replication_method</span><span class="o">=</span><span class="s1">'FULL_TABLE'</span>
+ <span class="p">)</span>
+ <span class="n">field_metadata</span><span class="p">[</span><span class="n">sheet_title</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_mdata</span>
+
+ <span class="k">return</span> <span class="n">schemas</span><span class="p">,</span> <span class="n">field_metadata</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-46'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-46'>#</a>
+ </div>
+ <h1>Footnotes</h1>
+<p>The shape of response is like, but note the tap stores this in the recursive <code>OrderedDict</code> structure</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-47'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-47'>#</a>
+ </div>
+ <pre><code class="language-JSON">{
+ "spreadsheetid": "my-id",
+ "properties": {...},
+ "sheets": [
+ {
+ "properties": {},
+ "data": [
+ {
+ "rowData": [
+ {
+ "values": [
+ {
+ "userEnteredValue": {"stringValue": "time1"},
+ "effectiveValue": {"stringValue": "time1"},
+ "formattedValue": "time1",
+ "userEnteredFormat": {...},
+ "effectiveFormat": {}
+ },
+ ...
+ ],
+ },
+ ...
+ ]
+ }
+ ]
+ },
+ ]
+}
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>
--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>streams.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>streams.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+ <p><code>streams.py:STREAMS</code> is an <code>OrderedDict</code>. Only because we want to loop over it in the same order
+every time.</p>
+<p>It’s still the same global variable found in taps of this style. It maps stream names to a
+dictionary describing the stream.</p>
+<p>Some notable things we learn in this file:</p>
+<ul>
+<li>
+<p><code>api</code> is either <code>"files"</code> or <code>"sheets"</code></p>
+</li>
+<li>
+<p>We saw this used in <code>client.py:GoogleClient.request()</code> to switch the base url of the request</p>
+</li>
+<li>
+<p><code>"file_metadata"</code> is the only incremental stream</p>
+</li>
+<li>
+<p>Full table streams include:</p>
+</li>
+<li><code>"spreadsheet_metadata"</code></li>
+<li><code>"sheet_metadata"</code></li>
+<li>
+<p><code>"sheets_loaded"</code></p>
+</li>
+<li>
+<p><code>"sheets_loaded"</code> is the only stream with a <code>"data_key"</code></p>
+</li>
+<li>We typically see <code>data_key</code> be the name of the key to get data out of “envelope” responses</li>
+</ul>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">collections</span> <span class="kn">import</span> <span class="n">OrderedDict</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+ <p>streams: API URL endpoints to be called
+properties:
+ <root node>: Plural stream name for the endpoint
+ path: API endpoint relative path, when added to the base URL, creates the full path,
+ default = stream_name
+ key_properties: Primary key fields for identifying an endpoint record.
+ replication_method: INCREMENTAL or FULL_TABLE
+ replication_keys: bookmark_field(s), typically a date-time, used for filtering the results
+ and setting the state
+ params: Query, sort, and other endpoint specific parameters; default = {}
+ data_key: JSON element containing the results list for the endpoint;
+ default = root (no data_key)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+ <p>file_metadata: Queries Google Drive API to get file information and see if file has been modified
+ Provides audit info about who and when last changed the file.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">FILE_METADATA</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"files"</span><span class="p">,</span>
+ <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"files/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span>
+ <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"id"</span><span class="p">],</span>
+ <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"INCREMENTAL"</span><span class="p">,</span>
+ <span class="s2">"replication_keys"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"modifiedTime"</span><span class="p">],</span>
+ <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s2">"fields"</span><span class="p">:</span> <span class="s2">"id,name,createdTime,modifiedTime,version,teamDriveId,driveId,lastModifyingUser"</span>
+ <span class="p">}</span>
+<span class="p">}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+ <p>spreadsheet_metadata: Queries spreadsheet to get basic information on spreadhsheet and sheets</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">SPREADSHEET_METADATA</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span>
+ <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span>
+ <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"spreadsheetId"</span><span class="p">],</span>
+ <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span>
+ <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s2">"includeGridData"</span><span class="p">:</span> <span class="s2">"false"</span>
+ <span class="p">}</span>
+<span class="p">}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+ <p>sheet_metadata: Get Header Row and 1st data row (Rows 1 & 2) from a Sheet on Spreadsheet.
+This endpoint includes detailed metadata about each cell in the header and first data row
+ incl. data type, formatting, etc.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">SHEET_METADATA</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span>
+ <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">"</span><span class="p">,</span>
+ <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"sheetId"</span><span class="p">],</span>
+ <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span>
+ <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s2">"includeGridData"</span><span class="p">:</span> <span class="s2">"true"</span><span class="p">,</span>
+ <span class="s2">"ranges"</span><span class="p">:</span> <span class="s2">"'</span><span class="si">{sheet_title}</span><span class="s2">'!1:2"</span>
+ <span class="p">}</span>
+<span class="p">}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+ <p>sheets_loaded: Queries a batch of Rows for each Sheet in the Spreadsheet.
+Each query uses the <code>values</code> endpoint, to get data-only, w/out the formatting/type metadata.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">SHEETS_LOADED</span> <span class="o">=</span> <span class="p">{</span>
+ <span class="s2">"api"</span><span class="p">:</span> <span class="s2">"sheets"</span><span class="p">,</span>
+ <span class="s2">"path"</span><span class="p">:</span> <span class="s2">"spreadsheets/</span><span class="si">{spreadsheet_id}</span><span class="s2">/values/'</span><span class="si">{sheet_title}</span><span class="s2">'!</span><span class="si">{range_rows}</span><span class="s2">"</span><span class="p">,</span>
+ <span class="s2">"data_key"</span><span class="p">:</span> <span class="s2">"values"</span><span class="p">,</span>
+ <span class="s2">"key_properties"</span><span class="p">:</span> <span class="p">[</span><span class="s2">"spreadsheetId"</span><span class="p">,</span> <span class="s2">"sheetId"</span><span class="p">,</span> <span class="s2">"loadDate"</span><span class="p">],</span>
+ <span class="s2">"replication_method"</span><span class="p">:</span> <span class="s2">"FULL_TABLE"</span><span class="p">,</span>
+ <span class="s2">"params"</span><span class="p">:</span> <span class="p">{</span>
+ <span class="s2">"dateTimeRenderOption"</span><span class="p">:</span> <span class="s2">"SERIAL_NUMBER"</span><span class="p">,</span>
+ <span class="s2">"valueRenderOption"</span><span class="p">:</span> <span class="s2">"UNFORMATTED_VALUE"</span><span class="p">,</span>
+ <span class="s2">"majorDimension"</span><span class="p">:</span> <span class="s2">"ROWS"</span>
+ <span class="p">}</span>
+<span class="p">}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-6'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-6'>#</a>
+ </div>
+ <p>Ensure streams are ordered sequentially, logically.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="n">STREAMS</span> <span class="o">=</span> <span class="n">OrderedDict</span><span class="p">()</span>
+<span class="n">STREAMS</span><span class="p">[</span><span class="s1">'file_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">FILE_METADATA</span>
+<span class="n">STREAMS</span><span class="p">[</span><span class="s1">'spreadsheet_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SPREADSHEET_METADATA</span>
+<span class="n">STREAMS</span><span class="p">[</span><span class="s1">'sheet_metadata'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SHEET_METADATA</span>
+<span class="n">STREAMS</span><span class="p">[</span><span class="s1">'sheets_loaded'</span><span class="p">]</span> <span class="o">=</span> <span class="n">SHEETS_LOADED</span>
+
+</pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>
--- /dev/null
+<!DOCTYPE html>
+<html>
+<head>
+ <meta http-equiv="content-type" content="text/html;charset=utf-8">
+ <title>sync.py</title>
+ <link rel="stylesheet" href="pycco.css">
+</head>
+<body>
+<div id='container'>
+ <div id="background"></div>
+ <div class='section'>
+ <div class='docs'><h1>sync.py</h1></div>
+ </div>
+ <div class='clearall'>
+ <div class='section' id='section-0'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-0'>#</a>
+ </div>
+ <p>This module contains the logic to sync data from the API.</p>
+<hr />
+<p>Syncable streams: The tap seems to care about syncing the streams in this order.</p>
+<ol>
+<li><code>file_metadata</code></li>
+<li><code>spreadsheet_metadata</code></li>
+<li><em>N</em> Sheets</li>
+<li><code>sheet_metadata</code></li>
+<li><code>sheets_loaded</code></li>
+<li><code>sheets_loaded</code></li>
+</ol>
+<hr />
+<p>The flow through this module is:</p>
+<ol>
+<li>Entrypoint: <code>sync()</code></li>
+<li>Sync <code>file_metadata</code><ol>
+<li><code>get_data()</code></li>
+<li><code>transform_file_metadata()</code></li>
+<li>Maybe exit the sync</li>
+<li><code>sync_stream()</code></li>
+</ol>
+</li>
+<li>Sync <code>spreadsheet_metadata</code><ol>
+<li><code>get_data()</code></li>
+<li><code>transform_spreadsheet_metadata()</code></li>
+<li><code>sync_stream()</code></li>
+</ol>
+</li>
+<li>Sync all of the Sheets. Here’s the process for a single Sheet<ol>
+<li><code>get_sheet_metadata()</code></li>
+<li><code>transform_sheet_metadata()</code></li>
+<li><code>get_data()</code></li>
+<li><code>transform_sheet_data()</code></li>
+<li><code>process_records()</code></li>
+</ol>
+</li>
+<li>Sync <code>sheet_metadata</code><ol>
+<li><code>sync_stream()</code></li>
+</ol>
+</li>
+<li>Sync <code>sheets_loaded</code><ol>
+<li><code>sync_stream()</code></li>
+</ol>
+</li>
+<li>Sync <code>sheets_loaded</code><ol>
+<li><code>sync_stream()</code></li>
+</ol>
+</li>
+</ol>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">time</span>
+<span class="kn">import</span> <span class="nn">math</span>
+<span class="kn">import</span> <span class="nn">json</span>
+<span class="kn">import</span> <span class="nn">re</span>
+<span class="kn">import</span> <span class="nn">urllib.parse</span>
+<span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span><span class="p">,</span> <span class="n">timedelta</span>
+<span class="kn">import</span> <span class="nn">pytz</span>
+<span class="kn">import</span> <span class="nn">singer</span>
+<span class="kn">from</span> <span class="nn">singer</span> <span class="kn">import</span> <span class="n">metrics</span><span class="p">,</span> <span class="n">metadata</span><span class="p">,</span> <span class="n">Transformer</span><span class="p">,</span> <span class="n">utils</span>
+<span class="kn">from</span> <span class="nn">singer.utils</span> <span class="kn">import</span> <span class="n">strptime_to_utc</span><span class="p">,</span> <span class="n">strftime</span>
+<span class="kn">from</span> <span class="nn">singer.messages</span> <span class="kn">import</span> <span class="n">RecordMessage</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.streams</span> <span class="kn">import</span> <span class="n">STREAMS</span>
+<span class="kn">from</span> <span class="nn">tap_google_sheets.schema</span> <span class="kn">import</span> <span class="n">get_sheet_metadata</span>
+
+<span class="n">LOGGER</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">get_logger</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-1'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-1'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-2'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-2'>#</a>
+ </div>
+ <h1>Helper Functions</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-3'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-3'>#</a>
+ </div>
+ <hr />
+<p>Log that we write a schema via singer.write_schema</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">write_schema</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-4'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-4'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">stream</span> <span class="o">=</span> <span class="n">catalog</span><span class="o">.</span><span class="n">get_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">schema</span> <span class="o">=</span> <span class="n">stream</span><span class="o">.</span><span class="n">schema</span><span class="o">.</span><span class="n">to_dict</span><span class="p">()</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">write_schema</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">schema</span><span class="p">,</span> <span class="n">stream</span><span class="o">.</span><span class="n">key_properties</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Writing schema for: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">))</span>
+ <span class="k">except</span> <span class="ne">OSError</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-5'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-5'>#</a>
+ </div>
+ <p>QUESTION: When do we encounter an OSError?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'OS Error writing schema for: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">))</span>
+ <span class="k">raise</span> <span class="n">err</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-6'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-6'>#</a>
+ </div>
+ <p>Write a RecordMessage, with the given version if it was passed in</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">write_record</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">record</span><span class="p">,</span> <span class="n">time_extracted</span><span class="p">,</span> <span class="n">version</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-7'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-7'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">try</span><span class="p">:</span>
+ <span class="k">if</span> <span class="n">version</span><span class="p">:</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">messages</span><span class="o">.</span><span class="n">write_message</span><span class="p">(</span>
+ <span class="n">RecordMessage</span><span class="p">(</span>
+ <span class="n">stream</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">record</span><span class="o">=</span><span class="n">record</span><span class="p">,</span>
+ <span class="n">version</span><span class="o">=</span><span class="n">version</span><span class="p">,</span>
+ <span class="n">time_extracted</span><span class="o">=</span><span class="n">time_extracted</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">messages</span><span class="o">.</span><span class="n">write_record</span><span class="p">(</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">record</span><span class="o">=</span><span class="n">record</span><span class="p">,</span>
+ <span class="n">time_extracted</span><span class="o">=</span><span class="n">time_extracted</span><span class="p">)</span>
+ <span class="k">except</span> <span class="ne">OSError</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-8'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-8'>#</a>
+ </div>
+ <p>QUESTION: When do we encounter an OSError?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'OS Error writing record for: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'record: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">record</span><span class="p">))</span>
+ <span class="k">raise</span> <span class="n">err</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-9'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-9'>#</a>
+ </div>
+ <p>Safe get a bookmark from <code>state</code>.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream</span><span class="p">,</span> <span class="n">default</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-10'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-10'>#</a>
+ </div>
+ <p>Hides an error though if <code>state</code> turns out to be <code>None</code></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="p">(</span><span class="n">state</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="s1">'bookmarks'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">state</span><span class="p">):</span>
+ <span class="k">return</span> <span class="n">default</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-11'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-11'>#</a>
+ </div>
+ <p>This is also short enough for one line, is this supposed to be more readable?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">return</span> <span class="p">(</span>
+ <span class="n">state</span>
+ <span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'bookmarks'</span><span class="p">,</span> <span class="p">{})</span>
+ <span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span>
+ <span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-12'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-12'>#</a>
+ </div>
+ <p>Updates and write state</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">write_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream</span><span class="p">,</span> <span class="n">value</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-13'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-13'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="s1">'bookmarks'</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">state</span><span class="p">:</span>
+ <span class="n">state</span><span class="p">[</span><span class="s1">'bookmarks'</span><span class="p">]</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">state</span><span class="p">[</span><span class="s1">'bookmarks'</span><span class="p">][</span><span class="n">stream</span><span class="p">]</span> <span class="o">=</span> <span class="n">value</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Write state for stream: </span><span class="si">{}</span><span class="s1">, value: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">write_state</span><span class="p">(</span><span class="n">state</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-14'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-14'>#</a>
+ </div>
+ <p>Upserts or deletes the ‘currently_syncing’ stream</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">update_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-15'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-15'>#</a>
+ </div>
+ <p>Why do we care if <code>stream_name</code> is passed in to delete <code>currently_syncing</code>?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="p">(</span><span class="n">stream_name</span> <span class="ow">is</span> <span class="kc">None</span><span class="p">)</span> <span class="ow">and</span> <span class="p">(</span><span class="s1">'currently_syncing'</span> <span class="ow">in</span> <span class="n">state</span><span class="p">):</span>
+ <span class="k">del</span> <span class="n">state</span><span class="p">[</span><span class="s1">'currently_syncing'</span><span class="p">]</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">set_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">singer</span><span class="o">.</span><span class="n">write_state</span><span class="p">(</span><span class="n">state</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-16'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-16'>#</a>
+ </div>
+ <p>Get a list of selected, top-level fields for <code>stream_name</code></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_selected_fields</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-17'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-17'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">stream</span> <span class="o">=</span> <span class="n">catalog</span><span class="o">.</span><span class="n">get_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">mdata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">to_map</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">metadata</span><span class="p">)</span>
+ <span class="n">mdata_list</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">metadata</span><span class="o">.</span><span class="n">to_list</span><span class="p">(</span><span class="n">mdata</span><span class="p">)</span>
+ <span class="n">selected_fields</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="k">for</span> <span class="n">entry</span> <span class="ow">in</span> <span class="n">mdata_list</span><span class="p">:</span>
+ <span class="n">field</span> <span class="o">=</span> <span class="kc">None</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">field</span> <span class="o">=</span> <span class="n">entry</span><span class="p">[</span><span class="s1">'breadcrumb'</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span>
+ <span class="k">if</span> <span class="n">entry</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'metadata'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'selected'</span><span class="p">,</span> <span class="kc">False</span><span class="p">):</span>
+ <span class="n">selected_fields</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">field</span><span class="p">)</span>
+ <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-18'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-18'>#</a>
+ </div>
+ <p>Swallow the error for the Stream level metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">pass</span>
+ <span class="k">return</span> <span class="n">selected_fields</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-19'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-19'>#</a>
+ </div>
+ <p>Construct the request we want to make, make the request, and return the Response</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">get_data</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">endpoint_config</span><span class="p">,</span> <span class="n">client</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">range_rows</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-20'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-20'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-21'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-21'>#</a>
+ </div>
+ <h3>Build the query</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">stream_name_escaped</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">escape</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-22'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-22'>#</a>
+ </div>
+ <p>Encode stream_name to fix issues with special characters in <code>stream_name</code>
+QUESTION: If there’s special characters here how do databases handle it?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">stream_name_encoded</span> <span class="o">=</span> <span class="n">urllib</span><span class="o">.</span><span class="n">parse</span><span class="o">.</span><span class="n">quote_plus</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">range_rows</span><span class="p">:</span>
+ <span class="n">range_rows</span> <span class="o">=</span> <span class="s1">''</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-23'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-23'>#</a>
+ </div>
+ <p>QUESTION: Why is this not a <code>string.format()</code> with keywords?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">path</span> <span class="o">=</span> <span class="n">endpoint_config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'path'</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span>
+ <span class="s1">'</span><span class="si">{spreadsheet_id}</span><span class="s1">'</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s1">'</span><span class="si">{sheet_title}</span><span class="s1">'</span><span class="p">,</span> <span class="n">stream_name_encoded</span><span class="p">)</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span>
+ <span class="s1">'</span><span class="si">{range_rows}</span><span class="s1">'</span><span class="p">,</span> <span class="n">range_rows</span><span class="p">)</span>
+ <span class="n">params</span> <span class="o">=</span> <span class="n">endpoint_config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'params'</span><span class="p">,</span> <span class="p">{})</span>
+ <span class="n">api</span> <span class="o">=</span> <span class="n">endpoint_config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'api'</span><span class="p">,</span> <span class="s1">'sheets'</span><span class="p">)</span>
+ <span class="n">querystring</span> <span class="o">=</span> <span class="s1">'&'</span><span class="o">.</span><span class="n">join</span><span class="p">([</span><span class="s1">'</span><span class="si">%s</span><span class="s1">=</span><span class="si">%s</span><span class="s1">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="k">for</span> <span class="p">(</span><span class="n">key</span><span class="p">,</span> <span class="n">value</span><span class="p">)</span> <span class="ow">in</span> <span class="n">params</span><span class="o">.</span><span class="n">items</span><span class="p">()])</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span>
+ <span class="s1">'</span><span class="si">{sheet_title}</span><span class="s1">'</span><span class="p">,</span> <span class="n">stream_name_encoded</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'URL: </span><span class="si">{}</span><span class="s1">/</span><span class="si">{}</span><span class="s1">?</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">client</span><span class="o">.</span><span class="n">base_url</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">querystring</span><span class="p">))</span>
+ <span class="n">data</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">time_extracted</span> <span class="o">=</span> <span class="n">utils</span><span class="o">.</span><span class="n">now</span><span class="p">()</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-24'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-24'>#</a>
+ </div>
+ <h3>Make the query</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">data</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="n">get</span><span class="p">(</span>
+ <span class="n">path</span><span class="o">=</span><span class="n">path</span><span class="p">,</span>
+ <span class="n">api</span><span class="o">=</span><span class="n">api</span><span class="p">,</span>
+ <span class="n">params</span><span class="o">=</span><span class="n">querystring</span><span class="p">,</span>
+ <span class="n">endpoint</span><span class="o">=</span><span class="n">stream_name_escaped</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-25'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-25'>#</a>
+ </div>
+ <h3>Return the Response.json()</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">return</span> <span class="n">data</span><span class="p">,</span> <span class="n">time_extracted</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-26'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-26'>#</a>
+ </div>
+ <hr />
+<h1>Transform Functions</h1>
+<p>There’s this line of code that happens in these that is a bit confusing:</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-27'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-27'>#</a>
+ </div>
+ <pre><code class="language-python">json.loads(json.dumps(some_object))
+</code></pre>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-28'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-28'>#</a>
+ </div>
+ <p>I don’t see the use here. We turn Python into a JSON string and back again.
+The only thing I could see in the repl is that integer keys get stringified.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-29'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-29'>#</a>
+ </div>
+ <p>In general, the transform functions just look like “maybe pop some
+stuff”, “maybe add some stuff”, and return the input in a list</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-30'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-30'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-31'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-31'>#</a>
+ </div>
+ <p>remove nodes from lastModifyingUser, format as array</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">transform_file_metadata</span><span class="p">(</span><span class="n">file_metadata</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-32'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-32'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">file_metadata_tf</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">file_metadata</span><span class="p">))</span>
+
+ <span class="k">if</span> <span class="n">file_metadata_tf</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'lastModifyingUser'</span><span class="p">):</span>
+ <span class="n">file_metadata_tf</span><span class="p">[</span><span class="s1">'lastModifyingUser'</span><span class="p">]</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s1">'photoLink'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+ <span class="n">file_metadata_tf</span><span class="p">[</span><span class="s1">'lastModifyingUser'</span><span class="p">]</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s1">'me'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+ <span class="n">file_metadata_tf</span><span class="p">[</span><span class="s1">'lastModifyingUser'</span><span class="p">]</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s1">'permissionId'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+
+ <span class="n">file_metadata_arr</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">file_metadata_arr</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">file_metadata_tf</span><span class="p">)</span>
+ <span class="k">return</span> <span class="n">file_metadata_arr</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-33'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-33'>#</a>
+ </div>
+ <p>remove defaultFormat and sheets nodes, format as array</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">transform_spreadsheet_metadata</span><span class="p">(</span><span class="n">spreadsheet_metadata</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-34'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-34'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">spreadsheet_metadata_tf</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">spreadsheet_metadata</span><span class="p">))</span>
+
+ <span class="k">if</span> <span class="n">spreadsheet_metadata_tf</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">):</span>
+ <span class="n">spreadsheet_metadata_tf</span><span class="p">[</span><span class="s1">'properties'</span><span class="p">]</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s1">'defaultFormat'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+ <span class="n">spreadsheet_metadata_tf</span><span class="o">.</span><span class="n">pop</span><span class="p">(</span><span class="s1">'sheets'</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span>
+
+ <span class="n">spreadsheet_metadata_arr</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">spreadsheet_metadata_arr</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">spreadsheet_metadata_tf</span><span class="p">)</span>
+ <span class="k">return</span> <span class="n">spreadsheet_metadata_arr</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-35'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-35'>#</a>
+ </div>
+ <p>add spreadsheetId, sheetUrl, and columns metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">transform_sheet_metadata</span><span class="p">(</span><span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">sheet</span><span class="p">,</span> <span class="n">columns</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-36'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-36'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_metadata</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">)</span>
+ <span class="n">sheet_metadata_tf</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">loads</span><span class="p">(</span><span class="n">json</span><span class="o">.</span><span class="n">dumps</span><span class="p">(</span><span class="n">sheet_metadata</span><span class="p">))</span>
+ <span class="n">sheet_id</span> <span class="o">=</span> <span class="n">sheet_metadata_tf</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheetId'</span><span class="p">)</span>
+ <span class="n">sheet_url</span> <span class="o">=</span> <span class="s1">'https://docs.google.com/spreadsheets/d/</span><span class="si">{}</span><span class="s1">/edit#gid=</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">sheet_id</span><span class="p">)</span>
+ <span class="n">sheet_metadata_tf</span><span class="p">[</span><span class="s1">'spreadsheetId'</span><span class="p">]</span> <span class="o">=</span> <span class="n">spreadsheet_id</span>
+ <span class="n">sheet_metadata_tf</span><span class="p">[</span><span class="s1">'sheetUrl'</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_url</span>
+ <span class="n">sheet_metadata_tf</span><span class="p">[</span><span class="s1">'columns'</span><span class="p">]</span> <span class="o">=</span> <span class="n">columns</span>
+ <span class="k">return</span> <span class="n">sheet_metadata_tf</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-37'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-37'>#</a>
+ </div>
+ <p>Convert Excel Date Serial Number (excel_date_sn) to datetime string timezone_str: defaults to</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">excel_to_dttm_str</span><span class="p">(</span><span class="n">excel_date_sn</span><span class="p">,</span> <span class="n">timezone_str</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-38'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-38'>#</a>
+ </div>
+ <p>UTC (which we assume is the timezone for ALL datetimes)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="ow">not</span> <span class="n">timezone_str</span><span class="p">:</span>
+ <span class="n">timezone_str</span> <span class="o">=</span> <span class="s1">'UTC'</span>
+ <span class="n">tzn</span> <span class="o">=</span> <span class="n">pytz</span><span class="o">.</span><span class="n">timezone</span><span class="p">(</span><span class="n">timezone_str</span><span class="p">)</span>
+ <span class="n">epoch_dttm</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">1970</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
+
+ <span class="n">sec_per_day</span> <span class="o">=</span> <span class="mi">86400</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-39'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-39'>#</a>
+ </div>
+ <p>1970-01-01T00:00:00Z, Lotus Notes Serial Number for Epoch Start Date</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">excel_epoch</span> <span class="o">=</span> <span class="mi">25569</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-40'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-40'>#</a>
+ </div>
+ <p>Seconds since Epoch, times the seconds per day => days since Epoch?</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">epoch_sec</span> <span class="o">=</span> <span class="n">math</span><span class="o">.</span><span class="n">floor</span><span class="p">((</span><span class="n">excel_date_sn</span> <span class="o">-</span> <span class="n">excel_epoch</span><span class="p">)</span> <span class="o">*</span> <span class="n">sec_per_day</span><span class="p">)</span>
+
+ <span class="n">excel_dttm</span> <span class="o">=</span> <span class="n">epoch_dttm</span> <span class="o">+</span> <span class="n">timedelta</span><span class="p">(</span><span class="n">seconds</span><span class="o">=</span><span class="n">epoch_sec</span><span class="p">)</span>
+ <span class="n">utc_dttm</span> <span class="o">=</span> <span class="n">tzn</span><span class="o">.</span><span class="n">localize</span><span class="p">(</span><span class="n">excel_dttm</span><span class="p">)</span><span class="o">.</span><span class="n">astimezone</span><span class="p">(</span><span class="n">pytz</span><span class="o">.</span><span class="n">utc</span><span class="p">)</span>
+ <span class="n">utc_dttm_str</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">utils</span><span class="o">.</span><span class="n">strftime</span><span class="p">(</span><span class="n">utc_dttm</span><span class="p">)</span>
+ <span class="k">return</span> <span class="n">utc_dttm_str</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-41'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-41'>#</a>
+ </div>
+ <hr />
+<h3>WARNING This next function is confusing</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-42'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-42'>#</a>
+ </div>
+ <p>In general, the point of the function is to transform the field based on the data type that the
+API tells us. It loops over every row and then every column in the row.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-43'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-43'>#</a>
+ </div>
+ <p>For the <code>TIME</code> fields, there’s no reason it should work. And for some cases, the value returned is
+just wrong.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-44'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-44'>#</a>
+ </div>
+ <p>You can look at the code for <code>timedelta</code> and you would see that this constructor wants to
+normalize the input of 6 units into 3 (you can create the object with <code>years</code>, <code>months</code>, <code>days</code>,
+<code>hours</code>, <code>minutes</code>, and <code>seconds</code>. But it will convert values into just <code>days</code>, <code>hours</code>, and
+<code>seconds</code>).</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-45'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-45'>#</a>
+ </div>
+ <p><em>Disclaimer I don’t have the exact units, but the spirit of
+the idea is here.</em></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-46'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-46'>#</a>
+ </div>
+ <p>When we pass in <code>seconds</code> here as the value we get from the API times the number of seconds in a
+day, how <code>timedelta</code> does its normalization gives us an incorrect value. It takes the input to
+<code>seconds</code> and passes that to <code>divmod()</code> which returns a 2-ple as the result. The first element is
+our input integer divided by the number of seconds in a day. The second element is our input mod
+the number of seconds in a day. Then these results are added to the rest of the normalization and
+we get the correct time value back out. It’s easy to imagine that since we don’t pass in a <code>days</code>
+argument, our <code>divmod</code>‘s days output is just added to zero. The <code>__str__()</code> for <code>timedelta</code> must
+be something like <code>"{my_days} days, {time_since_midnight(my_seconds)}"</code>, which is essentially what
+we get after this transform function.</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-47'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-47'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-48'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-48'>#</a>
+ </div>
+ <p>add spreadsheet_id, sheet_id, and row, convert dates/times Convert from array of values to</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">transform_sheet_data</span><span class="p">(</span><span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">sheet_id</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">,</span> <span class="n">from_row</span><span class="p">,</span> <span class="n">columns</span><span class="p">,</span> <span class="n">sheet_data_rows</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-49'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-49'>#</a>
+ </div>
+ <p>JSON with column names as keys</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data_tf</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">row_num</span> <span class="o">=</span> <span class="n">from_row</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-50'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-50'>#</a>
+ </div>
+ <p>Create sorted list of columns based on columnIndex</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">cols</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">columns</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">i</span><span class="p">:</span> <span class="n">i</span><span class="p">[</span><span class="s1">'columnIndex'</span><span class="p">])</span>
+
+ <span class="k">for</span> <span class="n">row</span> <span class="ow">in</span> <span class="n">sheet_data_rows</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-51'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-51'>#</a>
+ </div>
+ <p>If empty row, SKIP</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">row</span> <span class="o">==</span> <span class="p">[]:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'EMPTY ROW: </span><span class="si">{}</span><span class="s1">, SKIPPING'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">row_num</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">sheet_data_row_tf</span> <span class="o">=</span> <span class="p">{}</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-52'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-52'>#</a>
+ </div>
+ <p>Add spreadsheet_id, sheet_id, and row</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data_row_tf</span><span class="p">[</span><span class="s1">'__sdc_spreadsheet_id'</span><span class="p">]</span> <span class="o">=</span> <span class="n">spreadsheet_id</span>
+ <span class="n">sheet_data_row_tf</span><span class="p">[</span><span class="s1">'__sdc_sheet_id'</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_id</span>
+ <span class="n">sheet_data_row_tf</span><span class="p">[</span><span class="s1">'__sdc_row'</span><span class="p">]</span> <span class="o">=</span> <span class="n">row_num</span>
+ <span class="n">col_num</span> <span class="o">=</span> <span class="mi">1</span>
+ <span class="k">for</span> <span class="n">value</span> <span class="ow">in</span> <span class="n">row</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-53'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-53'>#</a>
+ </div>
+ <p>Select column metadata based on column index</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">col</span> <span class="o">=</span> <span class="n">cols</span><span class="p">[</span><span class="n">col_num</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span>
+ <span class="n">col_skipped</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnSkipped'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">col_skipped</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-54'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-54'>#</a>
+ </div>
+ <p>Get column metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">col_name</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnName'</span><span class="p">)</span>
+ <span class="n">col_type</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnType'</span><span class="p">)</span>
+ <span class="n">col_letter</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnLetter'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-55'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-55'>#</a>
+ </div>
+ <p>NULL values</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">value</span> <span class="ow">is</span> <span class="kc">None</span> <span class="ow">or</span> <span class="n">value</span> <span class="o">==</span> <span class="s1">''</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">None</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-56'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-56'>#</a>
+ </div>
+ <p>Convert dates/times from Lotus Notes Serial Numbers
+DATE-TIME</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'numberType.DATE_TIME'</span><span class="p">:</span>
+ <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="p">(</span><span class="nb">int</span><span class="p">,</span> <span class="nb">float</span><span class="p">)):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="n">excel_to_dttm_str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-57'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-57'>#</a>
+ </div>
+ <p>DATE</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'numberType.DATE'</span><span class="p">:</span>
+ <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="p">(</span><span class="nb">int</span><span class="p">,</span> <span class="nb">float</span><span class="p">)):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="n">excel_to_dttm_str</span><span class="p">(</span><span class="n">value</span><span class="p">)[:</span><span class="mi">10</span><span class="p">]</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-58'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-58'>#</a>
+ </div>
+ <p>TIME ONLY (NO DATE)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'numberType.TIME'</span><span class="p">:</span>
+
+ <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="p">(</span><span class="nb">int</span><span class="p">,</span> <span class="nb">float</span><span class="p">)):</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">total_secs</span> <span class="o">=</span> <span class="n">value</span> <span class="o">*</span> <span class="mi">86400</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-59'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-59'>#</a>
+ </div>
+ <p>Create string formatted like HH:MM:SS</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">timedelta</span><span class="p">(</span><span class="n">seconds</span><span class="o">=</span><span class="n">total_secs</span><span class="p">))</span>
+ <span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-60'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-60'>#</a>
+ </div>
+ <p>NUMBER (INTEGER AND FLOAT)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'numberType'</span><span class="p">:</span>
+ <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">int</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">float</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-61'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-61'>#</a>
+ </div>
+ <p>Determine float decimal digits</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">decimal_digits</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s1">'.'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">decimal_digits</span> <span class="o">></span> <span class="mi">15</span><span class="p">:</span>
+ <span class="k">try</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-62'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-62'>#</a>
+ </div>
+ <p>ROUND to multipleOf: 1e-15</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">col_val</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="nb">round</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="mi">15</span><span class="p">))</span>
+ <span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span> <span class="c1"># decimal_digits <= 15, no rounding</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">float</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="k">except</span> <span class="ne">ValueError</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR: SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row_num</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-63'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-63'>#</a>
+ </div>
+ <p>STRING</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'stringValue'</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-64'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-64'>#</a>
+ </div>
+ <p>BOOLEAN</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">elif</span> <span class="n">col_type</span> <span class="o">==</span> <span class="s1">'boolValue'</span><span class="p">:</span>
+ <span class="k">if</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">bool</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="n">value</span>
+ <span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">str</span><span class="p">):</span>
+ <span class="k">if</span> <span class="n">value</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="ow">in</span> <span class="p">(</span><span class="s1">'true'</span><span class="p">,</span> <span class="s1">'t'</span><span class="p">,</span> <span class="s1">'yes'</span><span class="p">,</span> <span class="s1">'y'</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">True</span>
+ <span class="k">elif</span> <span class="n">value</span><span class="o">.</span><span class="n">lower</span><span class="p">()</span> <span class="ow">in</span> <span class="p">(</span><span class="s1">'false'</span><span class="p">,</span> <span class="s1">'f'</span><span class="p">,</span> <span class="s1">'no'</span><span class="p">,</span> <span class="s1">'n'</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">False</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="k">elif</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">value</span><span class="p">,</span> <span class="nb">int</span><span class="p">):</span>
+ <span class="k">if</span> <span class="n">value</span> <span class="ow">in</span> <span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">):</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">True</span>
+ <span class="k">elif</span> <span class="n">value</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="kc">False</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-65'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-65'>#</a>
+ </div>
+ <p>OTHER: Convert everything else to a string</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">else</span><span class="p">:</span>
+ <span class="n">col_val</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="n">value</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'WARNING: POSSIBLE DATA TYPE ERROR; SHEET: </span><span class="si">{}</span><span class="s1">, COL: </span><span class="si">{}</span><span class="s1">, CELL: </span><span class="si">{}{}</span><span class="s1">, TYPE: </span><span class="si">{}</span><span class="s1">, VALUE: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">col_name</span><span class="p">,</span> <span class="n">col_letter</span><span class="p">,</span> <span class="n">row</span><span class="p">,</span> <span class="n">col_type</span><span class="p">,</span> <span class="n">value</span><span class="p">))</span>
+ <span class="n">sheet_data_row_tf</span><span class="p">[</span><span class="n">col_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">col_val</span>
+ <span class="n">col_num</span> <span class="o">=</span> <span class="n">col_num</span> <span class="o">+</span> <span class="mi">1</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-66'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-66'>#</a>
+ </div>
+ <p>APPEND non-empty row</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data_tf</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">sheet_data_row_tf</span><span class="p">)</span>
+ <span class="n">row_num</span> <span class="o">=</span> <span class="n">row_num</span> <span class="o">+</span> <span class="mi">1</span>
+ <span class="k">return</span> <span class="n">sheet_data_tf</span><span class="p">,</span> <span class="n">row_num</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-67'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-67'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-68'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-68'>#</a>
+ </div>
+ <h1>Main Functions</h1>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-69'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-69'>#</a>
+ </div>
+ <hr />
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-70'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-70'>#</a>
+ </div>
+ <p>Transform/validate batch of records w/ schema and sent to target</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">process_records</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">,</span> <span class="n">records</span><span class="p">,</span> <span class="n">time_extracted</span><span class="p">,</span> <span class="n">version</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-71'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-71'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">stream</span> <span class="o">=</span> <span class="n">catalog</span><span class="o">.</span><span class="n">get_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">schema</span> <span class="o">=</span> <span class="n">stream</span><span class="o">.</span><span class="n">schema</span><span class="o">.</span><span class="n">to_dict</span><span class="p">()</span>
+ <span class="n">stream_metadata</span> <span class="o">=</span> <span class="n">metadata</span><span class="o">.</span><span class="n">to_map</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">metadata</span><span class="p">)</span>
+ <span class="k">with</span> <span class="n">metrics</span><span class="o">.</span><span class="n">record_counter</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span> <span class="k">as</span> <span class="n">counter</span><span class="p">:</span>
+ <span class="k">for</span> <span class="n">record</span> <span class="ow">in</span> <span class="n">records</span><span class="p">:</span>
+ <span class="k">with</span> <span class="n">Transformer</span><span class="p">()</span> <span class="k">as</span> <span class="n">transformer</span><span class="p">:</span>
+ <span class="k">try</span><span class="p">:</span>
+ <span class="n">transformed_record</span> <span class="o">=</span> <span class="n">transformer</span><span class="o">.</span><span class="n">transform</span><span class="p">(</span><span class="n">record</span><span class="p">,</span> <span class="n">schema</span><span class="p">,</span> <span class="n">stream_metadata</span><span class="p">)</span>
+ <span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">err</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">error</span><span class="p">(</span><span class="s1">'</span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">err</span><span class="p">))</span>
+ <span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
+ <span class="n">write_record</span><span class="p">(</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">record</span><span class="o">=</span><span class="n">transformed_record</span><span class="p">,</span>
+ <span class="n">time_extracted</span><span class="o">=</span><span class="n">time_extracted</span><span class="p">,</span>
+ <span class="n">version</span><span class="o">=</span><span class="n">version</span><span class="p">)</span>
+ <span class="n">counter</span><span class="o">.</span><span class="n">increment</span><span class="p">()</span>
+ <span class="k">return</span> <span class="n">counter</span><span class="o">.</span><span class="n">value</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-72'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-72'>#</a>
+ </div>
+ <p>This is just a pass-through to <code>process_records()</code></p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">sync_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_streams</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">records</span><span class="p">,</span> <span class="n">time_extracted</span><span class="o">=</span><span class="kc">None</span><span class="p">):</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-73'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-73'>#</a>
+ </div>
+
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">stream_name</span> <span class="ow">in</span> <span class="n">selected_streams</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'STARTED Syncing </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">))</span>
+ <span class="n">update_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">selected_fields</span> <span class="o">=</span> <span class="n">get_selected_fields</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Stream: </span><span class="si">{}</span><span class="s1">, selected_fields: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_fields</span><span class="p">))</span>
+ <span class="n">write_schema</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">)</span>
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">time_extracted</span><span class="p">:</span>
+ <span class="n">time_extracted</span> <span class="o">=</span> <span class="n">utils</span><span class="o">.</span><span class="n">now</span><span class="p">()</span>
+ <span class="n">record_count</span> <span class="o">=</span> <span class="n">process_records</span><span class="p">(</span>
+ <span class="n">catalog</span><span class="o">=</span><span class="n">catalog</span><span class="p">,</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">records</span><span class="o">=</span><span class="n">records</span><span class="p">,</span>
+ <span class="n">time_extracted</span><span class="o">=</span><span class="n">time_extracted</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'FINISHED Syncing </span><span class="si">{}</span><span class="s1">, Total Records: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">record_count</span><span class="p">))</span>
+ <span class="n">update_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-74'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-74'>#</a>
+ </div>
+ <p>See top of file for notes</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre><span class="k">def</span> <span class="nf">sync</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">config</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">):</span>
+ <span class="n">start_date</span> <span class="o">=</span> <span class="n">config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'start_date'</span><span class="p">)</span>
+ <span class="n">spreadsheet_id</span> <span class="o">=</span> <span class="n">config</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'spreadsheet_id'</span><span class="p">)</span>
+
+ <span class="n">last_stream</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">get_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'last/currently syncing stream: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">last_stream</span><span class="p">))</span>
+
+ <span class="n">selected_streams</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="k">for</span> <span class="n">stream</span> <span class="ow">in</span> <span class="n">catalog</span><span class="o">.</span><span class="n">get_selected_streams</span><span class="p">(</span><span class="n">state</span><span class="p">):</span>
+ <span class="n">selected_streams</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">stream</span><span class="o">.</span><span class="n">stream</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'selected_streams: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">selected_streams</span><span class="p">))</span>
+
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">selected_streams</span><span class="p">:</span>
+ <span class="k">return</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-75'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-75'>#</a>
+ </div>
+ <h2>FILE_METADATA</h2>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">file_metadata</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">stream_name</span> <span class="o">=</span> <span class="s1">'file_metadata'</span>
+ <span class="n">file_metadata_config</span> <span class="o">=</span> <span class="n">STREAMS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-76'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-76'>#</a>
+ </div>
+ <p>GET file_metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'GET file_meatadata'</span><span class="p">)</span>
+ <span class="n">file_metadata</span><span class="p">,</span> <span class="n">time_extracted</span> <span class="o">=</span> <span class="n">get_data</span><span class="p">(</span><span class="n">stream_name</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">endpoint_config</span><span class="o">=</span><span class="n">file_metadata_config</span><span class="p">,</span>
+ <span class="n">client</span><span class="o">=</span><span class="n">client</span><span class="p">,</span>
+ <span class="n">spreadsheet_id</span><span class="o">=</span><span class="n">spreadsheet_id</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-77'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-77'>#</a>
+ </div>
+ <p>Transform file_metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Transform file_meatadata'</span><span class="p">)</span>
+ <span class="n">file_metadata_tf</span> <span class="o">=</span> <span class="n">transform_file_metadata</span><span class="p">(</span><span class="n">file_metadata</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-78'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-78'>#</a>
+ </div>
+ <p>Check if file has changed, if not exit</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">last_datetime</span> <span class="o">=</span> <span class="n">strptime_to_utc</span><span class="p">(</span><span class="n">get_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">stream_name</span><span class="p">,</span> <span class="n">start_date</span><span class="p">))</span>
+ <span class="n">this_datetime</span> <span class="o">=</span> <span class="n">strptime_to_utc</span><span class="p">(</span><span class="n">file_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'modifiedTime'</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'last_datetime = </span><span class="si">{}</span><span class="s1">, this_datetime = </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">last_datetime</span><span class="p">,</span> <span class="n">this_datetime</span><span class="p">))</span>
+ <span class="k">if</span> <span class="n">this_datetime</span> <span class="o"><=</span> <span class="n">last_datetime</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'this_datetime <= last_datetime, FILE NOT CHANGED. EXITING.'</span><span class="p">)</span>
+ <span class="n">write_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="s1">'file_metadata'</span><span class="p">,</span> <span class="n">strftime</span><span class="p">(</span><span class="n">this_datetime</span><span class="p">))</span>
+ <span class="k">return</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-79'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-79'>#</a>
+ </div>
+ <p>Write file_metadata records if selected</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sync_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_streams</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">file_metadata_tf</span><span class="p">,</span> <span class="n">time_extracted</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-80'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-80'>#</a>
+ </div>
+ <h2>SPREADSHEET_METADATA</h2>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">spreadsheet_metadata</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">stream_name</span> <span class="o">=</span> <span class="s1">'spreadsheet_metadata'</span>
+ <span class="n">spreadsheet_metadata_config</span> <span class="o">=</span> <span class="n">STREAMS</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">stream_name</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-81'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-81'>#</a>
+ </div>
+ <p>GET spreadsheet_metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'GET spreadsheet_meatadata'</span><span class="p">)</span>
+ <span class="n">spreadsheet_metadata</span><span class="p">,</span> <span class="n">ss_time_extracted</span> <span class="o">=</span> <span class="n">get_data</span><span class="p">(</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">stream_name</span><span class="p">,</span>
+ <span class="n">endpoint_config</span><span class="o">=</span><span class="n">spreadsheet_metadata_config</span><span class="p">,</span>
+ <span class="n">client</span><span class="o">=</span><span class="n">client</span><span class="p">,</span>
+ <span class="n">spreadsheet_id</span><span class="o">=</span><span class="n">spreadsheet_id</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-82'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-82'>#</a>
+ </div>
+ <p>Transform spreadsheet_metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Transform spreadsheet_meatadata'</span><span class="p">)</span>
+ <span class="n">spreadsheet_metadata_tf</span> <span class="o">=</span> <span class="n">transform_spreadsheet_metadata</span><span class="p">(</span><span class="n">spreadsheet_metadata</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-83'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-83'>#</a>
+ </div>
+ <p>Write spreadsheet_metadata records if selected</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sync_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_streams</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">spreadsheet_metadata_tf</span><span class="p">,</span> \
+ <span class="n">ss_time_extracted</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-84'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-84'>#</a>
+ </div>
+ <h2>SHEET_METADATA and SHEET_DATA</h2>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheets</span> <span class="o">=</span> <span class="n">spreadsheet_metadata</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheets'</span><span class="p">)</span>
+ <span class="n">sheet_metadata</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">sheets_loaded</span> <span class="o">=</span> <span class="p">[]</span>
+ <span class="n">sheets_loaded_config</span> <span class="o">=</span> <span class="n">STREAMS</span><span class="p">[</span><span class="s1">'sheets_loaded'</span><span class="p">]</span>
+ <span class="k">if</span> <span class="n">sheets</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-85'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-85'>#</a>
+ </div>
+ <p>Loop thru sheets (worksheet tabs) in spreadsheet</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">for</span> <span class="n">sheet</span> <span class="ow">in</span> <span class="n">sheets</span><span class="p">:</span>
+ <span class="n">sheet_title</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'title'</span><span class="p">)</span>
+ <span class="n">sheet_id</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'sheetId'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-86'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-86'>#</a>
+ </div>
+ <h3>Sheet_Metadata</h3>
+<p>GET sheet_metadata and columns</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_schema</span><span class="p">,</span> <span class="n">columns</span> <span class="o">=</span> <span class="n">get_sheet_metadata</span><span class="p">(</span><span class="n">sheet</span><span class="p">,</span> <span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">client</span><span class="p">)</span>
+
+ <span class="k">if</span> <span class="ow">not</span> <span class="n">sheet_schema</span> <span class="ow">or</span> <span class="ow">not</span> <span class="n">columns</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'SKIPPING Empty Sheet: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">))</span>
+ <span class="k">else</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-87'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-87'>#</a>
+ </div>
+ <p>Transform sheet_metadata</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_metadata_tf</span> <span class="o">=</span> <span class="n">transform_sheet_metadata</span><span class="p">(</span><span class="n">spreadsheet_id</span><span class="p">,</span> <span class="n">sheet</span><span class="p">,</span> <span class="n">columns</span><span class="p">)</span>
+ <span class="n">sheet_metadata</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">sheet_metadata_tf</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-88'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-88'>#</a>
+ </div>
+ <h3>SHEET_DATA</h3>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">if</span> <span class="n">sheet_title</span> <span class="ow">in</span> <span class="n">selected_streams</span><span class="p">:</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'STARTED Syncing Sheet </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">))</span>
+ <span class="n">update_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">)</span>
+ <span class="n">selected_fields</span> <span class="o">=</span> <span class="n">get_selected_fields</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Stream: </span><span class="si">{}</span><span class="s1">, selected_fields: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">,</span> <span class="n">selected_fields</span><span class="p">))</span>
+ <span class="n">write_schema</span><span class="p">(</span><span class="n">catalog</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-89'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-89'>#</a>
+ </div>
+ <p>Emit a Singer ACTIVATE_VERSION message before initial sync (but not subsequent syncs)
+everytime after each sheet sync is complete.
+This forces hard deletes on the data downstream if fewer records are sent.
+https://github.com/singer-io/singer-python/blob/master/singer/messages.py#L137</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">last_integer</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">get_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span>
+ <span class="n">activate_version</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="n">time</span><span class="o">.</span><span class="n">time</span><span class="p">()</span> <span class="o">*</span> <span class="mi">1000</span><span class="p">)</span>
+ <span class="n">activate_version_message</span> <span class="o">=</span> <span class="n">singer</span><span class="o">.</span><span class="n">ActivateVersionMessage</span><span class="p">(</span>
+ <span class="n">stream</span><span class="o">=</span><span class="n">sheet_title</span><span class="p">,</span>
+ <span class="n">version</span><span class="o">=</span><span class="n">activate_version</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">last_integer</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-90'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-90'>#</a>
+ </div>
+ <p>initial load, send activate_version before AND after data sync</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">singer</span><span class="o">.</span><span class="n">write_message</span><span class="p">(</span><span class="n">activate_version_message</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'INITIAL SYNC, Stream: </span><span class="si">{}</span><span class="s1">, Activate Version: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">,</span> <span class="n">activate_version</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-91'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-91'>#</a>
+ </div>
+ <p>Determine max range of columns and rows for “paging” through the data</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_last_col_index</span> <span class="o">=</span> <span class="mi">1</span>
+ <span class="n">sheet_last_col_letter</span> <span class="o">=</span> <span class="s1">'A'</span>
+ <span class="k">for</span> <span class="n">col</span> <span class="ow">in</span> <span class="n">columns</span><span class="p">:</span>
+ <span class="n">col_index</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnIndex'</span><span class="p">)</span>
+ <span class="n">col_letter</span> <span class="o">=</span> <span class="n">col</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'columnLetter'</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">col_index</span> <span class="o">></span> <span class="n">sheet_last_col_index</span><span class="p">:</span>
+ <span class="n">sheet_last_col_index</span> <span class="o">=</span> <span class="n">col_index</span>
+ <span class="n">sheet_last_col_letter</span> <span class="o">=</span> <span class="n">col_letter</span>
+ <span class="n">sheet_max_row</span> <span class="o">=</span> <span class="n">sheet</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'properties'</span><span class="p">)</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'gridProperties'</span><span class="p">,</span> <span class="p">{})</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'rowCount'</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-92'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-92'>#</a>
+ </div>
+ <p>Initialize paging for 1st batch</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">is_last_row</span> <span class="o">=</span> <span class="kc">False</span>
+ <span class="n">batch_rows</span> <span class="o">=</span> <span class="mi">200</span>
+ <span class="n">from_row</span> <span class="o">=</span> <span class="mi">2</span>
+ <span class="k">if</span> <span class="n">sheet_max_row</span> <span class="o"><</span> <span class="n">batch_rows</span><span class="p">:</span>
+ <span class="n">to_row</span> <span class="o">=</span> <span class="n">sheet_max_row</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">to_row</span> <span class="o">=</span> <span class="n">batch_rows</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-93'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-93'>#</a>
+ </div>
+ <p>Loop thru batches (each having 200 rows of data)</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="k">while</span> <span class="ow">not</span> <span class="n">is_last_row</span> <span class="ow">and</span> <span class="n">from_row</span> <span class="o"><</span> <span class="n">sheet_max_row</span> <span class="ow">and</span> <span class="n">to_row</span> <span class="o"><=</span> <span class="n">sheet_max_row</span><span class="p">:</span>
+ <span class="n">range_rows</span> <span class="o">=</span> <span class="s1">'A</span><span class="si">{}</span><span class="s1">:</span><span class="si">{}{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">from_row</span><span class="p">,</span> <span class="n">sheet_last_col_letter</span><span class="p">,</span> <span class="n">to_row</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-94'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-94'>#</a>
+ </div>
+ <p>GET sheet_data for a worksheet tab</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data</span><span class="p">,</span> <span class="n">time_extracted</span> <span class="o">=</span> <span class="n">get_data</span><span class="p">(</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">sheet_title</span><span class="p">,</span>
+ <span class="n">endpoint_config</span><span class="o">=</span><span class="n">sheets_loaded_config</span><span class="p">,</span>
+ <span class="n">client</span><span class="o">=</span><span class="n">client</span><span class="p">,</span>
+ <span class="n">spreadsheet_id</span><span class="o">=</span><span class="n">spreadsheet_id</span><span class="p">,</span>
+ <span class="n">range_rows</span><span class="o">=</span><span class="n">range_rows</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-95'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-95'>#</a>
+ </div>
+ <p>Data is returned as a list of arrays, an array of values for each row</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data_rows</span> <span class="o">=</span> <span class="n">sheet_data</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'values'</span><span class="p">,</span> <span class="p">[])</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-96'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-96'>#</a>
+ </div>
+ <p>Transform batch of rows to JSON with keys for each column</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_data_tf</span><span class="p">,</span> <span class="n">row_num</span> <span class="o">=</span> <span class="n">transform_sheet_data</span><span class="p">(</span>
+ <span class="n">spreadsheet_id</span><span class="o">=</span><span class="n">spreadsheet_id</span><span class="p">,</span>
+ <span class="n">sheet_id</span><span class="o">=</span><span class="n">sheet_id</span><span class="p">,</span>
+ <span class="n">sheet_title</span><span class="o">=</span><span class="n">sheet_title</span><span class="p">,</span>
+ <span class="n">from_row</span><span class="o">=</span><span class="n">from_row</span><span class="p">,</span>
+ <span class="n">columns</span><span class="o">=</span><span class="n">columns</span><span class="p">,</span>
+ <span class="n">sheet_data_rows</span><span class="o">=</span><span class="n">sheet_data_rows</span><span class="p">)</span>
+ <span class="k">if</span> <span class="n">row_num</span> <span class="o"><</span> <span class="n">to_row</span><span class="p">:</span>
+ <span class="n">is_last_row</span> <span class="o">=</span> <span class="kc">True</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-97'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-97'>#</a>
+ </div>
+ <p>Process records, send batch of records to target</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">record_count</span> <span class="o">=</span> <span class="n">process_records</span><span class="p">(</span>
+ <span class="n">catalog</span><span class="o">=</span><span class="n">catalog</span><span class="p">,</span>
+ <span class="n">stream_name</span><span class="o">=</span><span class="n">sheet_title</span><span class="p">,</span>
+ <span class="n">records</span><span class="o">=</span><span class="n">sheet_data_tf</span><span class="p">,</span>
+ <span class="n">time_extracted</span><span class="o">=</span><span class="n">ss_time_extracted</span><span class="p">,</span>
+ <span class="n">version</span><span class="o">=</span><span class="n">activate_version</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'Sheet: </span><span class="si">{}</span><span class="s1">, records processed: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">record_count</span><span class="p">))</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-98'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-98'>#</a>
+ </div>
+ <p>Update paging from/to_row for next batch</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">from_row</span> <span class="o">=</span> <span class="n">to_row</span> <span class="o">+</span> <span class="mi">1</span>
+ <span class="k">if</span> <span class="n">to_row</span> <span class="o">+</span> <span class="n">batch_rows</span> <span class="o">></span> <span class="n">sheet_max_row</span><span class="p">:</span>
+ <span class="n">to_row</span> <span class="o">=</span> <span class="n">sheet_max_row</span>
+ <span class="k">else</span><span class="p">:</span>
+ <span class="n">to_row</span> <span class="o">=</span> <span class="n">to_row</span> <span class="o">+</span> <span class="n">batch_rows</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-99'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-99'>#</a>
+ </div>
+ <p>End of Stream: Send Activate Version and update State</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">singer</span><span class="o">.</span><span class="n">write_message</span><span class="p">(</span><span class="n">activate_version_message</span><span class="p">)</span>
+ <span class="n">write_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">sheet_title</span><span class="p">,</span> <span class="n">activate_version</span><span class="p">)</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'COMPLETE SYNC, Stream: </span><span class="si">{}</span><span class="s1">, Activate Version: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">sheet_title</span><span class="p">,</span> <span class="n">activate_version</span><span class="p">))</span>
+ <span class="n">LOGGER</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'FINISHED Syncing Sheet </span><span class="si">{}</span><span class="s1">, Total Rows: </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span>
+ <span class="n">sheet_title</span><span class="p">,</span> <span class="n">row_num</span> <span class="o">-</span> <span class="mi">2</span><span class="p">))</span> <span class="c1"># subtract 1 for header row</span>
+ <span class="n">update_currently_syncing</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="kc">None</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-100'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-100'>#</a>
+ </div>
+ <p>SHEETS_LOADED
+Add sheet to sheets_loaded</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sheet_loaded</span> <span class="o">=</span> <span class="p">{}</span>
+ <span class="n">sheet_loaded</span><span class="p">[</span><span class="s1">'spreadsheetId'</span><span class="p">]</span> <span class="o">=</span> <span class="n">spreadsheet_id</span>
+ <span class="n">sheet_loaded</span><span class="p">[</span><span class="s1">'sheetId'</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_id</span>
+ <span class="n">sheet_loaded</span><span class="p">[</span><span class="s1">'title'</span><span class="p">]</span> <span class="o">=</span> <span class="n">sheet_title</span>
+ <span class="n">sheet_loaded</span><span class="p">[</span><span class="s1">'loadDate'</span><span class="p">]</span> <span class="o">=</span> <span class="n">strftime</span><span class="p">(</span><span class="n">utils</span><span class="o">.</span><span class="n">now</span><span class="p">())</span>
+ <span class="n">sheet_loaded</span><span class="p">[</span><span class="s1">'lastRowNumber'</span><span class="p">]</span> <span class="o">=</span> <span class="n">row_num</span>
+ <span class="n">sheets_loaded</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">sheet_loaded</span><span class="p">)</span>
+
+ <span class="n">stream_name</span> <span class="o">=</span> <span class="s1">'sheet_metadata'</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-101'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-101'>#</a>
+ </div>
+ <p>Write sheet_metadata records if selected</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sync_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_streams</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">sheet_metadata</span><span class="p">)</span>
+
+ <span class="n">stream_name</span> <span class="o">=</span> <span class="s1">'sheets_loaded'</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-102'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-102'>#</a>
+ </div>
+ <p>Write sheet_metadata records if selected</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">sync_stream</span><span class="p">(</span><span class="n">stream_name</span><span class="p">,</span> <span class="n">selected_streams</span><span class="p">,</span> <span class="n">catalog</span><span class="p">,</span> <span class="n">state</span><span class="p">,</span> <span class="n">sheets_loaded</span><span class="p">)</span></pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+ <div class='section' id='section-103'>
+ <div class='docs'>
+ <div class='octowrap'>
+ <a class='octothorpe' href='#section-103'>#</a>
+ </div>
+ <p>Update file_metadata bookmark</p>
+ </div>
+ <div class='code'>
+ <div class="highlight"><pre> <span class="n">write_bookmark</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="s1">'file_metadata'</span><span class="p">,</span> <span class="n">strftime</span><span class="p">(</span><span class="n">this_datetime</span><span class="p">))</span>
+
+ <span class="k">return</span>
+
+</pre></div>
+ </div>
+ </div>
+ <div class='clearall'></div>
+</div>
+</body>