id,node_id,name,full_name,private,owner,html_url,description,fork,created_at,updated_at,pushed_at,homepage,size,stargazers_count,watchers_count,language,has_issues,has_projects,has_downloads,has_wiki,has_pages,forks_count,archived,disabled,open_issues_count,license,topics,forks,open_issues,watchers,default_branch,permissions,temp_clone_token,organization,network_count,subscribers_count,readme,readme_html,allow_forking,visibility,is_template,template_repository,web_commit_signoff_required,has_discussions
110509816,MDEwOlJlcG9zaXRvcnkxMTA1MDk4MTY=,csvs-to-sqlite,simonw/csvs-to-sqlite,0,9599,https://github.com/simonw/csvs-to-sqlite,Convert CSV files into a SQLite database,0,2017-11-13T06:38:21Z,2021-11-18T16:33:39Z,2021-11-18T16:35:33Z,,138,655,655,Python,1,1,1,1,0,50,0,0,34,apache-2.0,"[""click"", ""csv"", ""datasette"", ""datasette-io"", ""datasette-tool"", ""pandas"", ""python"", ""sqlite""]",50,34,655,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,50,17,"# csvs-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/csvs-to-sqlite.svg)](https://pypi.org/project/csvs-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/csvs-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/csvs-to-sqlite/releases)
[![Tests](https://github.com/simonw/csvs-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/csvs-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csvs-to-sqlite/blob/main/LICENSE)

Convert CSV files into a SQLite database. Browse and publish that SQLite database with [Datasette](https://github.com/simonw/datasette).

Basic usage:

    csvs-to-sqlite myfile.csv mydatabase.db

This will create a new SQLite database called `mydatabase.db` containing a
single table, `myfile`, containing the CSV content.

You can provide multiple CSV files:

    csvs-to-sqlite one.csv two.csv bundle.db

The `bundle.db` database will contain two tables, `one` and `two`.

This means you can use wildcards:

    csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

If you pass a path to one or more directories, the script will recursively
search those directories for CSV files and create tables for each one.

    csvs-to-sqlite ~/path/to/directory all-my-csvs.db

## Handling TSV (tab-separated values)

You can use the `-s` option to specify a different delimiter. If you want
to use a tab character you'll need to apply shell escaping like so:

    csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

## Refactoring columns into separate lookup tables

Let's say you have a CSV file that looks like this:

    county,precinct,office,district,party,candidate,votes
    Clark,1,President,,REP,John R. Kasich,5
    Clark,2,President,,REP,John R. Kasich,0
    Clark,3,President,,REP,John R. Kasich,7

([Real example taken from the Open Elections project](https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv))

You can now convert selected columns into separate lookup tables using the new
`--extract-column` option (shortname: `-c`) - for example:

    csvs-to-sqlite openelections-data-*/*.csv \
        -c county:County:name \
        -c precinct:Precinct:name \
        -c office -c district -c party -c candidate \
        openelections.db

The format is as follows:

    column_name:optional_table_name:optional_table_value_column_name

If you just specify the column name e.g. `-c office`, the following table will
be created:

    CREATE TABLE ""office"" (
        ""id"" INTEGER PRIMARY KEY,
        ""value"" TEXT
    );

If you specify all three options, e.g. `-c precinct:Precinct:name` the table
will look like this:

    CREATE TABLE ""Precinct"" (
        ""id"" INTEGER PRIMARY KEY,
        ""name"" TEXT
    );

The original tables will be created like this:

    CREATE TABLE ""ca__primary__san_francisco__precinct"" (
        ""county"" INTEGER,
        ""precinct"" INTEGER,
        ""office"" INTEGER,
        ""district"" INTEGER,
        ""party"" INTEGER,
        ""candidate"" INTEGER,
        ""votes"" INTEGER,
        FOREIGN KEY (county) REFERENCES County(id),
        FOREIGN KEY (party) REFERENCES party(id),
        FOREIGN KEY (precinct) REFERENCES Precinct(id),
        FOREIGN KEY (office) REFERENCES office(id),
        FOREIGN KEY (candidate) REFERENCES candidate(id)
    );

They will be populated with IDs that reference the new derived tables.

## Installation

    $ pip install csvs-to-sqlite

`csvs-to-sqlite` now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:

    $ pip install csvs-to-sqlite==0.9.2

## csvs-to-sqlite --help

<!-- [[[cog
import cog
from csvs_to_sqlite import cli
from click.testing import CliRunner
runner = CliRunner()
result = runner.invoke(cli.cli, [""--help""])
help = result.output.replace(""Usage: cli"", ""Usage: csvs-to-sqlite"")
cog.out(
    ""```\n{}\n```"".format(help)
)
]]] -->
```
Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).

  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import

  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.

  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates

  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes

  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes

  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index

  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)

  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...

  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name

  --fixed-column <TEXT TEXT>...   Populate column with a fixed string
  --fixed-column-int <TEXT INTEGER>...
                                  Populate column with a fixed integer
  --fixed-column-float <TEXT FLOAT>...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)

  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)

  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)

  --version                       Show the version and exit.
  --help                          Show this message and exit.

```
<!-- [[[end]]] -->
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-csvs-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-csvs-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>csvs-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.org/project/csvs-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/4869c8a624ce3dbf120decc95900d98af23c54cf5c87d65f7164e640a5fcda6b/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f637376732d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/csvs-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csvs-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/c0cc260f11a725723219d723be3829d23edfc0962d0e8d191fa2bcd6ceaa749f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f637376732d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/csvs-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csvs-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/csvs-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csvs-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Convert CSV files into a SQLite database. Browse and publish that SQLite database with <a href=""https://github.com/simonw/datasette"">Datasette</a>.</p>
<p dir=""auto"">Basic usage:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite myfile.csv mydatabase.db
""><pre><code>csvs-to-sqlite myfile.csv mydatabase.db
</code></pre></div>
<p dir=""auto"">This will create a new SQLite database called <code>mydatabase.db</code> containing a
single table, <code>myfile</code>, containing the CSV content.</p>
<p dir=""auto"">You can provide multiple CSV files:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite one.csv two.csv bundle.db
""><pre><code>csvs-to-sqlite one.csv two.csv bundle.db
</code></pre></div>
<p dir=""auto"">The <code>bundle.db</code> database will contain two tables, <code>one</code> and <code>two</code>.</p>
<p dir=""auto"">This means you can use wildcards:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite ~/Downloads/*.csv my-downloads.db
""><pre><code>csvs-to-sqlite ~/Downloads/*.csv my-downloads.db
</code></pre></div>
<p dir=""auto"">If you pass a path to one or more directories, the script will recursively
search those directories for CSV files and create tables for each one.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite ~/path/to/directory all-my-csvs.db
""><pre><code>csvs-to-sqlite ~/path/to/directory all-my-csvs.db
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-handling-tsv-tab-separated-values"" class=""anchor"" aria-hidden=""true"" href=""#user-content-handling-tsv-tab-separated-values""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Handling TSV (tab-separated values)</h2>
<p dir=""auto"">You can use the <code>-s</code> option to specify a different delimiter. If you want
to use a tab character you'll need to apply shell escaping like so:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite my-file.tsv my-file.db -s $'\t'
""><pre><code>csvs-to-sqlite my-file.tsv my-file.db -s $'\t'
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-refactoring-columns-into-separate-lookup-tables"" class=""anchor"" aria-hidden=""true"" href=""#user-content-refactoring-columns-into-separate-lookup-tables""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Refactoring columns into separate lookup tables</h2>
<p dir=""auto"">Let's say you have a CSV file that looks like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7
""><pre><code>county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7
</code></pre></div>
<p dir=""auto"">(<a href=""https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv"">Real example taken from the Open Elections project</a>)</p>
<p dir=""auto"">You can now convert selected columns into separate lookup tables using the new
<code>--extract-column</code> option (shortname: <code>-c</code>) - for example:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""csvs-to-sqlite openelections-data-*/*.csv \
    -c county:County:name \
    -c precinct:Precinct:name \
    -c office -c district -c party -c candidate \
    openelections.db
""><pre><code>csvs-to-sqlite openelections-data-*/*.csv \
    -c county:County:name \
    -c precinct:Precinct:name \
    -c office -c district -c party -c candidate \
    openelections.db
</code></pre></div>
<p dir=""auto"">The format is as follows:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""column_name:optional_table_name:optional_table_value_column_name
""><pre><code>column_name:optional_table_name:optional_table_value_column_name
</code></pre></div>
<p dir=""auto"">If you just specify the column name e.g. <code>-c office</code>, the following table will
be created:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""CREATE TABLE &quot;office&quot; (
    &quot;id&quot; INTEGER PRIMARY KEY,
    &quot;value&quot; TEXT
);
""><pre><code>CREATE TABLE ""office"" (
    ""id"" INTEGER PRIMARY KEY,
    ""value"" TEXT
);
</code></pre></div>
<p dir=""auto"">If you specify all three options, e.g. <code>-c precinct:Precinct:name</code> the table
will look like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""CREATE TABLE &quot;Precinct&quot; (
    &quot;id&quot; INTEGER PRIMARY KEY,
    &quot;name&quot; TEXT
);
""><pre><code>CREATE TABLE ""Precinct"" (
    ""id"" INTEGER PRIMARY KEY,
    ""name"" TEXT
);
</code></pre></div>
<p dir=""auto"">The original tables will be created like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""CREATE TABLE &quot;ca__primary__san_francisco__precinct&quot; (
    &quot;county&quot; INTEGER,
    &quot;precinct&quot; INTEGER,
    &quot;office&quot; INTEGER,
    &quot;district&quot; INTEGER,
    &quot;party&quot; INTEGER,
    &quot;candidate&quot; INTEGER,
    &quot;votes&quot; INTEGER,
    FOREIGN KEY (county) REFERENCES County(id),
    FOREIGN KEY (party) REFERENCES party(id),
    FOREIGN KEY (precinct) REFERENCES Precinct(id),
    FOREIGN KEY (office) REFERENCES office(id),
    FOREIGN KEY (candidate) REFERENCES candidate(id)
);
""><pre><code>CREATE TABLE ""ca__primary__san_francisco__precinct"" (
    ""county"" INTEGER,
    ""precinct"" INTEGER,
    ""office"" INTEGER,
    ""district"" INTEGER,
    ""party"" INTEGER,
    ""candidate"" INTEGER,
    ""votes"" INTEGER,
    FOREIGN KEY (county) REFERENCES County(id),
    FOREIGN KEY (party) REFERENCES party(id),
    FOREIGN KEY (precinct) REFERENCES Precinct(id),
    FOREIGN KEY (office) REFERENCES office(id),
    FOREIGN KEY (candidate) REFERENCES candidate(id)
);
</code></pre></div>
<p dir=""auto"">They will be populated with IDs that reference the new derived tables.</p>
<h2 dir=""auto""><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install csvs-to-sqlite
""><pre><code>$ pip install csvs-to-sqlite
</code></pre></div>
<p dir=""auto""><code>csvs-to-sqlite</code> now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install csvs-to-sqlite==0.9.2
""><pre><code>$ pip install csvs-to-sqlite==0.9.2
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-csvs-to-sqlite---help"" class=""anchor"" aria-hidden=""true"" href=""#user-content-csvs-to-sqlite---help""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>csvs-to-sqlite --help</h2>

<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).

  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import

  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.

  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates

  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes

  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes

  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index

  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)

  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...

  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name

  --fixed-column &lt;TEXT TEXT&gt;...   Populate column with a fixed string
  --fixed-column-int &lt;TEXT INTEGER&gt;...
                                  Populate column with a fixed integer
  --fixed-column-float &lt;TEXT FLOAT&gt;...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)

  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)

  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)

  --version                       Show the version and exit.
  --help                          Show this message and exit.

""><pre><code>Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).

  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import

  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.

  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates

  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes

  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes

  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index

  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)

  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...

  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name

  --fixed-column &lt;TEXT TEXT&gt;...   Populate column with a fixed string
  --fixed-column-int &lt;TEXT INTEGER&gt;...
                                  Populate column with a fixed integer
  --fixed-column-float &lt;TEXT FLOAT&gt;...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)

  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)

  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)

  --version                       Show the version and exit.
  --help                          Show this message and exit.

</code></pre></div>

</article></div>",1,public,0,,,
140912432,MDEwOlJlcG9zaXRvcnkxNDA5MTI0MzI=,sqlite-utils,simonw/sqlite-utils,0,9599,https://github.com/simonw/sqlite-utils,Python CLI utility and library for manipulating SQLite databases,0,2018-07-14T03:21:46Z,2022-11-15T18:12:16Z,2022-11-15T15:53:38Z,https://sqlite-utils.datasette.io,1437,1029,1029,Python,1,1,1,1,0,79,0,0,72,apache-2.0,"[""cli"", ""click"", ""datasette"", ""datasette-io"", ""datasette-tool"", ""python"", ""sqlite"", ""sqlite-database""]",79,72,1029,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,79,16,"# sqlite-utils

[![PyPI](https://img.shields.io/pypi/v/sqlite-utils.svg)](https://pypi.org/project/sqlite-utils/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-utils?include_prereleases&label=changelog)](https://sqlite-utils.datasette.io/en/stable/changelog.html)
[![Python 3.x](https://img.shields.io/pypi/pyversions/sqlite-utils.svg?logo=python&logoColor=white)](https://pypi.org/project/sqlite-utils/)
[![Tests](https://github.com/simonw/sqlite-utils/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-utils/actions?query=workflow%3ATest)
[![Documentation Status](https://readthedocs.org/projects/sqlite-utils/badge/?version=stable)](http://sqlite-utils.datasette.io/en/stable/?badge=stable)
[![codecov](https://codecov.io/gh/simonw/sqlite-utils/branch/main/graph/badge.svg)](https://codecov.io/gh/simonw/sqlite-utils)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-utils/blob/main/LICENSE)
[![discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://discord.gg/Ass7bCAMDw)

Python CLI utility and library for manipulating SQLite databases.

## Some feature highlights

- [Pipe JSON](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data) (or [CSV or TSV](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data)) directly into a new SQLite database file, automatically creating a table with the appropriate schema
- [Run in-memory SQL queries](https://sqlite-utils.datasette.io/en/stable/cli.html#querying-data-directly-using-an-in-memory-database), including joins, directly against data in CSV, TSV or JSON files and view the results
- [Configure SQLite full-text search](https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search) against your database tables and run search queries against them, ordered by relevance
- Run [transformations against your tables](https://sqlite-utils.datasette.io/en/stable/cli.html#transforming-tables) to make schema changes that SQLite `ALTER TABLE` does not directly support, such as changing the type of a column
- [Extract columns](https://sqlite-utils.datasette.io/en/stable/cli.html#extracting-columns-into-a-separate-table) into separate tables to better normalize your existing data

Read more on my blog, in this series of posts on [New features in sqlite-utils](https://simonwillison.net/series/sqlite-utils-features/) and other [entries tagged sqliteutils](https://simonwillison.net/tags/sqliteutils/).

## Installation

    pip install sqlite-utils

Or if you use [Homebrew](https://brew.sh/) for macOS:

    brew install sqlite-utils

## Using as a CLI tool

Now you can do things with the CLI utility like this:

    $ sqlite-utils memory dogs.csv ""select * from t""
    [{""id"": 1, ""age"": 4, ""name"": ""Cleo""},
     {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv
    [####################################]  100%

    $ sqlite-utils tables dogs.db --counts
    [{""table"": ""dogs"", ""count"": 2}]

    $ sqlite-utils dogs.db ""select id, name from dogs""
    [{""id"": 1, ""name"": ""Cleo""},
     {""id"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils dogs.db ""select * from dogs"" --csv
    id,age,name
    1,4,Cleo
    2,2,Pancakes

    $ sqlite-utils dogs.db ""select * from dogs"" --table
      id    age  name
    ----  -----  --------
       1      4  Cleo
       2      2  Pancakes

You can import JSON data into a new database table like this:

    $ curl https://api.github.com/repos/simonw/sqlite-utils/releases \
        | sqlite-utils insert releases.db releases - --pk id

Or for data in a CSV file:

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv

`sqlite-utils memory` lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command:

    $ cat dogs.csv | sqlite-utils memory - ""select name, age from stdin""

See the [full CLI documentation](https://sqlite-utils.datasette.io/en/stable/cli.html) for comprehensive coverage of many more commands.

## Using as a library

You can also `import sqlite_utils` and use it as a Python library like this:

```python
import sqlite_utils
db = sqlite_utils.Database(""demo_database.db"")
# This line creates a ""dogs"" table if one does not already exist:
db[""dogs""].insert_all([
    {""id"": 1, ""age"": 4, ""name"": ""Cleo""},
    {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}
], pk=""id"")
```

Check out the [full library documentation](https://sqlite-utils.datasette.io/en/stable/python-api.html) for everything else you can do with the Python library.

## Related projects

* [Datasette](https://datasette.io/): A tool for exploring and publishing data
* [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database
* [db-to-sqlite](https://github.com/simonw/db-to-sqlite): CLI tool for exporting a MySQL or PostgreSQL database as a SQLite file
* [dogsheep](https://dogsheep.github.io/): A family of tools for personal analytics, built on top of `sqlite-utils`
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-sqlite-utils"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sqlite-utils""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>sqlite-utils</h1>
<p dir=""auto""><a href=""https://pypi.org/project/sqlite-utils/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/f691f8124616f99c7978ecb2a58841aa6a7ace31234b0c87d392c4ef8db853e4/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f73716c6974652d7574696c732e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/sqlite-utils.svg"" style=""max-width: 100%;""></a>
<a href=""https://sqlite-utils.datasette.io/en/stable/changelog.html"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/b114e00dffe1e5b24d21c5cd11dc97f8a6ab83da51acbb971e5daebeeb47d690/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f73716c6974652d7574696c733f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/sqlite-utils?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://pypi.org/project/sqlite-utils/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/2413fbc1359bcdcf500aa31b497c8c25acf7db3f3fc3a33c540e809c984a189e/68747470733a2f2f696d672e736869656c64732e696f2f707970692f707976657273696f6e732f73716c6974652d7574696c732e7376673f6c6f676f3d707974686f6e266c6f676f436f6c6f723d7768697465"" alt=""Python 3.x"" data-canonical-src=""https://img.shields.io/pypi/pyversions/sqlite-utils.svg?logo=python&amp;logoColor=white"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/sqlite-utils/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/sqlite-utils/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""http://sqlite-utils.datasette.io/en/stable/?badge=stable"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/ebfbbefc97415a2c49a4757ef1f7836a0771eb543ba3c1d1267e466ab0ed8bd1/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f73716c6974652d7574696c732f62616467652f3f76657273696f6e3d737461626c65"" alt=""Documentation Status"" data-canonical-src=""https://readthedocs.org/projects/sqlite-utils/badge/?version=stable"" style=""max-width: 100%;""></a>
<a href=""https://codecov.io/gh/simonw/sqlite-utils"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/762e8034bbc8654dab7f15e85ae5eb836d07b86ebf68ab955a8c18d8035fbf13/68747470733a2f2f636f6465636f762e696f2f67682f73696d6f6e772f73716c6974652d7574696c732f6272616e63682f6d61696e2f67726170682f62616467652e737667"" alt=""codecov"" data-canonical-src=""https://codecov.io/gh/simonw/sqlite-utils/branch/main/graph/badge.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/sqlite-utils/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a>
<a href=""https://discord.gg/Ass7bCAMDw"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/1e67d99839ed78183f50f78dee6bb4897d95ff44a4b7182d4bfd82c505d8d835/68747470733a2f2f696d672e736869656c64732e696f2f646973636f72642f3832333937313238363330383335363135373f6c6162656c3d646973636f7264"" alt=""discord"" data-canonical-src=""https://img.shields.io/discord/823971286308356157?label=discord"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Python CLI utility and library for manipulating SQLite databases.</p>
<h2 dir=""auto""><a id=""user-content-some-feature-highlights"" class=""anchor"" aria-hidden=""true"" href=""#user-content-some-feature-highlights""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Some feature highlights</h2>
<ul dir=""auto"">
<li><a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data"" rel=""nofollow"">Pipe JSON</a> (or <a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data"" rel=""nofollow"">CSV or TSV</a>) directly into a new SQLite database file, automatically creating a table with the appropriate schema</li>
<li><a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#querying-data-directly-using-an-in-memory-database"" rel=""nofollow"">Run in-memory SQL queries</a>, including joins, directly against data in CSV, TSV or JSON files and view the results</li>
<li><a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search"" rel=""nofollow"">Configure SQLite full-text search</a> against your database tables and run search queries against them, ordered by relevance</li>
<li>Run <a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#transforming-tables"" rel=""nofollow"">transformations against your tables</a> to make schema changes that SQLite <code>ALTER TABLE</code> does not directly support, such as changing the type of a column</li>
<li><a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#extracting-columns-into-a-separate-table"" rel=""nofollow"">Extract columns</a> into separate tables to better normalize your existing data</li>
</ul>
<p dir=""auto"">Read more on my blog, in this series of posts on <a href=""https://simonwillison.net/series/sqlite-utils-features/"" rel=""nofollow"">New features in sqlite-utils</a> and other <a href=""https://simonwillison.net/tags/sqliteutils/"" rel=""nofollow"">entries tagged sqliteutils</a>.</p>
<h2 dir=""auto""><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""pip install sqlite-utils""><pre class=""notranslate""><code>pip install sqlite-utils
</code></pre></div>
<p dir=""auto"">Or if you use <a href=""https://brew.sh/"" rel=""nofollow"">Homebrew</a> for macOS:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""brew install sqlite-utils""><pre class=""notranslate""><code>brew install sqlite-utils
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-using-as-a-cli-tool"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-as-a-cli-tool""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using as a CLI tool</h2>
<p dir=""auto"">Now you can do things with the CLI utility like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ sqlite-utils memory dogs.csv &quot;select * from t&quot;
[{&quot;id&quot;: 1, &quot;age&quot;: 4, &quot;name&quot;: &quot;Cleo&quot;},
 {&quot;id&quot;: 2, &quot;age&quot;: 2, &quot;name&quot;: &quot;Pancakes&quot;}]

$ sqlite-utils insert dogs.db dogs dogs.csv --csv
[####################################]  100%

$ sqlite-utils tables dogs.db --counts
[{&quot;table&quot;: &quot;dogs&quot;, &quot;count&quot;: 2}]

$ sqlite-utils dogs.db &quot;select id, name from dogs&quot;
[{&quot;id&quot;: 1, &quot;name&quot;: &quot;Cleo&quot;},
 {&quot;id&quot;: 2, &quot;name&quot;: &quot;Pancakes&quot;}]

$ sqlite-utils dogs.db &quot;select * from dogs&quot; --csv
id,age,name
1,4,Cleo
2,2,Pancakes

$ sqlite-utils dogs.db &quot;select * from dogs&quot; --table
  id    age  name
----  -----  --------
   1      4  Cleo
   2      2  Pancakes""><pre class=""notranslate""><code>$ sqlite-utils memory dogs.csv ""select * from t""
[{""id"": 1, ""age"": 4, ""name"": ""Cleo""},
 {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}]

$ sqlite-utils insert dogs.db dogs dogs.csv --csv
[####################################]  100%

$ sqlite-utils tables dogs.db --counts
[{""table"": ""dogs"", ""count"": 2}]

$ sqlite-utils dogs.db ""select id, name from dogs""
[{""id"": 1, ""name"": ""Cleo""},
 {""id"": 2, ""name"": ""Pancakes""}]

$ sqlite-utils dogs.db ""select * from dogs"" --csv
id,age,name
1,4,Cleo
2,2,Pancakes

$ sqlite-utils dogs.db ""select * from dogs"" --table
  id    age  name
----  -----  --------
   1      4  Cleo
   2      2  Pancakes
</code></pre></div>
<p dir=""auto"">You can import JSON data into a new database table like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ curl https://api.github.com/repos/simonw/sqlite-utils/releases \
    | sqlite-utils insert releases.db releases - --pk id""><pre class=""notranslate""><code>$ curl https://api.github.com/repos/simonw/sqlite-utils/releases \
    | sqlite-utils insert releases.db releases - --pk id
</code></pre></div>
<p dir=""auto"">Or for data in a CSV file:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ sqlite-utils insert dogs.db dogs dogs.csv --csv""><pre class=""notranslate""><code>$ sqlite-utils insert dogs.db dogs dogs.csv --csv
</code></pre></div>
<p dir=""auto""><code>sqlite-utils memory</code> lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ cat dogs.csv | sqlite-utils memory - &quot;select name, age from stdin&quot;""><pre class=""notranslate""><code>$ cat dogs.csv | sqlite-utils memory - ""select name, age from stdin""
</code></pre></div>
<p dir=""auto"">See the <a href=""https://sqlite-utils.datasette.io/en/stable/cli.html"" rel=""nofollow"">full CLI documentation</a> for comprehensive coverage of many more commands.</p>
<h2 dir=""auto""><a id=""user-content-using-as-a-library"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-as-a-library""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using as a library</h2>
<p dir=""auto"">You can also <code>import sqlite_utils</code> and use it as a Python library like this:</p>
<div class=""highlight highlight-source-python notranslate position-relative overflow-auto"" dir=""auto"" data-snippet-clipboard-copy-content=""import sqlite_utils
db = sqlite_utils.Database(&quot;demo_database.db&quot;)
# This line creates a &quot;dogs&quot; table if one does not already exist:
db[&quot;dogs&quot;].insert_all([
    {&quot;id&quot;: 1, &quot;age&quot;: 4, &quot;name&quot;: &quot;Cleo&quot;},
    {&quot;id&quot;: 2, &quot;age&quot;: 2, &quot;name&quot;: &quot;Pancakes&quot;}
], pk=&quot;id&quot;)""><pre><span class=""pl-k"">import</span> <span class=""pl-s1"">sqlite_utils</span>
<span class=""pl-s1"">db</span> <span class=""pl-c1"">=</span> <span class=""pl-s1"">sqlite_utils</span>.<span class=""pl-v"">Database</span>(<span class=""pl-s"">""demo_database.db""</span>)
<span class=""pl-c""># This line creates a ""dogs"" table if one does not already exist:</span>
<span class=""pl-s1"">db</span>[<span class=""pl-s"">""dogs""</span>].<span class=""pl-en"">insert_all</span>([
    {<span class=""pl-s"">""id""</span>: <span class=""pl-c1"">1</span>, <span class=""pl-s"">""age""</span>: <span class=""pl-c1"">4</span>, <span class=""pl-s"">""name""</span>: <span class=""pl-s"">""Cleo""</span>},
    {<span class=""pl-s"">""id""</span>: <span class=""pl-c1"">2</span>, <span class=""pl-s"">""age""</span>: <span class=""pl-c1"">2</span>, <span class=""pl-s"">""name""</span>: <span class=""pl-s"">""Pancakes""</span>}
], <span class=""pl-s1"">pk</span><span class=""pl-c1"">=</span><span class=""pl-s"">""id""</span>)</pre></div>
<p dir=""auto"">Check out the <a href=""https://sqlite-utils.datasette.io/en/stable/python-api.html"" rel=""nofollow"">full library documentation</a> for everything else you can do with the Python library.</p>
<h2 dir=""auto""><a id=""user-content-related-projects"" class=""anchor"" aria-hidden=""true"" href=""#user-content-related-projects""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Related projects</h2>
<ul dir=""auto"">
<li><a href=""https://datasette.io/"" rel=""nofollow"">Datasette</a>: A tool for exploring and publishing data</li>
<li><a href=""https://github.com/simonw/csvs-to-sqlite"">csvs-to-sqlite</a>: Convert CSV files into a SQLite database</li>
<li><a href=""https://github.com/simonw/db-to-sqlite"">db-to-sqlite</a>: CLI tool for exporting a MySQL or PostgreSQL database as a SQLite file</li>
<li><a href=""https://dogsheep.github.io/"" rel=""nofollow"">dogsheep</a>: A family of tools for personal analytics, built on top of <code>sqlite-utils</code></li>
</ul>
</article></div>",1,public,0,,0,0
166159072,MDEwOlJlcG9zaXRvcnkxNjYxNTkwNzI=,db-to-sqlite,simonw/db-to-sqlite,0,9599,https://github.com/simonw/db-to-sqlite,CLI tool for exporting tables or queries from any SQL database to a SQLite file,0,2019-01-17T04:16:48Z,2021-06-11T22:52:12Z,2021-06-11T22:55:56Z,,77,226,226,Python,1,1,1,1,0,12,0,0,2,apache-2.0,"[""sqlalchemy"", ""sqlite"", ""datasette"", ""datasette-io"", ""datasette-tool""]",12,2,226,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,12,4,"# db-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/db-to-sqlite.svg)](https://pypi.python.org/pypi/db-to-sqlite)
[![Changelog](https://img.shields.io/github/v/release/simonw/db-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/db-to-sqlite/releases)
[![Tests](https://github.com/simonw/db-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/db-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/db-to-sqlite/blob/main/LICENSE)

CLI tool for exporting tables or queries from any SQL database to a SQLite file.

## Installation

Install from PyPI like so:

    pip install db-to-sqlite

If you want to use it with MySQL, you can install the extra dependency like this:

    pip install 'db-to-sqlite[mysql]'

Installing the `mysqlclient` library on OS X can be tricky - I've found [this recipe](https://gist.github.com/simonw/90ac0afd204cd0d6d9c3135c3888d116) to work (run that before installing `db-to-sqlite`).

For PostgreSQL, use this:

    pip install 'db-to-sqlite[postgresql]'

## Usage

    Usage: db-to-sqlite [OPTIONS] CONNECTION PATH

      Load data from any database into SQLite.

      PATH is a path to the SQLite file to create, e.c. /tmp/my_database.db

      CONNECTION is a SQLAlchemy connection string, for example:

          postgresql://localhost/my_database
          postgresql://username:passwd@localhost/my_database

          mysql://root@localhost/my_database
          mysql://username:passwd@localhost/my_database

      More: https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls

    Options:
      --version                     Show the version and exit.
      --all                         Detect and copy all tables
      --table TEXT                  Specific tables to copy
      --skip TEXT                   When using --all skip these tables
      --redact TEXT...              (table, column) pairs to redact with ***
      --sql TEXT                    Optional SQL query to run
      --output TEXT                 Table in which to save --sql query results
      --pk TEXT                     Optional column to use as a primary key
      --index-fks / --no-index-fks  Should foreign keys have indexes? Default on
      -p, --progress                Show progress bar
      --postgres-schema TEXT        PostgreSQL schema to use
      --help                        Show this message and exit.

For example, to save the content of the `blog_entry` table from a PostgreSQL database to a local file called `blog.db` you could do this:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --table=blog_entry

You can specify `--table` more than once.

You can also save the data from all of your tables, effectively creating a SQLite copy of your entire database. Any foreign key relationships will be detected and added to the SQLite database. For example:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all

When running `--all` you can specify tables to skip using `--skip`:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all \
        --skip=django_migrations

If you want to save the results of a custom SQL query, do this:

    db-to-sqlite ""postgresql://localhost/myblog"" output.db \
        --output=query_results \
        --sql=""select id, title, created from blog_entry"" \
        --pk=id

The `--output` option specifies the table that should contain the results of the query.

## Using db-to-sqlite with PostgreSQL schemas

If the tables you want to copy from your PostgreSQL database aren't in the default schema, you can specify an alternate one with the `--postgres-schema` option:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all \
        --postgres-schema my_schema

## Using db-to-sqlite with Heroku Postgres

If you run an application on [Heroku](https://www.heroku.com/) using their [Postgres database product](https://www.heroku.com/postgres), you can use the `heroku config` command to access a compatible connection string:

    $ heroku config --app myappname | grep HEROKU_POSTG
    HEROKU_POSTGRESQL_OLIVE_URL: postgres://username:password@ec2-xxx-xxx-xxx-x.compute-1.amazonaws.com:5432/dbname

You can pass this to `db-to-sqlite` to create a local SQLite database with the data from your Heroku instance.

You can even do this using a bash one-liner:

    $ db-to-sqlite $(heroku config --app myappname | grep HEROKU_POSTG | cut -d: -f 2-) \
        /tmp/heroku.db --all -p
    1/23: django_migrations
    ...
    17/23: blog_blogmark
    [####################################]  100%
    ...

## Related projects

* [Datasette](https://github.com/simonw/datasette): A tool for exploring and publishing data. Works great with SQLite files generated using `db-to-sqlite`.
* [sqlite-utils](https://github.com/simonw/sqlite-utils): Python CLI utility and library for manipulating SQLite databases.
* [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database.

## Development

To set up this tool locally, first checkout the code. Then create a new virtual environment:

    cd db-to-sqlite
    python3 -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and test dependencies:

    pip install -e '.[test]'

To run the tests:

    pytest

This will skip tests against MySQL or PostgreSQL if you do not have their additional dependencies installed.

You can install those extra dependencies like so:

    pip install -e '.[test_mysql,test_postgresql]'

You can alternative use `pip install psycopg2-binary` if you cannot install the `psycopg2` dependency used by the `test_postgresql` extra.

See [Running a MySQL server using Homebrew](https://til.simonwillison.net/homebrew/mysql-homebrew) for tips on running the tests against MySQL on macOS, including how to install the `mysqlclient` dependency.

The PostgreSQL and MySQL tests default to expecting to run against servers on localhost. You can use environment variables to point them at different test database servers:

- `MYSQL_TEST_DB_CONNECTION` - defaults to `mysql://root@localhost/test_db_to_sqlite`
- `POSTGRESQL_TEST_DB_CONNECTION` - defaults to `postgresql://localhost/test_db_to_sqlite`

The database you indicate in the environment variable - `test_db_to_sqlite` by default - will be deleted and recreated on every test run.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-db-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-db-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>db-to-sqlite</h1>
<p><a href=""https://pypi.python.org/pypi/db-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/b34869a6692a0e2ab6754463aad8578fe9f594788d99d4fd7ae2d815735d1660/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f64622d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/db-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/db-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/0f168b599361c3b2242a18a2b84f49d1c4e5520c7c425a89f43fd1b6f337f299/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f64622d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/db-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/db-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/db-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/db-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>CLI tool for exporting tables or queries from any SQL database to a SQLite file.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install from PyPI like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install db-to-sqlite
""><pre><code>pip install db-to-sqlite
</code></pre></div>
<p>If you want to use it with MySQL, you can install the extra dependency like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install 'db-to-sqlite[mysql]'
""><pre><code>pip install 'db-to-sqlite[mysql]'
</code></pre></div>
<p>Installing the <code>mysqlclient</code> library on OS X can be tricky - I've found <a href=""https://gist.github.com/simonw/90ac0afd204cd0d6d9c3135c3888d116"">this recipe</a> to work (run that before installing <code>db-to-sqlite</code>).</p>
<p>For PostgreSQL, use this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install 'db-to-sqlite[postgresql]'
""><pre><code>pip install 'db-to-sqlite[postgresql]'
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""Usage: db-to-sqlite [OPTIONS] CONNECTION PATH

  Load data from any database into SQLite.

  PATH is a path to the SQLite file to create, e.c. /tmp/my_database.db

  CONNECTION is a SQLAlchemy connection string, for example:

      postgresql://localhost/my_database
      postgresql://username:passwd@localhost/my_database

      mysql://root@localhost/my_database
      mysql://username:passwd@localhost/my_database

  More: https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls

Options:
  --version                     Show the version and exit.
  --all                         Detect and copy all tables
  --table TEXT                  Specific tables to copy
  --skip TEXT                   When using --all skip these tables
  --redact TEXT...              (table, column) pairs to redact with ***
  --sql TEXT                    Optional SQL query to run
  --output TEXT                 Table in which to save --sql query results
  --pk TEXT                     Optional column to use as a primary key
  --index-fks / --no-index-fks  Should foreign keys have indexes? Default on
  -p, --progress                Show progress bar
  --postgres-schema TEXT        PostgreSQL schema to use
  --help                        Show this message and exit.
""><pre><code>Usage: db-to-sqlite [OPTIONS] CONNECTION PATH

  Load data from any database into SQLite.

  PATH is a path to the SQLite file to create, e.c. /tmp/my_database.db

  CONNECTION is a SQLAlchemy connection string, for example:

      postgresql://localhost/my_database
      postgresql://username:passwd@localhost/my_database

      mysql://root@localhost/my_database
      mysql://username:passwd@localhost/my_database

  More: https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls

Options:
  --version                     Show the version and exit.
  --all                         Detect and copy all tables
  --table TEXT                  Specific tables to copy
  --skip TEXT                   When using --all skip these tables
  --redact TEXT...              (table, column) pairs to redact with ***
  --sql TEXT                    Optional SQL query to run
  --output TEXT                 Table in which to save --sql query results
  --pk TEXT                     Optional column to use as a primary key
  --index-fks / --no-index-fks  Should foreign keys have indexes? Default on
  -p, --progress                Show progress bar
  --postgres-schema TEXT        PostgreSQL schema to use
  --help                        Show this message and exit.
</code></pre></div>
<p>For example, to save the content of the <code>blog_entry</code> table from a PostgreSQL database to a local file called <code>blog.db</code> you could do this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""db-to-sqlite &quot;postgresql://localhost/myblog&quot; blog.db \
    --table=blog_entry
""><pre><code>db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
    --table=blog_entry
</code></pre></div>
<p>You can specify <code>--table</code> more than once.</p>
<p>You can also save the data from all of your tables, effectively creating a SQLite copy of your entire database. Any foreign key relationships will be detected and added to the SQLite database. For example:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""db-to-sqlite &quot;postgresql://localhost/myblog&quot; blog.db \
    --all
""><pre><code>db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
    --all
</code></pre></div>
<p>When running <code>--all</code> you can specify tables to skip using <code>--skip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""db-to-sqlite &quot;postgresql://localhost/myblog&quot; blog.db \
    --all \
    --skip=django_migrations
""><pre><code>db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
    --all \
    --skip=django_migrations
</code></pre></div>
<p>If you want to save the results of a custom SQL query, do this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""db-to-sqlite &quot;postgresql://localhost/myblog&quot; output.db \
    --output=query_results \
    --sql=&quot;select id, title, created from blog_entry&quot; \
    --pk=id
""><pre><code>db-to-sqlite ""postgresql://localhost/myblog"" output.db \
    --output=query_results \
    --sql=""select id, title, created from blog_entry"" \
    --pk=id
</code></pre></div>
<p>The <code>--output</code> option specifies the table that should contain the results of the query.</p>
<h2><a id=""user-content-using-db-to-sqlite-with-postgresql-schemas"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-db-to-sqlite-with-postgresql-schemas""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using db-to-sqlite with PostgreSQL schemas</h2>
<p>If the tables you want to copy from your PostgreSQL database aren't in the default schema, you can specify an alternate one with the <code>--postgres-schema</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""db-to-sqlite &quot;postgresql://localhost/myblog&quot; blog.db \
    --all \
    --postgres-schema my_schema
""><pre><code>db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
    --all \
    --postgres-schema my_schema
</code></pre></div>
<h2><a id=""user-content-using-db-to-sqlite-with-heroku-postgres"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-db-to-sqlite-with-heroku-postgres""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using db-to-sqlite with Heroku Postgres</h2>
<p>If you run an application on <a href=""https://www.heroku.com/"" rel=""nofollow"">Heroku</a> using their <a href=""https://www.heroku.com/postgres"" rel=""nofollow"">Postgres database product</a>, you can use the <code>heroku config</code> command to access a compatible connection string:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ heroku config --app myappname | grep HEROKU_POSTG
HEROKU_POSTGRESQL_OLIVE_URL: postgres://username:password@ec2-xxx-xxx-xxx-x.compute-1.amazonaws.com:5432/dbname
""><pre><code>$ heroku config --app myappname | grep HEROKU_POSTG
HEROKU_POSTGRESQL_OLIVE_URL: postgres://username:password@ec2-xxx-xxx-xxx-x.compute-1.amazonaws.com:5432/dbname
</code></pre></div>
<p>You can pass this to <code>db-to-sqlite</code> to create a local SQLite database with the data from your Heroku instance.</p>
<p>You can even do this using a bash one-liner:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ db-to-sqlite $(heroku config --app myappname | grep HEROKU_POSTG | cut -d: -f 2-) \
    /tmp/heroku.db --all -p
1/23: django_migrations
...
17/23: blog_blogmark
[####################################]  100%
...
""><pre><code>$ db-to-sqlite $(heroku config --app myappname | grep HEROKU_POSTG | cut -d: -f 2-) \
    /tmp/heroku.db --all -p
1/23: django_migrations
...
17/23: blog_blogmark
[####################################]  100%
...
</code></pre></div>
<h2><a id=""user-content-related-projects"" class=""anchor"" aria-hidden=""true"" href=""#user-content-related-projects""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Related projects</h2>
<ul>
<li><a href=""https://github.com/simonw/datasette"">Datasette</a>: A tool for exploring and publishing data. Works great with SQLite files generated using <code>db-to-sqlite</code>.</li>
<li><a href=""https://github.com/simonw/sqlite-utils"">sqlite-utils</a>: Python CLI utility and library for manipulating SQLite databases.</li>
<li><a href=""https://github.com/simonw/csvs-to-sqlite"">csvs-to-sqlite</a>: Convert CSV files into a SQLite database.</li>
</ul>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To set up this tool locally, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd db-to-sqlite
python3 -mvenv venv
source venv/bin/activate
""><pre><code>cd db-to-sqlite
python3 -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and test dependencies:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
<p>This will skip tests against MySQL or PostgreSQL if you do not have their additional dependencies installed.</p>
<p>You can install those extra dependencies like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test_mysql,test_postgresql]'
""><pre><code>pip install -e '.[test_mysql,test_postgresql]'
</code></pre></div>
<p>You can alternative use <code>pip install psycopg2-binary</code> if you cannot install the <code>psycopg2</code> dependency used by the <code>test_postgresql</code> extra.</p>
<p>See <a href=""https://til.simonwillison.net/homebrew/mysql-homebrew"" rel=""nofollow"">Running a MySQL server using Homebrew</a> for tips on running the tests against MySQL on macOS, including how to install the <code>mysqlclient</code> dependency.</p>
<p>The PostgreSQL and MySQL tests default to expecting to run against servers on localhost. You can use environment variables to point them at different test database servers:</p>
<ul>
<li><code>MYSQL_TEST_DB_CONNECTION</code> - defaults to <code>mysql://root@localhost/test_db_to_sqlite</code></li>
<li><code>POSTGRESQL_TEST_DB_CONNECTION</code> - defaults to <code>postgresql://localhost/test_db_to_sqlite</code></li>
</ul>
<p>The database you indicate in the environment variable - <code>test_db_to_sqlite</code> by default - will be deleted and recreated on every test run.</p>
</article></div>",,,,,,
167759846,MDEwOlJlcG9zaXRvcnkxNjc3NTk4NDY=,markdown-to-sqlite,simonw/markdown-to-sqlite,0,9599,https://github.com/simonw/markdown-to-sqlite,CLI tool for loading markdown files into a SQLite database,0,2019-01-27T02:04:54Z,2022-05-13T18:09:26Z,2022-05-13T18:09:22Z,,13,49,49,Python,1,1,1,1,0,2,0,0,2,apache-2.0,"[""datasette-io"", ""datasette-tool"", ""markdown"", ""sqlite"", ""yaml""]",2,2,49,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,2,3,"# markdown-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/markdown-to-sqlite.svg)](https://pypi.python.org/pypi/markdown-to-sqlite)
[![Changelog](https://img.shields.io/github/v/release/simonw/markdown-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/markdown-to-sqlite/releases)
[![Tests](https://github.com/simonw/markdown-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/markdown-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/markdown-to-sqlite/blob/main/LICENSE)

CLI tool for loading markdown files into a SQLite database.

YAML embedded in the markdown files will be used to populate additional columns.

    Usage: markdown-to-sqlite [OPTIONS] DBNAME TABLE PATHS...

For example:

    $ markdown-to-sqlite docs.db documents file1.md file2.md

## Breaking change

Prior to version 1.0 this argument order was different - markdown files were listed before the database and table.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-markdown-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-markdown-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>markdown-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.python.org/pypi/markdown-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/e5aa6980c413976937ea698f111c31b580e6186e9f018673df37b3e9fe1047fb/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6d61726b646f776e2d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/markdown-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/markdown-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/29f48f8deb508081ca40341be55a1cdbe25dcd7a637e5277a24b82f167643ab6/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f6d61726b646f776e2d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/markdown-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/markdown-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/markdown-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/markdown-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">CLI tool for loading markdown files into a SQLite database.</p>
<p dir=""auto"">YAML embedded in the markdown files will be used to populate additional columns.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""Usage: markdown-to-sqlite [OPTIONS] DBNAME TABLE PATHS...""><pre class=""notranslate""><code class=""notranslate"">Usage: markdown-to-sqlite [OPTIONS] DBNAME TABLE PATHS...
</code></pre></div>
<p dir=""auto"">For example:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ markdown-to-sqlite docs.db documents file1.md file2.md""><pre class=""notranslate""><code class=""notranslate"">$ markdown-to-sqlite docs.db documents file1.md file2.md
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-breaking-change"" class=""anchor"" aria-hidden=""true"" href=""#user-content-breaking-change""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Breaking change</h2>
<p dir=""auto"">Prior to version 1.0 this argument order was different - markdown files were listed before the database and table.</p>
</article></div>",1,public,0,,,
168474970,MDEwOlJlcG9zaXRvcnkxNjg0NzQ5NzA=,dbf-to-sqlite,simonw/dbf-to-sqlite,0,9599,https://github.com/simonw/dbf-to-sqlite,"CLI tool for converting DBF files (dBase, FoxPro etc) to SQLite",0,2019-01-31T06:30:46Z,2021-03-23T01:29:41Z,2020-02-16T00:41:20Z,,8,25,25,Python,1,1,1,1,0,8,0,0,3,apache-2.0,"[""sqlite"", ""foxpro"", ""dbf"", ""dbase"", ""datasette-io"", ""datasette-tool""]",8,3,25,master,"{""admin"": false, ""push"": false, ""pull"": false}",,,8,2,"# dbf-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/dbf-to-sqlite.svg)](https://pypi.python.org/pypi/dbf-to-sqlite)
[![Travis CI](https://travis-ci.com/simonw/dbf-to-sqlite.svg?branch=master)](https://travis-ci.com/simonw/dbf-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/dbf-to-sqlite/blob/master/LICENSE)

CLI tool for converting DBF files (dBase, FoxPro etc) to SQLite.

## Installation

    pip install dbf-to-sqlite

## Usage

    $ dbf-to-sqlite --help
    Usage: dbf-to-sqlite [OPTIONS] DBF_PATHS... SQLITE_DB

      Convert DBF files (dBase, FoxPro etc) to SQLite

      https://github.com/simonw/dbf-to-sqlite

    Options:
      --version      Show the version and exit.
      --table TEXT   Table name to use (only valid for single files)
      -v, --verbose  Show what's going on
      --help         Show this message and exit.

Example usage:

    $ dbf-to-sqlite *.DBF database.db

This will create a new SQLite database called `database.db` containing one table for each of the `DBF` files in the current directory.

Looking for DBF files to try this out on? Try downloading the [Himalayan Database](http://himalayandatabase.com/) of all expeditions that have climbed in the Nepal Himalaya.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-dbf-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-dbf-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>dbf-to-sqlite</h1>
<p><a href=""https://pypi.python.org/pypi/dbf-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/59d39789edaeb918bb1febd34597470b4d6cab449725f3db3c0bb59d3c02551a/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6462662d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/dbf-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://travis-ci.com/simonw/dbf-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/d7e086b366bfa3de03a5c75c2112c011ac4eaf71863fed90253e3b5c28e13f89/68747470733a2f2f7472617669732d63692e636f6d2f73696d6f6e772f6462662d746f2d73716c6974652e7376673f6272616e63683d6d6173746572"" alt=""Travis CI"" data-canonical-src=""https://travis-ci.com/simonw/dbf-to-sqlite.svg?branch=master"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/dbf-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>CLI tool for converting DBF files (dBase, FoxPro etc) to SQLite.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install dbf-to-sqlite
""><pre><code>pip install dbf-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ dbf-to-sqlite --help
Usage: dbf-to-sqlite [OPTIONS] DBF_PATHS... SQLITE_DB

  Convert DBF files (dBase, FoxPro etc) to SQLite

  https://github.com/simonw/dbf-to-sqlite

Options:
  --version      Show the version and exit.
  --table TEXT   Table name to use (only valid for single files)
  -v, --verbose  Show what's going on
  --help         Show this message and exit.
""><pre><code>$ dbf-to-sqlite --help
Usage: dbf-to-sqlite [OPTIONS] DBF_PATHS... SQLITE_DB

  Convert DBF files (dBase, FoxPro etc) to SQLite

  https://github.com/simonw/dbf-to-sqlite

Options:
  --version      Show the version and exit.
  --table TEXT   Table name to use (only valid for single files)
  -v, --verbose  Show what's going on
  --help         Show this message and exit.
</code></pre></div>
<p>Example usage:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ dbf-to-sqlite *.DBF database.db
""><pre><code>$ dbf-to-sqlite *.DBF database.db
</code></pre></div>
<p>This will create a new SQLite database called <code>database.db</code> containing one table for each of the <code>DBF</code> files in the current directory.</p>
<p>Looking for DBF files to try this out on? Try downloading the <a href=""http://himalayandatabase.com/"" rel=""nofollow"">Himalayan Database</a> of all expeditions that have climbed in the Nepal Himalaya.</p>
</article></div>",,,,,,
175321497,MDEwOlJlcG9zaXRvcnkxNzUzMjE0OTc=,csv-diff,simonw/csv-diff,0,9599,https://github.com/simonw/csv-diff,Python CLI tool and library for diffing CSV and JSON files,0,2019-03-13T01:11:26Z,2022-07-29T20:01:02Z,2022-07-29T20:00:59Z,,34,198,198,Python,1,1,1,1,0,29,0,0,18,apache-2.0,"[""click"", ""csv"", ""datasette-io"", ""datasette-tool"", ""diff"", ""git-scraping""]",29,18,198,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,29,7,"# csv-diff

[![PyPI](https://img.shields.io/pypi/v/csv-diff.svg)](https://pypi.org/project/csv-diff/)
[![Changelog](https://img.shields.io/github/v/release/simonw/csv-diff?include_prereleases&label=changelog)](https://github.com/simonw/csv-diff/releases)
[![Tests](https://github.com/simonw/csv-diff/workflows/Test/badge.svg)](https://github.com/simonw/csv-diff/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csv-diff/blob/main/LICENSE)

Tool for viewing the difference between two CSV, TSV or JSON files. See [Generating a commit log for San Francisco’s official list of trees](https://simonwillison.net/2019/Mar/13/tree-history/) (and the [sf-tree-history repo commit log](https://github.com/simonw/sf-tree-history/commits)) for background information on this project.

## Installation

    pip install csv-diff

## Usage

Consider two CSV files:

`one.csv`

    id,name,age
    1,Cleo,4
    2,Pancakes,2

`two.csv`

    id,name,age
    1,Cleo,5
    3,Bailey,1

`csv-diff` can show a human-readable summary of differences between the files:

    $ csv-diff one.csv two.csv --key=id
    1 row changed, 1 row added, 1 row removed

    1 row changed

      Row 1
        age: ""4"" => ""5""

    1 row added

      id: 3
      name: Bailey
      age: 1

    1 row removed

      id: 2
      name: Pancakes
      age: 2

The `--key=id` option means that the `id` column should be treated as the unique key, to identify which records have changed.

The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using `--format=tsv` or `--format=csv`.

You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use `--format=json` if your input files are JSON.

Use `--show-unchanged` to include full details of the unchanged values for rows with at least one change in the diff output:

    % csv-diff one.csv two.csv --key=id --show-unchanged
    1 row changed

      id: 1
        age: ""4"" => ""5""

        Unchanged:
          name: ""Cleo""

You can use the `--json` option to get a machine-readable difference:

    $ csv-diff one.csv two.csv --key=id --json
    {
        ""added"": [
            {
                ""id"": ""3"",
                ""name"": ""Bailey"",
                ""age"": ""1""
            }
        ],
        ""removed"": [
            {
                ""id"": ""2"",
                ""name"": ""Pancakes"",
                ""age"": ""2""
            }
        ],
        ""changed"": [
            {
                ""key"": ""1"",
                ""changes"": {
                    ""age"": [
                        ""4"",
                        ""5""
                    ]
                }
            }
        ],
        ""columns_added"": [],
        ""columns_removed"": []
    }

## As a Python library

You can also import the Python library into your own code like so:

    from csv_diff import load_csv, compare
    diff = compare(
        load_csv(open(""one.csv""), key=""id""),
        load_csv(open(""two.csv""), key=""id"")
    )

`diff` will now contain the same data structure as the output in the `--json` example above.

If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows.

## As a Docker container

### Build the image

    $ docker build -t csvdiff .

### Run the container

    $ docker run --rm -v $(pwd):/files csvdiff

Suppose current directory contains two csv files : one.csv two.csv

    $ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv
    
## Alternatives

- [csvdiff](https://github.com/aswinkarthik/csvdiff) is a ""fast diff tool for comparing CSV files"" - you may get better results from this than from `csv-diff` against larger files.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-csv-diff"" class=""anchor"" aria-hidden=""true"" href=""#user-content-csv-diff""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>csv-diff</h1>
<p dir=""auto""><a href=""https://pypi.org/project/csv-diff/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/75784b8c5ee65df6e894c25c0efe54d360e3b6a33714da9aa0c6bb86ede1f153/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6373762d646966662e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/csv-diff.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csv-diff/releases""><img src=""https://camo.githubusercontent.com/a8ece0f4436cb61b1524e3722c02363faf7aa45fe7e62f6634604f2c30421517/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f6373762d646966663f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/csv-diff?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csv-diff/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/csv-diff/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/csv-diff/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Tool for viewing the difference between two CSV, TSV or JSON files. See <a href=""https://simonwillison.net/2019/Mar/13/tree-history/"" rel=""nofollow"">Generating a commit log for San Francisco’s official list of trees</a> (and the <a href=""https://github.com/simonw/sf-tree-history/commits"">sf-tree-history repo commit log</a>) for background information on this project.</p>
<h2 dir=""auto""><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""pip install csv-diff""><pre class=""notranslate""><code>pip install csv-diff
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p dir=""auto"">Consider two CSV files:</p>
<p dir=""auto""><code>one.csv</code></p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""id,name,age
1,Cleo,4
2,Pancakes,2""><pre class=""notranslate""><code>id,name,age
1,Cleo,4
2,Pancakes,2
</code></pre></div>
<p dir=""auto""><code>two.csv</code></p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""id,name,age
1,Cleo,5
3,Bailey,1""><pre class=""notranslate""><code>id,name,age
1,Cleo,5
3,Bailey,1
</code></pre></div>
<p dir=""auto""><code>csv-diff</code> can show a human-readable summary of differences between the files:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ csv-diff one.csv two.csv --key=id
1 row changed, 1 row added, 1 row removed

1 row changed

  Row 1
    age: &quot;4&quot; =&gt; &quot;5&quot;

1 row added

  id: 3
  name: Bailey
  age: 1

1 row removed

  id: 2
  name: Pancakes
  age: 2""><pre class=""notranslate""><code>$ csv-diff one.csv two.csv --key=id
1 row changed, 1 row added, 1 row removed

1 row changed

  Row 1
    age: ""4"" =&gt; ""5""

1 row added

  id: 3
  name: Bailey
  age: 1

1 row removed

  id: 2
  name: Pancakes
  age: 2
</code></pre></div>
<p dir=""auto"">The <code>--key=id</code> option means that the <code>id</code> column should be treated as the unique key, to identify which records have changed.</p>
<p dir=""auto"">The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using <code>--format=tsv</code> or <code>--format=csv</code>.</p>
<p dir=""auto"">You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use <code>--format=json</code> if your input files are JSON.</p>
<p dir=""auto"">Use <code>--show-unchanged</code> to include full details of the unchanged values for rows with at least one change in the diff output:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""% csv-diff one.csv two.csv --key=id --show-unchanged
1 row changed

  id: 1
    age: &quot;4&quot; =&gt; &quot;5&quot;

    Unchanged:
      name: &quot;Cleo&quot;""><pre class=""notranslate""><code>% csv-diff one.csv two.csv --key=id --show-unchanged
1 row changed

  id: 1
    age: ""4"" =&gt; ""5""

    Unchanged:
      name: ""Cleo""
</code></pre></div>
<p dir=""auto"">You can use the <code>--json</code> option to get a machine-readable difference:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ csv-diff one.csv two.csv --key=id --json
{
    &quot;added&quot;: [
        {
            &quot;id&quot;: &quot;3&quot;,
            &quot;name&quot;: &quot;Bailey&quot;,
            &quot;age&quot;: &quot;1&quot;
        }
    ],
    &quot;removed&quot;: [
        {
            &quot;id&quot;: &quot;2&quot;,
            &quot;name&quot;: &quot;Pancakes&quot;,
            &quot;age&quot;: &quot;2&quot;
        }
    ],
    &quot;changed&quot;: [
        {
            &quot;key&quot;: &quot;1&quot;,
            &quot;changes&quot;: {
                &quot;age&quot;: [
                    &quot;4&quot;,
                    &quot;5&quot;
                ]
            }
        }
    ],
    &quot;columns_added&quot;: [],
    &quot;columns_removed&quot;: []
}""><pre class=""notranslate""><code>$ csv-diff one.csv two.csv --key=id --json
{
    ""added"": [
        {
            ""id"": ""3"",
            ""name"": ""Bailey"",
            ""age"": ""1""
        }
    ],
    ""removed"": [
        {
            ""id"": ""2"",
            ""name"": ""Pancakes"",
            ""age"": ""2""
        }
    ],
    ""changed"": [
        {
            ""key"": ""1"",
            ""changes"": {
                ""age"": [
                    ""4"",
                    ""5""
                ]
            }
        }
    ],
    ""columns_added"": [],
    ""columns_removed"": []
}
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-as-a-python-library"" class=""anchor"" aria-hidden=""true"" href=""#user-content-as-a-python-library""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>As a Python library</h2>
<p dir=""auto"">You can also import the Python library into your own code like so:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""from csv_diff import load_csv, compare
diff = compare(
    load_csv(open(&quot;one.csv&quot;), key=&quot;id&quot;),
    load_csv(open(&quot;two.csv&quot;), key=&quot;id&quot;)
)""><pre class=""notranslate""><code>from csv_diff import load_csv, compare
diff = compare(
    load_csv(open(""one.csv""), key=""id""),
    load_csv(open(""two.csv""), key=""id"")
)
</code></pre></div>
<p dir=""auto""><code>diff</code> will now contain the same data structure as the output in the <code>--json</code> example above.</p>
<p dir=""auto"">If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows.</p>
<h2 dir=""auto""><a id=""user-content-as-a-docker-container"" class=""anchor"" aria-hidden=""true"" href=""#user-content-as-a-docker-container""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>As a Docker container</h2>
<h3 dir=""auto""><a id=""user-content-build-the-image"" class=""anchor"" aria-hidden=""true"" href=""#user-content-build-the-image""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Build the image</h3>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ docker build -t csvdiff .""><pre class=""notranslate""><code>$ docker build -t csvdiff .
</code></pre></div>
<h3 dir=""auto""><a id=""user-content-run-the-container"" class=""anchor"" aria-hidden=""true"" href=""#user-content-run-the-container""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Run the container</h3>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ docker run --rm -v $(pwd):/files csvdiff""><pre class=""notranslate""><code>$ docker run --rm -v $(pwd):/files csvdiff
</code></pre></div>
<p dir=""auto"">Suppose current directory contains two csv files : one.csv two.csv</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv""><pre class=""notranslate""><code>$ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-alternatives"" class=""anchor"" aria-hidden=""true"" href=""#user-content-alternatives""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Alternatives</h2>
<ul dir=""auto"">
<li><a href=""https://github.com/aswinkarthik/csvdiff"">csvdiff</a> is a ""fast diff tool for comparing CSV files"" - you may get better results from this than from <code>csv-diff</code> against larger files.</li>
</ul>
</article></div>",1,public,0,,0,
175550127,MDEwOlJlcG9zaXRvcnkxNzU1NTAxMjc=,yaml-to-sqlite,simonw/yaml-to-sqlite,0,9599,https://github.com/simonw/yaml-to-sqlite,Utility for converting YAML files to SQLite,0,2019-03-14T04:49:08Z,2021-06-13T09:04:40Z,2021-06-13T04:45:52Z,,19,36,36,Python,1,1,1,1,0,2,0,0,0,apache-2.0,"[""yaml"", ""sqlite"", ""datasette-io"", ""datasette-tool""]",2,0,36,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,2,1,"# yaml-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/yaml-to-sqlite.svg)](https://pypi.org/project/yaml-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/yaml-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/yaml-to-sqlite/releases)
[![Tests](https://github.com/simonw/yaml-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/yaml-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/yaml-to-sqlite/blob/main/LICENSE)

Load the contents of a YAML file into a SQLite database table.

```
$ yaml-to-sqlite --help
Usage: yaml-to-sqlite [OPTIONS] DB_PATH TABLE YAML_FILE

  Convert YAML files to SQLite

Options:
  --version             Show the version and exit.
  --pk TEXT             Column to use as a primary key
  --single-column TEXT  If YAML file is a list of values, populate this column
  --help                Show this message and exit.
```
## Usage

Given a `news.yml` file containing the following:
```yaml
- date: 2021-06-05
  body: |-
    [Datasette 0.57](https://docs.datasette.io/en/stable/changelog.html#v0-57) is out with an important security patch.
- date: 2021-05-10
  body: |-
    [Django SQL Dashboard](https://simonwillison.net/2021/May/10/django-sql-dashboard/) is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.
```
Running this command:
```bash
$ yaml-to-sqlite news.db stories news.yml
```
Will create a database file with this schema:
```bash
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT,
   [body] TEXT
);
```
The `--pk` option can be used to set a column as the primary key for the table:

```bash
$ yaml-to-sqlite news.db stories news.yml --pk date
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT PRIMARY KEY,
   [body] TEXT
);
```
## Single column YAML lists

The `--single-column` option can be used when the YAML file is a list of values, for example a file called `dogs.yml` containing the following:

```yaml
- Cleo
- Pancakes
- Nixie
```
Running this command:
```bash
$ yaml-to-sqlite dogs.db dogs.yaml --single-column=name
```
Will create a single `dogs` table with a single `name` column that is the primary key:

```bash
$ sqlite-utils schema dogs.db
CREATE TABLE [dogs] (
   [name] TEXT PRIMARY KEY
);
$ sqlite-utils dogs.db 'select * from dogs' -t
name
--------
Cleo
Pancakes
Nixie
```
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-yaml-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-yaml-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>yaml-to-sqlite</h1>
<p><a href=""https://pypi.org/project/yaml-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/32afda5e7bc913df42ad343b589f5d20c4fb51d9755037f4af1df86149cd0d94/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f79616d6c2d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/yaml-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/yaml-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/83cd14b53497376686dfabbadad49ff0e485a8a84f8266713a3c89fe7181ccbc/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f79616d6c2d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/yaml-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/yaml-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/yaml-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/yaml-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Load the contents of a YAML file into a SQLite database table.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ yaml-to-sqlite --help
Usage: yaml-to-sqlite [OPTIONS] DB_PATH TABLE YAML_FILE

  Convert YAML files to SQLite

Options:
  --version             Show the version and exit.
  --pk TEXT             Column to use as a primary key
  --single-column TEXT  If YAML file is a list of values, populate this column
  --help                Show this message and exit.
""><pre><code>$ yaml-to-sqlite --help
Usage: yaml-to-sqlite [OPTIONS] DB_PATH TABLE YAML_FILE

  Convert YAML files to SQLite

Options:
  --version             Show the version and exit.
  --pk TEXT             Column to use as a primary key
  --single-column TEXT  If YAML file is a list of values, populate this column
  --help                Show this message and exit.
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>Given a <code>news.yml</code> file containing the following:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""- date: 2021-06-05
  body: |-
    [Datasette 0.57](https://docs.datasette.io/en/stable/changelog.html#v0-57) is out with an important security patch.
- date: 2021-05-10
  body: |-
    [Django SQL Dashboard](https://simonwillison.net/2021/May/10/django-sql-dashboard/) is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.
""><pre>- <span class=""pl-ent"">date</span>: <span class=""pl-c1"">2021-06-05</span>
  <span class=""pl-ent"">body</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">    [Datasette 0.57](https://docs.datasette.io/en/stable/changelog.html#v0-57) is out with an important security patch.</span>
<span class=""pl-s""></span>- <span class=""pl-ent"">date</span>: <span class=""pl-c1"">2021-05-10</span>
  <span class=""pl-ent"">body</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">    [Django SQL Dashboard](https://simonwillison.net/2021/May/10/django-sql-dashboard/) is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.</span></pre></div>
<p>Running this command:</p>
<div class=""highlight highlight-source-shell position-relative"" data-snippet-clipboard-copy-content=""$ yaml-to-sqlite news.db stories news.yml
""><pre>$ yaml-to-sqlite news.db stories news.yml</pre></div>
<p>Will create a database file with this schema:</p>
<div class=""highlight highlight-source-shell position-relative"" data-snippet-clipboard-copy-content=""$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT,
   [body] TEXT
);
""><pre>$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT,
   [body] TEXT
)<span class=""pl-k"">;</span></pre></div>
<p>The <code>--pk</code> option can be used to set a column as the primary key for the table:</p>
<div class=""highlight highlight-source-shell position-relative"" data-snippet-clipboard-copy-content=""$ yaml-to-sqlite news.db stories news.yml --pk date
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT PRIMARY KEY,
   [body] TEXT
);
""><pre>$ yaml-to-sqlite news.db stories news.yml --pk date
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT PRIMARY KEY,
   [body] TEXT
)<span class=""pl-k"">;</span></pre></div>
<h2><a id=""user-content-single-column-yaml-lists"" class=""anchor"" aria-hidden=""true"" href=""#user-content-single-column-yaml-lists""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Single column YAML lists</h2>
<p>The <code>--single-column</code> option can be used when the YAML file is a list of values, for example a file called <code>dogs.yml</code> containing the following:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""- Cleo
- Pancakes
- Nixie
""><pre>- <span class=""pl-s"">Cleo</span>
- <span class=""pl-s"">Pancakes</span>
- <span class=""pl-s"">Nixie</span></pre></div>
<p>Running this command:</p>
<div class=""highlight highlight-source-shell position-relative"" data-snippet-clipboard-copy-content=""$ yaml-to-sqlite dogs.db dogs.yaml --single-column=name
""><pre>$ yaml-to-sqlite dogs.db dogs.yaml --single-column=name</pre></div>
<p>Will create a single <code>dogs</code> table with a single <code>name</code> column that is the primary key:</p>
<div class=""highlight highlight-source-shell position-relative"" data-snippet-clipboard-copy-content=""$ sqlite-utils schema dogs.db
CREATE TABLE [dogs] (
   [name] TEXT PRIMARY KEY
);
$ sqlite-utils dogs.db 'select * from dogs' -t
name
--------
Cleo
Pancakes
Nixie
""><pre>$ sqlite-utils schema dogs.db
CREATE TABLE [dogs] (
   [name] TEXT PRIMARY KEY
)<span class=""pl-k"">;</span>
$ sqlite-utils dogs.db <span class=""pl-s""><span class=""pl-pds"">'</span>select * from dogs<span class=""pl-pds"">'</span></span> -t
name
--------
Cleo
Pancakes
Nixie</pre></div>
</article></div>",,,,,,
195145678,MDEwOlJlcG9zaXRvcnkxOTUxNDU2Nzg=,sqlite-diffable,simonw/sqlite-diffable,0,9599,https://github.com/simonw/sqlite-diffable,Tools for dumping/loading a SQLite database to diffable directory structure,0,2019-07-04T00:58:46Z,2022-07-12T17:00:19Z,2022-08-18T22:49:29Z,,30,42,42,Python,1,1,1,1,0,3,0,0,3,apache-2.0,"[""datasette-io"", ""datasette-tool"", ""sqlite""]",3,3,42,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,3,1,"# sqlite-diffable

[![PyPI](https://img.shields.io/pypi/v/sqlite-diffable.svg)](https://pypi.org/project/sqlite-diffable/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-diffable?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-diffable/releases)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-diffable/blob/main/LICENSE)

Tools for dumping/loading a SQLite database to diffable directory structure

## Installation

    pip install sqlite-diffable

## Demo

The repository at [simonw/simonwillisonblog-backup](https://github.com/simonw/simonwillisonblog-backup) contains a backup of the database on my blog, https://simonwillison.net/ - created using this tool.

## Dumping a database

Given a SQLite database called `fixtures.db` containing a table `facetable`, the following will dump out that table to the `dump/` directory:

    sqlite-diffable dump fixtures.db dump/ facetable

To dump out every table in that database, use `--all`:

    sqlite-diffable dump fixtures.db dump/ --all

## Loading a database

To load a previously dumped database, run the following:

    sqlite-diffable load restored.db dump/

This will show an error if any of the tables that are being restored already exist in the database file.

You can replace those tables (dropping them before restoring them) using the `--replace` option:

    sqlite-diffable load restored.db dump/ --replace

## Converting to JSON objects

Table rows are stored in the `.ndjson` files as newline-delimited JSON arrays, like this:

```
[""a"", ""a"", ""a-a"", 63, null, 0.7364712141640124, ""$null""]
[""a"", ""b"", ""a-b"", 51, null, 0.6020187290499803, ""$null""]
```

Sometimes it can be more convenient to work with a list of JSON objects.

The `sqlite-diffable objects` command can read a `.ndjson` file and its accompanying `.metadata.json` file and output JSON objects to standard output:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson

The output of that command looks something like this:
```
{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""}
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
```

Add `-o` to write that output to a file:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson -o output.txt

Add `--array` to output a JSON array of objects, as opposed to a newline-delimited file:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson --array
Output:
```
[
{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""},
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
]
```

## Storage format

Each table is represented as two files. The first, `table_name.metadata.json`, contains metadata describing the structure of the table. For a table called `redirects_redirect` that file might look like this:

```json
{
    ""name"": ""redirects_redirect"",
    ""columns"": [
        ""id"",
        ""domain"",
        ""path"",
        ""target"",
        ""created""
    ],
    ""schema"": ""CREATE TABLE [redirects_redirect] (\n   [id] INTEGER PRIMARY KEY,\n   [domain] TEXT,\n   [path] TEXT,\n   [target] TEXT,\n   [created] TEXT\n)""
}
```

It is an object with three keys: `name` is the name of the table, `columns` is an array of column strings and `schema` is the SQL schema text used for tha table.

The second file, `table_name.ndjson`, contains [newline-delimited JSON](http://ndjson.org/) for every row in the table. Each row is represented as a JSON array with items corresponding to each of the columns defined in the metadata.

That file for the `redirects_redirect.ndjson` table might look like this:

```
[1, ""feeds.simonwillison.net"", ""swn-everything"", ""https://simonwillison.net/atom/everything/"", ""2017-10-01T21:11:36.440537+00:00""]
[2, ""feeds.simonwillison.net"", ""swn-entries"", ""https://simonwillison.net/atom/entries/"", ""2017-10-01T21:12:32.478849+00:00""]
[3, ""feeds.simonwillison.net"", ""swn-links"", ""https://simonwillison.net/atom/links/"", ""2017-10-01T21:12:54.820729+00:00""]
```
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-sqlite-diffable"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sqlite-diffable""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>sqlite-diffable</h1>
<p dir=""auto""><a href=""https://pypi.org/project/sqlite-diffable/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/f397aed437dd402178061cc28cd230ddaa228d7a19d4e6e33dd7d9d6b9aa7016/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f73716c6974652d6469666661626c652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/sqlite-diffable.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/sqlite-diffable/releases""><img src=""https://camo.githubusercontent.com/ad89df075a7b5a43a53000aaf22188af53405e2b5177102cb1b550954996dac4/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f73716c6974652d6469666661626c653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/sqlite-diffable?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/sqlite-diffable/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Tools for dumping/loading a SQLite database to diffable directory structure</p>
<h2 dir=""auto""><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""pip install sqlite-diffable""><pre class=""notranslate""><code>pip install sqlite-diffable
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-demo"" class=""anchor"" aria-hidden=""true"" href=""#user-content-demo""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Demo</h2>
<p dir=""auto"">The repository at <a href=""https://github.com/simonw/simonwillisonblog-backup"">simonw/simonwillisonblog-backup</a> contains a backup of the database on my blog, <a href=""https://simonwillison.net/"" rel=""nofollow"">https://simonwillison.net/</a> - created using this tool.</p>
<h2 dir=""auto""><a id=""user-content-dumping-a-database"" class=""anchor"" aria-hidden=""true"" href=""#user-content-dumping-a-database""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Dumping a database</h2>
<p dir=""auto"">Given a SQLite database called <code>fixtures.db</code> containing a table <code>facetable</code>, the following will dump out that table to the <code>dump/</code> directory:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable dump fixtures.db dump/ facetable""><pre class=""notranslate""><code>sqlite-diffable dump fixtures.db dump/ facetable
</code></pre></div>
<p dir=""auto"">To dump out every table in that database, use <code>--all</code>:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable dump fixtures.db dump/ --all""><pre class=""notranslate""><code>sqlite-diffable dump fixtures.db dump/ --all
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-loading-a-database"" class=""anchor"" aria-hidden=""true"" href=""#user-content-loading-a-database""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Loading a database</h2>
<p dir=""auto"">To load a previously dumped database, run the following:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable load restored.db dump/""><pre class=""notranslate""><code>sqlite-diffable load restored.db dump/
</code></pre></div>
<p dir=""auto"">This will show an error if any of the tables that are being restored already exist in the database file.</p>
<p dir=""auto"">You can replace those tables (dropping them before restoring them) using the <code>--replace</code> option:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable load restored.db dump/ --replace""><pre class=""notranslate""><code>sqlite-diffable load restored.db dump/ --replace
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-converting-to-json-objects"" class=""anchor"" aria-hidden=""true"" href=""#user-content-converting-to-json-objects""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Converting to JSON objects</h2>
<p dir=""auto"">Table rows are stored in the <code>.ndjson</code> files as newline-delimited JSON arrays, like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""[&quot;a&quot;, &quot;a&quot;, &quot;a-a&quot;, 63, null, 0.7364712141640124, &quot;$null&quot;]
[&quot;a&quot;, &quot;b&quot;, &quot;a-b&quot;, 51, null, 0.6020187290499803, &quot;$null&quot;]""><pre class=""notranslate""><code>[""a"", ""a"", ""a-a"", 63, null, 0.7364712141640124, ""$null""]
[""a"", ""b"", ""a-b"", 51, null, 0.6020187290499803, ""$null""]
</code></pre></div>
<p dir=""auto"">Sometimes it can be more convenient to work with a list of JSON objects.</p>
<p dir=""auto"">The <code>sqlite-diffable objects</code> command can read a <code>.ndjson</code> file and its accompanying <code>.metadata.json</code> file and output JSON objects to standard output:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable objects fixtures.db dump/sortable.ndjson""><pre class=""notranslate""><code>sqlite-diffable objects fixtures.db dump/sortable.ndjson
</code></pre></div>
<p dir=""auto"">The output of that command looks something like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""{&quot;pk1&quot;: &quot;a&quot;, &quot;pk2&quot;: &quot;a&quot;, &quot;content&quot;: &quot;a-a&quot;, &quot;sortable&quot;: 63, &quot;sortable_with_nulls&quot;: null, &quot;sortable_with_nulls_2&quot;: 0.7364712141640124, &quot;text&quot;: &quot;$null&quot;}
{&quot;pk1&quot;: &quot;a&quot;, &quot;pk2&quot;: &quot;b&quot;, &quot;content&quot;: &quot;a-b&quot;, &quot;sortable&quot;: 51, &quot;sortable_with_nulls&quot;: null, &quot;sortable_with_nulls_2&quot;: 0.6020187290499803, &quot;text&quot;: &quot;$null&quot;}""><pre class=""notranslate""><code>{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""}
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
</code></pre></div>
<p dir=""auto"">Add <code>-o</code> to write that output to a file:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable objects fixtures.db dump/sortable.ndjson -o output.txt""><pre class=""notranslate""><code>sqlite-diffable objects fixtures.db dump/sortable.ndjson -o output.txt
</code></pre></div>
<p dir=""auto"">Add <code>--array</code> to output a JSON array of objects, as opposed to a newline-delimited file:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""sqlite-diffable objects fixtures.db dump/sortable.ndjson --array""><pre class=""notranslate""><code>sqlite-diffable objects fixtures.db dump/sortable.ndjson --array
</code></pre></div>
<p dir=""auto"">Output:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""[
{&quot;pk1&quot;: &quot;a&quot;, &quot;pk2&quot;: &quot;a&quot;, &quot;content&quot;: &quot;a-a&quot;, &quot;sortable&quot;: 63, &quot;sortable_with_nulls&quot;: null, &quot;sortable_with_nulls_2&quot;: 0.7364712141640124, &quot;text&quot;: &quot;$null&quot;},
{&quot;pk1&quot;: &quot;a&quot;, &quot;pk2&quot;: &quot;b&quot;, &quot;content&quot;: &quot;a-b&quot;, &quot;sortable&quot;: 51, &quot;sortable_with_nulls&quot;: null, &quot;sortable_with_nulls_2&quot;: 0.6020187290499803, &quot;text&quot;: &quot;$null&quot;}
]""><pre class=""notranslate""><code>[
{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""},
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
]
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-storage-format"" class=""anchor"" aria-hidden=""true"" href=""#user-content-storage-format""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Storage format</h2>
<p dir=""auto"">Each table is represented as two files. The first, <code>table_name.metadata.json</code>, contains metadata describing the structure of the table. For a table called <code>redirects_redirect</code> that file might look like this:</p>
<div class=""highlight highlight-source-json notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""{
    &quot;name&quot;: &quot;redirects_redirect&quot;,
    &quot;columns&quot;: [
        &quot;id&quot;,
        &quot;domain&quot;,
        &quot;path&quot;,
        &quot;target&quot;,
        &quot;created&quot;
    ],
    &quot;schema&quot;: &quot;CREATE TABLE [redirects_redirect] (\n   [id] INTEGER PRIMARY KEY,\n   [domain] TEXT,\n   [path] TEXT,\n   [target] TEXT,\n   [created] TEXT\n)&quot;
}""><pre>{
    <span class=""pl-ent"">""name""</span>: <span class=""pl-s""><span class=""pl-pds"">""</span>redirects_redirect<span class=""pl-pds"">""</span></span>,
    <span class=""pl-ent"">""columns""</span>: [
        <span class=""pl-s""><span class=""pl-pds"">""</span>id<span class=""pl-pds"">""</span></span>,
        <span class=""pl-s""><span class=""pl-pds"">""</span>domain<span class=""pl-pds"">""</span></span>,
        <span class=""pl-s""><span class=""pl-pds"">""</span>path<span class=""pl-pds"">""</span></span>,
        <span class=""pl-s""><span class=""pl-pds"">""</span>target<span class=""pl-pds"">""</span></span>,
        <span class=""pl-s""><span class=""pl-pds"">""</span>created<span class=""pl-pds"">""</span></span>
    ],
    <span class=""pl-ent"">""schema""</span>: <span class=""pl-s""><span class=""pl-pds"">""</span>CREATE TABLE [redirects_redirect] (<span class=""pl-cce"">\n</span>   [id] INTEGER PRIMARY KEY,<span class=""pl-cce"">\n</span>   [domain] TEXT,<span class=""pl-cce"">\n</span>   [path] TEXT,<span class=""pl-cce"">\n</span>   [target] TEXT,<span class=""pl-cce"">\n</span>   [created] TEXT<span class=""pl-cce"">\n</span>)<span class=""pl-pds"">""</span></span>
}</pre></div>
<p dir=""auto"">It is an object with three keys: <code>name</code> is the name of the table, <code>columns</code> is an array of column strings and <code>schema</code> is the SQL schema text used for tha table.</p>
<p dir=""auto"">The second file, <code>table_name.ndjson</code>, contains <a href=""http://ndjson.org/"" rel=""nofollow"">newline-delimited JSON</a> for every row in the table. Each row is represented as a JSON array with items corresponding to each of the columns defined in the metadata.</p>
<p dir=""auto"">That file for the <code>redirects_redirect.ndjson</code> table might look like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""[1, &quot;feeds.simonwillison.net&quot;, &quot;swn-everything&quot;, &quot;https://simonwillison.net/atom/everything/&quot;, &quot;2017-10-01T21:11:36.440537+00:00&quot;]
[2, &quot;feeds.simonwillison.net&quot;, &quot;swn-entries&quot;, &quot;https://simonwillison.net/atom/entries/&quot;, &quot;2017-10-01T21:12:32.478849+00:00&quot;]
[3, &quot;feeds.simonwillison.net&quot;, &quot;swn-links&quot;, &quot;https://simonwillison.net/atom/links/&quot;, &quot;2017-10-01T21:12:54.820729+00:00&quot;]""><pre class=""notranslate""><code>[1, ""feeds.simonwillison.net"", ""swn-everything"", ""https://simonwillison.net/atom/everything/"", ""2017-10-01T21:11:36.440537+00:00""]
[2, ""feeds.simonwillison.net"", ""swn-entries"", ""https://simonwillison.net/atom/entries/"", ""2017-10-01T21:12:32.478849+00:00""]
[3, ""feeds.simonwillison.net"", ""swn-links"", ""https://simonwillison.net/atom/links/"", ""2017-10-01T21:12:54.820729+00:00""]
</code></pre></div>
</article></div>",1,public,0,,0,
197431109,MDEwOlJlcG9zaXRvcnkxOTc0MzExMDk=,dogsheep-beta,dogsheep/dogsheep-beta,0,53015001,https://github.com/dogsheep/dogsheep-beta,Build a search index across content from multiple SQLite database tables and run faceted searches against it using Datasette,0,2019-07-17T17:07:26Z,2021-06-13T14:39:01Z,2021-06-13T14:38:59Z,https://dogsheep.github.io/,61,78,78,Python,1,0,1,0,0,0,0,0,11,,"[""search"", ""datasette"", ""datasette-plugin"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",0,11,78,main,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,0,4,"# dogsheep-beta

[![PyPI](https://img.shields.io/pypi/v/dogsheep-beta.svg)](https://pypi.org/project/dogsheep-beta/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/beta?include_prereleases&label=changelog)](https://github.com/dogsheep/beta/releases)
[![Tests](https://github.com/dogsheep/beta/workflows/Test/badge.svg)](https://github.com/dogsheep/beta/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/beta/blob/main/LICENSE)

Build a search index across content from multiple SQLite database tables and run faceted searches against it using Datasette

## Example

A live example of this plugin is running at https://datasette.io/-/beta - configured using [this YAML file](https://github.com/simonw/datasette.io/blob/main/templates/dogsheep-beta.yml).

Read more about how this example works in [Building a search engine for datasette.io](https://simonwillison.net/2020/Dec/19/dogsheep-beta/).

## Installation

Install this tool like so:

    $ pip install dogsheep-beta

## Usage

Run the indexer using the `dogsheep-beta` command-line tool:

    $ dogsheep-beta index dogsheep.db config.yml

The `config.yml` file contains details of the databases and document types that should be indexed:

```yaml
twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
    users:
        sql: |-
            select
                id as key,
                name || ' @' || screen_name as title,
                created_at as timestamp,
                description as search_1
            from users
```

This will create a `search_index` table in the `dogsheep.db` database populated by data from those SQL queries.

By default the search index that this tool creates will be configured for Porter stemming. This means that searches for words like `run` will match documents containing `runs` or `running`.

If you don't want to use Porter stemming, use the `--tokenize none` option:

    $ dogsheep-beta index dogsheep.db config.yml --tokenize none

You can pass other SQLite tokenize argumenst here, see [the SQLite FTS tokenizers documentation](https://www.sqlite.org/fts5.html#tokenizers).

## Columns

The columns that can be returned by our query are:

- `key` - a unique (within that type) primary key
- `title` - the title for the item
- `timestamp` - an ISO8601 timestamp, e.g. `2020-09-02T21:00:21`
- `search_1` - a larger chunk of text to be included in the search index
- `category` - an integer category ID, see below
- `is_public` - an integer (0 or 1, defaults to 0 if not set) specifying if this is public or not

Public records are things like your public tweets, blog posts and GitHub commits.

## Categories

Indexed items can be assigned a category. Categories are integers that correspond to records in the `categories` table, which defaults to containing the following:

|   id | name       |
|------|------------|
|    1 | created    |
|    2 | saved      |
|    3 | received   |

`created` is for items that have been created by the Dogsheep instance owner.

`saved` is for items that they have saved, liked or favourited.

`received` is for items that have been specifically sent to them by other people - incoming emails or direct messages for example.

## Datasette plugin

Run `datasette install dogsheep-beta` (or use `pip install dogsheep-beta` in the same environment as Datasette) to install the Dogsheep Beta Datasette plugin.

Once installed, a custom search interface will be made available at `/-/beta`. You can use this interface to execute searches.

The Datasette plugin has some configuration options. You can set these by adding the following to your `metadata.json` configuration file:

```json
{
    ""plugins"": {
        ""dogsheep-beta"": {
            ""database"": ""beta"",
            ""config_file"": ""dogsheep-beta.yml"",
            ""template_debug"": true
        }
    }
}
```
The configuration settings for the plugin are:
- `database` - the database file that contains your search index. If the file is `beta.db` you should set `database` to `beta`.
- `config_file` - the YAML file containing your Dogsheep Beta configuration.
- `template_debug` - set this to `true` to enable debugging output if errors occur in your custom templates, see below.

## Custom results display

Each indexed item type can define custom display HTML as part of the `config.yml` file. It can do this using a `display` key containing a fragment of Jinja template, and optionally a `display_sql` key with extra SQL to execute to fetch the data to display.

Here's how to define a custom display template for a tweet:

```yaml
twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
        display: |-
            <p>{{ title }} - tweeted at {{ timestamp }}</p>
            <blockquote>{{ search_1 }}</blockquote>
```
This example reuses the value that were stored in the `search_index` table when the indexing query was run.

To load in extra values to display in the template, use a `display_sql` query like this:

```yaml
twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
        display_sql: |-
            select
                users.screen_name,
                tweets.full_text,
                tweets.created_at
            from
                tweets join users on tweets.user = users.id
            where
                tweets.id = :key
        display: |-
            <p>{{ display.screen_name }} - tweeted at {{ display.created_at }}</p>
            <blockquote>{{ display.full_text }}</blockquote>
```
The `display_sql` query will be executed for every search result, passing the key value from the `search_index` table as the `:key` parameter and the user's search term as the `:q` parameter.

This performs well because [many small queries are efficient in SQLite](https://www.sqlite.org/np1queryprob.html).

If an error occurs while rendering one of your templates the search results page will return a 500 error. You can use the `template_debug` configuration setting described above to instead output debugging information for the search results item that experienced the error.

## Displaying maps

This plugin will eventually include a number of useful shortcuts for rendering interesting content.

The first available shortcut is for displaying maps. Make your custom content output something like this:

```html
<div
    data-map-latitude=""{{ display.latitude }}""
    data-map-longitude=""{{ display.longitude }}""
    style=""display: none; float: right; width: 250px; height: 200px; background-color: #ccc;""
></div>
```
JavaScript on the page will look for any elements with `data-map-latitude` and `data-map-longitude` and, if it finds any, will load Leaflet and convert those elements into maps centered on that location. The default zoom level will be 12, or you can set a `data-map-zoom` attribute to customize this.

## Development

To set up this plugin locally, first checkout the code. Then create a new virtual environment:

    cd dogsheep-beta
    python3 -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-dogsheep-beta"" class=""anchor"" aria-hidden=""true"" href=""#user-content-dogsheep-beta""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>dogsheep-beta</h1>
<p><a href=""https://pypi.org/project/dogsheep-beta/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/f8d17c422fbd622ab366c57086226b7f90b6ab9057aa2f3b9f6844dd29bac733/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f646f6773686565702d626574612e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/dogsheep-beta.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/beta/releases""><img src=""https://camo.githubusercontent.com/341e696d2cba7197bea24e27cc6db041753c20587a00ced21e77516d26a54730/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f626574613f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/beta?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/beta/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/beta/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/beta/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Build a search index across content from multiple SQLite database tables and run faceted searches against it using Datasette</p>
<h2><a id=""user-content-example"" class=""anchor"" aria-hidden=""true"" href=""#user-content-example""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Example</h2>
<p>A live example of this plugin is running at <a href=""https://datasette.io/-/beta"" rel=""nofollow"">https://datasette.io/-/beta</a> - configured using <a href=""https://github.com/simonw/datasette.io/blob/main/templates/dogsheep-beta.yml"">this YAML file</a>.</p>
<p>Read more about how this example works in <a href=""https://simonwillison.net/2020/Dec/19/dogsheep-beta/"" rel=""nofollow"">Building a search engine for datasette.io</a>.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install dogsheep-beta
""><pre><code>$ pip install dogsheep-beta
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>Run the indexer using the <code>dogsheep-beta</code> command-line tool:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ dogsheep-beta index dogsheep.db config.yml
""><pre><code>$ dogsheep-beta index dogsheep.db config.yml
</code></pre></div>
<p>The <code>config.yml</code> file contains details of the databases and document types that should be indexed:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
    users:
        sql: |-
            select
                id as key,
                name || ' @' || screen_name as title,
                created_at as timestamp,
                description as search_1
            from users
""><pre><span class=""pl-ent"">twitter.db</span>:
    <span class=""pl-ent"">tweets</span>:
        <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            select</span>
<span class=""pl-s"">                tweets.id as key,</span>
<span class=""pl-s"">                'Tweet by @' || users.screen_name as title,</span>
<span class=""pl-s"">                tweets.created_at as timestamp,</span>
<span class=""pl-s"">                tweets.full_text as search_1</span>
<span class=""pl-s"">            from tweets join users on tweets.user = users.id</span>
<span class=""pl-s""></span>    <span class=""pl-ent"">users</span>:
        <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            select</span>
<span class=""pl-s"">                id as key,</span>
<span class=""pl-s"">                name || ' @' || screen_name as title,</span>
<span class=""pl-s"">                created_at as timestamp,</span>
<span class=""pl-s"">                description as search_1</span>
<span class=""pl-s"">            from users</span></pre></div>
<p>This will create a <code>search_index</code> table in the <code>dogsheep.db</code> database populated by data from those SQL queries.</p>
<p>By default the search index that this tool creates will be configured for Porter stemming. This means that searches for words like <code>run</code> will match documents containing <code>runs</code> or <code>running</code>.</p>
<p>If you don't want to use Porter stemming, use the <code>--tokenize none</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ dogsheep-beta index dogsheep.db config.yml --tokenize none
""><pre><code>$ dogsheep-beta index dogsheep.db config.yml --tokenize none
</code></pre></div>
<p>You can pass other SQLite tokenize argumenst here, see <a href=""https://www.sqlite.org/fts5.html#tokenizers"" rel=""nofollow"">the SQLite FTS tokenizers documentation</a>.</p>
<h2><a id=""user-content-columns"" class=""anchor"" aria-hidden=""true"" href=""#user-content-columns""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Columns</h2>
<p>The columns that can be returned by our query are:</p>
<ul>
<li><code>key</code> - a unique (within that type) primary key</li>
<li><code>title</code> - the title for the item</li>
<li><code>timestamp</code> - an ISO8601 timestamp, e.g. <code>2020-09-02T21:00:21</code></li>
<li><code>search_1</code> - a larger chunk of text to be included in the search index</li>
<li><code>category</code> - an integer category ID, see below</li>
<li><code>is_public</code> - an integer (0 or 1, defaults to 0 if not set) specifying if this is public or not</li>
</ul>
<p>Public records are things like your public tweets, blog posts and GitHub commits.</p>
<h2><a id=""user-content-categories"" class=""anchor"" aria-hidden=""true"" href=""#user-content-categories""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Categories</h2>
<p>Indexed items can be assigned a category. Categories are integers that correspond to records in the <code>categories</code> table, which defaults to containing the following:</p>
<table>
<thead>
<tr>
<th>id</th>
<th>name</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>created</td>
</tr>
<tr>
<td>2</td>
<td>saved</td>
</tr>
<tr>
<td>3</td>
<td>received</td>
</tr>
</tbody>
</table>
<p><code>created</code> is for items that have been created by the Dogsheep instance owner.</p>
<p><code>saved</code> is for items that they have saved, liked or favourited.</p>
<p><code>received</code> is for items that have been specifically sent to them by other people - incoming emails or direct messages for example.</p>
<h2><a id=""user-content-datasette-plugin"" class=""anchor"" aria-hidden=""true"" href=""#user-content-datasette-plugin""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Datasette plugin</h2>
<p>Run <code>datasette install dogsheep-beta</code> (or use <code>pip install dogsheep-beta</code> in the same environment as Datasette) to install the Dogsheep Beta Datasette plugin.</p>
<p>Once installed, a custom search interface will be made available at <code>/-/beta</code>. You can use this interface to execute searches.</p>
<p>The Datasette plugin has some configuration options. You can set these by adding the following to your <code>metadata.json</code> configuration file:</p>
<div class=""highlight highlight-source-json position-relative"" data-snippet-clipboard-copy-content=""{
    &quot;plugins&quot;: {
        &quot;dogsheep-beta&quot;: {
            &quot;database&quot;: &quot;beta&quot;,
            &quot;config_file&quot;: &quot;dogsheep-beta.yml&quot;,
            &quot;template_debug&quot;: true
        }
    }
}
""><pre>{
    <span class=""pl-s""><span class=""pl-pds"">""</span>plugins<span class=""pl-pds"">""</span></span>: {
        <span class=""pl-s""><span class=""pl-pds"">""</span>dogsheep-beta<span class=""pl-pds"">""</span></span>: {
            <span class=""pl-s""><span class=""pl-pds"">""</span>database<span class=""pl-pds"">""</span></span>: <span class=""pl-s""><span class=""pl-pds"">""</span>beta<span class=""pl-pds"">""</span></span>,
            <span class=""pl-s""><span class=""pl-pds"">""</span>config_file<span class=""pl-pds"">""</span></span>: <span class=""pl-s""><span class=""pl-pds"">""</span>dogsheep-beta.yml<span class=""pl-pds"">""</span></span>,
            <span class=""pl-s""><span class=""pl-pds"">""</span>template_debug<span class=""pl-pds"">""</span></span>: <span class=""pl-c1"">true</span>
        }
    }
}</pre></div>
<p>The configuration settings for the plugin are:</p>
<ul>
<li><code>database</code> - the database file that contains your search index. If the file is <code>beta.db</code> you should set <code>database</code> to <code>beta</code>.</li>
<li><code>config_file</code> - the YAML file containing your Dogsheep Beta configuration.</li>
<li><code>template_debug</code> - set this to <code>true</code> to enable debugging output if errors occur in your custom templates, see below.</li>
</ul>
<h2><a id=""user-content-custom-results-display"" class=""anchor"" aria-hidden=""true"" href=""#user-content-custom-results-display""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Custom results display</h2>
<p>Each indexed item type can define custom display HTML as part of the <code>config.yml</code> file. It can do this using a <code>display</code> key containing a fragment of Jinja template, and optionally a <code>display_sql</code> key with extra SQL to execute to fetch the data to display.</p>
<p>Here's how to define a custom display template for a tweet:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
        display: |-
            &lt;p&gt;{{ title }} - tweeted at {{ timestamp }}&lt;/p&gt;
            &lt;blockquote&gt;{{ search_1 }}&lt;/blockquote&gt;
""><pre><span class=""pl-ent"">twitter.db</span>:
    <span class=""pl-ent"">tweets</span>:
        <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            select</span>
<span class=""pl-s"">                tweets.id as key,</span>
<span class=""pl-s"">                'Tweet by @' || users.screen_name as title,</span>
<span class=""pl-s"">                tweets.created_at as timestamp,</span>
<span class=""pl-s"">                tweets.full_text as search_1</span>
<span class=""pl-s"">            from tweets join users on tweets.user = users.id</span>
<span class=""pl-s""></span>        <span class=""pl-ent"">display</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            &lt;p&gt;{{ title }} - tweeted at {{ timestamp }}&lt;/p&gt;</span>
<span class=""pl-s"">            &lt;blockquote&gt;{{ search_1 }}&lt;/blockquote&gt;</span></pre></div>
<p>This example reuses the value that were stored in the <code>search_index</code> table when the indexing query was run.</p>
<p>To load in extra values to display in the template, use a <code>display_sql</code> query like this:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""twitter.db:
    tweets:
        sql: |-
            select
                tweets.id as key,
                'Tweet by @' || users.screen_name as title,
                tweets.created_at as timestamp,
                tweets.full_text as search_1
            from tweets join users on tweets.user = users.id
        display_sql: |-
            select
                users.screen_name,
                tweets.full_text,
                tweets.created_at
            from
                tweets join users on tweets.user = users.id
            where
                tweets.id = :key
        display: |-
            &lt;p&gt;{{ display.screen_name }} - tweeted at {{ display.created_at }}&lt;/p&gt;
            &lt;blockquote&gt;{{ display.full_text }}&lt;/blockquote&gt;
""><pre><span class=""pl-ent"">twitter.db</span>:
    <span class=""pl-ent"">tweets</span>:
        <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            select</span>
<span class=""pl-s"">                tweets.id as key,</span>
<span class=""pl-s"">                'Tweet by @' || users.screen_name as title,</span>
<span class=""pl-s"">                tweets.created_at as timestamp,</span>
<span class=""pl-s"">                tweets.full_text as search_1</span>
<span class=""pl-s"">            from tweets join users on tweets.user = users.id</span>
<span class=""pl-s""></span>        <span class=""pl-ent"">display_sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            select</span>
<span class=""pl-s"">                users.screen_name,</span>
<span class=""pl-s"">                tweets.full_text,</span>
<span class=""pl-s"">                tweets.created_at</span>
<span class=""pl-s"">            from</span>
<span class=""pl-s"">                tweets join users on tweets.user = users.id</span>
<span class=""pl-s"">            where</span>
<span class=""pl-s"">                tweets.id = :key</span>
<span class=""pl-s""></span>        <span class=""pl-ent"">display</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">            &lt;p&gt;{{ display.screen_name }} - tweeted at {{ display.created_at }}&lt;/p&gt;</span>
<span class=""pl-s"">            &lt;blockquote&gt;{{ display.full_text }}&lt;/blockquote&gt;</span></pre></div>
<p>The <code>display_sql</code> query will be executed for every search result, passing the key value from the <code>search_index</code> table as the <code>:key</code> parameter and the user's search term as the <code>:q</code> parameter.</p>
<p>This performs well because <a href=""https://www.sqlite.org/np1queryprob.html"" rel=""nofollow"">many small queries are efficient in SQLite</a>.</p>
<p>If an error occurs while rendering one of your templates the search results page will return a 500 error. You can use the <code>template_debug</code> configuration setting described above to instead output debugging information for the search results item that experienced the error.</p>
<h2><a id=""user-content-displaying-maps"" class=""anchor"" aria-hidden=""true"" href=""#user-content-displaying-maps""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Displaying maps</h2>
<p>This plugin will eventually include a number of useful shortcuts for rendering interesting content.</p>
<p>The first available shortcut is for displaying maps. Make your custom content output something like this:</p>
<div class=""highlight highlight-text-html-basic position-relative"" data-snippet-clipboard-copy-content=""&lt;div
    data-map-latitude=&quot;{{ display.latitude }}&quot;
    data-map-longitude=&quot;{{ display.longitude }}&quot;
    style=&quot;display: none; float: right; width: 250px; height: 200px; background-color: #ccc;&quot;
&gt;&lt;/div&gt;
""><pre><span class=""pl-kos"">&lt;</span><span class=""pl-ent"">div</span>
    <span class=""pl-c1"">data-map-latitude</span>=""<span class=""pl-s"">{{ display.latitude }}</span>""
    <span class=""pl-c1"">data-map-longitude</span>=""<span class=""pl-s"">{{ display.longitude }}</span>""
    <span class=""pl-c1"">style</span>=""<span class=""pl-s"">display: none; float: right; width: 250px; height: 200px; background-color: #ccc;</span>""
<span class=""pl-kos"">&gt;</span><span class=""pl-kos"">&lt;/</span><span class=""pl-ent"">div</span><span class=""pl-kos"">&gt;</span></pre></div>
<p>JavaScript on the page will look for any elements with <code>data-map-latitude</code> and <code>data-map-longitude</code> and, if it finds any, will load Leaflet and convert those elements into maps centered on that location. The default zoom level will be 12, or you can set a <code>data-map-zoom</code> attribute to customize this.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To set up this plugin locally, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd dogsheep-beta
python3 -mvenv venv
source venv/bin/activate
""><pre><code>cd dogsheep-beta
python3 -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
197882382,MDEwOlJlcG9zaXRvcnkxOTc4ODIzODI=,healthkit-to-sqlite,dogsheep/healthkit-to-sqlite,0,53015001,https://github.com/dogsheep/healthkit-to-sqlite,Convert an Apple Healthkit export zip to a SQLite database,0,2019-07-20T05:03:12Z,2021-08-20T00:55:34Z,2021-08-20T00:56:17Z,https://datasette.io/tools/healthkit-to-sqlite,29,91,91,Python,1,1,1,1,0,4,0,0,8,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool"", ""dogsheep"", ""healthkit"", ""sqlite""]",4,8,91,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,4,3,"# healthkit-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/healthkit-to-sqlite.svg)](https://pypi.org/project/healthkit-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/healthkit-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/healthkit-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/healthkit-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/healthkit-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/healthkit-to-sqlite/blob/main/LICENSE)

Convert an Apple Healthkit export zip to a SQLite database

## How to install

    $ pip install healthkit-to-sqlite

## How to use

First you need to export your Apple HealthKit data.

1. On your iPhone, open the ""Health"" app
2. Click the profile icon in the top right
3. Click ""Export Health Data"" at the bottom of that page
4. Save the resulting file somewhere you can access it, or AirDrop it directly to your laptop.

Now you can convert the resulting `export.zip` file to SQLite like so:

    $ healthkit-to-sqlite export.zip healthkit.db

A progress bar will be displayed. You can disable this using `--silent`.

```
Importing from HealthKit  [#-------------]    5%  00:01:33
```

You can explore the resulting data using [Datasette](https://datasette.readthedocs.io/) like this:

    $ datasette healthkit.db
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-healthkit-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-healthkit-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>healthkit-to-sqlite</h1>
<p><a href=""https://pypi.org/project/healthkit-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/d8fd72edb8183afd279306fabff220e4c1670906c58c256e9e3bd7fbdea8c76f/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6865616c74686b69742d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/healthkit-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/healthkit-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/9092a0b3a68a53e13d8470fdf4909dda3cabe09e9e74b53960ad2063c79ae8e4/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f6865616c74686b69742d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/healthkit-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/healthkit-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/healthkit-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/healthkit-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Convert an Apple Healthkit export zip to a SQLite database</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install healthkit-to-sqlite
""><pre><code>$ pip install healthkit-to-sqlite
</code></pre></div>
<h2><a id=""user-content-how-to-use"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-use""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to use</h2>
<p>First you need to export your Apple HealthKit data.</p>
<ol>
<li>On your iPhone, open the ""Health"" app</li>
<li>Click the profile icon in the top right</li>
<li>Click ""Export Health Data"" at the bottom of that page</li>
<li>Save the resulting file somewhere you can access it, or AirDrop it directly to your laptop.</li>
</ol>
<p>Now you can convert the resulting <code>export.zip</code> file to SQLite like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ healthkit-to-sqlite export.zip healthkit.db
""><pre><code>$ healthkit-to-sqlite export.zip healthkit.db
</code></pre></div>
<p>A progress bar will be displayed. You can disable this using <code>--silent</code>.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""Importing from HealthKit  [#-------------]    5%  00:01:33
""><pre><code>Importing from HealthKit  [#-------------]    5%  00:01:33
</code></pre></div>
<p>You can explore the resulting data using <a href=""https://datasette.readthedocs.io/"" rel=""nofollow"">Datasette</a> like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ datasette healthkit.db
""><pre><code>$ datasette healthkit.db
</code></pre></div>
</article></div>",,,,,,
205429375,MDEwOlJlcG9zaXRvcnkyMDU0MjkzNzU=,swarm-to-sqlite,dogsheep/swarm-to-sqlite,0,53015001,https://github.com/dogsheep/swarm-to-sqlite,Create a SQLite database containing your checkin history from Foursquare Swarm,0,2019-08-30T17:37:29Z,2021-02-22T07:58:39Z,2021-01-18T04:36:03Z,,49,37,37,Python,1,1,1,1,0,1,0,0,1,apache-2.0,"[""sqlite"", ""foursquare"", ""swarm"", ""foursquare-api"", ""datasette"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",1,1,37,main,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,1,3,"# swarm-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/swarm-to-sqlite.svg)](https://pypi.org/project/swarm-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/swarm-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/swarm-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/swarm-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/swarm-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/swarm-to-sqlite/blob/main/LICENSE)

Create a SQLite database containing your checkin history from Foursquare Swarm.

## How to install

    $ pip install swarm-to-sqlite

## Usage

You will need to first obtain a valid OAuth token for your Foursquare account. You can do so using this tool: https://your-foursquare-oauth-token.glitch.me/

Simplest usage is to simply provide the name of the database file you wish to write to. The tool will prompt you to paste in your token, and will then download your checkins and store them in the specified database file.

    $ swarm-to-sqlite checkins.db
    Please provide your Foursquare OAuth token:
    Importing 3699 checkins  [#########-----------------------] 27% 00:02:31

You can also pass the token as a command-line option:

    $ swarm-to-sqlite checkins.db --token=XXX

Or as an environment variable:

    $ export FOURSQUARE_TOKEN=XXX
    $ swarm-to-sqlite checkins.db

To retrieve just checkins within the past X hours, days or weeks, use the `--since=` option. For example, to pull only checkins that happened within the last 10 days use:

    $ swarm-to-sqlite checkins.db --token=XXX --since=10d

Use `2w` for two weeks, `10h` for ten hours, `3d` for three days.

In addition to saving the checkins to a database, you can also write them to a JSON file using the `--save` option:

    $ swarm-to-sqlite checkins.db --save=checkins.json

Having done this, you can re-import checkins directly from that file (rather than making API calls to fetch data from Foursquare) like this:

    $ swarm-to-sqlite checkins.db --load=checkins.json

## Using with Datasette

The SQLite database produced by this tool is designed to be browsed using [Datasette](https://datasette.io/).

You can install the [datasette-cluster-map](https://datasette.io/plugins/datasette-cluster-map) plugin to view your checkins on a map.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-swarm-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-swarm-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>swarm-to-sqlite</h1>
<p><a href=""https://pypi.org/project/swarm-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/336537cfcc544f29699c00a2aa4b5a199c9a21f53e43aff833e268e664735ed9/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f737761726d2d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/swarm-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/swarm-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/3a244282c89501bdea48af0387ebed5c0a03d41ef8f87331ed3fe63421f42c4f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f737761726d2d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/swarm-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/swarm-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/swarm-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/swarm-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a SQLite database containing your checkin history from Foursquare Swarm.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install swarm-to-sqlite
""><pre><code>$ pip install swarm-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>You will need to first obtain a valid OAuth token for your Foursquare account. You can do so using this tool: <a href=""https://your-foursquare-oauth-token.glitch.me/"" rel=""nofollow"">https://your-foursquare-oauth-token.glitch.me/</a></p>
<p>Simplest usage is to simply provide the name of the database file you wish to write to. The tool will prompt you to paste in your token, and will then download your checkins and store them in the specified database file.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ swarm-to-sqlite checkins.db
Please provide your Foursquare OAuth token:
Importing 3699 checkins  [#########-----------------------] 27% 00:02:31
""><pre><code>$ swarm-to-sqlite checkins.db
Please provide your Foursquare OAuth token:
Importing 3699 checkins  [#########-----------------------] 27% 00:02:31
</code></pre></div>
<p>You can also pass the token as a command-line option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ swarm-to-sqlite checkins.db --token=XXX
""><pre><code>$ swarm-to-sqlite checkins.db --token=XXX
</code></pre></div>
<p>Or as an environment variable:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ export FOURSQUARE_TOKEN=XXX
$ swarm-to-sqlite checkins.db
""><pre><code>$ export FOURSQUARE_TOKEN=XXX
$ swarm-to-sqlite checkins.db
</code></pre></div>
<p>To retrieve just checkins within the past X hours, days or weeks, use the <code>--since=</code> option. For example, to pull only checkins that happened within the last 10 days use:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ swarm-to-sqlite checkins.db --token=XXX --since=10d
""><pre><code>$ swarm-to-sqlite checkins.db --token=XXX --since=10d
</code></pre></div>
<p>Use <code>2w</code> for two weeks, <code>10h</code> for ten hours, <code>3d</code> for three days.</p>
<p>In addition to saving the checkins to a database, you can also write them to a JSON file using the <code>--save</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ swarm-to-sqlite checkins.db --save=checkins.json
""><pre><code>$ swarm-to-sqlite checkins.db --save=checkins.json
</code></pre></div>
<p>Having done this, you can re-import checkins directly from that file (rather than making API calls to fetch data from Foursquare) like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ swarm-to-sqlite checkins.db --load=checkins.json
""><pre><code>$ swarm-to-sqlite checkins.db --load=checkins.json
</code></pre></div>
<h2><a id=""user-content-using-with-datasette"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-with-datasette""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using with Datasette</h2>
<p>The SQLite database produced by this tool is designed to be browsed using <a href=""https://datasette.io/"" rel=""nofollow"">Datasette</a>.</p>
<p>You can install the <a href=""https://datasette.io/plugins/datasette-cluster-map"" rel=""nofollow"">datasette-cluster-map</a> plugin to view your checkins on a map.</p>
</article></div>",,,,,,
206156866,MDEwOlJlcG9zaXRvcnkyMDYxNTY4NjY=,twitter-to-sqlite,dogsheep/twitter-to-sqlite,0,53015001,https://github.com/dogsheep/twitter-to-sqlite,Save data from Twitter to a SQLite database,0,2019-09-03T19:30:08Z,2021-12-26T18:08:43Z,2021-12-26T18:08:40Z,,298,269,269,Python,1,1,1,1,0,13,0,0,10,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool"", ""dogsheep"", ""sqlite"", ""twitter"", ""twitter-api""]",13,10,269,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,13,5,"# twitter-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/twitter-to-sqlite.svg)](https://pypi.org/project/twitter-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/twitter-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/twitter-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/twitter-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/twitter-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/twitter-to-sqlite/blob/main/LICENSE)

Save data from Twitter to a SQLite database.

**This tool currently uses Twitter API v1**. You may be unable to use it if you do not have an API key for that version of the API.

<!-- toc -->

- [How to install](#how-to-install)
- [Authentication](#authentication)
- [Retrieving tweets by specific accounts](#retrieving-tweets-by-specific-accounts)
- [Retrieve user profiles in bulk](#retrieve-user-profiles-in-bulk)
- [Retrieve tweets in bulk](#retrieve-tweets-in-bulk)
- [Retrieving Twitter followers](#retrieving-twitter-followers)
- [Retrieving friends](#retrieving-friends)
- [Retrieving favorited tweets](#retrieving-favorited-tweets)
- [Retrieving Twitter lists](#retrieving-twitter-lists)
- [Retrieving Twitter list memberships](#retrieving-twitter-list-memberships)
- [Retrieving just follower and friend IDs](#retrieving-just-follower-and-friend-ids)
- [Retrieving tweets from your home timeline](#retrieving-tweets-from-your-home-timeline)
- [Retrieving your mentions](#retrieving-your-mentions)
- [Providing input from a SQL query with --sql and --attach](#providing-input-from-a-sql-query-with---sql-and---attach)
- [Running searches](#running-searches)
- [Capturing tweets in real-time with track and follow](#capturing-tweets-in-real-time-with-track-and-follow)
  * [track](#track)
  * [follow](#follow)
- [Importing data from your Twitter archive](#importing-data-from-your-twitter-archive)
- [Design notes](#design-notes)

<!-- tocstop -->

## How to install

    $ pip install twitter-to-sqlite

## Authentication

First, you will need to create a Twitter application at https://developer.twitter.com/en/apps. You may need to apply for a Twitter developer account - if so, you may find this [example of an email application](https://raw.githubusercontent.com/dogsheep/twitter-to-sqlite/main/email.png) useful that has been approved in the past.

Once you have created your application, navigate to the ""Keys and tokens"" page and make note of the following:

* Your API key
* Your API secret key
* Your access token
* Your access token secret

You will need to save all four of these values to a JSON file in order to use this tool.

You can create that JSON file by running the following command and pasting in the values at the prompts:

    $ twitter-to-sqlite auth
    Create an app here: https://developer.twitter.com/en/apps
    Then navigate to 'Keys and tokens' and paste in the following:

    API key: xxx
    API secret key: xxx
    Access token: xxx
    Access token secret: xxx

This will create a file called `auth.json` in your current directory containing the required values. To save the file at a different path or filename, use the `--auth=myauth.json` option.

## Retrieving tweets by specific accounts

The `user-timeline` command retrieves all of the tweets posted by the specified user accounts. It defaults to the account belonging to the authenticated user:

    $ twitter-to-sqlite user-timeline twitter.db
    Importing tweets  [#####-------------------------------]  2799/17780  00:01:39

All of these commands assume that there is an `auth.json` file in the current directory. You can provide the path to your `auth.json` file using `-a`:

    $ twitter-to-sqlite user-timeline twitter.db -a /path/to/auth.json

To load tweets for other users, pass their screen names as arguments:

    $ twitter-to-sqlite user-timeline twitter.db cleopaws nichemuseums

Twitter's API only returns up to around 3,200 tweets for most user accounts, but you may find that it returns all available tweets for your own user account.

You can pass numeric Twitter user IDs instead of screen names using the `--ids` parameter.

You can use `--since` to retrieve every tweet since the last time you imported for that user, or `--since_id=xxx` to retrieve every tweet since a specific tweet ID.

This command also accepts `--sql` and `--attach` options, documented below.

## Retrieve user profiles in bulk

If you have a list of Twitter screen names (or user IDs) you can bulk fetch their fully inflated Twitter profiles using the `users-lookup` command:

    $ twitter-to-sqlite users-lookup users.db simonw cleopaws

You can pass user IDs instead using the `--ids` option:

    $ twitter-to-sqlite users-lookup users.db 12497 3166449535 --ids

This command also accepts `--sql` and `--attach` options, documented below.

## Retrieve tweets in bulk

If you have a list of tweet IDS you can bulk fetch them using the `statuses-lookup` command:

    $ twitter-to-sqlite statuses-lookup tweets.db 1122154819815239680 1122154178493575169

The `--sql` and `--attach` options are supported.

Here's a recipe to retrieve any tweets that existing tweets are in-reply-to which have not yet been stored in your database:

    $ twitter-to-sqlite statuses-lookup tweets.db \
        --sql='
            select in_reply_to_status_id
            from tweets
            where in_reply_to_status_id is not null' \
        --skip-existing

The `--skip-existing` option means that tweets that have already been stored in the database will not be fetched again.

## Retrieving Twitter followers

The `followers` command retrieves details of every follower of the specified accounts. You can use it to retrieve your own followers, or you can pass one or more screen names to pull the followers for other accounts.

The following command pulls your followers and saves them in a SQLite database file called `twitter.db`:

    $ twitter-to-sqlite followers twitter.db

This command is **extremely slow**, because Twitter impose a rate limit of no more than one request per minute to this endpoint! If you are running it against an account with thousands of followers you should expect this to take several hours.

To retrieve followers for another account, use:

    $ twitter-to-sqlite followers twitter.db cleopaws

This command also accepts the `--ids`, `--sql` and `--attach` options.

See [Analyzing my Twitter followers with Datasette](https://simonwillison.net/2018/Jan/28/analyzing-my-twitter-followers/) for the original inspiration for this command.

## Retrieving friends

The `friends` command works like the `followers` command, but retrieves the specified (or currently authenticated) user's friends - defined as accounts that the user is following.

    $ twitter-to-sqlite friends twitter.db

It takes the same options as the `followers` command.

## Retrieving favorited tweets

The `favorites` command retrieves tweets that have been favorited by a specified user. Called without any extra arguments it retrieves tweets favorited by the currently authenticated user:

    $ twitter-to-sqlite favorites faves.db

You can also use the `--screen_name` or `--user_id` arguments to retrieve favorite tweets for another user:

    $ twitter-to-sqlite favorites faves-obama.db --screen_name=BarackObama

Use the `--stop_after=xxx` argument to retrieve only the most recent number of favorites, e.g. to get the authenticated user's 50 most recent favorites:

    $ twitter-to-sqlite favorites faves.db --stop_after=50

## Retrieving Twitter lists

The `lists` command retrieves all of the lists belonging to one or more users.

    $ twitter-to-sqlite lists lists.db simonw dogsheep

This command also accepts the `--sql` and `--attach` and `--ids` options.

To additionally fetch the list of members for each list, use `--members`.

## Retrieving Twitter list memberships

The `list-members` command can be used to retrieve details of one or more Twitter lists, including all of their members.

    $ twitter-to-sqlite list-members members.db simonw/the-good-place

You can pass multiple `screen_name/list_slug` identifiers.

If you know the numeric IDs of the lists instead, you can use `--ids`:

    $ twitter-to-sqlite list-members members.db 927913322841653248 --ids

## Retrieving just follower and friend IDs

It's also possible to retrieve just the numeric Twitter IDs of the accounts that specific users are following (""friends"" in Twitter's API terminology) or followed-by:

    $ twitter-to-sqlite followers-ids members.db simonw cleopaws

This will populate the `following` table with `followed_id`/`follower_id` pairs for the two specified accounts, listing every account ID that is following either of those two accounts.

    $ twitter-to-sqlite friends-ids members.db simonw cleopaws

This will do the same thing but pull the IDs that those accounts are following.

Both of these commands also support `--sql` and `--attach` as an alternative to passing screen names as direct command-line arguments. You can use `--ids` to process the inputs as user IDs rather than screen names.

The underlying Twitter APIs have a rate limit of 15 requests every 15 minutes - though they do return up to 5,000 IDs in each call. By default both of these subcommands will wait for 61 seconds between API calls in order to stay within the rate limit - you can adjust this behaviour down to just one second delay if you know you will not be making many calls using `--sleep=1`.

## Retrieving tweets from your home timeline

The `home-timeline` command retrieves up to 800 tweets from the home timeline of the authenticated user - generally this means tweets from people you follow.

    $ twitter-to-sqlite home-timeline twitter.db
    Importing timeline  [#################--------]  591/800  00:01:14

The tweets are stored in the `tweets` table, and a record is added to the `timeline_tweets` table noting that this tweet came in due to being spotted in the timeline of your user.

You can use `--since` to retrieve just tweets that have been posted since the last time this command was run, or `--since_id=xxx` to explicitly pass in a tweet ID to use as the last position.

You can then view your timeline in Datasette using the following URL:

`/tweets/tweets?_where=id+in+(select+tweet+from+[timeline_tweets])&_sort_desc=id&_facet=user`

This will filter your tweets table to just tweets that appear in your timeline, ordered by most recent first and use faceting to show you which users are responsible for the most tweets.

## Retrieving your mentions

The `mentions-timeline` command works like `home-timeline` except it retrieves tweets that mention the authenticated user's account. It records the user account that was mentioned in a `mentions_tweets` table.

It supports `--since` and `--since_id` in the same was as `home-timeline` does.

## Providing input from a SQL query with --sql and --attach

This option is available for some subcommands - run `twitter-to-sqlite command-name --help` to check.

You can provide Twitter screen names (or user IDs or tweet IDs) directly as command-line arguments, or you can provide those screen names or IDs by executing a SQL query.

For example: consider a SQLite database with an `attendees` table listing names and Twitter accounts - something like this:

| First   | Last       | Twitter      |
|---------|------------|--------------|
| Simon   | Willison   | simonw       |
| Avril   | Lavigne    | AvrilLavigne |

You can run the `users-lookup` command to pull the Twitter profile of every user listed in that database by loading the screen names using a `--sql` query:

    $ twitter-to-sqlite users-lookup my.db --sql=""select Twitter from attendees""

If your database table contains Twitter IDs, you can select those IDs and pass the `--ids` argument. For example, to fetch the profiles of users who have had their user IDs inserted into the `following` table using the `twitter-to-sqlite friends-ids` command:

    $ twitter-to-sqlite users-lookup my.db --sql=""select follower_id from following"" --ids

Or to avoid re-fetching users that have already been fetched:

    $ twitter-to-sqlite users-lookup my.db \
        --sql=""select followed_id from following where followed_id not in (
            select id from users)"" --ids

If your data lives in a separate database file you can attach it using `--attach`. For example, consider the attendees example above but the data lives in an `attendees.db` file, and you want to fetch the user profiles into a `tweets.db` file. You could do that like this:

    $ twitter-to-sqlite users-lookup tweets.db \
        --attach=attendees.db \
        --sql=""select Twitter from attendees.attendees""

The filename (without the extension) will be used as the database alias within SQLite. If you want a different alias for some reason you can specify that with a colon like this:

    $ twitter-to-sqlite users-lookup tweets.db \
        --attach=foo:attendees.db \
        --sql=""select Twitter from foo.attendees""

## Running searches

The `search` command runs a search against the Twitter [standard search API](https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets).

    $ twitter-to-sqlite search tweets.db ""dogsheep""

This will import up to around 320 tweets that match that search term into the `tweets` table. It will also create a record in the `search_runs` table recording that the search took place, and many-to-many records in the `search_runs_tweets` table recording which tweets were seen for that search at that time.

You can use the `--since` parameter to check for previous search runs with the same arguments and only retrieve tweets that were posted since the last retrieved matching tweet.

The following additional options for `search` are supported:

* `--geocode`: `latitude,longitude,radius` where radius is a number followed by mi or km
* `--lang`: ISO 639-1 language code e.g. `en` or `es`
* `--locale`: Locale: only `ja` is currently effective
* `--result_type`: `mixed`, `recent` or `popular`. Defaults to `mixed`
* `--count`: Number of results per page, defaults to the maximum of 100
* `--stop_after`: Stop after this many results
* `--since_id`: Pull tweets since this Tweet ID. You probably want to use `--since` instead of this.

## Capturing tweets in real-time with track and follow

This functionality is **experimental**. Please [file bug reports](https://github.com/dogsheep/twitter-to-sqlite/issues) if you find any!

Twitter provides a real-time API which can be used to subscribe to tweets as they happen. `twitter-to-sqlite` can use this API to continually update a SQLite database with tweets matching certain keywords, or referencing specific users.

### track

To track keywords, use the `track` command:

    $ twitter-to-sqlite track tweets.db kakapo

This command will continue to run until you hit Ctrl+C. It will capture any tweets mentioning the keyword [kakapo](https://en.wikipedia.org/wiki/Kakapo) and store them in the `tweets.db` database file.

You can pass multiple keywords as a space separated list. This will capture tweets matching either of those keywords:

    $ twitter-to-sqlite track tweets.db kakapo raccoon

You can enclose phrases in quotes to search for tweets matching both of those keywords:

    $ twitter-to-sqlite track tweets.db 'trash panda'

See [the Twitter track documentation](https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#track) for advanced tips on using this command.

Add the `--verbose` option to see matching tweets (in their verbose JSON form) displayed to the terminal as they are captured:

    $ twitter-to-sqlite track tweets.db raccoon --verbose

### follow

The `follow` command will capture all tweets that are relevant to one or more specific Twitter users.

    $ twitter-to-sqlite follow tweets.db nytimes

This includes tweets by those users, tweets that reply to or quote those users and retweets by that user. See [the Twitter follow documentation](https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#follow) for full details.

The command accepts one or more screen names.

You can feed it numeric Twitter user IDs instead of screen names by using the `--ids` flag.

The command also supports the `--sql` and `--attach` options, and the `--verbose` option for displaying tweets as they are captured.

Here's how to start following tweets from every user ID currently represented as being followed in the `following` table (populated using the `friends-ids` command):

    $ twitter-to-sqlite follow tweets.db \
        --sql=""select distinct followed_id from following"" \
        --ids

## Importing data from your Twitter archive

You can request an archive of your Twitter data by [following these instructions](https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive).

Twitter will send you a link to download a `.zip` file. You can import the contents of that file into a set of tables in a new database file called `archive.db` (each table beginning with the `archive_` prefix) using the `import` command:

    $ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip

This command does not populate any of the regular tables, since Twitter's export data does not exactly match the schema returned by the Twitter API.

It will delete and recreate the corresponding `archive_*` tables every time you run it. If this is not what you want, run the command against a new SQLite database file name rather than running it against one that already exists.

If you have already decompressed your archive, you can run this against the directory that you decompressed it to:

    $ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2/

You can also run it against one or more specific files within that folder. For example, to import just the follower.js and following.js files:

    $ twitter-to-sqlite import archive.db \
        ~/Downloads/twitter-2019-06-25-b31f2/follower.js \
        ~/Downloads/twitter-2019-06-25-b31f2/following.js

You may want to use other commands to populate tables based on data from the archive. For example, to retrieve full API versions of each of the tweets you have favourited in your archive, you could run the following:

    $ twitter-to-sqlite statuses-lookup archive.db \
        --sql='select tweetId from archive_like' \
        --skip-existing

If you want these imported tweets to then be reflected in the `favorited_by` table, you can do so by applying the following SQL query:

    $ sqlite3 archive.db
    SQLite version 3.22.0 2018-01-22 18:45:57
    Enter "".help"" for usage hints.
    sqlite> INSERT OR IGNORE INTO favorited_by (tweet, user)
       ...>     SELECT tweetId, 'YOUR_TWITTER_ID' FROM archive_like;
    <Ctrl+D>

Replace YOUR_TWITTER_ID with your numeric Twitter ID. If you don't know that ID you can find it out by running the following:

    $ twitter-to-sqlite fetch \
        ""https://api.twitter.com/1.1/account/verify_credentials.json"" \
        | grep '""id""' | head -n 1

## Design notes

* Tweet IDs are stored as integers, to afford sorting by ID in a sensible way
* While we configure foreign key relationships between tables, we do not ask SQLite to enforce them. This is used by the `following` table to allow the `followers-ids` and `friends-ids` commands to populate it with user IDs even if the user accounts themselves are not yet present in the `users` table.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-twitter-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-twitter-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>twitter-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.org/project/twitter-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/fce3d286daf4a0ac037476f32a7f1885dd7f90329ec1df80f7fe6b9322c72f5c/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f747769747465722d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/twitter-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/twitter-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/e1228185d86e3eb446efcb27ff81748cdad3b90ac14ae58c61dab61a90bb992d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f747769747465722d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/twitter-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/twitter-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/twitter-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/twitter-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Save data from Twitter to a SQLite database.</p>
<p dir=""auto""><strong>This tool currently uses Twitter API v1</strong>. You may be unable to use it if you do not have an API key for that version of the API.</p>

<ul dir=""auto"">
<li><a href=""#user-content-how-to-install"">How to install</a></li>
<li><a href=""#user-content-authentication"">Authentication</a></li>
<li><a href=""#user-content-retrieving-tweets-by-specific-accounts"">Retrieving tweets by specific accounts</a></li>
<li><a href=""#user-content-retrieve-user-profiles-in-bulk"">Retrieve user profiles in bulk</a></li>
<li><a href=""#user-content-retrieve-tweets-in-bulk"">Retrieve tweets in bulk</a></li>
<li><a href=""#user-content-retrieving-twitter-followers"">Retrieving Twitter followers</a></li>
<li><a href=""#user-content-retrieving-friends"">Retrieving friends</a></li>
<li><a href=""#user-content-retrieving-favorited-tweets"">Retrieving favorited tweets</a></li>
<li><a href=""#user-content-retrieving-twitter-lists"">Retrieving Twitter lists</a></li>
<li><a href=""#user-content-retrieving-twitter-list-memberships"">Retrieving Twitter list memberships</a></li>
<li><a href=""#user-content-retrieving-just-follower-and-friend-ids"">Retrieving just follower and friend IDs</a></li>
<li><a href=""#user-content-retrieving-tweets-from-your-home-timeline"">Retrieving tweets from your home timeline</a></li>
<li><a href=""#user-content-retrieving-your-mentions"">Retrieving your mentions</a></li>
<li><a href=""#user-content-providing-input-from-a-sql-query-with---sql-and---attach"">Providing input from a SQL query with --sql and --attach</a></li>
<li><a href=""#user-content-running-searches"">Running searches</a></li>
<li><a href=""#user-content-capturing-tweets-in-real-time-with-track-and-follow"">Capturing tweets in real-time with track and follow</a>
<ul dir=""auto"">
<li><a href=""#user-content-track"">track</a></li>
<li><a href=""#user-content-follow"">follow</a></li>
</ul>
</li>
<li><a href=""#user-content-importing-data-from-your-twitter-archive"">Importing data from your Twitter archive</a></li>
<li><a href=""#user-content-design-notes"">Design notes</a></li>
</ul>

<h2 dir=""auto""><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install twitter-to-sqlite""><pre><code>$ pip install twitter-to-sqlite
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-authentication"" class=""anchor"" aria-hidden=""true"" href=""#user-content-authentication""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Authentication</h2>
<p dir=""auto"">First, you will need to create a Twitter application at <a href=""https://developer.twitter.com/en/apps"" rel=""nofollow"">https://developer.twitter.com/en/apps</a>. You may need to apply for a Twitter developer account - if so, you may find this <a href=""https://raw.githubusercontent.com/dogsheep/twitter-to-sqlite/main/email.png"" rel=""nofollow"">example of an email application</a> useful that has been approved in the past.</p>
<p dir=""auto"">Once you have created your application, navigate to the ""Keys and tokens"" page and make note of the following:</p>
<ul dir=""auto"">
<li>Your API key</li>
<li>Your API secret key</li>
<li>Your access token</li>
<li>Your access token secret</li>
</ul>
<p dir=""auto"">You will need to save all four of these values to a JSON file in order to use this tool.</p>
<p dir=""auto"">You can create that JSON file by running the following command and pasting in the values at the prompts:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite auth
Create an app here: https://developer.twitter.com/en/apps
Then navigate to 'Keys and tokens' and paste in the following:

API key: xxx
API secret key: xxx
Access token: xxx
Access token secret: xxx""><pre><code>$ twitter-to-sqlite auth
Create an app here: https://developer.twitter.com/en/apps
Then navigate to 'Keys and tokens' and paste in the following:

API key: xxx
API secret key: xxx
Access token: xxx
Access token secret: xxx
</code></pre></div>
<p dir=""auto"">This will create a file called <code>auth.json</code> in your current directory containing the required values. To save the file at a different path or filename, use the <code>--auth=myauth.json</code> option.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-tweets-by-specific-accounts"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-tweets-by-specific-accounts""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving tweets by specific accounts</h2>
<p dir=""auto"">The <code>user-timeline</code> command retrieves all of the tweets posted by the specified user accounts. It defaults to the account belonging to the authenticated user:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite user-timeline twitter.db
Importing tweets  [#####-------------------------------]  2799/17780  00:01:39""><pre><code>$ twitter-to-sqlite user-timeline twitter.db
Importing tweets  [#####-------------------------------]  2799/17780  00:01:39
</code></pre></div>
<p dir=""auto"">All of these commands assume that there is an <code>auth.json</code> file in the current directory. You can provide the path to your <code>auth.json</code> file using <code>-a</code>:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite user-timeline twitter.db -a /path/to/auth.json""><pre><code>$ twitter-to-sqlite user-timeline twitter.db -a /path/to/auth.json
</code></pre></div>
<p dir=""auto"">To load tweets for other users, pass their screen names as arguments:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite user-timeline twitter.db cleopaws nichemuseums""><pre><code>$ twitter-to-sqlite user-timeline twitter.db cleopaws nichemuseums
</code></pre></div>
<p dir=""auto"">Twitter's API only returns up to around 3,200 tweets for most user accounts, but you may find that it returns all available tweets for your own user account.</p>
<p dir=""auto"">You can pass numeric Twitter user IDs instead of screen names using the <code>--ids</code> parameter.</p>
<p dir=""auto"">You can use <code>--since</code> to retrieve every tweet since the last time you imported for that user, or <code>--since_id=xxx</code> to retrieve every tweet since a specific tweet ID.</p>
<p dir=""auto"">This command also accepts <code>--sql</code> and <code>--attach</code> options, documented below.</p>
<h2 dir=""auto""><a id=""user-content-retrieve-user-profiles-in-bulk"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieve-user-profiles-in-bulk""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieve user profiles in bulk</h2>
<p dir=""auto"">If you have a list of Twitter screen names (or user IDs) you can bulk fetch their fully inflated Twitter profiles using the <code>users-lookup</code> command:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup users.db simonw cleopaws""><pre><code>$ twitter-to-sqlite users-lookup users.db simonw cleopaws
</code></pre></div>
<p dir=""auto"">You can pass user IDs instead using the <code>--ids</code> option:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup users.db 12497 3166449535 --ids""><pre><code>$ twitter-to-sqlite users-lookup users.db 12497 3166449535 --ids
</code></pre></div>
<p dir=""auto"">This command also accepts <code>--sql</code> and <code>--attach</code> options, documented below.</p>
<h2 dir=""auto""><a id=""user-content-retrieve-tweets-in-bulk"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieve-tweets-in-bulk""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieve tweets in bulk</h2>
<p dir=""auto"">If you have a list of tweet IDS you can bulk fetch them using the <code>statuses-lookup</code> command:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite statuses-lookup tweets.db 1122154819815239680 1122154178493575169""><pre><code>$ twitter-to-sqlite statuses-lookup tweets.db 1122154819815239680 1122154178493575169
</code></pre></div>
<p dir=""auto"">The <code>--sql</code> and <code>--attach</code> options are supported.</p>
<p dir=""auto"">Here's a recipe to retrieve any tweets that existing tweets are in-reply-to which have not yet been stored in your database:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite statuses-lookup tweets.db \
    --sql='
        select in_reply_to_status_id
        from tweets
        where in_reply_to_status_id is not null' \
    --skip-existing""><pre><code>$ twitter-to-sqlite statuses-lookup tweets.db \
    --sql='
        select in_reply_to_status_id
        from tweets
        where in_reply_to_status_id is not null' \
    --skip-existing
</code></pre></div>
<p dir=""auto"">The <code>--skip-existing</code> option means that tweets that have already been stored in the database will not be fetched again.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-twitter-followers"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-twitter-followers""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving Twitter followers</h2>
<p dir=""auto"">The <code>followers</code> command retrieves details of every follower of the specified accounts. You can use it to retrieve your own followers, or you can pass one or more screen names to pull the followers for other accounts.</p>
<p dir=""auto"">The following command pulls your followers and saves them in a SQLite database file called <code>twitter.db</code>:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite followers twitter.db""><pre><code>$ twitter-to-sqlite followers twitter.db
</code></pre></div>
<p dir=""auto"">This command is <strong>extremely slow</strong>, because Twitter impose a rate limit of no more than one request per minute to this endpoint! If you are running it against an account with thousands of followers you should expect this to take several hours.</p>
<p dir=""auto"">To retrieve followers for another account, use:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite followers twitter.db cleopaws""><pre><code>$ twitter-to-sqlite followers twitter.db cleopaws
</code></pre></div>
<p dir=""auto"">This command also accepts the <code>--ids</code>, <code>--sql</code> and <code>--attach</code> options.</p>
<p dir=""auto"">See <a href=""https://simonwillison.net/2018/Jan/28/analyzing-my-twitter-followers/"" rel=""nofollow"">Analyzing my Twitter followers with Datasette</a> for the original inspiration for this command.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-friends"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-friends""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving friends</h2>
<p dir=""auto"">The <code>friends</code> command works like the <code>followers</code> command, but retrieves the specified (or currently authenticated) user's friends - defined as accounts that the user is following.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite friends twitter.db""><pre><code>$ twitter-to-sqlite friends twitter.db
</code></pre></div>
<p dir=""auto"">It takes the same options as the <code>followers</code> command.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-favorited-tweets"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-favorited-tweets""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving favorited tweets</h2>
<p dir=""auto"">The <code>favorites</code> command retrieves tweets that have been favorited by a specified user. Called without any extra arguments it retrieves tweets favorited by the currently authenticated user:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite favorites faves.db""><pre><code>$ twitter-to-sqlite favorites faves.db
</code></pre></div>
<p dir=""auto"">You can also use the <code>--screen_name</code> or <code>--user_id</code> arguments to retrieve favorite tweets for another user:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite favorites faves-obama.db --screen_name=BarackObama""><pre><code>$ twitter-to-sqlite favorites faves-obama.db --screen_name=BarackObama
</code></pre></div>
<p dir=""auto"">Use the <code>--stop_after=xxx</code> argument to retrieve only the most recent number of favorites, e.g. to get the authenticated user's 50 most recent favorites:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite favorites faves.db --stop_after=50""><pre><code>$ twitter-to-sqlite favorites faves.db --stop_after=50
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-retrieving-twitter-lists"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-twitter-lists""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving Twitter lists</h2>
<p dir=""auto"">The <code>lists</code> command retrieves all of the lists belonging to one or more users.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite lists lists.db simonw dogsheep""><pre><code>$ twitter-to-sqlite lists lists.db simonw dogsheep
</code></pre></div>
<p dir=""auto"">This command also accepts the <code>--sql</code> and <code>--attach</code> and <code>--ids</code> options.</p>
<p dir=""auto"">To additionally fetch the list of members for each list, use <code>--members</code>.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-twitter-list-memberships"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-twitter-list-memberships""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving Twitter list memberships</h2>
<p dir=""auto"">The <code>list-members</code> command can be used to retrieve details of one or more Twitter lists, including all of their members.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite list-members members.db simonw/the-good-place""><pre><code>$ twitter-to-sqlite list-members members.db simonw/the-good-place
</code></pre></div>
<p dir=""auto"">You can pass multiple <code>screen_name/list_slug</code> identifiers.</p>
<p dir=""auto"">If you know the numeric IDs of the lists instead, you can use <code>--ids</code>:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite list-members members.db 927913322841653248 --ids""><pre><code>$ twitter-to-sqlite list-members members.db 927913322841653248 --ids
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-retrieving-just-follower-and-friend-ids"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-just-follower-and-friend-ids""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving just follower and friend IDs</h2>
<p dir=""auto"">It's also possible to retrieve just the numeric Twitter IDs of the accounts that specific users are following (""friends"" in Twitter's API terminology) or followed-by:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite followers-ids members.db simonw cleopaws""><pre><code>$ twitter-to-sqlite followers-ids members.db simonw cleopaws
</code></pre></div>
<p dir=""auto"">This will populate the <code>following</code> table with <code>followed_id</code>/<code>follower_id</code> pairs for the two specified accounts, listing every account ID that is following either of those two accounts.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite friends-ids members.db simonw cleopaws""><pre><code>$ twitter-to-sqlite friends-ids members.db simonw cleopaws
</code></pre></div>
<p dir=""auto"">This will do the same thing but pull the IDs that those accounts are following.</p>
<p dir=""auto"">Both of these commands also support <code>--sql</code> and <code>--attach</code> as an alternative to passing screen names as direct command-line arguments. You can use <code>--ids</code> to process the inputs as user IDs rather than screen names.</p>
<p dir=""auto"">The underlying Twitter APIs have a rate limit of 15 requests every 15 minutes - though they do return up to 5,000 IDs in each call. By default both of these subcommands will wait for 61 seconds between API calls in order to stay within the rate limit - you can adjust this behaviour down to just one second delay if you know you will not be making many calls using <code>--sleep=1</code>.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-tweets-from-your-home-timeline"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-tweets-from-your-home-timeline""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving tweets from your home timeline</h2>
<p dir=""auto"">The <code>home-timeline</code> command retrieves up to 800 tweets from the home timeline of the authenticated user - generally this means tweets from people you follow.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite home-timeline twitter.db
Importing timeline  [#################--------]  591/800  00:01:14""><pre><code>$ twitter-to-sqlite home-timeline twitter.db
Importing timeline  [#################--------]  591/800  00:01:14
</code></pre></div>
<p dir=""auto"">The tweets are stored in the <code>tweets</code> table, and a record is added to the <code>timeline_tweets</code> table noting that this tweet came in due to being spotted in the timeline of your user.</p>
<p dir=""auto"">You can use <code>--since</code> to retrieve just tweets that have been posted since the last time this command was run, or <code>--since_id=xxx</code> to explicitly pass in a tweet ID to use as the last position.</p>
<p dir=""auto"">You can then view your timeline in Datasette using the following URL:</p>
<p dir=""auto""><code>/tweets/tweets?_where=id+in+(select+tweet+from+[timeline_tweets])&amp;_sort_desc=id&amp;_facet=user</code></p>
<p dir=""auto"">This will filter your tweets table to just tweets that appear in your timeline, ordered by most recent first and use faceting to show you which users are responsible for the most tweets.</p>
<h2 dir=""auto""><a id=""user-content-retrieving-your-mentions"" class=""anchor"" aria-hidden=""true"" href=""#user-content-retrieving-your-mentions""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Retrieving your mentions</h2>
<p dir=""auto"">The <code>mentions-timeline</code> command works like <code>home-timeline</code> except it retrieves tweets that mention the authenticated user's account. It records the user account that was mentioned in a <code>mentions_tweets</code> table.</p>
<p dir=""auto"">It supports <code>--since</code> and <code>--since_id</code> in the same was as <code>home-timeline</code> does.</p>
<h2 dir=""auto""><a id=""user-content-providing-input-from-a-sql-query-with---sql-and---attach"" class=""anchor"" aria-hidden=""true"" href=""#user-content-providing-input-from-a-sql-query-with---sql-and---attach""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Providing input from a SQL query with --sql and --attach</h2>
<p dir=""auto"">This option is available for some subcommands - run <code>twitter-to-sqlite command-name --help</code> to check.</p>
<p dir=""auto"">You can provide Twitter screen names (or user IDs or tweet IDs) directly as command-line arguments, or you can provide those screen names or IDs by executing a SQL query.</p>
<p dir=""auto"">For example: consider a SQLite database with an <code>attendees</code> table listing names and Twitter accounts - something like this:</p>
<table>
<thead>
<tr>
<th>First</th>
<th>Last</th>
<th>Twitter</th>
</tr>
</thead>
<tbody>
<tr>
<td>Simon</td>
<td>Willison</td>
<td>simonw</td>
</tr>
<tr>
<td>Avril</td>
<td>Lavigne</td>
<td>AvrilLavigne</td>
</tr>
</tbody>
</table>
<p dir=""auto"">You can run the <code>users-lookup</code> command to pull the Twitter profile of every user listed in that database by loading the screen names using a <code>--sql</code> query:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup my.db --sql=&quot;select Twitter from attendees&quot;""><pre><code>$ twitter-to-sqlite users-lookup my.db --sql=""select Twitter from attendees""
</code></pre></div>
<p dir=""auto"">If your database table contains Twitter IDs, you can select those IDs and pass the <code>--ids</code> argument. For example, to fetch the profiles of users who have had their user IDs inserted into the <code>following</code> table using the <code>twitter-to-sqlite friends-ids</code> command:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup my.db --sql=&quot;select follower_id from following&quot; --ids""><pre><code>$ twitter-to-sqlite users-lookup my.db --sql=""select follower_id from following"" --ids
</code></pre></div>
<p dir=""auto"">Or to avoid re-fetching users that have already been fetched:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup my.db \
    --sql=&quot;select followed_id from following where followed_id not in (
        select id from users)&quot; --ids""><pre><code>$ twitter-to-sqlite users-lookup my.db \
    --sql=""select followed_id from following where followed_id not in (
        select id from users)"" --ids
</code></pre></div>
<p dir=""auto"">If your data lives in a separate database file you can attach it using <code>--attach</code>. For example, consider the attendees example above but the data lives in an <code>attendees.db</code> file, and you want to fetch the user profiles into a <code>tweets.db</code> file. You could do that like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup tweets.db \
    --attach=attendees.db \
    --sql=&quot;select Twitter from attendees.attendees&quot;""><pre><code>$ twitter-to-sqlite users-lookup tweets.db \
    --attach=attendees.db \
    --sql=""select Twitter from attendees.attendees""
</code></pre></div>
<p dir=""auto"">The filename (without the extension) will be used as the database alias within SQLite. If you want a different alias for some reason you can specify that with a colon like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite users-lookup tweets.db \
    --attach=foo:attendees.db \
    --sql=&quot;select Twitter from foo.attendees&quot;""><pre><code>$ twitter-to-sqlite users-lookup tweets.db \
    --attach=foo:attendees.db \
    --sql=""select Twitter from foo.attendees""
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-running-searches"" class=""anchor"" aria-hidden=""true"" href=""#user-content-running-searches""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Running searches</h2>
<p dir=""auto"">The <code>search</code> command runs a search against the Twitter <a href=""https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets"" rel=""nofollow"">standard search API</a>.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite search tweets.db &quot;dogsheep&quot;""><pre><code>$ twitter-to-sqlite search tweets.db ""dogsheep""
</code></pre></div>
<p dir=""auto"">This will import up to around 320 tweets that match that search term into the <code>tweets</code> table. It will also create a record in the <code>search_runs</code> table recording that the search took place, and many-to-many records in the <code>search_runs_tweets</code> table recording which tweets were seen for that search at that time.</p>
<p dir=""auto"">You can use the <code>--since</code> parameter to check for previous search runs with the same arguments and only retrieve tweets that were posted since the last retrieved matching tweet.</p>
<p dir=""auto"">The following additional options for <code>search</code> are supported:</p>
<ul dir=""auto"">
<li><code>--geocode</code>: <code>latitude,longitude,radius</code> where radius is a number followed by mi or km</li>
<li><code>--lang</code>: ISO 639-1 language code e.g. <code>en</code> or <code>es</code></li>
<li><code>--locale</code>: Locale: only <code>ja</code> is currently effective</li>
<li><code>--result_type</code>: <code>mixed</code>, <code>recent</code> or <code>popular</code>. Defaults to <code>mixed</code></li>
<li><code>--count</code>: Number of results per page, defaults to the maximum of 100</li>
<li><code>--stop_after</code>: Stop after this many results</li>
<li><code>--since_id</code>: Pull tweets since this Tweet ID. You probably want to use <code>--since</code> instead of this.</li>
</ul>
<h2 dir=""auto""><a id=""user-content-capturing-tweets-in-real-time-with-track-and-follow"" class=""anchor"" aria-hidden=""true"" href=""#user-content-capturing-tweets-in-real-time-with-track-and-follow""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Capturing tweets in real-time with track and follow</h2>
<p dir=""auto"">This functionality is <strong>experimental</strong>. Please <a href=""https://github.com/dogsheep/twitter-to-sqlite/issues"">file bug reports</a> if you find any!</p>
<p dir=""auto"">Twitter provides a real-time API which can be used to subscribe to tweets as they happen. <code>twitter-to-sqlite</code> can use this API to continually update a SQLite database with tweets matching certain keywords, or referencing specific users.</p>
<h3 dir=""auto""><a id=""user-content-track"" class=""anchor"" aria-hidden=""true"" href=""#user-content-track""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>track</h3>
<p dir=""auto"">To track keywords, use the <code>track</code> command:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite track tweets.db kakapo""><pre><code>$ twitter-to-sqlite track tweets.db kakapo
</code></pre></div>
<p dir=""auto"">This command will continue to run until you hit Ctrl+C. It will capture any tweets mentioning the keyword <a href=""https://en.wikipedia.org/wiki/Kakapo"" rel=""nofollow"">kakapo</a> and store them in the <code>tweets.db</code> database file.</p>
<p dir=""auto"">You can pass multiple keywords as a space separated list. This will capture tweets matching either of those keywords:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite track tweets.db kakapo raccoon""><pre><code>$ twitter-to-sqlite track tweets.db kakapo raccoon
</code></pre></div>
<p dir=""auto"">You can enclose phrases in quotes to search for tweets matching both of those keywords:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite track tweets.db 'trash panda'""><pre><code>$ twitter-to-sqlite track tweets.db 'trash panda'
</code></pre></div>
<p dir=""auto"">See <a href=""https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#track"" rel=""nofollow"">the Twitter track documentation</a> for advanced tips on using this command.</p>
<p dir=""auto"">Add the <code>--verbose</code> option to see matching tweets (in their verbose JSON form) displayed to the terminal as they are captured:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite track tweets.db raccoon --verbose""><pre><code>$ twitter-to-sqlite track tweets.db raccoon --verbose
</code></pre></div>
<h3 dir=""auto""><a id=""user-content-follow"" class=""anchor"" aria-hidden=""true"" href=""#user-content-follow""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>follow</h3>
<p dir=""auto"">The <code>follow</code> command will capture all tweets that are relevant to one or more specific Twitter users.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite follow tweets.db nytimes""><pre><code>$ twitter-to-sqlite follow tweets.db nytimes
</code></pre></div>
<p dir=""auto"">This includes tweets by those users, tweets that reply to or quote those users and retweets by that user. See <a href=""https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters#follow"" rel=""nofollow"">the Twitter follow documentation</a> for full details.</p>
<p dir=""auto"">The command accepts one or more screen names.</p>
<p dir=""auto"">You can feed it numeric Twitter user IDs instead of screen names by using the <code>--ids</code> flag.</p>
<p dir=""auto"">The command also supports the <code>--sql</code> and <code>--attach</code> options, and the <code>--verbose</code> option for displaying tweets as they are captured.</p>
<p dir=""auto"">Here's how to start following tweets from every user ID currently represented as being followed in the <code>following</code> table (populated using the <code>friends-ids</code> command):</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite follow tweets.db \
    --sql=&quot;select distinct followed_id from following&quot; \
    --ids""><pre><code>$ twitter-to-sqlite follow tweets.db \
    --sql=""select distinct followed_id from following"" \
    --ids
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-importing-data-from-your-twitter-archive"" class=""anchor"" aria-hidden=""true"" href=""#user-content-importing-data-from-your-twitter-archive""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Importing data from your Twitter archive</h2>
<p dir=""auto"">You can request an archive of your Twitter data by <a href=""https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive"" rel=""nofollow"">following these instructions</a>.</p>
<p dir=""auto"">Twitter will send you a link to download a <code>.zip</code> file. You can import the contents of that file into a set of tables in a new database file called <code>archive.db</code> (each table beginning with the <code>archive_</code> prefix) using the <code>import</code> command:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip""><pre><code>$ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2.zip
</code></pre></div>
<p dir=""auto"">This command does not populate any of the regular tables, since Twitter's export data does not exactly match the schema returned by the Twitter API.</p>
<p dir=""auto"">It will delete and recreate the corresponding <code>archive_*</code> tables every time you run it. If this is not what you want, run the command against a new SQLite database file name rather than running it against one that already exists.</p>
<p dir=""auto"">If you have already decompressed your archive, you can run this against the directory that you decompressed it to:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2/""><pre><code>$ twitter-to-sqlite import archive.db ~/Downloads/twitter-2019-06-25-b31f2/
</code></pre></div>
<p dir=""auto"">You can also run it against one or more specific files within that folder. For example, to import just the follower.js and following.js files:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite import archive.db \
    ~/Downloads/twitter-2019-06-25-b31f2/follower.js \
    ~/Downloads/twitter-2019-06-25-b31f2/following.js""><pre><code>$ twitter-to-sqlite import archive.db \
    ~/Downloads/twitter-2019-06-25-b31f2/follower.js \
    ~/Downloads/twitter-2019-06-25-b31f2/following.js
</code></pre></div>
<p dir=""auto"">You may want to use other commands to populate tables based on data from the archive. For example, to retrieve full API versions of each of the tweets you have favourited in your archive, you could run the following:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite statuses-lookup archive.db \
    --sql='select tweetId from archive_like' \
    --skip-existing""><pre><code>$ twitter-to-sqlite statuses-lookup archive.db \
    --sql='select tweetId from archive_like' \
    --skip-existing
</code></pre></div>
<p dir=""auto"">If you want these imported tweets to then be reflected in the <code>favorited_by</code> table, you can do so by applying the following SQL query:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ sqlite3 archive.db
SQLite version 3.22.0 2018-01-22 18:45:57
Enter &quot;.help&quot; for usage hints.
sqlite&gt; INSERT OR IGNORE INTO favorited_by (tweet, user)
   ...&gt;     SELECT tweetId, 'YOUR_TWITTER_ID' FROM archive_like;
&lt;Ctrl+D&gt;""><pre><code>$ sqlite3 archive.db
SQLite version 3.22.0 2018-01-22 18:45:57
Enter "".help"" for usage hints.
sqlite&gt; INSERT OR IGNORE INTO favorited_by (tweet, user)
   ...&gt;     SELECT tweetId, 'YOUR_TWITTER_ID' FROM archive_like;
&lt;Ctrl+D&gt;
</code></pre></div>
<p dir=""auto"">Replace YOUR_TWITTER_ID with your numeric Twitter ID. If you don't know that ID you can find it out by running the following:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ twitter-to-sqlite fetch \
    &quot;https://api.twitter.com/1.1/account/verify_credentials.json&quot; \
    | grep '&quot;id&quot;' | head -n 1""><pre><code>$ twitter-to-sqlite fetch \
    ""https://api.twitter.com/1.1/account/verify_credentials.json"" \
    | grep '""id""' | head -n 1
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-design-notes"" class=""anchor"" aria-hidden=""true"" href=""#user-content-design-notes""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Design notes</h2>
<ul dir=""auto"">
<li>Tweet IDs are stored as integers, to afford sorting by ID in a sensible way</li>
<li>While we configure foreign key relationships between tables, we do not ask SQLite to enforce them. This is used by the <code>following</code> table to allow the <code>followers-ids</code> and <code>friends-ids</code> commands to populate it with user IDs even if the user accounts themselves are not yet present in the <code>users</code> table.</li>
</ul>
</article></div>",1,public,0,,,
206202864,MDEwOlJlcG9zaXRvcnkyMDYyMDI4NjQ=,inaturalist-to-sqlite,dogsheep/inaturalist-to-sqlite,0,53015001,https://github.com/dogsheep/inaturalist-to-sqlite,Create a SQLite database containing your observation history from iNaturalist,0,2019-09-04T01:21:21Z,2020-12-19T05:18:38Z,2020-10-22T00:08:58Z,,17,2,2,Python,1,1,1,1,0,0,0,0,0,apache-2.0,"[""sqlite"", ""inaturalist"", ""datasette"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",0,0,2,master,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,0,1,"# inaturalist-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/inaturalist-to-sqlite.svg)](https://pypi.org/project/inaturalist-to-sqlite/)
[![CircleCI](https://circleci.com/gh/dogsheep/inaturalist-to-sqlite.svg?style=svg)](https://circleci.com/gh/dogsheep/inaturalist-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/inaturalist-to-sqlite/blob/master/LICENSE)

Create a SQLite database containing your observation history from [iNaturalist](https://www.inaturalist.org/).

## How to install

    $ pip install inaturalist-to-sqlite

## Usage

    $ inaturalist-to-sqlite inaturalist.db yourusername

(Or try `simonw` if you don't yet have an iNaturalist account)

This will import all of your iNaturalist observations into a SQLite database called `inaturalist.db`.","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-inaturalist-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-inaturalist-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>inaturalist-to-sqlite</h1>
<p><a href=""https://pypi.org/project/inaturalist-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/0b4aee6bb6f3aeb706c5195fce3537b66445f052f38021aefeb48672cf06cf74/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f696e61747572616c6973742d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/inaturalist-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://circleci.com/gh/dogsheep/inaturalist-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/96dcb5c0cfa03bf010686f57c71dfef278ab6aaa53eb39cb48d28bde427c55b7/68747470733a2f2f636972636c6563692e636f6d2f67682f646f6773686565702f696e61747572616c6973742d746f2d73716c6974652e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/dogsheep/inaturalist-to-sqlite.svg?style=svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/inaturalist-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a SQLite database containing your observation history from <a href=""https://www.inaturalist.org/"" rel=""nofollow"">iNaturalist</a>.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install inaturalist-to-sqlite
""><pre><code>$ pip install inaturalist-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ inaturalist-to-sqlite inaturalist.db yourusername
""><pre><code>$ inaturalist-to-sqlite inaturalist.db yourusername
</code></pre></div>
<p>(Or try <code>simonw</code> if you don't yet have an iNaturalist account)</p>
<p>This will import all of your iNaturalist observations into a SQLite database called <code>inaturalist.db</code>.</p>
</article></div>",,,,,,
206649770,MDEwOlJlcG9zaXRvcnkyMDY2NDk3NzA=,google-takeout-to-sqlite,dogsheep/google-takeout-to-sqlite,0,53015001,https://github.com/dogsheep/google-takeout-to-sqlite,Save data from Google Takeout to a SQLite database,0,2019-09-05T20:15:15Z,2021-06-08T15:31:47Z,2021-02-24T00:34:55Z,,14,51,51,Python,1,1,1,1,0,4,0,0,6,apache-2.0,"[""google"", ""sqlite"", ""datasette"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",4,6,51,master,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,4,3,"# google-takeout-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/google-takeout-to-sqlite.svg)](https://pypi.org/project/google-takeout-to-sqlite/)
[![CircleCI](https://circleci.com/gh/dogsheep/google-takeout-to-sqlite.svg?style=svg)](https://circleci.com/gh/dogsheep/google-takeout-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/google-takeout-to-sqlite/blob/master/LICENSE)

Save data from google-takeout to a SQLite database.

## How to install

    $ pip install google-takeout-to-sqlite

Request your Google data from https://takeout.google.com/ - wait for the email and download the zip file.

This tool only supports a subset of the available options. More will be added over time.

## My Activity

You can request the ""My Activity"" export and then import it with the following command:

    $ google-takeout-to-sqlite my-activity takeout.db ~/Downloads/takeout-20190530.zip

This will create a database file called `takeout.db` if one does not already exist.

## Location History

Your location history records latitude, longitude and timestame for where Google has tracked your location. You can import it using this command:

    $ google-takeout-to-sqlite location-history takeout.db ~/Downloads/takeout-20190530.zip

## Browsing your data with Datasette

Once you have imported Google data into a SQLite database file you can browse your data using [Datasette](https://github.com/simonw/datasette). Install Datasette like so:

    $ pip install datasette

Now browse your data by running this and then visiting `http://localhost:8001/`

    $ datasette takeout.db

Install the [datasette-cluster-map](https://github.com/simonw/datasette-cluster-map) plugin to see your location history on a map:

    $ pip install datasette-cluster-map
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-google-takeout-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-google-takeout-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>google-takeout-to-sqlite</h1>
<p><a href=""https://pypi.org/project/google-takeout-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/fb9e0c4c7734b2904f2b1aaaa4ce05fa195167989240be80a7c8e56e2b2a000e/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f676f6f676c652d74616b656f75742d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/google-takeout-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://circleci.com/gh/dogsheep/google-takeout-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/41bc34953e618d9a884ec9a8bad7d759bf580fc6b7c27982d3c4af4531121246/68747470733a2f2f636972636c6563692e636f6d2f67682f646f6773686565702f676f6f676c652d74616b656f75742d746f2d73716c6974652e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/dogsheep/google-takeout-to-sqlite.svg?style=svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/google-takeout-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Save data from google-takeout to a SQLite database.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install google-takeout-to-sqlite
""><pre><code>$ pip install google-takeout-to-sqlite
</code></pre></div>
<p>Request your Google data from <a href=""https://takeout.google.com/"" rel=""nofollow"">https://takeout.google.com/</a> - wait for the email and download the zip file.</p>
<p>This tool only supports a subset of the available options. More will be added over time.</p>
<h2><a id=""user-content-my-activity"" class=""anchor"" aria-hidden=""true"" href=""#user-content-my-activity""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>My Activity</h2>
<p>You can request the ""My Activity"" export and then import it with the following command:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ google-takeout-to-sqlite my-activity takeout.db ~/Downloads/takeout-20190530.zip
""><pre><code>$ google-takeout-to-sqlite my-activity takeout.db ~/Downloads/takeout-20190530.zip
</code></pre></div>
<p>This will create a database file called <code>takeout.db</code> if one does not already exist.</p>
<h2><a id=""user-content-location-history"" class=""anchor"" aria-hidden=""true"" href=""#user-content-location-history""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Location History</h2>
<p>Your location history records latitude, longitude and timestame for where Google has tracked your location. You can import it using this command:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ google-takeout-to-sqlite location-history takeout.db ~/Downloads/takeout-20190530.zip
""><pre><code>$ google-takeout-to-sqlite location-history takeout.db ~/Downloads/takeout-20190530.zip
</code></pre></div>
<h2><a id=""user-content-browsing-your-data-with-datasette"" class=""anchor"" aria-hidden=""true"" href=""#user-content-browsing-your-data-with-datasette""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Browsing your data with Datasette</h2>
<p>Once you have imported Google data into a SQLite database file you can browse your data using <a href=""https://github.com/simonw/datasette"">Datasette</a>. Install Datasette like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install datasette
""><pre><code>$ pip install datasette
</code></pre></div>
<p>Now browse your data by running this and then visiting <code>http://localhost:8001/</code></p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ datasette takeout.db
""><pre><code>$ datasette takeout.db
</code></pre></div>
<p>Install the <a href=""https://github.com/simonw/datasette-cluster-map"">datasette-cluster-map</a> plugin to see your location history on a map:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install datasette-cluster-map
""><pre><code>$ pip install datasette-cluster-map
</code></pre></div>
</article></div>",,,,,,
207052882,MDEwOlJlcG9zaXRvcnkyMDcwNTI4ODI=,github-to-sqlite,dogsheep/github-to-sqlite,0,53015001,https://github.com/dogsheep/github-to-sqlite,Save data from GitHub to a SQLite database,0,2019-09-08T02:50:28Z,2022-09-20T04:36:37Z,2022-09-28T21:07:54Z,https://github-to-sqlite.dogsheep.net/,143,235,235,Python,1,1,1,1,0,32,0,0,20,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool"", ""dogsheep"", ""github-api"", ""sqlite""]",32,20,235,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,32,6,"# github-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/github-to-sqlite.svg)](https://pypi.org/project/github-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/github-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/github-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/github-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/github-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/github-to-sqlite/blob/main/LICENSE)

Save data from GitHub to a SQLite database.

<!-- toc -->

- [Demo](#demo)
- [How to install](#how-to-install)
- [Authentication](#authentication)
- [Fetching issues for a repository](#fetching-issues-for-a-repository)
- [Fetching pull requests for a repository](#fetching-pull-requests-for-a-repository)
- [Fetching issue comments for a repository](#fetching-issue-comments-for-a-repository)
- [Fetching commits for a repository](#fetching-commits-for-a-repository)
- [Fetching releases for a repository](#fetching-releases-for-a-repository)
- [Fetching tags for a repository](#fetching-tags-for-a-repository)
- [Fetching contributors to a repository](#fetching-contributors-to-a-repository)
- [Fetching repos belonging to a user or organization](#fetching-repos-belonging-to-a-user-or-organization)
- [Fetching specific repositories](#fetching-specific-repositories)
- [Fetching repos that have been starred by a user](#fetching-repos-that-have-been-starred-by-a-user)
- [Fetching users that have starred specific repos](#fetching-users-that-have-starred-specific-repos)
- [Fetching GitHub Actions workflows](#fetching-github-actions-workflows)
- [Scraping dependents for a repository](#scraping-dependents-for-a-repository)
- [Fetching emojis](#fetching-emojis)
- [Making authenticated API calls](#making-authenticated-api-calls)

<!-- tocstop -->

## Demo

https://github-to-sqlite.dogsheep.net/ hosts a [Datasette](https://datasette.io/) demo of a database created by [running this tool](https://github.com/dogsheep/github-to-sqlite/blob/main/.github/workflows/deploy-demo.yml#L40-L60) against all of the repositories in the [Dogsheep GitHub organization](https://github.com/dogsheep), plus the [datasette](https://github.com/simonw/datasette) and [sqlite-utils](https://github.com/simonw/sqlite-utils) repositories.

## How to install

    $ pip install github-to-sqlite

## Authentication

Create a GitHub personal access token: https://github.com/settings/tokens

Run this command and paste in your new token:

    $ github-to-sqlite auth

This will create a file called `auth.json` in your current directory containing the required value. To save the file at a different path or filename, use the `--auth=myauth.json` option.

As an alternative to using an `auth.json` file you can add your access token to an environment variable called `GITHUB_TOKEN`.

## Fetching issues for a repository

The `issues` command retrieves all of the issues belonging to a specified repository.

    $ github-to-sqlite issues github.db simonw/datasette

If an `auth.json` file is present it will use the token from that file. It works without authentication for public repositories but you should be aware that GitHub have strict IP-based rate limits for unauthenticated requests.

You can point to a different location of `auth.json` using `-a`:

    $ github-to-sqlite issues github.db simonw/datasette -a /path/to/auth.json

You can use the `--issue` option one or more times to load specific issues:

    $ github-to-sqlite issues github.db simonw/datasette --issue=1

Example: [issues table](https://github-to-sqlite.dogsheep.net/github/issues)

## Fetching pull requests for a repository

While pull requests are a type of issue, you will get more information on pull requests by pulling them separately. For example, whether a pull request has been merged and when.

Following the API of issues, the `pull-requests` command retrieves all of the pull requests belonging to a specified repository.

    $ github-to-sqlite pull-requests github.db simonw/datasette

You can use the `--pull-request` option one or more times to load specific pull request:

    $ github-to-sqlite pull-requests github.db simonw/datasette --pull-request=81

Note that the `merged_by` column on the `pull_requests` table will only be populated for pull requests that are loaded using the `--pull-request` option - the GitHub API does not return this field for pull requests that are loaded in bulk.

Example: [pull_requests table](https://github-to-sqlite.dogsheep.net/github/pull_requests)

## Fetching issue comments for a repository

The `issue-comments` command retrieves all of the comments on all of the issues in a repository.

It is recommended you run `issues` first, so that each imported comment can have a foreign key poining to its issue.

    $ github-to-sqlite issues github.db simonw/datasette
    $ github-to-sqlite issue-comments github.db simonw/datasette

You can use the `--issue` option to only load comments for a specific issue within that repository, for example:

    $ github-to-sqlite issue-comments github.db simonw/datasette --issue=1

Example: [issue_comments table](https://github-to-sqlite.dogsheep.net/github/issue_comments)

## Fetching commits for a repository

The `commits` command retrieves details of all of the commits for one or more repositories. It currently fetches the sha, commit message and author and committer details - it does no retrieve the full commit body.

    $ github-to-sqlite commits github.db simonw/datasette simonw/sqlite-utils

The command accepts one or more repositories.

By default it will stop as soon as it sees a commit that has previously been retrieved. You can force it to retrieve all commits (including those that have been previously inserted) using `--all`.

Example: [commits table](https://github-to-sqlite.dogsheep.net/github/commits)

## Fetching releases for a repository

The `releases` command retrieves the releases for one or more repositories.

    $ github-to-sqlite releases github.db simonw/datasette simonw/sqlite-utils

The command accepts one or more repositories.

Example: [releases table](https://github-to-sqlite.dogsheep.net/github/releases)

## Fetching tags for a repository

The `tags` command retrieves all of the tags for one or more repositories.

    $ github-to-sqlite tags github.db simonw/datasette simonw/sqlite-utils

Example: [tags table](https://github-to-sqlite.dogsheep.net/github/tags)

## Fetching contributors to a repository

The `contributors` command retrieves details of all of the contributors for one or more repositories.

    $ github-to-sqlite contributors github.db simonw/datasette simonw/sqlite-utils

The command accepts one or more repositories. It populates a `contributors` table, with foreign keys to `repos` and `users` and a `contributions` table listing the number of commits to that repository for each contributor.

Example: [contributors table](https://github-to-sqlite.dogsheep.net/github/contributors)

## Fetching repos belonging to a user or organization

The `repos` command fetches repos belonging to a user or organization.

Without any other arguments, this command will fetch all repos that the currently authenticated user owns, collaborates on or can access via one of their organizations:

    $ github-to-sqlite repos github.db

To fetch repos belonging to a specific user or organization, provide their username as an argument:

    $ github-to-sqlite repos github.db dogsheep # organization
    $ github-to-sqlite repos github.db simonw # user

You can pass more than one username to fetch for multiple users or organizations at once:

    $ github-to-sqlite repos github.db simonw dogsheep

Add the `--readme` option to save the README for the repo in a column called `readme`. Add `--readme-html` to save the HTML rendered version of the README into a collumn called `readme_html`.

Example: [repos table](https://github-to-sqlite.dogsheep.net/github/repos)

## Fetching specific repositories

You can use `-r` with the `repos` command one or more times to fetch just specific repositories.

    $ github-to-sqlite repos github.db -r simonw/datasette -r dogsheep/github-to-sqlite

## Fetching repos that have been starred by a user

The `starred` command fetches the repos that have been starred by a user.

    $ github-to-sqlite starred github.db simonw

If you are using an `auth.json` file you can omit the username to retrieve the starred repos for the authenticated user.

Example: [stars table](https://github-to-sqlite.dogsheep.net/github/stars)

## Fetching users that have starred specific repos

The `stargazers` command fetches the users that have starred the specified repos.

    $ github-to-sqlite stargazers github.db simonw/datasette dogsheep/github-to-sqlite

You can specify one or more repository using `owner/repo` syntax.

Users fetched using this command will be inserted into the `users` table. Many-to-many records showing which repository they starred will be added to the `stars` table.

## Fetching GitHub Actions workflows

The `workflows` command fetches the YAML workflow configurations from each repository's `.github/workflows` directory and parses them to populate `workflows`, `jobs` and `steps` tables.

    $ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite

You can specify one or more repository using `owner/repo` syntax.

Example: [workflows table](https://github-to-sqlite.dogsheep.net/github/workflows), [jobs table](https://github-to-sqlite.dogsheep.net/github/jobs), [steps table](https://github-to-sqlite.dogsheep.net/github/steps)

## Scraping dependents for a repository

The GitHub dependency graph can show other GitHub projects that depend on a specific repo, for example [simonw/datasette/network/dependents](https://github.com/simonw/datasette/network/dependents).

This data is not yet available through the GitHub API. The `scrape-dependents` command scrapes those pages and uses the GitHub API to load full versions of the dependent repositories.

    $ github-to-sqlite scrape-dependents github.db simonw/datasette

The command accepts one or more repositories.

Add `-v` for verbose output.

Example: [dependents table](https://github-to-sqlite.dogsheep.net/github/dependents?_sort_desc=first_seen_utc)

## Fetching emojis

You can fetch a list of every emoji supported by GitHub using the `emojis` command:

    $ github-to-sqlite emojis github.db

This will create a table callad `emojis` with a primary key `name` and a `url` column.

If you add the `--fetch` option the command will also fetch the binary content of the images and place them in an `image` column:

    $ github-to-sqlite emojis emojis.db -f
    [########----------------------------]  397/1799   22%  00:03:43

You can then use the [datasette-render-images](https://github.com/simonw/datasette-render-images) plugin to browse them visually.

Example: [emojis table](https://github-to-sqlite.dogsheep.net/github/emojis)

## Making authenticated API calls

The `github-to-sqlite get` command provides a convenient shortcut for making authenticated calls to the API. Once you have created your `auth.json` file (or set a `GITHUB_TOKEN` environment variable) you can use it like this:

    $ github-to-sqlite get https://api.github.com/gists

This will make an authenticated call to the URL you provide and pretty-print the resulting JSON to the console.

You can ommit the `https://api.github.com/` prefix, for example:

    $ github-to-sqlite get /gists

Many GitHub APIs are [paginated using the HTTP Link header](https://docs.github.com/en/rest/guides/traversing-with-pagination). You can follow this pagination and output a list of all of the resulting items using `--paginate`:

    $ github-to-sqlite get /users/simonw/repos --paginate

You can outline newline-delimited JSON for each item using `--nl`. This can be useful for streaming items into another tool.

    $ github-to-sqlite get /users/simonw/repos --nl
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-github-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-github-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>github-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.org/project/github-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/515e6efa4aef15e83b08072e5490c9040420223b2355d1d715cb628f66d60dff/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6769746875622d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/github-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/github-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/ae3fbf680cae0fca1e9126549e1fd0e14e756f6f77046e878db7b1c9cbd78911/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f6769746875622d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/github-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/github-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/github-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/github-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Save data from GitHub to a SQLite database.</p>

<ul dir=""auto"">
<li><a href=""#user-content-demo"">Demo</a></li>
<li><a href=""#user-content-how-to-install"">How to install</a></li>
<li><a href=""#user-content-authentication"">Authentication</a></li>
<li><a href=""#user-content-fetching-issues-for-a-repository"">Fetching issues for a repository</a></li>
<li><a href=""#user-content-fetching-pull-requests-for-a-repository"">Fetching pull requests for a repository</a></li>
<li><a href=""#user-content-fetching-issue-comments-for-a-repository"">Fetching issue comments for a repository</a></li>
<li><a href=""#user-content-fetching-commits-for-a-repository"">Fetching commits for a repository</a></li>
<li><a href=""#user-content-fetching-releases-for-a-repository"">Fetching releases for a repository</a></li>
<li><a href=""#user-content-fetching-tags-for-a-repository"">Fetching tags for a repository</a></li>
<li><a href=""#user-content-fetching-contributors-to-a-repository"">Fetching contributors to a repository</a></li>
<li><a href=""#user-content-fetching-repos-belonging-to-a-user-or-organization"">Fetching repos belonging to a user or organization</a></li>
<li><a href=""#user-content-fetching-specific-repositories"">Fetching specific repositories</a></li>
<li><a href=""#user-content-fetching-repos-that-have-been-starred-by-a-user"">Fetching repos that have been starred by a user</a></li>
<li><a href=""#user-content-fetching-users-that-have-starred-specific-repos"">Fetching users that have starred specific repos</a></li>
<li><a href=""#user-content-fetching-github-actions-workflows"">Fetching GitHub Actions workflows</a></li>
<li><a href=""#user-content-scraping-dependents-for-a-repository"">Scraping dependents for a repository</a></li>
<li><a href=""#user-content-fetching-emojis"">Fetching emojis</a></li>
<li><a href=""#user-content-making-authenticated-api-calls"">Making authenticated API calls</a></li>
</ul>

<h2 dir=""auto""><a id=""user-content-demo"" class=""anchor"" aria-hidden=""true"" href=""#user-content-demo""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Demo</h2>
<p dir=""auto""><a href=""https://github-to-sqlite.dogsheep.net/"" rel=""nofollow"">https://github-to-sqlite.dogsheep.net/</a> hosts a <a href=""https://datasette.io/"" rel=""nofollow"">Datasette</a> demo of a database created by <a href=""https://github.com/dogsheep/github-to-sqlite/blob/main/.github/workflows/deploy-demo.yml#L40-L60"">running this tool</a> against all of the repositories in the <a href=""https://github.com/dogsheep"">Dogsheep GitHub organization</a>, plus the <a href=""https://github.com/simonw/datasette"">datasette</a> and <a href=""https://github.com/simonw/sqlite-utils"">sqlite-utils</a> repositories.</p>
<h2 dir=""auto""><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install github-to-sqlite""><pre class=""notranslate""><code>$ pip install github-to-sqlite
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-authentication"" class=""anchor"" aria-hidden=""true"" href=""#user-content-authentication""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Authentication</h2>
<p dir=""auto"">Create a GitHub personal access token: <a href=""https://github.com/settings/tokens"">https://github.com/settings/tokens</a></p>
<p dir=""auto"">Run this command and paste in your new token:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite auth""><pre class=""notranslate""><code>$ github-to-sqlite auth
</code></pre></div>
<p dir=""auto"">This will create a file called <code>auth.json</code> in your current directory containing the required value. To save the file at a different path or filename, use the <code>--auth=myauth.json</code> option.</p>
<p dir=""auto"">As an alternative to using an <code>auth.json</code> file you can add your access token to an environment variable called <code>GITHUB_TOKEN</code>.</p>
<h2 dir=""auto""><a id=""user-content-fetching-issues-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-issues-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching issues for a repository</h2>
<p dir=""auto"">The <code>issues</code> command retrieves all of the issues belonging to a specified repository.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite issues github.db simonw/datasette""><pre class=""notranslate""><code>$ github-to-sqlite issues github.db simonw/datasette
</code></pre></div>
<p dir=""auto"">If an <code>auth.json</code> file is present it will use the token from that file. It works without authentication for public repositories but you should be aware that GitHub have strict IP-based rate limits for unauthenticated requests.</p>
<p dir=""auto"">You can point to a different location of <code>auth.json</code> using <code>-a</code>:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite issues github.db simonw/datasette -a /path/to/auth.json""><pre class=""notranslate""><code>$ github-to-sqlite issues github.db simonw/datasette -a /path/to/auth.json
</code></pre></div>
<p dir=""auto"">You can use the <code>--issue</code> option one or more times to load specific issues:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite issues github.db simonw/datasette --issue=1""><pre class=""notranslate""><code>$ github-to-sqlite issues github.db simonw/datasette --issue=1
</code></pre></div>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/issues"" rel=""nofollow"">issues table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-pull-requests-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-pull-requests-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching pull requests for a repository</h2>
<p dir=""auto"">While pull requests are a type of issue, you will get more information on pull requests by pulling them separately. For example, whether a pull request has been merged and when.</p>
<p dir=""auto"">Following the API of issues, the <code>pull-requests</code> command retrieves all of the pull requests belonging to a specified repository.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite pull-requests github.db simonw/datasette""><pre class=""notranslate""><code>$ github-to-sqlite pull-requests github.db simonw/datasette
</code></pre></div>
<p dir=""auto"">You can use the <code>--pull-request</code> option one or more times to load specific pull request:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite pull-requests github.db simonw/datasette --pull-request=81""><pre class=""notranslate""><code>$ github-to-sqlite pull-requests github.db simonw/datasette --pull-request=81
</code></pre></div>
<p dir=""auto"">Note that the <code>merged_by</code> column on the <code>pull_requests</code> table will only be populated for pull requests that are loaded using the <code>--pull-request</code> option - the GitHub API does not return this field for pull requests that are loaded in bulk.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/pull_requests"" rel=""nofollow"">pull_requests table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-issue-comments-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-issue-comments-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching issue comments for a repository</h2>
<p dir=""auto"">The <code>issue-comments</code> command retrieves all of the comments on all of the issues in a repository.</p>
<p dir=""auto"">It is recommended you run <code>issues</code> first, so that each imported comment can have a foreign key poining to its issue.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite issues github.db simonw/datasette
$ github-to-sqlite issue-comments github.db simonw/datasette""><pre class=""notranslate""><code>$ github-to-sqlite issues github.db simonw/datasette
$ github-to-sqlite issue-comments github.db simonw/datasette
</code></pre></div>
<p dir=""auto"">You can use the <code>--issue</code> option to only load comments for a specific issue within that repository, for example:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite issue-comments github.db simonw/datasette --issue=1""><pre class=""notranslate""><code>$ github-to-sqlite issue-comments github.db simonw/datasette --issue=1
</code></pre></div>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/issue_comments"" rel=""nofollow"">issue_comments table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-commits-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-commits-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching commits for a repository</h2>
<p dir=""auto"">The <code>commits</code> command retrieves details of all of the commits for one or more repositories. It currently fetches the sha, commit message and author and committer details - it does no retrieve the full commit body.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite commits github.db simonw/datasette simonw/sqlite-utils""><pre class=""notranslate""><code>$ github-to-sqlite commits github.db simonw/datasette simonw/sqlite-utils
</code></pre></div>
<p dir=""auto"">The command accepts one or more repositories.</p>
<p dir=""auto"">By default it will stop as soon as it sees a commit that has previously been retrieved. You can force it to retrieve all commits (including those that have been previously inserted) using <code>--all</code>.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/commits"" rel=""nofollow"">commits table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-releases-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-releases-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching releases for a repository</h2>
<p dir=""auto"">The <code>releases</code> command retrieves the releases for one or more repositories.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite releases github.db simonw/datasette simonw/sqlite-utils""><pre class=""notranslate""><code>$ github-to-sqlite releases github.db simonw/datasette simonw/sqlite-utils
</code></pre></div>
<p dir=""auto"">The command accepts one or more repositories.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/releases"" rel=""nofollow"">releases table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-tags-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-tags-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching tags for a repository</h2>
<p dir=""auto"">The <code>tags</code> command retrieves all of the tags for one or more repositories.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite tags github.db simonw/datasette simonw/sqlite-utils""><pre class=""notranslate""><code>$ github-to-sqlite tags github.db simonw/datasette simonw/sqlite-utils
</code></pre></div>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/tags"" rel=""nofollow"">tags table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-contributors-to-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-contributors-to-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching contributors to a repository</h2>
<p dir=""auto"">The <code>contributors</code> command retrieves details of all of the contributors for one or more repositories.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite contributors github.db simonw/datasette simonw/sqlite-utils""><pre class=""notranslate""><code>$ github-to-sqlite contributors github.db simonw/datasette simonw/sqlite-utils
</code></pre></div>
<p dir=""auto"">The command accepts one or more repositories. It populates a <code>contributors</code> table, with foreign keys to <code>repos</code> and <code>users</code> and a <code>contributions</code> table listing the number of commits to that repository for each contributor.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/contributors"" rel=""nofollow"">contributors table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-repos-belonging-to-a-user-or-organization"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-repos-belonging-to-a-user-or-organization""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching repos belonging to a user or organization</h2>
<p dir=""auto"">The <code>repos</code> command fetches repos belonging to a user or organization.</p>
<p dir=""auto"">Without any other arguments, this command will fetch all repos that the currently authenticated user owns, collaborates on or can access via one of their organizations:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite repos github.db""><pre class=""notranslate""><code>$ github-to-sqlite repos github.db
</code></pre></div>
<p dir=""auto"">To fetch repos belonging to a specific user or organization, provide their username as an argument:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite repos github.db dogsheep # organization
$ github-to-sqlite repos github.db simonw # user""><pre class=""notranslate""><code>$ github-to-sqlite repos github.db dogsheep # organization
$ github-to-sqlite repos github.db simonw # user
</code></pre></div>
<p dir=""auto"">You can pass more than one username to fetch for multiple users or organizations at once:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite repos github.db simonw dogsheep""><pre class=""notranslate""><code>$ github-to-sqlite repos github.db simonw dogsheep
</code></pre></div>
<p dir=""auto"">Add the <code>--readme</code> option to save the README for the repo in a column called <code>readme</code>. Add <code>--readme-html</code> to save the HTML rendered version of the README into a collumn called <code>readme_html</code>.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/repos"" rel=""nofollow"">repos table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-specific-repositories"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-specific-repositories""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching specific repositories</h2>
<p dir=""auto"">You can use <code>-r</code> with the <code>repos</code> command one or more times to fetch just specific repositories.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite repos github.db -r simonw/datasette -r dogsheep/github-to-sqlite""><pre class=""notranslate""><code>$ github-to-sqlite repos github.db -r simonw/datasette -r dogsheep/github-to-sqlite
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-fetching-repos-that-have-been-starred-by-a-user"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-repos-that-have-been-starred-by-a-user""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching repos that have been starred by a user</h2>
<p dir=""auto"">The <code>starred</code> command fetches the repos that have been starred by a user.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite starred github.db simonw""><pre class=""notranslate""><code>$ github-to-sqlite starred github.db simonw
</code></pre></div>
<p dir=""auto"">If you are using an <code>auth.json</code> file you can omit the username to retrieve the starred repos for the authenticated user.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/stars"" rel=""nofollow"">stars table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-users-that-have-starred-specific-repos"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-users-that-have-starred-specific-repos""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching users that have starred specific repos</h2>
<p dir=""auto"">The <code>stargazers</code> command fetches the users that have starred the specified repos.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite stargazers github.db simonw/datasette dogsheep/github-to-sqlite""><pre class=""notranslate""><code>$ github-to-sqlite stargazers github.db simonw/datasette dogsheep/github-to-sqlite
</code></pre></div>
<p dir=""auto"">You can specify one or more repository using <code>owner/repo</code> syntax.</p>
<p dir=""auto"">Users fetched using this command will be inserted into the <code>users</code> table. Many-to-many records showing which repository they starred will be added to the <code>stars</code> table.</p>
<h2 dir=""auto""><a id=""user-content-fetching-github-actions-workflows"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-github-actions-workflows""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching GitHub Actions workflows</h2>
<p dir=""auto"">The <code>workflows</code> command fetches the YAML workflow configurations from each repository's <code>.github/workflows</code> directory and parses them to populate <code>workflows</code>, <code>jobs</code> and <code>steps</code> tables.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite""><pre class=""notranslate""><code>$ github-to-sqlite workflows github.db simonw/datasette dogsheep/github-to-sqlite
</code></pre></div>
<p dir=""auto"">You can specify one or more repository using <code>owner/repo</code> syntax.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/workflows"" rel=""nofollow"">workflows table</a>, <a href=""https://github-to-sqlite.dogsheep.net/github/jobs"" rel=""nofollow"">jobs table</a>, <a href=""https://github-to-sqlite.dogsheep.net/github/steps"" rel=""nofollow"">steps table</a></p>
<h2 dir=""auto""><a id=""user-content-scraping-dependents-for-a-repository"" class=""anchor"" aria-hidden=""true"" href=""#user-content-scraping-dependents-for-a-repository""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Scraping dependents for a repository</h2>
<p dir=""auto"">The GitHub dependency graph can show other GitHub projects that depend on a specific repo, for example <a href=""https://github.com/simonw/datasette/network/dependents"">simonw/datasette/network/dependents</a>.</p>
<p dir=""auto"">This data is not yet available through the GitHub API. The <code>scrape-dependents</code> command scrapes those pages and uses the GitHub API to load full versions of the dependent repositories.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite scrape-dependents github.db simonw/datasette""><pre class=""notranslate""><code>$ github-to-sqlite scrape-dependents github.db simonw/datasette
</code></pre></div>
<p dir=""auto"">The command accepts one or more repositories.</p>
<p dir=""auto"">Add <code>-v</code> for verbose output.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/dependents?_sort_desc=first_seen_utc"" rel=""nofollow"">dependents table</a></p>
<h2 dir=""auto""><a id=""user-content-fetching-emojis"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fetching-emojis""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Fetching emojis</h2>
<p dir=""auto"">You can fetch a list of every emoji supported by GitHub using the <code>emojis</code> command:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite emojis github.db""><pre class=""notranslate""><code>$ github-to-sqlite emojis github.db
</code></pre></div>
<p dir=""auto"">This will create a table callad <code>emojis</code> with a primary key <code>name</code> and a <code>url</code> column.</p>
<p dir=""auto"">If you add the <code>--fetch</code> option the command will also fetch the binary content of the images and place them in an <code>image</code> column:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite emojis emojis.db -f
[########----------------------------]  397/1799   22%  00:03:43""><pre class=""notranslate""><code>$ github-to-sqlite emojis emojis.db -f
[########----------------------------]  397/1799   22%  00:03:43
</code></pre></div>
<p dir=""auto"">You can then use the <a href=""https://github.com/simonw/datasette-render-images"">datasette-render-images</a> plugin to browse them visually.</p>
<p dir=""auto"">Example: <a href=""https://github-to-sqlite.dogsheep.net/github/emojis"" rel=""nofollow"">emojis table</a></p>
<h2 dir=""auto""><a id=""user-content-making-authenticated-api-calls"" class=""anchor"" aria-hidden=""true"" href=""#user-content-making-authenticated-api-calls""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Making authenticated API calls</h2>
<p dir=""auto"">The <code>github-to-sqlite get</code> command provides a convenient shortcut for making authenticated calls to the API. Once you have created your <code>auth.json</code> file (or set a <code>GITHUB_TOKEN</code> environment variable) you can use it like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite get https://api.github.com/gists""><pre class=""notranslate""><code>$ github-to-sqlite get https://api.github.com/gists
</code></pre></div>
<p dir=""auto"">This will make an authenticated call to the URL you provide and pretty-print the resulting JSON to the console.</p>
<p dir=""auto"">You can ommit the <code>https://api.github.com/</code> prefix, for example:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite get /gists""><pre class=""notranslate""><code>$ github-to-sqlite get /gists
</code></pre></div>
<p dir=""auto"">Many GitHub APIs are <a href=""https://docs.github.com/en/rest/guides/traversing-with-pagination"">paginated using the HTTP Link header</a>. You can follow this pagination and output a list of all of the resulting items using <code>--paginate</code>:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite get /users/simonw/repos --paginate""><pre class=""notranslate""><code>$ github-to-sqlite get /users/simonw/repos --paginate
</code></pre></div>
<p dir=""auto"">You can outline newline-delimited JSON for each item using <code>--nl</code>. This can be useful for streaming items into another tool.</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ github-to-sqlite get /users/simonw/repos --nl""><pre class=""notranslate""><code>$ github-to-sqlite get /users/simonw/repos --nl
</code></pre></div>
</article></div>",1,public,0,,0,
209590345,MDEwOlJlcG9zaXRvcnkyMDk1OTAzNDU=,genome-to-sqlite,dogsheep/genome-to-sqlite,0,53015001,https://github.com/dogsheep/genome-to-sqlite,Import your genome into a SQLite database,0,2019-09-19T15:38:39Z,2021-01-18T19:39:48Z,2019-09-19T15:41:17Z,,9,13,13,Python,1,1,1,1,0,0,0,0,2,apache-2.0,"[""genetics"", ""sqlite"", ""23andme"", ""personal-analytics"", ""datasette"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",0,2,13,master,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,0,2,"# genome-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/genome-to-sqlite.svg)](https://pypi.org/project/genome-to-sqlite/)
[![CircleCI](https://circleci.com/gh/dogsheep/genome-to-sqlite.svg?style=svg)](https://circleci.com/gh/dogsheep/genome-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/genome-to-sqlite/blob/master/LICENSE)

Import your genome into a SQLite database.

## How to install

    $ pip install genome-to-sqlite

## How to use

First, export your genome. This tool has only been tested against 23andMe so far. You can request an export of your genome from https://you.23andme.com/tools/data/download/

Now you can convert the resulting `export.zip` file to SQLite like so:

    $ genome-to-sqlite export.zip genome.db

A progress bar will be displayed. You can disable this using `--silent`.

```
Importing genome  [#----------------]    5%  00:01:33
```

You can explore the resulting data using [Datasette](https://datasette.readthedocs.io/) like this:

    $ datasette genome.db --config facet_time_limit_ms:1000

Bumping up the facet time limit is useful in order to enable faceting by chromosome:

http://127.0.0.1:8001/genome/genome?_facet=chromosome&_sort=position
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-genome-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-genome-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>genome-to-sqlite</h1>
<p><a href=""https://pypi.org/project/genome-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/a91d14ecfe9aa2d73da4818a7a898033e301380d4c8477d06a17dc20bd871c84/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f67656e6f6d652d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/genome-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://circleci.com/gh/dogsheep/genome-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/5f7f5be2791ff007c49d3d153ef38baa81770a11c0326e5f6b8ffd7ee939d00f/68747470733a2f2f636972636c6563692e636f6d2f67682f646f6773686565702f67656e6f6d652d746f2d73716c6974652e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/dogsheep/genome-to-sqlite.svg?style=svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/genome-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Import your genome into a SQLite database.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install genome-to-sqlite
""><pre><code>$ pip install genome-to-sqlite
</code></pre></div>
<h2><a id=""user-content-how-to-use"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-use""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to use</h2>
<p>First, export your genome. This tool has only been tested against 23andMe so far. You can request an export of your genome from <a href=""https://you.23andme.com/tools/data/download/"" rel=""nofollow"">https://you.23andme.com/tools/data/download/</a></p>
<p>Now you can convert the resulting <code>export.zip</code> file to SQLite like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ genome-to-sqlite export.zip genome.db
""><pre><code>$ genome-to-sqlite export.zip genome.db
</code></pre></div>
<p>A progress bar will be displayed. You can disable this using <code>--silent</code>.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""Importing genome  [#----------------]    5%  00:01:33
""><pre><code>Importing genome  [#----------------]    5%  00:01:33
</code></pre></div>
<p>You can explore the resulting data using <a href=""https://datasette.readthedocs.io/"" rel=""nofollow"">Datasette</a> like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ datasette genome.db --config facet_time_limit_ms:1000
""><pre><code>$ datasette genome.db --config facet_time_limit_ms:1000
</code></pre></div>
<p>Bumping up the facet time limit is useful in order to enable faceting by chromosome:</p>
<p><a href=""http://127.0.0.1:8001/genome/genome?_facet=chromosome&amp;_sort=position"" rel=""nofollow"">http://127.0.0.1:8001/genome/genome?_facet=chromosome&amp;_sort=position</a></p>
</article></div>",,,,,,
213286752,MDEwOlJlcG9zaXRvcnkyMTMyODY3NTI=,pocket-to-sqlite,dogsheep/pocket-to-sqlite,0,53015001,https://github.com/dogsheep/pocket-to-sqlite,Create a SQLite database containing data from your Pocket account,0,2019-10-07T03:24:14Z,2022-08-21T21:11:59Z,2022-08-22T16:21:34Z,,20,63,63,Python,1,1,1,1,0,3,0,0,5,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool"", ""dogsheep"", ""pocket"", ""pocket-api"", ""sqlite""]",3,5,63,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,3,4,"# pocket-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/pocket-to-sqlite.svg)](https://pypi.org/project/pocket-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/pocket-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/pocket-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/pocket-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/pocket-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/pocket-to-sqlite/blob/main/LICENSE)

Create a SQLite database containing data from your [Pocket](https://getpocket.com/) account.

## How to install

    $ pip install pocket-to-sqlite

## Usage

You will need to first obtain a valid OAuth token for your Pocket account. You can do this by running the `auth` command and following the prompts:

    $ pocket-to-sqlite auth
    Visit this page and sign in with your Pocket account:

    https://getpocket.com/auth/author...

    Once you have signed in there, hit <enter> to continue
    Authentication tokens written to auth.json

Now you can fetch all of your items from Pocket like this:

    $ pocket-to-sqlite fetch pocket.db

The first time you run this command it will fetch all of your items, and display a progress bar while it does it.

On subsequent runs it will only fetch new items.

You can force it to fetch everything from the beginning again using `--all`. Use `--silent` to disable the progress bar.

## Using with Datasette

The SQLite database produced by this tool is designed to be browsed using [Datasette](https://datasette.readthedocs.io/). Use the [datasette-render-timestamps](https://github.com/simonw/datasette-render-timestamps) plugin to improve the display of the timestamp values.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-pocket-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-pocket-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>pocket-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.org/project/pocket-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/98050bc158f5e93f2c3576b03fa6a25b4a8091bb6d44774e07d95ae2842e4a3e/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f706f636b65742d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/pocket-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/pocket-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/b37c773b526e69ffcc72b9c02cbcd977d1b6e6a9bcf317c3accfc04b975e960c/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f706f636b65742d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/pocket-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/pocket-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/pocket-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/pocket-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Create a SQLite database containing data from your <a href=""https://getpocket.com/"" rel=""nofollow"">Pocket</a> account.</p>
<h2 dir=""auto""><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install pocket-to-sqlite""><pre class=""notranslate""><code>$ pip install pocket-to-sqlite
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p dir=""auto"">You will need to first obtain a valid OAuth token for your Pocket account. You can do this by running the <code>auth</code> command and following the prompts:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pocket-to-sqlite auth
Visit this page and sign in with your Pocket account:

https://getpocket.com/auth/author...

Once you have signed in there, hit &lt;enter&gt; to continue
Authentication tokens written to auth.json""><pre class=""notranslate""><code>$ pocket-to-sqlite auth
Visit this page and sign in with your Pocket account:

https://getpocket.com/auth/author...

Once you have signed in there, hit &lt;enter&gt; to continue
Authentication tokens written to auth.json
</code></pre></div>
<p dir=""auto"">Now you can fetch all of your items from Pocket like this:</p>
<div class=""snippet-clipboard-content notranslate position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pocket-to-sqlite fetch pocket.db""><pre class=""notranslate""><code>$ pocket-to-sqlite fetch pocket.db
</code></pre></div>
<p dir=""auto"">The first time you run this command it will fetch all of your items, and display a progress bar while it does it.</p>
<p dir=""auto"">On subsequent runs it will only fetch new items.</p>
<p dir=""auto"">You can force it to fetch everything from the beginning again using <code>--all</code>. Use <code>--silent</code> to disable the progress bar.</p>
<h2 dir=""auto""><a id=""user-content-using-with-datasette"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-with-datasette""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using with Datasette</h2>
<p dir=""auto"">The SQLite database produced by this tool is designed to be browsed using <a href=""https://datasette.readthedocs.io/"" rel=""nofollow"">Datasette</a>. Use the <a href=""https://github.com/simonw/datasette-render-timestamps"">datasette-render-timestamps</a> plugin to improve the display of the timestamp values.</p>
</article></div>",1,public,0,,0,
219372133,MDEwOlJlcG9zaXRvcnkyMTkzNzIxMzM=,sqlite-transform,simonw/sqlite-transform,0,9599,https://github.com/simonw/sqlite-transform,Tool for running transformations on columns in a SQLite database,0,2019-11-03T22:07:53Z,2021-08-02T22:06:23Z,2021-08-02T22:07:57Z,,64,29,29,Python,1,1,1,1,0,1,0,0,0,apache-2.0,"[""sqlite"", ""datasette-io"", ""datasette-tool""]",1,0,29,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,1,1,"# sqlite-transform

![No longer maintained](https://img.shields.io/badge/no%20longer-maintained-red)
[![PyPI](https://img.shields.io/pypi/v/sqlite-transform.svg)](https://pypi.org/project/sqlite-transform/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-transform?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-transform/releases)
[![Tests](https://github.com/simonw/sqlite-transform/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-transform/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/sqlite-transform/blob/main/LICENSE)

Tool for running transformations on columns in a SQLite database.

> **:warning: This tool is no longer maintained**
>
> I added a new tool to [sqlite-utils](https://sqlite-utils.datasette.io/) called [sqlite-utils convert](https://sqlite-utils.datasette.io/en/stable/cli.html#converting-data-in-columns) which provides a super-set of the functionality originally provided here. `sqlite-transform` is no longer maintained, and I recommend switching to using `sqlite-utils convert` instead.

## How to install

    pip install sqlite-transform

## parsedate and parsedatetime

These subcommands will run all values in the specified column through `dateutils.parser.parse()` and replace them with the result, formatted as an ISO timestamp or ISO date.

For example, if a row in the database has an `opened` column which contains `10/10/2019 08:10:00 PM`, running the following command:

    sqlite-transform parsedatetime my.db mytable opened

Will result in that value being replaced by `2019-10-10T20:10:00`.

Using the `parsedate` subcommand here would result in `2019-10-10` instead.

In the case of ambiguous dates such as `03/04/05` these commands both default to assuming American-style `mm/dd/yy` format. You can pass `--dayfirst` to specify that the day should be assumed to be first, or `--yearfirst` for the year.

## jsonsplit

The `jsonsplit` subcommand takes columns that contain a comma-separated list, for example a `tags` column containing records like `""trees,park,dogs""` and converts it into a JSON array `[""trees"", ""park"", ""dogs""]`.

This is useful for taking advantage of Datasette's [Facet by JSON array](https://docs.datasette.io/en/stable/facets.html#facet-by-json-array) feature.

    sqlite-transform jsonsplit my.db mytable tags

It defaults to splitting on commas, but you can specify a different delimiter character using the `--delimiter` option, for example:

    sqlite-transform jsonsplit \
        my.db mytable tags --delimiter ';'

Values within the array will be treated as strings, so a column containing `123,552,775` will be converted into the JSON array `[""123"", ""552"", ""775""]`.

You can specify a different type for these values using `--type int` or `--type float`, for example:

    sqlite-transform jsonsplit \
        my.db mytable tags --type int

This will result in that column being converted into `[123, 552, 775]`.

## lambda for executing your own code

The `lambda` subcommand lets you specify Python code which will be executed against the column.

Here's how to convert a column to uppercase:

    sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'

The code you provide will be compiled into a function that takes `value` as a single argument. You can break your function body into multiple lines, provided the last line is a `return` statement:

    sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)
    return value.upper()'

You can also specify Python modules that should be imported and made available to your code using one or more `--import` options:

    sqlite-transform lambda my.db mytable mycolumn \
        --code='""\n"".join(textwrap.wrap(value, 10))' \
        --import=textwrap

The `--dry-run` option will output a preview of the transformation against the first ten rows, without modifying the database.

## Saving the result to a separate column

Each of these commands accepts optional `--output` and `--output-type` options. These can be used to save the result of the transformation to a separate column, which will be created if the column does not already exist.

To save the result of `jsonsplit` to a new column called `json_tags`, use the following:

    sqlite-transform jsonsplit my.db mytable tags \
      --output json_tags

The type of the created column defaults to `text`, but a different column type can be specified using `--output-type`. This example will create a new floating point column called `float_id` with a copy of each item's ID increased by 0.5:

    sqlite-transform lambda my.db mytable id \
      --code 'float(value) + 0.5' \
      --output float_id \
      --output-type float

You can drop the original column at the end of the operation by adding `--drop`.

## Splitting a column into multiple columns

Sometimes you may wish to convert a single column into multiple derived columns. For example, you may have a `location` column containing `latitude,longitude` values which you wish to split out into separate `latitude` and `longitude` columns.

You can achieve this using the `--multi` option to `sqlite-transform lambda`. This option expects your `--code` function to return a Python dictionary: new columns well be created and populated for each of the keys in that dictionary.

For the `latitude,longitude` example you would use the following:

    sqlite-transform lambda demo.db places location \
      --code 'return {
        ""latitude"": float(value.split("","")[0]),
        ""longitude"": float(value.split("","")[1]),
      }' --multi

The type of the returned values will be taken into account when creating the new columns. In this example, the resulting database schema will look like this:

```sql
CREATE TABLE [places] (
    [location] TEXT,
    [latitude] FLOAT,
    [longitude] FLOAT
);
```
The code function can also return `None`, in which case its output will be ignored.

You can drop the original column at the end of the operation by adding `--drop`.

## Disabling the progress bar

By default each command will show a progress bar. Pass `-s` or `--silent` to hide that progress bar.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-sqlite-transform"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sqlite-transform""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>sqlite-transform</h1>
<p><a target=""_blank"" rel=""noopener noreferrer"" href=""https://camo.githubusercontent.com/818df98789ea0f246ed427c6efefc9450fdab68f50d69b83ecbbda8dda1d82b8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6e6f2532306c6f6e6765722d6d61696e7461696e65642d726564""><img src=""https://camo.githubusercontent.com/818df98789ea0f246ed427c6efefc9450fdab68f50d69b83ecbbda8dda1d82b8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6e6f2532306c6f6e6765722d6d61696e7461696e65642d726564"" alt=""No longer maintained"" data-canonical-src=""https://img.shields.io/badge/no%20longer-maintained-red"" style=""max-width:100%;""></a>
<a href=""https://pypi.org/project/sqlite-transform/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/607faf62b18abed6c31fc21c85d2e93f7800da5d89495d6df6355b2bb0b11a38/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f73716c6974652d7472616e73666f726d2e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/sqlite-transform.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sqlite-transform/releases""><img src=""https://camo.githubusercontent.com/351675bc73115e1f71304704ec82e7402dd85f44a866cd2580a9e1ce4194384f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f73716c6974652d7472616e73666f726d3f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/sqlite-transform?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sqlite-transform/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/sqlite-transform/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/sqlite-transform/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Tool for running transformations on columns in a SQLite database.</p>
<blockquote>
<p><strong><g-emoji class=""g-emoji"" alias=""warning"" fallback-src=""https://github.githubassets.com/images/icons/emoji/unicode/26a0.png"">⚠️</g-emoji> This tool is no longer maintained</strong></p>
<p>I added a new tool to <a href=""https://sqlite-utils.datasette.io/"" rel=""nofollow"">sqlite-utils</a> called <a href=""https://sqlite-utils.datasette.io/en/stable/cli.html#converting-data-in-columns"" rel=""nofollow"">sqlite-utils convert</a> which provides a super-set of the functionality originally provided here. <code>sqlite-transform</code> is no longer maintained, and I recommend switching to using <code>sqlite-utils convert</code> instead.</p>
</blockquote>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install sqlite-transform
""><pre><code>pip install sqlite-transform
</code></pre></div>
<h2><a id=""user-content-parsedate-and-parsedatetime"" class=""anchor"" aria-hidden=""true"" href=""#user-content-parsedate-and-parsedatetime""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>parsedate and parsedatetime</h2>
<p>These subcommands will run all values in the specified column through <code>dateutils.parser.parse()</code> and replace them with the result, formatted as an ISO timestamp or ISO date.</p>
<p>For example, if a row in the database has an <code>opened</code> column which contains <code>10/10/2019 08:10:00 PM</code>, running the following command:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform parsedatetime my.db mytable opened
""><pre><code>sqlite-transform parsedatetime my.db mytable opened
</code></pre></div>
<p>Will result in that value being replaced by <code>2019-10-10T20:10:00</code>.</p>
<p>Using the <code>parsedate</code> subcommand here would result in <code>2019-10-10</code> instead.</p>
<p>In the case of ambiguous dates such as <code>03/04/05</code> these commands both default to assuming American-style <code>mm/dd/yy</code> format. You can pass <code>--dayfirst</code> to specify that the day should be assumed to be first, or <code>--yearfirst</code> for the year.</p>
<h2><a id=""user-content-jsonsplit"" class=""anchor"" aria-hidden=""true"" href=""#user-content-jsonsplit""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>jsonsplit</h2>
<p>The <code>jsonsplit</code> subcommand takes columns that contain a comma-separated list, for example a <code>tags</code> column containing records like <code>""trees,park,dogs""</code> and converts it into a JSON array <code>[""trees"", ""park"", ""dogs""]</code>.</p>
<p>This is useful for taking advantage of Datasette's <a href=""https://docs.datasette.io/en/stable/facets.html#facet-by-json-array"" rel=""nofollow"">Facet by JSON array</a> feature.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform jsonsplit my.db mytable tags
""><pre><code>sqlite-transform jsonsplit my.db mytable tags
</code></pre></div>
<p>It defaults to splitting on commas, but you can specify a different delimiter character using the <code>--delimiter</code> option, for example:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform jsonsplit \
    my.db mytable tags --delimiter ';'
""><pre><code>sqlite-transform jsonsplit \
    my.db mytable tags --delimiter ';'
</code></pre></div>
<p>Values within the array will be treated as strings, so a column containing <code>123,552,775</code> will be converted into the JSON array <code>[""123"", ""552"", ""775""]</code>.</p>
<p>You can specify a different type for these values using <code>--type int</code> or <code>--type float</code>, for example:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform jsonsplit \
    my.db mytable tags --type int
""><pre><code>sqlite-transform jsonsplit \
    my.db mytable tags --type int
</code></pre></div>
<p>This will result in that column being converted into <code>[123, 552, 775]</code>.</p>
<h2><a id=""user-content-lambda-for-executing-your-own-code"" class=""anchor"" aria-hidden=""true"" href=""#user-content-lambda-for-executing-your-own-code""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>lambda for executing your own code</h2>
<p>The <code>lambda</code> subcommand lets you specify Python code which will be executed against the column.</p>
<p>Here's how to convert a column to uppercase:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'
""><pre><code>sqlite-transform lambda my.db mytable mycolumn --code='str(value).upper()'
</code></pre></div>
<p>The code you provide will be compiled into a function that takes <code>value</code> as a single argument. You can break your function body into multiple lines, provided the last line is a <code>return</code> statement:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)
return value.upper()'
""><pre><code>sqlite-transform lambda my.db mytable mycolumn --code='value = str(value)
return value.upper()'
</code></pre></div>
<p>You can also specify Python modules that should be imported and made available to your code using one or more <code>--import</code> options:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform lambda my.db mytable mycolumn \
    --code='&quot;\n&quot;.join(textwrap.wrap(value, 10))' \
    --import=textwrap
""><pre><code>sqlite-transform lambda my.db mytable mycolumn \
    --code='""\n"".join(textwrap.wrap(value, 10))' \
    --import=textwrap
</code></pre></div>
<p>The <code>--dry-run</code> option will output a preview of the transformation against the first ten rows, without modifying the database.</p>
<h2><a id=""user-content-saving-the-result-to-a-separate-column"" class=""anchor"" aria-hidden=""true"" href=""#user-content-saving-the-result-to-a-separate-column""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Saving the result to a separate column</h2>
<p>Each of these commands accepts optional <code>--output</code> and <code>--output-type</code> options. These can be used to save the result of the transformation to a separate column, which will be created if the column does not already exist.</p>
<p>To save the result of <code>jsonsplit</code> to a new column called <code>json_tags</code>, use the following:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform jsonsplit my.db mytable tags \
  --output json_tags
""><pre><code>sqlite-transform jsonsplit my.db mytable tags \
  --output json_tags
</code></pre></div>
<p>The type of the created column defaults to <code>text</code>, but a different column type can be specified using <code>--output-type</code>. This example will create a new floating point column called <code>float_id</code> with a copy of each item's ID increased by 0.5:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform lambda my.db mytable id \
  --code 'float(value) + 0.5' \
  --output float_id \
  --output-type float
""><pre><code>sqlite-transform lambda my.db mytable id \
  --code 'float(value) + 0.5' \
  --output float_id \
  --output-type float
</code></pre></div>
<p>You can drop the original column at the end of the operation by adding <code>--drop</code>.</p>
<h2><a id=""user-content-splitting-a-column-into-multiple-columns"" class=""anchor"" aria-hidden=""true"" href=""#user-content-splitting-a-column-into-multiple-columns""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Splitting a column into multiple columns</h2>
<p>Sometimes you may wish to convert a single column into multiple derived columns. For example, you may have a <code>location</code> column containing <code>latitude,longitude</code> values which you wish to split out into separate <code>latitude</code> and <code>longitude</code> columns.</p>
<p>You can achieve this using the <code>--multi</code> option to <code>sqlite-transform lambda</code>. This option expects your <code>--code</code> function to return a Python dictionary: new columns well be created and populated for each of the keys in that dictionary.</p>
<p>For the <code>latitude,longitude</code> example you would use the following:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-transform lambda demo.db places location \
  --code 'return {
    &quot;latitude&quot;: float(value.split(&quot;,&quot;)[0]),
    &quot;longitude&quot;: float(value.split(&quot;,&quot;)[1]),
  }' --multi
""><pre><code>sqlite-transform lambda demo.db places location \
  --code 'return {
    ""latitude"": float(value.split("","")[0]),
    ""longitude"": float(value.split("","")[1]),
  }' --multi
</code></pre></div>
<p>The type of the returned values will be taken into account when creating the new columns. In this example, the resulting database schema will look like this:</p>
<div class=""highlight highlight-source-sql position-relative"" data-snippet-clipboard-copy-content=""CREATE TABLE [places] (
    [location] TEXT,
    [latitude] FLOAT,
    [longitude] FLOAT
);
""><pre>CREATE TABLE [places] (
    [location] <span class=""pl-k"">TEXT</span>,
    [latitude] FLOAT,
    [longitude] FLOAT
);</pre></div>
<p>The code function can also return <code>None</code>, in which case its output will be ignored.</p>
<p>You can drop the original column at the end of the operation by adding <code>--drop</code>.</p>
<h2><a id=""user-content-disabling-the-progress-bar"" class=""anchor"" aria-hidden=""true"" href=""#user-content-disabling-the-progress-bar""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Disabling the progress bar</h2>
<p>By default each command will show a progress bar. Pass <code>-s</code> or <code>--silent</code> to hide that progress bar.</p>
</article></div>",,,,,,
237321267,MDEwOlJlcG9zaXRvcnkyMzczMjEyNjc=,geojson-to-sqlite,simonw/geojson-to-sqlite,0,9599,https://github.com/simonw/geojson-to-sqlite,CLI tool for converting GeoJSON files to SQLite (with SpatiaLite),0,2020-01-30T22:51:05Z,2022-03-05T00:40:56Z,2022-04-13T23:39:25Z,,117,34,34,Python,1,1,1,1,0,3,0,0,4,apache-2.0,"[""datasette-io"", ""datasette-tool"", ""geojson"", ""gis"", ""sqlite""]",3,4,34,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,3,3,"# geojson-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/geojson-to-sqlite.svg)](https://pypi.org/project/geojson-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/geojson-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/geojson-to-sqlite/releases)
[![Tests](https://github.com/simonw/geojson-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/geojson-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/geojson-to-sqlite/blob/main/LICENSE)

CLI tool for converting GeoJSON to SQLite (optionally with SpatiaLite)

[RFC 7946: The GeoJSON Format](https://tools.ietf.org/html/rfc7946)

## How to install

    $ pip install geojson-to-sqlite

## How to use

You can run this tool against a GeoJSON file like so:

    $ geojson-to-sqlite my.db features features.geojson

This will load all of the features from the `features.geojson` file into a table called `features`.

Each row will have a `geometry` column containing the feature geometry, and columns for each of the keys found in any `properties` attached to those features. (To bundle all properties into a single JSON object, use the `--properties` flag.)

The table will be created the first time you run the command.

On subsequent runs you can use the `--alter` option to add any new columns that are missing from the table.

You can pass more than one GeoJSON file, in which case the contents of all of the files will be inserted into the same table.

If your features have an `""id""` property it will be used as the primary key for the table. You can also use `--pk=PROPERTY` with the name of a different property to use that as the primary key instead. If you don't want to use the `""id""` as the primary key (maybe it contains duplicate values) you can use `--pk ''` to specify no primary key.

Specifying a primary key also will allow you to upsert data into the rows instead of insert data into new rows.

If no primary key is specified, a SQLite `rowid` column will be used.

You can use `-` as the filename to import from standard input. For example:

    $ curl https://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_20m.json \
        | geojson-to-sqlite my.db states - --pk GEO_ID

## Using with SpatiaLite

By default, the `geometry` column will contain JSON.

If you have installed the [SpatiaLite](https://www.gaia-gis.it/fossil/libspatialite/index) module for SQLite you can instead import the geometry into a geospatially indexed column.

You can do this using the `--spatialite` option, like so:

    $ geojson-to-sqlite my.db features features.geojson --spatialite

The tool will search for the SpatiaLite module in the following locations:

- `/usr/lib/x86_64-linux-gnu/mod_spatialite.so`
- `/usr/local/lib/mod_spatialite.dylib`

If you have installed the module in another location, you can use the `--spatialite_mod=xxx` option to specify where:

    $ geojson-to-sqlite my.db features features.geojson \
        --spatialite_mod=/usr/lib/mod_spatialite.dylib

You can create a SpatiaLite spatial index on the `geometry` column using the `--spatial-index` option:

    $ geojson-to-sqlite my.db features features.geojson --spatial-index

Using this option implies `--spatialite` so you do not need to add that.

## Streaming large datasets

For large datasets, consider using newline-delimited JSON to stream features into the database without loading the entire feature collection into memory.

For example, to load a day of earthquake reports from USGS:

    $ geojson-to-sqlite quakes.db quakes tests/quakes.ndjson \
      --nl --pk=id --spatialite

When using newline-delimited JSON, tables will also be created from the first feature, instead of guessing types based on the first 100 features.

If you want to use a larger subset of your data to guess column types (for example, if some fields are inconsistent) you can use [fiona](https://fiona.readthedocs.io/en/latest/cli.html) to collect features into a single collection.

    $ head tests/quakes.ndjson | fio collect | \
      geojson-to-sqlite quakes.db quakes - --spatialite

This will take the first 10 lines from `tests/quakes.ndjson`, pass them to `fio collect`, which turns them into a single feature collection, and pass that, in turn, to `geojson-to-sqlite`.

## Using this with Datasette

Databases created using this tool can be explored and published using [Datasette](https://datasette.readthedocs.io/).

The Datasette documentation includes a section on [how to use it to browse SpatiaLite databases](https://datasette.readthedocs.io/en/stable/spatialite.html).

The [datasette-leaflet-geojson](https://datasette.io/plugins/datasette-leaflet-geojson) plugin can be used to visualize columns containing GeoJSON geometries on a [Leaflet](https://leafletjs.com/) map.

If you are using SpatiaLite you will need to output the geometry as GeoJSON in order for that plugin to work. You can do that using the SpaitaLite `AsGeoJSON()` function - something like this:

```sql
select rowid, AsGeoJSON(geometry) from mytable limit 10
```

The [datasette-geojson-map](https://datasette.io/plugins/datasette-geojson-map) is an alternative plugin which will automatically render SpatiaLite geometries as a Leaflet map on the corresponding table page, without needing you to call `AsGeoJSON(geometry)`.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-geojson-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-geojson-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>geojson-to-sqlite</h1>
<p dir=""auto""><a href=""https://pypi.org/project/geojson-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/b0c77834f0d6adf37e62573e121ccadd92711433ad2c5b4dd506610919762dab/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f67656f6a736f6e2d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/geojson-to-sqlite.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/geojson-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/e88a12c13acc77ac4d0338f065dfeb664076b3414a7ee2560e8d465e828c1320/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f67656f6a736f6e2d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/geojson-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/geojson-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/geojson-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/geojson-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">CLI tool for converting GeoJSON to SQLite (optionally with SpatiaLite)</p>
<p dir=""auto""><a href=""https://tools.ietf.org/html/rfc7946"" rel=""nofollow"">RFC 7946: The GeoJSON Format</a></p>
<h2 dir=""auto""><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install geojson-to-sqlite""><pre><code>$ pip install geojson-to-sqlite
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-how-to-use"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-use""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to use</h2>
<p dir=""auto"">You can run this tool against a GeoJSON file like so:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ geojson-to-sqlite my.db features features.geojson""><pre><code>$ geojson-to-sqlite my.db features features.geojson
</code></pre></div>
<p dir=""auto"">This will load all of the features from the <code>features.geojson</code> file into a table called <code>features</code>.</p>
<p dir=""auto"">Each row will have a <code>geometry</code> column containing the feature geometry, and columns for each of the keys found in any <code>properties</code> attached to those features. (To bundle all properties into a single JSON object, use the <code>--properties</code> flag.)</p>
<p dir=""auto"">The table will be created the first time you run the command.</p>
<p dir=""auto"">On subsequent runs you can use the <code>--alter</code> option to add any new columns that are missing from the table.</p>
<p dir=""auto"">You can pass more than one GeoJSON file, in which case the contents of all of the files will be inserted into the same table.</p>
<p dir=""auto"">If your features have an <code>""id""</code> property it will be used as the primary key for the table. You can also use <code>--pk=PROPERTY</code> with the name of a different property to use that as the primary key instead. If you don't want to use the <code>""id""</code> as the primary key (maybe it contains duplicate values) you can use <code>--pk ''</code> to specify no primary key.</p>
<p dir=""auto"">Specifying a primary key also will allow you to upsert data into the rows instead of insert data into new rows.</p>
<p dir=""auto"">If no primary key is specified, a SQLite <code>rowid</code> column will be used.</p>
<p dir=""auto"">You can use <code>-</code> as the filename to import from standard input. For example:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ curl https://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_20m.json \
    | geojson-to-sqlite my.db states - --pk GEO_ID""><pre><code>$ curl https://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_20m.json \
    | geojson-to-sqlite my.db states - --pk GEO_ID
</code></pre></div>
<h2 dir=""auto""><a id=""user-content-using-with-spatialite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-with-spatialite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using with SpatiaLite</h2>
<p dir=""auto"">By default, the <code>geometry</code> column will contain JSON.</p>
<p dir=""auto"">If you have installed the <a href=""https://www.gaia-gis.it/fossil/libspatialite/index"" rel=""nofollow"">SpatiaLite</a> module for SQLite you can instead import the geometry into a geospatially indexed column.</p>
<p dir=""auto"">You can do this using the <code>--spatialite</code> option, like so:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ geojson-to-sqlite my.db features features.geojson --spatialite""><pre><code>$ geojson-to-sqlite my.db features features.geojson --spatialite
</code></pre></div>
<p dir=""auto"">The tool will search for the SpatiaLite module in the following locations:</p>
<ul dir=""auto"">
<li><code>/usr/lib/x86_64-linux-gnu/mod_spatialite.so</code></li>
<li><code>/usr/local/lib/mod_spatialite.dylib</code></li>
</ul>
<p dir=""auto"">If you have installed the module in another location, you can use the <code>--spatialite_mod=xxx</code> option to specify where:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ geojson-to-sqlite my.db features features.geojson \
    --spatialite_mod=/usr/lib/mod_spatialite.dylib""><pre><code>$ geojson-to-sqlite my.db features features.geojson \
    --spatialite_mod=/usr/lib/mod_spatialite.dylib
</code></pre></div>
<p dir=""auto"">You can create a SpatiaLite spatial index on the <code>geometry</code> column using the <code>--spatial-index</code> option:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ geojson-to-sqlite my.db features features.geojson --spatial-index""><pre><code>$ geojson-to-sqlite my.db features features.geojson --spatial-index
</code></pre></div>
<p dir=""auto"">Using this option implies <code>--spatialite</code> so you do not need to add that.</p>
<h2 dir=""auto""><a id=""user-content-streaming-large-datasets"" class=""anchor"" aria-hidden=""true"" href=""#user-content-streaming-large-datasets""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Streaming large datasets</h2>
<p dir=""auto"">For large datasets, consider using newline-delimited JSON to stream features into the database without loading the entire feature collection into memory.</p>
<p dir=""auto"">For example, to load a day of earthquake reports from USGS:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ geojson-to-sqlite quakes.db quakes tests/quakes.ndjson \
  --nl --pk=id --spatialite""><pre><code>$ geojson-to-sqlite quakes.db quakes tests/quakes.ndjson \
  --nl --pk=id --spatialite
</code></pre></div>
<p dir=""auto"">When using newline-delimited JSON, tables will also be created from the first feature, instead of guessing types based on the first 100 features.</p>
<p dir=""auto"">If you want to use a larger subset of your data to guess column types (for example, if some fields are inconsistent) you can use <a href=""https://fiona.readthedocs.io/en/latest/cli.html"" rel=""nofollow"">fiona</a> to collect features into a single collection.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ head tests/quakes.ndjson | fio collect | \
  geojson-to-sqlite quakes.db quakes - --spatialite""><pre><code>$ head tests/quakes.ndjson | fio collect | \
  geojson-to-sqlite quakes.db quakes - --spatialite
</code></pre></div>
<p dir=""auto"">This will take the first 10 lines from <code>tests/quakes.ndjson</code>, pass them to <code>fio collect</code>, which turns them into a single feature collection, and pass that, in turn, to <code>geojson-to-sqlite</code>.</p>
<h2 dir=""auto""><a id=""user-content-using-this-with-datasette"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-this-with-datasette""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using this with Datasette</h2>
<p dir=""auto"">Databases created using this tool can be explored and published using <a href=""https://datasette.readthedocs.io/"" rel=""nofollow"">Datasette</a>.</p>
<p dir=""auto"">The Datasette documentation includes a section on <a href=""https://datasette.readthedocs.io/en/stable/spatialite.html"" rel=""nofollow"">how to use it to browse SpatiaLite databases</a>.</p>
<p dir=""auto"">The <a href=""https://datasette.io/plugins/datasette-leaflet-geojson"" rel=""nofollow"">datasette-leaflet-geojson</a> plugin can be used to visualize columns containing GeoJSON geometries on a <a href=""https://leafletjs.com/"" rel=""nofollow"">Leaflet</a> map.</p>
<p dir=""auto"">If you are using SpatiaLite you will need to output the geometry as GeoJSON in order for that plugin to work. You can do that using the SpaitaLite <code>AsGeoJSON()</code> function - something like this:</p>
<div class=""highlight highlight-source-sql position-relative overflow-auto"" data-snippet-clipboard-copy-content=""select rowid, AsGeoJSON(geometry) from mytable limit 10""><pre><span class=""pl-k"">select</span> rowid, AsGeoJSON(geometry) <span class=""pl-k"">from</span> mytable <span class=""pl-k"">limit</span> <span class=""pl-c1"">10</span></pre></div>
<p dir=""auto"">The <a href=""https://datasette.io/plugins/datasette-geojson-map"" rel=""nofollow"">datasette-geojson-map</a> is an alternative plugin which will automatically render SpatiaLite geometries as a Leaflet map on the corresponding table page, without needing you to call <code>AsGeoJSON(geometry)</code>.</p>
</article></div>",1,public,0,,,
240815938,MDEwOlJlcG9zaXRvcnkyNDA4MTU5Mzg=,shapefile-to-sqlite,simonw/shapefile-to-sqlite,0,9599,https://github.com/simonw/shapefile-to-sqlite,Load shapefiles into a SQLite (optionally SpatiaLite) database,0,2020-02-16T01:55:29Z,2021-03-26T08:39:43Z,2020-08-23T06:00:41Z,,54,15,15,Python,1,1,1,1,0,0,0,0,3,apache-2.0,"[""sqlite"", ""gis"", ""spatialite"", ""shapefiles"", ""datasette"", ""datasette-io"", ""datasette-tool""]",0,3,15,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,1,"# shapefile-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/shapefile-to-sqlite.svg)](https://pypi.org/project/shapefile-to-sqlite/)
[![CircleCI](https://circleci.com/gh/simonw/shapefile-to-sqlite.svg?style=svg)](https://circleci.com/gh/simonw/shapefile-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/shapefile-to-sqlite/blob/main/LICENSE)

Load shapefiles into a SQLite (optionally SpatiaLite) database.

Project background: [Things I learned about shapefiles building shapefile-to-sqlite](https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/)

## How to install

    $ pip install shapefile-to-sqlite

## How to use

You can run this tool against a shapefile file like so:

    $ shapefile-to-sqlite my.db features.shp

This will load the geometries as GeoJSON in a text column.

## Using with SpatiaLite

If you have [SpatiaLite](https://www.gaia-gis.it/fossil/libspatialite/index) available you can load them as SpatiaLite geometries like this:

    $ shapefile-to-sqlite my.db features.shp --spatialite

The data will be loaded into a table called `features` - based on the name of the shapefile. You can specify an alternative table name using `--table`:

    $ shapefile-to-sqlite my.db features.shp --table=places --spatialite

The tool will search for the SpatiaLite module in the following locations:

- `/usr/lib/x86_64-linux-gnu/mod_spatialite.so`
- `/usr/local/lib/mod_spatialite.dylib`

If you have installed the module in another location, you can use the `--spatialite_mod=xxx` option to specify where:

    $ shapefile-to-sqlite my.db features.shp \
        --spatialite_mod=/usr/lib/mod_spatialite.dylib

You can use the `--spatial-index` option to create a spatial index on the `geometry` column:

    $ shapefile-to-sqlite my.db features.shp --spatial-index

You can omit `--spatialite` if you use either `--spatialite-mod` or `--spatial-index`.

## Projections

By default, this tool will attempt to convert geometries in the shapefile to the WGS 84 projection, for best conformance with the [GeoJSON specification](https://tools.ietf.org/html/rfc7946).

If you want it to leave the data in whatever projection was used by the shapefile, use the `--crs=keep` option.

You can convert the data to another output projection by passing it to the `--crs` option. For example, to convert to [EPSG:2227](https://epsg.io/2227) (California zone 3) use `--crs=espg:2227`.

The full list of formats accepted by the `--crs` option is [documented here](https://pyproj4.github.io/pyproj/stable/api/crs.html#pyproj.crs.CRS.__init__).

## Extracting columns

If your data contains columns with a small number of heavily duplicated values - the names of specific agencies responsible for parcels of land for example - you can extract those columns into separate lookup tables referenced by foreign keys using the `-c` option:

    $ shapefile-to-sqlite my.db features.shp -c agency

This will create a `agency` table with `id` and `name` columns, and will create the `agency` column in your main table as an integer foreign key reference to that table.

The `-c` option can be used multiple times.

[CPAD_2020a_Units](https://calands.datasettes.com/calands/CPAD_2020a_Units) is an example of a table created using the `-c` option.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-shapefile-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-shapefile-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>shapefile-to-sqlite</h1>
<p><a href=""https://pypi.org/project/shapefile-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/491f86f2d61c3cf6ce08c6c019a47686ccbe35187c465aeb784876d3adcc236b/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f736861706566696c652d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/shapefile-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://circleci.com/gh/simonw/shapefile-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/665b44f51d2c08eeb25388a24485a1591ebe793b914807ae371fe96fb242c0aa/68747470733a2f2f636972636c6563692e636f6d2f67682f73696d6f6e772f736861706566696c652d746f2d73716c6974652e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/simonw/shapefile-to-sqlite.svg?style=svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/shapefile-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Load shapefiles into a SQLite (optionally SpatiaLite) database.</p>
<p>Project background: <a href=""https://simonwillison.net/2020/Feb/19/shapefile-to-sqlite/"" rel=""nofollow"">Things I learned about shapefiles building shapefile-to-sqlite</a></p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install shapefile-to-sqlite
""><pre><code>$ pip install shapefile-to-sqlite
</code></pre></div>
<h2><a id=""user-content-how-to-use"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-use""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to use</h2>
<p>You can run this tool against a shapefile file like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp
""><pre><code>$ shapefile-to-sqlite my.db features.shp
</code></pre></div>
<p>This will load the geometries as GeoJSON in a text column.</p>
<h2><a id=""user-content-using-with-spatialite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-using-with-spatialite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Using with SpatiaLite</h2>
<p>If you have <a href=""https://www.gaia-gis.it/fossil/libspatialite/index"" rel=""nofollow"">SpatiaLite</a> available you can load them as SpatiaLite geometries like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp --spatialite
""><pre><code>$ shapefile-to-sqlite my.db features.shp --spatialite
</code></pre></div>
<p>The data will be loaded into a table called <code>features</code> - based on the name of the shapefile. You can specify an alternative table name using <code>--table</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp --table=places --spatialite
""><pre><code>$ shapefile-to-sqlite my.db features.shp --table=places --spatialite
</code></pre></div>
<p>The tool will search for the SpatiaLite module in the following locations:</p>
<ul>
<li><code>/usr/lib/x86_64-linux-gnu/mod_spatialite.so</code></li>
<li><code>/usr/local/lib/mod_spatialite.dylib</code></li>
</ul>
<p>If you have installed the module in another location, you can use the <code>--spatialite_mod=xxx</code> option to specify where:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp \
    --spatialite_mod=/usr/lib/mod_spatialite.dylib
""><pre><code>$ shapefile-to-sqlite my.db features.shp \
    --spatialite_mod=/usr/lib/mod_spatialite.dylib
</code></pre></div>
<p>You can use the <code>--spatial-index</code> option to create a spatial index on the <code>geometry</code> column:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp --spatial-index
""><pre><code>$ shapefile-to-sqlite my.db features.shp --spatial-index
</code></pre></div>
<p>You can omit <code>--spatialite</code> if you use either <code>--spatialite-mod</code> or <code>--spatial-index</code>.</p>
<h2><a id=""user-content-projections"" class=""anchor"" aria-hidden=""true"" href=""#user-content-projections""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Projections</h2>
<p>By default, this tool will attempt to convert geometries in the shapefile to the WGS 84 projection, for best conformance with the <a href=""https://tools.ietf.org/html/rfc7946"" rel=""nofollow"">GeoJSON specification</a>.</p>
<p>If you want it to leave the data in whatever projection was used by the shapefile, use the <code>--crs=keep</code> option.</p>
<p>You can convert the data to another output projection by passing it to the <code>--crs</code> option. For example, to convert to <a href=""https://epsg.io/2227"" rel=""nofollow"">EPSG:2227</a> (California zone 3) use <code>--crs=espg:2227</code>.</p>
<p>The full list of formats accepted by the <code>--crs</code> option is <a href=""https://pyproj4.github.io/pyproj/stable/api/crs.html#pyproj.crs.CRS.__init__"" rel=""nofollow"">documented here</a>.</p>
<h2><a id=""user-content-extracting-columns"" class=""anchor"" aria-hidden=""true"" href=""#user-content-extracting-columns""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Extracting columns</h2>
<p>If your data contains columns with a small number of heavily duplicated values - the names of specific agencies responsible for parcels of land for example - you can extract those columns into separate lookup tables referenced by foreign keys using the <code>-c</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ shapefile-to-sqlite my.db features.shp -c agency
""><pre><code>$ shapefile-to-sqlite my.db features.shp -c agency
</code></pre></div>
<p>This will create a <code>agency</code> table with <code>id</code> and <code>name</code> columns, and will create the <code>agency</code> column in your main table as an integer foreign key reference to that table.</p>
<p>The <code>-c</code> option can be used multiple times.</p>
<p><a href=""https://calands.datasettes.com/calands/CPAD_2020a_Units"" rel=""nofollow"">CPAD_2020a_Units</a> is an example of a table created using the <code>-c</code> option.</p>
</article></div>",,,,,,
245670670,MDEwOlJlcG9zaXRvcnkyNDU2NzA2NzA=,fec-to-sqlite,simonw/fec-to-sqlite,0,9599,https://github.com/simonw/fec-to-sqlite,Save FEC campaign finance data to a SQLite database,0,2020-03-07T16:52:49Z,2020-12-19T05:09:05Z,2020-03-07T18:21:48Z,,16,8,8,Python,1,1,1,1,0,0,0,0,1,apache-2.0,"[""sqlite"", ""fec"", ""datasette"", ""datasette-io"", ""datasette-tool""]",0,1,8,master,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,2,"# fec-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/fec-to-sqlite.svg)](https://pypi.org/project/fec-to-sqlite/)
[![CircleCI](https://circleci.com/gh/simonw/fec-to-sqlite.svg?style=svg)](https://circleci.com/gh/simonw/fec-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/fec-to-sqlite/blob/master/LICENSE)

Create a SQLite database using FEC campaign contributions data.

This tool builds on [fecfile](https://github.com/esonderegger/) by Evan Sonderegger.

## How to install

    $ pip install fec-to-sqlite

## Usage

    $ fec-to-sqlite filings filings.db 1146148

This fetches the filing with ID `1146148` and stores it in tables in a SQLite database called `filings.db`. It will create any tables it needs.

You can pass more than one filing ID, separated by spaces.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-fec-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-fec-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>fec-to-sqlite</h1>
<p><a href=""https://pypi.org/project/fec-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/93308ffe0f5302a01fe685a905be46fc42abddee239c695af80456aabcd72e94/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6665632d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/fec-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://circleci.com/gh/simonw/fec-to-sqlite"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/46d17ba2dacfcf6081f8db0a40f9dfd848e9bb2af54c6652e8358a01a88ba2cd/68747470733a2f2f636972636c6563692e636f6d2f67682f73696d6f6e772f6665632d746f2d73716c6974652e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/simonw/fec-to-sqlite.svg?style=svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/fec-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a SQLite database using FEC campaign contributions data.</p>
<p>This tool builds on <a href=""https://github.com/esonderegger/"">fecfile</a> by Evan Sonderegger.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install fec-to-sqlite
""><pre><code>$ pip install fec-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ fec-to-sqlite filings filings.db 1146148
""><pre><code>$ fec-to-sqlite filings filings.db 1146148
</code></pre></div>
<p>This fetches the filing with ID <code>1146148</code> and stores it in tables in a SQLite database called <code>filings.db</code>. It will create any tables it needs.</p>
<p>You can pass more than one filing ID, separated by spaces.</p>
</article></div>",,,,,,
248903544,MDEwOlJlcG9zaXRvcnkyNDg5MDM1NDQ=,hacker-news-to-sqlite,dogsheep/hacker-news-to-sqlite,0,53015001,https://github.com/dogsheep/hacker-news-to-sqlite,Create a SQLite database containing data pulled from Hacker News,0,2020-03-21T04:02:05Z,2021-06-06T22:42:00Z,2021-03-13T19:15:06Z,,19,25,25,Python,1,1,1,1,0,2,0,0,0,apache-2.0,"[""hacker-news"", ""datasette"", ""dogsheep"", ""datasette-io"", ""datasette-tool""]",2,0,25,main,"{""admin"": false, ""push"": false, ""pull"": false}",,53015001,2,1,"# hacker-news-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/hacker-news-to-sqlite.svg)](https://pypi.org/project/hacker-news-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/hacker-news-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/hacker-news-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/hacker-news-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/hacker-news-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/hacker-news-to-sqlite/blob/main/LICENSE)

Create a SQLite database containing data fetched from [Hacker News](https://news.ycombinator.com/).

## How to install

    $ pip install hacker-news-to-sqlite

## Usage

    $ hacker-news-to-sqlite user hacker-news.db your-username
    Importing items:  37%|███████████                        | 845/2297 [05:09<11:02,  2.19it/s]

Imports all of your Hacker News submissions and comments into a SQLite database called `hacker-news.db`.

    $ hacker-news-to-sqlite trees hacker-news.db 22640038 22643218

Fetches the entire comments tree in which any of those content IDs appears.

## Browsing your data with Datasette

You can use [Datasette](https://datasette.readthedocs.org/) to browse your data. Install Datasette like this:

    $ pip install datasette

Now run it against your `hacker-news.db` file like so:

    $ datasette hacker-news.db

Visit `https://localhost:8001/` to search and explore your data.

You can improve the display of your data usinng the [datasette-render-timestamps](https://github.com/simonw/datasette-render-timestamps) and [datasette-render-html](https://github.com/simonw/datasette-render-html) plugins. Install them like this:

    $ pip install datasette-render-timestamps datasette-render-html

Now save the following configuration in a file called `metadata.json`:

```json
{
    ""databases"": {
        ""hacker-news"": {
            ""tables"": {
                ""items"": {
                    ""plugins"": {
                        ""datasette-render-html"": {
                            ""columns"": [
                                ""text""
                            ]
                        },
                        ""datasette-render-timestamps"": {
                            ""columns"": [
                                ""time""
                            ]
                        }
                    }
                },
                ""users"": {
                    ""plugins"": {
                        ""datasette-render-timestamps"": {
                            ""columns"": [
                                ""created""
                            ]
                        }
                    }
                }
            }
        }
    }
}
```
Run Datasette like this:

    $ datasette -m metadata.json hacker-news.db

The timestamp columns will now be rendered as human-readable dates, and any HTML in your posts will be displayed as rendered HTML.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-hacker-news-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-hacker-news-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>hacker-news-to-sqlite</h1>
<p><a href=""https://pypi.org/project/hacker-news-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/bf6d88d26ea4d8f1f396c1b1fc88ae74380de6a30f5a792c9f29664f21a219dd/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6861636b65722d6e6577732d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/hacker-news-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/hacker-news-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/090ef507ee9402a786e40313d173f516f1dbdedf3df703402296594a4ebcad73/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f6861636b65722d6e6577732d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/hacker-news-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/hacker-news-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/hacker-news-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/hacker-news-to-sqlite/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a SQLite database containing data fetched from <a href=""https://news.ycombinator.com/"" rel=""nofollow"">Hacker News</a>.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install hacker-news-to-sqlite
""><pre><code>$ pip install hacker-news-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ hacker-news-to-sqlite user hacker-news.db your-username
Importing items:  37%|███████████                        | 845/2297 [05:09&lt;11:02,  2.19it/s]
""><pre><code>$ hacker-news-to-sqlite user hacker-news.db your-username
Importing items:  37%|███████████                        | 845/2297 [05:09&lt;11:02,  2.19it/s]
</code></pre></div>
<p>Imports all of your Hacker News submissions and comments into a SQLite database called <code>hacker-news.db</code>.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ hacker-news-to-sqlite trees hacker-news.db 22640038 22643218
""><pre><code>$ hacker-news-to-sqlite trees hacker-news.db 22640038 22643218
</code></pre></div>
<p>Fetches the entire comments tree in which any of those content IDs appears.</p>
<h2><a id=""user-content-browsing-your-data-with-datasette"" class=""anchor"" aria-hidden=""true"" href=""#user-content-browsing-your-data-with-datasette""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Browsing your data with Datasette</h2>
<p>You can use <a href=""https://datasette.readthedocs.org/"" rel=""nofollow"">Datasette</a> to browse your data. Install Datasette like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install datasette
""><pre><code>$ pip install datasette
</code></pre></div>
<p>Now run it against your <code>hacker-news.db</code> file like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ datasette hacker-news.db
""><pre><code>$ datasette hacker-news.db
</code></pre></div>
<p>Visit <code>https://localhost:8001/</code> to search and explore your data.</p>
<p>You can improve the display of your data usinng the <a href=""https://github.com/simonw/datasette-render-timestamps"">datasette-render-timestamps</a> and <a href=""https://github.com/simonw/datasette-render-html"">datasette-render-html</a> plugins. Install them like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install datasette-render-timestamps datasette-render-html
""><pre><code>$ pip install datasette-render-timestamps datasette-render-html
</code></pre></div>
<p>Now save the following configuration in a file called <code>metadata.json</code>:</p>
<div class=""highlight highlight-source-json position-relative"" data-snippet-clipboard-copy-content=""{
    &quot;databases&quot;: {
        &quot;hacker-news&quot;: {
            &quot;tables&quot;: {
                &quot;items&quot;: {
                    &quot;plugins&quot;: {
                        &quot;datasette-render-html&quot;: {
                            &quot;columns&quot;: [
                                &quot;text&quot;
                            ]
                        },
                        &quot;datasette-render-timestamps&quot;: {
                            &quot;columns&quot;: [
                                &quot;time&quot;
                            ]
                        }
                    }
                },
                &quot;users&quot;: {
                    &quot;plugins&quot;: {
                        &quot;datasette-render-timestamps&quot;: {
                            &quot;columns&quot;: [
                                &quot;created&quot;
                            ]
                        }
                    }
                }
            }
        }
    }
}
""><pre>{
    <span class=""pl-s""><span class=""pl-pds"">""</span>databases<span class=""pl-pds"">""</span></span>: {
        <span class=""pl-s""><span class=""pl-pds"">""</span>hacker-news<span class=""pl-pds"">""</span></span>: {
            <span class=""pl-s""><span class=""pl-pds"">""</span>tables<span class=""pl-pds"">""</span></span>: {
                <span class=""pl-s""><span class=""pl-pds"">""</span>items<span class=""pl-pds"">""</span></span>: {
                    <span class=""pl-s""><span class=""pl-pds"">""</span>plugins<span class=""pl-pds"">""</span></span>: {
                        <span class=""pl-s""><span class=""pl-pds"">""</span>datasette-render-html<span class=""pl-pds"">""</span></span>: {
                            <span class=""pl-s""><span class=""pl-pds"">""</span>columns<span class=""pl-pds"">""</span></span>: [
                                <span class=""pl-s""><span class=""pl-pds"">""</span>text<span class=""pl-pds"">""</span></span>
                            ]
                        },
                        <span class=""pl-s""><span class=""pl-pds"">""</span>datasette-render-timestamps<span class=""pl-pds"">""</span></span>: {
                            <span class=""pl-s""><span class=""pl-pds"">""</span>columns<span class=""pl-pds"">""</span></span>: [
                                <span class=""pl-s""><span class=""pl-pds"">""</span>time<span class=""pl-pds"">""</span></span>
                            ]
                        }
                    }
                },
                <span class=""pl-s""><span class=""pl-pds"">""</span>users<span class=""pl-pds"">""</span></span>: {
                    <span class=""pl-s""><span class=""pl-pds"">""</span>plugins<span class=""pl-pds"">""</span></span>: {
                        <span class=""pl-s""><span class=""pl-pds"">""</span>datasette-render-timestamps<span class=""pl-pds"">""</span></span>: {
                            <span class=""pl-s""><span class=""pl-pds"">""</span>columns<span class=""pl-pds"">""</span></span>: [
                                <span class=""pl-s""><span class=""pl-pds"">""</span>created<span class=""pl-pds"">""</span></span>
                            ]
                        }
                    }
                }
            }
        }
    }
}</pre></div>
<p>Run Datasette like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ datasette -m metadata.json hacker-news.db
""><pre><code>$ datasette -m metadata.json hacker-news.db
</code></pre></div>
<p>The timestamp columns will now be rendered as human-readable dates, and any HTML in your posts will be displayed as rendered HTML.</p>
</article></div>",,,,,,
255460347,MDEwOlJlcG9zaXRvcnkyNTU0NjAzNDc=,datasette-clone,simonw/datasette-clone,0,9599,https://github.com/simonw/datasette-clone,Create a local copy of database files from a Datasette instance,0,2020-04-13T23:05:41Z,2021-06-08T15:33:21Z,2021-02-22T19:32:36Z,,20,2,2,Python,1,1,1,1,0,0,0,0,0,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool""]",0,0,2,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,1,"# datasette-clone

[![PyPI](https://img.shields.io/pypi/v/datasette-clone.svg)](https://pypi.org/project/datasette-clone/)
[![Changelog](https://img.shields.io/github/v/release/simonw/datasette-clone?include_prereleases&label=changelog)](https://github.com/simonw/datasette-clone/releases)
[![Tests](https://github.com/simonw/datasette-clone/workflows/Test/badge.svg)](https://github.com/simonw/datasette-clone/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-clone/blob/main/LICENSE)

Create a local copy of database files from a Datasette instance.

See [datasette-clone](https://simonwillison.net/2020/Apr/14/datasette-clone/) on my blog for background on this project.

## How to install

    $ pip install datasette-clone

## Usage

This only works against Datasette instances running immutable databases (with the `-i` option). Databases published using the `datasette publish` command should be compatible with this tool.

To download copies of all `.db` files from an instance, run:

    datasette-clone https://latest.datasette.io

You can provide an optional second argument to specify a directory:

    datasette-clone https://latest.datasette.io /tmp/here-please

The command stores its own copy of a `databases.json` manifest and uses it to only download databases that have changed the next time you run the command.

It also stores a copy of the instance's `metadata.json` to ensure you have a copy of any source and licensing information for the downloaded databases.

If your instance is protected by an API token, you can use `--token` to provide it:

    datasette-clone https://latest.datasette.io --token=xyz

For verbose output showing what the tool is doing, use `-v`.
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-datasette-clone"" class=""anchor"" aria-hidden=""true"" href=""#user-content-datasette-clone""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>datasette-clone</h1>
<p><a href=""https://pypi.org/project/datasette-clone/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/17cc348c1197fd918ac911e525efd416a5d4ca5d3f00729cbf85c930ac5dbde6/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6461746173657474652d636c6f6e652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/datasette-clone.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/datasette-clone/releases""><img src=""https://camo.githubusercontent.com/9f8615567ed0f9b4d38d5f69dafcc8ae2fa97aa13895779172645064f16fcc9f/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f6461746173657474652d636c6f6e653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/datasette-clone?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/datasette-clone/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/datasette-clone/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/datasette-clone/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a local copy of database files from a Datasette instance.</p>
<p>See <a href=""https://simonwillison.net/2020/Apr/14/datasette-clone/"" rel=""nofollow"">datasette-clone</a> on my blog for background on this project.</p>
<h2><a id=""user-content-how-to-install"" class=""anchor"" aria-hidden=""true"" href=""#user-content-how-to-install""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>How to install</h2>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install datasette-clone
""><pre><code>$ pip install datasette-clone
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>This only works against Datasette instances running immutable databases (with the <code>-i</code> option). Databases published using the <code>datasette publish</code> command should be compatible with this tool.</p>
<p>To download copies of all <code>.db</code> files from an instance, run:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""datasette-clone https://latest.datasette.io
""><pre><code>datasette-clone https://latest.datasette.io
</code></pre></div>
<p>You can provide an optional second argument to specify a directory:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""datasette-clone https://latest.datasette.io /tmp/here-please
""><pre><code>datasette-clone https://latest.datasette.io /tmp/here-please
</code></pre></div>
<p>The command stores its own copy of a <code>databases.json</code> manifest and uses it to only download databases that have changed the next time you run the command.</p>
<p>It also stores a copy of the instance's <code>metadata.json</code> to ensure you have a copy of any source and licensing information for the downloaded databases.</p>
<p>If your instance is protected by an API token, you can use <code>--token</code> to provide it:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""datasette-clone https://latest.datasette.io --token=xyz
""><pre><code>datasette-clone https://latest.datasette.io --token=xyz
</code></pre></div>
<p>For verbose output showing what the tool is doing, use <code>-v</code>.</p>
</article></div>",,,,,,
256834907,MDEwOlJlcG9zaXRvcnkyNTY4MzQ5MDc=,dogsheep-photos,dogsheep/dogsheep-photos,0,53015001,https://github.com/dogsheep/dogsheep-photos,Upload your photos to S3 and import metadata about them into a SQLite database,0,2020-04-18T19:22:13Z,2021-11-04T20:45:03Z,2021-11-04T20:45:00Z,,68,124,124,Python,1,1,1,1,0,7,0,0,19,apache-2.0,"[""datasette"", ""datasette-io"", ""datasette-tool"", ""dogsheep"", ""sqlite""]",7,19,124,master,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,7,10,"# dogsheep-photos

[![PyPI](https://img.shields.io/pypi/v/dogsheep-photos.svg)](https://pypi.org/project/dogsheep-photos/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/dogsheep-photos?include_prereleases&label=changelog)](https://github.com/dogsheep/dogsheep-photos/releases)
[![CircleCI](https://circleci.com/gh/dogsheep/dogsheep-photos.svg?style=svg)](https://circleci.com/gh/dogsheep/dogsheep-photos)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/dogsheep-photos/blob/master/LICENSE)

Save details of your photos to a SQLite database and upload them to S3.

See [Using SQL to find my best photo of a pelican according to Apple Photos](https://simonwillison.net/2020/May/21/apple-photos-sqlite/) for background information on this project.

## What these tools do

These tools are a work-in-progress mechanism for taking full ownership of your photos. The core idea is to help implement the following:

* Every photo you have taken lives in a single, private Amazon S3 bucket
* You have a single SQLite database file which stores metadata about those photos - potentially pulled from multiple different places. This may include EXIF data, Apple Photos, the results of running machine learning APIs against photos and much more besides.
* You can then use [Datasette](https://github.com/simonw/datasette) to explore your own photos.

I'm a heavy user of Apple Photos so the initial releases of this tool will have a bias towards that, but ideally I would like a subset of these tools to be useful to people no matter which core photo solution they are using.

## Installation

    $ pip install dogsheep-photos

## Authentication (if using S3)

If you want to use S3 to store your photos, you will need to first create S3 credentials for a new, dedicated bucket.

You may find the [s3-credentials tool](https://github.com/simonw/s3-credentials) useful for this.

Run this command and paste in your credentials. You will need three values: the name of your S3 bucket, your Access key ID and your Secret access key.

    $ dogsheep-photos s3-auth

This will create a file called `auth.json` in your current directory containing the required values. To save the file at a different path or filename, use the `--auth=myauth.json` option.

## Uploading photos

Run this command to upload every photo in a specific directory to your S3 bucket:

    $ dogsheep-photos upload photos.db \
        ~/Pictures/Photos\ Library.photoslibrary/original

The command will only upload photos that have not yet been uploaded, based on their sha256 hash.

`photos.db` will be created with an `uploads` table containing details of which files were uploaded.

To see what the command would do without uploading any files, use the `--dry-run` option.

The sha256 hash of the photo contents will be used as the name of the file in the bucket, with an extension matching the type of file. This is an implementation of the [Content addressable storage](https://en.wikipedia.org/wiki/Content-addressable_storage) pattern.

## Importing Apple Photos metadata

The `apple-photos` command imports metadata from your Apple Photos library.

    $ photo-to-sqlite apple-photos photos.db

Imported metadata includes places, people, albums, quality scores and machine learning labels for the photo contents.

## Creating a subset database

You can create a new, subset database of photos using the `create-subset` command.

This is useful for creating a shareable SQLite database that only contains metadata for a selected set of photos.

Since photo metadata contains latitude and longitude you may not want to share a database that includes photos taken at your home address.

`create-subset` takes three arguments: an existing database file created using the `apple-photos` command, the name of the new, shareable database file you would like to create and a SQL query that returns the `sha256` hash values of the photos you would like to include in that database.

For example, here's how to create a shareable database of just the photos that have been added to albums containing the word ""Public"":

    $ dogsheep-photos create-subset \
        photos.db \
        public.db \
        ""select sha256 from apple_photos where albums like '%Public%'""

## Serving photos locally with datasette-media

If you don't want to upload your photos to S3 but you still want to browse them using Datasette you can do so using the [datasette-media](https://github.com/simonw/datasette-media) plugin. This plugin adds the ability to serve images and other static files directly from disk, configured using a SQL query.

To use it, first install Datasette and the plugin:

    $ pip install datasette datasette-media

If any of your photos are `.HEIC` images taken by an iPhone you should also install the optional `pyheif` dependency:

    $ pip install pyheif

Now create a `metadata.yaml` file configuring the plugin:

```yaml
plugins:
  datasette-media:
    thumbnail:
      sql: |-
        select path as filepath, 200 as resize_height from apple_photos where uuid = :key
    large:
      sql: |-
        select path as filepath, 1024 as resize_height from apple_photos where uuid = :key
```
This will configure two URL endpoints - one for 200 pixel high thumbnails and one for 1024 pixel high larger images.

Create your `photos.db` database using the `apple-photos` command, then run Datasette like this:

    $ datasette -m metadata.yaml

Your photos will be served on URLs that look like this:

    http://127.0.0.1:8001/-/media/thumbnail/F4469918-13F3-43D8-9EC1-734C0E6B60AD
    http://127.0.0.1:8001/-/media/large/F4469918-13F3-43D8-9EC1-734C0E6B60AD

You can find the UUIDs for use in these URLs by running `select uuid from photos_with_apple_metadata`.

### Displaying images using datasette-json-html

If you are using `datasette-media` to serve photos you can include images directly in Datasette query results using the [datasette-json-html](https://github.com/simonw/datasette-json-html) plugin.

Run `pip install datasette-json-html` to install the plugin, then use queries like this to view your images:

```sql
select
    json_object(
        'img_src',
        '/-/media/thumbnail/' || uuid
    ) as photo,
    uuid,
    date
from
    apple_photos
order by
    date desc
limit 10;
```
The `photo` column returned by this query should render as image tags that display the correct images.

### Displaying images using custom template pages

Datasette's [custom pages](https://datasette.readthedocs.io/en/stable/custom_templates.html#custom-pages) feature lets you create custom pages for a Datasette instance by dropping HTML templates into a `templates/pages` directory and then running Datasette using `datasette --template-dir=templates/`.

You can combine that ability with the [datasette-template-sql](https://github.com/simonw/datasette-template-sql) plugin to create custom template pages that directly display photos served by `datasette-media`.

Install the plugin using `pip install datasette-template-sql`.

Create a `templates/pages` folder and add the following files:

`recent-photos.html`
```html+jinja
<h1>Recent photos</h1>

<div>
{% for photo in sql(""select * from apple_photos order by date desc limit 20"") %}
    <img src=""/-/media/photo/{{ photo['uuid'] }}"">
{% endfor %}
</div>
```
`random-photos.html`
```html+jinja
<h1>Random photos</h1>

<div>
{% for photo in sql(""with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 20"") %}
    <img src=""/-/media/photo/{{ photo['uuid'] }}"">
{% endfor %}
</div>
```
Now run Datasette like this:

    $ datasette photos.db -m metadata.yaml --template-dir=templates/

Visiting `http://localhost:8001/recent-photos` will display 20 recent photos. Visiting `http://localhost:8001/random-photos` will display 20 photos randomly selected from your 5,000 most recent.

","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-dogsheep-photos"" class=""anchor"" aria-hidden=""true"" href=""#user-content-dogsheep-photos""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>dogsheep-photos</h1>
<p><a href=""https://pypi.org/project/dogsheep-photos/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/c21d10e4250454707420755a0d4b90c97709771d795ffa55b495c98956a3938d/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f646f6773686565702d70686f746f732e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/dogsheep-photos.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/dogsheep-photos/releases""><img src=""https://camo.githubusercontent.com/c7bead2f1c989034f62914278a736eecdfbab79ce4d9b3ab804d1692e230d79d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f646f6773686565702d70686f746f733f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/dogsheep-photos?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://circleci.com/gh/dogsheep/dogsheep-photos"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/3aefe86d4a3560c1ca080cf5b7e4dcc5788b239df8b7a3fed40e622b834c22e8/68747470733a2f2f636972636c6563692e636f6d2f67682f646f6773686565702f646f6773686565702d70686f746f732e7376673f7374796c653d737667"" alt=""CircleCI"" data-canonical-src=""https://circleci.com/gh/dogsheep/dogsheep-photos.svg?style=svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/dogsheep/dogsheep-photos/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p>Save details of your photos to a SQLite database and upload them to S3.</p>
<p>See <a href=""https://simonwillison.net/2020/May/21/apple-photos-sqlite/"" rel=""nofollow"">Using SQL to find my best photo of a pelican according to Apple Photos</a> for background information on this project.</p>
<h2><a id=""user-content-what-these-tools-do"" class=""anchor"" aria-hidden=""true"" href=""#user-content-what-these-tools-do""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>What these tools do</h2>
<p>These tools are a work-in-progress mechanism for taking full ownership of your photos. The core idea is to help implement the following:</p>
<ul>
<li>Every photo you have taken lives in a single, private Amazon S3 bucket</li>
<li>You have a single SQLite database file which stores metadata about those photos - potentially pulled from multiple different places. This may include EXIF data, Apple Photos, the results of running machine learning APIs against photos and much more besides.</li>
<li>You can then use <a href=""https://github.com/simonw/datasette"">Datasette</a> to explore your own photos.</li>
</ul>
<p>I'm a heavy user of Apple Photos so the initial releases of this tool will have a bias towards that, but ideally I would like a subset of these tools to be useful to people no matter which core photo solution they are using.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install dogsheep-photos
""><pre><code>$ pip install dogsheep-photos
</code></pre></div>
<h2><a id=""user-content-authentication-if-using-s3"" class=""anchor"" aria-hidden=""true"" href=""#user-content-authentication-if-using-s3""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Authentication (if using S3)</h2>
<p>If you want to use S3 to store your photos, you will need to first create S3 credentials for a new, dedicated bucket.</p>
<p>You may find the <a href=""https://github.com/simonw/s3-credentials"">s3-credentials tool</a> useful for this.</p>
<p>Run this command and paste in your credentials. You will need three values: the name of your S3 bucket, your Access key ID and your Secret access key.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ dogsheep-photos s3-auth
""><pre><code>$ dogsheep-photos s3-auth
</code></pre></div>
<p>This will create a file called <code>auth.json</code> in your current directory containing the required values. To save the file at a different path or filename, use the <code>--auth=myauth.json</code> option.</p>
<h2><a id=""user-content-uploading-photos"" class=""anchor"" aria-hidden=""true"" href=""#user-content-uploading-photos""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Uploading photos</h2>
<p>Run this command to upload every photo in a specific directory to your S3 bucket:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ dogsheep-photos upload photos.db \
    ~/Pictures/Photos\ Library.photoslibrary/original
""><pre><code>$ dogsheep-photos upload photos.db \
    ~/Pictures/Photos\ Library.photoslibrary/original
</code></pre></div>
<p>The command will only upload photos that have not yet been uploaded, based on their sha256 hash.</p>
<p><code>photos.db</code> will be created with an <code>uploads</code> table containing details of which files were uploaded.</p>
<p>To see what the command would do without uploading any files, use the <code>--dry-run</code> option.</p>
<p>The sha256 hash of the photo contents will be used as the name of the file in the bucket, with an extension matching the type of file. This is an implementation of the <a href=""https://en.wikipedia.org/wiki/Content-addressable_storage"" rel=""nofollow"">Content addressable storage</a> pattern.</p>
<h2><a id=""user-content-importing-apple-photos-metadata"" class=""anchor"" aria-hidden=""true"" href=""#user-content-importing-apple-photos-metadata""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Importing Apple Photos metadata</h2>
<p>The <code>apple-photos</code> command imports metadata from your Apple Photos library.</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ photo-to-sqlite apple-photos photos.db
""><pre><code>$ photo-to-sqlite apple-photos photos.db
</code></pre></div>
<p>Imported metadata includes places, people, albums, quality scores and machine learning labels for the photo contents.</p>
<h2><a id=""user-content-creating-a-subset-database"" class=""anchor"" aria-hidden=""true"" href=""#user-content-creating-a-subset-database""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Creating a subset database</h2>
<p>You can create a new, subset database of photos using the <code>create-subset</code> command.</p>
<p>This is useful for creating a shareable SQLite database that only contains metadata for a selected set of photos.</p>
<p>Since photo metadata contains latitude and longitude you may not want to share a database that includes photos taken at your home address.</p>
<p><code>create-subset</code> takes three arguments: an existing database file created using the <code>apple-photos</code> command, the name of the new, shareable database file you would like to create and a SQL query that returns the <code>sha256</code> hash values of the photos you would like to include in that database.</p>
<p>For example, here's how to create a shareable database of just the photos that have been added to albums containing the word ""Public"":</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ dogsheep-photos create-subset \
    photos.db \
    public.db \
    &quot;select sha256 from apple_photos where albums like '%Public%'&quot;
""><pre><code>$ dogsheep-photos create-subset \
    photos.db \
    public.db \
    ""select sha256 from apple_photos where albums like '%Public%'""
</code></pre></div>
<h2><a id=""user-content-serving-photos-locally-with-datasette-media"" class=""anchor"" aria-hidden=""true"" href=""#user-content-serving-photos-locally-with-datasette-media""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Serving photos locally with datasette-media</h2>
<p>If you don't want to upload your photos to S3 but you still want to browse them using Datasette you can do so using the <a href=""https://github.com/simonw/datasette-media"">datasette-media</a> plugin. This plugin adds the ability to serve images and other static files directly from disk, configured using a SQL query.</p>
<p>To use it, first install Datasette and the plugin:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install datasette datasette-media
""><pre><code>$ pip install datasette datasette-media
</code></pre></div>
<p>If any of your photos are <code>.HEIC</code> images taken by an iPhone you should also install the optional <code>pyheif</code> dependency:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ pip install pyheif
""><pre><code>$ pip install pyheif
</code></pre></div>
<p>Now create a <code>metadata.yaml</code> file configuring the plugin:</p>
<div class=""highlight highlight-source-yaml position-relative overflow-auto"" data-snippet-clipboard-copy-content=""plugins:
  datasette-media:
    thumbnail:
      sql: |-
        select path as filepath, 200 as resize_height from apple_photos where uuid = :key
    large:
      sql: |-
        select path as filepath, 1024 as resize_height from apple_photos where uuid = :key
""><pre><span class=""pl-ent"">plugins</span>:
  <span class=""pl-ent"">datasette-media</span>:
    <span class=""pl-ent"">thumbnail</span>:
      <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">        select path as filepath, 200 as resize_height from apple_photos where uuid = :key</span>
<span class=""pl-s""></span>    <span class=""pl-ent"">large</span>:
      <span class=""pl-ent"">sql</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">        select path as filepath, 1024 as resize_height from apple_photos where uuid = :key</span></pre></div>
<p>This will configure two URL endpoints - one for 200 pixel high thumbnails and one for 1024 pixel high larger images.</p>
<p>Create your <code>photos.db</code> database using the <code>apple-photos</code> command, then run Datasette like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ datasette -m metadata.yaml
""><pre><code>$ datasette -m metadata.yaml
</code></pre></div>
<p>Your photos will be served on URLs that look like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""http://127.0.0.1:8001/-/media/thumbnail/F4469918-13F3-43D8-9EC1-734C0E6B60AD
http://127.0.0.1:8001/-/media/large/F4469918-13F3-43D8-9EC1-734C0E6B60AD
""><pre><code>http://127.0.0.1:8001/-/media/thumbnail/F4469918-13F3-43D8-9EC1-734C0E6B60AD
http://127.0.0.1:8001/-/media/large/F4469918-13F3-43D8-9EC1-734C0E6B60AD
</code></pre></div>
<p>You can find the UUIDs for use in these URLs by running <code>select uuid from photos_with_apple_metadata</code>.</p>
<h3><a id=""user-content-displaying-images-using-datasette-json-html"" class=""anchor"" aria-hidden=""true"" href=""#user-content-displaying-images-using-datasette-json-html""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Displaying images using datasette-json-html</h3>
<p>If you are using <code>datasette-media</code> to serve photos you can include images directly in Datasette query results using the <a href=""https://github.com/simonw/datasette-json-html"">datasette-json-html</a> plugin.</p>
<p>Run <code>pip install datasette-json-html</code> to install the plugin, then use queries like this to view your images:</p>
<div class=""highlight highlight-source-sql position-relative overflow-auto"" data-snippet-clipboard-copy-content=""select
    json_object(
        'img_src',
        '/-/media/thumbnail/' || uuid
    ) as photo,
    uuid,
    date
from
    apple_photos
order by
    date desc
limit 10;
""><pre><span class=""pl-k"">select</span>
    json_object(
        <span class=""pl-s""><span class=""pl-pds"">'</span>img_src<span class=""pl-pds"">'</span></span>,
        <span class=""pl-s""><span class=""pl-pds"">'</span>/-/media/thumbnail/<span class=""pl-pds"">'</span></span> <span class=""pl-k"">||</span> uuid
    ) <span class=""pl-k"">as</span> photo,
    uuid,
    <span class=""pl-k"">date</span>
<span class=""pl-k"">from</span>
    apple_photos
<span class=""pl-k"">order by</span>
    <span class=""pl-k"">date</span> <span class=""pl-k"">desc</span>
<span class=""pl-k"">limit</span> <span class=""pl-c1"">10</span>;</pre></div>
<p>The <code>photo</code> column returned by this query should render as image tags that display the correct images.</p>
<h3><a id=""user-content-displaying-images-using-custom-template-pages"" class=""anchor"" aria-hidden=""true"" href=""#user-content-displaying-images-using-custom-template-pages""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Displaying images using custom template pages</h3>
<p>Datasette's <a href=""https://datasette.readthedocs.io/en/stable/custom_templates.html#custom-pages"" rel=""nofollow"">custom pages</a> feature lets you create custom pages for a Datasette instance by dropping HTML templates into a <code>templates/pages</code> directory and then running Datasette using <code>datasette --template-dir=templates/</code>.</p>
<p>You can combine that ability with the <a href=""https://github.com/simonw/datasette-template-sql"">datasette-template-sql</a> plugin to create custom template pages that directly display photos served by <code>datasette-media</code>.</p>
<p>Install the plugin using <code>pip install datasette-template-sql</code>.</p>
<p>Create a <code>templates/pages</code> folder and add the following files:</p>
<p><code>recent-photos.html</code></p>
<div class=""highlight highlight-text-html-django position-relative overflow-auto"" data-snippet-clipboard-copy-content=""&lt;h1&gt;Recent photos&lt;/h1&gt;

&lt;div&gt;
{% for photo in sql(&quot;select * from apple_photos order by date desc limit 20&quot;) %}
    &lt;img src=&quot;/-/media/photo/{{ photo['uuid'] }}&quot;&gt;
{% endfor %}
&lt;/div&gt;
""><pre>&lt;<span class=""pl-ent"">h1</span>&gt;Recent photos&lt;/<span class=""pl-ent"">h1</span>&gt;

&lt;<span class=""pl-ent"">div</span>&gt;
<span class=""pl-e"">{%</span> <span class=""pl-k"">for</span> <span class=""pl-s"">photo</span> <span class=""pl-k"">in</span> <span class=""pl-s"">sql</span>(<span class=""pl-s"">""select * from apple_photos order by date desc limit 20""</span>) <span class=""pl-e"">%}</span>
    &lt;<span class=""pl-ent"">img</span> <span class=""pl-e"">src</span>=<span class=""pl-s""><span class=""pl-pds"">""</span>/-/media/photo/{{ photo['uuid'] }}<span class=""pl-pds"">""</span></span>&gt;
<span class=""pl-e"">{%</span> <span class=""pl-k"">endfor</span> <span class=""pl-e"">%}</span>
&lt;/<span class=""pl-ent"">div</span>&gt;</pre></div>
<p><code>random-photos.html</code></p>
<div class=""highlight highlight-text-html-django position-relative overflow-auto"" data-snippet-clipboard-copy-content=""&lt;h1&gt;Random photos&lt;/h1&gt;

&lt;div&gt;
{% for photo in sql(&quot;with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 20&quot;) %}
    &lt;img src=&quot;/-/media/photo/{{ photo['uuid'] }}&quot;&gt;
{% endfor %}
&lt;/div&gt;
""><pre>&lt;<span class=""pl-ent"">h1</span>&gt;Random photos&lt;/<span class=""pl-ent"">h1</span>&gt;

&lt;<span class=""pl-ent"">div</span>&gt;
<span class=""pl-e"">{%</span> <span class=""pl-k"">for</span> <span class=""pl-s"">photo</span> <span class=""pl-k"">in</span> <span class=""pl-s"">sql</span>(<span class=""pl-s"">""with foo as (select * from apple_photos order by date desc limit 5000) select * from foo order by random() limit 20""</span>) <span class=""pl-e"">%}</span>
    &lt;<span class=""pl-ent"">img</span> <span class=""pl-e"">src</span>=<span class=""pl-s""><span class=""pl-pds"">""</span>/-/media/photo/{{ photo['uuid'] }}<span class=""pl-pds"">""</span></span>&gt;
<span class=""pl-e"">{%</span> <span class=""pl-k"">endfor</span> <span class=""pl-e"">%}</span>
&lt;/<span class=""pl-ent"">div</span>&gt;</pre></div>
<p>Now run Datasette like this:</p>
<div class=""snippet-clipboard-content position-relative overflow-auto"" data-snippet-clipboard-copy-content=""$ datasette photos.db -m metadata.yaml --template-dir=templates/
""><pre><code>$ datasette photos.db -m metadata.yaml --template-dir=templates/
</code></pre></div>
<p>Visiting <code>http://localhost:8001/recent-photos</code> will display 20 recent photos. Visiting <code>http://localhost:8001/random-photos</code> will display 20 photos randomly selected from your 5,000 most recent.</p>
</article></div>",1,public,0,,,
274264484,MDEwOlJlcG9zaXRvcnkyNzQyNjQ0ODQ=,sqlite-generate,simonw/sqlite-generate,0,9599,https://github.com/simonw/sqlite-generate,Tool for generating demo SQLite databases,0,2020-06-22T23:36:44Z,2021-02-27T15:25:26Z,2021-02-27T15:25:24Z,https://sqlite-generate-demo.datasette.io/,56,17,17,Python,1,1,1,1,0,0,0,0,0,apache-2.0,"[""sqlite"", ""datasette-io"", ""datasette-tool""]",0,0,17,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,2,"# sqlite-generate

[![PyPI](https://img.shields.io/pypi/v/sqlite-generate.svg)](https://pypi.org/project/sqlite-generate/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-generate?label=changelog)](https://github.com/simonw/sqlite-generate/releases)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-generate/blob/master/LICENSE)

Tool for generating demo SQLite databases

## Installation

Install this plugin using `pip`:

    $ pip install sqlite-generate

## Demo

You can see a demo of the database generated using this command running in [Datasette](https://github.com/simonw/datasette) at https://sqlite-generate-demo.datasette.io/

The demo is generated using the following command:

    sqlite-generate demo.db --seed seed --fts --columns=10 --fks=0,3 --pks=0,2

## Usage

To generate a SQLite database file called `data.db` with 10 randomly named tables in it, run the following:

    sqlite-generate data.db

You can use the `--tables` option to generate a different number of tables:

    sqlite-generate data.db --tables 20

You can run the command against the same database file multiple times to keep adding new tables, using different settings for each batch of generated tables.

By default each table will contain a random number of rows between 0 and 200. You can customize this with the `--rows` option:

    sqlite-generate data.db --rows 20

This will insert 20 rows into each table.

    sqlite-generate data.db --rows 500,2000

This inserts a random number of rows between 500 and 2000 into each table.

Each table will have 5 columns. You can change this using `--columns`:

    sqlite-generate data.db --columns 10

`--columns` can also accept a range:

    sqlite-generate data.db --columns 5,15

You can control the random number seed used with the `--seed` option. This will result in the exact same database file being created by multiple runs of the tool:

    sqlite-generate data.db --seed=myseed

By default each table will contain between 0 and 2 foreign key columns to other tables. You can control this using the `--fks` option, with either a single number or a range:

    sqlite-generate data.db --columns=20 --fks=5,15

Each table will have a single primary key column called `id`. You can use the `--pks=` option to change the number of primary key columns on each table. Drop it to 0 to generate [rowid tables](https://www.sqlite.org/rowidtable.html). Increase it above 1 to generate tables with compound primary keys. Or use a range to get a random selection of different primary key layouts:

    sqlite-generate data.db --pks=0,2

To configure [SQLite full-text search](https://www.sqlite.org/fts5.html) for all columns of type text, use `--fts`:

    sqlite-generate data.db --fts

This will use FTS5 by default. To use [FTS4](https://www.sqlite.org/fts3.html) instead, use `--fts4`.

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd sqlite-generate
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-sqlite-generate"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sqlite-generate""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>sqlite-generate</h1>
<p><a href=""https://pypi.org/project/sqlite-generate/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/f0bc12d9a036f8faadbe40bbc37caa416eadf33d6694322ca385c16f2302b575/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f73716c6974652d67656e65726174652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/sqlite-generate.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sqlite-generate/releases""><img src=""https://camo.githubusercontent.com/8acbae82ad62477a2630aee86e3c2c6c498a06acc6fd6d6c08e51779ea3905de/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f73716c6974652d67656e65726174653f6c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/sqlite-generate?label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sqlite-generate/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Tool for generating demo SQLite databases</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this plugin using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install sqlite-generate
""><pre><code>$ pip install sqlite-generate
</code></pre></div>
<h2><a id=""user-content-demo"" class=""anchor"" aria-hidden=""true"" href=""#user-content-demo""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Demo</h2>
<p>You can see a demo of the database generated using this command running in <a href=""https://github.com/simonw/datasette"">Datasette</a> at <a href=""https://sqlite-generate-demo.datasette.io/"" rel=""nofollow"">https://sqlite-generate-demo.datasette.io/</a></p>
<p>The demo is generated using the following command:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate demo.db --seed seed --fts --columns=10 --fks=0,3 --pks=0,2
""><pre><code>sqlite-generate demo.db --seed seed --fts --columns=10 --fks=0,3 --pks=0,2
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>To generate a SQLite database file called <code>data.db</code> with 10 randomly named tables in it, run the following:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db
""><pre><code>sqlite-generate data.db
</code></pre></div>
<p>You can use the <code>--tables</code> option to generate a different number of tables:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --tables 20
""><pre><code>sqlite-generate data.db --tables 20
</code></pre></div>
<p>You can run the command against the same database file multiple times to keep adding new tables, using different settings for each batch of generated tables.</p>
<p>By default each table will contain a random number of rows between 0 and 200. You can customize this with the <code>--rows</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --rows 20
""><pre><code>sqlite-generate data.db --rows 20
</code></pre></div>
<p>This will insert 20 rows into each table.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --rows 500,2000
""><pre><code>sqlite-generate data.db --rows 500,2000
</code></pre></div>
<p>This inserts a random number of rows between 500 and 2000 into each table.</p>
<p>Each table will have 5 columns. You can change this using <code>--columns</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --columns 10
""><pre><code>sqlite-generate data.db --columns 10
</code></pre></div>
<p><code>--columns</code> can also accept a range:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --columns 5,15
""><pre><code>sqlite-generate data.db --columns 5,15
</code></pre></div>
<p>You can control the random number seed used with the <code>--seed</code> option. This will result in the exact same database file being created by multiple runs of the tool:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --seed=myseed
""><pre><code>sqlite-generate data.db --seed=myseed
</code></pre></div>
<p>By default each table will contain between 0 and 2 foreign key columns to other tables. You can control this using the <code>--fks</code> option, with either a single number or a range:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --columns=20 --fks=5,15
""><pre><code>sqlite-generate data.db --columns=20 --fks=5,15
</code></pre></div>
<p>Each table will have a single primary key column called <code>id</code>. You can use the <code>--pks=</code> option to change the number of primary key columns on each table. Drop it to 0 to generate <a href=""https://www.sqlite.org/rowidtable.html"" rel=""nofollow"">rowid tables</a>. Increase it above 1 to generate tables with compound primary keys. Or use a range to get a random selection of different primary key layouts:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --pks=0,2
""><pre><code>sqlite-generate data.db --pks=0,2
</code></pre></div>
<p>To configure <a href=""https://www.sqlite.org/fts5.html"" rel=""nofollow"">SQLite full-text search</a> for all columns of type text, use <code>--fts</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-generate data.db --fts
""><pre><code>sqlite-generate data.db --fts
</code></pre></div>
<p>This will use FTS5 by default. To use <a href=""https://www.sqlite.org/fts3.html"" rel=""nofollow"">FTS4</a> instead, use <code>--fts4</code>.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd sqlite-generate
python -mvenv venv
source venv/bin/activate
""><pre><code>cd sqlite-generate
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
291339086,MDEwOlJlcG9zaXRvcnkyOTEzMzkwODY=,airtable-export,simonw/airtable-export,0,9599,https://github.com/simonw/airtable-export,"Export Airtable data to YAML, JSON or SQLite files on disk",0,2020-08-29T19:51:37Z,2021-06-08T17:30:30Z,2021-04-09T23:41:52Z,https://datasette.io/tools/airtable-export,41,33,33,Python,1,1,1,1,0,5,0,0,6,apache-2.0,"[""yaml"", ""airtable"", ""airtable-api"", ""datasette-io"", ""datasette-tool""]",5,6,33,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,5,3,"# airtable-export

[![PyPI](https://img.shields.io/pypi/v/airtable-export.svg)](https://pypi.org/project/airtable-export/)
[![Changelog](https://img.shields.io/github/v/release/simonw/airtable-export?include_prereleases&label=changelog)](https://github.com/simonw/airtable-export/releases)
[![Tests](https://github.com/simonw/airtable-export/workflows/Test/badge.svg)](https://github.com/simonw/airtable-export/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/airtable-export/blob/master/LICENSE)

Export Airtable data to files on disk

## Installation

Install this tool using `pip`:

    $ pip install airtable-export

## Usage

You will need to know the following information:

- Your Airtable base ID - this is a string starting with `app...`
- Your Airtable API key - this is a string starting with `key...`
- The names of each of the tables that you wish to export

You can export all of your data to a folder called `export/` by running the following:

    airtable-export export base_id table1 table2 --key=key

This example would create two files: `export/table1.yml` and `export/table2.yml`.

Rather than passing the API key using the `--key` option you can set it as an environment variable called `AIRTABLE_KEY`.

## Export options

By default the tool exports your data as YAML.

You can also export as JSON or as [newline delimited JSON](http://ndjson.org/) using the `--json` or `--ndjson` options:

    airtable-export export base_id table1 table2 --key=key --ndjson

You can pass multiple format options at once. This command will create a `.json`, `.yml` and `.ndjson` file for each exported table:

    airtable-export export base_id table1 table2 \
        --key=key --ndjson --yaml --json

### SQLite database export

You can export tables to a SQLite database file using the `--sqlite database.db` option:

    airtable-export export base_id table1 table2 \
        --key=key --sqlite database.db

This can be combined with other format options. If you only specify `--sqlite` the export directory argument will be ignored.

The SQLite database will have a table created for each table you export. Those tables will have a primary key column called `airtable_id`.

If you run this command against an existing SQLite database records with matching primary keys will be over-written by new records from the export.

## Request options

By default the tool uses [python-httpx](https://www.python-httpx.org)'s default configurations.

You can override the `user-agent` using the `--user-agent` option:

    airtable-export export base_id table1 table2 --key=key --user-agent ""Airtable Export Robot""

You can override the [timeout during a network read operation](https://www.python-httpx.org/advanced/#fine-tuning-the-configuration) using the `--http-read-timeout` option. If not set, this defaults to 5s.

    airtable-export export base_id table1 table2 --key=key --http-read-timeout 60

## Running this using GitHub Actions

[GitHub Actions](https://github.com/features/actions) is GitHub's workflow automation product. You can use it to run `airtable-export` in order to back up your Airtable data to a GitHub repository. Doing this gives you a visible commit history of changes you make to your Airtable data - like [this one](https://github.com/natbat/rockybeaches/commits/main/airtable).

To run this for your own Airtable database you'll first need to add the following secrets to your GitHub repository:

<dl>
  <dt>AIRTABLE_BASE_ID</dt>
  <dd>The base ID, a string beginning `app...`</dd>
  <dt>AIRTABLE_KEY</dt>
  <dd>Your Airtable API key</dd>
  <dt>AIRTABLE_TABLES</dt>
  <dd>A space separated list of the Airtable tables that you want to backup. If any of these contain spaces you will need to enclose them in single quotes, e.g. <samp>'My table with spaces in the name' OtherTableWithNoSpaces</samp></dd>
</dl>

Once you have set those secrets, add the following as a file called `.github/workflows/backup-airtable.yml`:
```yaml
name: Backup Airtable

on:
  workflow_dispatch:
  schedule:
  - cron: '32 0 * * *'

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - name: Check out repo
      uses: actions/checkout@v2
    - name: Set up Python
      uses: actions/setup-python@v2
      with:
        python-version: 3.8
    - uses: actions/cache@v2
      name: Configure pip caching
      with:
        path: ~/.cache/pip
        key: ${{ runner.os }}-pip-
        restore-keys: |
          ${{ runner.os }}-pip-
    - name: Install airtable-export
      run: |
        pip install airtable-export
    - name: Backup Airtable to backups/
      env:
        AIRTABLE_BASE_ID: ${{ secrets.AIRTABLE_BASE_ID }}
        AIRTABLE_KEY: ${{ secrets.AIRTABLE_KEY }}
        AIRTABLE_TABLES: ${{ secrets.AIRTABLE_TABLES }}
      run: |-
        airtable-export backups $AIRTABLE_BASE_ID $AIRTABLE_TABLES -v
    - name: Commit and push if it changed
      run: |-
        git config user.name ""Automated""
        git config user.email ""actions@users.noreply.github.com""
        git add -A
        timestamp=$(date -u)
        git commit -m ""Latest data: ${timestamp}"" || exit 0
        git push
```
This will run once a day (at 32 minutes past midnight UTC) and will also run if you manually click the ""Run workflow"" button, see [GitHub Actions: Manual triggers with workflow_dispatch](https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/).

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd airtable-export
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-airtable-export"" class=""anchor"" aria-hidden=""true"" href=""#user-content-airtable-export""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>airtable-export</h1>
<p><a href=""https://pypi.org/project/airtable-export/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/2635595699cd87784148506d703996615d75d6e8d6d0eba3928a886a5ebac963/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6169727461626c652d6578706f72742e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/airtable-export.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/airtable-export/releases""><img src=""https://camo.githubusercontent.com/361aa58985cc1f5841b899de85bedd6e7ced2a253334b4dc390bd1762b2f8be5/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f6169727461626c652d6578706f72743f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/airtable-export?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/airtable-export/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/airtable-export/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/airtable-export/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Export Airtable data to files on disk</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install airtable-export
""><pre><code>$ pip install airtable-export
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>You will need to know the following information:</p>
<ul>
<li>Your Airtable base ID - this is a string starting with <code>app...</code></li>
<li>Your Airtable API key - this is a string starting with <code>key...</code></li>
<li>The names of each of the tables that you wish to export</li>
</ul>
<p>You can export all of your data to a folder called <code>export/</code> by running the following:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 --key=key
""><pre><code>airtable-export export base_id table1 table2 --key=key
</code></pre></div>
<p>This example would create two files: <code>export/table1.yml</code> and <code>export/table2.yml</code>.</p>
<p>Rather than passing the API key using the <code>--key</code> option you can set it as an environment variable called <code>AIRTABLE_KEY</code>.</p>
<h2><a id=""user-content-export-options"" class=""anchor"" aria-hidden=""true"" href=""#user-content-export-options""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Export options</h2>
<p>By default the tool exports your data as YAML.</p>
<p>You can also export as JSON or as <a href=""http://ndjson.org/"" rel=""nofollow"">newline delimited JSON</a> using the <code>--json</code> or <code>--ndjson</code> options:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 --key=key --ndjson
""><pre><code>airtable-export export base_id table1 table2 --key=key --ndjson
</code></pre></div>
<p>You can pass multiple format options at once. This command will create a <code>.json</code>, <code>.yml</code> and <code>.ndjson</code> file for each exported table:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 \
    --key=key --ndjson --yaml --json
""><pre><code>airtable-export export base_id table1 table2 \
    --key=key --ndjson --yaml --json
</code></pre></div>
<h3><a id=""user-content-sqlite-database-export"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sqlite-database-export""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>SQLite database export</h3>
<p>You can export tables to a SQLite database file using the <code>--sqlite database.db</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 \
    --key=key --sqlite database.db
""><pre><code>airtable-export export base_id table1 table2 \
    --key=key --sqlite database.db
</code></pre></div>
<p>This can be combined with other format options. If you only specify <code>--sqlite</code> the export directory argument will be ignored.</p>
<p>The SQLite database will have a table created for each table you export. Those tables will have a primary key column called <code>airtable_id</code>.</p>
<p>If you run this command against an existing SQLite database records with matching primary keys will be over-written by new records from the export.</p>
<h2><a id=""user-content-request-options"" class=""anchor"" aria-hidden=""true"" href=""#user-content-request-options""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Request options</h2>
<p>By default the tool uses <a href=""https://www.python-httpx.org"" rel=""nofollow"">python-httpx</a>'s default configurations.</p>
<p>You can override the <code>user-agent</code> using the <code>--user-agent</code> option:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 --key=key --user-agent &quot;Airtable Export Robot&quot;
""><pre><code>airtable-export export base_id table1 table2 --key=key --user-agent ""Airtable Export Robot""
</code></pre></div>
<p>You can override the <a href=""https://www.python-httpx.org/advanced/#fine-tuning-the-configuration"" rel=""nofollow"">timeout during a network read operation</a> using the <code>--http-read-timeout</code> option. If not set, this defaults to 5s.</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""airtable-export export base_id table1 table2 --key=key --http-read-timeout 60
""><pre><code>airtable-export export base_id table1 table2 --key=key --http-read-timeout 60
</code></pre></div>
<h2><a id=""user-content-running-this-using-github-actions"" class=""anchor"" aria-hidden=""true"" href=""#user-content-running-this-using-github-actions""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Running this using GitHub Actions</h2>
<p><a href=""https://github.com/features/actions"">GitHub Actions</a> is GitHub's workflow automation product. You can use it to run <code>airtable-export</code> in order to back up your Airtable data to a GitHub repository. Doing this gives you a visible commit history of changes you make to your Airtable data - like <a href=""https://github.com/natbat/rockybeaches/commits/main/airtable"">this one</a>.</p>
<p>To run this for your own Airtable database you'll first need to add the following secrets to your GitHub repository:</p>
<dl>
  <dt>AIRTABLE_BASE_ID</dt>
  <dd>The base ID, a string beginning `app...`</dd>
  <dt>AIRTABLE_KEY</dt>
  <dd>Your Airtable API key</dd>
  <dt>AIRTABLE_TABLES</dt>
  <dd>A space separated list of the Airtable tables that you want to backup. If any of these contain spaces you will need to enclose them in single quotes, e.g. <samp>'My table with spaces in the name' OtherTableWithNoSpaces</samp></dd>
</dl>
<p>Once you have set those secrets, add the following as a file called <code>.github/workflows/backup-airtable.yml</code>:</p>
<div class=""highlight highlight-source-yaml position-relative"" data-snippet-clipboard-copy-content=""name: Backup Airtable

on:
  workflow_dispatch:
  schedule:
  - cron: '32 0 * * *'

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - name: Check out repo
      uses: actions/checkout@v2
    - name: Set up Python
      uses: actions/setup-python@v2
      with:
        python-version: 3.8
    - uses: actions/cache@v2
      name: Configure pip caching
      with:
        path: ~/.cache/pip
        key: ${{ runner.os }}-pip-
        restore-keys: |
          ${{ runner.os }}-pip-
    - name: Install airtable-export
      run: |
        pip install airtable-export
    - name: Backup Airtable to backups/
      env:
        AIRTABLE_BASE_ID: ${{ secrets.AIRTABLE_BASE_ID }}
        AIRTABLE_KEY: ${{ secrets.AIRTABLE_KEY }}
        AIRTABLE_TABLES: ${{ secrets.AIRTABLE_TABLES }}
      run: |-
        airtable-export backups $AIRTABLE_BASE_ID $AIRTABLE_TABLES -v
    - name: Commit and push if it changed
      run: |-
        git config user.name &quot;Automated&quot;
        git config user.email &quot;actions@users.noreply.github.com&quot;
        git add -A
        timestamp=$(date -u)
        git commit -m &quot;Latest data: ${timestamp}&quot; || exit 0
        git push
""><pre><span class=""pl-ent"">name</span>: <span class=""pl-s"">Backup Airtable</span>

<span class=""pl-ent"">on</span>:
  <span class=""pl-ent"">workflow_dispatch</span>:
  <span class=""pl-ent"">schedule</span>:
  - <span class=""pl-ent"">cron</span>: <span class=""pl-s""><span class=""pl-pds"">'</span>32 0 * * *<span class=""pl-pds"">'</span></span>

<span class=""pl-ent"">jobs</span>:
  <span class=""pl-ent"">build</span>:
    <span class=""pl-ent"">runs-on</span>: <span class=""pl-s"">ubuntu-latest</span>
    <span class=""pl-ent"">steps</span>:
    - <span class=""pl-ent"">name</span>: <span class=""pl-s"">Check out repo</span>
      <span class=""pl-ent"">uses</span>: <span class=""pl-s"">actions/checkout@v2</span>
    - <span class=""pl-ent"">name</span>: <span class=""pl-s"">Set up Python</span>
      <span class=""pl-ent"">uses</span>: <span class=""pl-s"">actions/setup-python@v2</span>
      <span class=""pl-ent"">with</span>:
        <span class=""pl-ent"">python-version</span>: <span class=""pl-c1"">3.8</span>
    - <span class=""pl-ent"">uses</span>: <span class=""pl-s"">actions/cache@v2</span>
      <span class=""pl-ent"">name</span>: <span class=""pl-s"">Configure pip caching</span>
      <span class=""pl-ent"">with</span>:
        <span class=""pl-ent"">path</span>: <span class=""pl-s"">~/.cache/pip</span>
        <span class=""pl-ent"">key</span>: <span class=""pl-s"">${{ runner.os }}-pip-</span>
        <span class=""pl-ent"">restore-keys</span>: <span class=""pl-s"">|</span>
<span class=""pl-s"">          ${{ runner.os }}-pip-</span>
<span class=""pl-s""></span>    - <span class=""pl-ent"">name</span>: <span class=""pl-s"">Install airtable-export</span>
      <span class=""pl-ent"">run</span>: <span class=""pl-s"">|</span>
<span class=""pl-s"">        pip install airtable-export</span>
<span class=""pl-s""></span>    - <span class=""pl-ent"">name</span>: <span class=""pl-s"">Backup Airtable to backups/</span>
      <span class=""pl-ent"">env</span>:
        <span class=""pl-ent"">AIRTABLE_BASE_ID</span>: <span class=""pl-s"">${{ secrets.AIRTABLE_BASE_ID }}</span>
        <span class=""pl-ent"">AIRTABLE_KEY</span>: <span class=""pl-s"">${{ secrets.AIRTABLE_KEY }}</span>
        <span class=""pl-ent"">AIRTABLE_TABLES</span>: <span class=""pl-s"">${{ secrets.AIRTABLE_TABLES }}</span>
      <span class=""pl-ent"">run</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">        airtable-export backups $AIRTABLE_BASE_ID $AIRTABLE_TABLES -v</span>
<span class=""pl-s""></span>    - <span class=""pl-ent"">name</span>: <span class=""pl-s"">Commit and push if it changed</span>
      <span class=""pl-ent"">run</span>: <span class=""pl-s"">|-</span>
<span class=""pl-s"">        git config user.name ""Automated""</span>
<span class=""pl-s"">        git config user.email ""actions@users.noreply.github.com""</span>
<span class=""pl-s"">        git add -A</span>
<span class=""pl-s"">        timestamp=$(date -u)</span>
<span class=""pl-s"">        git commit -m ""Latest data: ${timestamp}"" || exit 0</span>
<span class=""pl-s"">        git push</span></pre></div>
<p>This will run once a day (at 32 minutes past midnight UTC) and will also run if you manually click the ""Run workflow"" button, see <a href=""https://github.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/"" rel=""nofollow"">GitHub Actions: Manual triggers with workflow_dispatch</a>.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd airtable-export
python -mvenv venv
source venv/bin/activate
""><pre><code>cd airtable-export
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
303218369,MDEwOlJlcG9zaXRvcnkzMDMyMTgzNjk=,evernote-to-sqlite,dogsheep/evernote-to-sqlite,0,53015001,https://github.com/dogsheep/evernote-to-sqlite,Tools for converting Evernote content to SQLite,0,2020-10-11T21:45:49Z,2021-08-26T19:01:54Z,2021-08-26T19:02:47Z,,51,24,24,Python,1,1,1,1,0,4,0,0,3,apache-2.0,"[""datasette-io"", ""datasette-tool"", ""dogsheep"", ""evernote"", ""sqlite""]",4,3,24,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,53015001,4,4,"# evernote-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/evernote-to-sqlite.svg)](https://pypi.org/project/evernote-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/dogsheep/evernote-to-sqlite?include_prereleases&label=changelog)](https://github.com/dogsheep/evernote-to-sqlite/releases)
[![Tests](https://github.com/dogsheep/evernote-to-sqlite/workflows/Test/badge.svg)](https://github.com/dogsheep/evernote-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/dogsheep/evernote-to-sqlite/blob/master/LICENSE)

Tools for converting Evernote content to SQLite. See [Building an Evernote to SQLite exporter](https://simonwillison.net/2020/Oct/16/building-evernote-sqlite-exporter/) for background on this project.

## Installation

Install this tool using `pip`:

    $ pip install evernote-to-sqlite

## Usage

Currently the only available command is `evernote-to-sqlite enex`, which converts Evernote's ENEX export files into a SQLite database.

You can create [an ENEX export](https://help.evernote.com/hc/en-us/articles/209005557-Export-notes-and-notebooks-as-ENEX-or-HTML) in the Evernote desktop application by selecting some notes (or all of your notes) and using the `File -> Export Notes...` menu option.

This used to be able to export everything in one go, but it looks like more recent Evernote versions only allow exporting up to fifty notes at a time, or let you export an entire notebook by right-clicking on the notebook and selecting ""Export notebook..."".

You can convert that file to SQLite like so:

    $ evernote-to-sqlite enex evernote.db MyNotes.enex

This will display a progress bar and create a SQLite database file called `evernote.db`.

### Limitations

Unfortunately the ENEX export format does not include a unique identifier for each note. This means you cannot use this tool to re-import notes after they have been updated - you should consider this tool to be a one-time transformation of an ENEX file into an equivalent SQLite database.

ENEX exports also do not include details of which notebook a note belongs to.

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd evernote-to-sqlite
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-evernote-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-evernote-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>evernote-to-sqlite</h1>
<p><a href=""https://pypi.org/project/evernote-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/0035842afd045b61cc5c6ee9c8394c181ac4daf3530bf212349f37a05bebcdbb/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f657665726e6f74652d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/evernote-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/evernote-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/b51eee78bbc51296102af124839ed665bf3f17755f4dbff95c4bc8ed0713d763/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f646f6773686565702f657665726e6f74652d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/dogsheep/evernote-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/evernote-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/dogsheep/evernote-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/dogsheep/evernote-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Tools for converting Evernote content to SQLite. See <a href=""https://simonwillison.net/2020/Oct/16/building-evernote-sqlite-exporter/"" rel=""nofollow"">Building an Evernote to SQLite exporter</a> for background on this project.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install evernote-to-sqlite
""><pre><code>$ pip install evernote-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>Currently the only available command is <code>evernote-to-sqlite enex</code>, which converts Evernote's ENEX export files into a SQLite database.</p>
<p>You can create <a href=""https://help.evernote.com/hc/en-us/articles/209005557-Export-notes-and-notebooks-as-ENEX-or-HTML"" rel=""nofollow"">an ENEX export</a> in the Evernote desktop application by selecting some notes (or all of your notes) and using the <code>File -&gt; Export Notes...</code> menu option.</p>
<p>This used to be able to export everything in one go, but it looks like more recent Evernote versions only allow exporting up to fifty notes at a time, or let you export an entire notebook by right-clicking on the notebook and selecting ""Export notebook..."".</p>
<p>You can convert that file to SQLite like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ evernote-to-sqlite enex evernote.db MyNotes.enex
""><pre><code>$ evernote-to-sqlite enex evernote.db MyNotes.enex
</code></pre></div>
<p>This will display a progress bar and create a SQLite database file called <code>evernote.db</code>.</p>
<h3><a id=""user-content-limitations"" class=""anchor"" aria-hidden=""true"" href=""#user-content-limitations""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Limitations</h3>
<p>Unfortunately the ENEX export format does not include a unique identifier for each note. This means you cannot use this tool to re-import notes after they have been updated - you should consider this tool to be a one-time transformation of an ENEX file into an equivalent SQLite database.</p>
<p>ENEX exports also do not include details of which notebook a note belongs to.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd evernote-to-sqlite
python -mvenv venv
source venv/bin/activate
""><pre><code>cd evernote-to-sqlite
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
305199661,MDEwOlJlcG9zaXRvcnkzMDUxOTk2NjE=,sphinx-to-sqlite,simonw/sphinx-to-sqlite,0,9599,https://github.com/simonw/sphinx-to-sqlite,Create a SQLite database from Sphinx documentation,0,2020-10-18T21:26:55Z,2020-12-19T05:08:12Z,2020-10-22T04:55:45Z,,9,2,2,Python,1,1,1,1,0,0,0,0,2,apache-2.0,"[""sqlite"", ""sphinx"", ""datasette-io"", ""datasette-tool""]",0,2,2,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,2,"# sphinx-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/sphinx-to-sqlite.svg)](https://pypi.org/project/sphinx-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sphinx-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/sphinx-to-sqlite/releases)
[![Tests](https://github.com/simonw/sphinx-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/sphinx-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sphinx-to-sqlite/blob/master/LICENSE)

Create a SQLite database from Sphinx documentation.

## Demo

You can see the results of running this tool against the [Datasette documentation](https://docs.datasette.io/) at https://latest-docs.datasette.io/docs/sections

## Installation

Install this tool using `pip`:

    $ pip install sphinx-to-sqlite

## Usage

First run `sphinx-build` with the `-b xml` option to create XML files in your `_build/` directory.

Then run:

    $ sphinx-to-sqlite docs.db path/to/_build

To build the SQLite database.

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd sphinx-to-sqlite
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-sphinx-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-sphinx-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>sphinx-to-sqlite</h1>
<p><a href=""https://pypi.org/project/sphinx-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/877d22c6402df75a56257c7d5a426d35ee787ab139307e4a7060c7d706b4c6cd/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f737068696e782d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/sphinx-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sphinx-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/69509856ca4dd51d3c9e67530adbd0f1f662719a608252a8d9cb5bdc3b90590b/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f737068696e782d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/sphinx-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sphinx-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/sphinx-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/sphinx-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Create a SQLite database from Sphinx documentation.</p>
<h2><a id=""user-content-demo"" class=""anchor"" aria-hidden=""true"" href=""#user-content-demo""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Demo</h2>
<p>You can see the results of running this tool against the <a href=""https://docs.datasette.io/"" rel=""nofollow"">Datasette documentation</a> at <a href=""https://latest-docs.datasette.io/docs/sections"" rel=""nofollow"">https://latest-docs.datasette.io/docs/sections</a></p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install sphinx-to-sqlite
""><pre><code>$ pip install sphinx-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>First run <code>sphinx-build</code> with the <code>-b xml</code> option to create XML files in your <code>_build/</code> directory.</p>
<p>Then run:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ sphinx-to-sqlite docs.db path/to/_build
""><pre><code>$ sphinx-to-sqlite docs.db path/to/_build
</code></pre></div>
<p>To build the SQLite database.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd sphinx-to-sqlite
python -mvenv venv
source venv/bin/activate
""><pre><code>cd sphinx-to-sqlite
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
335372050,MDEwOlJlcG9zaXRvcnkzMzUzNzIwNTA=,download-tiles,simonw/download-tiles,0,9599,https://github.com/simonw/download-tiles,Download map tiles and store them in an MBTiles database,0,2021-02-02T17:37:49Z,2021-05-29T07:22:58Z,2021-02-16T04:19:59Z,https://datasette.io/tools/download-tiles,26,9,9,Python,1,1,1,1,0,0,0,0,0,apache-2.0,"[""openstreetmap"", ""mbtiles"", ""datasette-io"", ""datasette-tool""]",0,0,9,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,0,1,"# download-tiles

[![PyPI](https://img.shields.io/pypi/v/download-tiles.svg)](https://pypi.org/project/download-tiles/)
[![Changelog](https://img.shields.io/github/v/release/simonw/download-tiles?include_prereleases&label=changelog)](https://github.com/simonw/download-tiles/releases)
[![Tests](https://github.com/simonw/download-tiles/workflows/Test/badge.svg)](https://github.com/simonw/download-tiles/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/download-tiles/blob/master/LICENSE)

Download map tiles and store them in an MBTiles database

## Installation

Install this tool using `pip`:

    $ pip install download-tiles

## Usage

This tool downloads tiles from a specified [TMS (Tile Map Server)](https://wiki.openstreetmap.org/wiki/TMS) server for a specified bounding box and range of zoom levels and stores those tiles in a MBTiles SQLite database. It is a command-line wrapper around the [Landez](https://github.com/makinacorpus/landez) Python libary.

**Please use this tool responsibly**. Consult the usage policies of the tile servers you are interacting with, for example the [OpenStreetMap Tile Usage Policy](https://operations.osmfoundation.org/policies/tiles/).

Running the following will download zoom levels 0-3 of OpenStreetMap, 85 tiles total, and store them in a SQLite database called `world.mbtiles`:

    download-tiles world.mbtiles

You can customize which tile and zoom levels are downloaded using command options:

`--zoom-levels=0-3` or `-z=0-3`

The different zoom levels to download. Specify a single number, e.g. `15`, or a range of numbers e.g. `0-4`. Be careful with this setting as you can easily go over the limits requested by the underlying tile server.

`--bbox=3.9,-6.3,14.5,10.2` or `-b=3.9,-6.3,14.5,10.2`

The bounding box to fetch. Should be specified as `min-lon,min-lat,max-lon,max-lat`. You can use [bboxfinder.com](http://bboxfinder.com/) to find these for different areas.

`--city=london` or `--country=madagascar`

These options can be used instead of `--bbox`. The city or country specified will be looked up using the [Nominatum API](https://nominatim.org/release-docs/latest/api/Search/) and used to derive a bounding box.

`--show-bbox`

Use this option to output the bounding box that was retrieved for the `--city` or `--country` without downloading any tiles.

`--name=Name`

A name for this tile collection, used for the `name` field in the `metadata` table. If not specified a UUID will be used, or if you used `--city` or `--country` the name will be set to the full name of that place.

`--attribution=""Attribution string""`

Attribution string to bake into the `metadata` table. This will default to `© OpenStreetMap contributors` unless you use `--tiles-url` to specify an alternative tile server, in which case you should specify a custom attribution string.

You can use the `--attribution=osm` shortcut to specify the `© OpenStreetMap contributors` value without having to type it out in full.

`--tiles-url=https://...`

The tile server URL to use. This should include `{z}` and `{x}` and `{y}` specifiers, and can optionally include `{s}` for subdomains.

The default URL used here is for OpenStreetMap, `http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png`

`--tiles-subdomains=a,b,c`

A comma-separated list of subdomains to use for the `{s}` parameter.

`--verbose`

Use this option to turn on verbose logging.

`--cache-dir=/tmp/tiles`

Provide a directory to cache downloaded tiles between runs. This can be useful if you are worried you might not have used the correct options for the bounding box or zoom levels.

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd download-tiles
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-download-tiles"" class=""anchor"" aria-hidden=""true"" href=""#user-content-download-tiles""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>download-tiles</h1>
<p><a href=""https://pypi.org/project/download-tiles/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/82c46313526394b774727137f12ac6fbf6606364edbee19ebd99c951953b04b5/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f646f776e6c6f61642d74696c65732e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/download-tiles.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/download-tiles/releases""><img src=""https://camo.githubusercontent.com/32d363e282d2f95ba1b135630d328b8b61459c3e97b2b4e2fe1ec629be13d80e/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f646f776e6c6f61642d74696c65733f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/download-tiles?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/download-tiles/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/download-tiles/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/download-tiles/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Download map tiles and store them in an MBTiles database</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install download-tiles
""><pre><code>$ pip install download-tiles
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>This tool downloads tiles from a specified <a href=""https://wiki.openstreetmap.org/wiki/TMS"" rel=""nofollow"">TMS (Tile Map Server)</a> server for a specified bounding box and range of zoom levels and stores those tiles in a MBTiles SQLite database. It is a command-line wrapper around the <a href=""https://github.com/makinacorpus/landez"">Landez</a> Python libary.</p>
<p><strong>Please use this tool responsibly</strong>. Consult the usage policies of the tile servers you are interacting with, for example the <a href=""https://operations.osmfoundation.org/policies/tiles/"" rel=""nofollow"">OpenStreetMap Tile Usage Policy</a>.</p>
<p>Running the following will download zoom levels 0-3 of OpenStreetMap, 85 tiles total, and store them in a SQLite database called <code>world.mbtiles</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""download-tiles world.mbtiles
""><pre><code>download-tiles world.mbtiles
</code></pre></div>
<p>You can customize which tile and zoom levels are downloaded using command options:</p>
<p><code>--zoom-levels=0-3</code> or <code>-z=0-3</code></p>
<p>The different zoom levels to download. Specify a single number, e.g. <code>15</code>, or a range of numbers e.g. <code>0-4</code>. Be careful with this setting as you can easily go over the limits requested by the underlying tile server.</p>
<p><code>--bbox=3.9,-6.3,14.5,10.2</code> or <code>-b=3.9,-6.3,14.5,10.2</code></p>
<p>The bounding box to fetch. Should be specified as <code>min-lon,min-lat,max-lon,max-lat</code>. You can use <a href=""http://bboxfinder.com/"" rel=""nofollow"">bboxfinder.com</a> to find these for different areas.</p>
<p><code>--city=london</code> or <code>--country=madagascar</code></p>
<p>These options can be used instead of <code>--bbox</code>. The city or country specified will be looked up using the <a href=""https://nominatim.org/release-docs/latest/api/Search/"" rel=""nofollow"">Nominatum API</a> and used to derive a bounding box.</p>
<p><code>--show-bbox</code></p>
<p>Use this option to output the bounding box that was retrieved for the <code>--city</code> or <code>--country</code> without downloading any tiles.</p>
<p><code>--name=Name</code></p>
<p>A name for this tile collection, used for the <code>name</code> field in the <code>metadata</code> table. If not specified a UUID will be used, or if you used <code>--city</code> or <code>--country</code> the name will be set to the full name of that place.</p>
<p><code>--attribution=""Attribution string""</code></p>
<p>Attribution string to bake into the <code>metadata</code> table. This will default to <code>© OpenStreetMap contributors</code> unless you use <code>--tiles-url</code> to specify an alternative tile server, in which case you should specify a custom attribution string.</p>
<p>You can use the <code>--attribution=osm</code> shortcut to specify the <code>© OpenStreetMap contributors</code> value without having to type it out in full.</p>
<p><code>--tiles-url=https://...</code></p>
<p>The tile server URL to use. This should include <code>{z}</code> and <code>{x}</code> and <code>{y}</code> specifiers, and can optionally include <code>{s}</code> for subdomains.</p>
<p>The default URL used here is for OpenStreetMap, <code>http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png</code></p>
<p><code>--tiles-subdomains=a,b,c</code></p>
<p>A comma-separated list of subdomains to use for the <code>{s}</code> parameter.</p>
<p><code>--verbose</code></p>
<p>Use this option to turn on verbose logging.</p>
<p><code>--cache-dir=/tmp/tiles</code></p>
<p>Provide a directory to cache downloaded tiles between runs. This can be useful if you are worried you might not have used the correct options for the bounding box or zoom levels.</p>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd download-tiles
python -mvenv venv
source venv/bin/activate
""><pre><code>cd download-tiles
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
346597557,MDEwOlJlcG9zaXRvcnkzNDY1OTc1NTc=,tableau-to-sqlite,simonw/tableau-to-sqlite,0,9599,https://github.com/simonw/tableau-to-sqlite,Fetch data from Tableau into a SQLite database,0,2021-03-11T06:12:02Z,2021-06-10T04:40:44Z,2021-04-29T16:11:03Z,,212,8,8,Python,1,1,1,1,0,2,0,0,2,apache-2.0,"[""datasette-io"", ""datasette-tool""]",2,2,8,main,"{""admin"": false, ""push"": false, ""pull"": false}",,,2,1,"# tableau-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/tableau-to-sqlite.svg)](https://pypi.org/project/tableau-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/tableau-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/tableau-to-sqlite/releases)
[![Tests](https://github.com/simonw/tableau-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/tableau-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/tableau-to-sqlite/blob/master/LICENSE)

Fetch data from Tableau into a SQLite database. A wrapper around [TableauScraper](https://github.com/bertrandmartel/tableau-scraping/).

## Installation

Install this tool using `pip`:

    $ pip install tableau-to-sqlite

## Usage

If you have the URL to a Tableau dashboard like this:

https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations

You can pass that directly to the tool:

    tableau-to-sqlite tableau.db \
      https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations

This will create a SQLite database called `tableau.db` containing one table for each of the worksheets in that dashboard.

If the dashboard is hosted on https://public.tableau.com/ you can instead provide the view name. This will be two strings separated by a `/` symbol - something like this:

    OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment

Now run the tool like this:

    tableau-to-sqlite tableau.db \
        OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment

## Get the data as JSON or CSV

If you're building a [git scraper](https://simonwillison.net/2020/Oct/9/git-scraping/) you may want to convert the data gathered by this tool to CSV or JSON to check into your repository.

You can do that using [sqlite-utils](https://sqlite-utils.datasette.io/). Install it using `pip`:

    pip install sqlite-utils

You can dump out a table as JSON like so:

    sqlite-utils rows tableau.db \
       'Admin Site and County Map Site No Info' > tableau.json

Or as CSV like this:

    sqlite-utils rows tableau.db --csv \
       'Admin Site and County Map Site No Info' > tableau.csv

## Development

To contribute to this tool, first checkout the code. Then create a new virtual environment:

    cd tableau-to-sqlite
    python -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and tests:

    pip install -e '.[test]'

To run the tests:

    pytest
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1><a id=""user-content-tableau-to-sqlite"" class=""anchor"" aria-hidden=""true"" href=""#user-content-tableau-to-sqlite""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>tableau-to-sqlite</h1>
<p><a href=""https://pypi.org/project/tableau-to-sqlite/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/bdda6efc6980f655c99f0475b322f27bbc413e7e13e847e55e697dd135d10601/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f7461626c6561752d746f2d73716c6974652e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/tableau-to-sqlite.svg"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/tableau-to-sqlite/releases""><img src=""https://camo.githubusercontent.com/fd18da3a8541734c627a9be68134477abe876a8695165916ff7c8c5bbf7269eb/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f7461626c6561752d746f2d73716c6974653f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/tableau-to-sqlite?include_prereleases&amp;label=changelog"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/tableau-to-sqlite/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/tableau-to-sqlite/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width:100%;""></a>
<a href=""https://github.com/simonw/tableau-to-sqlite/blob/master/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width:100%;""></a></p>
<p>Fetch data from Tableau into a SQLite database. A wrapper around <a href=""https://github.com/bertrandmartel/tableau-scraping/"">TableauScraper</a>.</p>
<h2><a id=""user-content-installation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-installation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Installation</h2>
<p>Install this tool using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""$ pip install tableau-to-sqlite
""><pre><code>$ pip install tableau-to-sqlite
</code></pre></div>
<h2><a id=""user-content-usage"" class=""anchor"" aria-hidden=""true"" href=""#user-content-usage""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Usage</h2>
<p>If you have the URL to a Tableau dashboard like this:</p>
<p><a href=""https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations"" rel=""nofollow"">https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations</a></p>
<p>You can pass that directly to the tool:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""tableau-to-sqlite tableau.db \
  https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations
""><pre><code>tableau-to-sqlite tableau.db \
  https://results.mo.gov/t/COVID19/views/VaccinationsDashboard/Vaccinations
</code></pre></div>
<p>This will create a SQLite database called <code>tableau.db</code> containing one table for each of the worksheets in that dashboard.</p>
<p>If the dashboard is hosted on <a href=""https://public.tableau.com/"" rel=""nofollow"">https://public.tableau.com/</a> you can instead provide the view name. This will be two strings separated by a <code>/</code> symbol - something like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment
""><pre><code>OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment
</code></pre></div>
<p>Now run the tool like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""tableau-to-sqlite tableau.db \
    OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment
""><pre><code>tableau-to-sqlite tableau.db \
    OregonCOVID-19VaccineProviderEnrollment/COVID-19VaccineProviderEnrollment
</code></pre></div>
<h2><a id=""user-content-get-the-data-as-json-or-csv"" class=""anchor"" aria-hidden=""true"" href=""#user-content-get-the-data-as-json-or-csv""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Get the data as JSON or CSV</h2>
<p>If you're building a <a href=""https://simonwillison.net/2020/Oct/9/git-scraping/"" rel=""nofollow"">git scraper</a> you may want to convert the data gathered by this tool to CSV or JSON to check into your repository.</p>
<p>You can do that using <a href=""https://sqlite-utils.datasette.io/"" rel=""nofollow"">sqlite-utils</a>. Install it using <code>pip</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install sqlite-utils
""><pre><code>pip install sqlite-utils
</code></pre></div>
<p>You can dump out a table as JSON like so:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-utils rows tableau.db \
   'Admin Site and County Map Site No Info' &gt; tableau.json
""><pre><code>sqlite-utils rows tableau.db \
   'Admin Site and County Map Site No Info' &gt; tableau.json
</code></pre></div>
<p>Or as CSV like this:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""sqlite-utils rows tableau.db --csv \
   'Admin Site and County Map Site No Info' &gt; tableau.csv
""><pre><code>sqlite-utils rows tableau.db --csv \
   'Admin Site and County Map Site No Info' &gt; tableau.csv
</code></pre></div>
<h2><a id=""user-content-development"" class=""anchor"" aria-hidden=""true"" href=""#user-content-development""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Development</h2>
<p>To contribute to this tool, first checkout the code. Then create a new virtual environment:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""cd tableau-to-sqlite
python -mvenv venv
source venv/bin/activate
""><pre><code>cd tableau-to-sqlite
python -mvenv venv
source venv/bin/activate
</code></pre></div>
<p>Or if you are using <code>pipenv</code>:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pipenv shell
""><pre><code>pipenv shell
</code></pre></div>
<p>Now install the dependencies and tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pip install -e '.[test]'
""><pre><code>pip install -e '.[test]'
</code></pre></div>
<p>To run the tests:</p>
<div class=""snippet-clipboard-content position-relative"" data-snippet-clipboard-copy-content=""pytest
""><pre><code>pytest
</code></pre></div>
</article></div>",,,,,,
347263722,MDEwOlJlcG9zaXRvcnkzNDcyNjM3MjI=,django-sql-dashboard,simonw/django-sql-dashboard,0,9599,https://github.com/simonw/django-sql-dashboard,Django app for building dashboards using raw SQL queries,0,2021-03-13T03:38:23Z,2022-04-19T01:13:12Z,2022-04-20T00:27:39Z,https://django-sql-dashboard.datasette.io/,513,335,335,Python,1,1,1,1,0,28,0,0,25,apache-2.0,"[""dashboards"", ""datasette-io"", ""datasette-tool"", ""django"", ""sql""]",28,25,335,main,"{""admin"": false, ""maintain"": false, ""push"": false, ""triage"": false, ""pull"": false}",,,28,9,"# django-sql-dashboard

[![PyPI](https://img.shields.io/pypi/v/django-sql-dashboard.svg)](https://pypi.org/project/django-sql-dashboard/)
[![Changelog](https://img.shields.io/github/v/release/simonw/django-sql-dashboard?include_prereleases&label=changelog)](https://github.com/simonw/django-sql-dashboard/releases)
[![Tests](https://github.com/simonw/django-sql-dashboard/workflows/Test/badge.svg)](https://github.com/simonw/django-sql-dashboard/actions?query=workflow%3ATest)
[![Documentation Status](https://readthedocs.org/projects/django-sql-dashboard/badge/?version=latest)](http://django-sql-dashboard.datasette.io/en/latest/?badge=latest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/django-sql-dashboard/blob/main/LICENSE)

Django SQL Dashboard provides an authenticated interface for executing read-only SQL queries directly against your PostgreSQL database, bringing a useful subset of [Datasette](https://datasette.io/) to Django.

Applications include ad-hoc analysis and debugging, plus the creation of reporting dashboards that can be shared with team members or published online.

See my blog for [more about this project](https://simonwillison.net/2021/May/10/django-sql-dashboard/), including [a video demo](https://www.youtube.com/watch?v=ausrmMZkPEY).

Features include:

- Safely run read-only one or more SQL queries against your database and view the results in your browser
- Bookmark queries and share those links with other members of your team
- Create [saved dashboards](https://django-sql-dashboard.datasette.io/en/latest/saved-dashboards.html) from your queries, with full control over who can view and edit them
- [Named parameters](https://django-sql-dashboard.datasette.io/en/latest/sql.html#sql-parameters) such as `select * from entries where id = %(id)s` will be turned into form fields, allowing quick creation of interactive dashboards
- Produce [bar charts](https://django-sql-dashboard.datasette.io/en/latest/widgets.html#bar-label-bar-quantity), [progress bars](https://django-sql-dashboard.datasette.io/en/latest/widgets.html#total-count-completed-count) and more from SQL queries, with the ability to easily create new [custom dashboard widgets](https://django-sql-dashboard.datasette.io/en/latest/widgets.html#custom-widgets) using the Django template system
- Write SQL queries that safely construct and render [markdown](https://django-sql-dashboard.datasette.io/en/latest/widgets.html#markdown) and [HTML](https://django-sql-dashboard.datasette.io/en/latest/widgets.html#html)
- Export the full results of a SQL query as a downloadable CSV or TSV file, using a combination of Django's [streaming HTTP response](https://docs.djangoproject.com/en/3.2/ref/request-response/#django.http.StreamingHttpResponse) mechanism and PostgreSQL [server-side cursors](https://www.psycopg.org/docs/usage.html#server-side-cursors) to efficiently stream large amounts of data without running out of resources
- Copy and paste the results of SQL queries directly into tools such as Google Sheets or Excel
- Uses Django's authentication system, so dashboard accounts can be granted using Django's Admin tools

## Documentation

Full documentation is at [django-sql-dashboard.datasette.io](https://django-sql-dashboard.datasette.io/)

## Screenshot

<img width=""1018"" alt=""Screenshot showing a SQL query that produces a table and one that produces a bar chart"" src=""https://user-images.githubusercontent.com/9599/124050883-42ad2300-d9d0-11eb-83e6-44ad85f7ef64.png"">

## Alternatives

- [django-sql-explorer](https://github.com/groveco/django-sql-explorer) provides a related set of functionality that also works against database backends other than PostgreSQL
","<div id=""readme"" class=""md"" data-path=""README.md""><article class=""markdown-body entry-content container-lg"" itemprop=""text""><h1 dir=""auto""><a id=""user-content-django-sql-dashboard"" class=""anchor"" aria-hidden=""true"" href=""#user-content-django-sql-dashboard""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>django-sql-dashboard</h1>
<p dir=""auto""><a href=""https://pypi.org/project/django-sql-dashboard/"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/87a7771e261b30e7b0da09a83f2d6120ce484b068a0d844a41d2f945141194d1/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f646a616e676f2d73716c2d64617368626f6172642e737667"" alt=""PyPI"" data-canonical-src=""https://img.shields.io/pypi/v/django-sql-dashboard.svg"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/django-sql-dashboard/releases""><img src=""https://camo.githubusercontent.com/f3c931ccf5487f5df160f339f767aba9ac45e15a5f4e8beb37fd6e33cbc239c0/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f646a616e676f2d73716c2d64617368626f6172643f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67"" alt=""Changelog"" data-canonical-src=""https://img.shields.io/github/v/release/simonw/django-sql-dashboard?include_prereleases&amp;label=changelog"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/django-sql-dashboard/actions?query=workflow%3ATest""><img src=""https://github.com/simonw/django-sql-dashboard/workflows/Test/badge.svg"" alt=""Tests"" style=""max-width: 100%;""></a>
<a href=""http://django-sql-dashboard.datasette.io/en/latest/?badge=latest"" rel=""nofollow""><img src=""https://camo.githubusercontent.com/e5bd1be998e2bf35ee16299f10c1520b531a957a3b725b6b663e40d82d3c8425/68747470733a2f2f72656164746865646f63732e6f72672f70726f6a656374732f646a616e676f2d73716c2d64617368626f6172642f62616467652f3f76657273696f6e3d6c6174657374"" alt=""Documentation Status"" data-canonical-src=""https://readthedocs.org/projects/django-sql-dashboard/badge/?version=latest"" style=""max-width: 100%;""></a>
<a href=""https://github.com/simonw/django-sql-dashboard/blob/main/LICENSE""><img src=""https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667"" alt=""License"" data-canonical-src=""https://img.shields.io/badge/license-Apache%202.0-blue.svg"" style=""max-width: 100%;""></a></p>
<p dir=""auto"">Django SQL Dashboard provides an authenticated interface for executing read-only SQL queries directly against your PostgreSQL database, bringing a useful subset of <a href=""https://datasette.io/"" rel=""nofollow"">Datasette</a> to Django.</p>
<p dir=""auto"">Applications include ad-hoc analysis and debugging, plus the creation of reporting dashboards that can be shared with team members or published online.</p>
<p dir=""auto"">See my blog for <a href=""https://simonwillison.net/2021/May/10/django-sql-dashboard/"" rel=""nofollow"">more about this project</a>, including <a href=""https://www.youtube.com/watch?v=ausrmMZkPEY"" rel=""nofollow"">a video demo</a>.</p>
<p dir=""auto"">Features include:</p>
<ul dir=""auto"">
<li>Safely run read-only one or more SQL queries against your database and view the results in your browser</li>
<li>Bookmark queries and share those links with other members of your team</li>
<li>Create <a href=""https://django-sql-dashboard.datasette.io/en/latest/saved-dashboards.html"" rel=""nofollow"">saved dashboards</a> from your queries, with full control over who can view and edit them</li>
<li><a href=""https://django-sql-dashboard.datasette.io/en/latest/sql.html#sql-parameters"" rel=""nofollow"">Named parameters</a> such as <code>select * from entries where id = %(id)s</code> will be turned into form fields, allowing quick creation of interactive dashboards</li>
<li>Produce <a href=""https://django-sql-dashboard.datasette.io/en/latest/widgets.html#bar-label-bar-quantity"" rel=""nofollow"">bar charts</a>, <a href=""https://django-sql-dashboard.datasette.io/en/latest/widgets.html#total-count-completed-count"" rel=""nofollow"">progress bars</a> and more from SQL queries, with the ability to easily create new <a href=""https://django-sql-dashboard.datasette.io/en/latest/widgets.html#custom-widgets"" rel=""nofollow"">custom dashboard widgets</a> using the Django template system</li>
<li>Write SQL queries that safely construct and render <a href=""https://django-sql-dashboard.datasette.io/en/latest/widgets.html#markdown"" rel=""nofollow"">markdown</a> and <a href=""https://django-sql-dashboard.datasette.io/en/latest/widgets.html#html"" rel=""nofollow"">HTML</a></li>
<li>Export the full results of a SQL query as a downloadable CSV or TSV file, using a combination of Django's <a href=""https://docs.djangoproject.com/en/3.2/ref/request-response/#django.http.StreamingHttpResponse"" rel=""nofollow"">streaming HTTP response</a> mechanism and PostgreSQL <a href=""https://www.psycopg.org/docs/usage.html#server-side-cursors"" rel=""nofollow"">server-side cursors</a> to efficiently stream large amounts of data without running out of resources</li>
<li>Copy and paste the results of SQL queries directly into tools such as Google Sheets or Excel</li>
<li>Uses Django's authentication system, so dashboard accounts can be granted using Django's Admin tools</li>
</ul>
<h2 dir=""auto""><a id=""user-content-documentation"" class=""anchor"" aria-hidden=""true"" href=""#user-content-documentation""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Documentation</h2>
<p dir=""auto"">Full documentation is at <a href=""https://django-sql-dashboard.datasette.io/"" rel=""nofollow"">django-sql-dashboard.datasette.io</a></p>
<h2 dir=""auto""><a id=""user-content-screenshot"" class=""anchor"" aria-hidden=""true"" href=""#user-content-screenshot""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Screenshot</h2>
<p><a target=""_blank"" rel=""noopener noreferrer"" href=""https://user-images.githubusercontent.com/9599/124050883-42ad2300-d9d0-11eb-83e6-44ad85f7ef64.png""><img width=""1018"" alt=""Screenshot showing a SQL query that produces a table and one that produces a bar chart"" src=""https://user-images.githubusercontent.com/9599/124050883-42ad2300-d9d0-11eb-83e6-44ad85f7ef64.png"" style=""max-width: 100%;""></a></p>
<h2 dir=""auto""><a id=""user-content-alternatives"" class=""anchor"" aria-hidden=""true"" href=""#user-content-alternatives""><svg class=""octicon octicon-link"" viewBox=""0 0 16 16"" version=""1.1"" width=""16"" height=""16"" aria-hidden=""true""><path fill-rule=""evenodd"" d=""M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z""></path></svg></a>Alternatives</h2>
<ul dir=""auto"">
<li><a href=""https://github.com/groveco/django-sql-explorer"">django-sql-explorer</a> provides a related set of functionality that also works against database backends other than PostgreSQL</li>
</ul>
</article></div>",1,public,0,,,