name,summary,classifiers,description,author,author_email,description_content_type,home_page,keywords,license,maintainer,maintainer_email,package_url,platform,project_url,project_urls,release_url,requires_dist,requires_python,version,yanked,yanked_reason
csv-diff,Python CLI tool and library for diffing CSV and JSON files,"[""Development Status :: 4 - Beta"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7""]","# csv-diff

[![PyPI](https://img.shields.io/pypi/v/csv-diff.svg)](https://pypi.org/project/csv-diff/)
[![Changelog](https://img.shields.io/github/v/release/simonw/csv-diff?include_prereleases&label=changelog)](https://github.com/simonw/csv-diff/releases)
[![Tests](https://github.com/simonw/csv-diff/workflows/Test/badge.svg)](https://github.com/simonw/csv-diff/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csv-diff/blob/main/LICENSE)

Tool for viewing the difference between two CSV, TSV or JSON files. See [Generating a commit log for San Francisco’s official list of trees](https://simonwillison.net/2019/Mar/13/tree-history/) (and the [sf-tree-history repo commit log](https://github.com/simonw/sf-tree-history/commits)) for background information on this project.

## Installation

    pip install csv-diff

## Usage

Consider two CSV files:

`one.csv`

    id,name,age
    1,Cleo,4
    2,Pancakes,2

`two.csv`

    id,name,age
    1,Cleo,5
    3,Bailey,1

`csv-diff` can show a human-readable summary of differences between the files:

    $ csv-diff one.csv two.csv --key=id
    1 row changed, 1 row added, 1 row removed

    1 row changed

      Row 1
        age: ""4"" => ""5""

    1 row added

      id: 3
      name: Bailey
      age: 1

    1 row removed

      id: 2
      name: Pancakes
      age: 2

The `--key=id` option means that the `id` column should be treated as the unique key, to identify which records have changed.

The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using `--format=tsv` or `--format=csv`.

You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use `--format=json` if your input files are JSON.

Use `--show-unchanged` to include full details of the unchanged values for rows with at least one change in the diff output:

    % csv-diff one.csv two.csv --key=id --show-unchanged
    1 row changed

      id: 1
        age: ""4"" => ""5""

        Unchanged:
          name: ""Cleo""

You can use the `--json` option to get a machine-readable difference:

    $ csv-diff one.csv two.csv --key=id --json
    {
        ""added"": [
            {
                ""id"": ""3"",
                ""name"": ""Bailey"",
                ""age"": ""1""
            }
        ],
        ""removed"": [
            {
                ""id"": ""2"",
                ""name"": ""Pancakes"",
                ""age"": ""2""
            }
        ],
        ""changed"": [
            {
                ""key"": ""1"",
                ""changes"": {
                    ""age"": [
                        ""4"",
                        ""5""
                    ]
                }
            }
        ],
        ""columns_added"": [],
        ""columns_removed"": []
    }

## As a Python library

You can also import the Python library into your own code like so:

    from csv_diff import load_csv, compare
    diff = compare(
        load_csv(open(""one.csv""), key=""id""),
        load_csv(open(""two.csv""), key=""id"")
    )

`diff` will now contain the same data structure as the output in the `--json` example above.

If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows.


",Simon Willison,,text/markdown,https://github.com/simonw/csv-diff,,"Apache License, Version 2.0",,,https://pypi.org/project/csv-diff/,,https://pypi.org/project/csv-diff/,"{""Homepage"": ""https://github.com/simonw/csv-diff""}",https://pypi.org/project/csv-diff/1.1/,"[""click"", ""dictdiffer"", ""pytest ; extra == 'test'""]",,1.1,0,
csvs-to-sqlite,Convert CSV files into a SQLite database,"[""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# csvs-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/csvs-to-sqlite.svg)](https://pypi.org/project/csvs-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/csvs-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/csvs-to-sqlite/releases)
[![Tests](https://github.com/simonw/csvs-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/csvs-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csvs-to-sqlite/blob/main/LICENSE)

Convert CSV files into a SQLite database. Browse and publish that SQLite database with [Datasette](https://github.com/simonw/datasette).

Basic usage:

    csvs-to-sqlite myfile.csv mydatabase.db

This will create a new SQLite database called `mydatabase.db` containing a
single table, `myfile`, containing the CSV content.

You can provide multiple CSV files:

    csvs-to-sqlite one.csv two.csv bundle.db

The `bundle.db` database will contain two tables, `one` and `two`.

This means you can use wildcards:

    csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

If you pass a path to one or more directories, the script will recursively
search those directories for CSV files and create tables for each one.

    csvs-to-sqlite ~/path/to/directory all-my-csvs.db

## Handling TSV (tab-separated values)

You can use the `-s` option to specify a different delimiter. If you want
to use a tab character you'll need to apply shell escaping like so:

    csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

## Refactoring columns into separate lookup tables

Let's say you have a CSV file that looks like this:

    county,precinct,office,district,party,candidate,votes
    Clark,1,President,,REP,John R. Kasich,5
    Clark,2,President,,REP,John R. Kasich,0
    Clark,3,President,,REP,John R. Kasich,7

([Real example taken from the Open Elections project](https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv))

You can now convert selected columns into separate lookup tables using the new
`--extract-column` option (shortname: `-c`) - for example:

    csvs-to-sqlite openelections-data-*/*.csv \
        -c county:County:name \
        -c precinct:Precinct:name \
        -c office -c district -c party -c candidate \
        openelections.db

The format is as follows:

    column_name:optional_table_name:optional_table_value_column_name

If you just specify the column name e.g. `-c office`, the following table will
be created:

    CREATE TABLE ""office"" (
        ""id"" INTEGER PRIMARY KEY,
        ""value"" TEXT
    );

If you specify all three options, e.g. `-c precinct:Precinct:name` the table
will look like this:

    CREATE TABLE ""Precinct"" (
        ""id"" INTEGER PRIMARY KEY,
        ""name"" TEXT
    );

The original tables will be created like this:

    CREATE TABLE ""ca__primary__san_francisco__precinct"" (
        ""county"" INTEGER,
        ""precinct"" INTEGER,
        ""office"" INTEGER,
        ""district"" INTEGER,
        ""party"" INTEGER,
        ""candidate"" INTEGER,
        ""votes"" INTEGER,
        FOREIGN KEY (county) REFERENCES County(id),
        FOREIGN KEY (party) REFERENCES party(id),
        FOREIGN KEY (precinct) REFERENCES Precinct(id),
        FOREIGN KEY (office) REFERENCES office(id),
        FOREIGN KEY (candidate) REFERENCES candidate(id)
    );

They will be populated with IDs that reference the new derived tables.

## Installation

    $ pip install csvs-to-sqlite

`csvs-to-sqlite` now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:

    $ pip install csvs-to-sqlite==0.9.2

## csvs-to-sqlite --help

<!-- [[[cog
import cog
from csvs_to_sqlite import cli
from click.testing import CliRunner
runner = CliRunner()
result = runner.invoke(cli.cli, [""--help""])
help = result.output.replace(""Usage: cli"", ""Usage: csvs-to-sqlite"")
cog.out(
    ""```\n{}\n```"".format(help)
)
]]] -->
```
Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).

  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import

  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.

  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates

  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes

  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes

  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index

  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)

  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...

  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name

  --fixed-column <TEXT TEXT>...   Populate column with a fixed string
  --fixed-column-int <TEXT INTEGER>...
                                  Populate column with a fixed integer
  --fixed-column-float <TEXT FLOAT>...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)

  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)

  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)

  --version                       Show the version and exit.
  --help                          Show this message and exit.

```
<!-- [[[end]]] -->


",Simon Willison,,text/markdown,https://github.com/simonw/csvs-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/csvs-to-sqlite/,,https://pypi.org/project/csvs-to-sqlite/,"{""Homepage"": ""https://github.com/simonw/csvs-to-sqlite""}",https://pypi.org/project/csvs-to-sqlite/1.3/,"[""click (~=7.0)"", ""dateparser (>=1.0)"", ""pandas (>=1.0)"", ""py-lru-cache (~=0.1.4)"", ""six"", ""pytest ; extra == 'test'"", ""cogapp ; extra == 'test'""]",,1.3,0,
db-to-sqlite,CLI tool for exporting tables or queries from any SQL database to a SQLite file,"[""Development Status :: 3 - Alpha"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Topic :: Database""]","# db-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/db-to-sqlite.svg)](https://pypi.python.org/pypi/db-to-sqlite)
[![Changelog](https://img.shields.io/github/v/release/simonw/db-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/db-to-sqlite/releases)
[![Tests](https://github.com/simonw/db-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/db-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/db-to-sqlite/blob/main/LICENSE)

CLI tool for exporting tables or queries from any SQL database to a SQLite file.

## Installation

Install from PyPI like so:

    pip install db-to-sqlite

If you want to use it with MySQL, you can install the extra dependency like this:

    pip install 'db-to-sqlite[mysql]'

Installing the `mysqlclient` library on OS X can be tricky - I've found [this recipe](https://gist.github.com/simonw/90ac0afd204cd0d6d9c3135c3888d116) to work (run that before installing `db-to-sqlite`).

For PostgreSQL, use this:

    pip install 'db-to-sqlite[postgresql]'

## Usage

    Usage: db-to-sqlite [OPTIONS] CONNECTION PATH

      Load data from any database into SQLite.

      PATH is a path to the SQLite file to create, e.c. /tmp/my_database.db

      CONNECTION is a SQLAlchemy connection string, for example:

          postgresql://localhost/my_database
          postgresql://username:passwd@localhost/my_database

          mysql://root@localhost/my_database
          mysql://username:passwd@localhost/my_database

      More: https://docs.sqlalchemy.org/en/13/core/engines.html#database-urls

    Options:
      --version                     Show the version and exit.
      --all                         Detect and copy all tables
      --table TEXT                  Specific tables to copy
      --skip TEXT                   When using --all skip these tables
      --redact TEXT...              (table, column) pairs to redact with ***
      --sql TEXT                    Optional SQL query to run
      --output TEXT                 Table in which to save --sql query results
      --pk TEXT                     Optional column to use as a primary key
      --index-fks / --no-index-fks  Should foreign keys have indexes? Default on
      -p, --progress                Show progress bar
      --postgres-schema TEXT        PostgreSQL schema to use
      --help                        Show this message and exit.

For example, to save the content of the `blog_entry` table from a PostgreSQL database to a local file called `blog.db` you could do this:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --table=blog_entry

You can specify `--table` more than once.

You can also save the data from all of your tables, effectively creating a SQLite copy of your entire database. Any foreign key relationships will be detected and added to the SQLite database. For example:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all

When running `--all` you can specify tables to skip using `--skip`:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all \
        --skip=django_migrations

If you want to save the results of a custom SQL query, do this:

    db-to-sqlite ""postgresql://localhost/myblog"" output.db \
        --output=query_results \
        --sql=""select id, title, created from blog_entry"" \
        --pk=id

The `--output` option specifies the table that should contain the results of the query.

## Using db-to-sqlite with PostgreSQL schemas

If the tables you want to copy from your PostgreSQL database aren't in the default schema, you can specify an alternate one with the `--postgres-schema` option:

    db-to-sqlite ""postgresql://localhost/myblog"" blog.db \
        --all \
        --postgres-schema my_schema

## Using db-to-sqlite with Heroku Postgres

If you run an application on [Heroku](https://www.heroku.com/) using their [Postgres database product](https://www.heroku.com/postgres), you can use the `heroku config` command to access a compatible connection string:

    $ heroku config --app myappname | grep HEROKU_POSTG
    HEROKU_POSTGRESQL_OLIVE_URL: postgres://username:password@ec2-xxx-xxx-xxx-x.compute-1.amazonaws.com:5432/dbname

You can pass this to `db-to-sqlite` to create a local SQLite database with the data from your Heroku instance.

You can even do this using a bash one-liner:

    $ db-to-sqlite $(heroku config --app myappname | grep HEROKU_POSTG | cut -d: -f 2-) \
        /tmp/heroku.db --all -p
    1/23: django_migrations
    ...
    17/23: blog_blogmark
    [####################################]  100%
    ...

## Related projects

* [Datasette](https://github.com/simonw/datasette): A tool for exploring and publishing data. Works great with SQLite files generated using `db-to-sqlite`.
* [sqlite-utils](https://github.com/simonw/sqlite-utils): Python CLI utility and library for manipulating SQLite databases.
* [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database.

## Development

To set up this tool locally, first checkout the code. Then create a new virtual environment:

    cd db-to-sqlite
    python3 -mvenv venv
    source venv/bin/activate

Or if you are using `pipenv`:

    pipenv shell

Now install the dependencies and test dependencies:

    pip install -e '.[test]'

To run the tests:

    pytest

This will skip tests against MySQL or PostgreSQL if you do not have their additional dependencies installed.

You can install those extra dependencies like so:

    pip install -e '.[test_mysql,test_postgresql]'

You can alternative use `pip install psycopg2-binary` if you cannot install the `psycopg2` dependency used by the `test_postgresql` extra.

See [Running a MySQL server using Homebrew](https://til.simonwillison.net/homebrew/mysql-homebrew) for tips on running the tests against MySQL on macOS, including how to install the `mysqlclient` dependency.

The PostgreSQL and MySQL tests default to expecting to run against servers on localhost. You can use environment variables to point them at different test database servers:

- `MYSQL_TEST_DB_CONNECTION` - defaults to `mysql://root@localhost/test_db_to_sqlite`
- `POSTGRESQL_TEST_DB_CONNECTION` - defaults to `postgresql://localhost/test_db_to_sqlite`

The database you indicate in the environment variable - `test_db_to_sqlite` by default - will be deleted and recreated on every test run.


",Simon Willison,,text/markdown,https://github.com/simonw/db-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/db-to-sqlite/,,https://pypi.org/project/db-to-sqlite/,"{""CI"": ""https://travis-ci.com/simonw/db-to-sqlite"", ""Changelog"": ""https://github.com/simonw/db-to-sqlite/releases"", ""Documentation"": ""https://github.com/simonw/db-to-sqlite/blob/main/README.md"", ""Homepage"": ""https://github.com/simonw/db-to-sqlite"", ""Issues"": ""https://github.com/simonw/db-to-sqlite/issues"", ""Source code"": ""https://github.com/simonw/db-to-sqlite""}",https://pypi.org/project/db-to-sqlite/1.4/,"[""sqlalchemy"", ""sqlite-utils (>=2.9.1)"", ""click"", ""mysqlclient ; extra == 'mysql'"", ""psycopg2 ; extra == 'postgresql'"", ""pytest ; extra == 'test'"", ""pytest ; extra == 'test_mysql'"", ""mysqlclient ; extra == 'test_mysql'"", ""pytest ; extra == 'test_postgresql'"", ""psycopg2 ; extra == 'test_postgresql'""]",,1.4,0,
dbf-to-sqlite,"CLCLI tool for converting DBF files (dBase, FoxPro etc) to SQLite","[""Development Status :: 3 - Alpha"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Topic :: Database""]","# dbf-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/dbf-to-sqlite.svg)](https://pypi.python.org/pypi/dbf-to-sqlite)
[![Travis CI](https://travis-ci.com/simonw/dbf-to-sqlite.svg?branch=master)](https://travis-ci.com/simonw/dbf-to-sqlite)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/dbf-to-sqlite/blob/master/LICENSE)


CLI tool for converting DBF files (dBase, FoxPro etc) to SQLite.

    $ dbf-to-sqlite --help
    Usage: dbf-to-sqlite [OPTIONS] DBF_PATHS... SQLITE_DB

      Convert DBF files (dBase, FoxPro etc) to SQLite

      https://github.com/simonw/dbf-to-sqlite

    Options:
      --version      Show the version and exit.
      --table TEXT   Table name to use (only valid for single files)
      -v, --verbose  Show what's going on
      --help         Show this message and exit.

Example usage:

    $ dbf-to-sqlite *.DBF database.db

This will create a new SQLite database called `database.db` containing one table for each of the `DBF` files in the current directory.

Looking for DBF files to try this out on? Try downloading the [Himalayan Database](http://himalayandatabase.com/) of all expeditions that have climbed in the Nepal Himalaya.


",Simon Willison,,text/markdown,https://github.com/simonw/dbf-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/dbf-to-sqlite/,,https://pypi.org/project/dbf-to-sqlite/,"{""Homepage"": ""https://github.com/simonw/dbf-to-sqlite""}",https://pypi.org/project/dbf-to-sqlite/0.1/,"[""dbf (==0.97.11)"", ""click"", ""sqlite-utils""]",,0.1,0,
markdown-to-sqlite,CLI tool for loading markdown files into a SQLite database,"[""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Topic :: Database""]","# markdown-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/markdown-to-sqlite.svg)](https://pypi.python.org/pypi/markdown-to-sqlite)
[![Changelog](https://img.shields.io/github/v/release/simonw/markdown-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/markdown-to-sqlite/releases)
[![Tests](https://github.com/simonw/markdown-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/markdown-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/markdown-to-sqlite/blob/main/LICENSE)

CLI tool for loading markdown files into a SQLite database.

YAML embedded in the markdown files will be used to populate additional columns.

    Usage: markdown-to-sqlite [OPTIONS] DBNAME TABLE PATHS...

For example:

    $ markdown-to-sqlite docs.db documents file1.md file2.md

## Breaking change

Prior to version 1.0 this argument order was different - markdown files were listed before the database and table.


",Simon Willison,,text/markdown,https://github.com/simonw/markdown-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/markdown-to-sqlite/,,https://pypi.org/project/markdown-to-sqlite/,"{""CI"": ""https://github.com/simonw/markdown-to-sqlite/actions"", ""Changelog"": ""https://github.com/simonw/markdown-to-sqlite/releases"", ""Homepage"": ""https://github.com/simonw/markdown-to-sqlite"", ""Issues"": ""https://github.com/simonw/markdown-to-sqlite/issues""}",https://pypi.org/project/markdown-to-sqlite/1.0/,"[""yamldown"", ""markdown"", ""sqlite-utils"", ""click"", ""pytest ; extra == 'test'""]",>=3.6,1.0,0,
sqlite-diffable,Tools for dumping/loading a SQLite database to diffable directory structure,"[""Development Status :: 3 - Alpha"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Topic :: Database""]","# sqlite-diffable

[![PyPI](https://img.shields.io/pypi/v/sqlite-diffable.svg)](https://pypi.org/project/sqlite-diffable/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-diffable?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-diffable/releases)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-diffable/blob/main/LICENSE)

Tools for dumping/loading a SQLite database to diffable directory structure

## Installation

    pip install sqlite-diffable

## Demo

The repository at [simonw/simonwillisonblog-backup](https://github.com/simonw/simonwillisonblog-backup) contains a backup of the database on my blog, https://simonwillison.net/ - created using this tool.

## Dumping a database

Given a SQLite database called `fixtures.db` containing a table `facetable`, the following will dump out that table to the `dump/` directory:

    sqlite-diffable dump fixtures.db dump/ facetable

To dump out every table in that database, use `--all`:

    sqlite-diffable dump fixtures.db dump/ --all

## Loading a database

To load a previously dumped database, run the following:

    sqlite-diffable load restored.db dump/

This will show an error if any of the tables that are being restored already exist in the database file.

You can replace those tables (dropping them before restoring them) using the `--replace` option:

    sqlite-diffable load restored.db dump/ --replace

## Converting to JSON objects

Table rows are stored in the `.ndjson` files as newline-delimited JSON arrays, like this:

```
[""a"", ""a"", ""a-a"", 63, null, 0.7364712141640124, ""$null""]
[""a"", ""b"", ""a-b"", 51, null, 0.6020187290499803, ""$null""]
```

Sometimes it can be more convenient to work with a list of JSON objects.

The `sqlite-diffable objects` command can read a `.ndjson` file and its accompanying `.metadata.json` file and output JSON objects to standard output:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson

The output of that command looks something like this:
```
{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""}
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
```

Add `-o` to write that output to a file:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson -o output.txt

Add `--array` to output a JSON array of objects, as opposed to a newline-delimited file:

    sqlite-diffable objects fixtures.db dump/sortable.ndjson --array
Output:
```
[
{""pk1"": ""a"", ""pk2"": ""a"", ""content"": ""a-a"", ""sortable"": 63, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.7364712141640124, ""text"": ""$null""},
{""pk1"": ""a"", ""pk2"": ""b"", ""content"": ""a-b"", ""sortable"": 51, ""sortable_with_nulls"": null, ""sortable_with_nulls_2"": 0.6020187290499803, ""text"": ""$null""}
]
```

## Storage format

Each table is represented as two files. The first, `table_name.metadata.json`, contains metadata describing the structure of the table. For a table called `redirects_redirect` that file might look like this:

```json
{
    ""name"": ""redirects_redirect"",
    ""columns"": [
        ""id"",
        ""domain"",
        ""path"",
        ""target"",
        ""created""
    ],
    ""schema"": ""CREATE TABLE [redirects_redirect] (\n   [id] INTEGER PRIMARY KEY,\n   [domain] TEXT,\n   [path] TEXT,\n   [target] TEXT,\n   [created] TEXT\n)""
}
```

It is an object with three keys: `name` is the name of the table, `columns` is an array of column strings and `schema` is the SQL schema text used for tha table.

The second file, `table_name.ndjson`, contains [newline-delimited JSON](http://ndjson.org/) for every row in the table. Each row is represented as a JSON array with items corresponding to each of the columns defined in the metadata.

That file for the `redirects_redirect.ndjson` table might look like this:

```
[1, ""feeds.simonwillison.net"", ""swn-everything"", ""https://simonwillison.net/atom/everything/"", ""2017-10-01T21:11:36.440537+00:00""]
[2, ""feeds.simonwillison.net"", ""swn-entries"", ""https://simonwillison.net/atom/entries/"", ""2017-10-01T21:12:32.478849+00:00""]
[3, ""feeds.simonwillison.net"", ""swn-links"", ""https://simonwillison.net/atom/links/"", ""2017-10-01T21:12:54.820729+00:00""]
```
",Simon Willison,,text/markdown,https://github.com/simonw/sqlite-diffable,,"Apache License, Version 2.0",,,https://pypi.org/project/sqlite-diffable/,,https://pypi.org/project/sqlite-diffable/,"{""CI"": ""https://github.com/simonw/sqlite-diffable/actions"", ""Changelog"": ""https://github.com/simonw/sqlite-diffable/releases"", ""Homepage"": ""https://github.com/simonw/sqlite-diffable"", ""Issues"": ""https://github.com/simonw/sqlite-diffable/issues""}",https://pypi.org/project/sqlite-diffable/0.5/,"[""click"", ""sqlite-utils"", ""pytest ; extra == 'test'"", ""black ; extra == 'test'""]",,0.5,0,
sqlite-utils,CLI tool and Python utility functions for manipulating SQLite databases,"[""Development Status :: 5 - Production/Stable"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.10"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# sqlite-utils

[![PyPI](https://img.shields.io/pypi/v/sqlite-utils.svg)](https://pypi.org/project/sqlite-utils/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-utils?include_prereleases&label=changelog)](https://sqlite-utils.datasette.io/en/stable/changelog.html)
[![Python 3.x](https://img.shields.io/pypi/pyversions/sqlite-utils.svg?logo=python&logoColor=white)](https://pypi.org/project/sqlite-utils/)
[![Tests](https://github.com/simonw/sqlite-utils/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-utils/actions?query=workflow%3ATest)
[![Documentation Status](https://readthedocs.org/projects/sqlite-utils/badge/?version=stable)](http://sqlite-utils.datasette.io/en/stable/?badge=stable)
[![codecov](https://codecov.io/gh/simonw/sqlite-utils/branch/main/graph/badge.svg)](https://codecov.io/gh/simonw/sqlite-utils)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-utils/blob/main/LICENSE)
[![discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://discord.gg/Ass7bCAMDw)

Python CLI utility and library for manipulating SQLite databases.

## Some feature highlights

- [Pipe JSON](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data) (or [CSV or TSV](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data)) directly into a new SQLite database file, automatically creating a table with the appropriate schema
- [Run in-memory SQL queries](https://sqlite-utils.datasette.io/en/stable/cli.html#querying-data-directly-using-an-in-memory-database), including joins, directly against data in CSV, TSV or JSON files and view the results
- [Configure SQLite full-text search](https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search) against your database tables and run search queries against them, ordered by relevance
- Run [transformations against your tables](https://sqlite-utils.datasette.io/en/stable/cli.html#transforming-tables) to make schema changes that SQLite `ALTER TABLE` does not directly support, such as changing the type of a column
- [Extract columns](https://sqlite-utils.datasette.io/en/stable/cli.html#extracting-columns-into-a-separate-table) into separate tables to better normalize your existing data

Read more on my blog, in this series of posts on [New features in sqlite-utils](https://simonwillison.net/series/sqlite-utils-features/) and other [entries tagged sqliteutils](https://simonwillison.net/tags/sqliteutils/).

## Installation

    pip install sqlite-utils

Or if you use [Homebrew](https://brew.sh/) for macOS:

    brew install sqlite-utils

## Using as a CLI tool

Now you can do things with the CLI utility like this:

    $ sqlite-utils memory dogs.csv ""select * from t""
    [{""id"": 1, ""age"": 4, ""name"": ""Cleo""},
     {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv
    [####################################]  100%

    $ sqlite-utils tables dogs.db --counts
    [{""table"": ""dogs"", ""count"": 2}]

    $ sqlite-utils dogs.db ""select id, name from dogs""
    [{""id"": 1, ""name"": ""Cleo""},
     {""id"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils dogs.db ""select * from dogs"" --csv
    id,age,name
    1,4,Cleo
    2,2,Pancakes

    $ sqlite-utils dogs.db ""select * from dogs"" --table
      id    age  name
    ----  -----  --------
       1      4  Cleo
       2      2  Pancakes

You can import JSON data into a new database table like this:

    $ curl https://api.github.com/repos/simonw/sqlite-utils/releases \
        | sqlite-utils insert releases.db releases - --pk id

Or for data in a CSV file:

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv

`sqlite-utils memory` lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command:

    $ cat dogs.csv | sqlite-utils memory - ""select name, age from stdin""

See the [full CLI documentation](https://sqlite-utils.datasette.io/en/stable/cli.html) for comprehensive coverage of many more commands.

## Using as a library

You can also `import sqlite_utils` and use it as a Python library like this:

```python
import sqlite_utils
db = sqlite_utils.Database(""demo_database.db"")
# This line creates a ""dogs"" table if one does not already exist:
db[""dogs""].insert_all([
    {""id"": 1, ""age"": 4, ""name"": ""Cleo""},
    {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}
], pk=""id"")
```

Check out the [full library documentation](https://sqlite-utils.datasette.io/en/stable/python-api.html) for everything else you can do with the Python library.

## Related projects

* [Datasette](https://datasette.io/): A tool for exploring and publishing data
* [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database
* [db-to-sqlite](https://github.com/simonw/db-to-sqlite): CLI tool for exporting a MySQL or PostgreSQL database as a SQLite file
* [dogsheep](https://dogsheep.github.io/): A family of tools for personal analytics, built on top of `sqlite-utils`
",Simon Willison,,text/markdown,https://github.com/simonw/sqlite-utils,,"Apache License, Version 2.0",,,https://pypi.org/project/sqlite-utils/,,https://pypi.org/project/sqlite-utils/,"{""CI"": ""https://github.com/simonw/sqlite-utils/actions"", ""Changelog"": ""https://sqlite-utils.datasette.io/en/stable/changelog.html"", ""Documentation"": ""https://sqlite-utils.datasette.io/en/stable/"", ""Homepage"": ""https://github.com/simonw/sqlite-utils"", ""Issues"": ""https://github.com/simonw/sqlite-utils/issues"", ""Source code"": ""https://github.com/simonw/sqlite-utils""}",https://pypi.org/project/sqlite-utils/3.30/,"[""sqlite-fts4"", ""click"", ""click-default-group-wheel"", ""tabulate"", ""python-dateutil"", ""furo ; extra == 'docs'"", ""sphinx-autobuild ; extra == 'docs'"", ""codespell ; extra == 'docs'"", ""sphinx-copybutton ; extra == 'docs'"", ""beanbag-docutils (>=2.0) ; extra == 'docs'"", ""flake8 ; extra == 'flake8'"", ""mypy ; extra == 'mypy'"", ""types-click ; extra == 'mypy'"", ""types-tabulate ; extra == 'mypy'"", ""types-python-dateutil ; extra == 'mypy'"", ""data-science-types ; extra == 'mypy'"", ""pytest ; extra == 'test'"", ""black ; extra == 'test'"", ""hypothesis ; extra == 'test'"", ""cogapp ; extra == 'test'""]",>=3.6,3.30,0,
yaml-to-sqlite,Utility for converting YAML files to SQLite,"[""Development Status :: 3 - Alpha"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7""]","# yaml-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/yaml-to-sqlite.svg)](https://pypi.org/project/yaml-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/yaml-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/yaml-to-sqlite/releases)
[![Tests](https://github.com/simonw/yaml-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/yaml-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/yaml-to-sqlite/blob/main/LICENSE)

Load the contents of a YAML file into a SQLite database table.

```
$ yaml-to-sqlite --help
Usage: yaml-to-sqlite [OPTIONS] DB_PATH TABLE YAML_FILE

  Convert YAML files to SQLite

Options:
  --version             Show the version and exit.
  --pk TEXT             Column to use as a primary key
  --single-column TEXT  If YAML file is a list of values, populate this column
  --help                Show this message and exit.
```
## Usage

Given a `news.yml` file containing the following:
```yaml
- date: 2021-06-05
  body: |-
    [Datasette 0.57](https://docs.datasette.io/en/stable/changelog.html#v0-57) is out with an important security patch.
- date: 2021-05-10
  body: |-
    [Django SQL Dashboard](https://simonwillison.net/2021/May/10/django-sql-dashboard/) is a new tool that brings a useful authenticated subset of Datasette to Django projects that are built on top of PostgreSQL.
```
Running this command:
```bash
$ yaml-to-sqlite news.db stories news.yml
```
Will create a database file with this schema:
```bash
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT,
   [body] TEXT
);
```
The `--pk` option can be used to set a column as the primary key for the table:

```bash
$ yaml-to-sqlite news.db stories news.yml --pk date
$ sqlite-utils schema news.db
CREATE TABLE [stories] (
   [date] TEXT PRIMARY KEY,
   [body] TEXT
);
```
## Single column YAML lists

The `--single-column` option can be used when the YAML file is a list of values, for example a file called `dogs.yml` containing the following:

```yaml
- Cleo
- Pancakes
- Nixie
```
Running this command:
```bash
$ yaml-to-sqlite dogs.db dogs.yaml --single-column=name
```
Will create a single `dogs` table with a single `name` column that is the primary key:

```bash
$ sqlite-utils schema dogs.db
CREATE TABLE [dogs] (
   [name] TEXT PRIMARY KEY
);
$ sqlite-utils dogs.db 'select * from dogs' -t
name
--------
Cleo
Pancakes
Nixie
```


",Simon Willison,,text/markdown,https://github.com/simonw/yaml-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/yaml-to-sqlite/,,https://pypi.org/project/yaml-to-sqlite/,"{""Homepage"": ""https://github.com/simonw/yaml-to-sqlite""}",https://pypi.org/project/yaml-to-sqlite/1.0/,"[""click"", ""PyYAML"", ""sqlite-utils (>=3.9.1)"", ""pytest ; extra == 'test'""]",,1.0,0,