name,summary,classifiers,description,author,author_email,description_content_type,home_page,keywords,license,maintainer,maintainer_email,package_url,platform,project_url,project_urls,release_url,requires_dist,requires_python,version,yanked,yanked_reason
csvs-to-sqlite,Convert CSV files into a SQLite database,"[""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# csvs-to-sqlite

[![PyPI](https://img.shields.io/pypi/v/csvs-to-sqlite.svg)](https://pypi.org/project/csvs-to-sqlite/)
[![Changelog](https://img.shields.io/github/v/release/simonw/csvs-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/csvs-to-sqlite/releases)
[![Tests](https://github.com/simonw/csvs-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/csvs-to-sqlite/actions?query=workflow%3ATest)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csvs-to-sqlite/blob/main/LICENSE)

Convert CSV files into a SQLite database. Browse and publish that SQLite database with [Datasette](https://github.com/simonw/datasette).

Basic usage:

    csvs-to-sqlite myfile.csv mydatabase.db

This will create a new SQLite database called `mydatabase.db` containing a
single table, `myfile`, containing the CSV content.

You can provide multiple CSV files:

    csvs-to-sqlite one.csv two.csv bundle.db

The `bundle.db` database will contain two tables, `one` and `two`.

This means you can use wildcards:

    csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

If you pass a path to one or more directories, the script will recursively
search those directories for CSV files and create tables for each one.

    csvs-to-sqlite ~/path/to/directory all-my-csvs.db

## Handling TSV (tab-separated values)

You can use the `-s` option to specify a different delimiter. If you want
to use a tab character you'll need to apply shell escaping like so:

    csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

## Refactoring columns into separate lookup tables

Let's say you have a CSV file that looks like this:

    county,precinct,office,district,party,candidate,votes
    Clark,1,President,,REP,John R. Kasich,5
    Clark,2,President,,REP,John R. Kasich,0
    Clark,3,President,,REP,John R. Kasich,7

([Real example taken from the Open Elections project](https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv))

You can now convert selected columns into separate lookup tables using the new
`--extract-column` option (shortname: `-c`) - for example:

    csvs-to-sqlite openelections-data-*/*.csv \
        -c county:County:name \
        -c precinct:Precinct:name \
        -c office -c district -c party -c candidate \
        openelections.db

The format is as follows:

    column_name:optional_table_name:optional_table_value_column_name

If you just specify the column name e.g. `-c office`, the following table will
be created:

    CREATE TABLE ""office"" (
        ""id"" INTEGER PRIMARY KEY,
        ""value"" TEXT
    );

If you specify all three options, e.g. `-c precinct:Precinct:name` the table
will look like this:

    CREATE TABLE ""Precinct"" (
        ""id"" INTEGER PRIMARY KEY,
        ""name"" TEXT
    );

The original tables will be created like this:

    CREATE TABLE ""ca__primary__san_francisco__precinct"" (
        ""county"" INTEGER,
        ""precinct"" INTEGER,
        ""office"" INTEGER,
        ""district"" INTEGER,
        ""party"" INTEGER,
        ""candidate"" INTEGER,
        ""votes"" INTEGER,
        FOREIGN KEY (county) REFERENCES County(id),
        FOREIGN KEY (party) REFERENCES party(id),
        FOREIGN KEY (precinct) REFERENCES Precinct(id),
        FOREIGN KEY (office) REFERENCES office(id),
        FOREIGN KEY (candidate) REFERENCES candidate(id)
    );

They will be populated with IDs that reference the new derived tables.

## Installation

    $ pip install csvs-to-sqlite

`csvs-to-sqlite` now requires Python 3. If you are running Python 2 you can install the last version to support Python 2:

    $ pip install csvs-to-sqlite==0.9.2

## csvs-to-sqlite --help

<!-- [[[cog
import cog
from csvs_to_sqlite import cli
from click.testing import CliRunner
runner = CliRunner()
result = runner.invoke(cli.cli, [""--help""])
help = result.output.replace(""Usage: cli"", ""Usage: csvs-to-sqlite"")
cog.out(
    ""```\n{}\n```"".format(help)
)
]]] -->
```
Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT            Field separator in input .csv
  -q, --quoting INTEGER           Control field quoting behavior per csv.QUOTE_*
                                  constants. Use one of QUOTE_MINIMAL (0),
                                  QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                                  QUOTE_NONE (3).

  --skip-errors                   Skip lines with too many fields instead of
                                  stopping the import

  --replace-tables                Replace tables if they already exist
  -t, --table TEXT                Table to use (instead of using CSV filename)
  -c, --extract-column TEXT       One or more columns to 'extract' into a
                                  separate lookup table. If you pass a simple
                                  column name that column will be replaced with
                                  integer foreign key references to a new table
                                  of that name. You can customize the name of
                                  the table like so:     state:States:state_name
                                  
                                  This will pull unique values from the 'state'
                                  column and use them to populate a new 'States'
                                  table, with an id column primary key and a
                                  state_name column containing the strings from
                                  the original column.

  -d, --date TEXT                 One or more columns to parse into ISO
                                  formatted dates

  -dt, --datetime TEXT            One or more columns to parse into ISO
                                  formatted datetimes

  -df, --datetime-format TEXT     One or more custom date format strings to try
                                  when parsing dates/datetimes

  -pk, --primary-key TEXT         One or more columns to use as the primary key
  -f, --fts TEXT                  One or more columns to use to populate a full-
                                  text index

  -i, --index TEXT                Add index on this column (or a compound index
                                  with -i col1,col2)

  --shape TEXT                    Custom shape for the DB table - format is
                                  csvcol:dbcol(TYPE),...

  --filename-column TEXT          Add a column with this name and populate with
                                  CSV file name

  --fixed-column <TEXT TEXT>...   Populate column with a fixed string
  --fixed-column-int <TEXT INTEGER>...
                                  Populate column with a fixed integer
  --fixed-column-float <TEXT FLOAT>...
                                  Populate column with a fixed float
  --no-index-fks                  Skip adding index to foreign key columns
                                  created using --extract-column (default is to
                                  add them)

  --no-fulltext-fks               Skip adding full-text index on values
                                  extracted using --extract-column (default is
                                  to add them)

  --just-strings                  Import all columns as text strings by default
                                  (and, if specified, still obey --shape,
                                  --date/datetime, and --datetime-format)

  --version                       Show the version and exit.
  --help                          Show this message and exit.

```
<!-- [[[end]]] -->


",Simon Willison,,text/markdown,https://github.com/simonw/csvs-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/csvs-to-sqlite/,,https://pypi.org/project/csvs-to-sqlite/,"{""Homepage"": ""https://github.com/simonw/csvs-to-sqlite""}",https://pypi.org/project/csvs-to-sqlite/1.3/,"[""click (~=7.0)"", ""dateparser (>=1.0)"", ""pandas (>=1.0)"", ""py-lru-cache (~=0.1.4)"", ""six"", ""pytest ; extra == 'test'"", ""cogapp ; extra == 'test'""]",,1.3,0,
sqlite-utils,CLI tool and Python utility functions for manipulating SQLite databases,"[""Development Status :: 5 - Production/Stable"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.10"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# sqlite-utils

[![PyPI](https://img.shields.io/pypi/v/sqlite-utils.svg)](https://pypi.org/project/sqlite-utils/)
[![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-utils?include_prereleases&label=changelog)](https://sqlite-utils.datasette.io/en/stable/changelog.html)
[![Python 3.x](https://img.shields.io/pypi/pyversions/sqlite-utils.svg?logo=python&logoColor=white)](https://pypi.org/project/sqlite-utils/)
[![Tests](https://github.com/simonw/sqlite-utils/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-utils/actions?query=workflow%3ATest)
[![Documentation Status](https://readthedocs.org/projects/sqlite-utils/badge/?version=stable)](http://sqlite-utils.datasette.io/en/stable/?badge=stable)
[![codecov](https://codecov.io/gh/simonw/sqlite-utils/branch/main/graph/badge.svg)](https://codecov.io/gh/simonw/sqlite-utils)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-utils/blob/main/LICENSE)
[![discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://discord.gg/Ass7bCAMDw)

Python CLI utility and library for manipulating SQLite databases.

## Some feature highlights

- [Pipe JSON](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data) (or [CSV or TSV](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data)) directly into a new SQLite database file, automatically creating a table with the appropriate schema
- [Run in-memory SQL queries](https://sqlite-utils.datasette.io/en/stable/cli.html#querying-data-directly-using-an-in-memory-database), including joins, directly against data in CSV, TSV or JSON files and view the results
- [Configure SQLite full-text search](https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search) against your database tables and run search queries against them, ordered by relevance
- Run [transformations against your tables](https://sqlite-utils.datasette.io/en/stable/cli.html#transforming-tables) to make schema changes that SQLite `ALTER TABLE` does not directly support, such as changing the type of a column
- [Extract columns](https://sqlite-utils.datasette.io/en/stable/cli.html#extracting-columns-into-a-separate-table) into separate tables to better normalize your existing data

Read more on my blog, in this series of posts on [New features in sqlite-utils](https://simonwillison.net/series/sqlite-utils-features/) and other [entries tagged sqliteutils](https://simonwillison.net/tags/sqliteutils/).

## Installation

    pip install sqlite-utils

Or if you use [Homebrew](https://brew.sh/) for macOS:

    brew install sqlite-utils

## Using as a CLI tool

Now you can do things with the CLI utility like this:

    $ sqlite-utils memory dogs.csv ""select * from t""
    [{""id"": 1, ""age"": 4, ""name"": ""Cleo""},
     {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv
    [####################################]  100%

    $ sqlite-utils tables dogs.db --counts
    [{""table"": ""dogs"", ""count"": 2}]

    $ sqlite-utils dogs.db ""select id, name from dogs""
    [{""id"": 1, ""name"": ""Cleo""},
     {""id"": 2, ""name"": ""Pancakes""}]

    $ sqlite-utils dogs.db ""select * from dogs"" --csv
    id,age,name
    1,4,Cleo
    2,2,Pancakes

    $ sqlite-utils dogs.db ""select * from dogs"" --table
      id    age  name
    ----  -----  --------
       1      4  Cleo
       2      2  Pancakes

You can import JSON data into a new database table like this:

    $ curl https://api.github.com/repos/simonw/sqlite-utils/releases \
        | sqlite-utils insert releases.db releases - --pk id

Or for data in a CSV file:

    $ sqlite-utils insert dogs.db dogs dogs.csv --csv

`sqlite-utils memory` lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command:

    $ cat dogs.csv | sqlite-utils memory - ""select name, age from stdin""

See the [full CLI documentation](https://sqlite-utils.datasette.io/en/stable/cli.html) for comprehensive coverage of many more commands.

## Using as a library

You can also `import sqlite_utils` and use it as a Python library like this:

```python
import sqlite_utils
db = sqlite_utils.Database(""demo_database.db"")
# This line creates a ""dogs"" table if one does not already exist:
db[""dogs""].insert_all([
    {""id"": 1, ""age"": 4, ""name"": ""Cleo""},
    {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}
], pk=""id"")
```

Check out the [full library documentation](https://sqlite-utils.datasette.io/en/stable/python-api.html) for everything else you can do with the Python library.

## Related projects

* [Datasette](https://datasette.io/): A tool for exploring and publishing data
* [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database
* [db-to-sqlite](https://github.com/simonw/db-to-sqlite): CLI tool for exporting a MySQL or PostgreSQL database as a SQLite file
* [dogsheep](https://dogsheep.github.io/): A family of tools for personal analytics, built on top of `sqlite-utils`
",Simon Willison,,text/markdown,https://github.com/simonw/sqlite-utils,,"Apache License, Version 2.0",,,https://pypi.org/project/sqlite-utils/,,https://pypi.org/project/sqlite-utils/,"{""CI"": ""https://github.com/simonw/sqlite-utils/actions"", ""Changelog"": ""https://sqlite-utils.datasette.io/en/stable/changelog.html"", ""Documentation"": ""https://sqlite-utils.datasette.io/en/stable/"", ""Homepage"": ""https://github.com/simonw/sqlite-utils"", ""Issues"": ""https://github.com/simonw/sqlite-utils/issues"", ""Source code"": ""https://github.com/simonw/sqlite-utils""}",https://pypi.org/project/sqlite-utils/3.30/,"[""sqlite-fts4"", ""click"", ""click-default-group-wheel"", ""tabulate"", ""python-dateutil"", ""furo ; extra == 'docs'"", ""sphinx-autobuild ; extra == 'docs'"", ""codespell ; extra == 'docs'"", ""sphinx-copybutton ; extra == 'docs'"", ""beanbag-docutils (>=2.0) ; extra == 'docs'"", ""flake8 ; extra == 'flake8'"", ""mypy ; extra == 'mypy'"", ""types-click ; extra == 'mypy'"", ""types-tabulate ; extra == 'mypy'"", ""types-python-dateutil ; extra == 'mypy'"", ""data-science-types ; extra == 'mypy'"", ""pytest ; extra == 'test'"", ""black ; extra == 'test'"", ""hypothesis ; extra == 'test'"", ""cogapp ; extra == 'test'""]",>=3.6,3.30,0,