name,summary,classifiers,description,author,author_email,description_content_type,home_page,keywords,license,maintainer,maintainer_email,package_url,platform,project_url,project_urls,release_url,requires_dist,requires_python,version,yanked,yanked_reason csvs-to-sqlite,Convert CSV files into a SQLite database,"[""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# csvs-to-sqlite [![PyPI](https://img.shields.io/pypi/v/csvs-to-sqlite.svg)](https://pypi.org/project/csvs-to-sqlite/) [![Changelog](https://img.shields.io/github/v/release/simonw/csvs-to-sqlite?include_prereleases&label=changelog)](https://github.com/simonw/csvs-to-sqlite/releases) [![Tests](https://github.com/simonw/csvs-to-sqlite/workflows/Test/badge.svg)](https://github.com/simonw/csvs-to-sqlite/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csvs-to-sqlite/blob/main/LICENSE) Convert CSV files into a SQLite database. Browse and publish that SQLite database with [Datasette](https://github.com/simonw/datasette). Basic usage: csvs-to-sqlite myfile.csv mydatabase.db This will create a new SQLite database called `mydatabase.db` containing a single table, `myfile`, containing the CSV content. You can provide multiple CSV files: csvs-to-sqlite one.csv two.csv bundle.db The `bundle.db` database will contain two tables, `one` and `two`. This means you can use wildcards: csvs-to-sqlite ~/Downloads/*.csv my-downloads.db If you pass a path to one or more directories, the script will recursively search those directories for CSV files and create tables for each one. csvs-to-sqlite ~/path/to/directory all-my-csvs.db ## Handling TSV (tab-separated values) You can use the `-s` option to specify a different delimiter. If you want to use a tab character you'll need to apply shell escaping like so: csvs-to-sqlite my-file.tsv my-file.db -s $'\t' ## Refactoring columns into separate lookup tables Let's say you have a CSV file that looks like this: county,precinct,office,district,party,candidate,votes Clark,1,President,,REP,John R. Kasich,5 Clark,2,President,,REP,John R. Kasich,0 Clark,3,President,,REP,John R. Kasich,7 ([Real example taken from the Open Elections project](https://github.com/openelections/openelections-data-sd/blob/master/2016/20160607__sd__primary__clark__precinct.csv)) You can now convert selected columns into separate lookup tables using the new `--extract-column` option (shortname: `-c`) - for example: csvs-to-sqlite openelections-data-*/*.csv \ -c county:County:name \ -c precinct:Precinct:name \ -c office -c district -c party -c candidate \ openelections.db The format is as follows: column_name:optional_table_name:optional_table_value_column_name If you just specify the column name e.g. `-c office`, the following table will be created: CREATE TABLE ""office"" ( ""id"" INTEGER PRIMARY KEY, ""value"" TEXT ); If you specify all three options, e.g. `-c precinct:Precinct:name` the table will look like this: CREATE TABLE ""Precinct"" ( ""id"" INTEGER PRIMARY KEY, ""name"" TEXT ); The original tables will be created like this: CREATE TABLE ""ca__primary__san_francisco__precinct"" ( ""county"" INTEGER, ""precinct"" INTEGER, ""office"" INTEGER, ""district"" INTEGER, ""party"" INTEGER, ""candidate"" INTEGER, ""votes"" INTEGER, FOREIGN KEY (county) REFERENCES County(id), FOREIGN KEY (party) REFERENCES party(id), FOREIGN KEY (precinct) REFERENCES Precinct(id), FOREIGN KEY (office) REFERENCES office(id), FOREIGN KEY (candidate) REFERENCES candidate(id) ); They will be populated with IDs that reference the new derived tables. ## Installation $ pip install csvs-to-sqlite `csvs-to-sqlite` now requires Python 3. If you are running Python 2 you can install the last version to support Python 2: $ pip install csvs-to-sqlite==0.9.2 ## csvs-to-sqlite --help ``` Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME PATHS: paths to individual .csv files or to directories containing .csvs DBNAME: name of the SQLite database file to create Options: -s, --separator TEXT Field separator in input .csv -q, --quoting INTEGER Control field quoting behavior per csv.QUOTE_* constants. Use one of QUOTE_MINIMAL (0), QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or QUOTE_NONE (3). --skip-errors Skip lines with too many fields instead of stopping the import --replace-tables Replace tables if they already exist -t, --table TEXT Table to use (instead of using CSV filename) -c, --extract-column TEXT One or more columns to 'extract' into a separate lookup table. If you pass a simple column name that column will be replaced with integer foreign key references to a new table of that name. You can customize the name of the table like so: state:States:state_name This will pull unique values from the 'state' column and use them to populate a new 'States' table, with an id column primary key and a state_name column containing the strings from the original column. -d, --date TEXT One or more columns to parse into ISO formatted dates -dt, --datetime TEXT One or more columns to parse into ISO formatted datetimes -df, --datetime-format TEXT One or more custom date format strings to try when parsing dates/datetimes -pk, --primary-key TEXT One or more columns to use as the primary key -f, --fts TEXT One or more columns to use to populate a full- text index -i, --index TEXT Add index on this column (or a compound index with -i col1,col2) --shape TEXT Custom shape for the DB table - format is csvcol:dbcol(TYPE),... --filename-column TEXT Add a column with this name and populate with CSV file name --fixed-column ... Populate column with a fixed string --fixed-column-int ... Populate column with a fixed integer --fixed-column-float ... Populate column with a fixed float --no-index-fks Skip adding index to foreign key columns created using --extract-column (default is to add them) --no-fulltext-fks Skip adding full-text index on values extracted using --extract-column (default is to add them) --just-strings Import all columns as text strings by default (and, if specified, still obey --shape, --date/datetime, and --datetime-format) --version Show the version and exit. --help Show this message and exit. ``` ",Simon Willison,,text/markdown,https://github.com/simonw/csvs-to-sqlite,,"Apache License, Version 2.0",,,https://pypi.org/project/csvs-to-sqlite/,,https://pypi.org/project/csvs-to-sqlite/,"{""Homepage"": ""https://github.com/simonw/csvs-to-sqlite""}",https://pypi.org/project/csvs-to-sqlite/1.3/,"[""click (~=7.0)"", ""dateparser (>=1.0)"", ""pandas (>=1.0)"", ""py-lru-cache (~=0.1.4)"", ""six"", ""pytest ; extra == 'test'"", ""cogapp ; extra == 'test'""]",,1.3,0, datasette-gunicorn,Run a Datasette server using Gunicorn,"[""Framework :: Datasette"", ""License :: OSI Approved :: Apache Software License""]","# datasette-gunicorn [![PyPI](https://img.shields.io/pypi/v/datasette-gunicorn.svg)](https://pypi.org/project/datasette-gunicorn/) [![Changelog](https://img.shields.io/github/v/release/simonw/datasette-gunicorn?include_prereleases&label=changelog)](https://github.com/simonw/datasette-gunicorn/releases) [![Tests](https://github.com/simonw/datasette-gunicorn/workflows/Test/badge.svg)](https://github.com/simonw/datasette-gunicorn/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/datasette-gunicorn/blob/main/LICENSE) Run a [Datasette](https://datasette.io/) server using [Gunicorn](https://gunicorn.org/) ## Installation Install this plugin in the same environment as Datasette. datasette install datasette-gunicorn ## Usage The plugin adds a new `datasette gunicorn` command. This takes most of the same options as `datasette serve`, plus one more option for setting the number of Gunicorn workers to start: `-w/--workers X` - set the number of workers. Defaults to 1. To start serving a database using 4 workers, run the following: datasette gunicorn fixtures.db -w 4 It is advisable to switch your datasette [into WAL mode](https://til.simonwillison.net/sqlite/enabling-wal-mode) to get the best performance out of this configuration: sqlite3 fixtures.db 'PRAGMA journal_mode=WAL;' Run `datasette gunicorn --help` for a full list of options (which are the same as `datasette serve --help`, with the addition of the new `-w` option). ## datasette gunicorn --help Not all of the options to `datasette serve` are supported. Here's the full list of available options: ``` Usage: datasette gunicorn [OPTIONS] [FILES]... Start a Gunicorn server running to serve Datasette Options: -i, --immutable PATH Database files to open in immutable mode -h, --host TEXT Host for server. Defaults to 127.0.0.1 which means only connections from the local machine will be allowed. Use 0.0.0.0 to listen to all IPs and allow access from other machines. -p, --port INTEGER RANGE Port for server, defaults to 8001. Use -p 0 to automatically assign an available port. [0<=x<=65535] --cors Enable CORS by serving Access-Control-Allow-Origin: * --load-extension TEXT Path to a SQLite extension to load --inspect-file TEXT Path to JSON file created using ""datasette inspect"" -m, --metadata FILENAME Path to JSON/YAML file containing license/source metadata --template-dir DIRECTORY Path to directory containing custom templates --plugins-dir DIRECTORY Path to directory containing custom plugins --static MOUNT:DIRECTORY Serve static files from this directory at /MOUNT/... --memory Make /_memory database available --config CONFIG Deprecated: set config option using configname:value. Use --setting instead. --setting SETTING... Setting, see docs.datasette.io/en/stable/settings.html --secret TEXT Secret used for signing secure values, such as signed cookies --version-note TEXT Additional note to show on /-/versions --help-settings Show available settings --create Create database files if they do not exist --crossdb Enable cross-database joins using the /_memory database --nolock Ignore locking, open locked files in read-only mode -w, --workers INTEGER Number of Gunicorn workers [default: 1] --help Show this message and exit. ``` ## Development To set up this plugin locally, first checkout the code. Then create a new virtual environment: cd datasette-gunicorn python3 -m venv venv source venv/bin/activate Now install the dependencies and test dependencies: pip install -e '.[test]' To run the tests: pytest ",Simon Willison,,text/markdown,https://github.com/simonw/datasette-gunicorn,,"Apache License, Version 2.0",,,https://pypi.org/project/datasette-gunicorn/,,https://pypi.org/project/datasette-gunicorn/,"{""CI"": ""https://github.com/simonw/datasette-gunicorn/actions"", ""Changelog"": ""https://github.com/simonw/datasette-gunicorn/releases"", ""Homepage"": ""https://github.com/simonw/datasette-gunicorn"", ""Issues"": ""https://github.com/simonw/datasette-gunicorn/issues""}",https://pypi.org/project/datasette-gunicorn/0.1/,"[""datasette"", ""gunicorn"", ""pytest ; extra == 'test'"", ""pytest-asyncio ; extra == 'test'"", ""cogapp ; extra == 'test'""]",>=3.7,0.1,0, sqlite-utils,CLI tool and Python utility functions for manipulating SQLite databases,"[""Development Status :: 5 - Production/Stable"", ""Intended Audience :: Developers"", ""Intended Audience :: End Users/Desktop"", ""Intended Audience :: Science/Research"", ""License :: OSI Approved :: Apache Software License"", ""Programming Language :: Python :: 3.10"", ""Programming Language :: Python :: 3.6"", ""Programming Language :: Python :: 3.7"", ""Programming Language :: Python :: 3.8"", ""Programming Language :: Python :: 3.9"", ""Topic :: Database""]","# sqlite-utils [![PyPI](https://img.shields.io/pypi/v/sqlite-utils.svg)](https://pypi.org/project/sqlite-utils/) [![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-utils?include_prereleases&label=changelog)](https://sqlite-utils.datasette.io/en/stable/changelog.html) [![Python 3.x](https://img.shields.io/pypi/pyversions/sqlite-utils.svg?logo=python&logoColor=white)](https://pypi.org/project/sqlite-utils/) [![Tests](https://github.com/simonw/sqlite-utils/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-utils/actions?query=workflow%3ATest) [![Documentation Status](https://readthedocs.org/projects/sqlite-utils/badge/?version=stable)](http://sqlite-utils.datasette.io/en/stable/?badge=stable) [![codecov](https://codecov.io/gh/simonw/sqlite-utils/branch/main/graph/badge.svg)](https://codecov.io/gh/simonw/sqlite-utils) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-utils/blob/main/LICENSE) [![discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://discord.gg/Ass7bCAMDw) Python CLI utility and library for manipulating SQLite databases. ## Some feature highlights - [Pipe JSON](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-json-data) (or [CSV or TSV](https://sqlite-utils.datasette.io/en/stable/cli.html#inserting-csv-or-tsv-data)) directly into a new SQLite database file, automatically creating a table with the appropriate schema - [Run in-memory SQL queries](https://sqlite-utils.datasette.io/en/stable/cli.html#querying-data-directly-using-an-in-memory-database), including joins, directly against data in CSV, TSV or JSON files and view the results - [Configure SQLite full-text search](https://sqlite-utils.datasette.io/en/stable/cli.html#configuring-full-text-search) against your database tables and run search queries against them, ordered by relevance - Run [transformations against your tables](https://sqlite-utils.datasette.io/en/stable/cli.html#transforming-tables) to make schema changes that SQLite `ALTER TABLE` does not directly support, such as changing the type of a column - [Extract columns](https://sqlite-utils.datasette.io/en/stable/cli.html#extracting-columns-into-a-separate-table) into separate tables to better normalize your existing data Read more on my blog, in this series of posts on [New features in sqlite-utils](https://simonwillison.net/series/sqlite-utils-features/) and other [entries tagged sqliteutils](https://simonwillison.net/tags/sqliteutils/). ## Installation pip install sqlite-utils Or if you use [Homebrew](https://brew.sh/) for macOS: brew install sqlite-utils ## Using as a CLI tool Now you can do things with the CLI utility like this: $ sqlite-utils memory dogs.csv ""select * from t"" [{""id"": 1, ""age"": 4, ""name"": ""Cleo""}, {""id"": 2, ""age"": 2, ""name"": ""Pancakes""}] $ sqlite-utils insert dogs.db dogs dogs.csv --csv [####################################] 100% $ sqlite-utils tables dogs.db --counts [{""table"": ""dogs"", ""count"": 2}] $ sqlite-utils dogs.db ""select id, name from dogs"" [{""id"": 1, ""name"": ""Cleo""}, {""id"": 2, ""name"": ""Pancakes""}] $ sqlite-utils dogs.db ""select * from dogs"" --csv id,age,name 1,4,Cleo 2,2,Pancakes $ sqlite-utils dogs.db ""select * from dogs"" --table id age name ---- ----- -------- 1 4 Cleo 2 2 Pancakes You can import JSON data into a new database table like this: $ curl https://api.github.com/repos/simonw/sqlite-utils/releases \ | sqlite-utils insert releases.db releases - --pk id Or for data in a CSV file: $ sqlite-utils insert dogs.db dogs dogs.csv --csv `sqlite-utils memory` lets you import CSV or JSON data into an in-memory database and run SQL queries against it in a single command: $ cat dogs.csv | sqlite-utils memory - ""select name, age from stdin"" See the [full CLI documentation](https://sqlite-utils.datasette.io/en/stable/cli.html) for comprehensive coverage of many more commands. ## Using as a library You can also `import sqlite_utils` and use it as a Python library like this: ```python import sqlite_utils db = sqlite_utils.Database(""demo_database.db"") # This line creates a ""dogs"" table if one does not already exist: db[""dogs""].insert_all([ {""id"": 1, ""age"": 4, ""name"": ""Cleo""}, {""id"": 2, ""age"": 2, ""name"": ""Pancakes""} ], pk=""id"") ``` Check out the [full library documentation](https://sqlite-utils.datasette.io/en/stable/python-api.html) for everything else you can do with the Python library. ## Related projects * [Datasette](https://datasette.io/): A tool for exploring and publishing data * [csvs-to-sqlite](https://github.com/simonw/csvs-to-sqlite): Convert CSV files into a SQLite database * [db-to-sqlite](https://github.com/simonw/db-to-sqlite): CLI tool for exporting a MySQL or PostgreSQL database as a SQLite file * [dogsheep](https://dogsheep.github.io/): A family of tools for personal analytics, built on top of `sqlite-utils` ",Simon Willison,,text/markdown,https://github.com/simonw/sqlite-utils,,"Apache License, Version 2.0",,,https://pypi.org/project/sqlite-utils/,,https://pypi.org/project/sqlite-utils/,"{""CI"": ""https://github.com/simonw/sqlite-utils/actions"", ""Changelog"": ""https://sqlite-utils.datasette.io/en/stable/changelog.html"", ""Documentation"": ""https://sqlite-utils.datasette.io/en/stable/"", ""Homepage"": ""https://github.com/simonw/sqlite-utils"", ""Issues"": ""https://github.com/simonw/sqlite-utils/issues"", ""Source code"": ""https://github.com/simonw/sqlite-utils""}",https://pypi.org/project/sqlite-utils/3.30/,"[""sqlite-fts4"", ""click"", ""click-default-group-wheel"", ""tabulate"", ""python-dateutil"", ""furo ; extra == 'docs'"", ""sphinx-autobuild ; extra == 'docs'"", ""codespell ; extra == 'docs'"", ""sphinx-copybutton ; extra == 'docs'"", ""beanbag-docutils (>=2.0) ; extra == 'docs'"", ""flake8 ; extra == 'flake8'"", ""mypy ; extra == 'mypy'"", ""types-click ; extra == 'mypy'"", ""types-tabulate ; extra == 'mypy'"", ""types-python-dateutil ; extra == 'mypy'"", ""data-science-types ; extra == 'mypy'"", ""pytest ; extra == 'test'"", ""black ; extra == 'test'"", ""hypothesis ; extra == 'test'"", ""cogapp ; extra == 'test'""]",>=3.6,3.30,0,