pypi_packages: csv-diff
This data as json
name | summary | classifiers | description | author | author_email | description_content_type | home_page | keywords | license | maintainer | maintainer_email | package_url | platform | project_url | project_urls | release_url | requires_dist | requires_python | version | yanked | yanked_reason |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
csv-diff | Python CLI tool and library for diffing CSV and JSON files | ["Development Status :: 4 - Beta", "Intended Audience :: Developers", "Intended Audience :: End Users/Desktop", "Intended Audience :: Science/Research", "License :: OSI Approved :: Apache Software License", "Programming Language :: Python :: 3.6", "Programming Language :: Python :: 3.7"] | # csv-diff [](https://pypi.org/project/csv-diff/) [](https://github.com/simonw/csv-diff/releases) [](https://github.com/simonw/csv-diff/actions?query=workflow%3ATest) [](https://github.com/simonw/csv-diff/blob/main/LICENSE) Tool for viewing the difference between two CSV, TSV or JSON files. See [Generating a commit log for San Francisco’s official list of trees](https://simonwillison.net/2019/Mar/13/tree-history/) (and the [sf-tree-history repo commit log](https://github.com/simonw/sf-tree-history/commits)) for background information on this project. ## Installation pip install csv-diff ## Usage Consider two CSV files: `one.csv` id,name,age 1,Cleo,4 2,Pancakes,2 `two.csv` id,name,age 1,Cleo,5 3,Bailey,1 `csv-diff` can show a human-readable summary of differences between the files: $ csv-diff one.csv two.csv --key=id 1 row changed, 1 row added, 1 row removed 1 row changed Row 1 age: "4" => "5" 1 row added id: 3 name: Bailey age: 1 1 row removed id: 2 name: Pancakes age: 2 The `--key=id` option means that the `id` column should be treated as the unique key, to identify which records have changed. The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using `--format=tsv` or `--format=csv`. You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use `--format=json` if your input files are JSON. Use `--show-unchanged` to include full details of the unchanged values for rows with at least one change in the diff output: % csv-diff one.csv two.csv --key=id --show-unchanged 1 row changed id: 1 age: "4" => "5" Unchanged: name: "Cleo" You can use the `--json` option to get a machine-readable difference: $ csv-diff one.csv two.csv --key=id --json { "added": [ { "id": "3", "name": "Bailey", "age": "1" } ], "removed": [ { "id": "2", "name": "Pancakes", "age": "2" } ], "changed": [ { "key": "1", "changes": { "age": [ "4", "5" ] } } ], "columns_added": [], "columns_removed": [] } ## As a Python library You can also import the Python library into your own code like so: from csv_diff import load_csv, compare diff = compare( load_csv(open("one.csv"), key="id"), load_csv(open("two.csv"), key="id") ) `diff` will now contain the same data structure as the output in the `--json` example above. If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows. | Simon Willison | text/markdown | https://github.com/simonw/csv-diff | Apache License, Version 2.0 | https://pypi.org/project/csv-diff/ | https://pypi.org/project/csv-diff/ | {"Homepage": "https://github.com/simonw/csv-diff"} | https://pypi.org/project/csv-diff/1.1/ | ["click", "dictdiffer", "pytest ; extra == 'test'"] | 1.1 | 0 |