home / content / repos

repos: 175321497

This data as json

id node_id name full_name private owner html_url description fork created_at updated_at pushed_at homepage size stargazers_count watchers_count language has_issues has_projects has_downloads has_wiki has_pages forks_count archived disabled open_issues_count license topics forks open_issues watchers default_branch permissions temp_clone_token organization network_count subscribers_count readme readme_html allow_forking visibility is_template template_repository web_commit_signoff_required has_discussions
175321497 MDEwOlJlcG9zaXRvcnkxNzUzMjE0OTc= csv-diff simonw/csv-diff 0 9599 https://github.com/simonw/csv-diff Python CLI tool and library for diffing CSV and JSON files 0 2019-03-13T01:11:26Z 2022-07-29T20:01:02Z 2022-07-29T20:00:59Z   34 198 198 Python 1 1 1 1 0 29 0 0 18 apache-2.0 ["click", "csv", "datasette-io", "datasette-tool", "diff", "git-scraping"] 29 18 198 main {"admin": false, "maintain": false, "push": false, "triage": false, "pull": false}     29 7 # csv-diff [![PyPI](https://img.shields.io/pypi/v/csv-diff.svg)](https://pypi.org/project/csv-diff/) [![Changelog](https://img.shields.io/github/v/release/simonw/csv-diff?include_prereleases&label=changelog)](https://github.com/simonw/csv-diff/releases) [![Tests](https://github.com/simonw/csv-diff/workflows/Test/badge.svg)](https://github.com/simonw/csv-diff/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/csv-diff/blob/main/LICENSE) Tool for viewing the difference between two CSV, TSV or JSON files. See [Generating a commit log for San Francisco’s official list of trees](https://simonwillison.net/2019/Mar/13/tree-history/) (and the [sf-tree-history repo commit log](https://github.com/simonw/sf-tree-history/commits)) for background information on this project. ## Installation pip install csv-diff ## Usage Consider two CSV files: `one.csv` id,name,age 1,Cleo,4 2,Pancakes,2 `two.csv` id,name,age 1,Cleo,5 3,Bailey,1 `csv-diff` can show a human-readable summary of differences between the files: $ csv-diff one.csv two.csv --key=id 1 row changed, 1 row added, 1 row removed 1 row changed Row 1 age: "4" => "5" 1 row added id: 3 name: Bailey age: 1 1 row removed id: 2 name: Pancakes age: 2 The `--key=id` option means that the `id` column should be treated as the unique key, to identify which records have changed. The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using `--format=tsv` or `--format=csv`. You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use `--format=json` if your input files are JSON. Use `--show-unchanged` to include full details of the unchanged values for rows with at least one change in the diff output: % csv-diff one.csv two.csv --key=id --show-unchanged 1 row changed id: 1 age: "4" => "5" Unchanged: name: "Cleo" You can use the `--json` option to get a machine-readable difference: $ csv-diff one.csv two.csv --key=id --json { "added": [ { "id": "3", "name": "Bailey", "age": "1" } ], "removed": [ { "id": "2", "name": "Pancakes", "age": "2" } ], "changed": [ { "key": "1", "changes": { "age": [ "4", "5" ] } } ], "columns_added": [], "columns_removed": [] } ## As a Python library You can also import the Python library into your own code like so: from csv_diff import load_csv, compare diff = compare( load_csv(open("one.csv"), key="id"), load_csv(open("two.csv"), key="id") ) `diff` will now contain the same data structure as the output in the `--json` example above. If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows. ## As a Docker container ### Build the image $ docker build -t csvdiff . ### Run the container $ docker run --rm -v $(pwd):/files csvdiff Suppose current directory contains two csv files : one.csv two.csv $ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv ## Alternatives - [csvdiff](https://github.com/aswinkarthik/csvdiff) is a "fast diff tool for comparing CSV files" - you may get better results from this than from `csv-diff` against larger files. <div id="readme" class="md" data-path="README.md"><article class="markdown-body entry-content container-lg" itemprop="text"><h1 dir="auto"><a id="user-content-csv-diff" class="anchor" aria-hidden="true" href="#user-content-csv-diff"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>csv-diff</h1> <p dir="auto"><a href="https://pypi.org/project/csv-diff/" rel="nofollow"><img src="https://camo.githubusercontent.com/75784b8c5ee65df6e894c25c0efe54d360e3b6a33714da9aa0c6bb86ede1f153/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f6373762d646966662e737667" alt="PyPI" data-canonical-src="https://img.shields.io/pypi/v/csv-diff.svg" style="max-width: 100%;"></a> <a href="https://github.com/simonw/csv-diff/releases"><img src="https://camo.githubusercontent.com/a8ece0f4436cb61b1524e3722c02363faf7aa45fe7e62f6634604f2c30421517/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f6373762d646966663f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67" alt="Changelog" data-canonical-src="https://img.shields.io/github/v/release/simonw/csv-diff?include_prereleases&amp;label=changelog" style="max-width: 100%;"></a> <a href="https://github.com/simonw/csv-diff/actions?query=workflow%3ATest"><img src="https://github.com/simonw/csv-diff/workflows/Test/badge.svg" alt="Tests" style="max-width: 100%;"></a> <a href="https://github.com/simonw/csv-diff/blob/main/LICENSE"><img src="https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667" alt="License" data-canonical-src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" style="max-width: 100%;"></a></p> <p dir="auto">Tool for viewing the difference between two CSV, TSV or JSON files. See <a href="https://simonwillison.net/2019/Mar/13/tree-history/" rel="nofollow">Generating a commit log for San Francisco’s official list of trees</a> (and the <a href="https://github.com/simonw/sf-tree-history/commits">sf-tree-history repo commit log</a>) for background information on this project.</p> <h2 dir="auto"><a id="user-content-installation" class="anchor" aria-hidden="true" href="#user-content-installation"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Installation</h2> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="pip install csv-diff"><pre class="notranslate"><code>pip install csv-diff </code></pre></div> <h2 dir="auto"><a id="user-content-usage" class="anchor" aria-hidden="true" href="#user-content-usage"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Usage</h2> <p dir="auto">Consider two CSV files:</p> <p dir="auto"><code>one.csv</code></p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="id,name,age 1,Cleo,4 2,Pancakes,2"><pre class="notranslate"><code>id,name,age 1,Cleo,4 2,Pancakes,2 </code></pre></div> <p dir="auto"><code>two.csv</code></p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="id,name,age 1,Cleo,5 3,Bailey,1"><pre class="notranslate"><code>id,name,age 1,Cleo,5 3,Bailey,1 </code></pre></div> <p dir="auto"><code>csv-diff</code> can show a human-readable summary of differences between the files:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ csv-diff one.csv two.csv --key=id 1 row changed, 1 row added, 1 row removed 1 row changed Row 1 age: &quot;4&quot; =&gt; &quot;5&quot; 1 row added id: 3 name: Bailey age: 1 1 row removed id: 2 name: Pancakes age: 2"><pre class="notranslate"><code>$ csv-diff one.csv two.csv --key=id 1 row changed, 1 row added, 1 row removed 1 row changed Row 1 age: "4" =&gt; "5" 1 row added id: 3 name: Bailey age: 1 1 row removed id: 2 name: Pancakes age: 2 </code></pre></div> <p dir="auto">The <code>--key=id</code> option means that the <code>id</code> column should be treated as the unique key, to identify which records have changed.</p> <p dir="auto">The tool will automatically detect if your files are comma- or tab-separated. You can over-ride this automatic detection and force the tool to use a specific format using <code>--format=tsv</code> or <code>--format=csv</code>.</p> <p dir="auto">You can also feed it JSON files, provided they are a JSON array of objects where each object has the same keys. Use <code>--format=json</code> if your input files are JSON.</p> <p dir="auto">Use <code>--show-unchanged</code> to include full details of the unchanged values for rows with at least one change in the diff output:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="% csv-diff one.csv two.csv --key=id --show-unchanged 1 row changed id: 1 age: &quot;4&quot; =&gt; &quot;5&quot; Unchanged: name: &quot;Cleo&quot;"><pre class="notranslate"><code>% csv-diff one.csv two.csv --key=id --show-unchanged 1 row changed id: 1 age: "4" =&gt; "5" Unchanged: name: "Cleo" </code></pre></div> <p dir="auto">You can use the <code>--json</code> option to get a machine-readable difference:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ csv-diff one.csv two.csv --key=id --json { &quot;added&quot;: [ { &quot;id&quot;: &quot;3&quot;, &quot;name&quot;: &quot;Bailey&quot;, &quot;age&quot;: &quot;1&quot; } ], &quot;removed&quot;: [ { &quot;id&quot;: &quot;2&quot;, &quot;name&quot;: &quot;Pancakes&quot;, &quot;age&quot;: &quot;2&quot; } ], &quot;changed&quot;: [ { &quot;key&quot;: &quot;1&quot;, &quot;changes&quot;: { &quot;age&quot;: [ &quot;4&quot;, &quot;5&quot; ] } } ], &quot;columns_added&quot;: [], &quot;columns_removed&quot;: [] }"><pre class="notranslate"><code>$ csv-diff one.csv two.csv --key=id --json { "added": [ { "id": "3", "name": "Bailey", "age": "1" } ], "removed": [ { "id": "2", "name": "Pancakes", "age": "2" } ], "changed": [ { "key": "1", "changes": { "age": [ "4", "5" ] } } ], "columns_added": [], "columns_removed": [] } </code></pre></div> <h2 dir="auto"><a id="user-content-as-a-python-library" class="anchor" aria-hidden="true" href="#user-content-as-a-python-library"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>As a Python library</h2> <p dir="auto">You can also import the Python library into your own code like so:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="from csv_diff import load_csv, compare diff = compare( load_csv(open(&quot;one.csv&quot;), key=&quot;id&quot;), load_csv(open(&quot;two.csv&quot;), key=&quot;id&quot;) )"><pre class="notranslate"><code>from csv_diff import load_csv, compare diff = compare( load_csv(open("one.csv"), key="id"), load_csv(open("two.csv"), key="id") ) </code></pre></div> <p dir="auto"><code>diff</code> will now contain the same data structure as the output in the <code>--json</code> example above.</p> <p dir="auto">If the columns in the CSV have changed, those added or removed columns will be ignored when calculating changes made to specific rows.</p> <h2 dir="auto"><a id="user-content-as-a-docker-container" class="anchor" aria-hidden="true" href="#user-content-as-a-docker-container"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>As a Docker container</h2> <h3 dir="auto"><a id="user-content-build-the-image" class="anchor" aria-hidden="true" href="#user-content-build-the-image"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Build the image</h3> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ docker build -t csvdiff ."><pre class="notranslate"><code>$ docker build -t csvdiff . </code></pre></div> <h3 dir="auto"><a id="user-content-run-the-container" class="anchor" aria-hidden="true" href="#user-content-run-the-container"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Run the container</h3> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ docker run --rm -v $(pwd):/files csvdiff"><pre class="notranslate"><code>$ docker run --rm -v $(pwd):/files csvdiff </code></pre></div> <p dir="auto">Suppose current directory contains two csv files : one.csv two.csv</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv"><pre class="notranslate"><code>$ docker run --rm -v $(pwd):/files csvdiff one.csv two.csv </code></pre></div> <h2 dir="auto"><a id="user-content-alternatives" class="anchor" aria-hidden="true" href="#user-content-alternatives"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Alternatives</h2> <ul dir="auto"> <li><a href="https://github.com/aswinkarthik/csvdiff">csvdiff</a> is a "fast diff tool for comparing CSV files" - you may get better results from this than from <code>csv-diff</code> against larger files.</li> </ul> </article></div> 1 public 0   0  

Links from other tables

  • 9 rows from repo in releases
Powered by Datasette · Queries took 1.284ms