home / content / repos

repos: 511787166

This data as json

id node_id name full_name private owner html_url description fork created_at updated_at pushed_at homepage size stargazers_count watchers_count language has_issues has_projects has_downloads has_wiki has_pages forks_count archived disabled open_issues_count license topics forks open_issues watchers default_branch permissions temp_clone_token organization network_count subscribers_count readme readme_html allow_forking visibility is_template template_repository web_commit_signoff_required has_discussions
511787166 R_kgDOHoFAng sqlite-comprehend simonw/sqlite-comprehend 0 9599 https://github.com/simonw/sqlite-comprehend Tools for running data in a SQLite database through AWS Comprehend 0 2022-07-08T06:26:15Z 2022-07-11T21:44:34Z 2022-07-12T14:21:42Z   77 6 6 Python 1 1 1 1 0 0 0 0 1 apache-2.0 [] 0 1 6 main {"admin": false, "maintain": false, "push": false, "triage": false, "pull": false}     0 1 # sqlite-comprehend [![PyPI](https://img.shields.io/pypi/v/sqlite-comprehend.svg)](https://pypi.org/project/sqlite-comprehend/) [![Changelog](https://img.shields.io/github/v/release/simonw/sqlite-comprehend?include_prereleases&label=changelog)](https://github.com/simonw/sqlite-comprehend/releases) [![Tests](https://github.com/simonw/sqlite-comprehend/workflows/Test/badge.svg)](https://github.com/simonw/sqlite-comprehend/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/sqlite-comprehend/blob/master/LICENSE) Tools for running data in a SQLite database through [AWS Comprehend](https://aws.amazon.com/comprehend/) See [sqlite-comprehend: run AWS entity extraction against content in a SQLite database](https://simonwillison.net/2022/Jul/11/sqlite-comprehend/) for background on this project. ## Installation Install this tool using `pip`: pip install sqlite-comprehend ## Demo You can see examples of tables generated using this command here: - [comprehend_entities](https://datasette.simonwillison.net/simonwillisonblog/comprehend_entities) - the extracted entities, classified by type - [blog_entry_comprehend_entities](https://datasette.simonwillison.net/simonwillisonblog/blog_entry_comprehend_entities) - a table relating entities to the entries that they appear in - [comprehend_entity_types](https://datasette.simonwillison.net/simonwillisonblog/comprehend_entity_types) - a small lookup table of entity types ## Configuration You will need AWS credentials with the `comprehend:BatchDetectEntities` [IAM permission](https://docs.aws.amazon.com/comprehend/latest/dg/access-control-managing-permissions.html). You can configure credentials [using these instructions](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html). You can also save them to a JSON or INI configuration file and pass them to the command using `-a credentials.ini`, or pass them using the `--access-key` and `--secret-key` options. ## Entity extraction The `sqlite-comprehend entities` command runs entity extraction against every row in the specified table and saves the results to your database. Specify the database, the table and one or more columns containing text in that table. The following runs against the `text` column in the `pages` table of the `sfms.db` SQLite database: sqlite-comprehend sfms.db pages text Results will be written into a `pages_comprehend_entities` table. Change the name of the output table by passing `-o other_table_name`. You can run against a subset of rows by adding a `--where` clause: sqlite-comprehend sfms.db pages text --where 'id < 10' You can also used named parameters in your `--where` clause: sqlite-comprehend sfms.db pages text --where 'id < :maxid' -p maxid 10 Only the first 5,000 characters for each row will be considered. Be sure to review [Comprehend's pricing](https://aws.amazon.com/comprehend/pricing/) - which starts at $0.0001 per hundred characters. If your context includes HTML tags, you can strip them out before extracting entities by adding `--strip-tags`: sqlite-comprehend sfms.db pages text --strip-tags Rows that have been processed are recorded in the `pages_comprehend_entities_done` table. If you run the command more than once it will only process rows that have been newly added. You can delete records from that `_done` table to run them again. ### sqlite-comprehend entities --help <!-- [[[cog from click.testing import CliRunner from sqlite_comprehend import cli runner = CliRunner() result = runner.invoke(cli.cli, ["entities", "--help"]) help = result.output.replace("Usage: cli", "Usage: sqlite-comprehend") cog.out( "```\n{}\n```".format(help) ) ]]] --> ``` Usage: sqlite-comprehend entities [OPTIONS] DATABASE TABLE COLUMNS... Detect entities in columns in a table To extract entities from columns text1 and text2 in mytable: sqlite-comprehend entities my.db mytable text1 text2 To run against just a subset of the rows in the table, add: --where "id < :max_id" -p max_id 50 Results will be written to a table called mytable_comprehend_entities To specify a different output table, use -o custom_table_name Options: --where TEXT WHERE clause to filter table -p, --param <TEXT TEXT>... Named :parameters for SQL query -o, --output TEXT Custom output table -r, --reset Start from scratch, deleting previous results --strip-tags Strip HTML tags before extracting entities --access-key TEXT AWS access key ID --secret-key TEXT AWS secret access key --session-token TEXT AWS session token --endpoint-url TEXT Custom endpoint URL -a, --auth FILENAME Path to JSON/INI file containing credentials --help Show this message and exit. ``` <!-- [[[end]]] --> ## Schema Assuming an input table called `pages` the tables created by this tool will have the following schema: <!-- [[[cog import cog, json from sqlite_comprehend import cli from unittest.mock import patch from click.testing import CliRunner import sqlite_utils import tempfile, pathlib tmpdir = pathlib.Path(tempfile.mkdtemp()) db_path = str(tmpdir / "data.db") db = sqlite_utils.Database(db_path) db["pages"].insert_all( [ { "id": 1, "text": "John Bob", }, { "id": 2, "text": "Sandra X", }, ], pk="id", ) with patch('boto3.client') as client: client.return_value.batch_detect_entities.return_value = { "ResultList": [ { "Index": 0, "Entities": [ { "Score": 0.8, "Type": "PERSON", "Text": "John Bob", "BeginOffset": 0, "EndOffset": 5, }, ], }, { "Index": 1, "Entities": [ { "Score": 0.8, "Type": "PERSON", "Text": "Sandra X", "BeginOffset": 0, "EndOffset": 5, }, ], }, ], "ErrorList": [], } runner = CliRunner() result = runner.invoke(cli.cli, [ "entities", db_path, "pages", "text" ]) cog.out("```sql\n") cog.out(db.schema) cog.out("\n```") ]]] --> ```sql CREATE TABLE [pages] ( [id] INTEGER PRIMARY KEY, [text] TEXT ); CREATE TABLE [comprehend_entity_types] ( [id] INTEGER PRIMARY KEY, [value] TEXT ); CREATE TABLE [comprehend_entities] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [type] INTEGER REFERENCES [comprehend_entity_types]([id]) ); CREATE TABLE [pages_comprehend_entities] ( [id] INTEGER REFERENCES [pages]([id]), [score] FLOAT, [entity] INTEGER REFERENCES [comprehend_entities]([id]), [begin_offset] INTEGER, [end_offset] INTEGER ); CREATE UNIQUE INDEX [idx_comprehend_entity_types_value] ON [comprehend_entity_types] ([value]); CREATE UNIQUE INDEX [idx_comprehend_entities_type_name] ON [comprehend_entities] ([type], [name]); CREATE TABLE [pages_comprehend_entities_done] ( [id] INTEGER PRIMARY KEY REFERENCES [pages]([id]) ); ``` <!-- [[[end]]] --> ## Development To contribute to this tool, first checkout the code. Then create a new virtual environment: cd sqlite-comprehend python -m venv venv source venv/bin/activate Now install the dependencies and test dependencies: pip install -e '.[test]' To run the tests: pytest <div id="readme" class="md" data-path="README.md"><article class="markdown-body entry-content container-lg" itemprop="text"><h1 dir="auto"><a id="user-content-sqlite-comprehend" class="anchor" aria-hidden="true" href="#user-content-sqlite-comprehend"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>sqlite-comprehend</h1> <p dir="auto"><a href="https://pypi.org/project/sqlite-comprehend/" rel="nofollow"><img src="https://camo.githubusercontent.com/469666373ceff91e6d61dd34291b32805e10d84b0499d1d671f5379190a6358f/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f73716c6974652d636f6d70726568656e642e737667" alt="PyPI" data-canonical-src="https://img.shields.io/pypi/v/sqlite-comprehend.svg" style="max-width: 100%;"></a> <a href="https://github.com/simonw/sqlite-comprehend/releases"><img src="https://camo.githubusercontent.com/3af3baa88364e1d1baa0619b4bb45a483a453734d0b69c5aaf50f3e18e86193d/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f762f72656c656173652f73696d6f6e772f73716c6974652d636f6d70726568656e643f696e636c7564655f70726572656c6561736573266c6162656c3d6368616e67656c6f67" alt="Changelog" data-canonical-src="https://img.shields.io/github/v/release/simonw/sqlite-comprehend?include_prereleases&amp;label=changelog" style="max-width: 100%;"></a> <a href="https://github.com/simonw/sqlite-comprehend/actions?query=workflow%3ATest"><img src="https://github.com/simonw/sqlite-comprehend/workflows/Test/badge.svg" alt="Tests" style="max-width: 100%;"></a> <a href="https://github.com/simonw/sqlite-comprehend/blob/master/LICENSE"><img src="https://camo.githubusercontent.com/1698104e976c681143eb0841f9675c6f802bb7aa832afc0c7a4e719b1f3cf955/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d417061636865253230322e302d626c75652e737667" alt="License" data-canonical-src="https://img.shields.io/badge/license-Apache%202.0-blue.svg" style="max-width: 100%;"></a></p> <p dir="auto">Tools for running data in a SQLite database through <a href="https://aws.amazon.com/comprehend/" rel="nofollow">AWS Comprehend</a></p> <p dir="auto">See <a href="https://simonwillison.net/2022/Jul/11/sqlite-comprehend/" rel="nofollow">sqlite-comprehend: run AWS entity extraction against content in a SQLite database</a> for background on this project.</p> <h2 dir="auto"><a id="user-content-installation" class="anchor" aria-hidden="true" href="#user-content-installation"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Installation</h2> <p dir="auto">Install this tool using <code>pip</code>:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="pip install sqlite-comprehend"><pre class="notranslate"><code>pip install sqlite-comprehend </code></pre></div> <h2 dir="auto"><a id="user-content-demo" class="anchor" aria-hidden="true" href="#user-content-demo"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Demo</h2> <p dir="auto">You can see examples of tables generated using this command here:</p> <ul dir="auto"> <li><a href="https://datasette.simonwillison.net/simonwillisonblog/comprehend_entities" rel="nofollow">comprehend_entities</a> - the extracted entities, classified by type</li> <li><a href="https://datasette.simonwillison.net/simonwillisonblog/blog_entry_comprehend_entities" rel="nofollow">blog_entry_comprehend_entities</a> - a table relating entities to the entries that they appear in</li> <li><a href="https://datasette.simonwillison.net/simonwillisonblog/comprehend_entity_types" rel="nofollow">comprehend_entity_types</a> - a small lookup table of entity types</li> </ul> <h2 dir="auto"><a id="user-content-configuration" class="anchor" aria-hidden="true" href="#user-content-configuration"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Configuration</h2> <p dir="auto">You will need AWS credentials with the <code>comprehend:BatchDetectEntities</code> <a href="https://docs.aws.amazon.com/comprehend/latest/dg/access-control-managing-permissions.html" rel="nofollow">IAM permission</a>.</p> <p dir="auto">You can configure credentials <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html" rel="nofollow">using these instructions</a>. You can also save them to a JSON or INI configuration file and pass them to the command using <code>-a credentials.ini</code>, or pass them using the <code>--access-key</code> and <code>--secret-key</code> options.</p> <h2 dir="auto"><a id="user-content-entity-extraction" class="anchor" aria-hidden="true" href="#user-content-entity-extraction"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Entity extraction</h2> <p dir="auto">The <code>sqlite-comprehend entities</code> command runs entity extraction against every row in the specified table and saves the results to your database.</p> <p dir="auto">Specify the database, the table and one or more columns containing text in that table. The following runs against the <code>text</code> column in the <code>pages</code> table of the <code>sfms.db</code> SQLite database:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="sqlite-comprehend sfms.db pages text"><pre class="notranslate"><code>sqlite-comprehend sfms.db pages text </code></pre></div> <p dir="auto">Results will be written into a <code>pages_comprehend_entities</code> table. Change the name of the output table by passing <code>-o other_table_name</code>.</p> <p dir="auto">You can run against a subset of rows by adding a <code>--where</code> clause:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="sqlite-comprehend sfms.db pages text --where 'id &lt; 10'"><pre class="notranslate"><code>sqlite-comprehend sfms.db pages text --where 'id &lt; 10' </code></pre></div> <p dir="auto">You can also used named parameters in your <code>--where</code> clause:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="sqlite-comprehend sfms.db pages text --where 'id &lt; :maxid' -p maxid 10"><pre class="notranslate"><code>sqlite-comprehend sfms.db pages text --where 'id &lt; :maxid' -p maxid 10 </code></pre></div> <p dir="auto">Only the first 5,000 characters for each row will be considered. Be sure to review <a href="https://aws.amazon.com/comprehend/pricing/" rel="nofollow">Comprehend's pricing</a> - which starts at $0.0001 per hundred characters.</p> <p dir="auto">If your context includes HTML tags, you can strip them out before extracting entities by adding <code>--strip-tags</code>:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="sqlite-comprehend sfms.db pages text --strip-tags"><pre class="notranslate"><code>sqlite-comprehend sfms.db pages text --strip-tags </code></pre></div> <p dir="auto">Rows that have been processed are recorded in the <code>pages_comprehend_entities_done</code> table. If you run the command more than once it will only process rows that have been newly added.</p> <p dir="auto">You can delete records from that <code>_done</code> table to run them again.</p> <h3 dir="auto"><a id="user-content-sqlite-comprehend-entities---help" class="anchor" aria-hidden="true" href="#user-content-sqlite-comprehend-entities---help"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>sqlite-comprehend entities --help</h3> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="Usage: sqlite-comprehend entities [OPTIONS] DATABASE TABLE COLUMNS... Detect entities in columns in a table To extract entities from columns text1 and text2 in mytable: sqlite-comprehend entities my.db mytable text1 text2 To run against just a subset of the rows in the table, add: --where &quot;id &lt; :max_id&quot; -p max_id 50 Results will be written to a table called mytable_comprehend_entities To specify a different output table, use -o custom_table_name Options: --where TEXT WHERE clause to filter table -p, --param &lt;TEXT TEXT&gt;... Named :parameters for SQL query -o, --output TEXT Custom output table -r, --reset Start from scratch, deleting previous results --strip-tags Strip HTML tags before extracting entities --access-key TEXT AWS access key ID --secret-key TEXT AWS secret access key --session-token TEXT AWS session token --endpoint-url TEXT Custom endpoint URL -a, --auth FILENAME Path to JSON/INI file containing credentials --help Show this message and exit. "><pre class="notranslate"><code>Usage: sqlite-comprehend entities [OPTIONS] DATABASE TABLE COLUMNS... Detect entities in columns in a table To extract entities from columns text1 and text2 in mytable: sqlite-comprehend entities my.db mytable text1 text2 To run against just a subset of the rows in the table, add: --where "id &lt; :max_id" -p max_id 50 Results will be written to a table called mytable_comprehend_entities To specify a different output table, use -o custom_table_name Options: --where TEXT WHERE clause to filter table -p, --param &lt;TEXT TEXT&gt;... Named :parameters for SQL query -o, --output TEXT Custom output table -r, --reset Start from scratch, deleting previous results --strip-tags Strip HTML tags before extracting entities --access-key TEXT AWS access key ID --secret-key TEXT AWS secret access key --session-token TEXT AWS session token --endpoint-url TEXT Custom endpoint URL -a, --auth FILENAME Path to JSON/INI file containing credentials --help Show this message and exit. </code></pre></div> <h2 dir="auto"><a id="user-content-schema" class="anchor" aria-hidden="true" href="#user-content-schema"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Schema</h2> <p dir="auto">Assuming an input table called <code>pages</code> the tables created by this tool will have the following schema:</p> <div class="highlight highlight-source-sql notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="CREATE TABLE [pages] ( [id] INTEGER PRIMARY KEY, [text] TEXT ); CREATE TABLE [comprehend_entity_types] ( [id] INTEGER PRIMARY KEY, [value] TEXT ); CREATE TABLE [comprehend_entities] ( [id] INTEGER PRIMARY KEY, [name] TEXT, [type] INTEGER REFERENCES [comprehend_entity_types]([id]) ); CREATE TABLE [pages_comprehend_entities] ( [id] INTEGER REFERENCES [pages]([id]), [score] FLOAT, [entity] INTEGER REFERENCES [comprehend_entities]([id]), [begin_offset] INTEGER, [end_offset] INTEGER ); CREATE UNIQUE INDEX [idx_comprehend_entity_types_value] ON [comprehend_entity_types] ([value]); CREATE UNIQUE INDEX [idx_comprehend_entities_type_name] ON [comprehend_entities] ([type], [name]); CREATE TABLE [pages_comprehend_entities_done] ( [id] INTEGER PRIMARY KEY REFERENCES [pages]([id]) );"><pre>CREATE TABLE [pages] ( [id] <span class="pl-k">INTEGER</span> <span class="pl-k">PRIMARY KEY</span>, [<span class="pl-k">text</span>] <span class="pl-k">TEXT</span> ); CREATE TABLE [comprehend_entity_types] ( [id] <span class="pl-k">INTEGER</span> <span class="pl-k">PRIMARY KEY</span>, [value] <span class="pl-k">TEXT</span> ); CREATE TABLE [comprehend_entities] ( [id] <span class="pl-k">INTEGER</span> <span class="pl-k">PRIMARY KEY</span>, [name] <span class="pl-k">TEXT</span>, [type] <span class="pl-k">INTEGER</span> <span class="pl-k">REFERENCES</span> [comprehend_entity_types]([id]) ); CREATE TABLE [pages_comprehend_entities] ( [id] <span class="pl-k">INTEGER</span> <span class="pl-k">REFERENCES</span> [pages]([id]), [score] FLOAT, [entity] <span class="pl-k">INTEGER</span> <span class="pl-k">REFERENCES</span> [comprehend_entities]([id]), [begin_offset] <span class="pl-k">INTEGER</span>, [end_offset] <span class="pl-k">INTEGER</span> ); CREATE UNIQUE INDEX [idx_comprehend_entity_types_value] <span class="pl-k">ON</span> [comprehend_entity_types] ([value]); CREATE UNIQUE INDEX [idx_comprehend_entities_type_name] <span class="pl-k">ON</span> [comprehend_entities] ([type], [name]); CREATE TABLE [pages_comprehend_entities_done] ( [id] <span class="pl-k">INTEGER</span> <span class="pl-k">PRIMARY KEY</span> <span class="pl-k">REFERENCES</span> [pages]([id]) );</pre></div> <h2 dir="auto"><a id="user-content-development" class="anchor" aria-hidden="true" href="#user-content-development"><svg class="octicon octicon-link" viewBox="0 0 16 16" version="1.1" width="16" height="16" aria-hidden="true"><path fill-rule="evenodd" d="M7.775 3.275a.75.75 0 001.06 1.06l1.25-1.25a2 2 0 112.83 2.83l-2.5 2.5a2 2 0 01-2.83 0 .75.75 0 00-1.06 1.06 3.5 3.5 0 004.95 0l2.5-2.5a3.5 3.5 0 00-4.95-4.95l-1.25 1.25zm-4.69 9.64a2 2 0 010-2.83l2.5-2.5a2 2 0 012.83 0 .75.75 0 001.06-1.06 3.5 3.5 0 00-4.95 0l-2.5 2.5a3.5 3.5 0 004.95 4.95l1.25-1.25a.75.75 0 00-1.06-1.06l-1.25 1.25a2 2 0 01-2.83 0z"></path></svg></a>Development</h2> <p dir="auto">To contribute to this tool, first checkout the code. Then create a new virtual environment:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="cd sqlite-comprehend python -m venv venv source venv/bin/activate"><pre class="notranslate"><code>cd sqlite-comprehend python -m venv venv source venv/bin/activate </code></pre></div> <p dir="auto">Now install the dependencies and test dependencies:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="pip install -e '.[test]'"><pre class="notranslate"><code>pip install -e '.[test]' </code></pre></div> <p dir="auto">To run the tests:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="pytest"><pre class="notranslate"><code>pytest </code></pre></div> </article></div> 1 public 0   0  

Links from other tables

  • 5 rows from repo in releases
Powered by Datasette · Queries took 59.595ms