Absortio

Email → Summary → Bookmark → Email

Datasette

https://datasette.io/ Jan 23, 2023 21:59

Extracto

Datasette is a tool for exploring and publishing data. It helps people take data of any shape, analyze and explore it, and publish it as an interactive website and accompanying API.

Contenido

Latest news

13th January 2023 #

9th January 2023 #

15th December 2022 #

2nd December 2022 #

27th October 2022 #

8th September 2022 #

21st August 2022 #

14th August 2022 #

Datasette 0.62 introduces compatibility with Pyodide for Datasette Lite, and incorporates a number of bug fixes, plugin hook upgrades and other improvements.

31st July 2022 #

30th June 2022 #

s3-ocr is a new tool which can run OCR (via Amazon Textract) against every PDF file in an S3 bucket and write the results to a searchable SQLite database, ready to use with Datasette. Read more about it in s3-ocr: Extract text from PDF files stored in an S3 bucket.

5th May 2022 #

Datasette Lite is a new way to run Datasette: entirely in your browser, thanks to the Pyodide project which provides a full Python environment compiled to WebAssembly. You can use it to explore any SQLite database file hosted on a CORS-enabled static hosting provider, which includes GitHub and GitHub Pages. Read more about this project in Datasette Lite: a server-side Python web application running in a browser.

12th April 2022 #

Datasette for geospatial analysis describes how Datasette can be used in conjunction with SpatiaLite to work with geospatial data, including details of several geospatial plugins and tools from the Datasette ecosystem.

23rd March 2022 #

Datasette 0.61 introduces two potentially backwards-incompatible changes in preparation for the forthcoming 1.0 release: hashed URL mode has been moved to a new plugin, and the way URLs are generated to tables or databases containing special characters such as . or / has changed. Datasette 0.61.1 fixes a small bug in that release. See also the annotated release notes for these two versions.

27th February 2022 #

The first two of an ongoing series of official Datasette tutorials are now available: Exploring a database with Datasette introduces the Datasette web interface and shows how it can be used to explore a new database, and Learn SQL with Datasette provides an introduction to SQL using Datasette as a learning environment.

13th January 2022 #

Datasette 0.60 adds a new filters_from_request plugin hook, new internal methods for writing to the database, better performance and various faceting improvements. See also the annotated release notes.

All news

Latest releases

22nd January 2023

datasette-scraper 0.5 - Adds website scraping abilities to Datasette.

  • feature: generic support for extracting json+ld data
  • feature: specific support for extracting json+ld Product data
  • feature: add discover-allow to specify an allowlist of patterns to crawl
  • enhancement: seed-sitemaps only activates for seeds that are at the top-level of the domain
  • enhancement: extract_from_response can delete existing entries
  • enhancement: extract_from_response can add indexed entries with @ sigil
  • enhancement: extract_from_response skips doing writes that wouldn't change the database
  • enhancement: prune pages that exceed max depth/max page limit earlier

19th January 2023

datasette-faiss 0.2 - Maintain a FAISS index for specified Datasette tables

  • New faiss_agg() and faiss_agg_with_scores() aggregate functions. #3

14th January 2023

datasette-openai 0.2 - SQL functions for calling OpenAI APIs

13th January 2023

openai-to-sqlite 0.2 - Save OpenAI API results to a SQLite database

  • openai-to-sqlite embeddings command can read JSON, CSV or TSV from a file or from standard input and fetch and store embeddings for that data. #1
  • openai-to-sqlite embeddings --sql command can read the data to be embedded from a SQL query. #2
  • Data is now sent to the OpenAI API in batches, defaulting to 100 and with a size that can be specified using --batch-size up to 2048. #5

12th January 2023

datasette-cookies-for-magic-parameters 0.1.2 - UI for setting cookies to populate magic parameters

  • Fix for a cookie parsing bug. #3

datasette-openai 0.1a2 - SQL functions for calling OpenAI APIs

  • Returns more detailed error messages if a completion fails. #6
  • openai_strip_tags() function. #5
  • openai_tokenize() and openai_count_tokens() functions. #7
  • A not-yet-documented openai_build_prompt() aggregate function. #4

11th January 2023

datasette-cookies-for-magic-parameters 0.1.1 - UI for setting cookies to populate magic parameters

  • Fixed bug where duplicate mentions of the same parameter name resulted in duplicate form fields. #2

datasette-cookies-for-magic-parameters 0.1

  • Initial release. Adds a form to any canned query that uses :_cookie_x parameters allowing the user to set that cookie. #1

git-history 0.7a0 - Tools for analyzing Git history using SQLite

  • Fixed bug where the item table did not correctly link to the commits using a foreign key. #59
  • Fixed bug where some repositories would not process correctly due to data from older versions not being successfully loaded from the commit history. #64

datasette 0.64.1 - An open source multi-tool for exploring and publishing data

  • Documentation now links to a current source of information for installing Python 3. (#1987)
  • Incorrectly calling the Datasette constructor using Datasette("path/to/data.db") instead of Datasette(["path/to/data.db"]) now returns a useful error message. (#1985)

datasette-faiss 0.1a0 - Maintain a FAISS index for specified Datasette tables

10th January 2023

json-to-files 0.1 - Create separate files on disk based on a JSON object

datasette-openai 0.1a1 - SQL functions for calling OpenAI APIs

  • Calls to GPT-3 now have a 15s timeout, increased from 5s.

9th January 2023

datasette-auth-passwords 1.1 - Datasette plugin for authenticating access using passwords

  • Now adds a "Log in" item to the Datasette navigation menu if the user is logged out. #23
  • Increased password hash iteration to 480,000. #21

datasette 0.64 - An open source multi-tool for exploring and publishing data

All releases