Absortio

Email → Summary → Bookmark → Email

Using SQLite in production with Laravel

Extracto

Handling concurrency

Resumen

Resumen Principal

El contenido examina a SQLite como una alternativa viable y potente a MySQL para muchas aplicaciones Laravel, enfatizando sus ventajas y limitaciones. Su principal beneficio reside en ser una base de datos de archivo único, lo que simplifica enormemente las tareas de respaldo y compartición al eliminar la necesidad de un daemon dedicado. En escenarios específicos, como aquellos sin comunicación interprocesos o sobrecarga de red, SQLite puede incluso superar a MySQL en rendimiento al ser utilizada directamente por el mismo proceso que maneja la solicitud web. No obstante, su uso está desaconsejado para aplicaciones que requieren escalabilidad horizontal a través de múltiples servidores o aquellas con un alto volumen de escrituras concurrentes, ya que bloquea la base de datos completa en lugar de filas individuales. Para el vasto segmento de aplicaciones que no exigen una escala extrema —como herramientas internas o plataformas con menos tráfico—, SQLite ofrece una solución robusta, especialmente cuando se configura correctamente para aprovechar su potencial, siendo la habilitación del modo WAL (write-ahead log) y el busy_timeout consideraciones prácticas fundamentales para optimizar su rendimiento.

Elementos Clave

  • Ventajas y Casos de Uso de SQLite: SQLite destaca por su simplicidad al ser una base de datos de archivo único, lo que facilita su respaldo y compartición sin requerir un daemon de base de datos como MySQL. Esta característica la hace ideal para muchas aplicaciones Laravel, especialmente aquellas que no demandan escalabilidad a través de múltiples servidores, ofreciendo un rendimiento competitivo e incluso superior en casos donde se evita la comunicación interprocesos y la sobrecarga de red.
  • Limitaciones de Escalabilidad y Concurrencia: A pesar de sus ventajas, SQLite no es adecuada para todos los escenarios. No soporta la escalabilidad horizontal en múltiples servidores (salvo excepciones como Turso) y presenta limitaciones significativas con altas escrituras concurrentes, ya que bloquea la base de datos entera para cada escritura, a diferencia de MySQL que puede bloquear filas individuales. Esto la hace inadecuada para aplicaciones distribuidas y de muy alta concurrencia.
  • Configuración Esencial para el Rendimiento: Modo WAL y Busy Timeout: Para desbloquear el potencial de rendimiento de SQLite, es crucial habilitar el modo WAL (pragma journal_mode = wal;). Este modo mejora la concurrencia al permitir que los lectores no bloqueen a los escritores (y viceversa), manejando las escrituras en un archivo separado. Adicionalmente, configurar el busy_timeout (pragma busy_timeout = 10000;) permite que las nuevas transacciones esperen un tiempo determinado antes de fallar si la base de datos está bloqueada, mejorando la robustez.
  • Manejo de Tipado Dinámico y Casts en Eloquent: SQLite utiliza tipado dinámico, lo que significa que el motor de la base de datos no impone estrictamente los tipos de datos declarados en las tablas. Esto exige precaución a nivel de aplicación, particularmente el uso riguroso de casts en modelos Eloquent para garantizar la consistencia de los datos. El ejemplo proporcionado subraya la importancia de especificar formatos precisos, como datetime:U para Unix timestamps, para evitar problemas en consultas where() y asegurar que los datos se almacenen y recuperen según lo esperado.

Análisis e Implicaciones

La elección de SQLite para aplicaciones Laravel puede optimizar significativamente el desarrollo y despliegue para una amplia gama de proyectos, liberando recursos al evitar la gestión de un servidor de base de datos completo. Sin embargo, esta elección implica una comprensión profunda de sus limitaciones inherentes y la necesidad de una configuración y manejo de datos meticulosos a nivel de aplicación para asegurar la consistencia y el rendimiento.

Contexto Adicional

El análisis sugiere que SQLite es una opción sorprendentemente potente para la "mayoría" de las aplicaciones, desafiando la noción de que solo las grandes bases de datos son aptas para entornos de producción, siempre que se comprendan sus requisitos específicos.

Contenido

SQLite is a great database for many, if not most Laravel applications. When well-configured (we’ll get to that), it can provide great performance even if your app has a lot of concurrent requests.

Still, why use it over something like MySQL? Well, asked in reverse, why use MySQL if SQLite works for your use case. The main benefit of SQLite is that your database is a single file that can be easily backed up and shared. A “proper” database like MySQL on the other hand requires an entire daemon to run. If you want to argue performance, for some use cases SQLite can be faster, especially compared to a database on a separate server since it’s used directly by the same process that is handling the web request — no inter-process communication and no networking overhead.

That said, you obviously shouldn’t use SQLite for everything. If you expect to scale your app across multiple servers, you can’t. SQLite databases are files that must be on the same server as your webserver. That is unless you use something like Turso but that’s beyond the scope of this article.

If your DB is complex and you have a lot of concurrent writes, MySQL also wins. It can lock individual rows while SQLite locks the entire database. So in practice, you shouldn’t use SQLite for very highly concurrent, distributed, scalable applications.

But most applications honestly are not that. Most applications are internal tools that don’t get that much activity, or even customer facing applications that just don’t have that many customers. If you simply vertically scale such applications (= moving to a larger server, really take a look at dedicated Hetzner servers and see just how much hardware you can get for only $50/mo) you can get really good performance. But again, not a great fit for every app. Even if you don’t need insane scale you may prefer having a load balancer and at least 2 web servers for redundancy. Every app is different.

If you do choose to use SQLite though, here are some practical considerations.

Enabling WAL mode is what lets SQLite unlock its potential performance-wise. In short, by default SQLite is in rollback mode, which means readers block writers and vice versa. WAL (write-ahead log) mode puts writes into a separate file (periodically merged with the main database file; new readers read up to a safe point in the WAL in addition to the main DB file), meaning readers don’t block writers anymore (and vice versa), but writers still block each other. That isn’t a huge deal though, since most operations are pretty fast, you just don’t want the nightmare of reads blocking writes (and vice versa…). For more details I’d highly recommend Aaron Francis’ High Performance SQLite course if you’re the type of person that enjoys courses. If not, the SQLite docs are great.

To enable WAL mode, all you need to do is:

pragma journal_mode = wal;

It’s a persistent pragma so it only needs to be set once. In addition to the journal mode, you also want to set busy_timeout. That way, if the database is locked for writing, new transactions will wait for a bit instead of immediately failing:

pragma busy_timeout = 10000;

To do this in Laravel:

// config/database.php
'sqlite' => [
    // ...
    'busy_timeout' => 10000,
    'journal_mode' => 'wal',
],

SQLite uses dynamic typing12 which requires a little more care on the application level.

You could use strict tables but as far as I know Laravel doesn’t have a way of creating those in its migrations, and you cannot turn an already-created table into a strict table.

The main thing to keep in mind is that the database engine will not complain about you storing different data than the table’s type so you should take precautions on your application level.

Additionally, though this isn’t strictly just a SQLite thing, you should make sure your Eloquent model casts enforce consistency.

For instance, I have a table where I store timestamps as unix timestamps. These are used as the public ID of those records (basically version numbers). SQLite cannot normalize these into a consistent format (though you can use some functions to turn datetime data into the format you want3) so you should make sure your Eloquent casts use the precise format you want.

At first I had some data created directly using the DB facade (simple low-level statements) but then I created some additional records using the Eloquent model. In that model I was using the datetime cast for the timestamp column, but all logic expected the data to be stored as unix timestamps. Eloquent casts make you not notice this after the models have been fetched, where all these columns are properly turned into Carbon instances, but where this falls apart is queries with where() clauses:

// Old record
Version::where('timestamp', $unixTimestamp)->first(); // works

// Record created with Eloquent
Version::where('timestamp', $unixTimestamp)->first(); // null

The entire fix here was using datetime:U casts instead of just datetime, so this was more of an application-level mistake on my end than an inherent problem with SQLite, but this is something you may need to think about more since SQLite will just store the data as you (or your ORM) pass it, so basically just make sure your casts are perfectly clear about what format you expect the data to be in.

This is the main reason I’m writing this article. SQLite has this particular behavior you need to be aware of: by default, transactions don’t acquire locks until you need them, and if you start with a read lock, and later try to upgrade to a write lock in that same transaction, it will immediately fail regardless of your busy_timeout setting.

That’s a lot of words — what does that mean, in code terms?

DB::transaction(function () use ($attributes) {
  $user = User::firstWhere('github_id', $attributes['github_id']);

  // No user with the github_id, but found a user with
  // the same email address, let's link them together
  if (! $user && isset($attributes['email']))
    $user = User::firstWhere('email', $attributes['email']);

  if ($user) {
    $user->update(collect($attributes)->only([
      'github_id',
      'github_username',
      'github_token',
      'github_refresh_token',
    ])->all());
  } else {
    $user = User::create(array_merge($attributes, [
      'email_verified_at' => now(),
      'password' => bin2hex(random_bytes(32)),
    ]));
  }
});

A bit simplified example from some OAuth logic. There are a few more branches in the real code but this works for our example. We first try to fetch some user (this acquires a read lock) and then try to write a user (this tries to upgrade to a write lock). In some cases you could have a single upsert operation, but the point is that in many cases you won’t, there are tons of transactions that first read and only then try to write.

What happens here is that if there’s a different active write lock while we’re trying to upgrade our lock to a write one we get an immediate SQLITE_BUSY. Our busy_timeout does nothing to prevent this.

This is because SQLite defaults to deferred transactions. There are other transaction types that let us use a write lock from the start: IMMEDIATE and EXCLUSIVE. They differ slightly in some journal modes, but are equivalent in WAL mode which is what we’re using.

So if we start our transaction using BEGIN IMMEDIATE TRANSACTION instead of BEGIN DEFERRED TRANSACTION (or just BEGIN which does the same thing), we will not run into the issue of getting a SQLITE_BUSY when trying to upgrade to a write lock, we start with a write lock (namely a RESERVED lock, see the 4 footnote) from the start.

The issue is, we can’t do this in PHP. I mean we can do:

$pdo->exec('begin immediate transaction');

But we can’t really do this in Laravel. Laravel has its own abstraction for DB transactions that adds a bit of extra logic, which we still want to use, but ultimately it uses PDO::beginTransaction() under the hood which means that in Laravel, we only ever get deferred transactions.

Upon learning this, I thought that surely there must be a setting for this. Source diving Laravel only yielded the beginTransaction() call, no luck there, so I checked if PHP has any settings for this (normally this would be PDO attributes). It does not. I found a GitHub issue from 2022 requesting this feature. First I contributed with a simple reproduction:

<?php

// This repro needs the pcntl extension

$immediateTx = isset($argv[1]) && $argv[1] === 'immediate';

if (file_exists($path = '/tmp/sqlite_tx_mode_demo.sqlite')) unlink($path);

$pdo = PDO::connect('sqlite:/tmp/sqlite_tx_mode_demo.sqlite');
$pdo->exec('pragma busy_timeout=5000');
assert($pdo->query('pragma busy_timeout', PDO::FETCH_ASSOC)->fetch()['timeout'] === 5000);
$pdo->exec('create table foo (bar)');
$pdo->exec('insert into foo (bar) values ("baz")');

// Two processes
$pid = pcntl_fork();

if ($immediateTx) $pdo->exec('begin immediate transaction');
else $pdo->beginTransaction();

// Start with a shared lock. If this is not present, the code seems to work fine since the first action in each
// transaction is a write so when it acquires a proper lock it's an exclusive one (?). Whereas if we start by
// acquiring a shared lock, the transaction will only succeed in one process and fail in the other one.
$pdo->query('select * from foo')->fetch();

sleep(1);

// MAIN ISSUE: At this point we have two concurrent writers, but they both started with deferred transactions
// which causes SQLite to throw a SQLITE_BUSY. If both transactions are started as immediate, this code
// is perfectly fine as long as the sleep above doesn't exceed the 5 second timeout set at the top.
$pdo->exec('update foo set bar = "xyz"');
assert($pdo->query('select * from foo')->fetch(PDO::FETCH_ASSOC)['bar'] === 'xyz');

if ($immediateTx) $pdo->exec('commit');
else $pdo->commit();

assert($pdo->query('select * from foo')->fetch(PDO::FETCH_ASSOC)['bar'] === 'xyz');

printf("[%s] Success\n", $pid);

// Wait on child before exiting
if ($pid) pcntl_waitpid($pid, $status);

The code is pretty simple. We just create two processes, both using the same SQLite database, both first reading, waiting 1 second, and then attempting a write.

If you run this as php script.php, one transaction succeeds and the other fails. If you run this as php script.php immediate, which causes the script to use exec(‘begin immediate transaction’) instead of beginTransaction(), both transactions succeed.

This was causing me a lot of pain in production, so in the meantime I thought of a few possible solutions, and used perhaps the dumbest one: a wrapper around DB::transaction() that first writes to a dummy table before executing the transaction callback. It actually worked perfectly well:

/**
 * A DB::transaction() wrapper that ensures we work with an exclusive lock from the start
 * and as such busy_timeout works as one would expect. This should be used instead of
 * DB::transaction() when the transaction BEGINS WITH A READ and is likely concurrent.
 *
 * @template T
 * @param Closure(): T $callback
 * @return T
 */
function immediate_transaction(Closure $callback, int $attempts = 1) {
  return DB::transaction(function () use (&$callback) {
    // Start with a write so we don't get stuck with a read lock
    DB::statement('update dummy_table set foo = "bar"');

    return $callback();
  }, $attempts);
}

Exceptions from my own code disappeared. My bug tracker was serene. Then days later I get an exception from Laravel’s internal rate limiting logic (which uses cache, which uses the DB driver with DB::transaction()) with the exact same issue:

General error: 5 database is locked

So this needed to be addressed at a lower level. I revisited the GitHub issue and decided to give it a try — a pull request to php-src. It’s an old C codebase with tons of macros, but after an hour or so you can make pretty good sense of it. I ended up submitting a PR adding PDO\Sqlite attributes for configuring the transaction mode, which I’m hoping will make it into PHP 8.5 (coming this November). PDO attributes are a perfect solution here since Laravel lets you set those in the database connection config. No logic needs to be changed, it’s just how the PDO connection gets configured. Any beginTransaction() calls (and as such DB::transaction()) will use the configured transaction mode.

But I cannot wait until PHP 8.5, I have exceptions in production right now. So I looked into the internals of Laravel’s database logic a bit and found a way of overriding this in Laravel. All database connections (SQLiteConnection, MySQLConnection, …) extend the base Connection class which uses the ManagesTransactions trait. There we can find the logic that calls beginTransaction() and we just need to change it to exec(‘begin immediate transaction’).

Laravel lets us override the “resolver” for a database driver (like ‘sqlite’):

Connection::resolverFor('sqlite', function (...$args) {
    return new SQLiteConnectionOverride(...$args);
});

This just sets a static property so it can be placed into AppServiceProvider::register() to run as early as possible.

Then in our override class we just do this:

class SQLiteConnectionOverride extends SQLiteConnection
{
  protected function createTransaction()
  {
    if ($this->transactions == 0) {
      $this->reconnectIfMissingConnection();

      try {
        // OVERRIDE here:
        $this->getPdo()->exec('begin immediate transaction');
        // Instead of:
        // $this->getPdo()->beginTransaction();
      } catch (Throwable $e) {
        $this->handleBeginTransactionException($e);
      }
    } elseif ($this->transactions >= 1 && $this->queryGrammar->supportsSavepoints()) {
      $this->createSavepoint();
    }
  }
}

I don’t love it, but it works great, and covers any DB::transaction() calls, even in any internal Laravel logic like cache or rate limiting, where I cannot use my dumb wrapper.

  1. Use WAL mode and a reasonable busy_timeout. Both of these can be set directly in config/database.php

  2. Be mindful of data types, take extra care to make your Eloquent casts precise

  3. Probably use the override from above? It seems wrong and that it shouldn’t be necessary, but from all information I could find, as well as various reproduction scripts I wrote, this is a real limitation in PHP (hopefully only until PHP 8.5) and by extension Laravel.

SQLite is a fantastic database and I love how simple my server setup is. Backups are as easy as a scheduled Laravel command VACUUM’ing the database into a new file that’s uploaded to a safe storage. But at least right now, it requires a bit of understanding and discipline with your application code.

Discussion about this post

Fuente: Samuel Štancl