Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

About

microrm is a simple object relational manager (ORM) for sqlite that requires no external tooling, keeping all specifications in natural Rust types with no DSLs and few annotations. It supports zero-allocation queries, the standard battery of entity relationships, migrations, and leverages the Rust type system for type-safe entity access and querying.

Unlike heavier-weight ORM systems, microrm is designed to be lightweight both in terms of boilerplate and runtime speed. By necessity, microrm sacrifices some flexibility in favour of these goals, and so can be thought of as more opinionated than SeaORM or Diesel — both of which are fine libraries in their own right, but they solve a different problem than microrm. In particular, you might want microrm if:

  • You don’t have to deal with an existing schema from a different ORM.
  • You want your server/application to run with a minimal number of external runtime dependencies, such as database servers.
  • You care about having a minimal crate dependency tree.
  • Your application lives on the same server that stores its database.
  • You want type-safety to sit front and centre for your database interactions.

As of the writing of this book, the latest stable release is 0.7.0. Details may change as the library itself evolves; double-check what version you’re using if something here doesn’t match up with what your compiler or API documentation reports.

Interested? Take a look at the quickstart, or start reading about the data model.

Quickstart

Let’s get started with a barebones authentication example. First, we’ll define a few simple types.

use microrm::prelude::*;

#[derive(Entity)]
struct User {
    #[key]
    username: String,
    password_hash: String,
    authz: microrm::Map<Action>,
}

#[derive(Entity)]
struct Action {
    #[key]
    title: String,
}

#[derive(Schema)]
struct AuthSchema {
    users: microrm::Table<User>,
    actions: microrm::Table<Action>,
}

That’s it; this defines a database schema of users and actions, where users can be configured to be authorized for a set of actions. Now let’s open the database and insert some example data:

let (cpool, schema) =
    microrm::ConnectionPool::open::<AuthSchema>("auth.db")
        .expect("couldn't connect to database");

cpool.run_transaction(1, |txn| {
    // add some users
    let alice = schema.users.insert_and_return(txn, User {
        username: "alice".into(),
        password_hash: "somehash".into(),
        authz: Default::default(),
    })?;
    let barbara = schema.users.insert_and_return(txn, User {
        username: "barbara".into(),
        password_hash: "someotherhash".into(),
        authz: Default::default(),
    })?;

    // now a few different actions as well
    let front_door_id = schema.actions.insert(txn, Action {
        title: "unlock front door".into()
    })?;

    let oven_id = schema.actions.insert(txn, Action {
        title: "turn on the oven".into()
    })?;

    let feed_cat_id = schema.actions.insert(txn, Action {
        title: "feed the cat".into()
    })?;

    // and now let's assign some simple authz
    alice.authz.connect(txn, front_door_id)?;
    alice.authz.connect(txn, feed_cat_id)?;
    barbara.authz.connect(txn, front_door_id)?;
    barbara.authz.connect(txn, oven_id)?;

    Ok(())
}).expect("couldn't insert data into database");

Now we can run queries against the database! For example, is ‘alice’ allowed to ‘turn on the oven’?

let allowed : bool = cpool.run_transaction(1, |txn| {
    Ok(schema
        .users
        .keyed("alice")
        .join(User::Authz)
        .keyed("turn on the oven")
        .count(txn)? > 0
    )
}).expect("couldn't run query");

Barbara forgot to turn off the oven again, let’s revoke her privilege:

cpool.run_transaction(1, |txn| {
    let barbara = schema
        .users
        .keyed("barbara")
        .get(txn)?
        .unwrap();

    let oven_authz = schema
        .authz
        .keyed("turn on the oven")
        .get(txn)?
        .unwrap();

    barbara.authz.disconnect(txn, oven_authz)?;

    Ok(())
}).expect("couldn't run query");

For more possible queries, check out the API documentation.

Congrats! You’ve defined a schema, added some data, and run some queries. For more information on what exactly was happening here, see the next chapter, Entities and Schemas.

Entities and Schemas

While there are others for more specialized tasks, interacting with the microrm API starts with two derive macros: Entity and Schema. The first defines structured data types – entities – and the second defines collections of entities and some of their relations to make a comprehensive data type that describes the entire database. The schema type, along with microrm’s generic query interface, form what is sometimes called a “data access object”, or DAO.

For a simple example, suppose you want to store a set of configuration key-value pairs; the resulting specification could be something like the following:

#[derive(Entity)]
struct ConfigPair {
    #[key]
    key: String,
    value: String,
}

#[derive(Schema)]
struct ConfigSchema {
    config: microrm::Table<ConfigPair>,
}

The optional #[key] attribute, when applied to one or more fields, defines the primary key for the entity — a set of fields that collectively represent unique entities. You can define more than one search key for an entity, but everything beyond the primary key must be defined externally to the entity; see the section on indices for more information. Relatedly, the Table type specifies a standard database table for the entity parameter, and is the usual way to access entity instances. Finally, note that all fields must be of a type that implements Datum; see the chapter on Datums for more information.

With the above schema, you could then immediately begin writing and reading data to the database, as we’ll see in a minute. microrm maintains multiple handles to the database and uses a connection pool to allow tasks to borrow one of the database handles for a transaction. Transactions have a specific meaning in the context of databases; if you aren’t familiar with them, a good way to think of it is as an atomic unit of ‘work’ for the database, a set of reads and writes. Importantly, transactions have no success guarantees and can be aborted if the database engine detects a conflict. microrm provides an API that will retry transactions automatically if you so choose, or you may wish for the transaction abort error to bubble up if you have a more specific context.

Opening a connection to the database and doing some simple queries is now very straightforwards:

use microrm::prelude::*;

// connect...
let (cpool, schema) = microrm::ConnectionPool::open::<ConfigSchema>("path-to-db").expect("failed to connect to database");

// write some data
cpool.run_transaction(1, |txn| {
    schema.config.insert(
        txn,
        ConfigPair {
            key: "cache_path".into(),
            value: "$HOME/.cache/...".into(),
        }
    )?;
    Ok(())
}).expect("failed to write to database");

// read some data
let lang = cpool.run_transaction(
    1,
    // access via the primary key
    |txn| schema.config.keyed("lang").get(txn)
}).expect("failed to read from database");
// lang is of type Option<microrm::Stored<ConfigPair>>

// read all the data
let config_pairs = cpool.run_transaction(
    1,
    |txn| schema.config.get(txn)
).expect("failed to read from database");
// config_pairs is of type Vec<microrm::Stored<ConfigPair>>

In general, queries are done by selecting a schema item (such as an Table) from the schema object and then applying qualifing clauses. Here keyed() uses the primary key to select an unique entity, and thus the return type of the query is an Option<>. The Stored type is a transparent wrapper that also contains the entity’s database ID, and can be used to synchronize local changes to an entity back to the database via sync().

Relationships

Of course, not all entities are ‘leaf’ objects; one of the main strengths of a database is the ability to maintain relationships between table rows, and that strength carries over to ORMs. microrm supports most standard relationship types, and all are closely integrated into the query interface.

Many-to-one (unidirectional)

The simplest relationship type is many-to-one, somewhat equivalent to a pointer where one object references a second object. These are accessible in microrm by simply storing the database ID of an entity inside another entity; this link can then be followed with the query interface, or the ID can be used directly in a following query.1 Here’s a simple example schema where every LogEntry is authored by a Person:

#[derive(Entity)]
struct Person {
    #[key]
    name: String
}

#[derive(Entity)]
struct LogEntry {
    contents: String,
    author: microrm::ID<Person>,
}

#[derive(Schema)]
struct Schema {
    person: microrm::Table<Person>,
    log_entry: microrm::Table<LogEntry>,
}

Assuming one starts with the ID of a specific LogEntry, the author can be queried as:

let log_entry_author =
    schema.log_entry.with_id(log_entry_id).foreign(LogEntry::Author).get(txn)?;

One downside of this approach is that, given a pointed-to entity, there is no efficient way to determine what entities reference it. If you require this to be a fully bidirectional relationship, you may wish to use a many-to-many map with uniqueness constraints instead.

Many-to-many (bidirectional)

A many-to-many relationship is an arbitrary mapping between sets of two different entities. With a many-to-many relationship, microrm supports following connections in both directions; microrm implements both one-to-one and one-to-many relationships as a special case of many-to-many relationships, as described below.

Many-to-many relationships are represented as fields in both entities, along with a tag struct to allow microrm to correctly match the two sides. The Relationship trait must be implemented for this tag struct. Here’s a simple schema example of a many-to-many map of Readers to Books, storing e.g. which books a given reader has read:

struct ReadBooks;
impl microrm::Relationship for ReadBooks {
    type Domain = Reader;
    type Codomain = Book;
    const NAME: &'static str = "ReadBooks";
}
// alternatively, can be written as:
// microrm::define_relation_tag!(ReadBooks, Reader, Book);

#[derive(Entity)]
struct Reader {
    #[key]
    username: String,
    books_read: microrm::Domain<ReadBooks>,
}

#[derive(Entity)]
struct Book {
    #[key]
    isbn: String,
    title: String,
    read_by: microrm::Codomain<ReadBooks>,
}

(All many-to-many entity relationships in microrm have a domain and a codomain. These are generally interchangeable, but must be consistent — the domain entity contains a Domain field, and the codomain entity contains a Codomain field2.)

Actually using a relation as part of a query is done through two interfaces. To modify the relationships between existing entities, the RelationInterface trait offers two functions: connect_to() and disconnect_from(). Both Domain and Codomain also implement the full Queryable interface, so the rest of the query API can be used as well, including the insert() and delete() methods, which will correctly update relationships as relevant.

Many-to-many (unidirectional)

As a special case of many-to-many relationships, microrm provides the Map type, which:

  • Is unidirectional, thus not allowing efficient ‘what references me’ queries;
  • Does not require an explicit Relation tag struct to be defined, nor does it require a field in the codomain entity;
  • Requires one less index, and thus is more efficient.

Note that only one Map<T> field may be present in an entity for a given T.

Many-to-many with uniqueness constraints (bidirectional)

By default, many-to-many relationships in microrm have no uniqueness constraints: every domain entity references an arbitrary set of codomain entities. There are two uniqueness constraints that may optionally be imposed: domain uniqueness and codomain uniqueness; these restrict the relationship to unique domain or codomain entities, meaning that one or both of the following is true:

  • Every domain entity can reference at most one codomain entity (domain uniqueness);
  • Every codomain entity is referenced by at most one domain entity (codomain uniqueness);

This can be useful in certain scenarios. In particular, these allow construction of the following relationship types:

Domain uniquenessCodomain uniquenessRelationship type
YesNoMany-to-one
NoYesOne-to-many
YesYesOne-to-at-most-one

Note that one-to-one relationships are a pending feature.


  1. Specifically, storing an EntityID as a field in an Entity will make microrm define a foreign key relationship with that table.

  2. The terms domain and codomain are taken from mathematics.

Query construction

The previous two chapters have touched on queries somewhat, but it’s worth spending some time exploring the full breadth of the query API.

In microrm, queries generally begin in the schema object, referencing a single entity’s Table. To retrieve the entities in the current query context, the methods get(), first(), count(), and get_ids() are available.

TODO: the rest of this chapter

Indices

Indices can be used to speed up query operations by pre-computing orderings on entity fields, and can optionally impose a uniqueness constraint. The #[key] attribute is a shortcut to define a single unique index on a table, but sometimes more indices are needed.

Unlike with the key index, in microrm other indicies are declared externally to the entity type; instead, they are defined in the schema. The relevant types are SearchIndex and UniqueIndex; these each take two generic type parameters: the first is the entity to declare a search index over, and the second is the list of fields that make up the search index.

Due to limitations of the Rust type system1, specifying the fields for the index must be done via the index_cols! proc macro, leading to definitions such as the following:

#[derive(Clone, Entity)]
struct IndexedExample {
    #[key]
    pub main_key: String,
    pub secondary_key: String,
    pub data: String,
    // ...
}

#[derive(Schema)]
struct Schema {
    // normal accessor
    example_entities: microrm::Table<IndexExample>,
    // secondary search key
    supplementary_index: microrm::SearchIndex<
        IndexedExample,
        // note use of index_cols! and the automatically-defined SecondaryKey constant
        index_cols![IndexedExample::SecondaryKey]
    >,
}

Actually making use of the defined index requires no special effort; any query against the entity using the exact same set of columns as is specified in the index will make use of the index internally.


  1. Specifically, the lack of inherent associated types.

Schema versioning

So far, all examples in this crate have focused on schemas that are one-and-done with no eye towards future maintenance requirements. Sadly, this is rarely the case for production databases — schemas change over time, and the usual way to handle this is via migrations, which define versions of the database schema and how to migrate the database contents between said versions. This chapter outlines the intended method of managing database schema iterations and migrations with microrm.1

The core implementation detail that microrm migrations hinge upon is that the database table name for an entity is determined by the name of the struct that has the Entity derive macro applied to it. Thus, you can define multiple versions of an entity by placing them in separate Rust modules; this leads to a module structure such as:

schema/
    mod.rs
    v1.rs
    v2.rs
    v3.rs
    ...

Here, each vX.rs file contains the database schema for that particular version, and can reference earlier versions of the schema for tables that have not changed, and mod.rs simply re-exports the contents of the most recent schema module. For example, one might have something like the following:

  • v1.rs:
use microrm::prelude::*;

/// Initial Foo definition
#[derive(Clone, Entity)]
pub struct Foo {
    #[key]
    pub bar: String,
    pub baz: String,
}

#[derive(Schema)]
pub struct Schema {
    pub foo: microrm::Table<Foo>,
}
  • v2.rs:
use microrm::prelude::*;

/// New Foo definition with linked Aleph entity
#[derive(Clone, Entity)]
pub struct Foo {
    #[key]
    pub bar: String,
    pub baz: String,
    pub alephs: microrm::RelationMap<Aleph>
}

/// Aleph record
#[derive(Clone, Entity)]
pub struct Aleph {
    pub bet: u64,
    pub dalet: Vec<u8>,
}

#[derive(Schema)]
pub struct Schema {
    pub foo: microrm::Table<Foo>,
}
  • v3.rs:
use microrm::prelude::*;

pub use super::v2::Aleph;

#[derive(Clone, Entity)]
pub struct Foo {
   #[key]
   pub bar: String,
   pub baz: String,
   /// We're adding in an extra (nullable) field
   pub quux: Option<f32>,
   /// We aren't changing Aleph at all, so we can reuse the previous definition here.
   pub alephs: microrm::RelationMap<Aleph>,
}

#[derive(Schema)]
pub struct Schema {
    pub foo: microrm::Table<Foo>,
}
  • mod.rs:
mod v1;
mod v2;
mod v3;

pub use v3::*;

type SchemaList = (v1::Schema, v2::Schema, v3::Schema, );

pub fn apply_migrations(pool: &microrm::ConnectionPool) -> microrm::DBResult<Schema> {
    microrm::migration::run_migration::<SchemaList>(pool)
}

As the definition of the entities changes, consumer code only ever sees the most recent types – thus functionally the same as changing the definition in-place – but the previous schemas are available to microrm for the purposes of migrating between different versions. Sharp readers will note that a crucial piece is missing: the description of how to convert the data between the schemas; indeed, the Rust compiler will complain if you tried to compile the above. We will discuss migration logic after a brief segue regarding the handling bidirectional relationships.

Bidirectional relationships

Some care is required when handling bidirectional relationships, due to the Relation tag struct holding a reference to both types. In particular, both types must have updated versions in the new schema, along with a new Relation-implementing tag struct. This can unfortunately require ‘chained’ updates if you have a web of bidirectional relationships among a group of entities.

Migration logic

The primary trait for migrations is MigratableItem; broadly speaking, a schema S that implements MigratableItem<T> can be migrated from the schema T. The implementation of MigratableItem controls the overarching data flow between the two schemas, farming out to other helpers as needed.

One such built-in helper is the MigratableEntity trait. Rather than describing a table-to-table transformation, MigratableEntity allows describing a row-to-row (i.e. entity-to-entity) transformation. As with MigratableItem<T>, an entity E : MigratableEntity<T> allows an instance of E to be constructed from an instance of T.

A helper attribute is built-in to the Entity derive macro to aid with some common implementations of MigratableEntity, specifically migratable_to. This will provide a stub MigratableEntity<T> implementation that performs no data transforms, and is intended for use when the only changes are to relation tags.

Nested schemas

Schemas can be nested to aid in organization. This can help reduce the number of duplicated lines between schema versions if only one sub-schema of a group is untouched.

Complete worked example

TODO


  1. microrm is sufficiently flexible that other management strategies may also work. If you find something useful that is materially different, please let us know!

Entity manipulation CLI

One optional feature that microrm supports is autogeneration of a CLI for inspecting and manipulating database entities, enabled by the cli feature.

use microrm::prelude::*;

// start by defining a simple schema
struct PetOwners;
impl microrm::Relation for PetOwners {
    type Domain = Owner;
    type Codomain = Pet;
    const NAME: &'static str = "PetOwners";
    const UNIQUE_CODOMAIN: bool = true;
}
#[derive(Entity)]
struct Pet {
    #[key]
    name: String,
    species: String,
    owner: microrm::Codomain<PetOwners>,
}
#[derive(Entity)]
struct Owner {
    #[key]
    name: String,
    address: String,
    pets: microrm::Domain<PetOwners>,
}
#[derive(Schema)]
struct Schema {
    owners: microrm::Table<Owner>,
    pets: microrm::Table<Pet>,
}

// make a very simple clap parser via the derive interface:
#[derive(clap::Parser)]
struct Invocation {
    #[clap(subcommand)]
    cmd: InvocationSubcommand,
}
#[derive(clap::Subcommand)]
enum InvocationSubcommand {
    Pet {
        #[clap(subcommand)]
        cmd: microrm::cli::Autogenerate<PetInterface>,
    },
    Owner {
        #[clap(subcommand)]
        cmd: microrm::cli::Autogenerate<OwnerInterface>,
    },
}

// now we need to define two helper types: PetInterface and OwnerInterface. these tell the CLI
// generation how to deal with these two entities.
struct PetInterface;
impl microrm::cli::EntityInterface for PetInterface {
    type Entity = Pet;
    type Error = microrm::Error;
    // no custom context needed
    type Context = ();
    // by default, microrm does not generate a "new" or "create" command, so we need to add it
    // here.
    type CustomCommand = PetCustom;

    // and a handler for it
    fn run_custom(
        _ctx: &Self::Context,
        cmd: Self::CustomCommand,
        txn: &mut microrm::Transaction,
        query_ctx: impl Queryable<EntityOutput = Self::Entity> + Insertable<Self::Entity>,
    ) -> Result<(), Self::Error> {
        match cmd {
            PetCustom::Create { name, species } => {
                query_ctx.insert(
                    txn,
                    Pet {
                        name,
                        species,
                        owner: Default::default(),
                    },
                )?;
            },
        }
        Ok(())
    }
}
#[derive(Debug, clap::Subcommand)]
enum PetCustom {
    Create { name: String, species: String },
}

struct OwnerInterface;
impl microrm::cli::EntityInterface for OwnerInterface {
    type Entity = Owner;
    type Error = microrm::Error;
    // no custom context needed
    type Context = ();
    // as before, add a 'create' command
    type CustomCommand = OwnerCustom;

    // and again, a handler for it...
    fn run_custom(
        _ctx: &Self::Context,
        cmd: Self::CustomCommand,
        txn: &mut microrm::Transaction,
        query_ctx: impl Queryable<EntityOutput = Self::Entity> + Insertable<Self::Entity>,
    ) -> Result<(), Self::Error> {
        match cmd {
            OwnerCustom::Create { name, address } => {
                query_ctx.insert(
                    txn,
                    Owner {
                        name,
                        address,
                        pets: Default::default(),
                    },
                )?;
            },
        }
        Ok(())
    }
}
#[derive(Debug, clap::Subcommand)]
enum OwnerCustom {
    Create { name: String, address: String },
}

fn main() -> microrm::DBResult<()> {
    use clap::Parser;
    let inv = Invocation::parse();
    // open the database
    let (pool, schema) = microrm::ConnectionPool::open::<Schema>("simple_cli.db")?;
    let mut txn = pool.start()?;
    match inv.cmd {
        InvocationSubcommand::Pet { cmd } => cmd.perform(&(), &mut txn, &schema.pets)?,
        InvocationSubcommand::Owner { cmd } => cmd.perform(&(), &mut txn, &schema.owners)?,
    }
    txn.commit()?;

    Ok(())
}

/* annotated interaction transcript:
    $ alias simple_cli='cargo run -qF cli --example simple_cli'
Create some pets, who must have unique names:
    $ simple_cli pet create louie cat
    $ simple_cli pet create louie dog
    Error: ConstraintViolation("UNIQUE constraint failed: pet.name")
    $ simple_cli pet create archibald dog
Create an owner:
    $ simple_cli owner create alison "123 Some Street"
Attach louie to alison:
    $ simple_cli owner attach alison pets louie
    $ simple_cli owner inspect alison
    Owner {
        name: "alison",
        address: "123 Some Street",
    }
    pets: (1)
    [#  1]: Pet { name: "louie", species: "cat" }
Verify that the connection goes both ways:
    $ simple_cli pet inspect louie
    Pet {
        name: "louie",
        species: "cat",
    }
    owner: (1)
    [#  1]: Owner { name: "alison", address: "123 Some Street" }
Add a second owner and try to give them louie as well:
    $ simple_cli owner create barbara "234 Some Street"
    $ simple_cli owner attach barbara pets louie
    Error: ConstraintViolation("UNIQUE constraint failed: owner_pet_relation_PetOwners.range")
But we can, of course, give barbara a different pet:
    $ simple_cli owner attach barbara pets archibald
*/

Datum definitions

The Datum trait holds much of the core serialization/deserialization logic that powers entity storage. While generally you will not need to implement Datum yourself for any types, this chapter provides an overview of how the various datum-related traits interact. Most interfaces only require a Datum type, but some will also use the subtraits OwnedDatum ('static + Clone datums) and BorrowedDatum (Copy datums). In particular, all fields of a type that derives Entity must satisfy an OwnedDatum constraint.

There are three manners in which Datum types are used in the microrm API:

  1. Returned from queries with static lifetime,
  2. Returned from queries with local lifetime,
  3. Passed to queries for updates/inserts.

All Datum types can be used in the third manner, but only some can be used in the first or second. The following table describes which lifetimes the types that microrm supports out of the box support, where S is a type satisfying the bound serde::Serialize + serde::DeserializeOwned and T is a type satisfying the bound Datum:

Type'static'l
{u,i}{8,16,32,64}YesYes1
boolYesYes1
StringYesNo
&strNoYes
Vec<u8>YesNo
&[u8]NoYes
OsStringYesNo
&OsStrNoYes
PathBufYesNo
&PathNoYes
Serialized<S>Yes2No
Vec<S>Yes2No
&SNoNo
time::OffsetDateTime3YesYes1
time::UtcDateTime3YesYes1
Option<T>Yesbased on T
&TNoNo

Note that isize and usize explicitly do not have Datum implementations, because their size is platform-dependent: writing to a database on a 64-bit system and then reading it on a 32-bit system could produce incorrect results.

OwnedDatum and BorrowedDatum

Briefly, OwnedDatum is a Datum that has a 'static lifetime, i.e. it owns its data, and BorrowedDatum is one that has a non-'static lifetime. The relationship is slightly more complicated than just that, however: each OwnedDatum is required to define a BorrowedDatum that acts as its ‘canonical’ reference type, and each BorrowedDatum defines what OwnedDatum it represents. There may be multiple such BorrowedDatum types for a single OwnedDatum; consider for example &[u8] and &Vec<u8> as borrowed versions of Vec<u8>.


  1. For types that are inherently Copy, the associated reference type is exactly the same as the owned type. ↩2 ↩3 ↩4

  2. Requires the serialized feature. ↩2

  3. Requires the time feature. ↩2