Implementing ULIDs in Ruby on Rails with PostgreSQL
ULIDs offer a robust alternative to traditional UUIDs by providing unique identifiers that are also sortable by creation time. This tutorial guides you through integrating ULIDs into a Ruby on Rails application using PostgreSQL.
Introduction
In the evolving landscape of web and application development, the need for reliable, scalable, and efficient data identification mechanisms is more critical than ever. As applications grow and handle increasingly complex data across distributed systems, the traditional methods of identifying records—such as auto-incremental integers or universally unique identifiers (UUIDs)—begin to show their limitations. This is where ULIDs, or Universally Unique Lexicographically Sortable Identifiers, come into play, offering a compelling alternative for developers seeking to optimize their databases and application performance.
ULIDs are designed to address several key scenarios where traditional identifiers fall short:
- Distributed Systems: In environments where data is spread across multiple databases or servers, ensuring uniqueness of identifiers can be challenging. ULIDs are globally unique by design, making them ideal for such distributed architectures.
- Time-Ordered Records: Applications that require sorting records by creation time, such as logging systems, social media platforms, or any system tracking temporal events, benefit immensely from ULIDs. The lexicographically sortable nature of ULIDs means that sorting them alphabetically also sorts them chronologically, a significant advantage over UUIDs.
- High-Performance Requirements: ULIDs are generated with a timestamp and random component, offering a balance between randomness and time-based sorting without compromising on performance. This makes them suitable for high-throughput systems where insertion speed and index performance are crucial.
- Scalability and Future-Proofing: As applications scale, the overhead of managing and indexing traditional identifiers can become a bottleneck. ULIDs, with their fixed size and sortable nature, provide a scalable solution that is efficient to index and query over time.
- Human-Friendly: Unlike the lengthy and cumbersome UUIDs, ULIDs are shorter and more manageable, making debugging and manual inspection of records slightly more straightforward without sacrificing uniqueness or information density.
Prerequisites
- Ruby on Rails (version 7 or later)
- PostgreSQL database
- Basic understanding of Rails migrations
Step 1: Creating a ULID Domain in PostgreSQL
First, we'll define a custom domain in PostgreSQL to represent the ULID format. This ensures that all ULIDs stored in the database conform to the correct structure.
Generate the migration file:
rails generate migration CreateUlidDomain
Edit the generated migration file to include the CREATE DOMAIN and DROP DOMAIN SQL commands:
class CreateUlidDomain < ActiveRecord::Migration[7.1]
def up
execute <<-SQL.squish
CREATE DOMAIN public.ulid AS character(26);
SQL
end
def down
execute <<-SQL.squish
DROP DOMAIN public.ulid;
SQL
end
end
Step 2: Creating a ULID Generation Function
Next, we'll create a PostgreSQL function to generate ULIDs. This function utilizes the pgcrypto extension for generating random bytes and encodes the timestamp and randomness into a ULID format.
Generate the migration file:
rails generate migration CreateUlidFunction
Edit the generated migration file to include the CREATE FUNCTION SQL command:
class CreateUlidFunction < ActiveRecord::Migration[7.1]
def up
execute <<-'SQL'
CREATE EXTENSION IF NOT EXISTS pgcrypto WITH SCHEMA public;
CREATE OR REPLACE FUNCTION public.gen_ulid() RETURNS public.ulid
LANGUAGE plpgsql
AS $$
declare
-- crockford's base32
encoding bytea = '0123456789ABCDEFGHJKMNPQRSTVWXYZ';
timestamp bytea = E'\\000\\000\\000\\000\\000\\000';
output ulid = '';
unix_time bigint;
ulid bytea;
begin
-- 6 timestamp bytes
unix_time = (extract(epoch from clock_timestamp()::timestamp) * 1000)::bigint;
timestamp = set_byte(timestamp, 0, (unix_time >> 40)::bit(8)::integer);
timestamp = set_byte(timestamp, 1, (unix_time >> 32)::bit(8)::integer);
timestamp = set_byte(timestamp, 2, (unix_time >> 24)::bit(8)::integer);
timestamp = set_byte(timestamp, 3, (unix_time >> 16)::bit(8)::integer);
timestamp = set_byte(timestamp, 4, (unix_time >> 8)::bit(8)::integer);
timestamp = set_byte(timestamp, 5, unix_time::bit(8)::integer);
-- 10 entropy bytes
ulid = timestamp || gen_random_bytes(10);
-- encode the timestamp
output = output || chr(get_byte(encoding, (get_byte(ulid, 0) & 224) >> 5));
output = output || chr(get_byte(encoding, (get_byte(ulid, 0) & 31)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 1) & 248) >> 3));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 1) & 7) << 2) | ((get_byte(ulid, 2) & 192) >> 6)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 2) & 62) >> 1));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 2) & 1) << 4) | ((get_byte(ulid, 3) & 240) >> 4)));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 3) & 15) << 1) | ((get_byte(ulid, 4) & 128) >> 7)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 4) & 124) >> 2));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 4) & 3) << 3) | ((get_byte(ulid, 5) & 224) >> 5)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 5) & 31)));
-- encode the entropy
output = output || chr(get_byte(encoding, (get_byte(ulid, 6) & 248) >> 3));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 6) & 7) << 2) | ((get_byte(ulid, 7) & 192) >> 6)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 7) & 62) >> 1));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 7) & 1) << 4) | ((get_byte(ulid, 8) & 240) >> 4)));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 8) & 15) << 1) | ((get_byte(ulid, 9) & 128) >> 7)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 9) & 124) >> 2));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 9) & 3) << 3) | ((get_byte(ulid, 10) & 224) >> 5)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 10) & 31)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 11) & 248) >> 3));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 11) & 7) << 2) | ((get_byte(ulid, 12) & 192) >> 6)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 12) & 62) >> 1));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 12) & 1) << 4) | ((get_byte(ulid, 13) & 240) >> 4)));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 13) & 15) << 1) | ((get_byte(ulid, 14) & 128) >> 7)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 14) & 124) >> 2));
output = output || chr(get_byte(encoding, ((get_byte(ulid, 14) & 3) << 3) | ((get_byte(ulid, 15) & 224) >> 5)));
output = output || chr(get_byte(encoding, (get_byte(ulid, 15) & 31)));
return output;
end $$;
SQL
end
end
The gen_ulid()
function we've defined in PostgreSQL serves to generate ULIDs, which are 26-character, URL-safe, time-sortable identifiers. The function operates as follows:
- Timestamp Generation: The function captures the current time with millisecond precision (
clock_timestamp()
), multiplying it by 1000 to convert seconds to milliseconds. This timestamp is then broken down into 6 bytes, representing the most significant part of the ULID and ensuring that ULIDs are sortable by time. - Randomness: After the timestamp, 10 bytes of cryptographic randomness are generated using PostgreSQL's
pgcrypto
extension (gen_random_bytes(10)
). This randomness ensures that each ULID is unique, even if generated at the same millisecond as another ULID. - Encoding: The function then encodes both the timestamp and the random bytes into Crockford's Base32, chosen for its readability and URL-friendliness. The encoding process converts the binary data into a human-readable string while maintaining the lexicographical sortability property. This is achieved by mapping each group of bits to one of the 32 characters in the Crockford's Base32 alphabet.
- Concatenation: The encoded timestamp and random parts are concatenated to form the final ULID. The result is a 26-character string that uniquely identifies a record, with the added benefit of being sortable by creation time.
This approach to generating ULIDs directly in the database layer has several advantages, including consistency across application instances, reduced application complexity, and leveraging the database's robustness and scalability. By offloading the ID generation to PostgreSQL, applications can efficiently handle ID creation even under high load, ensuring uniqueness and sortability without the overhead of additional application logic.
Step 3: Configuring Rails to Use ULIDs for Primary Keys
In your Rails application, configure the generators to use ULIDs for primary keys by default. Edit the config/application.rb file:
module YourAppName
class Application < Rails::Application
# ...
config.generators do |generate|
generate.orm :active_record, primary_key_type: :ulid
end
# ...
end
end
Step 4: Generating a Model with a ULID Primary Key
When you generate a new model, Rails will automatically use a ULID as the primary key type.
rails generate model User name:string
Edit the generated migration to set the default value of the ULID id
using the gen_ulid()
function:
class CreateUsers < ActiveRecord::Migration[7.1]
def change
# Add `default: -> { 'gen_ulid()' }` to the line below
create_table :users, id: :ulid, default: -> { 'gen_ulid()' } do |t|
t.string :name
t.timestamps
end
end
end
Step 5: Adding ULID References in Other Tables
When you've configured your Rails application to use ULIDs as the default primary key type, Rails automatically handles the creation of reference columns with the correct type. This simplification is due to the config.generators
settings in config/application.rb
, which instruct Rails to use ULID for primary keys and, by extension, for reference keys in associated models.
Here's how Rails manages the generation of models with references when ULIDs are set as the default primary key type:
rails g model Post user:references title:string
This command generates a migration file for the posts
table with a user_id
column that uses ULID as its type, thanks to the global configuration you've applied. Here's what the generated migration might look like:
class CreatePosts < ActiveRecord::Migration[7.1]
def change
# Add `default: -> { 'gen_ulid()' }` to the line below
create_table :posts, id: :ulid, default: -> { 'gen_ulid()' } do |t|
t.references :user, null: false, foreign_key: true, type: :ulid
t.string :title
t.timestamps
end
end
end
Testing it out
We can now test the ULID generation by creating a new user and a post inspecting the generated ULID.
After applying the migrations, run the rails console
in --sandbox
mode:
rails console --sandbox
And then in the console:
user = User.create(name: "Loboz")
user.id
# => "01HR2DYE28BCSVAR5ENXK6AWDF"
post = Post.create(user_id: user.id, title: "ULIDs on Rails")
post.user_id
# => "01HR2DYE28BCSVAR5ENXK6AWDF"
Conclusion
You've successfully integrated ULID generation into your Ruby on Rails application with PostgreSQL. This setup enhances your application's scalability and performance by providing unique, time-sortable identifiers that are ideal for distributed systems.