Simulate Intel NUCs with Virtualbox

Since yesterday's update, I've made the repo! It has a minimal (read: shitty) README, but the code is all there. Just git clone the repo on a linux machine and run make.

I need to add a simple way to check all dependencies. Maybe a command like make check-deps would work.

I also need to add a user-friendly way of letting people know that they should add a config/authorized_hosts file.

Simulating Intel NUCs with VirtualBox

I was thinking about writing some tests for this new repo. I'd like to run my testing on my mac before making physical moves. Here are some commands I used help manage the stormlight virtualbox.

# Turn on and off the VM
VBoxManage startvm stormlight
VBoxManage controlvm stormlight poweroff

# Attach stormlight.iso as a DVD
VBoxManage storageattach stormlight --storagectl IDE --port 1 --device 0 --type dvddrive --medium ~/Downloads/stormlight.iso

# Setting medium to "none" removes the drive. 
VBoxManage storageattach stormlight --storagectl IDE --port 1 --device 0 --medium "none"

Creating a VM to Simulate an Intel NUC with Virtualbox

# making a virtualbox for stormlight
alias vb=VBoxManage

mkdir '/Users/pbui/VirtualBox VMs/test/'
vb createvm --name test --register --ostype "Ubuntu_64"
vb modifyvm test --firmware efi --rtcuseutc on --ioapic on --memory 1024 --vram 128 --nic1 nat 

vb createmedium disk --filename '/Users/pbui/VirtualBox VMs/test/test.vdi' --size 10240 --format VDI --variant Standard

vb storagectl test --name nvme --add pcie --controller NVMe
vb storagectl test --name ide --add ide --controller PIIX4

vb storageattach test --storagectl nvme --port 0 --device 0 --type hdd --medium '/Users/pbui/VirtualBox VMs/test/test.vdi' --nonrotational on

vb storageattach test --storagectl ide --port 1 --device 0 --type dvddrive --medium ~/Downloads/stormlight.iso

vb startvm test

Ubuntu Unattended Installs, Virtualbox, Manipulating initrd.gz, and Makefiles

I have three bare-bones Intel NUCs. I want to install Ubuntu Server 18.04 and then install Kubernetes using kubeadm.

But, installing Ubuntu Server three times is a chore. I've misconfigured the servers multiple times now. I'm honestly tired of it.

So I'll automate it.

Thinking out loud, this is what I'd like to see:

  • Stick a USB into the machine and power on
  • Wait for the install to complete
  • Reboot the machine
  • SSH into the machine from my mac to automate configuration with <insert some tool I've yet to decide on>.

Luckily, I came across the Automatic Installation docs and the lovely InstallCDCustomization wiki.

But, the installation is still physical. I have to create the ISO, load it on a USB stick, walk over to my Intel NUCs and stick in the USB, and boot the instance. Then I have to wait. And if I didn't setup the preseed.cfgcorrectly, I am prompted with a dialog box that stops the installation.

I need a way to simulate the installation on a new system without moving from my chair.

As I edit this post, I realize how lazy I am reading that last sentence.

Luckily, Virtualbox is a handy tool for this.

Virtualbox command line

Here are the commands I found useful simulating the installation.

VBoxManage startvm stormlight
VBoxManage controlvm stormlight poweroff

# Attach stormlight.iso as a DVD
VBoxManage storageattach stormlight --storagectl IDE --port 1 --device 0 --type dvddrive --medium ~/Downloads/stormlight.iso

# Setting medium to "none" removes the drive. 
VBoxManage storageattach stormlight --storagectl IDE --port 1 --device 0 --medium "none"

Working with initrd.gz

Through a lot of trial and error, I realized that the best way to automate the installation process is to modify the initrd.gz archive for the Ubuntu installation process. Sadly, this took a few days of trial an error tinkering with a preseed.cfg file in the root directory of the ISO and with command line flags. Neither solution worked, so modifying the initrd.gz archive it is. Here are a few helpful commands for modifying the archive (which I only just realized now that cpio was a precursor for tar -- cool!).

# Lists files in archive just like ls -l
cpio -ivt < initrd.gz

# Extract a file with isoinfo
isoinfo -i mini.iso -x '/INITRD.GZ;1' > initrd.gz

Makefile on lightweaver

lightweaver is the name of my kubernetes master -- I name my machines after orders of the Knights Radiant. This section probably doesn't matter to you readers as you don't have access to this machine, but hey, these are my development notes ...

On lightweaver (my intel NUC with ubuntu installed by hand), I set up a new working directory to help build and create the ISO with a Makefile. It's at ~/stormlight-iso on lightweaver.

ssh lightweaver
cd stormlight-iso
make

I also created scripts locally on my Mac to build the stormlight-iso, download to my Mac, run Virtualbox to create the image, and then finally have it shutdown. This is at:

~/code/stormlight/test-stormlight.sh

Update (later in the day)

I finally got everything working! Even the SSH and disabled password.

Next: make a private repo and store on GitHub so I don't forget. The repoisotry is designed to run on Ubuntu machines (for now). I will add some form of testing as well.

GOTO Postgres Configuration for RDS and Self-managed Postgres

I've updated my previous postgres configuration so that it works for both self-managed postgres databases and RDS postgres databases.

I want to address a failing in my prior post on my GOTO postgres configuration because it fails for RDS postgres databases. Luckily, this change will work for RDS postgres and self-managed installations. Woohoo!

RDS's most privileged user is rds_superuser. rds_superuser is akin to a regular user that has privileges to create postgres roles. And, unlike a postgres superuser, rds_superuser does not have privileges to create and manage objects for another postgres role. This is the crux of the problem with my previous configuration, and the reason you see errors like ERROR: must be a member of the role "owner".

Since we cannot create objects for other users with rds_superuser, we separate role management and database object management. To manage roles, we rely on rds_superuser. To create database objects and control access, we rely on the owner. So the configuration from my previous post needs a few changes. Here's the gist:

My GOTO Postgres Configuration for Web Services

My guide to setting up postgres for web services.

My GOTO Postgres Configuration for Web Services

UPDATE Jan 6, 2020: I've updated this guide's configuration in another post to also include RDS databases. This configuration still works, but only for self-managed postgres installations that have a real superuser.

I've painfully learned how to configure postgres one incident at a time. And now this guide represents my learnings for web services. Also, I'm tired of thinking and discussing this with myself, and I want share this guide with others. This guide presents my default postgres database configuration for web services.

Table of Contents

Assumptions

This guide is a starter configuration for postgres. It makes several assumptions about a web service and development team. This may work for you. This guide assumes:

  • A running postgres database and you have superuser or superuser-like (e.g., rds_superuser)(see update for RDS) access
  • You're developing a web service with a team and both will (hopefully) grow rapidly (think: startups)
  • Schema changes are executed as an admin process -- you use rake db:migrate, manage.py migrate, sequelize db:migrate, or similar to create tables prior to app deployment
  • One application postgres database (i.e., when you run \l to show all databases on the server)
  • One schema named after the database

Last but not least, this configuration is designed to evolve as your web service and team grow. If the configuration doesn't work, change it!

The End Result

  • One user (i.e., a postgres role that can log in) owns all objects (database, schema, tables, indexes, etc.). This user is called the owner.
  • A role readwrite_users that enables read and write access to all objects created by the owner.
  • A role readonly_users that enables read-only access to all objects created by the owner.
  • An app user that belongs to readwrite_role that has connection settings to guard against scaling problems (statement_timeout, lock_timeout, idle_in_transaction_session_timeout, connection_limit, and search_path)

The Upsides

  • Enables good separation for maintenance operations (schema changes, data backfills, monitoring) from application operations (CRUD, data management, etc.)
  • Enables server-side defaults for user settings like statement_timeout and lock_timeout -- the application can still override these if needed
  • Sets up roles and allows you to never deal with postgres GRANTS, REVOKES, and the unfortunately confusing ALTER DEFAULT PRIVILEGES ever again (as long as you stay within the use-cases of this configuration of course)
  • Groups users by their access privileges of read-and-write or read-only
  • Prevents TRUNCATE of an entire table from the app user

The Downsides

This configuration doesn't come without its downsides. The biggest downside I experience is related to local development. Local development requires knowing two sets of credentials -- one credential is used for schema changes (migrations), and one is used by the app to read and write data. For the developer, this knowledge can be codified to reduce confusion, but it's still knowledge a developer should know with the configuration presented in this guide.

Another downside with this configuration is the inability to prevent DELETE * FROM table. The configuration does prevent TRUNCATE statements for application developers, but preventing DELETE * FROM table is not easily accomplished without bigger trade-offs.

Finally, setting guard rails like statement_timeout at the database layer does not prevent abuse from the application code. Postgres does not prevent SET statements. Also, it's important that the application has the ability to override settings when needed. As mentioned, the default settings are guard rails and not jail bars.

Walkthrough: Configuring Postgres

Let's walk through the steps required to create the configuration described. Connect to your database with a superuser. All statements executed within this guide require the superuser. That is all you will need.

Create the owner User

The owner user owns all objects for your application's database. Every schema, table, index, and <insert favorite postgres object> is owned by this user. owner is the only user used for schema changes.

CREATE ROLE owner 
	LOGIN 
	ENCRYPTED PASSWORD 'secret' 
	CONNECTION LIMIT 3;
ALTER ROLE owner SET statement_timeout = 20000;
ALTER ROLE owner SET lock_timeout = 3000;

-- v9.6+
ALTER ROLE owner SET idle_in_transaction_session_timeout = 3000; 

  • Change the password to something secure. :)
  • CONNECTION LIMIT 3: This allows owner to connect if when we have accidental long-running queries. Which leads us to ...
  • statement_timeout: 20 seconds is long, but it provides an upper limit for long-running queries. You will likely set this value higher in migrations to accommodate for schema changes that rewrite tables and for a larger data set. But you can enjoy that error message when your data is large enough that postgres cancels your schema change.
  • lock_timeout: 3 seconds is what I've found to be long enough but not too long where an outage will occur due to connection exhaustion.
  • idle_in_transaction_session_timeout: 3 seconds, similar to lock_timeout is what I've found useful for my applications. Change the value to suit your needs. This setting is only available in postgres 9.6+.

Create the readwrite_users Role

The readwrite_users role groups users that require read and write access (CRUD) to objects owned by owner.

CREATE ROLE readwrite_users NOLOGIN;

We need a database to alter privileges, so this is as much as we can configure for this role at this point.

Create the readonly_users Role

The readonly_users role groups users that require read access (SELECT) to objects owned by owner.

CREATE ROLE readonly_users NOLOGIN;

We need a database to alter privileges, so this is as much as we can configure for this role at this point.

Create the Database

Create the exampledb database.

CREATE DATABASE exampledb 
WITH OWNER owner 
ENCODING UTF8 
LC_COLLATE 'en_US.UTF-8' 
LC_CTYPE 'en_US.UTF-8';

Postgres enables connection privileges to new databases for all users (PUBLIC) by default. Postgres also creates the public schema enabling usage and creation of tables within that schema for all users. Instead of relying on implicit privileges from the default installation, we want to declare privileges to roles explicitly. This puts us in a state of least privilege first, and then we add new privileges afterward. I generally find the least-privilege-first strategy much easier to follow mentally and in code. I'm a simpleton, so starting with nothing and adding is easier than starting when everything and then removing.

Therefore, with new databases, I revoke all privileges for the database and drop the public schema.

\c exampledb
REVOKE ALL ON DATABASE exampledb FROM PUBLIC;
DROP SCHEMA public;

Create the Schema and Configure search_path

Now create the schema. The schema is named after the database to keep things simple.

SET ROLE owner;
CREATE SCHEMA exampledb;
RESET ROLE;

Now we default the owner user to exampledb schema on connection since we dropped the public schema. Also, if this was not set, then owner defaults to "$user" -- i.e., the owner schema that does not exist.

ALTER ROLE owner SET search_path TO exampledb;

We also need to set our session's search_path to exampledb so that we, as a superuser, properly create objects in the exampledb schema.

SET search_path TO exampledb; 

GRANTs for Resources

With the basic resources provisioned, we grant privileges for the owner user, readwrite_users role, and readonly_users role. Starting with the exampledb database.

GRANT CONNECT ON DATABASE exampledb TO readwrite_users;
GRANT TEMPORARY ON DATABASE exampledb TO readwrite_users;

GRANT CONNECT ON DATABASE exampledb TO readonly_users;
GRANT TEMPORARY ON DATABASE exampledb TO readonly_users;

Now grant privileges to the exampledb schema.

GRANT USAGE ON SCHEMA exampledb TO readwrite_users;
GRANT USAGE ON SCHEMA exampledb TO readonly_users;

-- Not needed, but being explicit helps with /dn+
GRANT CREATE, USAGE ON SCHEMA exampledb TO owner;

Future-proofing

GRANT (and REVOKE) enables privileges for existing objects but not for new objects. This is accomplished with ALTER DEFAULT PRIVILEGES.

Let's configure privileges for the readwrite_users role. We want to ensure new objects like tables or indexes allow, well, read and write privileges for the users in the readwrite_users role.

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT SELECT, INSERT, UPDATE, DELETE
  ON TABLES
  TO readwrite_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT USAGE, SELECT, UPDATE
  ON SEQUENCES
  TO readwrite_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT EXECUTE
  ON FUNCTIONS
  TO readwrite_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT USAGE
  ON TYPES
  TO readwrite_users;

As stated earlier, these statements rely on owner to create all objects in the database. Therefore, these statements do not include privileges that would allow a user in readwrite_users to create objects. Refer to the ALTER DEFAULT PRIVILEGES and Privileges documentation for details on the other privileges.

Additionally, I did not include a statement for future schemas because this guide doesn't support multiple schemas.

Now let's move onto the readonly_users role.

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT SELECT
  ON TABLES
  TO readonly_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT USAGE, SELECT
  ON SEQUENCES
  TO readonly_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT EXECUTE
  ON FUNCTIONS
  TO readonly_users;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  IN SCHEMA exampledb
  GRANT USAGE
  ON TYPES
  TO readonly_users;

Finally, we want to ensure PUBLIC (a special role that means all users) does not have access to objects in our database. This forces users to belong to either the readwrite_users role or readonly_users role to access the database.

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  REVOKE ALL PRIVILEGES
  ON TABLES
  FROM PUBLIC;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  REVOKE ALL PRIVILEGES
  ON SEQUENCES
  FROM PUBLIC;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  REVOKE ALL PRIVILEGES
  ON FUNCTIONS
  FROM PUBLIC;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  REVOKE ALL PRIVILEGES
  ON TYPES
  FROM PUBLIC;

ALTER DEFAULT PRIVILEGES
  FOR ROLE owner
  REVOKE ALL PRIVILEGES
  ON SCHEMAS
  FROM PUBLIC;

Once again, I favor explicitness for readability (and my sanity) because not all of the statements above are required. The default privileges for PUBLIC are documented, and a few of the statements above are redundant.

With these last statements, you will not need to work with GRANTS and ALTER DEFAULT PRIVILEGES again as long as you create users in the readwrite_users or readonly_users roles. WOOHOO!

Create the app User

The app user is what your web service will use to connect to the database with. app (usually) requires read and write permissions, so this user belongs to the readwrite_users role.

CREATE ROLE app WITH
	LOGIN 
	ENCRYPTED PASSWORD 'secret' 
	CONNECTION LIMIT 90 
	IN ROLE readwrite_users;
ALTER ROLE app SET statement_timeout = 1000;
ALTER ROLE app SET lock_timeout = 750;

-- v9.6+
ALTER ROLE app SET idle_in_transaction_session_timeout = 1000; 

ALTER ROLE app SET search_path = exampledb;
  • Change the password. :)
  • CONNECTION LIMIT 90: if you're using the default installation of postgres, the installer usually defaults max_connections to 100. Setting this to 90 ensures there are a few more connections available for administrators to hop in and wrangle queries in the event of an incident. Feel free to change this to match your database setup. If you need more connections, follow postgres' best practice of adding a connection pooler in front of the database.
  • statement_timeout: 1 second is a long time for a web request. 1 second for one query of many queries in a web request feels like an eternity to a user. Feel free to change this to suit your needs. I like pushing this as low as possible for my services. Keeping this low is important at scale: Twitch has a 1-second statement_timeout.
  • lock_timeout: 750ms might not seem much, but if you have the 1-second statement_timeout, this allows your query 249ms of execution time. Change as required.
  • idle_in_transaction_session_timeout: for postgres v9.6+, this ensures you do not have zombie transactions that maintain active connections. This guards the database against webservers that terminate requests and leave a connection open.

The GIST of It

Alright, alright! That was a lot of text to set up a database. Here's the gist. I use the same defaults and secrets above, so please change them!


Developer Basics

This guide's postgres configuration provides better defaults for scaling web services with databases than the defaults provided by a base installation of postgres. That said, this configuration may seem harsh and punishing if you're unfamiliar with writing efficient SQL. But, this is okay. Learn to write SQL efficiently while your web service and team scale up. While you're doing this, the configuration will help protect you from painful incidents like a tipped database. Nevertheless, let's discuss a few points to keep in mind.

The app user has SELECT, INSERT, UPDATE, and DELETE privileges for all tables created by the owner user. app is the only user your web service connects with to handle web requests. SQL queries sent with the app user should be fast, or the settings for the app user will terminate the statement. If your application code attempts to create tables, indexes, or other postgres objects, you will be on the receiving end of a permission failure. To solve this, use the owner user.

The owner user manages all tables and indexes for your application's database. The management of the database schema should happen outside of the webserver process -- in an admin process like rake db:migrate or alembic upgrade head. The owner user is not designed for web service requests. You want to create the structure of the database with owner, but manage the data in that structure with app.

Finally, you can override setting values if needed. The defaults are guard rails, not jail bars. You can jump over the rail if you know what you're doing.

Schema Changes as a superuser

Hopefully, you don't need to do this as schema migrations with owner should suffice. But, incidents happen and maybe the quickest way to stop the bleeding is to hop onto the database and wrangle some code.

SET ROLE owner;
-- ALTER TABLE ...
-- ...
RESET ROLE;

Schema Changes (via Migrations) and Local Development

Use the owner's credentials to run migrations. If your application follows the Twelve-Factor App's Config Factor, then the owner credentials can be passed in via environment variables in the form of a postgres connection URI. Here's an example of what the URI would look like for a local database:

postgres://owner:secret@127.0.0.1:5432/exampledb 

If you're using rake db:migrate, set the DATABASE_URL environment variable to the owner credentials.

If you're using alembic upgrade, ensure EnvironmentContext.configure passes in the database URI from the environment, and you can swap out database URIs as mentioned. e.g.,

# env.py
import os
from alembic import context

context.configure(url=os.getenv("DATABASE_URL", "localhost"))

Similar strategies are available in your migration tool of choice.

List Users

List users and roles:

\du+

List users and their settings (only as a superuser):

SELECT usename, useconfig FROM pg_shadow;

Add User

We only add users to readwrite_users or readonly_users.

CREATE ROLE newuser WITH
	LOGIN
	ENCRYPTED PASSWORD 'secret'
	CONNECTION LIMIT 3
	IN ROLE readwrite_users;
ALTER ROLE newuser SET search_path = exampledb;

-- Recommended connection settings
ALTER ROLE newuser SET statement_timeout = 1000;
ALTER ROLE newuser SET lock_timeout = 750;
-- v9.6+
ALTER ROLE newuser SET idle_in_transaction_session_timeout = 1000; 

Tweak the IN ROLE, CONNECTION LIMIT, and settings to fit your needs.

Drop/Remove User

Connect as the superuser and execute the following:

-- Change password
ALTER ROLE myuser ENCRYPTED PASSWORD 'somethingbogus';

-- Terminate existing queries/connections
SELECT pg_terminate_backend(pg_stat_activity.pid) 
FROM pg_stat_activity
WHERE usename = 'myuser';

-- Now remove the user
DROP USER myuser;

Changing the user password stops new connections from clients. Terminating existing queries ensures that the targeted drop is no longer connected to the database before dropping the user.

Update User Settings

Connect as the superuser and execute the following:

ALTER ROLE myuser SET statement_timeout = 5000;

-- Optional, terminate existing queries
SELECT pg_terminate_backend(pg_stat_activity.pid) 
FROM pg_stat_activity
WHERE usename = 'myuser';

Settings are applied at connection time. Therefore, you need to have your application or developer reconnect after the change.

app Password Rotation

Connect as the superuser and follow these steps:

  1. Change the app password: ALTER ROLE myuser ENCRYPTED PASSWORD 'newpassword'
  2. Update the credentials wherever you store them (e.g., heroku settings, chef data bags, etc)
  3. Restart your application to use new credentials.

This strategy does have a flaw. The time between Step 1 and Step 3 may raise exceptions from the application if you're using webservers that cancel requests (Rack::Timeout, g/Unicorn, etc) and reconnect to the database. You'll want to execute the above steps rapidly.

Dynamically Set Settings From Your Application

The database is configured with default settings like statement_timeout for app, but these can be overridden from the application code. A quick first pass might look like:

-- WARNING! Leaky settings! WARNING!
SET statement_timeout TO 5000; 
-- ... execute query ...
RESET statement_timeout;

This is not a full-proof method because you may leak setting values. If you're using a connection pooler like pgbouncer, then executing this statement from the application will enable this setting for the life of the connection. The connection pooler shares this connection with other requests, so you must call RESET statement_timeout after you've completed your query. Otherwise, the setting value lasts until the connection is closed, a new SET statement_timeout query is executed, or if RESET statement_timeout is executed. But even if you executed RESET, leaking the value to other requests may still occur with long-request-terminating webservers.

A better method is to change the setting in a transaction using SET LOCAL:

BEGIN;
SET LOCAL statement_timeout = 5000;
-- execute queries
COMMIT; -- or ROLLBACK

Now, when the transaction is complete (COMMIT or ROLLBACK), statement_timeout is reset. Additionally, if your webserver terminates long requests during a transaction, then idle_in_transaction_session_timeout will terminate the connection and clean up the setting set by the application.

Query Errors

With all this talk of query cancellations, what are the errors that occur? Let's have a look.

ERROR:  canceling statement due to statement timeout

You see this error message when a query surpasses the statement_timeout value.

If possible, rework the query (maybe it's missing an index?) and then try again. If this strategy doesn't work, you can increase statement_timeout for the duration of this query.

ERROR:  canceling statement due to lock timeout

You see this error message when a query attempts to acquire a lock and surpasses the lock_timeout value.

This one isn't as easy (or difficult) as fixing statement_timeout. For this error, you'll need to make sure that your application queries hold locks for as little as possible. Or you can delay the queries. You can rearchitect web requests so that the locks happen out of the web request cycle -- in an asynchronous job.

FATAL:  terminating connection due to idle-in-transaction timeout

You see this error message when you open a transaction and idle for longer than idle_in_transaction_session_timeout.

You can fix this by rearranging your queries to ensure there are no long pauses in between queries sent to the database.


Outro

This configuration described in this guide is what I use for web services now. I even use this for personal projects that don't have the data size or team size at work! That said, this is my default setup for web services and it has served (and serving) me well. I hope it works for you.

Postgres Failures At Scale

A discussion of failures with postgres' default settings.

I love working at startups. My career up until this point has been exclusively in startups. Startups move fast to claim victory for the day because they know there might not be a tomorrow. As a software engineer in startups, I have to be quick with my solutions, nimble when the company pivots, and extremely humble to learn from my mistakes. And with this guide, I present postgres knowledge earned through failures.

Over the past four years, I experienced downtimes, late-night wake-up calls, "good-enough" designs, data growth, and organizational growth while managing postgres databases for my medium-size engineering department in a startup. I learned how postgres handles replication, how to quickly backup and restore databases, and how to lock up the database with schema changes. When I required help, a Google search would present solutions in the detailed documentation, in the community wiki, in the public mailing lists, and in StackOverflow. With the abundance of readily available information, postgres makes an excellent database for a beginner. I believe postgres is a delightful database, and I would recommend the database to anyone working in startups building web services.

And yet, the default settings for postgres are terrible for growing web services. Postgres is older than many of the millennials that now use the database in startups. Postgres uses range from bank software, data warehouses, web services like Instagram and Twitch, and of course, my super-awesome-badass-perfect pet projects. My guess is, the defaults we see today were a side effect of managing sophisticated software over decades (!). Nevertheless, developers in startups move fast to ship product, and it's easy to overlook the default settings when you're trying to get shit done.

So for this article, I'd like to focus on two settings that affect web services and postgres in the context of startup growth. The two settings are:

statement_timeout

Abort any statement that takes more than the specified number of milliseconds, starting from the time the command arrives at the server from the client. If log_min_error_statement is set to ERROR or lower, the statement that timed out will also be logged. A value of zero (the default) turns this off.

lock_timeout (postgres 9.3+)

Abort any statement that waits longer than the specified number of milliseconds while attempting to acquire a lock on a table, index, row, or other database object. The time limit applies separately to each lock acquisition attempt. The limit applies both to explicit locking requests (such as LOCK TABLE, or SELECT FOR UPDATE without NOWAIT) and to implicitly acquired locks. A value of zero (the default) turns this off.
Unlike statement_timeout, this timeout can only occur while waiting for locks. Note that if statement_timeout is nonzero, it is rather pointless to set lock_timeout to the same or larger value, since the statement timeout would always trigger first. If log_min_error_statement is set to ERROR or lower, the statement that timed out will be logged.

These settings, when enabled, limit how long a query executes or waits for locks. Both settings default to off, allowing SQL queries and transactions to execute without interruption from postgres. As we'll see, these defaults cause more significant problems in the future.

In the early stages of a web service, allowing queries to execute for as long as necessary isn't a problem. There isn't much traffic to the service, and there's little data. Every query running on the database is screaming fast -- even the problematic queries like pagination and counts. If you're building a mobile application to interface with the web service, the phone screen is likely to refresh without the user noticing. If you're on the web, user actions are snappy and responsive. If the webpage didn't flicker, you might not have seen a request. And with a bit of luck, your startup is attracting users. But this is like a new relationship — you're in a honeymoon phase.

Once the web service grows in users and data, query latencies climb. Snappy queries slow as the database filters more significant numbers of rows. The total number of queries sent to the database climbs higher, increasing overall resource contention. Your new developer Luke may forget to use the index, causing sequential scans that hog disk IO. New features may increase the complexity of queries, claiming more CPU and memory. No matter what the reason, the honeymoon phase declines as the database usage increases. In addition, software engineers are wearing so many hats as the startup grows that the database tuning may be left behind or forgotten. The situation may lead to the database's tipping point.

A Tipped Database

The tipping point of a database is when the database server resources are oversubscribed due to the number and complexity of queries concurrently running on the database. When the database reaches the tipping point, it falls over and is extremely difficult to upright. All queries have longer query latencies, and the entire system slows to a crawl. Normally fast queries are queued behind slower queries. This also causes the web service to respond slower. And slow web requests may lead to a loss of users. The events that follow a tipped database spiral out of control and solutions to upright a tipped database are at best short term solutions.

Pushing a database to its tipping point starts with long-running queries.

Long-Running Queries

Long-running queries (LRQ) hog database resources like disk IO, reducing the capacity for other queries on the system. The more concurrent LRQs are executing, the more time the database spends scheduling access to resources like CPU, memory, and disk IO. If enough LRQs saturate available resources, all other queries, whether they are LRQs or not, wait for the utilized resources. This, in turn, makes all queries long-running queries. And since statement_timeout is disabled, long-running queries beget more long-running queries.

There exist two strategies to bring a tipped database back to normal. The first strategy is to stop sending new LRQs to the database and allow the database to complete the current LRQs. This will slowly bring the database back to expected latencies when the existing LRQs complete. The second strategy is to cancel the LRQs with pf_cancel_backend() or pg_terminate_backend(). Canceled queries release shared resources allowing other queries to execute.

Both strategies are at best band-aids to the situation. Attempting the first strategy of disabling LRQs may lead to disabling a feature and running at a degraded service level. Worst, if the feature is critical to the web service (like a payment transaction), then a downtime event might be more appropriate.

The second strategy of canceling the LRQs is a never-ending battle with a tipped database. If the web service is online while you terminate queries, then the subset of users with canceled LRQs will see error messages. But, like the first strategy, if the canceled LRQs are critical to the web service, a downtime event might be more appropriate.

Both strategies are short-term bandaids that stop the bleeding. The problem is systemic. If you fix the problem with the bandaid solutions, the service will return to normal (hopefully your users too). When the traffic increases on the web service, the database will again reach the tipping point. Once tipped, you're bleeding again. A long-term fix is required to bring the service back to normal levels of operation.

But, what about your team during the incident?

Your team is likely in a meeting room, now dubbed "TEH War Room," debugging the tipped database. They scramble to find the off switch that disables the beautifully, yet fear-inducing automated deploy system. At the same time, your manager yells at your teammate to stop committing code, a teammate that has the propensity to ship code as soon as the happy path is complete. Another teammate is dashing across the office of densely packed manual desks, to call over the infrastructure/DBA/SRE/DevOps to help. Yet another teammate is downloading slow query logs to hunt for the one slow query that ruined them all. All the while, customer support, business development, marketing, product management, non-infrastructure operations, and sales team are asking, "Are we up yet?" Sigh.

Hopefully, a solution is found. Because if LRQs delay other queries, the database eventually reaches the next error.

FATAL: sorry, too many clients already.

I hate this error. To beginners, the apparent solution leads to problems at scale. You cannot increase max_connections with a postgres database and move on like other databases. Instead, the postgres wiki suggests using connection poolers to solve this error. This allows you the properly scale your web servers horizontally without overwhelming exhausting database connections.

So now you scaled up in traffic, and the connection pooler hums along smoothly, handling a larger number of queries. But, then database tips. Your team is frantically debugging the issue while your service is in a degraded state. Suddenly, FATAL: sorry, too many clients already. messages appear in your logs. And now your service is throwing out 500s. What happened?

LRQs force the connection pooler to maintain an active connection to the database. And in a tipped database, the more LRQs are running, the fewer available connections available. When all possible connections are held by LRQs, new queries sent to the database error with the message FATAL: sorry, too many clients already. You've maxed out available connections to the database.

This causes two problems for the web service. First, the web requests with an active LRQ connection have a slow response, well because it's an LRQ. Lucky (unluckily) you if you sent this web request. Second, all other web requests executing new queries experience database connection errors because the connection pooler fails to connect to the database. With a postgres default of 100 connections, 100 web requests will be slow, while all other web requests have an immediate 500 web response assuming your developers didn't account for a failed database in their web service.

Unlike LRQs, where the web service is responding, maxing out connections is an immediate downtime event. Luckily, your development team has the opportunity to manage this, just like the previous situation! Joy!

Zombie Long Running Queries

In the previous section, we discussed how the default statement_timeout affects tipped databases with LRQs by maxing available connections. Unfortunately, I conveniently failed to mention the best practice of timing out web requests, so let's discuss this now.

A slow web request hogs application server resources reducing the capacity for other requests. Load balancers queue new requests behind slow requests like you do when you line up behind a person with 500 grocery items at your local supermarket. This leads to slow responses for all users of your web service. Therefore, it's best practice to terminate long web requests. For instance, the popular deployment service Heroku, terminates requests after 30 seconds. If you’re not on Heroku, there are similar settings for application servers like unicorn or Rack::Timeout for ruby, gunicorn for python, and net.http for go.

Unfortunately, terminated web requests cause zombie queries. A zombie query is a query that is running on the postgres database, where the client is no longer waiting for the results. This happens because the postgres database cancels a query when postgres fails to deliver results back to the client. In other words, only after the query has completed executing (loading data into memory, filtering results, sorting, etc.) will postgres notice a client has disconnected. So up until the communication with the client, postgres will blindly utilize the necessary resources to execute the query. Knowing this, LRQs artificially reduce available connections and starve existing web handlers from executing queries.

If the tipped database stays tipped long enough, the web service will face another downtime event. Luckily, your development team should now have the experience of handling a downtime, right?

Locks

statement_timeout applies to all queries in a database, and it overlaps with lock_timeout, the following setting to discuss that impacts web services. lock_timeout applies when a query attempts to acquire a lock. Let's dig into what locks are and the effects of too many locks.

Locks enable a postgres to serialize access to database objects, just like the lock on the door serializes the entrance to the bathroom in the bar across the street from my work. When postgres clients acquire a lock on an object, all other clients that want to acquire a lock on that object wait. The time a client waits for a lock is controlled by lock_timeout -- the default enables indefinite waiting. And waiting is problematic at scale.

Lock Mechanics 101

Let's zoom into the mechanics of locks. Let's say we have transaction A that holds a lock on the users table. Now, let's say transaction B attempts to acquire a lock on the users table. In this scenario, transaction B is forced to wait until transaction A releases the lock on the users table. Pretty basic mechanics, right?

Zooming out, if a web service typically creates B-like transactions at high concurrency (not uncommon if you have code that uses SELECT FOR UPDATE), then when a slow A-like transaction comes along that acquires a lock, this causes all the B-like transactions to wait until the A-like transaction completes. The longer B-like transactions are waiting, the more active connections the connection pooler has to maintain to the database. The more active connections used, the faster the web service sees FATAL: sorry, too many clients already.

A real-life example of this is running this statement on postgres (< 11):

ALTER TABLE users
ADD COLUMN status TEXT SET DEFAULT 'pending';

For this statement, we're adding a new column status with a default value of 'pending'. The ALTER TABLE statement acquires a lock on the users and then rewrites the whole table to ensure the status column is set with the default. If you have a large users table, this statement holds the lock on users for a long time. This is an A-like transaction. If there are B-like transactions from your web service executing after the ALTER statement, then your web service rapidly marches (in seconds) towards connection exhaustion. Prep TEH War Room!

Lock Mechanics 102

Let's zoom back in at a different angle to look at the mechanics of locks. The previous example used a simplistic definition of locks -- when an object is locked, no other access to that object is allowed (like the single bathroom analogy). But, postgres has different modes of locks for various purposes. If I'm keeping with the bathroom analogy, and hell yes, I'm keeping to it, then different lock modes are like a larger bathroom that has a common area to wash your hands and a stall to do your business. These bathrooms handle more concurrency -- one person can use the stall while another person uses the sink. Postgres locks enable a similar concurrency with access to data.

Postgres locks range from read-only access to data (e.g., simple SELECT * FROM table LIMIT 1 statements) and to exclusive access to data. The difference between the locks other than semantics is whether a lock mode conflicts with another lock mode. For instance, the query SELECT * FROM users LIMIT 1 acquires an ACCESS SHARE lock. ACCESS SHARE locks do not conflict with one another, which allows multiple SELECT statements to acquire this lock on the same table. But, an ALTER TABLE query acquires an ACCESS EXCLUSIVE lock, which conflicts with every other lock mode available to postgres. With an ACCESS EXCLUSIVE lock, a SELECT query that uses the ACCESS SHARE lock is forced to wait until the ALTER TABLE query completes. Conflicting locks block. You can learn about the conflicts in the postgres documentation on explicit locking.

Now we can discuss the mechanics of locks with this new information. Going back to our example, transaction A holds the SHARE UPDATE EXCLUSIVE lock on the users table. Transaction B attempts to acquire an ACCESS EXCLUSIVE lock on users. Since ACCESS EXCLUSIVE conflicts with all other locks, transaction B is now blocked and waiting.

Now let's add a transaction C that attempts to acquire the ACCESS SHARE lock on the users table. Transaction C does not conflict with transaction B's SHARE UPDATE EXCLUSIVE, but does conflict with transaction A's ACCESS EXCLUSIVE lock mode.

Does transaction C gain access?

Nope. Transaction C is blocked by transaction B, which is blocked by transaction A.

Typically, the ACCESS SHARE lock mode and the SHARE UPDATE EXCLUSIVE lock mode does not conflict with one another, and postgres allows concurrent access to the same object (users table). But, postgres implements locking mechanics with a queue in front of the conflict check (at least this is what I think is how it's implemented based on my testing - I haven't read the source code to confirm). The queue ensures access is serialized (i.e., transaction A, then B, then C). So if the front of the queue is a transaction that is waiting for lock (i.e., transaction B), then every other transaction behind transaction B must also wait (i.e., transaction C).

In Lock Mechanics 101, we discussed the mechanics of locks at the front of the queue (A holds the lock, B waits). But as we see in this deep dive, locks gain access to database objects serially. The reason I bring this up is that it might be simple to skirt around locks if it was as simple as described in Lock Mechanics 101. You could implement everything without requiring high conflict lock modes. And maybe your web service works without high-conflict lock modes. But there is a real-life scenario where we see the example described in this lesson: postgres auto vacuums and schema changes.

The postgres autovacuum daemon uses the SHARE UPDATE EXCLUSIVE lock mode. One crucial task the autovacuum daemon handles cleaning up the rows to prevent transaction ID wraparound failures, which means you should not cancel this job if it is running. If you execute a schema change, say the same ALTER TABLE statement from Lock Mechanics 101, the database will rapidly exhaust connections to the database because of the waiting mechanics describe in this lesson.

Ask me if I have lock_timeout set for all my schema changes. Go ahead. Ask me!

Fuck yea I have lock_timeout set! You think I love getting screamed at by every department at my company?


Early Warning System

I end this discussion of postgres failures at scale with a strategy that has worked for me and the startups I've worked in. We set up an early warning system. This strategy surfaces issues before the database reaches the tipping point, but is not a guarantee to prevent tipping entirely. If I knew of a quick and easy solution to avoid tipping, I'd start a company …

We start by setting statement_timeout and lock_timeout for the application so that the web service has query boundaries. I like setting this to double or triple the average response time for the web service (your mileage will vary depending on your workloads). Then, we use monitoring systems to track when web service errors due to statement timeouts or lock timeouts. This is simply because most startups in the past 15 years have grown to use monitoring religiously, so this is a matter of adding on an additional alert. When the query timeout errors increase, we debug the issue like any other bug -- make a ticket and assign it to a developer. This way, we know there's an issue before a database has tipped, and we can focus our attention early.

This strategy allows a startup to concentrate on delivering product while keeping tabs on the web service and database as the company grows. Aside from the rocket growth startups that the media loves to pay attention to, startup growth tends to look tedious and slow at the beginning and the early warning system strategy banks on this. As the startup gains users and data, the query timeout errors increase and warn us of problems with the system. When the startup scales out the organization, a culture of debugging and fixing these errors (hopefully) has permeated through the team, and old developers would (hopefully) train new developers to behave similarly.

New Defaults and Usage Guide

Time to put the money where my mouth is. What follows is how I would configure the statement_timeout and lock_timeout settings for a web service.

(Re)setting statement_timeout

The postgres setting statement_timeout can be configured in the postgres.conf configuration file, for the user, and by the client (web service) after connecting to the database. Setting this for the postgres.conf makes all statements subject to the value — this is too aggressive and will hurt your poor operations/devops/SRE folks when they need to run maintenance or monitoring tasks. But also, setting this value in application code (i.e., client-side) doesn't ensure that all connections by the postgres user have a statement_timeout -- developers could forget or override it on accident. We want to ensure all connections by the web service defaults to a statement_timeout. Instead, we configure the setting on the postgres user used by the web service ensuring blanket coverage at the database.

For example, to set statement_timeout to 5 seconds for the user webservice, execute the following statement on the database as a privileged user:

ALTER ROLE webservice SET statement_timeout = 5000;

With this configuration, any queries made by the webservice user will be terminated if it executes for longer than 5 seconds.

Client Overrides for statement_timeout

It’s may not be possible to have every query running faster than the postgres user’s statement_timeout. For instance, if there’s a web request that was designed to return a large amount of data a few times a day, the statement_timeout might terminate the query prematurely. For those cases, we can set statement_timeout before the query is sent to the server, then reset the setting after the query completes.

BEGIN;
SET LOCAL statement_timeout = 20000; 
SET LOCAL idle_in_transaction_session_timeout = 20000; -- postgres 9.6+
SELECT * FROM mylargetable INNER JOIN mylarger table;
COMMIT; -- ROLLBACK is fine too

The override process described uses a transaction (BEGIN) and overrides the setting with SET LOCAL. SET LOCAL ensures the statement_timeout is reset after the transaction completes (whether COMMIT or ROLLBACK). The reason for the transaction is to guard against leaking the increased statement_timeout to other queries.

If you’re using postgres 9.6+, then you can also guard against transactions that are idle and open by configuring the idle_in_transaction_session_timeout as described above. If you’re not on 9.6+, then you need to resort to a cron or cron-like tool that cancels idle in transaction queries every few minutes. Or you can upgrade to postgres 9.6+.

There is an alternative method that I don't recommend. This method uses SET and RESET without a transaction. For instance, instead of sending five queries above, you could send three queries:

SET statement_timeout = 20000;
SELECT * FROM mylargetable INNER JOIN mylarger table;
RESET statement_timeout;

This is likely to cause leaked configuration if you're connecting through a connection pooler, and the web service terminates a request before the web handler could issue the RESET statement. In this situation, the connection pooler now has an active connection to the database with a statement_timeout set to 20 seconds.

(Re)setting lock_timeout

If you're on postgres 9.3 or greater, lock_timeout protects against queries waiting for locks indefinitely. lock_timeout should be shorter than statement_timeout because statement_timeout applies to all queries and will cancel queries that wait for locks longer than statement_timeout. Also, as discussed, locks are notorious at exhausting database connections rapidly, so a smaller value for lock_timeout is better.

Set lock_timeout, like statement_timeout on the user the web service is connecting with.

ALTER ROLE webservice SET statement_timeout = 1500;

This setting really depends on your application, and you should revisit it often. The lower you push the value, the better.

Client Overrides for lock_timeout

Overriding the client is similar to statement_timeout. Use a transaction and use SET LOCAL.

BEGIN;
SET LOCAL lock_timeout = 500; 
SET LOCAL idle_in_transaction_session_timeout = 20000; -- postgres 9.6+
LOCK mytable;
-- ... more queries ... 
COMMIT; -- ROLLBACK is fine too