Skip to main content

Database Standards

PostgreSQL Best Practices

We use PostgreSQL as our primary database for all projects. This document outlines our standards and best practices for database design, schema management, and query optimization.

Database Design Principles

Normalization

  • Aim for 3rd Normal Form (3NF) for most data models
  • Consider denormalization only when performance benefits are significant and clearly documented
  • Document any intentional denormalization with rationale

Naming Conventions

  • Tables: Use plural, snake_case names (e.g., users, order_items)
  • Columns: Use singular, snake_case names (e.g., first_name, created_at)
  • Primary Keys: Use id as the standard primary key name
  • Foreign Keys: Use singular table name with _id suffix (e.g., user_id, product_id)
  • Indexes: Format as idx_[table]_[column(s)] (e.g., idx_users_email)
  • Constraints: Use descriptive prefixes:
    • Primary key: pk_[table] (e.g., pk_users)
    • Foreign key: fk_[table]_[referenced_table] (e.g., fk_orders_users)
    • Unique: uq_[table]_[column(s)] (e.g., uq_users_email)
    • Check: ck_[table]_[column]_[check] (e.g., ck_products_price_positive)

Common Fields

Include these standard fields in all tables:

-- Standard fields for tracking and auditing
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,

For soft deletes (when needed):

deleted_at TIMESTAMP WITH TIME ZONE,

For user tracking (when needed):

created_by UUID REFERENCES users(id),
updated_by UUID REFERENCES users(id),

Schema Design Examples

User Schema Example

CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
password_hash VARCHAR(255) NOT NULL,
role VARCHAR(50) NOT NULL DEFAULT 'user',
last_login_at TIMESTAMP WITH TIME ZONE,
email_verified BOOLEAN NOT NULL DEFAULT FALSE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
deleted_at TIMESTAMP WITH TIME ZONE
);

-- Index for email lookups
CREATE INDEX idx_users_email ON users(email);

-- Index for filtered queries that exclude deleted records
CREATE INDEX idx_users_deleted_at ON users(deleted_at) WHERE deleted_at IS NULL;

Relationship Example

For a one-to-many relationship:

CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id),
status VARCHAR(50) NOT NULL DEFAULT 'pending',
total_amount DECIMAL(10, 2) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,

CONSTRAINT ck_orders_total_amount_positive CHECK (total_amount >= 0)
);

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);

For a many-to-many relationship:

CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
description TEXT,
price DECIMAL(10, 2) NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,

CONSTRAINT ck_products_price_positive CHECK (price >= 0)
);

CREATE TABLE categories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100) NOT NULL UNIQUE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL
);

-- Junction table for many-to-many relationship
CREATE TABLE product_categories (
product_id UUID NOT NULL REFERENCES products(id) ON DELETE CASCADE,
category_id UUID NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,

PRIMARY KEY (product_id, category_id)
);

Indexing Strategy

When to Create Indexes

Create indexes on:

  • All primary keys (automatic with PRIMARY KEY constraint)
  • All foreign keys
  • Columns frequently used in WHERE clauses
  • Columns used in ORDER BY or GROUP BY
  • Columns used in JOIN conditions

Index Types

  • B-tree indexes (default): For most equality and range queries
  • Hash indexes: For exact equality comparisons
  • GIN indexes: For full-text search and JSON data
  • GiST indexes: For geospatial data
  • Partial indexes: For filtering on a subset of data

Index Example

-- Regular index
CREATE INDEX idx_products_name ON products(name);

-- Composite index for queries that filter by both columns
CREATE INDEX idx_orders_user_id_status ON orders(user_id, status);

-- Partial index for active users only
CREATE INDEX idx_users_active ON users(email, role) WHERE deleted_at IS NULL;

-- Index for case-insensitive searches
CREATE INDEX idx_products_name_lower ON products(LOWER(name));

-- Index for JSON data
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);

-- Index for full-text search
CREATE INDEX idx_products_search ON products USING GIN (to_tsvector('english', name || ' ' || description));

Query Best Practices

General Guidelines

  • Always use parameterized queries to prevent SQL injection
  • Use specific column names instead of SELECT *
  • Add proper LIMIT and OFFSET for pagination
  • Include appropriate WHERE clauses for filtered queries
  • Use JOINs instead of nested queries when possible
  • Use CTEs (WITH queries) for complex queries to improve readability

Example Queries

Efficient SELECT with filtering:

-- With proper filtering and selection
SELECT
u.id,
u.email,
u.first_name,
u.last_name,
u.role
FROM
users u
WHERE
u.deleted_at IS NULL
AND u.role = 'admin'
ORDER BY
u.last_name,
u.first_name
LIMIT
20 OFFSET 0;

Using JOIN:

-- Efficient join with specific columns
SELECT
o.id AS order_id,
o.total_amount,
u.email AS user_email,
u.first_name,
u.last_name
FROM
orders o
JOIN
users u ON o.user_id = u.id
WHERE
o.status = 'completed'
AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY
o.created_at DESC
LIMIT
50;

Using Common Table Expressions (CTEs):

-- CTE for improved readability
WITH recent_orders AS (
SELECT
user_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent
FROM
orders
WHERE
created_at >= NOW() - INTERVAL '90 days'
AND status = 'completed'
GROUP BY
user_id
)
SELECT
u.id,
u.email,
u.first_name,
u.last_name,
COALESCE(ro.order_count, 0) AS recent_order_count,
COALESCE(ro.total_spent, 0) AS recent_total_spent
FROM
users u
LEFT JOIN
recent_orders ro ON u.id = ro.user_id
WHERE
u.deleted_at IS NULL
ORDER BY
recent_total_spent DESC
LIMIT
100;

Migration and Schema Management

Migration Process

For projects using PayloadCMS:

  1. Make schema changes in your collection configuration
  2. Run migrations through PayloadCMS migration system

For Django projects:

  1. Update models in your Django files
  2. Generate migrations with python manage.py makemigrations
  3. Apply migrations with python manage.py migrate

For custom SQL migrations:

  1. Create a versioned migration file
  2. Include both UP and DOWN migration scripts
  3. Document what the migration does

Migration Principles

  • Migrations should be incremental, not massive schema changes
  • Always test migrations on a staging environment before production
  • Include rollback procedures for all migrations
  • Schedule migrations during low-traffic periods
  • Document all schema changes

Performance Optimization

Query Optimization

  • Use EXPLAIN ANALYZE to identify slow queries
  • Ensure indexes are being used effectively
  • Consider partitioning large tables
  • Use appropriate JOIN types (INNER, LEFT, etc.)
  • Consider materialized views for complex, frequently-run reports

Connection Management

  • Use connection pooling (PgBouncer or built-in pool in frameworks)
  • Properly close database connections when done
  • Monitor and limit max connections
  • Set appropriate timeouts

Integration with NextJS and PayloadCMS

PayloadCMS Collections

When defining PayloadCMS collections:

// Example PayloadCMS collection with best practices
import { CollectionConfig } from 'payload/types';

const Products: CollectionConfig = {
slug: 'products',
admin: {
useAsTitle: 'name',
defaultColumns: ['name', 'price', 'category', 'createdAt'],
},
access: {
// Define proper access control
read: ({ req }) => true,
update: ({ req }) => req.user.role === 'admin',
// ...
},
fields: [
{
name: 'name',
type: 'text',
required: true,
index: true, // Add index for frequently queried fields
},
{
name: 'description',
type: 'textarea',
},
{
name: 'price',
type: 'number',
required: true,
min: 0,
},
{
name: 'category',
type: 'relationship',
relationTo: 'categories',
required: true,
},
// Add audit fields automatically
],
hooks: {
// Add validation or transformation hooks
beforeChange: [
({ data }) => {
// Validate or transform data
return data;
},
],
},
};

export default Products;

Setting Up with NextJS

// Example of database connection in NextJS
// lib/db.ts
import { Pool } from 'pg';

// Use environment variables for connection details
const pool = new Pool({
user: process.env.DB_USER,
host: process.env.DB_HOST,
database: process.env.DB_NAME,
password: process.env.DB_PASSWORD,
port: parseInt(process.env.DB_PORT || '5432', 10),
max: 20, // Maximum number of clients in the pool
idleTimeoutMillis: 30000,
});

// Helper function for queries
export async function query(text, params) {
try {
const start = Date.now();
const res = await pool.query(text, params);
const duration = Date.now() - start;

// Log slow queries in development
if (duration > 100) {
console.log('Slow query:', { text, duration, rows: res.rowCount });
}

return res;
} catch (error) {
console.error('Database query error:', error);
throw error;
}
}

// Example of a data access function
export async function getUserById(id) {
const { rows } = await query(
'SELECT id, email, first_name, last_name, role FROM users WHERE id = $1 AND deleted_at IS NULL',
[id]
);
return rows[0] || null;
}

Security Best Practices

Access Control

  • Implement row-level security for multi-tenant applications
  • Use roles and permissions to restrict data access
  • Never expose database credentials in client-side code

Data Protection

  • Hash passwords with strong algorithms (bcrypt/Argon2)
  • Encrypt sensitive data at rest
  • Use TLS/SSL for database connections
  • Implement proper backups and disaster recovery

Example of Row-Level Security

-- Create a policy to restrict users to seeing only their data
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;

CREATE POLICY orders_user_policy ON orders
USING (user_id = current_user_id());

-- Function to get the current user ID from application context
CREATE OR REPLACE FUNCTION current_user_id()
RETURNS UUID AS $$
BEGIN
RETURN current_setting('app.current_user_id', true)::UUID;
EXCEPTION
WHEN OTHERS THEN
RETURN NULL;
END;
$$ LANGUAGE plpgsql;

Monitoring and Maintenance

  • Regularly analyze and optimize tables with VACUUM and ANALYZE
  • Monitor query performance and database size
  • Set up alerting for slow queries and high resource usage
  • Schedule routine maintenance during off-peak hours
  • Keep PostgreSQL version updated

Conclusion

Following these database standards ensures that our PostgreSQL databases are well-designed, performant, and secure. Always consider the long-term maintainability of your database schema and optimize for the specific query patterns of your application.

For project-specific database questions, consult with your Tech Lead or the database specialist on your team.