Database Standards
PostgreSQL Best Practices
We use PostgreSQL as our primary database for all projects. This document outlines our standards and best practices for database design, schema management, and query optimization.
Database Design Principles
Normalization
- Aim for 3rd Normal Form (3NF) for most data models
- Consider denormalization only when performance benefits are significant and clearly documented
- Document any intentional denormalization with rationale
Naming Conventions
- Tables: Use plural, snake_case names (e.g., users,order_items)
- Columns: Use singular, snake_case names (e.g., first_name,created_at)
- Primary Keys: Use idas the standard primary key name
- Foreign Keys: Use singular table name with _idsuffix (e.g.,user_id,product_id)
- Indexes: Format as idx_[table]_[column(s)](e.g.,idx_users_email)
- Constraints: Use descriptive prefixes:
- Primary key: pk_[table](e.g.,pk_users)
- Foreign key: fk_[table]_[referenced_table](e.g.,fk_orders_users)
- Unique: uq_[table]_[column(s)](e.g.,uq_users_email)
- Check: ck_[table]_[column]_[check](e.g.,ck_products_price_positive)
 
- Primary key: 
Common Fields
Include these standard fields in all tables:
-- Standard fields for tracking and auditing
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
For soft deletes (when needed):
deleted_at TIMESTAMP WITH TIME ZONE,
For user tracking (when needed):
created_by UUID REFERENCES users(id),
updated_by UUID REFERENCES users(id),
Schema Design Examples
User Schema Example
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    first_name VARCHAR(100) NOT NULL,
    last_name VARCHAR(100) NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    role VARCHAR(50) NOT NULL DEFAULT 'user',
    last_login_at TIMESTAMP WITH TIME ZONE,
    email_verified BOOLEAN NOT NULL DEFAULT FALSE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    deleted_at TIMESTAMP WITH TIME ZONE
);
-- Index for email lookups
CREATE INDEX idx_users_email ON users(email);
-- Index for filtered queries that exclude deleted records
CREATE INDEX idx_users_deleted_at ON users(deleted_at) WHERE deleted_at IS NULL;
Relationship Example
For a one-to-many relationship:
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL REFERENCES users(id),
    status VARCHAR(50) NOT NULL DEFAULT 'pending',
    total_amount DECIMAL(10, 2) NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    
    CONSTRAINT ck_orders_total_amount_positive CHECK (total_amount >= 0)
);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_status ON orders(status);
For a many-to-many relationship:
CREATE TABLE products (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(255) NOT NULL,
    description TEXT,
    price DECIMAL(10, 2) NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    
    CONSTRAINT ck_products_price_positive CHECK (price >= 0)
);
CREATE TABLE categories (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    name VARCHAR(100) NOT NULL UNIQUE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL
);
-- Junction table for many-to-many relationship
CREATE TABLE product_categories (
    product_id UUID NOT NULL REFERENCES products(id) ON DELETE CASCADE,
    category_id UUID NOT NULL REFERENCES categories(id) ON DELETE CASCADE,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    
    PRIMARY KEY (product_id, category_id)
);
Indexing Strategy
When to Create Indexes
Create indexes on:
- All primary keys (automatic with PRIMARY KEY constraint)
- All foreign keys
- Columns frequently used in WHERE clauses
- Columns used in ORDER BY or GROUP BY
- Columns used in JOIN conditions
Index Types
- B-tree indexes (default): For most equality and range queries
- Hash indexes: For exact equality comparisons
- GIN indexes: For full-text search and JSON data
- GiST indexes: For geospatial data
- Partial indexes: For filtering on a subset of data
Index Example
-- Regular index
CREATE INDEX idx_products_name ON products(name);
-- Composite index for queries that filter by both columns
CREATE INDEX idx_orders_user_id_status ON orders(user_id, status);
-- Partial index for active users only
CREATE INDEX idx_users_active ON users(email, role) WHERE deleted_at IS NULL;
-- Index for case-insensitive searches
CREATE INDEX idx_products_name_lower ON products(LOWER(name));
-- Index for JSON data
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
-- Index for full-text search
CREATE INDEX idx_products_search ON products USING GIN (to_tsvector('english', name || ' ' || description));
Query Best Practices
General Guidelines
- Always use parameterized queries to prevent SQL injection
- Use specific column names instead of SELECT *
- Add proper LIMIT and OFFSET for pagination
- Include appropriate WHERE clauses for filtered queries
- Use JOINs instead of nested queries when possible
- Use CTEs (WITH queries) for complex queries to improve readability
Example Queries
Efficient SELECT with filtering:
-- With proper filtering and selection
SELECT 
    u.id, 
    u.email, 
    u.first_name, 
    u.last_name, 
    u.role
FROM 
    users u
WHERE 
    u.deleted_at IS NULL 
    AND u.role = 'admin'
ORDER BY 
    u.last_name, 
    u.first_name
LIMIT 
    20 OFFSET 0;
Using JOIN:
-- Efficient join with specific columns
SELECT 
    o.id AS order_id,
    o.total_amount,
    u.email AS user_email,
    u.first_name,
    u.last_name
FROM 
    orders o
JOIN 
    users u ON o.user_id = u.id
WHERE 
    o.status = 'completed'
    AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY 
    o.created_at DESC
LIMIT 
    50;
Using Common Table Expressions (CTEs):
-- CTE for improved readability
WITH recent_orders AS (
    SELECT 
        user_id, 
        COUNT(*) AS order_count,
        SUM(total_amount) AS total_spent
    FROM 
        orders
    WHERE 
        created_at >= NOW() - INTERVAL '90 days'
        AND status = 'completed'
    GROUP BY 
        user_id
)
SELECT 
    u.id,
    u.email,
    u.first_name,
    u.last_name,
    COALESCE(ro.order_count, 0) AS recent_order_count,
    COALESCE(ro.total_spent, 0) AS recent_total_spent
FROM 
    users u
LEFT JOIN 
    recent_orders ro ON u.id = ro.user_id
WHERE 
    u.deleted_at IS NULL
ORDER BY 
    recent_total_spent DESC
LIMIT 
    100;
Migration and Schema Management
Migration Process
For projects using PayloadCMS:
- Make schema changes in your collection configuration
- Run migrations through PayloadCMS migration system
For Django projects:
- Update models in your Django files
- Generate migrations with python manage.py makemigrations
- Apply migrations with python manage.py migrate
For custom SQL migrations:
- Create a versioned migration file
- Include both UP and DOWN migration scripts
- Document what the migration does
Migration Principles
- Migrations should be incremental, not massive schema changes
- Always test migrations on a staging environment before production
- Include rollback procedures for all migrations
- Schedule migrations during low-traffic periods
- Document all schema changes
Performance Optimization
Query Optimization
- Use EXPLAIN ANALYZE to identify slow queries
- Ensure indexes are being used effectively
- Consider partitioning large tables
- Use appropriate JOIN types (INNER, LEFT, etc.)
- Consider materialized views for complex, frequently-run reports
Connection Management
- Use connection pooling (PgBouncer or built-in pool in frameworks)
- Properly close database connections when done
- Monitor and limit max connections
- Set appropriate timeouts
Integration with NextJS and PayloadCMS
PayloadCMS Collections
When defining PayloadCMS collections:
// Example PayloadCMS collection with best practices
import { CollectionConfig } from 'payload/types';
const Products: CollectionConfig = {
  slug: 'products',
  admin: {
    useAsTitle: 'name',
    defaultColumns: ['name', 'price', 'category', 'createdAt'],
  },
  access: {
    // Define proper access control
    read: ({ req }) => true,
    update: ({ req }) => req.user.role === 'admin',
    // ...
  },
  fields: [
    {
      name: 'name',
      type: 'text',
      required: true,
      index: true, // Add index for frequently queried fields
    },
    {
      name: 'description',
      type: 'textarea',
    },
    {
      name: 'price',
      type: 'number',
      required: true,
      min: 0,
    },
    {
      name: 'category',
      type: 'relationship',
      relationTo: 'categories',
      required: true,
    },
    // Add audit fields automatically
  ],
  hooks: {
    // Add validation or transformation hooks
    beforeChange: [
      ({ data }) => {
        // Validate or transform data
        return data;
      },
    ],
  },
};
export default Products;
Setting Up with NextJS
// Example of database connection in NextJS
// lib/db.ts
import { Pool } from 'pg';
// Use environment variables for connection details
const pool = new Pool({
  user: process.env.DB_USER,
  host: process.env.DB_HOST,
  database: process.env.DB_NAME,
  password: process.env.DB_PASSWORD,
  port: parseInt(process.env.DB_PORT || '5432', 10),
  max: 20, // Maximum number of clients in the pool
  idleTimeoutMillis: 30000,
});
// Helper function for queries
export async function query(text, params) {
  try {
    const start = Date.now();
    const res = await pool.query(text, params);
    const duration = Date.now() - start;
    
    // Log slow queries in development
    if (duration > 100) {
      console.log('Slow query:', { text, duration, rows: res.rowCount });
    }
    
    return res;
  } catch (error) {
    console.error('Database query error:', error);
    throw error;
  }
}
// Example of a data access function
export async function getUserById(id) {
  const { rows } = await query(
    'SELECT id, email, first_name, last_name, role FROM users WHERE id = $1 AND deleted_at IS NULL',
    [id]
  );
  return rows[0] || null;
}
Security Best Practices
Access Control
- Implement row-level security for multi-tenant applications
- Use roles and permissions to restrict data access
- Never expose database credentials in client-side code
Data Protection
- Hash passwords with strong algorithms (bcrypt/Argon2)
- Encrypt sensitive data at rest
- Use TLS/SSL for database connections
- Implement proper backups and disaster recovery
Example of Row-Level Security
-- Create a policy to restrict users to seeing only their data
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY orders_user_policy ON orders
    USING (user_id = current_user_id());
-- Function to get the current user ID from application context
CREATE OR REPLACE FUNCTION current_user_id()
RETURNS UUID AS $$
BEGIN
    RETURN current_setting('app.current_user_id', true)::UUID;
EXCEPTION
    WHEN OTHERS THEN
        RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Monitoring and Maintenance
- Regularly analyze and optimize tables with VACUUM and ANALYZE
- Monitor query performance and database size
- Set up alerting for slow queries and high resource usage
- Schedule routine maintenance during off-peak hours
- Keep PostgreSQL version updated
Conclusion
Following these database standards ensures that our PostgreSQL databases are well-designed, performant, and secure. Always consider the long-term maintainability of your database schema and optimize for the specific query patterns of your application.
For project-specific database questions, consult with your Tech Lead or the database specialist on your team.