Things you should definitely do in a greenfield application

A lot of these are Rails specific, but can be adapted to other stacks.

Implement multi-tenancy correctly from day one: https://www.flightcontrol.dev/blog/ultimate-guide-to-multi-tenant-saas-data-modeling
Steal some things from https://github.com/discourse/discourse
Puma (taken from Nate’s Four Line Friday from 2025-06-20)
- Fewer containers with more processes per container for web. Try this simulation.
- Friends don’t let friends autoscale Puma, Unicorn or Sidekiq based on CPU utilization. Or response time. Or requests per second.
- The optimal Puma configuration for 80% of apps is probably 4 workers, 5 threads on a 4vCPU machine with ~8GB of memory.
Postgres
- QueryTags in query logs, including the code namespace and function (Job, Controller+Action, etc) as well as the trace id
- Have automation in place to reindex bloated B-tree indexes
  - https://github.com/vitabaks/pg_auto_reindexer
- Periodically cleanup table bloat with pg_squeeze
- Set a low statement_timeout (e.g. 10s)
- Set a low idle_in_transaction_session_timeout (e.g. 3s)
- Set a low lock_timeout (e.g. 3s)
- Set a low transaction_timeout
- Set a idle_session_timeout (e.g. 30s), set a slightly higher timeout on the client (application or middleware connection pooler)
- Lock as little as possible, lock contention is a performance killer
- Hold locks for the shortest possible time. This goes hand in hand with keeping transactions as short as possible, which you should also do.
- Follow pganalyze’s performance checklist
- Use https://pganalyze.com/ to track performance, load, and help discover bottlenecks
- See other Postgres specific notes
Tests
- Use minitest + https://github.com/grosser/maxitest
- Use fixtures, not factories
- Postgres
  - Use tablespaces and put tables on tmpfs (RAM)
    - https://gajus.com/blog/setting-up-postgre-sql-for-running-integration-tests
  - Use unlogged tables
  - fsync=off
  - full_page_writes=off
  - synchronous_commit=off
  - autovacuum=off
  - checkpoint_timeout=60m
  - wal_level=minimal
  - max_wal_senders=0
- Enforce a maximum test runtime
- Fix or delete flaky tests
- https://github.com/basecamp/gh-signoff and don’t use a CI
Preconnect cross-origin domains
Compress assets + use a CDN
Compress response payloads (HTML/JSON)
Sidekiq jobs
- Leverage https://github.com/sidekiq/sidekiq/wiki/Iteration
- SLO based queue naming (within_6_hours, within_0_seconds, etc.)
  - Idea: expand sidekiq_options to accept slo and weight to pick/generate a queue name which is automatically used
    - slo: 5.minutes, weight: 0.5
- Manual load shedding
  - Short-circuit all jobs of class X with argument[0] = Y
- Implement Sidekiq batch invalidation by default for every job
- Provide fairness between tenants
  - TODO: How? Sidekiq Limiter?
Avoid using UUIDs at all
- Use bigint primary keys, for public identifiers use sqids with a secret
RMP in all environments
RUM in browser
Autoscale web and worker
Instrument request queue time
Automated alerts/SLOs in Terraform
Prosopite/strictloading in tests
jemalloc
Turbo/DataStar/hx-boost
Mise Tasks
Code formatting
- Standardrb
- Enforce with git hooks https://github.com/sds/overcommit
https://evilmartians.com/chronicles/gemfile-of-dreams-libraries-we-use-to-build-rails-apps
Enforce a zero-bug policy
All “Repository” operations must be batched
Setup a CSP policy from day one
Track performance regressions
- Metric A: What I want to monitor, e.g., # of requests that took longer than 5 seconds, grouped by controller action
- Metric B: Same, but timeshifted week ago
- Alert: B/A > 2
Feature flags
- Support a tree-like structure, since some feature flags are nested by nature. Enabling an child node should enable all parent nodes
  - This can be done simply with a parent_feature_ids array column, but you may also leverage the ltree extension
- Notify when feature flags are fully rolled out or fully disabled, every week
  - As a reminder to delete dead code branches
Setup span/slog fields for metrics/signals/monitoring, per request/job:
- Tenant ID
- User ID
- Feature Flags
- Error code
- Number of DB queries
- Number of DB tables queries
- Number of DB services hit
- Number of Elastic queries
- Number of Elastic services hit
- Number of cache reads
- Number of cache writes
- Number of cache services hit
- Number of object allocations
- Number of HTTP requests
- Number of HTTP request retries
- Number of HTTP requests uniqued by domain
- Time spent in DB
- Time spent reading from DB
- Time spent writing to DB
- Time spent in cache
- Time spent reading from cache
- Time spent writing to cache
- Time spent in view layer
- Time spent in HTTP requests
- Time spent
- CPU wall time spent