postgresql-database-engineering▌
manutej/luxor-claude-marketplace · updated Apr 8, 2026
MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.
Comprehensive PostgreSQL database engineering covering indexing, query optimization, replication, and production management.
- ›Covers 6 index types (B-tree, Hash, GiST, GIN, BRIN, SP-GiST) with decision matrices for choosing the right index for specific query patterns
- ›Includes streaming and logical replication setup, failover procedures, and high-availability configuration with synchronous/asynchronous modes
- ›Provides partitioning strategies (range, list, hash) with automation examples
PostgreSQL Database Engineering
A comprehensive skill for professional PostgreSQL database engineering, covering everything from query optimization and indexing strategies to high availability, replication, and production database management. This skill enables you to design, optimize, and maintain high-performance PostgreSQL databases at scale.
When to Use This Skill
Use this skill when:
- Designing database schemas for high-performance applications
- Optimizing slow queries and improving database performance
- Implementing indexing strategies for complex query patterns
- Setting up partitioning for large tables (100M+ rows)
- Configuring streaming replication and high availability
- Tuning PostgreSQL configuration for production workloads
- Implementing backup and recovery procedures
- Debugging performance issues and query bottlenecks
- Setting up connection pooling with pgBouncer or PgPool
- Monitoring database health and performance metrics
- Planning database migrations and schema changes
- Implementing database security and access controls
- Scaling PostgreSQL databases horizontally or vertically
- Managing VACUUM operations and database maintenance
- Setting up logical replication for data distribution
Core Concepts
PostgreSQL Architecture
PostgreSQL uses a process-based architecture with several key components:
- Postmaster Process: Main server process that manages connections
- Backend Processes: One per client connection, handles queries
- Shared Memory: Shared buffers, WAL buffers, lock tables
- Background Workers: Autovacuum, checkpointer, WAL writer, statistics collector
- Write-Ahead Log (WAL): Transaction log for durability and replication
- Storage Layer: TOAST for large values, FSM for free space, VM for visibility
MVCC (Multi-Version Concurrency Control)
PostgreSQL's foundational concurrency mechanism:
- Snapshots: Each transaction sees a consistent snapshot of data
- Tuple Versions: Multiple row versions coexist for concurrent access
- Transaction IDs: xmin (creating transaction), xmax (deleting transaction)
- Visibility Rules: Determines which row versions are visible to transactions
- VACUUM: Reclaims space from dead tuples and prevents transaction wraparound
- FREEZE: Marks old rows as visible to all transactions
Key Implications:
- No read locks - readers never block writers
- Writers never block readers
- Updates create new row versions
- Regular VACUUM is essential
- Dead tuples accumulate until vacuumed
Transaction Isolation Levels
PostgreSQL supports four isolation levels:
- Read Uncommitted: Treated as Read Committed in PostgreSQL
- Read Committed (default): Sees committed data at statement start
- Repeatable Read: Sees snapshot from transaction start
- Serializable: True serializable isolation with SSI
Choosing Isolation:
- Read Committed: Most applications, best performance
- Repeatable Read: Reports, analytics needing consistency
- Serializable: Financial transactions, critical consistency needs
Index Types
PostgreSQL offers multiple index types for different use cases:
1. B-Tree (Default)
- Use for: Equality, range queries, sorting
- Supports: <, <=, =, >=, >, BETWEEN, IN, IS NULL
- Best for: Most general-purpose indexing
- Example: Primary keys, foreign keys, timestamps
2. Hash
- Use for: Equality comparisons only
- Supports: = operator
- Best for: Large tables with equality lookups
- Limitation: Not WAL-logged before PG 10, no range queries
3. GiST (Generalized Search Tree)
- Use for: Geometric data, full-text search, custom types
- Supports: Overlaps, contains, nearest neighbor
- Best for: Spatial data, ranges, full-text search
- Example: PostGIS geometries, tsvector, ranges
4. GIN (Generalized Inverted Index)
- Use for: Multi-valued columns (arrays, JSONB, full-text)
- Supports: Contains, exists operators
- Best for: JSONB queries, array operations, full-text search
- Tradeoff: Slower updates, faster queries
5. BRIN (Block Range Index)
- Use for: Very large tables with natural ordering
- Supports: Range queries on sorted data
- Best for: Time-series data, append-only tables
- Advantage: Tiny index size, scales to billions of rows
6. SP-GiST (Space-Partitioned GiST)
- Use for: Non-balanced data structures
- Supports: Points, ranges, IP addresses
- Best for: Quadtrees, k-d trees, radix trees
Query Planning and Optimization
PostgreSQL's query planner determines execution strategies:
Planner Components:
- Statistics: Table and column statistics for cardinality estimation
- Cost Model: CPU, I/O, and memory cost estimation
- Plan Types: Sequential scan, index scan, bitmap scan, joins
- Join Methods: Nested loop, hash join, merge join
- Optimization: Query rewriting, predicate pushdown, join reordering
Key Statistics:
n_distinct: Number of distinct values (for selectivity)correlation: Physical row ordering correlationmost_common_vals: MCV list for skewed distributionshistogram_bounds: Value distribution histogram
Understanding EXPLAIN:
- Cost: Startup cost .. total cost (arbitrary units)
- Rows: Estimated row count
- Width: Average row size in bytes
- Actual Time: Real execution time (with ANALYZE)
- Loops: Number of times node executed
Partitioning Strategies
Table partitioning for managing large datasets:
Range Partitioning
- Use for: Time-series data, sequential values
- Example: Partition by date ranges (daily, monthly, yearly)
- Benefit: Easy data lifecycle management, faster queries
List Partitioning
- Use for: Discrete categorical values
- Example: Partition by country, region, status
- Benefit: Logical data separation, partition pruning
Hash Partitioning
- Use for: Even data distribution
- Example: Partition by hash(user_id)
- Benefit: Balanced partition sizes, parallel queries
Partition Pruning:
- Planner eliminates irrelevant partitions
- Drastically reduces query scope
- Essential for partition performance
Partition-Wise Operations:
- Partition-wise joins: Join matching partitions directly
- Partition-wise aggregation: Aggregate within partitions
- Parallel partition processing
Replication and High Availability
PostgreSQL replication options:
Streaming Replication (Physical)
- Type: Binary WAL streaming to standby servers
- Modes: Asynchronous, synchronous, quorum-based
- Use for: High availability, read scalability
- Failover: Automatic with tools like Patroni, repmgr
Synchronous vs Asynchronous:
- Synchronous: Zero data loss, higher latency
- Asynchronous: Low latency, potential data loss
- Quorum: Balance between safety and performance
Logical Replication
- Type: Row-level change stream
- Use for: Selective replication, upgrades, multi-master
- Benefit: Replicate specific tables, cross-version
- Limitation: No DDL replication, overhead
Cascading Replication
- Standbys replicate from other standbys
- Reduces load on primary
- Geographic distribution
Connection Pooling
Managing database connections efficiently:
pgBouncer
- Type: Lightweight connection pooler
- Modes: Session, transaction, statement pooling
- Use for: High connection count applications
- Benefit: Reduced connection overhead, resource limits
Pooling Modes:
- Session: Client connects for entire session
- Transaction: Connection per transaction
- Statement: Connection per statement (rarely used)
PgPool-II
- Type: Feature-rich middleware
- Features: Connection pooling, load balancing, query caching
- Use for: Read/write splitting, connection management
- Benefit: Advanced routing, in-memory cache
VACUUM and Maintenance
Critical maintenance operations:
VACUUM
- Purpose: Reclaim dead tuple space, update statistics
- Types: Regular VACUUM, VACUUM FULL
- When: After large updates/deletes, regularly via autovacuum
- Impact: Regular VACUUM is non-blocking
ANALYZE
- Purpose: Update planner statistics
- When: After data changes, schema modifications
- Impact: Minimal, fast on most tables
REINDEX
- Purpose: Rebuild indexes, fix bloat
- When: Index corruption, significant bloat
- Impact: Locks table, use REINDEX CONCURRENTLY (PG 12+)
Autovacuum
- Purpose: Automated VACUUM and ANALYZE
- Configuration: Threshold-based triggering
- Tuning: Balance resource usage vs. responsiveness
- Monitoring: Track autovacuum runs, prevent wraparound
Performance Tuning
Key configuration parameters:
Memory Settings
shared_buffers: 25% of RAM (start point)
effective_cache_size: 50-75% of RAM
work_mem: Per-operation memory (sort, hash)
maintenance_work_mem: VACUUM, CREATE INDEX memory
Checkpoint and WAL
checkpoint_timeout: How often to checkpoint
max_wal_size: WAL size before checkpoint
checkpoint_completion_target: Spread checkpoint I/O
wal_buffers: WAL write buffer size
Query Planner
random_page_cost: Relative cost of random I/O
effective_io_concurrency: Concurrent I/O operations
default_statistics_target: Histogram detail level
Connection Settings
max_connections: Maximum client connections
connection_limit: Per-database/user limits
Index Strategies
Choosing the Right Index
Decision Matrix:
| Query Pattern | Index Type | Reason |
|---|---|---|
WHERE id = 5 |
B-tree | Equality lookup |
WHERE created_at > '2024-01-01' |
B-tree | Range query |
ORDER BY name |
B-tree | Sorting support |
WHERE tags @> ARRAY['sql'] |
GIN | Array containment |
WHERE data->>'status' = 'active' |
GIN (jsonb_path_ops) | JSONB query |
WHERE to_tsvector(content) @@ query |
GIN | Full-text search |
WHERE location <-> point(0,0) |
GiST | Nearest neighbor |
WHERE timestamp BETWEEN ... (large table) |
BRIN | Sequential time-series |
WHERE ip_address << '192.168.0.0/16' |
GiST or SP-GiST | IP range query |
Composite Indexes
Multi-column indexes for complex queries:
Column Ordering Rules:
- Equality columns first
- Sort/range columns last
- High-selectivity columns first
- Match query patterns exactly
Example:
-- Query: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY created_at
-- Optimal index: (status, created_at)
CREATE INDEX idx_users_status_created ON users(status, created_at);
Partial Indexes
Index subset of rows:
Benefits:
- Smaller index size
- Faster updates on non-indexed rows
- Targeted query optimization
Use Cases:
- Index only active records:
WHERE deleted_at IS NULL - Index recent data:
WHERE created_at > NOW() - INTERVAL '90 days' - Index specific states:
WHERE status IN ('pending', 'processing')
Expression Indexes
Index computed values:
Examples:
-- Case-insensitive search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Date truncation
CREATE INDEX idx_events_date ON events(DATE(created_at));
-- JSONB field
CREATE INDEX idx_data_status ON documents((data->>'status'));
Covering Indexes (INCLUDE)
Include non-key columns for index-only scans:
CREATE INDEX idx_users_email_include
ON users(email)
INCLUDE (first_name, last_name, created_at);
Benefit: Query satisfied entirely from index, no table lookup
Index Maintenance
Monitoring Index Usage:
-- Unused indexes
SELECT schemaname, tablename, indexname, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY pg_relation_size(indexrelid) DESC;
Detecting Bloat:
-- Index bloat estimation
SELECT schemaname, tablename, indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
Query Optimization
Using EXPLAIN ANALYZE
Understanding query execution:
-- Basic EXPLAIN
EXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- EXPLAIN ANALYZE (actually runs query)
EXPLAIN ANALYZE SELECT * FROM users WHERE created_at > '2024-01-01';
-- Detailed output
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT u.*, o.total
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01';
Key Metrics:
- Planning Time: Time to generate plan
- Execution Time: Actual query runtime
- Shared Hit vs Read: Buffer cache hits vs disk reads
- Rows: Estimated vs actual row counts
- Filter vs Index Cond: Post-scan filtering vs index usage
Common Query Anti-Patterns
1. N+1 Queries
Problem: One query per row in a loop Solution: JOIN or batch queries
2. SELECT *
Problem: Fetches unnecessary columns Solution: Select only needed columns
3. Implicit Type Conversions
How to use postgresql-database-engineering on Cursor
AI-first code editor with Composer
Prerequisites
Before installing skills in Cursor, ensure your development environment meets these requirements:
- ›Cursor installed and configured on your development machine
- ›Node.js version 16.0+ with npm package manager (verify with
node --version) - ›Active project directory or workspace where you want to add postgresql-database-engineering
Execute installation command
Execute the skills CLI command in your project's root directory to begin installation:
The skills CLI fetches postgresql-database-engineering from GitHub repository manutej/luxor-claude-marketplace and configures it for Cursor.
Select Cursor when prompted
The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:
Verify installation
Confirm successful installation by checking the skill directory location:
Reload or restart Cursor to activate postgresql-database-engineering. Access the skill through slash commands (e.g., /postgresql-database-engineering) or your agent's skill management interface.
Security & Verification Notice
We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.
Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.
List & Monetize Your Skill
Submit your Claude Code skill and start earning
Use Cases▌
User Story & Requirements Generation
Create detailed user stories, acceptance criteria, and feature specs
Example
Generate user stories for 'password reset feature' with acceptance criteria, edge cases, and test scenarios
Reduce spec writing time by 50%, ensure comprehensive coverage
Competitive Analysis
Research competitors, compare features, identify gaps
Example
Analyze 5 competitor products, create feature comparison matrix, suggest differentiation opportunities
Complete competitive research in 2 hours instead of 2 days
Roadmap Prioritization
Evaluate features using frameworks (RICE, ICE, Kano) and create prioritized backlogs
Example
Score 20 feature ideas using RICE framework, generate prioritized roadmap with rationale
Make data-driven prioritization decisions faster
Stakeholder Communication
Draft PRDs, status updates, and stakeholder presentations
Example
Create executive summary of Q3 roadmap, monthly progress report, feature launch announcement
Save 3-5 hours/week on communication overhead
Implementation Guide▌
Prerequisites
- ›Claude Desktop or compatible AI client
- ›Access to product documentation and roadmap tools (Jira, Notion, etc.)
- ›Understanding of product management frameworks (RICE, Jobs-to-be-Done, etc.)
- ›Stakeholder contact information and communication channels
Time Estimate
30-60 minutes to see productivity improvements
Installation Steps
- 1.Install product management skill
- 2.Start with user story generation for known feature
- 3.Progress to competitive analysis: research 2-3 competitors
- 4.Use for roadmap prioritization: apply RICE/ICE scoring
- 5.Draft stakeholder communications and refine based on feedback
- 6.Build template library for recurring PM tasks
- 7.Share effective prompts with product team
Common Pitfalls
- ⚠Not validating competitive research—verify facts before sharing
- ⚠Accepting user stories without involving engineering team
- ⚠Over-relying on frameworks without qualitative judgment
- ⚠Not customizing outputs to company culture and communication style
- ⚠Skipping stakeholder validation of generated requirements
Best Practices▌
✓ Do
- +Validate research and competitive analysis with real data
- +Collaborate with engineering when generating technical requirements
- +Customize frameworks and templates to your company context
- +Use skill for first drafts, refine with stakeholder input
- +Document successful prompt patterns for PM tasks
- +Combine AI efficiency with human judgment and intuition
✗ Don't
- −Don't publish competitive analysis without fact-checking
- −Don't finalize user stories without engineering review
- −Don't make prioritization decisions solely on AI scoring
- −Don't skip customer validation of generated requirements
- −Don't ignore company-specific context and culture
💡 Pro Tips
- ★Provide context: company goals, constraints, customer feedback
- ★Ask for alternatives: 'Show 3 ways to prioritize this roadmap'
- ★Request stakeholder-specific formatting: 'Executive summary vs. engineering spec'
- ★Use skill for 70% generation + 30% customization to company needs
When to Use This▌
✓ Use When
Use for user story writing, competitive research, roadmap prioritization, stakeholder communication, and PRD drafting. Best for reducing repetitive documentation and research work.
✗ Avoid When
Avoid for strategic product vision (requires deep customer empathy), pricing decisions (needs market and financial expertise), or when face-to-face customer discovery is more valuable than speed.
Learning Path▌
- 1Basic: user stories, feature specs, status updates
- 2Intermediate: competitive analysis, prioritization frameworks, PRDs
- 3Advanced: product strategy, go-to-market planning, OKR setting
- 4Expert: product vision, market positioning, business model innovation
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.5★★★★★26 reviews- ★★★★★Min Desai· Dec 16, 2024
postgresql-database-engineering fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Arjun Yang· Dec 16, 2024
Keeps context tight: postgresql-database-engineering is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Chaitanya Patil· Dec 12, 2024
We added postgresql-database-engineering from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Arjun Martin· Nov 7, 2024
postgresql-database-engineering is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Piyush G· Nov 3, 2024
Useful defaults in postgresql-database-engineering — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Xiao Gupta· Oct 26, 2024
Solid pick for teams standardizing on skills: postgresql-database-engineering is focused, and the summary matches what you get after install.
- ★★★★★Shikha Mishra· Oct 22, 2024
Registry listing for postgresql-database-engineering matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Mateo Bansal· Sep 21, 2024
Useful defaults in postgresql-database-engineering — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Yash Thakker· Sep 1, 2024
Solid pick for teams standardizing on skills: postgresql-database-engineering is focused, and the summary matches what you get after install.
- ★★★★★Dhruvi Jain· Aug 20, 2024
postgresql-database-engineering is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
showing 1-10 of 26