Every app that stores anything β user accounts, blog posts, orders, messages β uses a database. But "database" is one of those words that beginners hear constantly without anyone explaining what it actually is or how it works.
This guide covers the basics with real examples: what a database is, how SQL works, what primary keys and indexes do, and how to try it all right now without installing anything.
What a database is (and how it differs from a file)
A database is software that stores, organises, and retrieves data. You could also store data in a plain file β a JSON file, a CSV, a text file β so what makes a database different?
Three things:
1. Concurrent access. A plain file has no coordination mechanism. If two users try to write at the same time, you get corrupted data. A database handles thousands of simultaneous readers and writers safely.
2. Structure and constraints. A database enforces rules. You can say "this column must not be empty" or "this value must be unique" or "this value must refer to a row that exists in another table." A plain file has no enforcement β bad data sneaks in.
3. Efficient queries. Finding all users who signed up after January 1st in a JSON file means reading every record and filtering in code. A database with the right index finds those rows in milliseconds even if you have ten million of them.
Think of a spreadsheet: it stores data in rows and columns, but it has no concurrency control, no constraints, and no query language. A database adds all three.
Two main types: SQL and NoSQL
Relational databases (SQL)
Data is stored in tables β like spreadsheets, but with enforced schemas and relationships between tables. You query them using SQL (Structured Query Language).
Popular options: PostgreSQL, MySQL, SQLite.
Best for: structured data with clear relationships. A users table, an orders table, a products table β and queries that join them together.
NoSQL / Document databases
Data is stored in flexible documents, usually JSON. There is no fixed schema β different documents in the same collection can have different fields.
Popular options: MongoDB, Firestore (from Google).
Best for: data with a flexible or rapidly changing shape. A social media post might have zero or ten photos β a document database handles that naturally without schema migrations.
For most backend projects, a relational database (PostgreSQL) is the right default. NoSQL is a deliberate choice for specific use cases, not a simpler alternative.
SQL basics β real queries explained
SQL is the language you use to talk to relational databases. The core commands are CREATE, INSERT, SELECT, UPDATE, and DELETE.
Let's use a real example: a users table.
Creating a table
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
What each line does:
id SERIAL PRIMARY KEYβ an auto-incrementing integer that uniquely identifies each rowname TEXT NOT NULLβ a text column that cannot be emptyemail TEXT UNIQUE NOT NULLβ a text column that cannot be empty and must be unique across all rowscreated_at TIMESTAMP DEFAULT NOW()β a timestamp column that defaults to the current time if you don't supply one
Inserting and reading rows
-- Add a user
INSERT INTO users (name, email) VALUES ('Alice', '[email protected]');
INSERT INTO users (name, email) VALUES ('Bob', '[email protected]');
-- Read all users
SELECT * FROM users;
-- Read only users who joined in 2026
SELECT name, email FROM users WHERE created_at > '2026-01-01';
SELECT * means "all columns." Replacing * with specific column names makes queries faster and output easier to read.
Updating and deleting
-- Change Alice's last name
UPDATE users SET name = 'Alice Smith' WHERE id = 1;
-- Delete a user
DELETE FROM users WHERE id = 1;
The WHERE clause is critical β omit it from DELETE and you delete every row in the table.
Primary keys and foreign keys
A primary key uniquely identifies each row. In our table above, id is the primary key. No two users can have the same id, and it cannot be NULL. Every table should have one.
A foreign key links a row in one table to a row in another. This is how you model relationships.
Example: an orders table where each order belongs to a user:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER NOT NULL REFERENCES users(id),
product TEXT NOT NULL,
total_cents INTEGER NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
REFERENCES users(id) is the foreign key. The database will reject any attempt to insert an order with a user_id that doesn't exist in the users table. This protects data integrity at the database level β you can't end up with orphaned orders pointing to deleted users (unless you explicitly allow it).
To find all orders placed by Alice (id = 1):
SELECT orders.product, orders.total_cents
FROM orders
WHERE orders.user_id = 1;
Or join both tables together for richer results:
SELECT users.name, orders.product, orders.total_cents
FROM orders
JOIN users ON orders.user_id = users.id
WHERE users.name = 'Alice Smith';
Indexes: finding rows without scanning the whole table
Imagine a 500-page book with no index. Finding every mention of "PostgreSQL" means reading every page. A book index sends you directly to the right pages.
Database indexes work the same way. Without an index, a query like WHERE email = '[email protected]' scans every row. With an index on the email column, the database jumps directly to that row.
CREATE INDEX idx_users_email ON users(email);
When to add one: Add an index on any column you filter or sort by frequently β especially columns used in WHERE, ORDER BY, or JOIN conditions. Foreign key columns are almost always worth indexing.
The cost tradeoff: Indexes make reads faster but writes slightly slower (the index must be updated on every insert/update/delete). They also consume disk space. Don't index every column β only the ones you actually query on.
Transactions: all-or-nothing operations
A transaction groups multiple operations so they either all succeed or all fail together. The canonical example is a bank transfer:
- Subtract $100 from Alice's account
- Add $100 to Bob's account
If your app crashes after step 1 but before step 2, Alice loses $100 with no corresponding credit to Bob. A transaction prevents this β both steps are committed together, or neither is.
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
If anything goes wrong between BEGIN and COMMIT, you run ROLLBACK and neither update is applied.
The acronym ACID describes the guarantees a transactional database provides:
- Atomicity β all steps succeed or none do
- Consistency β the database is always in a valid state
- Isolation β concurrent transactions don't interfere with each other
- Durability β once committed, data survives crashes
ORMs: query your database in your own language
Raw SQL works fine, but most applications use an ORM (Object-Relational Mapper) β a library that lets you work with your database using the objects and functions of your programming language rather than writing SQL strings.
Prisma is the most popular ORM for Node.js and Next.js. You define your schema in a schema.prisma file:
model User {
id Int @id @default(autoincrement())
name String
email String @unique
createdAt DateTime @default(now())
orders Order[]
}
model Order {
id Int @id @default(autoincrement())
userId Int
user User @relation(fields: [userId], references: [id])
product String
totalCents Int
createdAt DateTime @default(now())
}
Then query in TypeScript β no SQL strings:
// Find all orders by Alice
const orders = await prisma.order.findMany({
where: { user: { name: 'Alice Smith' } },
include: { user: true },
});
Prisma generates type-safe queries, catches schema mismatches at build time, and handles migrations. For Python, SQLAlchemy is the equivalent standard.
SQLite vs PostgreSQL: which one to use
SQLite is a database stored in a single file. No server to run, no connection strings β just point your app at a file. Perfect for local development, small desktop apps, and single-user tools. It comes bundled with Python and is widely used in mobile apps.
PostgreSQL is a full database server. It handles concurrent users, enforces constraints rigorously, and has features that SQLite lacks (like full-text search and advanced JSON operations). It's the right choice for any production app with multiple users.
A common pattern: use SQLite during local development (zero setup), switch to PostgreSQL in production.
Try SQL right now β no install needed
You don't need to install anything to practice SQL today.
- db-fiddle.com β choose PostgreSQL or MySQL, paste a schema and queries, run them instantly in the browser
- SQLiteOnline.com β SQLite in the browser; good for table creation and basic queries
Try pasting the CREATE TABLE and INSERT examples above and running a few SELECT queries.
Hosted databases for side projects
When you're ready to deploy, you don't need to manage a database server yourself. These services offer free tiers that are good for learning and small projects:
- Supabase β hosted PostgreSQL with a generous free tier, plus built-in auth and storage
- Neon β serverless PostgreSQL; scales to zero when not in use (great for projects that get infrequent traffic)
- PlanetScale β MySQL-compatible, built for high-scale apps
For most beginners, Supabase is the easiest starting point β you get a real PostgreSQL database, a visual table editor, and auth out of the box.
When you don't need a database
Not every project needs a database. Consider a simpler alternative when:
- Your site is static β a blog or marketing site where content changes rarely. Markdown files (like those used by this blog) or a JSON file are plenty.
- You have one user β a personal tool only you use doesn't need concurrency control.
- Your data is configuration β environment variables or a
.envfile beat a database for config values. - You're prototyping β during the first hours of a project, start with a JSON file. Swap in a database once you know what data you actually need to store.