NoSQL Injection Deep Dive
NoSQL Injection happens when an application builds a NoSQL query from untrusted input in a way that lets the attacker change the meaning of the query. Instead of input being treated as simple data (like a username), it becomes part of the query structure (like operators, filters, or conditions).
In the wild: the scary part isnât the obvious payload â itâs the one query nobody reviewed because it lives in a background job or a report export.
Why it exists (root cause)
- Query-as-data: many NoSQL drivers accept query objects (JSON-like structures) that can include operators and nested conditions.
- Unsafe merging: developers sometimes merge user JSON directly into a query filter (âflexible searchâ, âadvanced filterâ, âadmin queryâ).
- Type confusion: fields expected to be strings become objects/arrays, changing how the database interprets them.
- Operator exposure: allowing operator keys (like â$somethingâ) from untrusted input lets attackers alter logic.
Mental model: value control vs query-shape control
| What the user controls | Risk | Why |
|---|---|---|
| Only a primitive value (string/number) assigned to a fixed field | Lower | Input stays âdataâ if typed/validated |
| A JSON object merged into the filter | High | User can influence operators, nesting, and logic |
| Field names or sort/where clauses from user input | High | User can redirect query to unintended fields/paths |
Where NoSQL Injection commonly appears
- Login/search endpoints: âfind user by email/passwordâ patterns that build filters from request bodies.
- Advanced filtering: endpoints that accept filter JSON (â/search?filter=...â).
- Admin tools: ârun a queryâ or âpreview resultsâ features.
- Multi-tenant apps: shared âquery builderâ components reused across services.
- GraphQL resolvers: filters passed directly into Mongo queries without validation.
Node.js examples (Express + Mongo-style)
Vulnerable pattern (unsafe query object merge)
// â Vulnerable: merges attacker-controlled JSON into the query filter
import express from "express";
import bodyParser from "body-parser";
import { MongoClient } from "mongodb";
const app = express();
app.use(bodyParser.json());
app.post("/api/users/search", async (req, res) => {
const filter = req.body.filter; // attacker-controlled object
const sort = req.body.sort || { createdAt: -1 };
// Danger: filter may contain operators/paths the server didn't intend
const users = await req.db.collection("users")
.find(filter)
.sort(sort)
.limit(50)
.toArray();
res.json({ users });
}); Fixed pattern (server-controlled shape + validation + typing)
// â
Secure: allow-list fields + coerce types + block operator keys
function isPlainObject(x) {
return x && typeof x === "object" && !Array.isArray(x);
}
function rejectOperatorKeys(obj) {
// Block any key that starts with "$" or contains "." to avoid operator/path abuse
for (const k of Object.keys(obj)) {
if (k.startsWith("$") || k.includes(".")) return false;
const v = obj[k];
if (isPlainObject(v) && !rejectOperatorKeys(v)) return false;
}
return true;
}
app.post("/api/users/search", async (req, res) => {
const q = String(req.body.q || "").trim();
const page = Number(req.body.page || 1);
const pageSize = Math.min(Number(req.body.pageSize || 20), 50);
// Server controls the query SHAPE; user controls only values
const filter = {};
if (q) {
// Example: allow searching by email prefix only (business-defined)
filter.email = { $regex: "^" + q.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"), $options: "i" };
}
// If you accept structured filters, validate strictly:
const raw = req.body.filter;
if (raw !== undefined) {
if (!isPlainObject(raw) || !rejectOperatorKeys(raw)) return res.status(400).json({ error: "Invalid filter" });
// Allow-list only specific fields, and keep values typed as primitives
const allowed = ["status", "role"];
for (const f of allowed) {
if (raw[f] !== undefined) filter[f] = String(raw[f]);
}
}
const users = await req.db.collection("users")
.find(filter)
.skip((page - 1) * pageSize)
.limit(pageSize)
.toArray();
res.json({ users });
}); Exploitation progression (attacker mindset)
This describes how attackers think at a high level. It avoids step-by-step exploit instructions.
Phase 1: Look for query flexibility
- Endpoints that accept âfilterâ, âwhereâ, âqueryâ, âconditionsâ, or âsearch JSONâ.
- Login or search endpoints that accept JSON bodies and respond differently based on crafted structure.
Phase 2: Identify query-shape control
- Can attacker-controlled input become an object instead of a string/number?
- Does the server merge user JSON directly into the database filter?
Phase 3: Expand impact conceptually
- Attackers aim to change authorization-related filters (tenantId/ownerId), bypass checks, or broaden result sets.
- They also look for performance impact (expensive queries) and data exposure (over-broad searches).
Tricky edge cases & bypass logic (conceptual)
- Type confusion: strings becoming objects/arrays in JSON bodies can change query semantics.
- Operator keys: keys starting with
$introduce operator behavior; block them from untrusted input. - Dot-path abuse: keys containing
.can target nested fields; treat as dangerous unless explicitly allowed. - Sort/projection injection: user-controlled sort/projection can expose unexpected fields or stress indexes.
- Second-order filters: stored filter JSON rendered/executed later by jobs.
- Authorization coupling: security filters (tenantId/ownerId) must be enforced server-side and never overrideable by client filters.
Safe validation & testing guidance (defensive verification)
- Confirm input typing: does the server treat a field as a primitive, or will it accept an object?
- Confirm query merge: check code paths where request JSON is used directly as a DB filter.
- Observe bounded effects: prove that structured input changes query behavior without attempting destructive actions.
- Check auth filters: ensure tenant/owner scoping is added server-side and cannot be widened by client filters.
Fixes that hold in production
1) Keep query shape server-controlled
- Accept primitives, not arbitrary query objects, for most endpoints.
- Use explicit mapping from request fields to filter fields.
2) Block operators and path keys
- Reject keys that start with
$or contain.unless strictly needed and validated. - Disallow arrays/objects where primitives are expected.
3) Validate and type inputs
- Schema validation (Zod/JSON Schema/Joi) with strict mode, length limits, and type coercion rules.
- Normalize/escape for any regex-like searches, and limit patterns to safe use cases.
4) Secure authorization filters
- Always inject tenant/owner constraints on the server and keep them immutable.
- Never allow clients to supply âtenantIdâ or âownerIdâ filters unless verified against session context.
Confidence levels (low / medium / high)
- Low: NoSQL is used, but you canât show user input affecting query shape.
- Medium: user input reaches query building, but operator/path control is not confirmed.
- High: confirmed user-controlled query shape (operators/paths/types) with repeatable, safe evidence and clear impact scope.
Interview-ready summaries (60-second + 2-minute)
60-second answer
NoSQL Injection is when untrusted input changes the structure or meaning of a NoSQL query. It often happens when apps merge user JSON directly into a query filter or allow operator keys. I validate by tracing input into the database filter and confirming query-shape control. Fixes are strict schema validation, allow-listing fields, blocking operator/path keys, and enforcing server-side authorization scoping.
2-minute answer
I treat NoSQL Injection as the same class as Injection: untrusted input reaching an interpreterâin this case, the NoSQL query parser. The key difference is that NoSQL drivers often accept rich query objects, so the main risk is user-controlled query shape: operators, nesting, types, and paths. This often leads to broadening result sets, bypassing business checks, or breaking tenant isolation. I look for unsafe merges of request JSON into filters, type confusion, and operator/path keys. The durable remediation is to keep query shape server-controlled, validate and type all inputs, block operator and dot-path keys, and always enforce immutable server-side auth filters like tenantId/ownerId.
Checklist (quick review)
- Search code for direct use of request JSON in DB filters (
find(req.body),find(req.body.filter)). - Block operator keys (
$) and dot-path keys (.) from untrusted input. - Schema-validate request bodies; enforce primitives where expected.
- Allow-list fields for filtering/sorting/projection; default-deny unknown fields.
- Enforce tenant/owner scoping server-side; never accept client overrides.
- Add query limits: max page size, timeouts, and indexed fields only.
Remediation playbook
- Contain: disable âadvanced filter JSONâ or restrict it to trusted admins temporarily.
- Inventory: identify all endpoints accepting filter objects, sort/projection objects, and query builders.
- Fix boundary: replace raw merge patterns with server-constructed filters.
- Validate: apply strict schemas; reject operator keys and dot-paths; enforce primitive types.
- Authorize: inject immutable scoping (tenant/owner) based on the session and keep it non-overridable.
- Guardrails: enforce page size limits, query timeouts, and safe regex usage; add audit logs for filter usage.
- Prevent regressions: add tests and CI checks that forbid passing request objects directly to DB query APIs.
Interview Questions & Answers (Easy â Hard)
Easy
- What is NoSQL Injection?
A: Plain-English: itâs when user input changes the meaning of a NoSQL query. Deeply, itâs untrusted input influencing query shape (operators/types/paths) rather than only values. - Is NoSQL âsafe from injectionâ compared to SQL?
A: Plainly, no. Deeply, NoSQL often uses query objects, so unsafe merges or operator exposure create injection risk. - Whatâs the most common developer mistake?
A: Plainly, using request JSON as the filter. Deeply, merging attacker-controlled objects intofind()/where()so operators or nested conditions become attacker-controlled. - Whatâs the single best defense?
A: Plainly, keep query shape server-controlled. Deeply, allow-list fields, block operator/path keys, and validate types strictly. - Why do operator keys matter?
A: Plainly, they change logic. Deeply, operators let the query behave differently than âfield equals value,â which can broaden matches or bypass checks. - Whatâs a safe way to accept filters?
A: Plainly, accept only known fields. Deeply, map request inputs to an allow-listed server-built filter and reject unknown structures.
Medium
- Scenario: A search endpoint accepts
{"filter": {...}}. What do you check first?
A: Plainly, whether itâs merged into DB queries. Deeply, trace iffiltergoes directly tofind()and whether operator/path keys are blocked and types are enforced. - Scenario: A login endpoint uses a Mongo query built from JSON body fields.
A: Plainly, that can allow logic manipulation. Deeply, ensure the code enforces primitives and never accepts objects for credential fields; use explicit comparisons and strict schema validation. - Follow-up: What evidence makes it âhigh confidenceâ?
A: Plainly, proof input changes query semantics. Deeply, code-level trace to unsafe query merge plus repeatable behavior change attributable to query-shape control. - Scenario: User controls sort or projection objects.
A: Plainly, that can leak data. Deeply, projection can expose sensitive fields; sort can cause heavy scans. Allow-list both and keep defaults safe. - Follow-up: How do you secure tenant isolation in queries?
A: Plainly, enforce it on the server. Deeply, inject tenantId/ownerId constraints from session context and prevent clients from widening or overriding them. - Scenario: The app supports regex search.
A: Plainly, regex can be abused. Deeply, only allow safe patterns (e.g., prefix match), escape user input, and enforce limits to avoid expensive queries. - Follow-up: How do you avoid performance-based abuse?
A: Plainly, limit results. Deeply, enforce max page size, require indexes, apply timeouts, and restrict unbounded sorts/projections/regex.
Hard
- Scenario: You fixed operator keys but still accept arbitrary field names.
A: Plainly, still risky. Deeply, field-name control can target sensitive nested fields or bypass intended constraints; allow-list fields and block dot-paths. - Scenario: Filters are stored in DB and executed later by a worker.
A: Plainly, thatâs second-order injection. Deeply, stored untrusted query objects can bypass front-end validation; enforce schema validation at execution time too. - Follow-up: Whatâs the trade-off between flexibility and security for filtering?
A: Plainly, more flexibility increases risk. Deeply, safe design uses a restricted query DSL with allow-listed operators and fields, plus strong typing and limits. - Scenario: GraphQL resolver passes args directly into Mongo queries.
A: Plainly, it can become NoSQL injection. Deeply, treat GraphQL args as untrusted; map them to allow-listed fields/operators and validate types and limits. - Follow-up: How would you create regression prevention?
A: Plainly, CI + tests. Deeply, ban passing request objects into query APIs, add lint/grep checks, and write tests ensuring operator/path keys are rejected. - Scenario: âWe sanitize by deleting keys that start with $.â Is that enough?
A: Plainly, not always. Deeply, dot-paths and type confusion can still alter semantics; you need a strict schema and allow-list design, not ad-hoc sanitization. - Follow-up: Where does NoSQL Injection map in OWASP?
A: Plainly, Injection. Deeply, it maps to OWASP Top 10 A03 Injection, often enabled by insecure design (A04) and missing validation (A05 patterns). - Scenario: Client can pass
tenantIdas part of filter.
A: Plainly, thatâs dangerous. Deeply, tenant scoping must be derived from the authenticated session and be immutable; never trust client-provided tenantId for access control.