The economic failures of penetration testing
Offensive security seems to be getting its time in the sun again with the rise of AI-powered tools that promise fully autonomous penetration testing. But is it really all that different from the old days?
The persistent failure of the penetration testing market is often framed as a technical problem: not enough talent, insufficient automation, immature tooling, or attackers moving faster than defenders. This framing is comforting because it implies that incremental technical progress will eventually close the gap. In reality, the core problem is economic. It systematically rewards the appearance of security over the reduction of risk, and it does so through a predictable set of incentive failures that any A-Level Economics student can explain.
The core problem: it is not a market for outcomes, it is a market for signals.
Information Asymmetry and Adverse Selection
Akerlof's market for lemons is a classic example of how information asymmetry can lead to market failure. In the used car market, the seller knows more about the quality of the car than the buyer. It is impossible for the buyer to know the true quality of the car before buying it, so they price for the expected (average) quality. This leads to the high-quality sellers leaving the market because the average price falls below their cost of production, leading the average quality of the market to fall further.
In a similar vein, the penetration testing market adversely selects against the best pentesters, instead selecting for proxy signals: certifications, report length, vulnerability counts, or compliance acceptance. None of these are tightly correlated with real attacker cost or the likelihood of a breach. Again, there are tons of firms with CREST certifications, so the buyer prices for the average. High-quality providers, whose work is slower, more expensive, and harder to explain in a procurement meeting, are crowded out. Lower-quality providers, who can produce predictable findings cheaply and repeatedly, start to dominate the market.
Over time, this drives a race to the bottom. The equilibrium product becomes the "acceptable" pentest: one that produces familiar, low-hanging fruit, fits procurement expectations, and minimizes organizational disruption, regardless of whether it meaningfully models attacker sophistication.
Moral Hazards and Perverse Feedback Loops
Moral hazards occur when one party to a transaction has an incentive to take actions that are not in the best interest of the other party. In the pentesting market, the pentesting firm has an incentive to find issues, not to actually reduce risk. There is no upside for identifying deep systemic flaws that are politically difficult or expensive to fix.
Security teams, meanwhile, are often evaluated on audit success rather than security posture. Their rational incentive is to commission work that passes compliance checks with minimal internal friction. A pentest that uncovers uncomfortable architectural weaknesses creates political cost without guaranteeing budget or remediation authority. A shallow pentest that confirms existing narratives is safer.
This moral hazard produces predictable behavior. Pentests focus on low-hanging fruit. Reports emphasize volume over depth. The same vulnerability classes reappear annually, not because they are hard to fix, but because no one is economically rewarded for ensuring that they are fixed.
Modern application-security tooling exacerbates the problem. Better tools increase visibility into real risk, but visibility generates work. Findings require triage, prioritization, fixes, regressions, and cross-team coordination. For engineering organizations already under delivery pressure, security improvements often appear as a pure tax.
From a narrow organizational perspective, it is rational to underinvest in tools or tests that surface actionable truth without providing remediation capacity. A weak pentest that produces a small, predictable list of issues is less disruptive than a strong system that reveals hundreds of real weaknesses. In this sense, insecurity becomes a form of organizational equilibrium. Unless the tool makes remediation low-friction, it will not be easily adopted.
The Compliance Demand proxy
Compliance frameworks like SOC 2 and ISO 27001 further distort incentives by acting as demand proxies for security. Pentests are purchased not to actually find meaningful security issues, but because they are required by an external checklist. This moves the goalpost away from realistic attacker behaviour and towards helping the organisation pass the audit.
Fixed-scope, time-boxed engagements are imposed regardless of system complexity or change velocity. Success is defined by the presence of a report, not by the absence of exploitable paths. In this environment, pentesting becomes performative: existing to satisfy third parties, not to inform internal decision-making.
This section of demand almost entirely selects for security theatre.
Pricing Failures and the Race to the Bottom
The dominant pricing models in pentesting — flat fees and hourly rates — are poorly aligned with outcomes. They decouple compensation from exploit difficulty, business impact, or remediation success. Competitive pressure pushes firms to reduce costs through junior staffing, templated methodologies, and superficial checklist-based testing.
Because quality remains largely unobservable, price becomes the primary axis of competition. The market clears not at the price of risk reduction, but at the price of plausible deniability.
Positive Externalities
Markets with positive externalities tend to under-consume because the second-order effects on the ecosystem are not directly felt by the buyer, so they are not priced in.
Security improvements generate positive externalities. A company that fixes systemic vulnerabilities reduces risk not only for itself, but for customers, partners, and the broader ecosystem. However, the cost of remediation is borne immediately and internally, while much of the benefit is diffused over time and external.
This is why companies often deploy cheap fixes that don't remediate fundamental architectural weaknesses, and why the same bug fixed from a previous pentest tends to resurface in another slightly different form later.
Re-aligning Incentives
The failures described above persist not because organizations misunderstand security, but because the current market clears at an equilibrium optimized for compliance, cost containment, and plausible deniability. Any attempt to "fix" the penetration-testing market must therefore focus on changing incentives rather than improving technical capabilities in isolation.
Looking ahead into the next 5 years, how do we improve the market? There is no perfect answer, but some things that come to mind:
-
Re-internalise external responsibilities: moving from one-off pentests that hands off all long-term responsibility to the buyer to a continuous approach that expects long-term, consistent outcomes from the seller. Security products should be designed to stand the test of time, and be viewed as a way to reduce responsibility for the security team.
-
Align pricing with outcomes: selling actionable outcomes, which include both discovery and triaging/remediation, reduces noise and prevents information that cannot be acted on from being marketed as value. Security tooling that creates more work for engineering teams with no corresponding budget or authority is not sustainable. Tools should make remediation low-friction, and organisations should view these tools as reducing work, not increasing it.
-
Reframe compliance as a lagging indicator: rather than being a primary driver of demand, compliance artifacts should be seen as byproducts of a secure system (instead of its objective!). This is a hard one because it requires shifting of internal metrics.
Attempts to improve security by increasing technical capability alone will fail. More accurate testing, better tools, or deeper analysis raise the cost of information without changing who bears responsibility for acting on it. The result is a market that systematically undervalues high-signal work and selects against those who produce it.
In this sense, the failures of the pentesting market are not unique. They mirror the broader economics of prevention: costs are immediate, benefits are invisible, and success is defined by the absence of events that cannot be proven to have been avoided. Markets built on such foundations require deliberate incentive design to function at all.
Absent that design, penetration testing will remain what it largely is today: not a mechanism for reducing risk, but a ritual that allows organizations to proceed as if they had.