In 2025, “vibe coding” became a real workflow, for better or for worse: The idea was that I described what I wanted, accepted what the model gave me, pasted the next error message — rinse, wash, repeat. The code “worked,” but I often hadn’t actually read or understood it (particularly if it was in a language I’m unfamiliar with).

The term “vibe coding” was coined by Andrej Karpathy in a viral post describing a mode where you “forget that the code even exists. Simon Willison later offered the cleanest definition that engineering teams could use in policy and code review: Vibe coding is “generating code with AI without caring about the code that is produced.

That attitude can be perfectly rational for prototypes and one-off tools. Think small black-box modules that perform simple tasks and appear to perform them correctly — the kind of “I just need it to do this one thing for me” that’s fine for makers and hobbyists.

The hangover arrives when those “throwaway” artifacts become dependencies, revenue features, or embedded firmware that must be maintained, audited, and secured for years. If 2025 was the year everyone shipped faster, 2026 is the year many teams discover what they shipped.

Subscribe
Tag alert: Subscribe to the tag Embedded & AI and you will receive an e-mail as soon as a new item about it is published on our website!

What Is Meant by “Vibe Coding” (and What Isn’t)

Plenty of engineers used AI in 2025 without vibe coding: generating boilerplate code, drafting tests, explaining unfamiliar APIs, or producing a first pass that still went through normal review. That is simply AI-assisted development.

Vibe coding is the “no one is accountable for the internals” mode. At this point, we move on from the tool and onto posture. Do we treat AI output as untrusted code that must survive review, tests, and security scrutiny like any external contribution? Or do we treat it as a shortcut around those constraints?

In practice, teams drift into vibe coding via small, reasonable steps: a quick script here, a glue layer there, a “temporary” automation, a generated migration, a generated CI workflow, a generated Terraform module. Each piece is individually plausible. The system-level result is a codebase with more surface area, more dependencies, and fewer people who actually understand why it works.

Speed Went Up, Stability Did Not

The most sobering public data point I’ve seen is DORA’s research on generative AI adoption. In the 2024 DORA report summary, Google reported that, as AI adoption increased, it was accompanied by an estimated 7.2% reduction in delivery stability and an estimated 1.5% reduction in delivery throughput.

DORA’s dedicated 2025 report framed those numbers as estimated change per 25% increase in AI adoption, and emphasized the gap between how fast developers feel and what delivery performance reflects.

That pattern maps cleanly onto what vibe coding optimizes for: local speed (getting a thing to run) rather than global reliability (keeping a system stable as it evolves). AI makes it cheaper to make changes, and that encourages bigger diffs and more frequent changes. If your release hygiene is marginal to begin with, AI will happily amplify the consequences.

There is also a second, less comfortable possibility: Sometimes AI does not actually save time for experienced developers working in familiar codebases, even when it feels like it should. A randomized controlled trial by METR found experienced open-source developers took about 19% longer with early-2025 AI tools than without them, while still believing they were faster . The takeaway is not that “AI is bad,” but that “your gut is not an instrument.” If a workflow reduces understanding or increases verification overhead, the speed story can flip.

Quality Failure Modes That Look Fine in a Demo

Vibed code often passes a superficial test because it is optimized for plausibility (“so plausible I can’t believe it!”). It tends to fail later in ways that are expensive to diagnose:

 

  • Overfitted glue: brittle code that assumes the exact shape of today’s inputs, APIs, or HTML, with no contracts or versioning.
  • Hidden complexity: small “one-file” scripts that quietly accrete retry logic, caching, concurrency, and edge-case handling until these scripts become complete systems.
  • Non-obvious performance debt: N+1 queries, quadratic loops, unbounded queues, eager loading, or “just add caching” patterns that mask root causes.
  • Configuration hazards: CI/CD workflows, cloud IAM rules, Kubernetes manifests, and Terraform modules that “work” while embedding dangerous defaults.
  • Testing theatre: generated tests that assert the current behavior rather than the intended behavior, or that mock away the actual risk.

 

The consistent theme is missing intent. Human-written systems usually have some trace of why a decision was made. Vibe-coded systems often have only what: It does this now, so tests assert that it does this now.

Security: “It Compiles” Is Not a Security Property

Vibe coding tends to optimize for “works on my machine” rather than “safe under adversarial conditions.” That distinction is not theoretical. An empirical study of AI-generated code snippets found a “high likelihood of security weaknesses,” including a large fraction of Python and JavaScript snippets flagged with Common-Weakness-Enumeration-aligned issues in real GitHub projects.

Studies such as this are not a verdict on any one tool; static analysis has false positives, and “flagged weakness” is not always “exploitable in your context.” But the direction is clear: Plausible code is not necessarily safe code, and security is exactly the kind of non-functional requirement that vibe coding ignores.

The security risks that show up most often in AI-heavy codebases are also the boring classics:
 

  • Injection and unsafe construction: SQL injection, command injection, unsafe deserialization, templating problems, and weak input validation.
  • Secrets leakage: keys committed in “temporary” scripts, logs containing tokens, and debug endpoints that escape into production.
  • Auth and session mistakes: DIY authentication flows, missing authorization checks, and misunderstanding of identity boundaries across services.
  • Crypto footguns: “roll your own” randomness, wrong modes, wrong key handling, and insecure defaults that still pass tests.

 

Vibe coding is not uniquely insecure; it is insecure in a very specific way: It makes it easy to produce a lot of code without the understanding needed to spot the security assumptions you are accidentally making.

Subscribe
Tag alert: Subscribe to the tag Security and you will receive an e-mail as soon as a new item about it is published on our website!

Supply-Chain Risk: Package Hallucinations and “Dependency-Shaped” Attacks

Once you accept AI output without reading it, dependencies become a primary attack surface. One risk is “package hallucination:” The model suggests a dependency that does not exist. An attacker can publish a package with that name and wait for someone to install it. USENIX analyzed the phenomenon, framing it as a form of package confusion risk in the software supply chain.

The deeper point is that AI makes “try this library” a default move. In a vibe-coding loop, adding dependencies feels cheaper than understanding the code you already have. That increases:
 

  • the number of transitive dependencies you are implicitly trusting,
  • the speed at which unknown packages enter your build, and
  • the difficulty of auditing and patching when a supply-chain incident hits.

For Embedded and Industrial Code, Failure Modes Get Less Forgiving

Web software can often fail “softly,” meaning they can degrade without causing immediate physical harm or permanent damage: an endpoint returns 500s, a feature is temporarily disabled, traffic is rerouted, a rollback happens, users retry later. Embedded systems, on the other hand, tend to fail as intermittent field faults, timing bugs, safety incidents, or unrecoverable bricking. Vibe-coded firmware is especially prone to:

 

  • Confidently wrong hardware details: register names, bitfields, timing values, and “magic numbers” that are plausible but incorrect.
  • Concurrency and interrupt hazards: race conditions around ISRs, DMA, RTOS primitives, and shared buffers that only show up under load.
  • HAL misuse: mixing vendor HAL calls with direct register writes in ways that violate assumptions (and create non-reproducible bugs).
  • Protocol edge cases: I2C/SPI/UART drivers that pass smoke tests but fail with clock stretching, noise, bus contention, or when simply using longer cables.
  • Watchdog and recovery gaps: code that “works” in the lab without robust brownout handling, safe boot, or recovery paths.

 

None of this is new. What changes with vibe coding is how easily low-understanding code can enter the codebase and how long it can survive if it “mostly works.”

Subscribe
Tag alert: Subscribe to the tag embedded systems and you will receive an e-mail as soon as a new item about it is published on our website!

The Human Factor: Distrust, Overtrust, and Skills Debt

Even at peak hype, developers were not uniformly confident in AI output. Stack Overflow’s 2025 survey data also showed that more developers distrusted AI accuracy than trusted it, with only a small fraction “highly” trusting it.

The trap is that low trust does not automatically produce careful review. In practice it often produces “just keep prompting until the tests go green,” which can become cargo-cult verification — especially if tests are thin, missing, or themselves generated from the current behavior.

The longer-term cost stretches from technical debt to skills debt. When code becomes something you steer rather than write, organizations can end up with fewer people who can debug at the metal, reason about edge cases, simplify systems, and do the “boring” work of building reliable interfaces. You don’t notice until the system breaks outside the small slice of reality your tests cover.

Governance Is Catching Up: Treat AI as an Untrusted Contributor

The practical response emerging across 2025 to 2026 is “policy + pipeline,” not “ban the tool.” NIST’s SSDF community profile for generative AI is explicit about the stance engineering teams should default to: It assumes all source code should be evaluated for vulnerabilities and other issues before use, without distinguishing between human-written and AI-generated code.

That is the only posture that scales. The question is not “did a model write this?” but “would we accept this from an external contributor?”

A Minimum Viable Safety Net for AI-Heavy Teams

If you want the speed benefits of AI without the vibe-coding hangover, you need a hard floor under the process. The exact tooling varies, but the principles are stable.

Keep diffs small. AI makes it easy to ship big changes. Reliability punishes that. Use pull request size limits, staged rollouts, and feature flags.

Require tests that fail meaningfully. Unit tests are not enough if integration boundaries are where you bleed. Add contract tests, fuzzing where relevant, and regression tests tied to incidents.

Turn on the scanners and make them blocking. Static analysis, secret scanning, dependency scanning, and container/IaC scanning should run by default, and “warn-only” should be temporary.

Lock dependencies and track provenance. Use lockfiles, pinned versions, reproducible builds, SBOMs, and signed artifacts. “Just install the suggested package” is not a workflow.

Make review about intent, not syntax. Require a short design note or PR description that explains why the change exists, its risks, and how it was tested.

Instrument production. AI often shifts effort from writing code to debugging systems. Observability is not optional: logs, metrics, tracing, and explicit SLOs are how you keep speed without chaos.

None of that is novel. The point is that AI increases the pressure on those fundamentals because it lowers the marginal cost of change.

Subscribe
Tag alert: Subscribe to the tag programming and you will receive an e-mail as soon as a new item about it is published on our website!

Legal and Compliance: Provenance Questions Do Not Go Away

Vibe coding also collides with compliance because it blurs provenance: What data influenced the output, what licensing constraints exist, and who is accountable for what shipped? Regulators are moving, even if unevenly. The European Commission published a General-Purpose AI Code of Practice on July 10, 2025, positioned as a voluntary tool to help providers demonstrate compliance with the AI Act’s general-purpose AI obligations, which apply from August 2, 2025.

In the UK, government consultation explicitly framed copyright and AI training as a live policy problem; the consultation ran from December 17, 2024 to February 25, 2025.

The pragmatic engineering implication is simple: If your organization has licensing or compliance constraints, AI-generated code does not magically bypass them. Treat provenance as an operational requirement, not a philosophical debate that can be postponed until legal shows up at your desk.

The Actual Lesson of 2025

2025 made it cheaper to create software. It did not make it cheaper to own software. Teams that kept vibe coding in the prototype lane got a burst of speed. Teams that let it leak into production bought a three-part hangover:

 

  • More instability, because change velocity outpaced release discipline.
  • More security exposure, because plausible code is not secure code, and dependency mistakes become easier to introduce at scale.
  • More accountability ambiguity, because provenance, review, and licensing questions arrive later—usually when money is on the line.

 

Obviously, none of this argues for abandoning AI tools. It argues for the boring, correct stance: AI accelerates engineering, but it does not replace engineering.


Questions or Comments?

If you have any questions or comments, I’d love to hear your experiences. Write to me at brian.williams@elektor.com or find me on X @briantw.


Editor's Note: This article (230181-R-01) appears in Elektor March/April 2026.