How to Get Your AI-Generated, Vibe-Coded Medical Device FDA-Cleared

 April 27, 2026
SHARE ON

AI/MLCybersecurityRegulatorySoftware

Vibe-coded medical devices and the path to FDA clearance.

In early 2025, the software engineering world was introduced to a new term that quickly moved from a viral social media post to the Merriam-Webster dictionary: vibe coding. Coined by AI researcher Andrej Karpathy, the term describes a paradigm where the engineer specifies intent in natural language and accepts the AI's generated code with light or no manual editing [1]. The human acts as the director; the Large Language Model (LLM) acts as the author. Tools like Cursor, Claude Code, ChatGPT Codex, Lovable, Replit, Antigravity, GitHub Copilot, and Windsurf have put this paradigm in reach of anyone who can describe what they want — from solo clinicians prototyping algorithms to enterprise teams shipping production software.

But the paradigm has already accelerated beyond vibe coding. We are now entering the era of Agentic Code, where autonomous AI agents — Devin, Aider, the agentic modes inside Claude Code and Cursor, Bolt, v0, and increasingly capable IDE-native agents like Antigravity — write, test, and deploy software iteratively without human prompting. The byproduct is Dark Code: any code that no human eye has actually read. Every modern codebase contains some. Vibe coding raises the share. Agentic workflows raise it further. The question is never whether a codebase has dark code, but how much, and what surrounds it.

For developers building consumer web apps, this autonomous generation is a revelation. But for engineers and regulatory affairs professionals in the medical device industry, the concept of "Dark Code" triggers immediate alarm bells. Medical device software must comply with rigorous standards: IEC 62304 for software life-cycle processes, ISO 14971 for risk management, and ISO 13485 for quality management systems [2] [3].

The prevailing sentiment among many regulatory consultants is that a vibe-coded codebase is inherently incompatible with FDA clearance. They argue that AI-generated code is too opaque, too prone to unverified dependencies, and too detached from traditional requirements traceability to ever pass regulatory muster.

I disagree.

The vibe coding paradigm itself is perfectly acceptable for medical devices. The code generated by an LLM does not need to be ripped out and rewritten from scratch by a human engineer. Instead, it needs to be adapted, verified, and wrapped in the appropriate regulatory evidence. Look at the FDA's history of absorbing paradigm shifts in software development, from the Command-Line Interface (CLI) to the Software Development Kit (SDK) to containerization. The agency's frameworks are already equipped to handle vibe code.

In this article, I'll explore the data behind AI/ML Software as a Medical Device (SaMD) clearances, examine historical precedents that prove the FDA's adaptability, and outline the exact steps required to bring a vibe-coded prototype to full regulatory specification.

The SaMD Landscape in Numbers 🔗

Before discussing how to clear a vibe-coded or autonomously generated device, understand the current state of Software as a Medical Device (SaMD) clearances. Using the Innolitics FDA Browser, I analyzed every SaMD clearance from 2016 through April 2026. The data reveals a regulatory environment that is not only highly active but also increasingly comfortable with complex software architectures.

The Exponential Rise of SaMD Clearances 🔗

The FDA has cleared 2,589 SaMD devices since 2016. Of those, 963 (37.2%) are AI/ML-enabled. As Figure 1 illustrates, the volume of SaMD clearances is climbing rapidly, with 2025 setting a new record of 390 clearances, and AI/ML now making up 59% of that annual total.

Figure 1: Annual SaMD clearances since 2016. 2025 set a record at 390, with AI/ML-enabled devices now 59% of the total.

This growth is not limited to a single medical specialty. While Radiology remains the gravitational center of SaMD, accounting for roughly 63.6% of all clearances, other panels such as Cardiovascular, Clinical Chemistry, and Neurology are steadily gaining ground (Figure 2).

Figure 2: SaMD clearances by medical specialty panel. Radiology leads at 63.6%, with Cardiovascular, Clinical Chemistry, and Neurology gaining ground.

The Fast Lane is Real 🔗

One of the most persistent myths in the medical device industry is that clearing a software device takes years. My data shows otherwise. The median review time (from the date the FDA receives the submission to the decision date) for a SaMD is just 146 days.

Figure 5: Distribution of FDA review times for SaMD submissions. The median is 146 days, and over 26% are cleared in under 90.

As Figure 5 demonstrates, over 26% of SaMD submissions are cleared in under 90 days. This "fast lane" is accessible to manufacturers who submit well-organized, highly traceable documentation that clearly maps software requirements to risk controls and verification testing. The speed of clearance is determined by the quality of the evidence, not the origin of the code.

Historical Precedent: The FDA Has Seen This Before 🔗

The skepticism surrounding AI-generated code echoes the skepticism that accompanied previous shifts in software engineering. Time and again, the industry has assumed that a new development paradigm would be rejected by the FDA, only to find that the agency's existing frameworks were flexible enough to absorb the change.

Figure 4: Cleared SaMD devices by software paradigm — CLIs, SDKs, containerized apps, and cloud-based systems are all well represented.

As Figure 4 shows, the FDA has already cleared hundreds of devices utilizing paradigms that were once considered "too modern" or "too abstract" for medical use. Let's examine a few of these precedents.

Command-Line Interface as a Medical Device (CLaMD) 🔗

Historically, medical device software was expected to have a Graphical User Interface (GUI). The GUI was where the clinician interacted with the device, and therefore, it was the focus of extensive human factors engineering and UI hazard analysis.

When developers began submitting "headless" algorithms that operated entirely via a Command-Line Interface (CLI), taking an input file, processing it, and returning an output file without any user interaction, many assumed the FDA would balk. How could you clear a device that the clinician never actually sees?

As I detailed in my previous article on Command-Line as a Medical Device, the FDA accepted the CLI paradigm readily. The regulatory logic is elegant: a CLI shifts the risk-control burden for display, navigation, and confirmation steps to another device (such as a cleared PACS or EHR system) that is already authorized for those tasks.

Recent clearances from the FDA Browser confirm that the CLI paradigm is thriving:

  • Thirona LungQ 4 (K250766, Oct 2025): "LungQ is a docker image with a standalone command-line software which must be run from a command-line interpreter and does not have a GUI."
  • Nurea PRAEVAorta®2 (K243859, Aug 2025): "Input of patient data: Command line interface (API)."
  • Brain Electrophysiology NEAT 001 (K250058, Apr 2025): "Sleep stages are scored by the containerized neat-cli software on the FLOW server."

The CLI precedent proves that the FDA does not require software to look or behave like a traditional desktop application. It only requires that the software's inputs, outputs, and risks are strictly defined and verified.

Software Development Kit (SDK) as a Medical Device 🔗

An even more abstract paradigm is the Software Development Kit (SDK) or software library. An SDK is not a standalone application; it is a collection of code intended to be integrated into another developer's application.

The FDA Browser reveals that across all SaMD, 20 clearances explicitly mention "SDK," 11 mention "Software Development Kit," and 239 mention "library."

  • Measure Labs / Preemptive AI Clinical SDK (K250233, Feb 2026): "SaMD SDK for integration into third-party mobile apps; operates on Android/iOS."
  • Nobel Biocare DTX Studio Assist (K252086, Nov 2025): An SDK with no UI, bundled with host dental imaging software.
  • Deepwell DTx ABS (K233580, Aug 2024): An SDK for Android breathing biofeedback.

The fact that SDKs have been cleared for over a decade demonstrates a crucial point: the FDA's frameworks already accommodate non-end-user, "developer-only" software artifacts. The agency is comfortable clearing raw code, provided that the code is accompanied by a rigorous integration guide and comprehensive verification evidence.

Containerization and Modern Stacks 🔗

When Docker and containerization revolutionized software deployment, the medical device industry was hesitant. Could a containerized microservice be cleared, or did the FDA require monolithic, bare-metal installations?

The data is clear: across all SaMD, 32 clearances mention "Docker," 14 mention "containerized," and 597 mention "cloud."

  • Clouds of Care PreOp v3 (K252565, Feb 2026): Cleared with a "modular, containerized software architecture that replaces the monolithic design of PreOp V1."
  • TeraRecon Cardiovascular.Calcification.CT (K250288, Oct 2025): Described as a "containerized application (e.g., Docker)."

Similarly, the shift from legacy languages like LabVIEW to modern stacks like Python has been seamlessly absorbed. The AccurECG Analysis System v2.0 (K252361, Dec 2025) successfully used a LabVIEW predicate to clear a Python rewrite.

Engineering-stack modernization is itself a kind of paradigm shift that the FDA accepts via 510(k) substantial-equivalence reasoning. AI-assisted coding is simply the next iteration of this modernization.

The accepted progression of software paradigms in SaMD — from CLI and SDK to containerization, modern stacks, and now vibe coding.

The Innolitics Thesis: Wrap, Don't Rewrite 🔗

The core argument against AI-generated code is that an LLM cannot produce code that is inherently compliant with IEC 62304. This is true. An LLM cannot generate a complete, traceable risk file or a verified software architecture document out of thin air.

However, this argument misses the point. No code is inherently compliant.

A human engineer writing Python in a text editor is not producing compliant code either. Compliance is not a property of the code itself; it is a property of the evidence and provenance surrounding the code.

A vibe-coded prototype is to a cleared medical device what:

  • A whiteboard sketch is to a finalized engineering drawing.
  • An MVP is to a productionized application.
  • A research notebook is to a deployable algorithm.
  • A LabVIEW prototype is to a Python rewrite.
  • A monolithic system is to a modular container.

Each of these transitions has already been accepted by FDA reviewers via substantial equivalence reasoning, Predetermined Change Control Plan (PCCP) flexibility, and the standard documentation backbone of IEC 62304, ISO 14971, ISO 13485, and Good Machine Learning Practice (GMLP). None required throwing away the prior code, they required wrapping it in the right artifacts.

AI-generated code is no different. It does not replace the regulatory backbone; it changes who (or what) holds the pen during the coding activity, while leaving the verification, validation, risk management, and labeling activities essentially intact.

Wrap, don't rewrite: prototype code encased in IEC 62304-compliant verification, risk controls, and traceability artifacts.

The Final Frontier: Clearing "Dark Code" 🔗

If AI-generated code makes regulatory professionals nervous, Dark Code terrifies them. Dark Code is the inevitable result of autonomous AI agents: code that is generated, tested, and deployed entirely machine-to-machine. No human engineer ever opens the file. No human ever reads the logic.

Dark Code: machine-generated logic no human has read, made safe by an impenetrable, human-verified wrapper.

How can you possibly clear a medical device powered by code no human has ever seen?

The answer is that the FDA does not clear code readability; the FDA clears safety and efficacy. If a manufacturer can prove that a Dark Code module is strictly bounded, that its inputs are sanitized, its outputs are verified against a deterministic oracle, and its failure modes are mitigated by independent risk controls, the fact that a human hasn't read the source code becomes a business risk, not a patient safety risk.

The "Wrap, Don't Rewrite" philosophy applies just as strongly to Dark Code. The wrapper must be written, reviewed, and verified by humans (or at least held to the highest standard of deterministic verification). But the autonomous logic inside the wrapper can remain dark, provided the wrapper is impenetrable.

The Vibe-to-Clearance Pipeline 🔗

If you have an AI-generated prototype that you want to bring to market, you cannot simply submit the raw code to the FDA. You must adapt it. However, this adaptation is not a complete rewrite, as some consultants with less software development experience might tell you.

Instead, it is a structured process of wrapping the prototype in the necessary regulatory evidence.

Deep Dive: The Anatomy of an AI-Generated Code Refactor 🔗

Let's walk through a hypothetical case study to illustrate exactly how a vibe-coded prototype is transformed into a compliant medical device.

Imagine a team of cardiologists who have developed a novel algorithm for detecting early signs of heart failure from wearable ECG data. They are not software engineers, but they used an LLM to "vibe code" a functional prototype in Python. The prototype works beautifully on their test dataset, but the code is a mess. It lacks error handling, uses unverified open-source libraries, and has zero documentation.

If they take this prototype to a traditional regulatory consultant, they will likely be told to throw it away and hire a team of software engineers to rewrite it from scratch in C++ or Java. This would delay their time-to-market by a year or more — and every month of delay is a month where the predicate field shifts, competitors clear adjacent indications, and the clinical evidence base they were trying to advance moves on without them.

If they bring it to me at Innolitics, I take a different approach. I wrap the prototype.

Step 1: Establishing the Boundary 🔗

I start by treating the vibe-coded algorithm as a "black box." I don't care (yet) what the code looks like inside. I only care about its inputs and outputs. I work with the cardiologists to formally document the design intentions via the Software Requirements Specification (SRS). What exact data format does the algorithm expect? What exact output does it produce? What are the performance requirements (e.g., sensitivity, specificity, latency)?

Step 2: Risk Management (ISO 14971) 🔗

Next, I conduct a rigorous hazard analysis. What happens if the algorithm receives corrupted data? What happens if it produces a false positive? What happens if it crashes? For each hazard, I define risk controls. Some of these controls will be implemented outside the algorithm (e.g., the wearable device must validate the signal quality before sending it). Other controls must be implemented within the software wrapper.

Step 3: The Wrapper Architecture 🔗

Instead of rewriting the algorithm, I build a robust, IEC 62304-compliant "wrapper" around it. This wrapper handles all the critical safety functions:

  • Input Validation: The wrapper checks every piece of incoming data to ensure it meets the required format and quality standards. If the data is bad, the wrapper rejects it before it ever reaches the vibe-coded algorithm.
  • Error Handling: The wrapper monitors the execution of the algorithm. If the algorithm crashes or throws an exception, the wrapper catches it and fails gracefully, ensuring the system remains in a safe state.
  • Output Verification: The wrapper checks the output of the algorithm to ensure it is within expected bounds before passing it on to the clinician.

Step 4: SOUP and SBOM 🔗

I then analyze the vibe-coded codebase to identify all third-party dependencies (Software of Unknown Provenance, or SOUP). I create a comprehensive Software Bill of Materials (SBOM) and assess each dependency for known cybersecurity vulnerabilities. If the LLM used an insecure or outdated library, I replace it with a secure alternative.

Step 5: Exhaustive Verification 🔗

Finally, I prove that the system works. I write an extensive suite of automated tests that exercise the wrapper and the algorithm under every conceivable condition. I test edge cases, boundary conditions, and failure modes. I generate the traceability matrix that links every requirement to a risk control and a passing test result.

The result is a fully compliant medical device. The core logic is still the original vibe-coded Python, but it is now encased in a fortress of verified, traceable engineering. The code stayed. The evidence was added.

AI-Generated SaMD and Good Machine Learning Practice (GMLP) 🔗

The FDA's AI/ML Action Plan also heavily emphasized the development of Good Machine Learning Practice (GMLP). In 2021, the FDA, Health Canada, and the UK's MHRA jointly published 10 guiding principles for GMLP [5]. These principles were later formalized by the International Medical Device Regulators Forum (IMDRF) in a document finalized in January 2025 [6].

GMLP is essentially the AI/ML equivalent of Good Manufacturing Practice (GMP) or Good Clinical Practice (GCP). It provides a framework for ensuring that AI models are developed, validated, and maintained in a safe and effective manner.

How does AI-generated code intersect with GMLP? At first glance, the two might seem fundamentally opposed. GMLP demands rigorous data governance, clear model architectures, and comprehensive performance testing. AI-assisted coding, by its very nature, is often exploratory and unstructured.

However, the two can be reconciled. GMLP does not dictate how the code is written; it dictates how the model is managed. If a vibe-coded prototype is used to train a machine learning model, the resulting model must still adhere to GMLP principles. This means:

  1. Multi-Disciplinary Expertise: The team developing the model must include domain experts (e.g., clinicians) and data scientists. AI-assisted coding actually facilitates this by allowing clinicians to directly interact with the code generation process.
  2. Good Software Engineering and Security Practices: This is where the "wrapper" approach I discussed earlier becomes critical. The vibe-coded model must be integrated into a secure, well-engineered software system.
  3. Clinical Study Participants and Data Sets: The data used to train and test the model must be representative of the intended patient population. AI-generated code does not change this requirement.
  4. Training Data Sets Are Independent of Test Sets: This is a fundamental principle of machine learning that must be strictly enforced, regardless of how the code was generated.
  5. Selected Reference Datasets Are Based Upon Best Available Methods: The ground truth used to evaluate the model must be reliable.
  6. Model Design Is Tailored to the Available Data and Reflects the Intended Use: The architecture of the model must be appropriate for the task.
  7. Focus Is Placed on the Performance of the Human-AI Team: If the model is intended to assist a clinician, the performance of the combined human-AI system must be evaluated.
  8. Testing Demonstrates Device Performance During Clinically Relevant Conditions: The model must be tested in realistic scenarios.
  9. Users Are Provided Clear, Essential Information: The labeling and instructions for use must clearly explain the model's capabilities and limitations.
  10. Deployed Models Are Monitored for Performance and Re-training Risks are Managed: The manufacturer must have a plan for monitoring the model's performance in the real world and managing the risks associated with retraining.

By adhering to these principles, a manufacturer can ensure that a vibe-coded AI/ML model is safe and effective, even if the underlying code was generated by an LLM.

1. Capture Intent and Outputs 🔗

The first step is to formalize the natural language prompts that generated the code. These prompts are the foundation of your Software Requirements Specification (SRS). You must document what the software is intended to do, what inputs it accepts, and what outputs it produces.

2. Hazard and Risk Triage 🔗

Next, you must map the software's functions to potential hazards. This is where ISO 14971 comes into play. You must identify what could go wrong if the LLM-generated code fails, and you must implement risk controls to mitigate those hazards. For example, if the code is responsible for displaying a critical alert, you must ensure that the alert is displayed correctly and reliably.

3. Refactor for Traceability 🔗

This is the most technical step. You must review the vibe-coded codebase and ensure that it is traceable. This means identifying all Software of Unknown Provenance (SOUP) and creating a comprehensive Software Bill of Materials (SBOM). You must also ensure that the code is modular enough to be tested effectively. This may require some refactoring, but it does not require a complete rewrite.

4. Verification and Validation 🔗

Once the code is traceable, you must prove that it works. This requires independent test evidence. You must write unit tests, integration tests, and system tests that verify the software meets its requirements and mitigates its risks. This is where the "vibe" ends and the engineering begins.

5. PCCP-Ready Submission 🔗

Finally, you must prepare your 510(k) submission. If your device is an AI/ML SaMD, you should strongly consider including a Predetermined Change Control Plan (PCCP). As Figure 3 shows, PCCPs entered the SaMD mainstream in 2024 (jumping from 6 in 2023 to 42 in 2025), and they are essential for devices that will evolve over time. A PCCP allows you to update your vibe-coded algorithms without requiring a new 510(k) clearance for every change.

The Cybersecurity Imperative 🔗

One of the most valid criticisms of AI-generated code is its potential impact on cybersecurity. LLMs are trained on vast amounts of open-source code, much of which contains known vulnerabilities or insecure coding patterns. If an engineer blindly accepts LLM-generated code, they may inadvertently introduce these vulnerabilities into their medical device.

The FDA has become increasingly stringent regarding cybersecurity in recent years. In 2023, the agency gained new statutory authority to require cybersecurity information in premarket submissions, including a Software Bill of Materials (SBOM) and a plan for monitoring and addressing postmarket cybersecurity vulnerabilities [7].

How can an AI-generated medical device meet these stringent requirements?

The answer lies in rigorous, automated security testing and dependency management. When an AI-generated codebase is transitioned to a compliant medical device, the following cybersecurity practices must be implemented:

  1. Static Application Security Testing (SAST): The vibe-coded codebase must be scanned using SAST tools to identify insecure coding patterns, hardcoded secrets, and other vulnerabilities. Any issues identified must be remediated before the code is deployed.
  2. Dynamic Application Security Testing (DAST): The running application must be tested using DAST tools to identify vulnerabilities that only manifest during execution, such as injection flaws or cross-site scripting (XSS).
  3. Software Composition Analysis (SCA): As mentioned earlier, a comprehensive SBOM must be generated using SCA tools. This SBOM must list all third-party dependencies (SOUP) used by the vibe-coded application, along with their versions and known vulnerabilities.
  4. Vulnerability Management: The manufacturer must establish a process for continuously monitoring the SBOM for new vulnerabilities and deploying patches or mitigations as needed.
  5. Threat Modeling: A formal threat model must be developed to identify potential attack vectors and ensure that appropriate security controls are in place.

By implementing these practices, a manufacturer can mitigate the cybersecurity risks associated with AI-generated code and demonstrate to the FDA that their device is secure. The LLM may have written the code, but the manufacturer is responsible for securing it.

Agentic AI Best Practices 🔗

The fundamental limitation of today’s AI coding agents comes down to limited memory and basically needing to wake up from cryostasis every time you spin up a new session.

If an LLM is holding the pen — whether through Cursor, Claude Code, ChatGPT Codex, Lovable, Replit, Antigravity, GitHub Copilot, Windsurf, Devin, Aider, or whatever tool ships next quarter — the discipline shifts. The tool changes; the regulatory constraints do not. What used to be enforced through code review now has to be enforced through specification, structure, and acceptance criteria. The agent will write whatever you let it write. Your job is to make "whatever" small, predictable, and verifiable.

Six practices we use at Innolitics when building FDA-regulated software with agentic AI:

1. Specification-Driven Development 🔗

The specification is the source of truth. Not the code. Not the prompt. The spec.

Write the requirements, interfaces, invariants, and acceptance tests before the agent generates a line of code. Prompt against the spec. When the agent drifts, correct against the spec, not against the latest output. This is how a vibe-coded module stays aligned with its IEC 62304 SRS instead of drifting into whatever pattern the model happened to favor that day.

This is exquisitely important given the memory limitations of AI agents. They forget everything about the codebase on every new coding session so must be quickly brought back up to speed on the product, design patterns, requirements, and everything else that human engineers carry around in their heads every day.

2. Clean, Simple, Concise Code With No Surprises 🔗

Boring code is auditable code. An agent will gladly produce clever code. Reject it.

Every function should do exactly what its name says — no hidden side effects, no implicit state, no exception handling that swallows context. If a reviewer has to read the body to understand the behavior, the function failed. Surprises are the failure mode hazard analysis cannot catch. They only surface during V&V, or worse, post-market.

3. Self-Documenting Interfaces 🔗

Interfaces should not need a manual. Type signatures, parameter names, and return contracts encode the intent. If you need a comment to explain what a function does, the function is wrong.

This matters more in agentic workflows than in human-authored code. The next agent that touches the codebase reads interfaces, not your design rationale doc. A self-documenting interface is the contract that survives across iterations and across reviewers.

4. Keep It Stupidly Simple 🔗

KISS is not a slogan. It is a regulatory strategy.

Every additional abstraction is another layer the reviewer has to trace, another place a hazard can hide, another piece of code the verification suite has to cover. Push complexity out of the critical path. Into the wrapper, into the configuration, into the documentation — anywhere except the regulated logic. The smaller the regulated surface, the tighter the clearance argument.

5. Lines-of-Code and Complexity Budgets in the Specification 🔗

Set hard ceilings in the spec. Maximum function length. Maximum cyclomatic complexity. Maximum module size.

Budgets do two things. First, they force the agent to factor the problem before generating code, which is where structural mistakes are cheapest to fix. Second, they give reviewers a quantitative gate during code review — when a module breaches its budget, the conversation is structural, not stylistic. Treat budgets as design inputs, not aspirational targets.

6. Acceptance Criteria That Include Code Review 🔗

Generated code is not done when it passes tests. It is done when a qualified human has read it and signed off.

Bake review into the acceptance criteria. Every story, every pull request, every agent-generated module needs an explicit human-review gate before it merges into the regulated codebase. This is how Dark Code stays out of the wrapper. It is also how you produce the design history file evidence an FDA reviewer expects to see — not "the AI generated this" but "the AI generated this and a qualified engineer reviewed and approved it."

Together, these practices keep the regulated surface small, predictable, and reviewable. The agent generates fast. The discipline makes the output clearable.

Speed and Clearance Certainty 🔗

The real advantage of AI-assisted coding in MedTech is not cheaper code. It is faster iteration on the questions that decide whether a device clears.

Traditional SaMD development front-loads engineering. Months of building before the first clinical question gets interrogated against real code. More months before the first hazard analysis hits a real architecture. By the time the team discovers the edge case a reviewer will fixate on, the codebase has calcified around the wrong design — and the submission inherits that calcification.

AI-assisted development inverts the loop. A small team stands up a functional prototype in days, exposes it to clinical reviewers and regulatory strategists, and iterates against real signals before any architecture is locked. This compresses the riskiest phase of SaMD development: the phase where requirements ambiguity becomes architectural debt.

Speed alone is not the point. Speed in service of clearance certainty is.

Every iteration before design freeze is a chance to surface a failure mode that would otherwise emerge during V&V — or worse, during FDA review. Each catch upstream is one fewer surprise in the submission. The teams that compress the prototyping loop do not just ship faster. They ship submissions that hold together under reviewer scrutiny.

The "wrap, don't rewrite" philosophy compounds the effect. The wrapper — input validation, error handling, output verification, traceability — is engineered once, deliberately, by humans. The logic inside iterates freely without disturbing the regulatory backbone. Requirements shift? The wrapper absorbs it. Algorithm tweaks? The verification suite re-runs against a stable interface. The submission package stays coherent across iterations instead of fracturing under change.

The Role of the Regulatory Consultant in the Age of AI-Generated Code 🔗

As the software engineering paradigm shifts toward AI-generated code, the role of the regulatory consultant must also evolve. Historically, regulatory consultants have often acted as gatekeepers, enforcing strict adherence to traditional software development life cycles (SDLCs). They have been trained to look for comprehensive design documents, detailed code reviews, and manual traceability matrices.

When presented with an AI-generated codebase, a traditional consultant's instinct is often to reject it. They see the lack of human-authored code and the reliance on probabilistic LLMs as insurmountable regulatory hurdles.

This approach is no longer viable. Regulatory consultants must adapt to the new reality of AI-assisted software development. They must become facilitators rather than gatekeepers.

A modern regulatory consultant must understand how to apply the principles of IEC 62304 and ISO 14971 to an AI-generated codebase. They must be able to guide manufacturers through the process of building the "wrapper", establishing the boundaries, conducting the hazard analysis, and implementing the necessary risk controls.

They must also be well-versed in the FDA's latest guidance on AI/ML, including the use of PCCPs and the application of Good Machine Learning Practice (GMLP). They must be able to help manufacturers navigate the complex intersection of AI, cybersecurity, and medical device regulation.

At Innolitics, I've embraced this evolving role. I don't tell my clients to throw away their vibe-coded prototypes. I help them understand the regulatory landscape, identify the gaps in their evidence, and build the engineering pipelines necessary to achieve compliance. I believe regulatory expertise should enable innovation, not stifle it.

The regulatory consultant's evolving role: from gatekeeper to facilitator of AI-assisted medical device development.

The Future of Medical Device Software Engineering 🔗

The medical device industry is notoriously slow to adopt new technologies. This caution is understandable; when software fails in a medical device, the consequences can be catastrophic. However, this caution can also stifle innovation and delay the delivery of life-saving technologies to patients.

AI-assisted coding represents a massive leap forward in software engineering productivity. It has the potential to democratize the development of medical devices, allowing clinicians, researchers, and domain experts to rapidly prototype and iterate on new ideas.

If the industry rejects AI-generated code outright, it will miss out on this incredible potential. Instead, the industry must embrace AI-assisted development while simultaneously adapting its regulatory and engineering practices to manage the associated risks.

This means shifting the focus from how the code is written to how the code is verified and validated. It means embracing automated testing, rigorous risk management, and comprehensive cybersecurity practices. It means recognizing that compliance is not a property of the code itself, but a property of the evidence surrounding the code.

The FDA has already demonstrated its willingness to adapt to new software paradigms, from the CLI to the SDK to containerization. The agency's frameworks, including substantial equivalence, the PCCP, and GMLP, are flexible enough to accommodate AI-generated code.

The challenge now lies not with the regulators, but with the manufacturers. Manufacturers must learn how to effectively "wrap" vibe-coded prototypes in the necessary regulatory evidence. They must build the engineering pipelines and quality management systems required to ensure that LLM-generated code is safe, effective, and secure.

Those who succeed will be able to bring innovative medical devices to market faster and more efficiently than ever before. Those who fail will be left behind, clinging to outdated development paradigms while their competitors embrace the future.

Bring Your AI-Generated Code to Specification 🔗

The FDA is ready to accept AI-generated medical devices, provided they are created with a compliant development process in mind. The regulatory frameworks that absorbed the CLI, the SDK, Docker, Python (and countless others), and the cloud are fully capable of absorbing LLM-generated code. The paradigm itself is acceptable; only the documentation, traceability, and verification must be added on top.

If you have a vibe-coded prototype that you want to bring to market, do not let regulatory skepticism convince you to throw it away. The code stays. The evidence is added.

At Innolitics, we specialize in bridging the gap between modern software engineering and FDA compliance. My team of engineers and regulatory experts can help you adapt your vibe-coded prototype into an IEC 62304-compliant, traceable codebase with full verification.

  • Software Development Services: We can refactor your prototype, implement robust testing, and generate the necessary software documentation. Learn more about our End-to-End SaMD Development.
  • FDA Regulatory Consulting: We can guide you through risk classification, predicate selection, and 510(k) authoring.
  • Guaranteed Clearance: We are so confident in our process that I offer Guaranteed AI/ML SaMD 510(k) Clearance.
  • Cybersecurity: We can secure your vibe-coded codebase, ensuring SBOM compliance and dependency hygiene.

Agentic AI development is the future of software engineering. Let Innolitics help you make it the future of medical devices.

References 🔗

[1] Karpathy, A. (2025, February 6). There's a new kind of coding I call 'vibe coding'...X (formerly Twitter).

[2] International Electrotechnical Commission. (2006).IEC 62304: Medical device software - Software life cycle processes.

[3] International Organization for Standardization. (2019).ISO 14971: Medical devices - Application of risk management to medical devices.

[4] U.S. Food and Drug Administration. (2021).Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan.

[5] U.S. Food and Drug Administration, Health Canada, and Medicines and Healthcare products Regulatory Agency. (2021).Good Machine Learning Practice for Medical Device Development: Guiding Principles.

[6] International Medical Device Regulators Forum. (2025).Good Machine Learning Practice for Medical Device Development: Guiding Principles.

[7] U.S. Food and Drug Administration. (2023).Cybersecurity in Medical Devices: Quality System Considerations and Content of Premarket Submissions.

SHARE ON

Related Articles

×

Let's Talk

Every great partnership starts with a conversation. Fill out the form below for a discovery call, and an Innolitics team member will contact you soon.