FDA’s Guiding Principles of Good AI Practice: Is Your Vendor in Compliance?
By Kathie Clark, Industry Expert; Technology & Innovation Partner, Just in Time GCP
As artificial intelligence becomes increasingly embedded in drug development and clinical operations, regulators are making one thing clear: AI cannot be evaluated like traditional software. In January 2026, the FDA and EMA jointly published the Guiding Principles of Good AI Practice, outlining expectations across ten specific areas that define how AI technologies should be designed, deployed, governed, and overseen throughout their lifecycle. While the guidance is not prescriptive, it sets a clear bar for responsibility, transparency, and risk awareness – especially for tools used in regulated environments.
For life science companies, this creates a practical challenge. Many AI vendors market impressive functionality, but far fewer can demonstrate that their tools align with the FDA’s principles in a way that stands up to inspection, audit, or internal quality review. Asking the right questions upfront is essential.
Why AI vendors require a different evaluation lens
Traditional software vendor assessments tend to focus on deterministic behavior: defined inputs, expected outputs, validation scripts, and change control. AI systems, particularly those based on non-deterministic models such as large language models, behave differently. Outputs may vary, learning may occur over time, and risk depends heavily on context of use, not solely on technical performance.
Deterministic vs. Non-Deterministic Systems (at a glance)
Aspect
Deterministic Software
Non-Deterministic AI
Output behavior
Same input produces the same output every time
Same input may produce similar, not but identical, outputs
Predictability
Fully predictable and repeatable
Predictable within an expected range, not a single fixed result
Generated results
Based on fixed rules and logic
Based on probabilistic models and likliehoods
Response to repeated runs
Identical outputs across runs
Outputs may vary slightly across runs
Typical validation approach
Pass/fail testing against predefined expected outputs
Assess whether outputs are fit for purpose using a credibility-based approach
Primary risk consideration
Correctness of implementation
Appropriateness of use, impact of variability, and context of use
Change sensitivity
Changes are usually explicit and controlled
Performance may change due to data, prompts, or environment, even without code changes
As a result, evaluating an AI vendor requires going beyond standard validation checklists. You need to understand how the vendor defines risk, how trust in outputs is established, how data is protected, and how accountability is maintained when behavior deviates from expectations. The questions below are designed to surface whether a vendor has truly internalized the FDA’s principles – or is simply applying traditional software thinking to a fundamentally different technology.
10 Questions to Ask Your AI Vendor
1.
How does the vendor clearly define the tool's context of use, and where is it documented?
Ask the vendor to describe, in plain language, exactly what the AI is intended to support and what it is explicitly not intended to do. If the vendor cannot point to a documented context of use that clearly limits scope, users, and impact, risk is likely undefined or misunderstood. Without a clearly defined context of use, an AI tool remains closer to an experiment than a controlled system.
2.
How has the vendor assessed model risk in relation to that context of use?
A credible assessment includes an explanation of what could go wrong, how severe the impact would be, and why it matters for the intended use. Generic claims of “low risk” without a structured assessment are a red flag.
3.
Does the vendor apply traditional software validation, or an AI-appropriate credibility assessment?
For non-deterministic AI, traditional validation methods are often insufficient. Ask whether the vendor applies a credibility-based approach to assess whether outputs are fit for purpose, including how discrepancies are reviewed and adjudicated. The FDA’s Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products provides guidance on determining the credibility of AI models.
4.
How does the vendor define and test “errors that matter”?
Not all errors are equal. Vendors should distinguish between acceptable variability and errors that have regulatory or compliance impact, and show how testing focuses on issues that materially affect outcomes.
5.
What evidence demonstrates that the model performs reliably over time?
Ask how the vendor detects performance drift. Are assessments repeated after changes, and how do they ensure the tool has not degraded as data, environments, or dependencies evolve.
6.
How does the vendor classify and control changes to the model, prompts, or environement?
Vendors should clearly explain what constitutes a minor versus material change, how prompt updates are handled, what triggers reassessment, and how changes are documented, reviewed, and approved. Uncontrolled change is a major compliance risk.
7.
How does the vendor protect, isolate, and ultimately destroy client data?
Request specifics: where data is stored, how environments are isolated, how access is controlled, whether data is retained if it is sent to an externally provided LLM, whether data is encrypted, and how and when data is deleted. Look for a documented data stewardship approach, not just assurances.
8.
Does the vendor use any customer data in ways that could influence model behavior outside the customer's environment?
Ask whether customer data is ever used for training, testing, fine-tuning, prompt optimization, feedback loops, or performance improvement – and whether those activities could affect other customers or future deployments. Vendors should demonstrate that your data cannot “escape into the wild,” directly or indirectly.
9.
How does the vendor ensure users understand the tool’s outputs, limitations, and appropriate use?
Ask whether users receive plain-language training or guidance explaining what the tool does, what it does not do, its known limitations, and how results should be interpreted. If the output is expected to “speak for itself,” risk is effectively being pushed onto the customer.
10.
Who is accountable when the AI gets it wrong?
Finally, ask about governance. Who reviews discrepancies? How is acceptability determined? Who approves releases and risk responses? If accountability is unclear, responsibility will fall to the sponsor when issues arise.
Bottom Line
The FDA’s Guiding Principles of Good AI Practice make it clear that responsible AI is not just about performance – it also depends on governance, transparency, and lifecycle control. For clinical trial sponsors, vendor selection is one of the most important control points when adopting AI-enabled solutions.
Asking these questions helps organizations move beyond traditional software evaluation models and build confidence that vendors are prepared to operate in regulated, rapidly evolving environments. When applied consistently, this approach enables life science organizations to adopt new AI technologies with greater clarity, trust, and readiness for what comes next.
By Kathie Clark, Industry Expert; Technology & Innovation Partner, Just in Time GCP
Kathie is an integral part of Just in Time GCP’s growing innovation team. As Founder & CEO, Donna Dorozinsky has said “Kathie offers a strategic vision for innovative technology application that embraces AI and clinical data analytics, allowing Just in Time GCP to translate increased quality and efficiencies to our partners.” Click here to learn more about how Kathie and Just in Time GCP are applying AI with purpose, not hype, and realizing results.