TL;DR: AI can accelerate autonomous vehicle development through scenario generation, sensor simulation, and smarter testing, but it cannot reliably validate its own outputs. Learn how Foretellix combines AI capabilities with rule-based, formal validation and structured coverage workflows to create trustworthy, physically accurate scenarios, close testing gaps, and support safer, production-ready AV development.
Artificial intelligence is changing the game everywhere, and autonomous vehicles (AVs) are no exception. Instead of relying on manually coded decision-making, the AI-Powered Autonomy stack is trained to predict, plan, and respond to traffic situations. For AV developers, AI is unlocking faster and more affordable ways to build and test driving stacks. Think photo-realistic sensor data, smarter data curation and labelling, spotting unsafe behaviours, trained traffic models, and a lot more.
The big win? Development moves faster, the systems adapt better, and teams can cover way more traffic scenarios than ever before. All of this is key for getting closer to true driver-out autonomy.
At Foretellix, we are strong supporters of an open AV partner ecosystem. We’ve integrated our solutions with tools and technologies from NVIDIA, Parallel Domain, InvertedAI, Voxel51, and others to deliver cutting-edge AI-powered toolchains.
But here’s the catch: while AI brings a ton of benefits, it can’t be trusted to fully check its own work. It’s a bit like a student trying to grade their own exam, or an artist trying to be the sole judge of their own painting. The AV stack built by AI is still vulnerable to blind spots, hallucinations, hidden biases, and subtle mistakes if it’s left to self-validate.
In this blog, we’ll dig into why relying on AI alone is risky and insufficient for autonomous vehicles – and more importantly, how combining it with rule-based validation builds a workflow that’s not only safer but also structured and systematic. Think of it as putting guardrails on AI: they keep it from veering off course while also guiding it along a clear, reliable path.
The AI Hope for Self-Driving Cars
When self-driving cars first hit the headlines, the story almost sounded like science fiction. Articles painted a future of safer roads and easier commutes, and companies jumped in, racing to be the first to roll out driverless vehicles. But the journey turned out to be tougher than expected, and many projects missed their deadlines.
That’s when artificial intelligence started to take center stage. Traditionally, AI was mainly used to detect and classify objects around the car. But a new promising approach granted AI a larger role: now it helps predict what other road users might do and decide how the car should respond. Instead of hand-coding every single traffic rule, engineers began training neural networks to imitate safe driver behavior (imitation learning) or to learn through trial and error with feedback (reinforcement learning).
AI was also applied to another game-changing path: simulation. Gathering real-world driving data is slow, expensive, and sometimes dangerous. But AI can generate endless virtual traffic scenarios, letting AVs practice in a safe digital environment before facing the real world. In many ways, AI evolved from being just one piece of the puzzle to becoming the engine pushing the whole field forward.
So, is the problem solved? Not quite. The complexity remains. But now the hardest part of the problem shifted from writing software rules to training and validating AI models with enough diverse driving scenarios.
So Why Can’t We Just Rely on AI for Stack Development?
Imagine you’re using an LLM to dig deep into a topic. It’s quick, smart, and can give you tons of information, but there are two big issues:
- Accuracy isn’t guaranteed: Not every answer is reliable. You might get hallucinations, flattering or over-agreeable responses (sycophancy), or even broken links.
- Research takes structure: Answers often come scattered across multiple queries. To make sense of them, you have to gather, filter, and organize the results into something coherent and complete.
Now, swap out “LLM research” for “training and validating an autonomous vehicle.” AI can generate traffic scenarios or estimate the AV behaviours upon request, but the same challenges pop up – can it be trusted to make real-life or death decisions.
- Accuracy: AI can’t reliably grade its own work. You still need rule-based checks to make sure a scenario is safe. Plus, AI-generated scenes might look super realistic, but they’re not always physically correct, and tweaking small details can be tricky.
- Structure and completeness: If a scenario isn’t in the training set, AI may stumble. A structured workflow is essential to guide AI across the full Operational Design Domain (ODD), covering both known challenges and hidden edge cases.
And unlike research notes, the cost of mistakes here is huge – we’re talking about real people’s lives on the road. Add to that the massive scale of traffic situations, and evolving regulatory requirements, such as SOTIF, and you can see why a systematic process is non-negotiable.

Making AI Production Ready: The Foretellix Approach
AI is powerful for generating diverse, naturally behaving scenarios, but transforming that capability into a production-ready solution presents a completely different challenge. At Foretellix, we’ve developed our own AI engines, yet we know that relying on AI alone isn’t enough. That’s why we combine them with a formal, accurate, and scalable pipeline to complement and strengthen AI capabilities.
Here’s how it works: when a user requests a specific scenario, our scenario generator creates it, adjusting the behaviors of each vehicle and object in real time to meet the user’s intent. Unlike AI alone, these scenarios are guaranteed to perform as designed and remain physically accurate – no surprises, no shortcuts.
Then comes the synergy: once we have a validated trajectory-level scenario, we hand it over to AI tools to generate realistic sensor data. This combination gives you the best of both worlds: formally guaranteed scenarios paired with the richness and photo-realism of AI-generated inputs. On top of that, our pipeline validates the AI outputs to ensure every test is trustworthy and aligned with system requirements.
To make this process thorough and manageable, we built an automation and management layer. It translates project goals into actionable coverage goals – defined as the measurable criteria that ensure all relevant driving behaviors, conditions, and edge cases are adequately tested. The system then selects the right engines to fill gaps and provides a high-level dashboard so teams can track progress and apply expert judgment where it matters most.
The result is a seamless workflow that leverages both our formal technology and AI in perfect harmony, accelerating development while maintaining rigorous safety and reliability standards.
Delivering Trustworthy Scenarios and Results
So what does this toolchain actually deliver? Here’s a closer look:
- Formal, accurate, and scalable scenario generation and vehicle performance assessment.
- Abstract scenarios, clear intent: While AI models often generate scenarios based on statistical patterns, our constraint-based scenario generator uses the formal ASAM OpenSCENARIO DSL to create precise and reproducible results.
- Its semantic foundation ensures that every scenario is both meaningful and directly aligned with user-defined goals, not just plausible. Teams start with a clear behavioral description of their goal (like “a car cuts in aggressively on a wet road”). and then generate thousands of parameterized, physics-compliant variations.
- Our pre-validated libraries make it easy to expand coverage without losing control of what’s being tested or why.
- Controlled variations: With Foretellix, you can replay real-world drives while adding intelligent environmental and behavioral variations. This enables end-to-end testing of AI systems with inputs that are both diverse and physically plausible. It’s especially useful when real-world data is scarce or when edge cases are too risky to capture on public roads.
- Built-in validation: Whether a scenario is created manually, generated through AI, or derived from logs, it goes through built-in validation. Every test must meet its intent, respect physics, and align with safety standards AI-generated tests that fail to meet their intent are automatically filtered out, preventing bias in validation or training results. The outcome: failures provide meaningful insights, and passes accurately reflect true system performance under well-defined conditions. Process automation and management.
- Structured workflows: Define project goals, automatically launch tasks, and detect coverage gaps while using the best engine for the job.
- Unified testing: Combine on-road testing and virtual simulation into a single seamless workflow. This lets you tune simulation based on real-world observations and get a complete picture of your AV stack.
- Best-in-class integrations: Our system plays nicely with top industry tools, so you always get the most capable solution available.
Want to Learn More?
Whether you’re building an end-to-end stack that uses sensor data or a rule-based stack that works with object lists and sensor inputs, our approach helps make development and testing more reliable without compromising safety or rigor. It also supports seamless integration with your own AI engines, so you can leverage internal models or toolchains within a structured, validated framework. As AI toolchains evolve, having a future-proof validation approach that can incorporate advanced AI tools while maintaining structure and safety is essential.
By working with leading AV developers and partnering with top AI engine providers, we continue to refine and expand our approach. We’d be glad to share ideas and explore how it could fit your workflow.
For more details, contact us.