Creating “Right-of-Way Violator” Scenarios for Synthetic Sensor Data Generation

The need for realistic and reliable driving data is imperative for training AI-powered AV stacks with end-to-end planning models, and yet, real-world data is limited in diversity and scale. The solution is to generate synthetic data that is both scalable and controllable.

This blog outlines the process our team followed using Foretellix’s Foretify development toolchain, integrated with CARLA and NVIDIA Cosmos, to create, select and render high-quality “right-of-way violator” scenarios for the purpose of generating synthetic sensor data. This data feeds into imitation learning pipelines for end-to-end autonomous driving stacks, ensuring diverse, intent-preserving, and scalable training scenarios for improved AV stack performance and safety.

Motivation: Addressing a Critical Data Gap

The process originated from a performance gap identified in an AV stack during validation. The system struggled with a specific traffic scenario: “Driving through a junction with right-of-way violators”.

Upon analysis, AV engineers discovered that:

  • Real-world driving logs lacked sufficient examples of this scenario, especially those where the autonomous vehicle under test (EGO) responds correctly
  • This data scarcity, specifically for safety-critical scenarios, made it impossible to train or evaluate models on this behavior using recorded fleet data alone

To address this, the team enacted the following process to generate synthetic sensor data representing such interactions. The goal was to systematically simulate plausible yet rare violations, ensuring the AV stack learns to handle them through imitation learning with high-quality, diverse, and intent-aligned synthetic data.

 

1. Requirement Analysis

We began with a natural language requirement document that described the “right of way violator” scenario class. This document detailed:

  • Functional objectives of the scenario
  • Expected behaviors of the EGO vehicle (e.g., slowing, stopping, maneuvering)
  • Actions of violating agents, such as a vehicle ignoring or violating stop/yield rules

Our team parsed this information to extract core scenario intents, actor roles and behaviors, and desired outcomes. This human-readable requirement formed the foundation for formal scenario modelling.

2. Abstract Scenario Formalization and Implementation with Foretify Developer and V-Suite Libraries

Using the OpenSCENARIO Domain-Specific Language (DSL) we translated the English-language requirements into a map-agnostic, formalized scenario description. This included:

  • Declarative agent roles (e.g., “violator”, “victim”, “other agents”)
  • Temporal relationships and constraints (e.g., “violator enters intersection 0.5s before EGO”)
  • Desired EGO reactions (e.g., “apply brakes with X deceleration if Time-to-Collision < threshold”)

Foretify V-Suites, a comprehensive library of pre-configured, reusable scenario components gave us the baseline to jumpstart our scenario definition. Specifically:

  • Junction scenarios from the library served as a starting point
  • These base scenarios were modified and combined to reflect “right of way violator” situations
  • This reuse allowed us to accelerate development and ensure consistency with validated scenario patterns

To guarantee correctness and physical plausibility, we applied our domain model:

  • It provides a set of foundational constraints (e.g., traffic rules, geometry constraints, actor capabilities)
  • These constraints ensure that all generated scenario variants are both valid and realistic, regardless of the underlying map

This abstract representation allowed us to separate scenario logic from geographical layout, enabling flexible and scalable reuse across different maps.

Our team then used Foretify Developer tools to implement the scenario. One of key enablers for this task here were Foretify’s controllable driver models. We defined a configurable EGO behavior model that “does the right thing” to avoid collisions (e.g., slowing down or yielding). In contrast, the violating actors were configured to ignore junction rules (e.g., blowing past a yield or stop sign). This dual-driver setup ensured functional fidelity and expressiveness in scenario execution.

Click here to watch the entire video

3. Runtime Execution with CARLA and Foretify

During runtime, the scenario execution was integrated with CARLA, the open-source autonomous vehicle simulator, to add vehicle dynamics to the co-simulation loop. The architecture included:

  • Foretify orchestrating scenario execution during the co-simulation runtime
  • CARLA simulating vehicle physics, road traction, inertia, and fine-grain movement

This hybrid runtime setup preserved intent, while embedding the richness of realistic dynamics and physical constraints. It ensured that the synthetic sensor data matched real-world driving behaviors under the specified scenario logic.

4. Constraint Solving and Large-Scale Generation

We leveraged Foretellix’s constraint solver technology to generate 6,000+ valid scenario instances that preserved the core intent:

  1. Violator agent always challenges right of way
  2. EGO must always be forced to evaluate and act under time pressure
  3. Other traffic conditions, map elements (differing junction layouts), and timing are randomized within defined bounds

This ensured diversity without compromising the semantic consistency of the scenario class.

Click here to watch the entire video

5. Diversity Analysis and Run Selection

Using our intuitive big data analytics platform, we conducted an in-depth analysis of the generated scenarios:

  • Examined distributions across key metrics (e.g., violator speed, EGO reaction time, collision rate)
  • Ensured broad coverage across corner cases and edge conditions
  • Selected the most representative and diverse runs that optimally meet the scenario intent as the basis for imitation learning

This data-driven approach allowed us to avoid overfitting while maximizing generalization in downstream models.

6. Sensor Simulation with NVIDIA Cosmos

Selected runs were processed through Cosmos Transfer, a multicontrol WFM, to generate hyper-realistic, physically-based, sensor simulation scenarios. We used prompt upsampling techniques to expand the dataset across:

  • Weather conditions (e.g., fog, rain, glare)
  • Geographic locations (e.g., urban grids, suburban roads, highway ramps)
  • Lighting variations (e.g., dusk, dawn, night-time)

Click here to watch the entire video

With this process, we have demonstrated how, with the use of the Foretify Development Toolchain, we have generated the high-fidelity “right-of-way violator” sensor simulation scenarios required for training the end-to-end AI-powered AV stack, with a solution that is both scalable and controllable.

Additional content for you

In this blog, I will look at the near-term future of AI-based autonomy and will discuss: trends in AI-based autonomy - E.g. the move to “end-to-end”, the growing role of V&V in autonomy and the need for a common tool for both V&V and implementation of AI-based systems ...

Register to receive ALKS scenarios verification code examples

AI, Autonomy, V&V and Abstractions – Automating at Hyper Speed