
The promise of Safety-Driven V&V
‘A method of solution is perfect if we can foresee from the start, and even prove, that following that method we shall attain our aim.’ — Gottfried Wilhelm Leibniz
We would like to thank all attendees. We hope that this was informative and valuable for you and appreciate the positive feedback that we have received so far. Our goal with these webinars is to visit key topics and jointly find solutions for the benefit of the industry.
Many users requested a transcript of the webinar Q&A segment, and this blog post is dedicated to that. Note that it is more informative than our typical blogs. It ranges from conceptual, methodology, and standard implementation questions, but you can choose to read all of it or jump to the specific answers.
Feel free to bring up other topics of interest in the comments below or send an email to info@foretellix.com. We will do our best to respond to any requests within the next couple of sessions and maybe even hold full sessions around the deeper or more interesting topics.
We have already published the date of our next webinar on the much-requested topic of requirements-based testing and how to efficiently deploy this with an automated V&V process. The webinar will take place on the 24th of March 5pm CET. You can register to join and ask questions.
Before addressing the question, it is important to clarify that the OpenSCENARIO 2.0 is a standard input format, and the ability to capture scenario intent can fuel multiple tools and purposes. The standard itself neither limits nor dictates a specific purpose, flow, or tool. In regard to training, we agree with the question’s suggested direction. The steerable random flow allows users to create multiple surprising scenarios that can be used for both training and validation purposes. There is much more to be said on the training flow and the mix of surprises with real-life probabilities.
As the question suggests, for a lot of practical applications you need to use constraints. Logical scenarios allow random selection (aka Monte-Carlo) within ranges. These ranges define a space of scenarios to be tested but this isn’t enough for most practical purposes. Attributes such as speed, distance, time, acceleration, and even the road, are tightly correlated with implied physical constraints. For example, driving at a certain speed for a certain amount of time determines the distance traveled and requires a road of sufficient length. Constraints also provide abstraction. They allow capturing a high-level intent in a formal description. This is necessary for technology to replace the tedious cognitive work by humans that is currently the norm. For example, the tool will select a proper segment for a cut-in scenario, adjusting the scenario’s physical and intent considerations accordingly.
Roads and intersections are structs composed of multiple segments. See the domain model subsection in the LRM.
We agree that this is needed. In general, the more you can standardize the better, but the creation of 2.0 was a huge cooperative effort. One reason this version of OpenSCENARIO is called 2.0, is that we will eventually have more than just two versions. There was a lot of excellent work done to standardize things, yet some solid ideas just couldn’t fit into our schedule for 2.0. For instance, there are basic methods of reporting errors and warnings as well as more sophisticated controls that were postponed to a future release. Foretellix has standards that we will be happy to contribute to subsequent versions, but we’ll see how that pans out.
The OSC2.0 domain model includes many actions and modifiers in addition to lanes, such as movable objects that can move in free space, and a car that can drive along paths. See the LRM for more details.
There is no single solution that ensures 100% full coverage of the entire scenario space. Coverage Driven Verification allows you to simply and effectively define and achieve goals. The use of automation enables scale and measurability and reduces room for human errors. As such, it is more effective than the manual approach which often suffers from major holes. At the same time, there are important points to make about CDV and the possibility of missing holes:
CDV enables an automated flow full of checks and balances to identify coverage holes or missing requirements.
OSC2.0 coverage features facilitate organizing infinity with a gradable coverage model. As a basis, each coverage item is calculated according to its collected type expressions:
OSC2 lets you further shape the grades by ignoring values that are not interesting (for example, I do not care about trucks in my project), setting a different target for some values (for example, I want to try at least 10 cut-ins from the right and 5 are enough from the left), and more. Users may build different dashboards and analytics tools around these language concepts.
This is the core issue of taming infinity. Scenarios have parameters, each with a certain range of values. Also, because you can always mix scenarios, each with multiple parameters, the possibilities will never end, right? Even thinking about that makes your head spin. And one of the things we try and show — and I think you got at least a first glimpse of it — is that there are ways to use coverage definitions and related technology to make this a kind of optimization problem in the following sense: if you just have one night and, say, a hundred machines, what is the best way to use those resources to collect a first pass coverage?
If you have a weekend and a thousand machines, then here is your next step, but you’re going to cover all your scenarios, breadth-first at least to some degree. Then if you have more time, you can collect more and more coverage. And obviously, you will never do all the combinations. In fact, even with just a single speed parameter, you will never get to all the combinations. So technology and methodology are required to make a practical solution.
So as I think the writer of this question already implied, when you define your coverage, you should be thinking through what are the dangerous things and what are the various interactions. In a sense, a coverage model should be a map of your fears. And so if you are assuming that in this ODD there will be multiple car types, and there will be trucks, busses, and so on, then you had better define a coverage model that goes through them all. And if you think there will be snow and rain as well as nice days and fog, you had better go through all of these and then make sure that you cross them all and combine them all. And if you don’t have the assets, or if there’s a specific simulator you’re connected to that does not support snow, for instance, then you should do something about it, such as using a better simulator.
Ranking is one of the optimization capabilities that Foretellix built on top of the OSC2 standard input format. Specifically, ranking optimization helps you create an optimized test suite for the entire V&V effort or for selected V&V needs.
The fact is that executed scenarios cover their own specified intent but might also cover much more in a by-the-way manner. For example, while doing a cut-in in a random location I may also drive under a bridge. Ranking can take a test suite with 500 tests and identify 50 tests that produce the same coverage result. In subsequent test regressions, I may run 50 scenarios that have minimal redundancy, shorten the test suite execution time, and save the expensive simulation cycles for other goals.
The answer is yes. Please note that these webinars are really about the approach. CDV efficiently spans the desired space and identifies unknown scenarios that can later be sent to a KPI optimizer. Specifically, Foretellix technology uses machine learning algorithms to identify feature importance, which can help both debug and efficient scenario exploration.
Yes. By default, unified distribution is applied for value selections – this means that all legal values have the same unbiased chance of being selected. You can use OSC2 default constraints to capture the desired probabilities for value selection. As the name suggests, default constraints apply default values. If a test constraint further steers the needed distribution (for example, for an edge case scenario), these constraints are ignored. Note that in edge case scenarios, even though one behavior was pushed to the extreme, it is important that other behaviors are normal to avoid a scenario that is just too crazy. Note that in some cases where statistical reports are needed, a joint distribution needs to be applied.
OSC2 provides success criteria mechanisms to enable self-checking tests. Such mechanisms are critical to enabling scale. Two main checker categories are:
This is exactly true. When you go from the thousands of runs per night to the hundred thousand runs per night, you can no longer think about manual checking of things. Not every accident and not every collision is necessarily a SUT error. If you create a challenging scenario where, for example, 50% of the instances of that scenario are supposed to end in a collision, then such collisions are not obviously all errors of the SUT. You need a pretty sophisticated way to define what is clearly an error, what is clearly okay, and what are the gray areas that you may want to check later? And it is a combination of technology, methodology, and even art.
You also need a standard format for errors, and you need tools that help you cluster the errors, help you analyze them. We didn’t get to show that, but I think that’s an interesting thing to see.
The constraints include both scenario constraints, such as vehicle category, and implied physical constraints, such as those that connect distance to speed, to acceleration, and so on. An OSC2 tool should adjust the scenario to meet both your scenario and implied physical constraints, or it should report an error if no such scenario can be achieved. Many times, the constraint solver may report constraint contradictions even before simulation starts. In the case of partially controlled actors (for example, smart actors driven by behavioral models) a runtime contradiction may be reported. As was said previously, Foretellix tools load the physical constraints by default, so users can focus on the abstract scenario constraints.
One of the advantages of a standard is that the multiple vendors involved will push the industry forward. PMSF suggests a parser and CLI for OSC2.0 syntax checking. For more information, please click here.
Scenario parameter ranges are just another type of constraint. As was discussed, abstract scenarios can include any constraint kind to capture dependencies. A parameter range can be between any two expressions: not just a literally specified range such as keep(speed in [10.100]KPH) but also keep(speed in [attr_a..attr_b]). Note also that the final calculated values will be a resolution of the range and the context scenario. For example, if a truck is used as part of the cut-in scenario, the eventually randomized vehicle speeds will conform to both the range requirement and the truck speed capabilities.
By default, generated scenarios and dynamics are both feasible and physically possible. Note that virtual platforms may benefit from running scenarios that are not physically possible. (For example, you can save simulation time by allowing infinite acceleration.) The OSC2 LRM and domain model allow executing actions that are not physically possible, as well as those that are. There are methodology guidelines on how to write reusable scenarios that are portable between virtual and physical platforms.
The definition of edge-case scenarios can be wide. It could be non-typical parameter values, the desired traffic density, error injections, road conditions, a specific mix of scenarios, and more. There are efficient ways to come up with the right set of edge-cases in the planning process. Once the plan is set, OSC2 facilitates translating the functional goals into abstract scenarios and a coverage model to efficiently meet these goals.
Not at this point. In terms of effort, it would be extremely challenging to create a reference implementation. Also, past experience has shown that a reference implementation prevents industry progress by forcing a specific implementation scheme on vendors so that, for example, constraint solvers did not progress.
As usual, drive safe,
‘A method of solution is perfect if we can foresee from the start, and even prove, that following that method we shall attain our aim.’ — Gottfried Wilhelm Leibniz
The recent Cruise recall, which was initiated last week to address a “unique combination of parameters” provides us with a glimpse into the future of autonomous driving.
A case study for uncovering AV edge cases using hyper-scale virtual simulation
We’ll get back to you as soon as possible.