for all the diligence required to solve this sort of problem, you’d think that would start pushing programming more towards, ya know, engineering as a way of building things, but at least its a cool story!
As it was a hardware issue in this case, I’m not sure I understand what you’re saying. Do you mean that if, say, the software was verified and guaranteed not to crash, they would have immediately diagnosed the crash as a hardware issue, thus saving a lot of time?
In general, you can do that sort of thing in Design-by-Contract with a certified and non-certified compiler. In debug mode, the contracts can all become runtime checks showing you exactly which module fed bad input into the system. That lets you probe around to localize exactly which bad box took input that accepted its preconditions, did something with it, and produced output that broke the system. When looking at that module, it will normally be a software bug. However, you can run all failing tests through both a certified and regular binary to see if the failure disappears in one. If it does, it’s probably a compiler error. Similar check running sequential vs concurrent in case it’s a concurrency bug. Similarly, if the logic makes sense, it’s not concurrency, and passes on certified compiler, it’s probably an error involving something else reaching into your memory or CPU to corrupt it. That’s going to be either a hardware fault or a privileged component in software. With some R&D, I think we could develop for those components techniques for doing something similar to DbC in software for quickly isolating hardware faults.
That said, I don’t think it was applicable in this specific case. They’d have not seen that problem coming unless they were highly-experienced, embedded engineers. I’ve read articles from them where they look into things like effects of temperature, bootloader starting before PLL’s sync up, etc. Although I can’t find link, this isn’t the first time sunlight has killed off a box or piece of software. I’ve definitely seen this before. I think a friend might have experienced it with a laptop, too. We might add to best practices for hardware/software development to make sure the box isn’t in sunlight or another situation that can throw its operating temperature. I mean, PC builders have always watched that a bit but maybe the developers on new hardware should ensure it’s true by default. The hardware builders should also test the effects of direct sunlight or other heat to make sure the boxes don’t crash. Some do already.
However, you can run all failing tests through both a certified and regular binary to see if the failure disappears in one. If it does, it’s probably a compiler error.
I don’t think that’s true, at least in C. I know CompCert at least takes “certified” to mean “guarantees well-defined output for well-defined input”, so it’s free to make hash of any UB-containing code the same as Clang.
That said, if your test results change between any two C compilers, it’s a strong suggestion you have a UB issue.
True, too. There’s teams out there that test with multiple compilers to catch stuff like that. OpenBSD folks mentioned it before as a side benefit of cross-platform support.
That said, if your test results change between any two C compilers, it’s a strong suggestion you have a UB issue.
In C, this can also mean that you depend on implementation-defined bahaviour or unspecified bahaviour which are not the same as undefined behaviour (which often will also be a bad thing ;)).
Im always pushing all programmers to adopt anything that helps them that they can fit in their constraints. Aside from Design-by-Contract, I’d hold off on recommending game developers do what I described until the tooling and workflow are ready for easy adoption. Ive got probably a dozen high-level designs for it turning around in my head trying to find the simplest route.
One thing Im sure about is using C++ is a drawback since it’s so hard go analyze. About everything I do runs into walls of complexity if the code starts in C++. Still working on ideas for that like C++-to-C compilers or converting it into equivalent, easier-to-analyze language that compiles to C (eg ZL in Scheme). So, I recommend avoiding anything as complex as C++ if one wants benefits from future work analyzing either C or intermediate languages.
Edit: Here was a case study that found DbC fit game dev requirements.
The other day there was a link to a project that does source-to-source transformation on C++ code to reduce the level of indirection: https://cppinsights.io
Try doing an exhaustive list of FOSS apps for C and C++ doing this stuff. You’ll find there’s several times more for C which also collectively get more done. There’s also several certifying compilers for C subsets with formal semantics for C++ still barely there despite it being around for a long time.
So, that’s neat but one anecdote goimg against a general trend.
Im always pushing all programmers to adopt anything that helps them that they can fit in their constraints.
This is meaningless - is there someone who doesn’t?
Aside from Design-by-Contract, I’d hold off on recommending game developers do what I described until the tooling and workflow are ready for easy adoption.
So the only one thing you propose would not help at all with problem with overheating console or with performance regression from that post ;)
One thing Im sure about is using C++ is a drawback since it’s so hard go analyze.
Sure, it’s hard but there are tools that can do some sort of static analysis for it (for example Coverity or Klocwork). Either way, there are no alternatives today for c++ as language for engine that can be used for AAA games.
Here was a case study that found DbC fit game dev requirements.
Have you actually read it? I have nothing against DbC, but as far as I can see that study doesn’t really show any great benefits of DbC nor is it realistic. They do show that writing assertions for pre/post conditions and class invariants helps in diagnosing bugs (which is obvious), but not much more.
They don’t show that really hard bugs are clearly easier to diagnose and fix with DbC, nor do they show that cost/benefit ratio is favourable for DbC.
Finally, that paper fails to describe in detail how was the experiment conducted - all I could gather is this:
Implementation took approximately 45 days full-time and led to source code consisting of 400 files.
code was predominantly implemented by one person
Even it was an interesting paper (which imo it is not) it’s impossible to try and replicate it independently.
“This is meaningless - is there someone who doesn’t?”
Yes, most developers don’t if you look at the QA of both proprietary and FOSS codebases. I mean, every study done on formal specifications said developers found them helpful. Do you and most developers you see use them? Cuz I swore Ive been fighting an uphill battle for years even getting adoption of consistent interface checks and code inspection for common problems.
“but as far as I can see that study doesn’t really show any great benefits of DbC nor is it realistic. They do show that writing assertions for pre/post conditions and class invariants helps in diagnosing bugs (which is obvious)”
It’s so “obvious” most developers aren’t doing DbC. What it says is that DbC fits the requirements of game developers. Most formal methods don’t. It also helped find errors quickly. It’s not mere assertions in the common way they’re used: it can range from simple Booleans to more complex properties. One embedded project used a whole Prolog. Shen does something similar to model arbitrary, type systems so you can check what you want to. Finally, you can generate tests directly from contracts and runtime checks for combining with fuzzers taking one right to the failures. Is that standard practice among C developers like it has been in Eiffel for quite a while? Again, you must be working with some unusually-QA-focused developers.
“would not help at all with problem with overheating console “
First part of my comment was about general case. Second paragraph said exactly what you just asked me. Did you read it?
“Even it was an interesting paper (which imo it is not) it’s impossible to try and replicate it independently.”
The fact that it used evidence at all would put it ahead of many programming resources that are more like opinion pieces. Good news is you don’t have to replicate it: you can create a better study that tries same method against same criteria. Then, replicate that. If that’s the kind of thing you want to do.
I mean, every study done on formal specifications said developers found them helpful. Do you and most developers you see use them?
Academic studies show one thing, while practitioners for unknown reasons do not adopt practices recommended by academics. Maybe, the studies are somehow flawed. Maybe the gains recommended in studies don’t have good roi for most of gamedev industry?
What it says is that DbC fits the requirements of game developers. Most formal methods don’t. It also helped find errors quickly. It’s not mere assertions in the common way they’re used
The study was using “mere assertions in the common way they’re used” so I don’t know what is the point of rest of that paragraph - techniques you mention there are not even mentioned in that paper so there is no proof of their applicability to gamedev.
First part of my comment was about general case. Second paragraph said exactly what you just asked me.
I asked you about your recommendations for gamedevs not about some unconstrained general case and just pointed out that your particular recommendation would not help in case from the story - nothing more :)
Did you read it?
Sure I have, but to make it easier in future try to make your posts more succinct ;)
Academic studies show one thing, while practitioners for unknown reasons do not adopt practices recommended by academics.
I agree in the general case. Except that most practitioners trying some of these methods get good results. Then, most other practitioners ignore them. Like CompSci has it’s irrelevant stuff, the practitioners have their own cultures of chasing some things that work and some that don’t. However, DbC was deployed to industrial practice via Eiffel and EiffelStudio. The assertions that are a subset of it have obvious value. SPARK used them for proving absence of errors. Now, Ada 2012 programmers are using it as I described with contract-based testing. Lots of people are also praising property-based testing and fuzzing based on real bugs they’re finding.
So, this isn’t just a CompSci recommendation or study: it’s a combo of techniques that each have lots of industrial evidence they work and supporters that work well together that also have low cost. With that, either mainstream programmers don’t know about them or they’re ignoring effective techniques despite evidence. The latter is all too common.
“I asked you about your recommendations for gamedevs not about some unconstrained general case”
What helps programming in general often helps them, too. That’s true in this case.
“Sure I have, but to make it easier in future try to make your posts more succinct ;)”
Lots of people are also praising property-based testing and fuzzing based on real bugs they’re finding.
Those techniques are not what I would call “formal specifications” any more than simplest unit tests are, but if you consider them as such, than…
either mainstream programmers don’t know about them or they’re ignoring effective techniques despite evidence. The latter is all too common.
… I have no studies to back this up, but my experience is different. DbC (as shown in that study you linked), property based and fuzzing based testing are techniques that are used by the working programmers. Not for all the code, and not all the time but they are used.
When I wrote about studies showing one thing and real life showing something opposite I was thinking about methods like promela, spin or coq.
Makes more sense. I see where you’re going with Coq but Spin/Promela have lots of industrial success. Similar to TLA+ usage now. Found protocol or hardware errors easily that other methods missed. Check it out.
for all the diligence required to solve this sort of problem, you’d think that would start pushing programming more towards, ya know, engineering as a way of building things, but at least its a cool story!
As it was a hardware issue in this case, I’m not sure I understand what you’re saying. Do you mean that if, say, the software was verified and guaranteed not to crash, they would have immediately diagnosed the crash as a hardware issue, thus saving a lot of time?
In general, you can do that sort of thing in Design-by-Contract with a certified and non-certified compiler. In debug mode, the contracts can all become runtime checks showing you exactly which module fed bad input into the system. That lets you probe around to localize exactly which bad box took input that accepted its preconditions, did something with it, and produced output that broke the system. When looking at that module, it will normally be a software bug. However, you can run all failing tests through both a certified and regular binary to see if the failure disappears in one. If it does, it’s probably a compiler error. Similar check running sequential vs concurrent in case it’s a concurrency bug. Similarly, if the logic makes sense, it’s not concurrency, and passes on certified compiler, it’s probably an error involving something else reaching into your memory or CPU to corrupt it. That’s going to be either a hardware fault or a privileged component in software. With some R&D, I think we could develop for those components techniques for doing something similar to DbC in software for quickly isolating hardware faults.
That said, I don’t think it was applicable in this specific case. They’d have not seen that problem coming unless they were highly-experienced, embedded engineers. I’ve read articles from them where they look into things like effects of temperature, bootloader starting before PLL’s sync up, etc. Although I can’t find link, this isn’t the first time sunlight has killed off a box or piece of software. I’ve definitely seen this before. I think a friend might have experienced it with a laptop, too. We might add to best practices for hardware/software development to make sure the box isn’t in sunlight or another situation that can throw its operating temperature. I mean, PC builders have always watched that a bit but maybe the developers on new hardware should ensure it’s true by default. The hardware builders should also test the effects of direct sunlight or other heat to make sure the boxes don’t crash. Some do already.
I don’t think that’s true, at least in C. I know CompCert at least takes “certified” to mean “guarantees well-defined output for well-defined input”, so it’s free to make hash of any UB-containing code the same as Clang.
That said, if your test results change between any two C compilers, it’s a strong suggestion you have a UB issue.
True, too. There’s teams out there that test with multiple compilers to catch stuff like that. OpenBSD folks mentioned it before as a side benefit of cross-platform support.
In C, this can also mean that you depend on implementation-defined bahaviour or unspecified bahaviour which are not the same as undefined behaviour (which often will also be a bad thing ;)).
Are you proposing gamedev companies should adopt that kind of techniques?
Im always pushing all programmers to adopt anything that helps them that they can fit in their constraints. Aside from Design-by-Contract, I’d hold off on recommending game developers do what I described until the tooling and workflow are ready for easy adoption. Ive got probably a dozen high-level designs for it turning around in my head trying to find the simplest route.
One thing Im sure about is using C++ is a drawback since it’s so hard go analyze. About everything I do runs into walls of complexity if the code starts in C++. Still working on ideas for that like C++-to-C compilers or converting it into equivalent, easier-to-analyze language that compiles to C (eg ZL in Scheme). So, I recommend avoiding anything as complex as C++ if one wants benefits from future work analyzing either C or intermediate languages.
Edit: Here was a case study that found DbC fit game dev requirements.
The other day there was a link to a project that does source-to-source transformation on C++ code to reduce the level of indirection: https://cppinsights.io
Try doing an exhaustive list of FOSS apps for C and C++ doing this stuff. You’ll find there’s several times more for C which also collectively get more done. There’s also several certifying compilers for C subsets with formal semantics for C++ still barely there despite it being around for a long time.
So, that’s neat but one anecdote goimg against a general trend.
This is meaningless - is there someone who doesn’t?
So the only one thing you propose would not help at all with problem with overheating console or with performance regression from that post ;)
Sure, it’s hard but there are tools that can do some sort of static analysis for it (for example Coverity or Klocwork). Either way, there are no alternatives today for c++ as language for engine that can be used for AAA games.
Have you actually read it? I have nothing against DbC, but as far as I can see that study doesn’t really show any great benefits of DbC nor is it realistic. They do show that writing assertions for pre/post conditions and class invariants helps in diagnosing bugs (which is obvious), but not much more.
They don’t show that really hard bugs are clearly easier to diagnose and fix with DbC, nor do they show that cost/benefit ratio is favourable for DbC.
Finally, that paper fails to describe in detail how was the experiment conducted - all I could gather is this:
Even it was an interesting paper (which imo it is not) it’s impossible to try and replicate it independently.
“This is meaningless - is there someone who doesn’t?”
Yes, most developers don’t if you look at the QA of both proprietary and FOSS codebases. I mean, every study done on formal specifications said developers found them helpful. Do you and most developers you see use them? Cuz I swore Ive been fighting an uphill battle for years even getting adoption of consistent interface checks and code inspection for common problems.
“but as far as I can see that study doesn’t really show any great benefits of DbC nor is it realistic. They do show that writing assertions for pre/post conditions and class invariants helps in diagnosing bugs (which is obvious)”
It’s so “obvious” most developers aren’t doing DbC. What it says is that DbC fits the requirements of game developers. Most formal methods don’t. It also helped find errors quickly. It’s not mere assertions in the common way they’re used: it can range from simple Booleans to more complex properties. One embedded project used a whole Prolog. Shen does something similar to model arbitrary, type systems so you can check what you want to. Finally, you can generate tests directly from contracts and runtime checks for combining with fuzzers taking one right to the failures. Is that standard practice among C developers like it has been in Eiffel for quite a while? Again, you must be working with some unusually-QA-focused developers.
“would not help at all with problem with overheating console “
First part of my comment was about general case. Second paragraph said exactly what you just asked me. Did you read it?
“Even it was an interesting paper (which imo it is not) it’s impossible to try and replicate it independently.”
The fact that it used evidence at all would put it ahead of many programming resources that are more like opinion pieces. Good news is you don’t have to replicate it: you can create a better study that tries same method against same criteria. Then, replicate that. If that’s the kind of thing you want to do.
Academic studies show one thing, while practitioners for unknown reasons do not adopt practices recommended by academics. Maybe, the studies are somehow flawed. Maybe the gains recommended in studies don’t have good roi for most of gamedev industry?
The study was using “mere assertions in the common way they’re used” so I don’t know what is the point of rest of that paragraph - techniques you mention there are not even mentioned in that paper so there is no proof of their applicability to gamedev.
I asked you about your recommendations for gamedevs not about some unconstrained general case and just pointed out that your particular recommendation would not help in case from the story - nothing more :)
Sure I have, but to make it easier in future try to make your posts more succinct ;)
I agree in the general case. Except that most practitioners trying some of these methods get good results. Then, most other practitioners ignore them. Like CompSci has it’s irrelevant stuff, the practitioners have their own cultures of chasing some things that work and some that don’t. However, DbC was deployed to industrial practice via Eiffel and EiffelStudio. The assertions that are a subset of it have obvious value. SPARK used them for proving absence of errors. Now, Ada 2012 programmers are using it as I described with contract-based testing. Lots of people are also praising property-based testing and fuzzing based on real bugs they’re finding.
So, this isn’t just a CompSci recommendation or study: it’s a combo of techniques that each have lots of industrial evidence they work and supporters that work well together that also have low cost. With that, either mainstream programmers don’t know about them or they’re ignoring effective techniques despite evidence. The latter is all too common.
“I asked you about your recommendations for gamedevs not about some unconstrained general case”
What helps programming in general often helps them, too. That’s true in this case.
“Sure I have, but to make it easier in future try to make your posts more succinct ;)”
Oh another one of you… Haha.
Those techniques are not what I would call “formal specifications” any more than simplest unit tests are, but if you consider them as such, than…
… I have no studies to back this up, but my experience is different. DbC (as shown in that study you linked), property based and fuzzing based testing are techniques that are used by the working programmers. Not for all the code, and not all the time but they are used.
When I wrote about studies showing one thing and real life showing something opposite I was thinking about methods like promela, spin or coq.
Makes more sense. I see where you’re going with Coq but Spin/Promela have lots of industrial success. Similar to TLA+ usage now. Found protocol or hardware errors easily that other methods missed. Check it out.