1. 3

    Required reading for this should be Taleb’s Black Swan book. The statistical consequences of Fat Tails

    A couple of observations….

    • I’m sure the distribution of software projects have several underlying distributions. So saying it’s just “log normal” underestimates things. eg. Some projects have high levels of technical risk… ie. Some tasks are just not feasible and the time taken to prove it’s not feasible is not finite.

    • I’m sure the distribution is truncated fat tailed. ie. The only reason the mean time is finite is a lot of projects get truncated at the far right due to budget limitations.

    • The whole field suffers from the Ludic Fallacy. Projects aren’t a game operating to strict rules. Shit of every conceivable flavour happens in real life all the time.

    There is a fundamental disconnect between contracts, “I promise to deliver X by hard deadline Y in return for payment Z” and estimating “If I do X, how long will it take me and how much will it cost?”

    Business is fundamentally a betting game because “how long will it take me” and “how much will it cost” are fundamentally random variables.

    There is irreducible risk in collapsing a probabilistic estimate into a hard contract.

    You can reduce that risk, but you cannot remove it.

    If you’re uncomfortable with that you need to be in a different game.

    1. 1

      Thanks for the link.

      I’ve started the Black swan and Antifragile both a couple of times but didn’t manage to get through either one of them. One problem was Taleb’s writing style, but also a lack of background knowledge in statistics. (I don’t have a very good understanding of statistics outside of what they teach during engineering programs in a university.) Trying to get even the part 2.1 of the linked doc proved somewhat difficult.

      You seem to be quite knowledgeable in these topics. Would you have any pointers, where one could possibly even get started, if the goal would be to self-study the required background knowledge for this?

        1. 2

          One problem was Taleb’s writing style.

          Sigh. Yes.

          He’s core point is most of the “background knowledge in statistics” you’d get in the average stats courses at uni are focused on that small subset of problems that are amenable to analysis.

          Preferable those with closed form solutions, preferable teachable in a course that has to be endured.

          (Typically by engineers or economics or psychology students or whatever that actually want to be studying something else, but this damn stats course is a prerequisite.)

          Alas, as he hammers home almost to the point of tedium, most of reality does not satisfy the requirements for those classical of statistical methods to be appropriate.

          Which leaves us in somewhat empty….

          Fortunately Taleb does try hand us some digestible take aways, I’d focus on the section “2.3 the main consequences” and google around to try make sense of each point.

          Certainly I’d also keep an eye on what Derek Jones is doing (add his blog to your RSS feed (I have)) https://lobste.rs/s/srgwbv/why_software_projects_take_longer_than#c_u9ia1a

          Alas Science traditionally has the “All Other Things being Equal” caveat…. which in an industrial software project is never ever true, which makes drawing valid conclusions very hard. https://leanpub.com/leprechauns/read

          The assumption in all traditional stats is the data isn’t truncated. Real world projects are always truncated in several dimensions. (Insufficient money, CPU power, project becomes irrelevant, …..)

      1. 6

        “They could have designed a two-channel system. Or they could have tested the value of angle of attack on the ground,” said Lemme. “I don’t know why they didn’t.”

        Partially for the same reason the FAA deferred certification to Boeing: deadlines and management pressure. I highly doubt this is the end of the story; there are deeper issues here.

        I look forward to the final STAMP investigation.

        1. 1

          I think they also touched another point in the article (Disclaimer: I’m not an expert in safety regulation for aviation, but I assume the principles are rather similar to those in industrial systems). The safety analysis of the MCAS system classified the risk associated with the system too low. The single channel system fulfilled the regulatory requirements for the lower estimated risk level, had the risk analysis classified the associated risk higher, they would’ve had to build this particular system with a two-channel architecture. To be honest, I was surprised when I read that it was a single-channel system, I was under the impression that mostly everything in aeroplanes is two- or three-channel. :)

          But this was also just a single problem in the whole engineering process. I spotted this article on the System safety mailing list, some people there have rightly noted, that the problem is with the overall systems engineering in this case.

          The point of outsourcing the certification to Boeing was also a surprise to me. I’ve heard that major corporations can pull such tricks off, where they can argue that a separate organisation of the company is sufficiently independent of the development team to be able to achieve the required independence when executing the risk analysis and “third-party” review. But somehow the way it was done here seemed quite “unclean”.

          A STAMP investigation would surely be interesting, we’ll see if that happens, I have the feeling that it’s still not that widely used even in this particular industry.

        1. 4

          Thanks for posting! The article was interesting, even if at some points it felt a bit overly dramatic.

          If someone wants to know a bit more about the technical side, there’s a report from Dragos available and a short talk from Schneider Electric on Youtube.

          There are some interesting details related to the incident, the whole thing wouldn’t have been possible, if a physical switch on the safety PLC was used to set the device to ‘run’-mode (prevents software changes), as instructed by the manufacturer, instead of being left in ‘program’-mode.

          1. 13

            Selfishly, I’d recommend my talk on YouTube over Schneider’s. I was on site for the incident response and led the investigation of the controls systems side of the breach. Our team took over from Julian’s after they identified that there was malware present on the engineering workstation.

          1. 2

            I’m always conflicted about Systems Theory….

            Part of me says Cool…. part of me says it’s so fuzzy and so glossing over details as to be provably useless. (Look at any system which produces chaotic behaviour).

            This seems like it might be a good use for it….

            ie. Use systems theory to identify all contributory factors and take actions on all of them instead of a single scape goated “root cause”.

            But then that niggle about vagueness and fuzzyness and how it doesnt fit with any software intensive system I work with….

            1. 2

              What’s good about the presentation in this handbook is, that the STPA-method is presented as a step-by-step process, without much of the fuzzy stuff. Leveson & co. claim to have reached pretty good results with the methodology, usually exceeding what was reached by FMEAs or HAZOPs. Leveson typically argues replacing the old methodology completely with STPA. A couple of presentations I’ve seen about it in an industrial setting usually suggest using it for a high level analysis in combination with e.g. FMEAs for low-level stuff. The hope there is, that you catch the more complicated causal scenarios (possibly involving humans with “fuzzy” behaviour) but for circuit board-level stuff remain with the traditional methods.

              You might be interested in a recent PhD thesis from Uni Stuttgart, where they combined STPA into a systems engineering approach for software-intensive systems in an automotive setting. These slides provide a nice concise summary of that work. The approach uses STPA to identify system specifications, which are then translated into something more formal for model-checking / software testing.

              I’ve had similar feeling when reading Leveson’s stuff. I think however, the analysis method they’ve built is powerful, also outside the domain of safety engineering. maybe especially when incorporated in a similar way as suggested in the thesis above, where it’s used to formulate specifications early on, as writing specs might often be the more fuzzy part of the system engineering process?

            1. 5

              This was an interesting read. Thanks!

              One thing, which would’ve been a cool addition, would’ve been a control graph of the system being analysed. There’s a lot of literature with examples of STPA/CAST-analyses for safety, but very few for security, and all of them are related to analysing a system around some physical process. Would’ve been interesting to see your take on it here. There’s a bit about extending STPA for Security e.g. here and here.

              Another nice and a bit more practical resource for learning about STPA (The hazard analysis method based on STAMP) is the STPA Handbook (pdf) Leveson and Thomas published last year. Leveson’s 2012 book, at least for me, was at points a bit long winded (nonetheless a very good book), but I’ve found the handbook a better reference for analyses. It also covers a lot of typical common mistakes etc.

              If I may nitpick a bit, STAMP is an accident model, STPA and CAST are analysis methods based on it :) Leveson extended on ideas from Jens Rasmussen, as she talks about e.g. in this publication. If you’re interested, check e.g. this influential paper from Rasmussen.

              PS. The link to Leveson’s homepage at the end of the article is not working.

              1. 1

                Wow, that is a really good handbook. You should submit it as a lobsters link!

                Also, link fixed =)

                (I know I should include a control graph but I’m kinda procrastinating on making one :/)

              1. 2

                Interesting paper, thanks for posting.

                I’ll try to take a more detailed look later, but would be interesting to see how their approach scales to something actually useful. The system they are using as an example is rather simple, and they mention having used a 150s time limit for the solver. I guess this is only for the marginal/instable cases with gain >= 1, for which the solver apparently timed out? Wonder how much time the solver needed for the proven cases.

                While searching, found a follow-up paper with an interesting comparison of different model checkers for similar stuff from 2015. Hopefully you weren’t planning to post this next. :)

                1. 2

                  I didn’t have it! So, thanks. I actually wasn’t looking for Simulink/Stateflow stuff this year. I just stumble on it periodically because of how I word searches. I might spend more time looking for them next year due to importance in embedded. Are there any open tools that are comparable to those and high quality enough for professionals?

                  EDIT: Also interesting is it’s two women. I tried to find out what their other work was to see what’s industrially applicable. Both had pages at Britol, here and here, with Araiza-Illan now at ARTC. It’s heavy on industrial application. Araiza-Illan also posted four hours of verification proceedings to YouTube. Interestingly, Eder worked at XMOS on power-aware design. As Bristol is Cadence’s lead university per that page, she’s probably also feeding ideas into their verification groups. If not, she at least started with real-experience per this interview which I’m still watching right now.

                  So, you found not only a paper but two women doing interesting work with strong ties to industry and/or real-world experience. Extra interesting. Also, now on my list to potentially hire if I get into verification. :)

                  1. 2

                    Re: Open alternatives - This is a topic where I’ve wanted to look into lately. Probably a large portion of normal data analysis people might do in plain old Matlab would be easily replaceable with Python and the common analysis libraries for it.

                    As for Simulink/Stateflow: I’ve been meaning to play around with OpenModelica, as I would really like to have an open alternative available if I don’t have access to Simulink in the future. The tool has been in development for a long time (still very active, with EU-funded research projects developing it further), has a consortium backing it with major players involved, and has been used as a basis for industrial tools. It’s the only thing I would seriously consider as a realistic open source alternative. There’s also JModelica, which is another Modelica-language based open source tool from a swedish company (apparently they base their commercial tools on the open core developed with couple of Unis).

                    Simulink was always a nice tool for building models of dynamic systems, continuous or discrete, as the graphical notation makes sense for the sort of stuff in control engineering or signal processing, where you would anyway draw block diagrams or similar charts to try to clarify things. However, at least in my experience, building any sort of non-trivial software-like logic with the graphical blocks always resulted in a horrible mess. The Stateflow addition with hierarchical state machines has been a really nice addition to help with that aspect (before moving to software-in-the-loop with generated code or custom code through the C-interface). I think OpenModelica should have an equally good system/solver for the typical control/signal processing stuff (I read somewhere that the solver should be somehow even better/faster than the Simulink-solvers due to a different approach internally), I’m not sure how well supported the FSM-stuff is. At least there should also be a C-interface, so it should fill most of my use cases. For use with HW-in-the-loop-simulators, both systems support the FMI/FMU-standards, so ideally that should also be covered, but I’ve only worked with dSpace, which is rather heavily integrated with Matlab/Simulink.

                    I don’t know how the usability of the graphical editor with OpenModelica is. You could always just write Modelica directly, but somehow I was always put off by the syntax of the language. :)

                    1. 2

                      I submitted OpenModelica here before. I don’t know much about it past the description sounded really cool and general-purpose. Yeah, expanding it to have parity with or better methods than Simulink/Stateflow makes sense. I didn’t know about FMI/FMU. Looking at this article, it says FMI is Modelica project. It has a nice pro’s/con’s section comparing to Simulink where improvements might be made.

                      I’m not sure about FMU’s. It looks like they’re like modules or executables produced as a result of doing FMI. Like class or jar files in Java maybe. Or am I pretty far off?

                      re syntax. Add it to list of Better Language extracts to and integrates with Supported Language. :)

                      1. 2

                        I also have pretty much no experience actually using the FMI-standard, as I’ve been stuck with Simulink (and the S-functions also mentioned in that Wikipedia-article). My undertanding is, that the FMI-standard specifies a way to describe the interfaces (IO-points) and the internal logic, and FMUs are then models/modules implemented according to this standard. Modelica is going for an object-oriented approach, so I think thinking of FMUs as classes is probably a good analogy.

                        Typically, at least in the S-function world, and I think it’s the same here, one defines the time derivative of each continuous state within the model, which are fed back to the solver/simulator executing the model, and it handles the numerical integration and feeds to integrals back to the FMU (See p.69 here). For discrete stuff in the S-function world one defines a step-function, which is executed by the simulator every x seconds to solve the next outputs, FMI-standard seems to have an event-based approach, but “time-based events” probably boils down to the same thing.

                        I tried to dig up some meaningful examples outside the actual spec, but they were a bit hard to find.

                        This paper has an example of an FMU-description in Modelica: http://www.ep.liu.se/ecp/056/003/ecp1105603.pdf (I assume then this could be used to generate the XML/C-interface described in the FMI-standard?)

                        There are some tutorial slides here.

                        1. 1

                          It looks pretty complicated haha. I’ll probably have to try OpenModelica or Simulink some time in the future on small, realistic examples to learn how they work. Especially the thinking process behind it. Probably different than what I’m used to. Thanks for the examples, though. The transmission example particularly shows me how they use these tools.

                1. 2

                  It’s now happened a couple of times that I’m surprised by texts claiming that developers haven’t heard of state machines. At least for me the concept was introduced in an embedded systems course during the uni, and has been quite a common thing during my working life in that area (I’ve worked mainly with different embedded/PLC-based systems for machine controls), both in the control logic as well as with the fieldbus communication (e.g. the CANOpen state machine comes to mind). It’s often been a rather natural presentation when working with anything physical which has different modes and moving parts etc., just like the demo here.

                  Currently I’m working sometimes with the Stateflow extension of Matlab Simulink, which offers a nice utility for hierarchical state machines, though behind a rather expensive paywall. I had not heard of this language before, and it looks like they have code generators for different languages, nice. I’ll try to give it a go at some point. Thanks for posting!

                  Now if there only would be nice integration with something like OpenModelica to enable easily sketching out control logic against a process model..

                  1. 2

                    This is an interesting piece of work. I had the chance to see a presentation from AbsInt at conference (for industry in safety-critical application areas, more industry people than academics) earlier this year. They presented mainly the same results as discussed in this paper. The company is doing the commercialisation for the tool.

                    For the discussion about which parts are formally proven or part of the “Trusted Computing Base” and which not, I think slide 34 from this presentation summarises it quite nicely.

                    If I understood it right, I think the main argument here is, that for industrial usage especially in x-critical applications, CompCert can provide one trusted part in the whole toolchain to produce an executable which is correct with high confidence. Key benefits being reduced need for testing/validation (esp. of the compiler itself) as well as the possibility to use compiler optimisations, which are usually not allowed,or only low levels, as part of the compiler certification/validation for safety-critical use. Of course the toolchain doesn’t remove the need for testing the end result, but at least they could trust the compiler itself.

                    They use several additional tools together with CompCert to achieve the high confidence in the end. Like they discuss in the paper, CompCert doesn’t do pre-processing itself. They use a “proven-in-use” gcc-version for that, which they additionally test themselves for the subset allowed by their coding rules (derived from MISRA C). The coding rules are then enforced by another tool, so they can have high confidence that the pre-processing is correct. They use another tool to check for runtime errors and undefined behavior. Lastly, they again use “proven-in-use” gcc for assembling and linking, but the translation is checked again with another tool.

                    1. 1

                      CompCert uses “proven-in-use” tool (CIL) to map the C source output from the “proven-in-use” gcc to their lower level language Clight, then they generate assembler followed by “proven-in-use”, gcc, to assemble and link the code.

                      What is wrong with the gcc’s “proven-in-use” code generator?

                      1. 2

                        The paper is not just about CompCert, it’s about how the main authors used the compiler in a specific safety-critical application as part of the toolchain, and what benefits it had in that case and how they managed the non-proven parts of the toolchain, where CompCert is just a single building block.

                        The problem in safety-critical applications, where the produced system is certified by a third-party (e.g. TÜV or Exida) for some specific use according to some standard (e.g. IEC 61508), is that the used tools need to be validated somehow. The problem then with gcc is, that it was not developed which such a use case in mind, so the user of the compiler needs to somehow validate, that the compiler is doing what he expects it to do. Other option is to purchase an expensive compiler, which is developed and certified for safety-related use. Some manufacturers are also offering certificates when using their libraries with a specific piece of their hardware with a specific compiler, with a specific language subset, which they have pre-validated for the user.

                        In this case by using CompCert as part of the toolchain, they had to do less validation for the compiler and were able to use higher compiler optimisation settings, as they had the proofs to show, that with reasonable high confidence they can be trusted. Since the other parts of the toolchain were not formally verified, they used different measures to qualify them for the use case, which was equivalent to SIL 3 acc. to IEC 61508 if I’m not mistaken. As a reference point, that criticality level usually means that if the system fails, it’s possible that several people are injured or die. Normally for such a use case one has to do a lot of verification effort for the used tools to prove to the third-party, that the tools are suitable for the job and with a high level of confidence don’t introduce errors in the end product.

                        Again, at least that’s my understanding of the matter.

                        Edit: couple of typos.

                        Edit 2: Forgot to answer your first remark. In my understanding, the gcc in the beginning is used only for the preprocessing (.c-files -> .i-files), so there in the first entry point to CompCert, it’s still plain old C. If I understand right, the first step CompCert itself does is to translate the C to Clight, but I’m not really an expert on the internal workings, I’m more of a potential user for the tool.

                        1. 2

                          I understand about compiler validation. As far as I know CompCert has not passed a compiler validation suite.

                          The C semantics pass of CompCert is based on CIL. which last I checked fails the C validation suites.

                          What does this claim mean: “The CompCert front-end and back-end compilation passes are all formally proved to be free of miscompilation errors”? What is a miscompilation error? Does it mean the source of CompCert does not contain any compilation errors? Since the semantic analysis of CompCert is based on a tool that has not been proved correct in anyway, I don’t see how this claim relates to C source.

                          Typo in my earlier response Clight -> Cminor

                          1. 2

                            Interesting text about compiler validation. I wonder if and how the validation you write about differs from the tool qualification done today for e.g. IEC 61508.

                            I have to admit I didn’t completely understand your previous comment, as I wasn’t aware of CIL. As mentioned, I’m not really an expert, but I tried to do some digging.

                            There’s a paper from 2012 where it’s mentioned that CompCert used an adapted CIL-library for parsing. In this paper they don’t mention CIL and only discuss about using gcc as preprocessor. There seems to be some documentation about the current compiler structure in this paper from 2017 (check esp. part 3) and in more detail about the steps in the current docs. The 2017 paper also discusses the semantic preservation. At least I dont find any references to CIL there. Maybe there’s an answer to your question there, I don’t feel qualified to answer it. :)

                      2. 1

                        The slide has a nice breakdown of steps but is a little misleading. The formal methodists sometimes count anything proven to not be in the TCB or be untrusted. They had me in that habit a while back before I realized how much they screwed up. The truth is all of that is in the TCB. They’ve just applied formal proof to the stuff they’re not counting. It can still be wrong due to spec errors like the 3 or so found in Csmith testing. There’s also stuff it doesn’t model, esp during runtime, that can cause failures. Bit-flips in RAM or bit rot on HD’s are examples. The best they can say is certain parts inspire higher confidence they work. Whereas, this is an example of qualifying verification claims the right way.

                        “CompCert can provide one trusted part in the whole toolchain to produce an executable which is correct with high confidence. “

                        Good descriptions about its implementation aspects. Yeah, it’s supposed to be one link in the chain. There was another slideshow I saw showing everything that might be used in a DO-178B evaluation. They had something for requirements, a modeling tool for high-level design, something like Astree Analyzer for checking code, a test generator, and then maybe CompCert kicks in at last part. One tool in a whole, assurance-producing chain.

                        1. 1

                          Yes, I realised it was actually a bit backwards in that slide after I posted the comment. Here is a version of the same thing they used in the presentation at the conference where I participated. If you’re interested in the slideset, I can somehow share it over PM, it’s in principle under copyright, so I don’t really want to post it somewhere openly.

                          They also used Astrée in this paper, the whole toolchain is summarised in the Figure 6. In the presentation I saw they also discussed the modeling tools etc shortly.. What’s discussed in this paper is funnily mostly the toolchain, where AbsInt can sell you solutions for the problem. :)

                          Edit: Re: HW Failures during runtime

                          Yes, they didn’t mention that much in the paper. I would assume that in this case they also use redundancy, ECC-memory and all manners of self-tests implemented in software to combat and self-diagnose all the failures that could happen in the electronics during operation.

                          1. 2

                            Here is a version of the same thing they used in the presentation at the conference where I participated.

                            That’s much better. It just says what parts they did a proof on without any false claims. Yeah, send it to me in case I need it for someone.

                            “What’s discussed in this paper is funnily mostly the toolchain, where AbsInt can sell you solutions for the problem. :)”

                            You’d think they’re motivated primarily by money instead of public good. This is the thing I hate most about CompCert: it was assigned to a private company as proprietary software. INRIA’s brightest build one of the most amazing tools in field of verification to have it partly locked up. Then, all the academics contributing to CompCert, which is free for them and maybe GPL stuff, are contributing to a proprietary product instead of an open one. Had they gave it to C0 or something totally free, then all the improvements in that slide would be totally free with companies optionally selling them. CompCert should’ve been Apache, LGPL, or something. It’s too important.

                            After one company did one in weeks, I got it in my head to eventually attempt an open project using ASM’s as an intermediate form since they have a ton of potential. They’ve already been used as IR in both hardware and software but not free or well-developed. If using them, one can target all the analyses and transforms on them with code generated afterwords or hand-optimized with equivalence check. As prior work shows, the equivalence check can happen at assembly level.

                      1. 0

                        Last time I checked, CompCert did not do any formal analysis comparing its behavior and the semantics of the C language. The formal verification stuff is all back end, i.e., code generation.

                        More CompCert soap powder advertising.

                        Using the CompCert proof ‘methodology’, I could translate Visual Basic to the CompCert low level code (from which machine code is generated) and claim to have a formally verified Visual Basic compiler (i.e., no connection to the formal semantics of any language is required).

                        1. 2

                          I thought I told you or you told me that CompCert uses its own dialect of C that they formalized. Its a subset with specific interpretations of anything with wiggle room. Most of their work centers on Clight as stsrting point. Most efforts targeting CompCert target it. One could check the MISRA-C to Clight translation by eye and equivalence tests if concerned.

                          I do think they should be more clear on exactly what they support at various times.

                          1. 1

                            Incorrect, a subset is not a dialect. compcert implements a very large subset…

                            1. 1

                              It’s possible my wording is wrong. I’m not sure how different something has to be from the original before it qualifies as its own version or dialect. Usually, once a non-standard change is introduced, people often refer to it as something different sometimes using “dialect.”

                              CompCert C has never been C itself. It was always its own easier-to-verify thing. For example, its memory model has evolved over time. These verifiable subsets of C not only have a subset of functionality: they expect it to be used in specific ways/semantics that facilitate their analyses and algorithms.

                              Don’t these subsets that have specific, unusual interpretations and usages become new dialects? Or is my own interpretation of what a dialect is off?

                              Note: That was first link I could find about this in DDG. Original was on Lobsters but its search isn’t giving it up.

                              1. 2

                                The compcert memory model is shared with C from what you linked. Also from the compcert front page

                                The main result of the project is the CompCert C verified compiler, a high-assurance compiler for almost all of the > C language (ISO C99), generating efficient code for the PowerPC, ARM, RISC-V and x86 processors.

                                A dialect is different, a subset is a subset. also see the words ‘almost all’ .

                                1. 2

                                  There is a bit more information available here.

                                  CompCert C supports all of ISO C 99, with the following exceptions:

                                  -switch statements must be structured as in MISRA-C; unstructured “switch”, as in Duff’s device, is not supported.

                                  -longjmp and setjmp are not guaranteed to work.

                                  -Variable-length array types are not supported.

                                  There is a specific mention that they cover all of MISRA C 2004, which is already quite good considering that this tool might be especially interesting for industry application in safety-critical context, like in the linked paper.

                                  1. 1

                                    I could be recalling incorrectly what made it a dialect or something about C which I’m far from an expert on. I’ll say subset for now unless I can cite a difference deserving the word dialect.

                              2. 1

                                Clight looks like it might be a very small subset of C (I have not checked). It might also qualify as a very small subset of Java and perhaps other languages.

                                Since Clight is defined by its creators and whatever it does, by default, is correct. What exactly does a proof of the correctness of Clight mean?

                                1. 2

                                  The proofs in CompCert typically mean:

                                  1. The specs possess a specific property.

                                  2. The implementation does what the spec says on all inputs.

                                  In CompCert’s case, they mainly want to specify the meaning of the languages, the operations (esp transformations) they perform on them, and prove they have equal meaning after those operations. The Csmith testing empirically confirmed high quality for at least those type of tests. Found a few spec errors while non-verified compilers had piles of failures.

                                  Personally, Id like to see a testing suite of every feature, common combinations (esp w/ nesting and aliasing) and random combos. Compare how each compiler handles programs generated from those templates.

                              3. 1

                                you can formally verify C code with things like frama-c or verifast… But once you have verified code, you need a verified compiler to convert it to assembly, that is where compcert comes in. If you compile verified code with some buggy compiler it might do you little good.

                                1. 1

                                  Frame-C and Verifast are essentially static checkers of source code. They have no means of formally verifying C source against the requirements contained in the C Standard.

                                  Where is the proof showing that CompCert implements the semantics specified in the C Standard? There could be a fault in CompCert’s handling of C semantics (just like in other compilers) which causes it to generate incorrect code.

                                  1. 1

                                    Frama-C WP plugin absolutely allows you to write formal specifications of your code, and checks the C code matches that specification with relation to the C standard. https://frama-c.com/wp.html

                                    The whole point of compcert is that it is a formally verified translator of the C standard to assembly… Obviously there could be bugs in the compcert proofs, but they are probably less likely than many bugs in other compilers.

                                    Where is the proof showing that CompCert implements the semantics specified in the C Standard?

                                    The proofs are in the git repository…

                                    1. 1

                                      I don’t see anything in the frame-c link that makes any claim about “relation to the C Standard”. In fact, the word ‘standard’ doe snot appear in the link.

                                      In frama-c, you write you own specifications and the tool checks the source code against what you have written. No connection to the C standard provided or required.

                                      Please provide a link to the files involving proofs against the C Standard, I cannot find them.

                                      1. 1

                                        What language do you think you are checking against? ASCL (what WP checks the C code against) stands for ANSI/ISO C Specification.

                                        What language do you think it checks against if not standard C? Here is the wiki page - https://en.wikipedia.org/wiki/ANSI/ISO_C_Specification_Language

                                        Are you claiming they invented their own language that is not C to check the spec against?

                                        1. 1

                                          The ANSI-C just come from the fact that it targets ANSI C. ASCL itself is just a language for formally specifying behavioral properties of code written in ANSI C. The book is here with tons of examples.

                                          The only one I know that can check C code against its specification is KCC that RV-Match is built on. That’s because it’s pretty much the only complete, open specification of C that can catch undefined behavior and be used as a push-button tool (if their docs are honest). It also passed the parts of the GCC Torture Test that their semantics support, RV-Match is getting good results on benchmarks. So, you could say it’s only formal semantics with significant, empirical evidence on top of claims in formal specs/proofs.

                              1. 1

                                This was hard to tag since it is about safety, uses pseudocode, and the formal method was done semi-formally w/ regular conditionals. That plus being a quick slide deck means it’s quite approachable. Most interesting was the use of fault trees. You could also say this uses the pattern from proof assistants and secure compilers of Untrusted Generation of Data/Action followed by Trusted Checker. In such designs, you put as much of the complexity into the Untrusted part with the Trusted part as simple and rigorous as possible. The idea is Trusted part will never fail plus limit damage of Untrusted part.

                                1. 3

                                  The system structure with a safety-related, separated, controller supervising simple safety conditions, while another controller does the actual process/machine control, is rather common in industrial control systems. The functional safety standards (IEC 61508 and its industry specific derivates) basically dictate the separation of safety-related functionality from “standard” functionality. If a system were to combine standard and safety functionality running on the same controller, the whole would need to be verified as if safety-critical.

                                  The second author, Phil Koopman, has done a lot of cool work on dependable systems, for example some early work in selecting good polynomials for CRCs in communication systems.

                                  Nice presentation, thanks for posting!

                                  1. 1

                                    I appreciate the tip. The only problem: Phil Koopman was in my list of stuff to submit this week. You done beat me to posting him! Although, I’ll probably still do it to experiment with submitting his whole, publication list. We normally do it one paper or project at a time. The work he does is pretty similar and focused, though. Might be justifiable. I’ll still do it later in the week.

                                    While I’m at it, @pushcx, what do you think about this as a submission? Most of it is stuff you wouldn’t want to submit individually. A few things are submission worthy. My concept was to submit it tagged appropriately, add “person” maybe, and summarize his research in text field.

                                1. 5

                                  Hi both, first time posting, long-time lurker, should thank both Hillel and Nick for all the great content over the last couple of years. :)

                                  I think the main contribution to the area of safety engineering described in the book is the STAMP accident model and the derived analysis methods, STPA and CAST, from which STPA is the interesting one for doing risk analyses/hazard identification for new and existing systems. This book is a bit of a long-winded, a more condensed description of STPA can be found in the STPA Handbook, which they published just this year.

                                  A short lecture by Leveson from last years STAMP-EU Workshop can be found here, where she introduces the contents of this book in one short lecture.

                                  STPA was originally developed for safety analysis, but there’s been some interesting work in the last years to extend it for other aspects: Security analysis: STPA-Sec.

                                  STPA for Privacy

                                  It has been a focus of the safety community during the last years to include security-aspects in risk analyses, this was done also for STPA: STPA-SafeSec

                                  Other interesting stuff lately has been combining STPA with formal methods, which might be interesting for you guys:

                                  Asim Abdulkhaleq combined STPA with SPIN+NuSMV in his PhD thesis for autonomous driving, the tool they developed is available on Github: XSTAMPP.

                                  Sam Procter created his own approach based on STPA called SAFE for medical applications, combined with modeling in the AADL language in his PhD Thesis.

                                  Edit: Added one more link