I’m a primary contributor to c2rust and I may be the person “stolen” away from corrode. I’d like to apologize if it feels like we ripped off ideas without giving due credit - the project wasn’t really supposed to “discovered” so soon. The website was a throwaway idea, a means of easily sharing our work in a limited circle while avoiding both the DARPA approval and the sub-par build process (you have to build a huge chunk of Clang).
So here is me acknowledging Jamey’s work: I personally did take inspiration from Corrode, and I was expecting to work on Corrode proper when I joined Galois (I even wrote a pure Haskell library to parse/pretty-print Rust code - with Corrode in mind). I’ve re-read the CFG module of Corrode several times (as well as the Mozilla paper, and some older literature).
All that said, I also want to point out that Corrode hasn’t had any activity at all since last April - and that’s not for want of PRs piling up. I’m not criticizing here, since I understand that managing an open source can be quite time-consuming and stressful, but I feel like this also does need to be mentioned. Also, c2rust can be freely forked. Once the DARPA funding runs out, it is my hope that the Rust community will become the maintainers.
Finally, regarding the many improvements that can be automated: that is next up on our plate!
Hi, nice to see you here!
Regarding idiomatic improvements, in case you missed, C2Eif is the most impressive research in this area in my opinion. It can create classes and methods based on function signature analysis. It can also mark methods as private based on call graph analysis.
It is also impressive in coverage. It handles Vim, it handles inline assembly, it translates long jumps to exceptions, it handles variadic functions (Eiffel doesn’t have them like Rust, they are translated to library calls), it translates printf format strings to Eiffel formatting, etc.
Marco Tudel, the author of C2Eif, has a company called mtSystems that sells a C to Java translator.
Very cool! This is the sort of tool I wish was open source, but probably won’t be because nobody has any financial incentive to do so.
Interesting example given in haskell about type system complexity:
length (1, 2) --> 1 wut?
length (1, 2, 3) --> *incomprehensible error*
FWIW, this is all caused by a Foldable ((,) a) instance that is already quite controversial in the Haskell community1. It isn’t the only controversial Foldable instance either - did you know there is a Foldable (Either a)2?
The main friction is that removing instances that were previously there may cause code that currently compiles to stop compiling. One suggestion I personally like is to have a compiler warning for the pathological cases3.
<interactive>:6:1: error:
• No instance for (Foldable ((,,) t0 t1))
arising from a use of ‘length’
— what’s incomprehensible about this?
Hmm well this is what I got, which is pretty incomprehensible to someone starting out with haskell I think.
<interactive>:4:1: error:
• No instance for (Foldable ((,,) t0 t1))
arising from a use of ‘length’
• In the expression: length (1, 2, 3)
In an equation for ‘it’: it = length (1, 2, 3)
<interactive>:4:9: error:
• Ambiguous type variable ‘t0’ arising from the literal ‘1’
prevents the constraint ‘(Num t0)’ from being solved.
Probable fix: use a type annotation to specify what ‘t0’ should be.
These potential instances exist:
instance Num Integer -- Defined in ‘GHC.Num’
instance Num Double -- Defined in ‘GHC.Float’
instance Num Float -- Defined in ‘GHC.Float’
...plus two others
...plus three instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
• In the expression: 1
In the first argument of ‘length’, namely ‘(1, 2, 3)’
In the expression: length (1, 2, 3)
<interactive>:4:11: error:
• Ambiguous type variable ‘t1’ arising from the literal ‘2’
prevents the constraint ‘(Num t1)’ from being solved.
Probable fix: use a type annotation to specify what ‘t1’ should be.
These potential instances exist:
instance Num Integer -- Defined in ‘GHC.Num’
instance Num Double -- Defined in ‘GHC.Float’
instance Num Float -- Defined in ‘GHC.Float’
...plus two others
...plus three instances involving out-of-scope types
(use -fprint-potential-instances to see them all)
• In the expression: 2
In the first argument of ‘length’, namely ‘(1, 2, 3)’
In the expression: length (1, 2, 3)
Yeah, sadly GHC error messages are pointlessly hard to read.
The first should just say “There is no instance of Foldable for (a,b,c)”.
The other two are very standard messages you’ll see all the time. You don’t even need to read them. Actually, GHC should be taught to simply not produce them in cases like this. They’re a consequence of the previous error. GHC should be printing something like “I don’t know what type to assign to literal ‘1’ because there are no constraints on it. If there are other type errors fixing them may add additional constraints. If not, annotate the literal with a type like (1::Int)”.
Basically, 1, doesn’t mean much. It could be an integer, a double, a km/s, a price, the unit vector, etc. As long as the type has a Num instance available 1 can be converted to it. Since the type controls the behavior of that object you need to know what it is before you can run the code.
I agree that the amount of information that GHC outputs is overwhelming (the GHC typechecker looks to me like a complicated solver environment that might be better served by its own interactive mode and type-level debugger, Coq-style). On the other hand, the source of the error is clearly written in two lines at the start of the message, that’s why it’s hardly “incomprehensible”.
To me this looks like the unreadable messages from g++. You just have to learn to read through it!
For someone new to Haskell, the notion that – were you to invest the time to learn, this would be a readable message – is hard to fathom. I think that sort of imagination barrier is why generally things with steep learning curves are less popular.
There’s nothing Haskell-specific about bad error messages. And it’s nothing to do with imagination or the learning curve of Haskell. It’s just the obtuse way that GHC error messages are written and the lack of interest to make them better.
If this message said “(,,) is not an instance of Foldable” no one would find it difficult to comprehend.
That being said. The tuple instance of Foldable is really horrible and confusing. Either length shouldn’t be in foldable and we should use some other concept (slotness or something) or that instance shouldn’t be there by default. But this has nothing to do with Haskell or type systems. It’s just as if Java had created a terrible misnamed class that gave you the wrong answer to an obvious query.
Ok, someone replied to me and deleted his reply yesterday about this. I’m guessing he wants privacy where only I saw it through email system. I looked at his publications and affiliations to find he’s smart as hell, worked with DARPA (among others), and maybe even did a project with Galois. So, the source is probably reliable on what can or can’t work with DARPA. Here’s a paraphrasing of his claims:
Program managers can set up programs so that pre-pub review isn’t required for the whole program or for specific projects.
He did that himself with the entire project not subject to pre-pub review. He had the whole thing in open repositories that people did the work in.
Another project had subject areas defined as “designated, unclassified, subject areas.” Anything in them didn’t have to be reviewed. This is where you’d put the FOSS part.
He believes the prepub requirement by default is program managers just being lazy. Defaulting on review is the safest approach to go. It requires putting no thought or effort into whether something needs secrecy in the first place. So, they just do it even though they don’t have to. It wastes a lot of people’s time.
So, in his view, Galois could definitely get exemption for whole projects or pieces of them that could be in a live repo. He also claimed to have done this at least two times: one a whole project, one pieces of a project. Maybe the Galois people interacting with DARPA should try to default on doing that for anything that will be published to begin with. If it’s possible at all at current time. And I thank them for all the neat stuff they FOSS’d for us up to this point since they didn’t even have to do that. Still a great company. :)
@harpocrates can you pass this along to them so they at least consider it? I’d appreciate it.
I don’t currently write proposals, but I’ll pass the message on.
Don’t hold your breath though - this sounds like something Galois would already have done (perhaps we have? I don’t know…) if possible. We really do make an effort to open-source when possible.
Thanks!