1. 2

    Is there a way to print a string in Rust that generates simpler assembly? Would avoiding the println! macro achieve that?

    1. 2

      I looked into it and it seems that there is. It’s still a little more convoluted than the C example because it has to allocate a handle for stdout, but I think that it’s an improvement over what’s there right now. I’ll make an edit, thank you for the suggestion :)

    1. 3

      A message veiled into a personal learning story to make it more palatable. I would not care much, but it scratches an itch.

      We got a slight tests vs. type checking going in the middle of the lines. These subjects should not be dumped together because you may also do Python without any tools and get along just fine.

      My opinion about types and type checking has changed as well, but it has grown to very different direction than the posters. I also have had some sort of enlightenment going along. I am a die-hard dynamic-typing proponent. I did not need a static type system, and I neither needed tests. The author had to work with something else than Python to form his opinion. I had to go into other direction, deeper into Python and finally into my own improved variation of Python.

      If type checking is sound, and if it is decidable, then it must be incomplete. I’ve realized this is really important as the set of correct and useful programs that do not pass a type checker is large. Worse, often these are the most important sort of programs that spares a person from stupid or menial work.

      “If it compiles, it works” and “type system make unit tests unnecessary” are hogwash. It doesn’t really matter how much you repeat them or whether you clothe them into a learning story. There was a recent post pointing out how difficult it is to construct a proof that some small program is actually correct. This means you cannot expect that program works or is correct despite that it types correctly in any language.

      There is an aspect that is required for making computation and logical reasoning possible in the first place. That is in recognizing variant and invariant parts of the program. I’ve realized that spamming variants is the issue in modern dynamically typed languages. That cannot be solved by adding type annotations because you still have tons of variables in your code that could theoretically change. And you really have to check whether they do, otherwise you have not verified that your program is correct.

      Statically typed languages commonly do better in keeping variances smaller, but they are also stupid in the sense that they introduce additional false invariants that you are required to satisfy in order to make the type checking succeed. And you cannot evaluate the program before the type checker is satisfied. This is an arbitrary limitation and I think people defending this for any reason are just dumb. Type checker shouldn’t be a straitjacket for your language. It should be a tool and only required when you’re going to formally verify or optimize something.

      During working on software I’ve realized the best workflow is to make the software work first, then later validate and optimize. Languages like Python are good for the first purpose while some ML-variants are good for the middle, and for the optimization C and similar are good. So our programming languages have been written orthogonal, to cross with the workflow that makes most sense.

      1. 19

        the set of correct and useful programs that do not pass a type checker is large

        If it’s large then you should be able to give a few convincing examples.

        1. 5

          I haven’t had the problem the quote implies. The basic, type systems were about enforcing specific properties throughout the codebase and/or catching specific kinds of errors. They seem to do that fine in any language designed well. When designers slip up, users notice with it becoming a best practice to avoid whatever causes protection scheme to fail.

          Essentially, the type system blocks some of the most damaging kinds of errors so I can focus on other verification conditions or errors it can prevent. It reduces my mental burden letting me track less stuff. One can design incredibly-difficult, type systems that try to do everything under the sun which can add as many problems as they subtract. That’s a different conversation, though.

          1. 1

            This set includes programs that could be put to pass a type checker, given that you put extra work into it, or use a specific type checker for them. Otherwise that set is empty: For every program you can construct such variation where the parts that do not type check are lifted outside from the realm of the type checker. For example. stringly typed code.

            The recipe to construct a program that does not pass a type checker is to vary things that have to be invariant for the type checker. For example, if you have a function that loads a function, we cannot determine the type for the function that is produced. If the loaded function behaves like an ordinary function, it may result in a dilemma that you may have to resolve either by giving it some weird different type that includes the idea that you do not know the call signature, or by not type checking the program.

            Analogous to the function example: If you define creation of an abstract datatype as a program, then you also have a situation where the abstract datatype may exist, but you cannot type the program that creates it, and you will know the type information for the datatype only after the program has finished.

            And also consider this: When you write software, you are yourself doing effort to verify that it does what you want. People are not very formal though, and you will likely find ways to prove yourself that the program works, but it does not necessarily align with the way the system thinks about your program. And you are likely to vary the ways you use to conclude the thing works because you are not restricted to just one way of thinking about code. This is also visible in type systems that themselves can be wildly different from each other, such that the same form of a same program does not type in an another type system.

            I think for the future I’ll try to pick up examples of this kind of tricky situations. I am going to encounter them in the future because in my newest language I’ll have a type inference and checking integrated into the language, despite that the language is very dynamic by nature.

            There is some work involved in giving you proper examples, and I think people have already moved to reading something else when I finish, but we’ll eventually resume to this subject anyway.

            1. 6

              Looking forward to seeing your examples, but until then we don’t have any way to evaluate your claims.

              About your function loading example, that may or may not be typeable, depending on the deserialisation mechanism. Again, can’t really say without seeing an example.

              1. 6

                When you write software, you are yourself doing effort to verify that it does what you want.

                That’s exactly why I find type systems so useful. I’m doing the effort when writing the code either way; types give me a way to write down why it works. If I don’t write it down, I have to figure out why it works all over again every time I come back to the code.

            2. 6

              A message veiled into a personal learning story to make it more palatable.

              Why do you think this is veiled message instead of an actual personal story?

              1. 5

                If type checking is sound, and if it is decidable, then it must be incomplete.

                Only if you assume that some large set of programs must be valuable. In my experience useful programs are constructive, based on human-friendly constructions, and so we can use a much more restricted language than something Turing-complete.

                1. 1

                  If type checking is sound, and if it is decidable, then it must be incomplete.

                  That’s not a bug. That’s a feature.

                  If you can implement a particular code feature in a language subset that is restricted from Turing completeness, then you should. It makes the code less likely to have a security vulnerability or bug. (See LangSec)

                1. 4

                  One way to look at it is what life-features are useful in remote work and then go backwards from there. For example, as a remote worker I wish devices didn’t matter and I could just sit at any device and it would be setup exactly how I like it and I could just start using it and it would be magically on a network that lets me do my work. Right now I lug around a laptop and need VPN software installed and if my laptop breaks I have to get a new one and set it up again (which can be easy but only if you prepare for it).

                  In that sense, I want an OS that supports this kind of work. If anyone’s see The Expanse, they can take their little pocket device and flick towards an screen in their area and now they are using that screen (MSFT was trying to do something like this with Continuum which I think is dead and Motorola with Altrix). For all of that to work, I think you need a safe language that makes writing robust network code easy but also makes interface with the OS easy. Maybe something like Mozart/Oz which supports this idea of moving data around.

                  Of course, the above is all a cool fantasy. In reality, I don’t find being a remote worker changes how I work compared to an office. When I’m writing code I either run it locally or on a cloud machine, just like if I were in an office. The networking is the same except I need a VPN rather than just being on the office’s network. The actual mechanics of working remotely are the same, just talking to my coworkers is harder. I guess what I’m saying is I don’t think technology really changes here, it’s social.

                  1. 2

                    Seems to me like you just want plan9 (With a network of terminal servers in public spaces).

                    In an ideal world you could carry your filesystem server in your pocket and use public cpu servers when doing computations the public terminal cant handle.

                    1. 1

                      It’s even more than that. Being able to move running applications across hardware, for example, not just the display. Or instead of firewalls having authentication everywhere and fractal so I can connect to anywhere from anywhere. There are a lot of structures that would need to change and a lot of how we do things to hit this ideal.

                      1. 1

                        Hhmm the closest thing I can think to moving running applications across hardware is the Acme Dump / Load command, which is really just dumping a snapshot of acme’s state in a file and loading the file at a later date. I use it so I can switch projects and still keep the context I had last time. Especially useful given acme’s way of writting commands in tag bars. I’ll sometimes share those files over the network and use them on a another machine, in which the file hierarchy for that project is the same.

                        But then, that’s a lot less powerful, and only makes sense in plan9, where you can import a hierarchy of files from one machine to another.

                        1. 1

                          There is a bunch of work done in academia about for this. For example Mozart/Oz lets programs be moved across machines. I believe Tom7’s dissertation offers some ideas on this as well:

                          http://www.cs.cmu.edu/~tom7/papers/modal-types-for-mobile-code.pdf

                    2. 1

                      The X Window System has been capable of this since I think the ’80s. The X Server has a message protocol and one server can serve multiple clients at the same time. A while back Joe Armstrong of Erlang fame even hacked together a simple client library that can talk to X: https://3e8.org/pub/scheme/doc/p1-armstrong.pdf

                      Thanks to Erlang’s concurrency model, talking to the X Server turns out to be really easy:

                      Label1 = swLabel:make(Win,10,10,220,30,0,?cornsilk,”First name:"),
                      Entry1 = swEntry:make(Win,140,10,120, "Hello"),
                      Label2 = swLabel:make(Win,10,60,220,30,?cornsilk, "Last name:"),
                      Entry2 = swEntry:make(Win,140,60,120,"World"),
                      Button = swButton:make(Win,10,100,120,30,?grey88, "Swap"),
                      
                      Button ! {onClick, fun(X) ->
                        Val1 = Entry1 !! read,
                        Val2 = Entry2 !! read,
                        Entry1 ! {set, Val2},
                        Entry2 ! {set, Val1}
                      end},
                      
                      loop().
                      

                      Under this model, you send messages not just to the server but to the individual widgets, e.g. telling them what to do to handle events.

                      1. 2

                        Yes, I know about Erlang and X11. I know the X Windows is capable of doing the client server thing (very insecurely, mind you) but that is only a small part of what I’m talking about. I still need to be on a network that can connect to the X server and on a network that my work is on, etc.My data needs to move around the world with me (securely). there are a lot of components missing.

                    1. 8

                      For me, as a browser security engineer, it’s striking that security is only mentioned once, and it’s about the server-side not the client. Rust shows its benefits just in the amount of time not wasted debugging C++’s various forms of unsafety.

                      I wonder if this is quantifiable, conventional wisdom is that Rust can be relatively difficult to learn, compared to other languages, but if you can demonstrate that you save the time on debugging and not dealing with security issues, that’d be a powerful argument.

                      1. 9

                        It’s a whitepaper, so it isn’t intended to highlight the whole gamut. I’m giving a talk on security aspects of Rust next week though, which will be taped, I may ping you if the recording is up and I remember.

                        conventional wisdom is that Rust can be relatively difficult to learn

                        Depends on what your baseline and your goal is. It’s a language built for a medium pace, resulting in stable software.

                        I teach Rust professional and at a learners group. The general takeway from it is that strict enforcement of single ownership is something people really have to get used, although it’s often a line of thinking in general programming, too. I don’t find Rust hard, but it took some time for the community go get used to. It isn’t Go, which is completely built around being easy to pick up. For example, a lot of early Rust 1.0 code had a lot of emphasis on borrowing, now, three years in, people move away towards ownership everywhere and things get a lot easier. There’s now a lot of code to look at which can be considered idiomatic. We have a lot of people around who are competent with the language and can steer people the right way. People became so hyper-focused on having to understand lifetimes, now, I give a 30 minutes lecture in my courses how you are often semantically and computationally better of with avoiding them. That makes the whole language much easier.

                        Sooo, the whole thing became kind of a meme and its foundation are questionable. People learn hard languages all the time, especially in a space where C++ is dominant.

                        1. 2

                          Do you have a link handy for your lecture about how it’s better to avoid lifetimes? I’m interested to know since the borrow checker is one of Rust’s most famous capabilities.

                          1. 2

                            Id be interested in that, too, given I looked at it when people were talking about borrowing and lifetimes a lot.

                        2. 3

                          They’re doing game development, which means most of the time security is their last priority.

                          1. 2

                            Well, crashes often were how consoles got rooted in the end. The game developers might not care, though perhaps the companies making the consoles do.

                            1. 14

                              In that case, we should encourage them all to use C/C++ to ensure eventual freedom of our devices. Good news is they all switched to the very CPU’s that have the most attacks and experienced attackers. Probably not going to be necessary. ;)

                              1. 3

                                Yeah, I for one hope that we continue to write games in unsafe languages so that consoles can be rooted with Long Horse Names

                          2. 2

                            “ but if you can demonstrate that you save the time on debugging and not dealing with security issues, that’d be a powerful argument.”

                            That’s the exact argument used by the quality- or correctness-improving methodologies I often mention like Cleanroom. The older ones like Fagan Inspection Process said same thing. The reason is that problems are easier and cheaper to prevent or fix earlier in the lifecycle in most cases. They can actually be several times cheaper to prevent than fix. There’s usually an upfront cost that comes with it but the savings in debugging or maintenance often wiped it out in industry usage. Not always, though, where the quality did cost something extra by end of project. That came with other benefits, though, making it an investment with ROI rather than just a pure cost.

                            So, there’s a lot of evidence to support your argument.

                          1. 7

                            I think there’s blame to go around. For example, part of it is web designers trying to treat the web like a print medium and forcing devs to reproduce designs with pixel-perfect fidelity. If you care that much about pixels, you simply haven’t understood what a web page is.

                            1. 1

                              Once I saw a designer who gave advises such as “internal margin always 5px”, “everything clickable this colour”, “no space between these kind of elements”, along with illustrations and mockups to give an idea.

                              One can quite put these into a CSS stylesheet and attach it to any kind of HTML content and then it works!

                              1. 2

                                Would be excellent if we come up with these guidelines and they become standard.

                            1. 18

                              Correctness is not why I prefer static types; I find code with static types much easier to read.

                              1. 11

                                I’m going through an episode of this right now. I’ve been hacking on a popular package manager written in Ruby. I love the app, its ecosystem, and its governance and consider it to be one of the most successful software projects written in Ruby and one of the most successful open source projects for its target platform. You’ve probably guessed what it is by now.

                                Ruby wasn’t my first language (PHP) or my second (Java) or my third (Smalltalk). However, Ruby was what I used outside of work for everything for about five years between when I stopped doing PHP for work and started doing… XSL, HTML, and JavaScript. While I don’t know Rails very well because my tasks with it were rarely web-oriented, I do know Ruby pretty well.

                                Or at least did.

                                You see, I got into Scala at the end of 2013 when I transferred departments temporarily and then eventually permanently. To my Ruby mind, Scala was the marriage of Java and Ruby with all of that Scheme goodness I’d enjoyed in college after learning Smalltalk.

                                So, for the last 4+ years, my brain has been Scala. Sure, I’ve used Ruby here and there, but the tracing skills atrophied as my needs for debugging Ruby dried up. I’ve gotten really used to debugging with IntelliJ, using code completions, not needing to keep a documentation browser tab open all of the time, etc.

                                These days, just tracing down what’s coming from where in Ruby is… a slog.

                                I feel like a lot of the Ruby code I encounter works on assumptions and not assurances. It makes me uncomfortable now. These days, I’m doing Scala and Rust at work. To encode those assurances I fear would tremendously slow down the execution of the interpreted Ruby code and greatly reduce the simplicity therein by introducing far more repeated incantation, the kind of complexity handled at compile time in compiled languages.

                                1. 7

                                  So weird that you’re saying this, because in another post ( http://www.drmaciver.com/2016/07/it-might-be-worth-learning-an-ml-family-language/ ), MacIver says:

                                  I’ve been noticing something recently when teaching Python programmers to use Hypothesis that has made me reconsider somewhat.

                                  That skill is this: Keeping track of what the type of the value in a variable is.

                                  In particular this seems to come up when the types are related but distinct.

                                  … this is by far the second most common basic usage error people make with it….

                                  … it’s definitely a common pattern. It also appears to be almost entirely absent from people who have used Haskell (and presumably other ML-family languages – any statically typed language with type-inference and a bias towards functional programming really)

                                  … in an ML family language, where the types are static but inferred, you are constantly playing a learning game with the compiler as your teacher (literally a typing tutor): Whenever you get this wrong, the compiler tells you immediately that you’ve done so and localises it to the place where you’ve made the mistake.

                                  This is probably a game worth playing.

                                  1. 3

                                    To me, this is the game played when not using an IDE with completions! To each their own – some folks prefer to memorize methods or keep documentation open – but completions allow me to work faster and not have to keep as much in my head from session to session.

                                2. 9

                                  Yes, types actually communicate a great deal about the intentions and expected inputs and outputs of functions. If a function returns a Weight, that is a lot different than something returning a Velocity and might cause you to question the name of the function “GetVelocity”.

                                  1. 7

                                    Which is also an important part of correctness in the maintenance phase: the most expensive phase for both defects and changes.

                                  1. 2

                                    Technical: From Mathematics to Generic Programming by Alexander Stepanov (author of the C++ Standard Template Library) and Daniel Rose (Senior Manager at A9.com). I’m targeting it for this year because many of the techniques of type-safe programming and optimisation and deeply tied to mathematics and generic programming.

                                    Non-technical: there will probably be quite a few, but I do want to finish Hariri’s Sapiens and Homo Deus at least.

                                    1. 4

                                      Guy Steel’s invited talk in Clojure/Conj 2017, on the meta-language of computer science: https://youtu.be/dCuZkaaou0Q

                                      1. 7

                                        One thing that is rarely mentioned in the static vs dynamic debate is whether the programming language compiler is actually the right place for static typing.

                                        I have been doing a lot of work lately that makes me believe that schemas and types are much more useful in databases and wire formats than they are in programming languages themselves. Examples for me include data modeling in Elasticsearch (where types determine query and aggregation capabilities) and Parquet (where types determine storage costs and read/parse speeds). I would contrast these two with, say, MongoDB and JSON, where the lack of types and schemas in these layers leads to massive loss of “effectiveness”, to borrow Hickey’s term.

                                        Meanwhile, in programming language compilers themselves, types are often a source of “wrestling with the compiler” for even the well-trained senior programmer and are rarely the source of order of magnitude performance gains, at least not at cluster scale and petabyte data scale. The loss of programmer time is also simply a bad opportunity cost bet. Even the single-core speedups come more from “low-level optimizations” than from type models, e.g. in Python they come from writing code in C and Cython to strip out interpreter overhead.

                                        I realize here I am talking much more about speed than program correctness. But for me, correctness comes at such a higher level than the code itself. It comes from users.

                                        1. 3

                                          Even the single-core speedups come more from “low-level optimizations” than from type models, e.g. in Python they come from writing code in C and Cython to strip out interpreter overhead.

                                          Non sequitur. Python is dynamically typed and the speedups come from implementing it in C … which is statically typed (to an extent, anyway).

                                          1. 5

                                            I’d say C is “machine typed”, rather than statically typed in the way Haskell programmers or Java/Scala programmers think about the world.

                                            1. 3

                                              It’s not machine typed and on purpose. BCPL was machine-typed:

                                              https://en.wikipedia.org/wiki/BCPL

                                              Thompson and Ritchie changed that. See this vid:

                                              https://vimeo.com/132192250

                                        1. 1

                                          Obligatory upvote because I used to work with the creator of tig and he’s a wizard.

                                          1. 2

                                            To construct a list of bugs we could uniformly sample, we took a snapshot of all publicly available JavaScript projects on GitHub, with their closed issue reports. We uniformly selected a closed and linked issue, using the procedure described above and stopped sampling when we reached 400 bugs. The resulting corpus contains bugs from 398 projects, because two projects happened to have two bugs included in the corpus.

                                            One threat to validity they didn’t cover is that they didn’t control for testing practices, so we don’t know if the bugs were from well-tested projects or not.

                                            1. 3

                                              That is irrelevant to the method they used: take actual bugfix commits and see whether the bug would have been caught by static typing. If the project has tests or not does not matter, it had the bug and it needed to fix the bug in a commit. Static analysis is run before one commits.

                                              1. 2

                                                It’s irrelevant to their method but not their thesis.

                                                Evaluating static type systems against public bugs, which have survived testing and review, is conservative: it understates their effectiveness at detecting bugs during private development, not to mention their other benefits such as facilitating code search/completion and serving as documentation. Despite this uneven playing field, our central finding is that both static type systems find an important percentage of public bugs: both Flow 0.30 and TypeScript 2.0 successfully detect 15%!

                                                They haven’t demonstrated that these bugs survived testing, because they haven’t controlled for the prevalence of unit testing Javascript projects. Consider the extreme case where 0% of their 400 bugs came from tested projects. Then a viable objection would be “You don’t know whether or not static typing is that effective, because for all we know, all of those bugs and more would have been caught with the same investment of unit testing!”

                                                Don’t get me wrong, I was incredibly excited about this study until I noticed this. I want to finally have an answer to this debate, and I think studying gradual typing is the best way to find it. But I’m also a huge curmudgeon about these things and want to keep things absolutely watertight.

                                                Fortunately, the reviewers were wonderful enough to put their full list online. I clicked on four projects at random: two had some form of unit tests, two did not. Thinking of digging in deeper later.

                                                EDIT: just emailed them to get their cleaned-up data.

                                                1. 3

                                                  … all of those bugs and more would have been caught with the same investment of unit testing!

                                                  Surely not the same investment? These are all bugs they fixed with a typechecker + within ten minutes of coding. It’s fair to say that adding those few judicious annotations and typechecking in the first place would have prevented those bugs in less than half the time each. In fact, it’s hard to believe a single unit test could provide even the same level of guarantee that one static type would provide. So I can’t bring myself to believe that this basic typechecking would have been the same or greater investment.

                                                  1. 2

                                                    To be clear, I personally find this pretty exciting evidence, but there’s been a lot of research on this topic that misses some critical objection that muddles it. And if there’s one thing physics has taught me, it’s that common sense is a lying bastard ;) That’s why I like research as hardened as possible.

                                                    Re testing, there’s a couple of arguments you could make here. One is that unit tests can catch logic problems in addition to type errors, so maybe you could get the type-checking “free” in your unit tests. Another is that while unit tests are pretty high investment per test, they’re also among the least powerful testing tools you have! But what about something like Quickcheck?

                                                    That’s why I think it’s worth, at the very least, reviewing the testing infra these projects had. If it turns out that they all had extensive testing, that’s a slam-dunk for typechecking. If none of them did, it’d be worth doing another experiment to see how much time you need to cover the same bugs with tests.

                                                    There’s a couple of tests (haha) I’m planning on running:

                                                    1. Of the files with bugs, how many were imported into files with ‘test’ or ‘spec’ in their name? This should give us a rough lower bound of test coverage.
                                                    2. Of the commits that fixed bugs, how many also added a covering unit test? This should give us a rough upper bound of test coverage.

                                                    Another fun idea might be to look at projects with low and high test coverage and see how many intra-project issues could have been fixed with a type-checker.

                                            1. 8

                                              I’m speculating that closing their source code is the result of this ruling about ownership of content on social network sites. After all if your source code is open and you don’t own your site content what assets do you actually have? reddit has always argued that they own an exclusive license to their user’s submissions but if this ruling weakens that argument then they probably have to take some steps to mollify their investors.

                                              1. 3

                                                Interesting ruling w.r.t. LinkedIn, happy to see it! I did notice that recently they did seem to have stopped blocking my unauthenticated access, compared to as much as they used to in the past.

                                                I think what sites effectively own is the brand and domain, as well as trust and goodwill of the users, as well as the private information that is not shared publicly, including emails and passwords. It makes little sense that some sites block anonymous access, yet any information is still available to you if you have a human-like bot with a simple and free account, or just know Google-fu well enough.

                                                1. 3

                                                  If your source code is open source, you still own it as long as you require contributor license agreements.

                                                1. 1

                                                  I tried OCaml for a bit but the weirdness got to me after awhile. There was a ton of magic around project setup and compilation that I didn’t understand and couldn’t find properly explained, and the fact there is more than one “standard” library bugged the heck out of me. I’m hoping that once the Linux story solidifies a bit more around .NET I’ll be able to reasonably give F# a shot.

                                                  1. 3

                                                    I’ve been using F# on Linux for a few years now using Mono. It’s a bit more manual than .NET Core, but it’s stable.

                                                    1. 3

                                                      If you’re interested in trying again, I created a build system (yes, yet another one) specifically designed for getting going fast in most cases. I have a blog post here:

                                                      http://blog.appliedcompscilab.com/2016-Q4/index.html

                                                      Short version: all you need is a pds.conf which is in TOML so fairly straight forward, a specific directory structure (src/<project>) and GNU Make. Then you run pds && make -f pds.mk and you’re done. Supports tests as well as debug builds.

                                                      1. 5

                                                        I’m not sure it is worth pushing yet another build system that seemingly nobody uses (at least I haven’t yet run across a package which uses it) when jbuilder seems to be gaining so much momentum in the OCaml world lately.

                                                        1. 3

                                                          Maybe, but pds is pretty easy to port away from for most builds and it’s so trivial to get started and much less confusing than jbuilder’s config, IMO. My personal view is that jbuilder is a mistake but I’ll wait to switch over to it once it’s gained enough momentum. At that point, I can just switch pds over to producing jbuilder configs instead. But I’m a symptom of the problem rather than the solution unfortunately. I also use @c-cube’s containers, so yet another stdlib replacement/extension :)

                                                          1. 4

                                                            My personal view is that jbuilder is a mistake

                                                            Could you elaborate on why? IMO jbuilder is not perfect either but if we get a modern, documented build system which is hopefully easy to setup, it would be a massive win over all the other solutions we currently use.

                                                      2. 1

                                                        I agree, the different choices in tooling is sort of disorienting and it can lead to analysis-paralysis. For a toy compiler project I started working on, I tried to find the most basic tooling that would work: whatever ocaml compiler came with my distro, ocamlbuild, make, and then eventually, extlib, ocpindent, and then after some more time, opam, ocamlfind, utop. It may make sense to use the tooling outlined in this article if future maintainability is a big concern, but to get started and to learn ocaml, I don’t find it necessary (and definitely not appealing). Having done this, I don’t pine so much for standardization (;

                                                        1. 1

                                                          There’s more than one standard library in a lot of languages, though. Why does that bother you?

                                                          1. 4

                                                            It bothers me because it makes the language more difficult to learn. It also wasn’t always clear to me that an alternative was in use because, IIRC, they’re not (always) clearly namespaced. I have run into this in Haskell as well, FWIW.

                                                            1. 2

                                                              Typically it’s visible when you use an alternative stdlib because you start your files with open Batteries or open Core or open Containers. I agree it’s annoying that the stdlib is not richer, and it’s a bit slow to accept contributions, but in a way the existence of alternative stdlibs/extensions shows how easy it is to roll your own :-)

                                                            2. 4

                                                              You can’t have two standards, that’s a double standard!

                                                              1. 1

                                                                Which languages?

                                                                1. 1

                                                                  Haskell, C, and D come to mind. You could also argue that Python has multiple standard libraries because it has different implementations that effectively can’t use some aspects of the normal stdlib (PyPy). Then there’s Java: SE, EE, and ME are the same language with different sets of functionality in the standard libraries.

                                                              2. 1

                                                                Out of curiosity, have you tried OP’s project setup?

                                                                Also, there is only one OCaml standard library–the one that comes bundled with OCaml. The other ‘standard libraries’, Batteries Jane Street’s Core, are optional add-ons made for specific purposes.

                                                                1. 2

                                                                  I haven’t tried OP’s setup, but honestly it seems even worse than what I had. I pretty much followed this: https://ocaml.org/learn/tutorials/get_up_and_running.html. I ended up using Oasis, which was just awful, every time I added a file or dependency I had to fiddle with the config until everything would build again, but at least there wasn’t an entirely separate language.

                                                                  From OP:

                                                                  (jbuild_version 1)
                                                                  
                                                                  (executable
                                                                    ((name main)                 ; The name of your entry file, minus the .ml
                                                                     (public_name OcamlTestProj) ; Whatever you like, as far as I can tell
                                                                     (libraries (lib))))         ; Express a dependency on the "lib" module
                                                                  

                                                                  Note the comment, “as far as I can tell”. To me, that’s a terrible sign. A person who has gone to a reasonable amount of effort to explain how to set up a project can’t even figure out the tooling completely.

                                                                  1. 2

                                                                    Jbuilder is quite nicely documented (see http://jbuilder.readthedocs.io/en/latest/). The public_name defines the name of the produced executable in the install context. It does not take much effort to read it from there

                                                                    1. 2

                                                                      Of course you still have to find out that Jbuilder exists, which the official site doesn’t seem to mention… I am lazy, I don’t like choices, I just want one, blessed tool that works more or less out-of-the-box if you follow a set of relatively simple rules (I’m even OK with wrapping the tool in a simple, handwritten Makefile, which is what I do in Go). I’m not arrogant enough to think that the way I prefer is the “right” way, in fact in some cases it would be dead wrong (like for extremely complex, multi-language software projects), but that explains why I dropped OCaml for hobby stuff.

                                                                      1. 1

                                                                        OK, but your criticism is that you have to find out that JBuilder exists, commenting on a post that tells you about JBuilder.

                                                                        1. 1

                                                                          To be fair, jbuilder is very young (not even 1.0 yet actually) but it might become the “standard” build tool the OCaml community has been waiting for for years (decades?). Then clearly there will be more doc and pointers towards it.

                                                                          1. 1

                                                                            Well obviously I know about it now, but it still isn’t terribly discoverable for someone new to the language. My actual point, and I probably didn’t make this as clear as I should have, sorry, is that in my experience OCaml isn’t very friendly to beginners, in part because its tooling story is kind of weak and fragmented.

                                                                            1. 2

                                                                              Yeah. This is true. Especially on Windows. People are working on it but it’s slow and it’s taking time to consolidate all the disparate efforts. I myself am not getting terribly excited about OCaml native but funnily enough I am about BuckleScript (OCaml->JS compiler) because of its easy setup (npm i -g bs-platform) and incredible interop story.

                                                                              Others are getting equally into ReasonML ( https://reasonml.github.io/ )because it’s coming from a single source (Facebook) is starting to build a compelling tooling/documentation story.

                                                                              1. 2

                                                                                I didn’t know about either of these, thanks!

                                                                      2. 1

                                                                        OP here: I didn’t really make any effort to pursue documentation re: the public_name field, and I have really almost no production experience with OCaml whatsoever. I certainly have complaints about OCaml’s tooling, but I can assure you that any argument against it appealing to my authority is certainly flawed.

                                                                        1. 1

                                                                          I wasn’t really appealing to your authority, in fact kind of the opposite. I don’t like using systems that converge to copy-paste magic, and that seems to be what you did, and is likely what I would do. I don’t want to use a weird programming language to configure my project, I want something simple, with happy defaults, that can be understood easily.

                                                                          I guess I generally prefer convention over configuration in this case, and that doesn’t seem to be what the OCaml community values, which is why I gave up on it. I’m not saying anyone is right or wrong, it’s just not a good fit for me, particularly for hobby projects.

                                                                  1. 3

                                                                    Typechecking the entire stack is a very powerful concept. Richard Feldman and the Elm community are at it too, e.g. elm-css is trying to capture the syntax of valid CSS in the type system so that you literally can’t compile invalid CSS in Elm. As we find ways to move more and more declarative components into the type system, the future looks bright for language which have great tooling support for easy autocompletion, jump to definition, and the other features you describe.

                                                                    1. 2

                                                                      How is his conclusion not a point for Python as well? Because you can do it impurely in Python as well as purely? I’m not particularly impressed by this.

                                                                      1. 15

                                                                        His point is that Haskell forced him to think about separation of concerns. With Python it was easy for him to mix them together.

                                                                        1. 2

                                                                          In a sense, yet his case is so trivial he could risk invoking Zalgo and use grep for the correct result.

                                                                          Python alls gets the job done, when purity isn’t always the main concern.

                                                                          Which does not exclude writing more pure code in Python or refactoring hacks to higher quality if need be.

                                                                          1. 3

                                                                            I’m not sure where this reply fits into the topic of how Haskell forced them to think about separating concerns, where Python doesn’t.

                                                                            1. 3

                                                                              In inane examples like this in the real world the better programmer delivers and may not care about separation of concerns. At the risk of sounding like someone like Zed Shaw.

                                                                              If a Python hack grows, requires proper tests, anything, the coder will learn about separation of concerns. That’s what’s required in the long run, right? Even if it wasn’t the language enforcing.

                                                                              A lot of issues that pop around these topics are non-issues in practical life, and I’d be very surprised if this article series ends up giving Python a single point. But I’d expect there to be mainly ties and maybe an occasional contestable point for Haskell, like in this case, where the better programmer would have delivered already but with less prowess in purity.

                                                                              1. 2

                                                                                If a Python hack grows, requires proper tests, anything, the coder will learn about separation of concerns. That’s what’s required in the long run, right?

                                                                                This is a very hand-wavy argument. Just because separation of concerns (SoC) is the “right way” doesn’t mean that a competent programmer will discover it given enough time. And even if they do discover SoC, humans are fallible and they will forget to do it properly at times.

                                                                                Haskell just automates this whole process for you.

                                                                                1. 2

                                                                                  Learning new languages is, of course, a fine thing, but something disturbs me here.

                                                                                  This example is just too simple and/or whatever. Maybe there’s deeper stuff coming, but so far learning a new language is an overkill solution.

                                                                                  The concepts aren’t unique to Haskell or whatever, either. A coder who cares can learn about practices separately from another language. Learn about how things are done in other languages and see how they apply.

                                                                                  Maybe it’s just this example that feels completely silly to me.

                                                                      1. 1

                                                                        Working on a pet project to teach myself Cycle.js as well as the ins and outs of programming in OCaml+BuckleScript. Also using this pet project to explore a new design for threaded conversations.

                                                                        1. 7

                                                                          Please, don’t conflate “functional” (e.g. Clojure) with “typeful” (e.g. Rust). They are different, though not incompatible: Functional style simplifies structure, typeful style makes structure explicit.

                                                                          1. 1

                                                                            Firstly “Typeful” is more ML than rust. While it’s unfortunate that “functional style” must have different meanings in different contexts, it’s the unfortunate reality we live in. Statically typed languages manifest functional style differently than dynamically typed languages, and they should.

                                                                            1. 6

                                                                              ML is both typeful and functional. I wanted to give examples of a language that is functional but not typeful (Clojure) and another that is typeful but not functional (Rust), to emphasize that both qualities needn’t come hand in hand in a language design.

                                                                              Just to be clear: By “typeful”, I don’t mean “statically typed”, but rather “statically typed and the types tell you a significant part of what’s going on in the program”. It happens to be the case that statically typed functional languages are typeful, because it happens to be the case that the type structure of functional programming is well understood: the Brouwer-Heyting-Kolmogorov and Curry-Howard correspondences predate computers and programming languages themselves! What makes Rust an interesting case is that it’s one of the first attempts at capturing the type structure of imperative programming (mutation, aliasing, concurrency control).

                                                                              1. 1

                                                                                I see, I had misinterpreted your intent. I agree in the case of Rust it has aspects of ADT but isn’t functional first, wheras clojure is functional first but is missing ADT, but in the context of the language they’re talking about I’d definitely say the type system is intertwined with its functional style. So I don’t think there’s any conflation, but I can see now why you felt it was important to point that out.

                                                                                1. 3

                                                                                  The type systems of Scala, Haskell, ML, etc. happen not to have any means to accurately describe the structure of imperative programs beyond I/O segregation and reference cells, as a way to say “here be dragons!” That’s why typeful programming in those languages exerts pressure towards using a functional style. This is a limitation of those particular languages, not typeful style, though.

                                                                                  1. 3

                                                                                    Weird claim, the monad bind operation is a perfect way to model imperative operations. Also, ML and Scala don’t enforce that segregation at all.

                                                                                    1. 2

                                                                                      Monads allow you to embed an effectful object language inside a pure metalanguage, but they don’t let you actually tame the effects that happen in the effectful language. For instance, Haskell’s IO is actually unsafe in the presence of concurrency: you can mutate the same IORef from two different threads at the same time, and the results will be just as disastrous as if you did the same thing in Java or C++ or whatever.

                                                                                    2. 2

                                                                                      The type systems of Scala, Haskell, ML, etc. happen not to have any means to accurately describe the structure of imperative programs beyond I/O segregation and reference cells, as a way to say “here be dragons!”

                                                                                      It sounds like people’s use of the free monad over algebras representing their chosen effects is one of many solutions to this problem. Do you mean something else?

                                                                                      1. 2

                                                                                        Can you use monads to statically enforce correct concurrency control? Or to prevent use-after-free for resources that need manual cleanup?

                                                                                        As far as I can tell, the only effects that can be adequately managed using monads alone are those that can be “desugared” back to purely functional programs: exceptions (EitherT), nondeterminism (ListT), etc. For “intrinsically imperative” stuff, you need something else.

                                                                                        1. 2

                                                                                          I believe you can use indexed monads to model substructural types, but nobody, as far as I can see, actually does this in real code. Nobody even understands what indexed monads are from a mathematical perspective! In any case, Rust definitely makes it much more practical to write safe imperative code that handles resources concurrenty and efficiently. I’d still prefer the Haskell way though, since it’s better at everything else :) when efficiency is a concern, I drop down to C++ and be really really careful.

                                                                                          1. 1

                                                                                            I don’t think it’s fundamentally impossible to design a single language that’s good at both what ML and Haskell do well (manipulate “eternal” values) and what Rust does well (manipulate “ephemeral” resources). This is a very rough sketch of what I think such a language could look like:

                                                                                            • Use call-by-value evaluation.

                                                                                            • Distinguish destruction (which is deterministic and controlled by the programmer, as in Rust) from deallocation (which is performed by a garbage collector, as in ML and Haskell).

                                                                                            • Distinguish a special class of types, let’s call them cptypes, that are guaranteed to have trivial destructors. This works in a similar fashion to Standard ML’s eqtypes, which are guaranteed to have decidable equality. See here and here for details.

                                                                                            • The basic type formers are sum, tuple, function and object types.

                                                                                              • Sum and tuple types are cptypes iff their type parameters and constructor payloads are cptypes. This mirrors the rule for eqtypes in Standard ML.

                                                                                              • Function types come in two flavors: functions (-o) and cpfunctions (->). foo -> bar is always a cptype, and foo -o bar never is (regardless of what foo and bar are), and the former is a subtype of the latter.

                                                                                              • Object types are never cptypes, since objects (and only objects!) can have programmer-defined destructors.

                                                                                            • A value can be used more than once iff its type is a cptype.

                                                                                            Of course, this glosses over many details, and figuring out those details is precisely why it took Rust so many years to go from 0.x to 1.x. But a new language has the benefit of hindsight, and, in any case, when exploring the design space, “do as Rust does” is a sensible starting point.


                                                                                            As for indexed monads, I object to the name, unless someone can show that they are actual monads. In any case, they don’t seem to be elegant.

                                                                                          2. 2

                                                                                            Haskell’s ST data type is a way to prevent leaking of state. It’s possible to use ST to ensure resources stay within a scope and are cleaned up.

                                                                                            https://hackage.haskell.org/package/base/docs/Control-Monad-ST.html

                                                                                            I’m not completely sure about concurrency control. You mean something like Haskell’s acid-state?

                                                                                            1. 2

                                                                                              Haskell’s ST data type is a way to prevent leaking of state. It’s possible to use ST to ensure resources stay within a scope and are cleaned up.

                                                                                              ST is effectively a single monadic region: all resources are deallocated all-at-once at the end of the ST action that you pass to runST. But, even considering nested monadic regions in their full generality, you couldn’t possibly express the following pattern:

                                                                                              • Allocate A
                                                                                              • Allocate B, process A-B, deallocate A
                                                                                              • Allocate C, process B-C, deallocate B
                                                                                              • etc.
                                                                                              • Deallocate Z.

                                                                                              This pattern arises, for example, in the implementation of concurrent B+ tree variants internally used by DBMSs.

                                                                                              I’m not completely sure about concurrency control. You mean something like Haskell’s acid-state?

                                                                                              I mean the ability to:

                                                                                              • Disallow manipulation of a shared resource before a lock is taken. (The easy part.)
                                                                                              • Disallow manipulation of a shared resource after the lock has been released. (The hard part.)
                                                                                              1. 2

                                                                                                But, even considering nested monadic regions in their full generality, you couldn’t possibly express the following pattern:

                                                                                                You can’t? Can’t you use free coproduct style to express that kind of chain with ST-style regions? (Or indeed just an ST-like monad that uses a type-level list of the currently open regions). A/B/… would each be indexed by their own type.

                                                                                                1. 2

                                                                                                  The fundamental problem is that monadic regions are monad transformers, and monad transformers impose a stack discipline. Every step of the pattern I described, other than the first and last ones, requires the ability to deallocate the second topmost element of the stack while retaining the topmost one. You need substructural types.

                                                                                                  1. 3

                                                                                                    Monad transformers are one possible way to encode monadic regions, and have the nesting limitation you describe. But they’re not the only way to write them, and other encodings do not have that limitation (e.g. on http://okmij.org/ftp/Haskell/regions.html I believe the “heavy-weight implementation” allows your interleaved A/B/C/… scenario - the nesting restriction is explicitly noted as a limitation of the lightweight implementation only).

                                                                                                    1. 1

                                                                                                      Why should I trust that the type system used in the “heavy-weight implementation” is sound?

                                                                                                      1. 1

                                                                                                        Why do you trust that the Rust borrow checker is sound? Or any other means of enforcing this kind of constraint?

                                                                                                        The comments at the top sketch an argument that the use of OverlappingInstances is safe and explain that IncoherentInstances is only for compatibility with older versions (though I would certainly understand confirming that oneself). AIUI UndecidableIstances does not present soundness issues, in the sense that anything that compiles successfully with that extension is sound (just that compilation may not terminate, or in practical terms compilation of some sound programs may time out)

                                                                                                        1. 2

                                                                                                          I trust substructural types because they tackle the problem of resource management directly, rather than by means of hacky workarounds. Using rank-2 types (ST) to get local mutation is completely ridiculous.

                                                                                                          I trust the borrow checker because I’ve read the ATTAPL chapter on region-based memory management, as well as the papers by Tofte and Talpin on region-based memory management. In this regard, Rust’s improvements are primarily a matter of ergonomics (elision), but the fundamental system is the same. That covers immutable borrows.

                                                                                                          As for mutable borrows, I trust them because they’re effectively syntactic sugar for a pattern where a caller transfers ownership to a callee, with a promise that the callee will transfer ownership back to the caller when it’s done. Since I already trust both substructural types and regions, it’s not hard to trust mutable borrows as well.

                                                                                                          1. 2

                                                                                                            I trust a complex implementation written as a library more than I trust a simple implementation that’s only available as a language-level builtin, just based on past experience. Using rank-2 types doesn’t particularly bother me because I’m already used to them and would want them in my programming language for other reasons, so the implementation of other features may as well use them. I would prefer direct support for affine or linear types (I’m not familiar with substructural types), sure, but I see that as an implementation detail. It’s a hack in the sense of being an ugly workaround, but not in a sense that impairs correctness.

                                                                                                            Monadic regions are also an implementation of regions so I don’t see the academic backing for Rust’s borrow checker as being any stronger or weaker than that for monadic region libraries?

                                                                                                            1. 2

                                                                                                              Monadic regions are also an implementation of regions so I don’t see the academic backing for Rust’s borrow checker as being any stronger or weaker than that for monadic region libraries?

                                                                                                              Monadic regions only account for immutable borrows. Most interesting types that can be nicely implemented in Rust but not in Haskell have operations that require mutable borrows.

                                                                                            2. 2

                                                                                              As far as I can tell, the only effects that can be adequately managed using monads alone are those that can be “desugared” back to purely functional programs

                                                                                              They have precisely the same power as delimited continuations in a call-by-value language

                                                                                              1. 2

                                                                                                Right. And, if I recall correctly, delimited continuations are just particular higher-order functions, which can be no more effectful than the host language is.

                                                                                1. 4

                                                                                  I have long been aware of C# as a strictly-better Java (imo), but I only recently learned of F# which appears to be an OCaml running on .NET if I understand properly.

                                                                                  I must say, I keep seeing very cool things for F# that really interest me.

                                                                                  Actually, I think this may just be OCaml that interests me as I have very little experience with OCaml but I keep seeing very cool projects crop up in various dialects of it.

                                                                                  Can someone with more experience with OCaml chime in and offer some opinions/articles on a few things for me?

                                                                                  • What is your preferred OCaml dialect and why?
                                                                                  • How does F# stack up against other OCaml dialects?
                                                                                  • What is a good place to start to get a feel for how OCaml compares to Haskell or Agda?
                                                                                  1. 5

                                                                                    I’ve used both OCaml and F# in anger, in production, and I’m honestly mediocre at both, but I can at least partially answer your questions.

                                                                                    F# is an ML, but not really an OCaml variant in any meaningful sense. Yes, it’s true that early versions of F# aimed to be syntax-compatible with OCaml, but that was always a bit of a stretch. Even if you ignore the syntactical differences, F# and OCaml have meaningfully different semantics. F# has .NET-style classes, OCaml has its own object system (which AFAICT no one uses) and its own take on the whole concept with a really neat module system. F# has workflows, OCaml doesn’t have something quite equivalent. F# has units of measure, OCaml lacks an equivalent. OCaml has abstract data types, F# does not, but F# has operator overloading and reified generics, which OCaml does not. And so on. In nearly all respects, OCaml has vastly more in common with SML/NJ than it does with F#.

                                                                                    The runtimes are also very different. By virtue of running on the CLR, F# has real, full-blown multithreading with a rich async system. In contrast, OCaml has a stop-the-world GC that can feel quite a bit like the Python GIL in practice. On the other hand, OCaml compiles directly to native binaries and has a simple, very easy-to-understand runtime that I feel confident I could explain to you in about ten minutes, whereas .NET in most situations is still really a VM-like affair, and requires you understand a vastly more complicate runtime situation. (Yes, .NET Core changes at least the native compilation aspect, but that cuts out a lot of libraries most .NET devs want to use today.) And again, because F# runs on the CLR, you’ve got access to the .NET ecosystem, whereas OCaml is limited to a mixture of the OCaml ecosystem (which is significantly smaller), or calling out to C libraries (which has gotten very pleasant, based on my experience, but is still more overhead than just consuming C# libs from F#).

                                                                                    I personally found my experience with F# more pleasant than OCaml due to a mixture of superficial issues (e.g. lighter syntax, better IDE support) and what I’d call “real” issues (the multithreading support and library situation), but, again, I don’t consider myself an expert in either language based on my limited experience. And I cannot possibly compare this to Haskell or Agda. But I hope this helps answer your question at least a bit.

                                                                                    1. 2

                                                                                      Two things:

                                                                                      F# does have abstract data types, if you have a type 'a t = T of 'a and hide the constructor in the corresponding .fsi file, outside modules can’t see it.

                                                                                      OCaml has an incremental, not stop-the-world, GC.

                                                                                    2. 3

                                                                                      Reason might be of interest to you, as well.

                                                                                      1. 2

                                                                                        I’m a big fan of the ML family, though I’ll stick with Scala (which is a great option if you’re already on the JVM) until other options get HKT. My concern with F# would be whether there’s the level of library/tool ecosystem that you’d get in OCaml (or Scala). .net doesn’t have the same open source tradition or established package management infrastructure and corresponding ecosystem yet, AIUI. (There’s nuget or some such up to a point)

                                                                                        The only advice I’m ever able to give people asking where to start with a language is: write some code. Solve some problems.

                                                                                      1. 4

                                                                                        I should emphasise that it’s pragmatic: you’re not forced to deal with the added mental load of monads and other type system rabbit holes

                                                                                        Except that purity reduces the number of “type system rabbit holes”. OCaml gives up principal types (the “value restriction”) in order to remain sound in the presence of effects.

                                                                                        1. 6

                                                                                          ‘Rabbit holes’ isn’t a soundness judgment, it’s a description of the steep learning curve.

                                                                                        1. 3

                                                                                          With all due respect, I get the sense that the author has very little understanding of how compilers for functional languages and the execution of programs in general.

                                                                                          How does bucklescript compile closures? Does running a closure result in a sequence of pointer deferences? The generated code is going to be JIT-ed at runtime by browser anyway.

                                                                                          At points in the article, it feels like the author is advocating more for the speed of compilation rather than the speed of the generated code.

                                                                                          If I’m wrong about any of the above points, I’ll happily be corrected. Also, in general I’m pleased to see the adoption of function languages; just felt that the argument in this article was poorly formed. Sorry to be so negative :)

                                                                                          1. 9

                                                                                            Hi, I am the author of the blog post. I like to think I know a little bit about functional compilers, languages and runtimes from my day job and other studies; of course I could be totally wrong. But I’ll try to answer any specific questions you may have.

                                                                                            BuckleScript compiles OCaml closures simply to ES5 closures. So the following:

                                                                                            let add x y = let add_x y = x + y in add_x y
                                                                                            

                                                                                            is compiled to:

                                                                                            function add(x, y) {
                                                                                              var add_x = function(y) { return x + y; }
                                                                                              return add_x(y);
                                                                                            }
                                                                                            

                                                                                            The generated closure, as you can see, is about as performant as a hand-written closure would have been.

                                                                                            I definitely advocate for speed of compilation quite a lot because it’s such a transformative experience when the compiler doesn’t waste your time; but I think I also mention that the OCaml compiler and the community are obsessed with performance in general. The thing is that the language has a dead-simple runtime model; it doesn’t do fancy just-in-time compilation or other tricks. As a result, the JavaScript output looks and performs very much like idiomatic hand-written JS.

                                                                                            1. 5

                                                                                              At points in the article, it feels like the author is advocating more for the speed of compilation rather than the speed of the generated code.

                                                                                              You are not wrong about this. If you go to BuckleScript homepage, the very first thing it wants to tell you is compilation speed. Personally I think optimizing for compilation speed makes sense in this case.