1. 13

    You might want to read up on control characters before deciding what control characters you might want to redefine. I know that DC1 and DC3 are still in active use on Unix systems (^S will stop output in a terminal window, ^Q will start output going).

    As far as the “reveal codes” feature of WordPerfect, an HTML editor could do the same—HTML tags are a type of “code” after all. In fact, in the 90s we had just a program for HTML—DreamWeaver. Worked pretty much like WordPerfect except it used HTML instead of control characters.

    1. 2

      That is a gem. Thank you for finding it out for me! I’m going to look at it a bit and see if there’s some selection of control characters that could be reused without a drama.

      1.  

        I read up on the control characters yesterday, especially I paid attention to the rfc20 because the codes seem to have remained mostly same since then and that document is the easiest to comprehend.

        The transmission and device control characters seem to be safest to use in text streams. They are used as control signals in terminal software but otherwise they wouldn’t seem to cause anything in file output. Therefore I probably can use the following block of characters:

        [01-06]     SOH STX ETX EOT ENQ ACK
        [10-16] DLE DC1 DC2 DC3 DC4 NAK SYN
        

        Emacs, nano, vim, gedit seem to be gladly printing these characters and still encode the file as utf-8. I also opened a Linux console and ‘cat’ these in. There were none sort of visible effects, so I guess it’s safe to reuse these codes.

        Most transmissions seem to assume that the stream is binary and that anything goes over, so I doubt this would have too many negative effects. Except that of course some other software could not reuse them anymore. Maybe it doesn’t hurt to provide some small file header that’s easy to type with a text editor.

        I don’t need this many, so I will probably leave the DC1-DC4 without interpretation and only reuse the transmission control characters on this range.

      1. 17

        I’m missing some context: what are the shortcomings of Markdown and/or LaTeX that this proposal will fix. Thanks.

        1. 4

          Same here, it was not really clear to me what problem the new format is trying to solve.

          It looks like a literate programming language specialized to output HTML. Does it mean that a new interpreter / compiler / stdlib has to be written?

          It looks like a prerequisite to understand this text is to read up on all the bullet points presented in the “Summary of inspiration” section.

          The control character codes are reused because they are hardly used with web and HTML

          this was my main clue and it’s quite late in the document

          1. 2

            It’s better to turn it upside down and think that we can improve upon Markdown. We definitely have to support and work with it quite some while, just like other common formats that we have now.

            I’ve used markdown every week, so I am well-aware of how it works for me and against in writing text.

            Markdown assigns meanings to character sequences: '#', '=', '-', '**', '_', '~~', '*', '1.', '+', '[', ']', '(', ')', ']:', '![', '>', ...

            The implementations parse these by block processing over the text with regular expressions. You can escape those sequences when you use them in the text, but it means that you got to stay on your toes when introducing these elements. Usually you catch it in previewing your message. There’s small differences in how different markdown implementations parse text that will cause inconvenience when switching among them.

            Markdown’s grammar of things it understands is fairly small. You get headers, links, few ways to mark text, images, quotes, lists, tables. You fall back to html when it doesn’t understand things, but it results in a weird, html-specific document. Extensions to markdown are implementation-specific and do not translate between them. This can result in the texts being monotonic as you tend to stick into what markdown offers.

            Formatting-characters trip ordinary users and frustrate them. This happens no matter how many there are of them. The proposal removes the need for this, as well as for the use of escape-characters. The idea would be to produce a form of rich-text editing that augments, but doesn’t obscure the existing structure of a text document.

            I’ve left open the choice of a meta-language, thinking that you’d use this interleaving existing programming languages and plain text together. In the web you’d restrict this to some nice, constrained, standardized declarative language that has the amount of annotations you prefer to allow.

            1. 7

              To me the attraction of markdown (and latex) is that its really just plain text. What I understand is that documents in your proposal are binary documents readable in the same way Microsoft word documents are readable: islands of text interspersed by encoded formatting directives.

              1. 1

                I cleaned up the demo that I did and pasted the files into the github. I used it in gvim, but also tried in vim. Probably from the vim it doesn’t do clean pastes without any help from the terminal.

                https://gist.github.com/cheery/2a34769a2398a345ad77235e8d1c3693

                I guess microsoft word documents are harder to read as plain-text without a dedicated editor.

                1. 2

                  Some thoughts that come to mind

                  1. Is the formatting going to be restricted to one or a few characters, or could we have strings of characters representing particular formats, like highlighting, colors, styles etc.
                  2. Will there be complexity like macros, with variables (i.e. a DSL behind it)

                  Depending on how this is setup I fear you will end up reinventing one of the document formats from the 1990s and 2000s and on (e.g. Microsoft doc format). You’ll need a particular piece of software to view the document as intended. Without this software, depending on the complexity of the intent of the formatting code the text part of the document could become unreadable. Pretty soon there will be viruses masquerading as such documents and so on and so forth.

                  I guess I haven’t understood enough of this scheme to be convinced that we haven’t visited this before and found it as unsatisfactory as anything else, if not more.

                  1. 2

                    I guess that if you stretch it, you could have something like [i color=red:This text is red], but it’s probably something I wouldn’t like to add. I would prefer that the occurrences of structures may be indexed and then referenced in the program text portion to attribute into them.

                    For example, if you used a Prolog-like language in the program portion, it might look something like this in the text editor:

                    Hello [span.marked:Word]
                    $ color(span.marked, "red"). ¤
                    

                    It’d probably take some time for people to hack the support for the format into their languages.

                    There’s real concern for what you’re saying though. I can imagine how this could turn into something that can be no longer properly read with a text editor. I’m convinced that an eager enough person, not ready to sit down and think, would easily turn this into yet another wysiwyg. We’ve got numerous records of similar disasters. :)

                    I did look at the .doc format from the 1990s. It’s been divided into 128-byte sized blocks with the text dropped somewhere in the middle. It looked like you’d be perhaps able to leak information with it. But also you couldn’t pop it open in a text editor and expect to be able to edit it without corrupting the file.

              2. 5

                Markdown was never meant to be something like {SG/HT/X}ML, it was meant to be a lightweight markup language for stuff like comments and blog posts. The fact that you can fall back to HTML is great in my opinion, it means I can integrate those features into my writing quite seamlessly (mostly anchor links and tables).

            1. 7

              In my view, this can be interesting if looked at as a new binary format for documents. I don’t think it is feasible as a text markup language. I believe the latter category has, by definition (?), an important characteristic of being built with “in band” characters from the set those available in common text editors. And somewhat counter-intuitively, the full ASCII set is actally not available easily in common text editors, I think.

              But I may well be wrong. Always easiest to criticise. I think it mostly depends what’s your goal with this idea. It wasn’t clear to me from skimming the report. And I think it may be easier or harder to achieve the goal depending on what it is. But for sure interesting as an experiment/explorarion, i.e. if the goal is still vague :)

              1. 4

                Mostly agree with you, but I think it’s less about being available in common editors (typically editors support UTF-8, and are limited only by what fonts you have installed), and more about being available on common keyboards. If your would-be users can’t see where your control characters are printed on their keyboards, it’ll be hard to get adoption, since they won’t know how to type out their documents. The available characters generally are:

                1. A-Z
                2. 0-9
                3. ~`!@#$%^&*()-_=+[{]}|;:’”,<.>/?

                Even international keyboard are often double-printed with the local character set and arrangement, alongside a standard QWERTY character set that you can swap into using via your OS.

                1. 2

                  Vim -editor has a :digraph -table that allows you to add control characters into the text document.

                  I think an ordinary user would most likely use a rich-text-editor with this format. I have a proposal on how to make that work, but I guess it’d be easiest to just cobble the thing together in javascript and then bring it here. I guess I’ll go check if someone has prototyped a piece-table editor in JS and start from there.

              1. -2

                Untyped programs do exist. The author is just looking at the wrong direction. He’s looking down to hardware when he should look upwards toward abstract programming languages.

                Here’s an example of an “untyped” program, it’s the ‘id’ function from Haskell: (λx.x) : (∀a. a -> a)

                It works for every variable you pass in and the variable ranges over an infinitely large domain. The type in the program says that for every variable, you can get a variable and return the same type of a variable.

                If you loosen from the idea that a program must represent a valid proof for some logical formula, then you can describe large variety of programs. The validity of the program is determined when the domain is selected. That’s how you get runtime errors and dynamic typing.

                The original post is parroting of Robert Harper’s dumbest ideas. It’s equivalent to taking my “Dynamic typing is good” -post from the year 2014 and using it as the basis to refute the use of type systems.

                It appears that the motive for writing the post came from a short discussion with somebody else. He hasn’t gotten to the agreement about the subject so he decided to write a very assertive blog post to drive his point.

                1. 7

                  The haskell id function is typed (it is polymorphic, but it has a universal type).

                  1. 4

                    I’m afraid I don’t understand what definition of “typed” you are using here, could clarify a little?

                    I’m not sure what connection the proofs-as-programs idea has here, this merely states that some interesting type systems happen to mirror some interesting proof theories. You can trivially go from a type system to a logic, though the result is rarely worth studying. Going in the opposite direction is usually more interesting but seems irrelevant.

                    1. 2

                      Type systems not only mirror proof theories; they are isomorphic. And the isomorphism is very useful because theorems about programs can be stated in the same language as the programs themselves, as seen in languages such as Coq, Idris, Agda, Epigram… It gives verification powers to languages, and also gives implementation laboratories for logics.

                      1. 1

                        I avoided the use of the word “isomorphic” because it’s not clear in what category such an isomorphism should live. But yes, it is a very important symmetry!

                    2. 4
                      GHCi, version 8.0.1
                      λ :t id
                      id :: a -> a
                      

                      Looks like a type to me… but that’s just Haskell’s type system. Are you trying to make a more general claim?

                      1. 3

                        The more general claim is that it’s idiotic to continue the static vs. dynamic typing debate because it is a false dilemma. And it’s harmful to propagate it.

                        If you take the role of a dynamic typing zealot, it means you’ve discarded the study of logic and proof theory that could enhance your usual ways of reasoning about programming. You don’t need to stop using dynamic languages to reap the benefits, but once you’ve figured it out you want to use something like Prolog, Coq or Haskell more often.

                        If you go and be a static typing zealot, then you do some sort of a fallacy. It involves putting some particular set of deduction rules to a pedestal and discard all methods of reasoning outside of whatever framework you’ve decided to use. Just like the author asserts how everything, including assembly language and C, has to have a well-defined type, even if the type was useless by failing to convey what the program is trying to solve. There are tons of logical systems and frameworks and nobody has decided that we should stick to intuitionistic logic.

                        Effectively both stances discard the study of logic in one form or an another. It literally harms scientific progress and the motion from theory into practice.

                    1. 2

                      Wow. I’m happily surprised about Hickey recognizing that his optional record-attributes correspond to Maybe. Then he goes on and gets the idea through that naming of attributes should be also subject to module scoping, and that we have generic need to work with partial information.

                      These ideas would be quite compatible with Prolog’s partial data structures and Haskell’s profunctor optics.

                      I wouldn’t want it with recursive types though, but otherwise it’s cool. Also handling this kind of set-objects would likely require MLsub’s subtyping inference.

                      1. 5

                        I’m really glad she acknowledged that many people had come up with this stuff and that her efforts saved the great work from irresponsible scum going after bonus money while screwing everyone else.

                        1. 4

                          Such an anticlimactic post.

                          It’s naïve to think that the language is responsible for the quality of code and that by adding some bells and whistles (or removing some bells and whistles), we can automatically make everything better. We were not happy with Fortran and COBOL, so we invented C++ and Java only to be unhappy with them too in some 20–30 years.

                          The structured programming debate when they abolished goto. It was a pretty good improvement to the languages overall. It was only spoiled by the fact that “practical” people didn’t understand shit. The point was to make the code easier to understand by giving it regular structure. The complex structures could have been conveyed by tail recursion.

                          Well people found optimal tail recursion hard to implement, so instead they introduced ways to violate the rules that made structured programming work in the first place, the idea that you cannot haphazardly jump around in the code was weakened. They introduced break/continue into loops, and of course, the return statement. Well of course the structured programming was not as useful afterwards.

                          Likewise the object oriented programming was about processes and messages. Very important ideas to control for concurrent programming. People diluted it into a dogma that dominated the industry for few years until few figured they had chased red herrings there.

                          Of course if you fuck up with the theory part of your work, you’re certified to come up with tools that aren’t better than the things they replace. Jonathan Blow didn’t get this, Go completely missed it, Rust strike force almost figured it out. Whole C++ committee is out on the field and didn’t get the memo. Surprisingly it’s hard to grok that the theory is important.

                          So if I were to invent a programming language for the 21st century, I would reinvent being responsible instead. I would reinvent learning your tools; I would reinvent being attentive to essential details and being merciless to accidental complexity.

                          If you’re going to try this the same way that you try to reinvent programming languages, the outcome won’t be different. For example of this you can just look at Bret Victor’s stuff or the EVE text editor. Also you’re going need a better language to come up with better tools. The language is your primary tool for being attentive on the detail and explaining complexity.

                          1. 5

                            Your material should teach how the Prolog works, but I think it’s not enough to appreciate Prolog. It’s just a programming language with backtracking, partial data structures, sequential evaluation and without need for control flow structures after all.

                            To understand why that combination of things make sense and is useful, you got to understand Curry-Howard correspondence to the point that you can apply it in programming. For people who have been working with computers for a long time, with advanced languages, it really doesn’t click easily why Prolog is the way it is.

                            1. 3

                              Doesn’t look like a bad advice, but doesn’t look like good either.

                              First of all there’s not an absolute “4 different functions” your documentation should fill. You better decide the structure by which is appropriate for the software, not follow a guide that tells you what you should do.

                              Also I don’t think that the post tell the thing what I think is important in good documentation. What I keep important is that the internal documentation and the code are either tightly linked together, or that the documentation flows upwards from the code. Eg. You document internals, then you document the features they provide and finally provide examples and everything else. This guide seems to start with things I’d leave last.

                              1. 8

                                I think the post addresses that by distinguishing between documentation for the project and documentation about the project. Some of the worst documentation I’ve dealt with tried to repurpose the former as the latter.

                              1. 1

                                Just teach kids to how to prove things, then teach how their proofs can be interpreted as programs. You can combine this with the study of mathematics. For example, “completing the square” algorithm is encoded in the quadratic formula, and studying that formula is studying of programs.

                                1. 3

                                  Completing the square is boring. This example is fun!

                                1. 4

                                  There would be more things for me to pick up about this post, but I have only few minutes so I have to choose one.

                                  The problem with OOP and SOLID principles the author promotes, is that they’re superstition. For SOLID, there are several unrelated conclusions and premises aren’t exactly clear or agreed upon. People who use these rules have to take them for belief, because there is no precise explanation why they work, or what the basis is for them to work in the first place.

                                  Relational model in other hand has some basis in logic. It’s in some manner minimal setting as well.

                                  I think OOP feels like you’d create a new abstract algebra for every possible data structure you work with. It’s bulky, slow and nonsensical. In meanwhile you also conflate the ideas with actors and processes, while all the time trying to pretend you gain something in the process.

                                  1. 6

                                    The problem with OOP and SOLID principles the author promotes, is that they’re superstition. For SOLID, there are several unrelated conclusions and premises aren’t exactly clear or agreed upon. People who use these rules have to take them for belief, because there is no precise explanation why they work, or what the basis is for them to work in the first place.

                                    If you trace the history of SOLID it was pretty much presented whole cloth by Bob Martin, and while he justified it as a synthesis of prior work SOLID is ultimately one man’s opinions on OOP that somehow became gospel.

                                    1. 4

                                      I mean there are some things in ‘SOLID’ that make decent sense. The ‘single responsibility principle’ is basically ‘do one thing and do it well’, the liskov substitution principle is really ‘what it means to be type-safe in a language with subtyping’ and I can’t remember the name of one of the others but don’t they basically come down to ‘program to well-defined interfaces’ and ‘separate policy and mechanism’?

                                      The problem is that they’re interpreted far too narrowly, in the context of ‘solving problems using OOP’, and they’re treated as some kind of gospel. Sometimes it’s okay to do two things. Sometimes it’s okay to put some policy in amongst mechanism. etc.

                                      1. 2

                                        In other words, it’s a synthesis of prior work, but the selection of principles is kind of arbitrary. Why should subtyping be in a list of abstract design goals that are supposed to be applicable to any programming language? Why isn’t “tell don’t ask” in there? There’s nothing about testing, either, or about the proper use of any of the Gang of Four patterns, except, for some reason, Dependency Injection.

                                    1. 5

                                      Category-theoretic thinking of products/sums is a good logical model, but I think it’s awful if your physical memory layout is the same thing as your logical model.

                                      For an example, lets take a list [1,2,3]. In product/sum design your representation for this structure is: 1:(2:(3:nil)).

                                      Imagine it costs you 1 byte to store a number, and 2 bytes to store a structure. If you take the literal in-memory interpretation for this structure, it is formed from pairs of references (total cost=10): 01 *> 02 *> 03 *> 00

                                      If you’re dealing with packed representations, you terminate by empty structure, you end up with: 01 02 03 00. But if you didn’t treat physical memory layout as logical memory layout, you could also give it a different representation where the sequence is annotated with a number first: 03 01 02 03.

                                      I think protobuffer sucks because the schemas are directly compiled into the user language. They would have a much better system if they first converted protobuffer schemas into protobuffer files, then have an interpreter for such files in each dynamic language, and compilers from the schema for the compiled languages.

                                      I also think that the post illustrates the common error that people tend to do, that is to not recognize that implementation details come and go. You really should not let your language be influenced by them, and if you force implementation detail with your language then you open the barn door for that.

                                      1. 1

                                        I think protobuffer sucks because the schemas are directly compiled into the user language. They would have a much better system if they first converted protobuffer schemas into protobuffer files, then have an interpreter for such files in each dynamic language, and compilers from the schema for the compiled languages.

                                        Just from a pragmatism perspective, that sounds like significantly more work for every language that wants to have a protobuf library. As it stands, having a straightforward translation from the object in memory to the wire format greatly assists implementation across all of the potential languages that need implementing. I think this is the key reason Lua, for example, has seen such broad adoption as a scripting language. It’s easy to embed because it has a very natural layout for interoperability (all calls just push and pop stuff on a stack). It’s very easy to write a Lua FFI.

                                        1. 1

                                          It’d be a bit more work in each dynamically typed language that you need to support. You’d need a wire format decoder and a script that decodes the schema file and uses it to translate between wire format objects and their legible counterparts in the client language. But that’d be nice to use when you got to read from or write into a protobuffer file because you could just do the pip install protobuf -equivalent of your scripting language and then start rolling:

                                          schema = protobuf.load("api_schema.pb")
                                          dataset_0 = schema.open("dataset_0.pb")
                                          print(dataset_0[0].user_data)
                                          

                                          It’s quite involving to get the .proto3 -compiler to work. It’s almost like compiling a C project in complexity. It produces plain code that reserves its own directory in your project.

                                          1. 4

                                            I think protobuffer sucks because the schemas are directly compiled into the user language.

                                            IMO, this is an example of a tooling problem being perceived as a problem with protobuf because the prevailing implementations do it that way. If you want an interpreter-style proto library for C, check out nanopb. protoc will produce data and interfaces (struct definitions) instead of a full C implementation.

                                      1. 4

                                        Every once in a while someone is trying these scum tactics on their customers. Looking at the track record of Apple this is just natural progression for them.

                                        And I think the Apple users deserve what they get, if they keep buying Apple stuff despite all the effort Apple takes to abuse them.

                                        1. 5

                                          It’s a widespread problem across entire industries. In my experience most large manufacturers are pretty much equally hateful of end-users. Look at the whole tractor debacle, where John Deere and others are making tractors un-repairable by anyone other than John Deere, so they can get you coming and going.

                                        1. 20

                                          On the one hand, the article has moments that make it seem dubiously sourced.

                                          On the other hand, I’m reminded of a conference I went to years ago where some genius from NSA was harranguing me about the security dangers of non-US born programmers working in US firms or on open source projects. I asked him why he was not more worried about Chinese built motherboards and he refused to believe me that the USA depended on imports of Chinese motherboards.

                                          1. 21

                                            “The dangers of non-US born programmers working on open source projects”?

                                            The spook mentality is something to behold.

                                            1. 3

                                              It is well known that Linus is a Russian spy and goes by the name Linyos Torovoltos. :)

                                              The link above was submitted recently here.

                                              1. 2

                                                I recall, from Linus’s autobiography, that he claimed that his parents were fans of the soviet union & until their divorce he was raised as a red-diaper baby. Obviously, that hasn’t made him into a stalinist as an adult, but if somebody wanted to spin it that way more seriously I’m sure they could. (I recall back in the naughts some microsoft fanboys trying to make those claims & paint the whole open source movement with that brush, but I don’t think they were very successful.)

                                              2. 1

                                                It was kind of jarring - I am not born in USA either!

                                                1. 1

                                                  And you come over here with that foreign thinking devising things like RTLinux that jeopardize the profits of domestic, closed-source, RTOS vendors. That’s exactly the kind of thing our non-corrupt, capitalist government was worried about! ;)

                                              3. 3

                                                I think I don’t trust this article on its face value at all. Bloomberg could be telling a fake story on the demand of someone who wants to further his agenda against Chinese hardware. Also they might partake in a stock manipulation scheme, it was very effective if that’s the case.

                                                Going to wait and see what’s happening before I conclude anything from here.

                                                1. 6

                                                  I really really really doubt it. Bloomberg in particular is financial news, its reporters are constantly seeing how (to quote Matt Levine) Everything Is Shareholder Fraud. Publishing something like this with willful negligance would open them up to soooooo many lawsuits.

                                                  Not to mention that Bloomberg is beholden to basically no one, as an organization. They make huge amounts of money selling their stuff. While Businessweek is being pushed to be more self-sufficient, there’s still a lot of value in them being trustworthy.

                                                  Also making up a story, publishing it in a major outlet, and profiting off of a stock trade afterwards. Oh my god that is a “go directly to jail do not pass go do not collect $200” move, especially if you’re just a journalist and not a multi-billionaire. And these people know it, because they’re the ones reporting on other people doing this kind of thing!

                                                  I’m not saying the story is most definitely right, but it’s a serious outfit.

                                                  1. 4

                                                    Yeah, the idea that either Businessweek itself or the author is fabricating this story is hard to swallow – if they did, then somebody’s making incredibly poor decisions.

                                                    On the other hand, I could absolutely buy the idea that they’ve been fed fabricated evidence. This story exists at the intersection of international relations, espionage, and big business. I can imagine some Angelton-esque character whose paranoia only became pathological after they got in a position of power who suddenly decided that supply chain meddling by chinese intelligence is inevitable & decided to try to trigger an outright ban by orchestrating a high-profile story. (After all, once upon a time our president campaigned on heavily limiting chinese imports, so it’s possible that somebody in intelligence capable of faking convincing-looking Apple & Amazon documentation thought it’d be an easy sell.)

                                                      1. 1

                                                        Author doesn’t seem to mention particular Bloomberg stories that those authors wrote that turned out to be false.

                                                        It’s not outside the realm of possibility but I find it hard to believe that there’s a pattern of a couple authors making stuff up in multiple stories for that outfit – or even reporting stories that end up being wrong due to misleading sources, unless they’ve got damned good excuses. I’ll believe it when I see the stories he’s talking about.

                                                        (BadBIOS is getting mentioned in that thread, but BadBIOS was broken by Ars Technica, right? Anyhow, the whole BadBIOS story was – accurately – reported as “this one researcher thinks this is happening, and other researchers think it’s possible but probably bullshit” in all the coverage I saw. While it was questionably newsworthy, that coverage wasn’t wrong or misleading, unless you only read the headlines – which are almost always wrong & misleading, even in good articles.)

                                                1. 0

                                                  When you’re doing a termination proof, you disregard the fact that the computation might not terminate due to an external reason. Similarly you can prove that a language is turing-complete. The model is supposed to disregard external factors such as the limited amount of memory.

                                                  1. 2

                                                    The limited amount of memory allowed by the C specification is not external to the question of whether C is Turing-complete. The problem is not ‘there’s limited memory in practice’ but instead ‘C forbids an implementation from having unbounded memory’.

                                                    1. 1

                                                      But that doesn’t necessarily change things because with infinite amount of memory you could run simultaneous instances of C program. First one runs with pointer size 1, second with 2, third with 3, so on up to infinity. All machines get the same input, and machines are discarded up to until they all produce the same result without overflow error in calculating the addresses.

                                                      1. 1

                                                        Hmm that is an interesting point indeed. Is it allowable to say ’if we would run out of memory, restart with twice as much memory and sizeof(void*)++.

                                                        1. 1

                                                          If you pick up the Peano numbers, 0, s0, ss0, sss0… There’s infinity included in there because you don’t have a restriction to how many times you can increment a number other than running out of the paper or patience.

                                                          Since Peano numbers do not have an end, what exactly is definition of ‘infinity’ in these terms? There may be several but one plausible approach might be to treat infinity as the set of all Peano numbers that you can construct. Giving you a set that is infinitely large.

                                                          So sizeof(void*) = infinite makes sense if you think of it as.. sizeof(void*) ∈ N, where N stands for a natural number.

                                                          I think there’s a subtle difference in running infinite amount of things in parallel versus discarding the work and restarting the whole thing with a next option. If you run them one-by-one, then you don’t know from the results whether the program produces the same output with larger pointer sizes.

                                                          1. 1

                                                            If you pick up the Peano numbers, 0, s0, ss0, sss0… There’s infinity included in there because you don’t have a restriction to how many times you can increment a number other than running out of the paper or patience.

                                                            No, there isn’t. There are infinitely many natural numbers, but none of them are infinite. Every natural number is finite.

                                                            That’s not why sizeof(void*) cannot be infinite though. It cannot be infinite because the C standard explicitly requires pointers to have a finite constant size.

                                                            I think there’s a subtle difference in running infinite amount of things in parallel versus discarding the work and restarting the whole thing with a next option. If you run them one-by-one, then you don’t know from the results whether the program produces the same output with larger pointer sizes.

                                                            If program behaviour changes as a result of the integer representations of pointers changing then that program is (to my understanding) relying on undefined behaviour anyway. So it shouldn’t be a problem. Any program uses only a certain fixed amount of memory at any one time. Any program that halts on a given input uses only finitely much memory on that input. I think the scheme might work for C. It’s worth thinking about further.

                                                    2. 2

                                                      Did you read the memo? The point is that the amount of memory you can address in C is finite, therefore there are Turing machines you cannot implement in it (eg, the TM which always moves the read head left).

                                                      You can assume as much memory as you want, but it’s always going to be bounded.

                                                      1. 0

                                                        Yup. But if you have infinite memory, then you can also afford sizeof(void*) that is infinite.

                                                        1. 3

                                                          No you can’t, because pointers (and all types) are required to have a statically known size:

                                                          Values […] consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes.

                                                          There is no n such that n × CHAR_BIT is infinity.

                                                          I don’t mean to be dismissive, but this is brought up in the first two paragraphs.

                                                          1. -1

                                                            Sure you can. n must be a natural number that extends to infinity. Or do you know what would prevent a mathematically perfect machine stacking peano numbers forever?

                                                            So what’s exactly hard about this?

                                                            struct thing { int a; void* b; int c; };
                                                            sizeof(int) = 4
                                                            sizeof(void*) = infinite
                                                            sizeof(thing) = 4 + infinite + 4
                                                            

                                                            Likewise offset(.a) would be 4, and offset(.c) would be 4 + infinite.

                                                            Basically. You can fit infinitely large things into infinite memory. If you couldn’t fit an infinitely large object there, then it wouldn’t be infinite.

                                                            1. 1

                                                              sizeof(void*) can’t be infinite.

                                                    1. 3

                                                      I ended up concluding that the fault is in the structured programming and that we don’t need indentation sensitive syntax, because we don’t need deeply nested structures that much either. The code can be written “1 block, 2 block, 3 block, “ so on…

                                                      1. 2

                                                        There’s a reason people adopted structured programming

                                                      1. 5

                                                        If you want something like this for your own language, it may be worthwhile to check out my cffi-gen. It generates json files from C headers.

                                                        Also I got a C-parser implemented in Lever. I dunno if anyone would use it, especially when I’m ripe to stop it’s development and I didn’t get to finish it properly, but it may help if you are planning to do something like this yourself. c.lc cffigen.lc. This project ultimately suffered from a kind of inverse-second-system syndrome. I still ended up using the original, cffi-gen to generate most of the stuff because it still works.

                                                        1. 2

                                                          It generates json files from C headers.

                                                          That’s a great idea! Makes it more reusable across projects.

                                                        1. 3

                                                          I smelled that the author is pushing his own, so I went to see what’s going on.

                                                          This is actually post-rationalization because although it gives a good rational, this is not really what’s going on. What is actually going on is that the modulus and division are connected.

                                                          The way how they are connected can be described as: a = (a / b) * b + a % b.

                                                          Division gives a different result depending on rounding. C99 spec says that the rounding goes toward to zero, but we have also had floor division implementations and systems where you can decide on the rounding mode.

                                                          If you have floor division, the 19 / -12 gives you -2. That is correct when the modulo operator gives you -5. If you do a round-towards-zero-division, the 19 mod -12 must give you 7.

                                                          On positive numbers, the rounding to zero and floor rounding give the same results.

                                                          Also checked on x86 spec. It’s super confusing about this. If the online debugger I tried was correct, then the x86 idiv instruction is doing floor division.

                                                          1. 1

                                                            Forgive my extreme mathematical naivety, but a = (a / b) * b + a % b doesn’t make much sense to me. Given (a / b) * b will always equal a, doesn’t this imply that a % b is always 0?

                                                            1. 4

                                                              / in this context is integer division, not rational division, so e.g. 7 / 3 = 2.

                                                              1. 2

                                                                The division operator in this case is not division in the algebraic sense, and it does not cancel with the multiplication such that (a / b * b = a) {b != 0}. Otherwise your reasoning would be correct.

                                                                To still make this super clear, lets look at 19 / -12. The real number division of this would give you -1.58... But we actually have division rounded toward negative (floor division) or division rounded toward zero, and it’s not necessarily clear which one is it. Floor division returns -2 and division rounding toward zero returns -1.

                                                                The modulus is connected by the rule that I gave earlier. Therefore 19 = q*-12 + (19 % -12). If you plug in -2 here, you’ll get -5 = 19 % -12, but if you plug in -1 then you get 7 = 19 % -12.

                                                                Whatever intuition here was is lost due to constraints to stick into integers or approximate numbers, therefore it’s preferable to always treat it as if modulus was connected with floor division because the floor division modulus contains more information than remainder. But this is not true on every system because hardware and language designers are fallible just like everybody else.

                                                            1. 5

                                                              Computer science clocksteps at the rate of algorithms and discoveries. Languages are always going to come and go, unless the language springs up from a good theory.

                                                              If you want to understand why this would be true, just look at the history of mathematics. Read about algebraic representations, which kind of abacuses have been used, slide rules, mechanical calculators. You will find out that what we have present today is a small fragment of what used to be, and the stuff that still exists was lugged to today because there’s not many obvious better ways to do the same thing.

                                                              By this basis, I’d propose that the current “top 20” by Redmonk cannot form any kind of a long-running status quo. It’s a large list of programming languages rooting to the same theory (Javascript, Java, Python, PHP, C#, C++, Ruby, C, Objective-C, Swift, Scala, Go, TypeScript, Perl, Lua).

                                                              There is going to be only one in 30 years, and I think it’ll be falling to C or Javascript axis. They are syntactically near and lot of software was and gets written with these languages. Although there is even more written with C++, it’s way too contrived to survive without reducing back to something like C.

                                                              CSS may have some chance of surviving, but it’s pretty much different from the rest. About Haskell I’m not sure. I think typed lambda calculus appear or will reappear in a better format elsewhere. The language will be similar to Haskell though, and may bear the same name.

                                                              Unix shell and its commands will probably survive, while Powershell and DOS will wither. Windows seems to have its days counted already by now. Sadly it was not because of open source movement. Microsoft again just botched themselves up.

                                                              R seems like a write-and-forget language. But it roots to Iverson’s notation.. Though perhaps the notation itself will be around, but not the current instances of it.

                                                              I think that hardware getting more concurrent and diverging from linear execution model will do permanent shakeup on this list in a short term. The plethora of programming languages that describe a rigid evaluation strategy will simply not survive. Though I have bit of bias to think this way so I may not be a reliable source for checking into the future.

                                                              But I think this be better than looking at programming language rankings.

                                                              1. 8

                                                                I think, most importantly, we haven’t even seen anything like the one language to rule them all. I expect that language to be in the direction of Conal Elliott’s work compiling to categories.

                                                                A language that is built around category theory from the start, like you have many different syntactic constructs and the ones you use in a given expression determines the properties of the category that the expression lives in. Such a language could locally have the properties of all the current languages and could provide optimal interoperation.

                                                                BTW, I think we won’t be calling the ultimate language a “programming language” because it’ll be as good for describing electrical circuits, mechanical designs and biological systems as for describing programs. So I guess it’ll be called something like a specification language.

                                                                1. 4

                                                                  “we haven’t even seen anything like the one language to rule them all. “

                                                                  That’s exactly what the LISPers always said they had. Their language could be extended to do anything. New paradigms and styles were regularly backported to it as libraries. It’s also used for hardware development and verification (ACL2).

                                                                  1. 3

                                                                    Well, it’s hard to say anything about LISPs in general since the span is so vast and academic, and especially for me, since my contact with any LISP is quite limited. But, from my understanding of the common usage of LISP, it doesn’t qualify.

                                                                    First of all, I think dropping static analysis is cheating, but I don’t intend to tap into an eternal flame war here. What I mean when I say “the properties of the current languages” is no implicit allocations, borrow-checking and inline assembly like in Rust, purity and parametricity like in Haskell, capabilities-security like in Pony etc. etc. , and not only the semantics of these, but also compilers taking advantage of these semantics to provide static assistance and optimizations (like using the stack instead of the heap, laziness & strictness analysis etc.).

                                                                    And I’m also not just talking about being able to embed these into a given language; you should also be able to write code such that if it’s simple enough, it should be usable in many of them. For instance, it’d be hard to come up with some language semantics in which the identity function cannot be defined, so the identifier id x = x should be usable under any local semantics (after all every category needs to have identity morphisms). You should also be able to write code that interfaces between these local semantics without leaving the language and the static analysis.

                                                                    I know you can embed these things in LISP, expose enough structure from your LISP code to perform static analysis, get LISP to emit x86 assembly etc. etc. But, IMHO, this doesn’t make LISP the language I’m talking about. It makes it a substrate to build that language on.

                                                                2. 2

                                                                  I think one major difference between math and computer science, and why we’re not going to see a lot of consolidation for a while (not even in 30 years, I don’t think), is that code that’s on the internet has a way of sticking around, since it’s doing more than just sitting in research papers, or providing a tool for a single person.

                                                                  I doubt we’ll see 100% consolidation any time soon, if for no reason than that it’s too easy to create a new programming language for that to be the case.

                                                                  Hardware changes might shake up this list, but I think it’ll take 30 years for that to be realized, but there will be a lot of programming languages that fall out of that.

                                                                  We’re definitely still going to have COBOL in 30 years, and Java, and C. The rest, I’m unsure of, but I’ll bet that we’ll be able to recognize the lineage of a lot of the top 30 when we look in 30 years.

                                                                  1. 1

                                                                    R seems like a write-and-forget language. But it roots to Iverson’s notation.

                                                                    Did you mean to write J or APL? I understand R as the statistics language.

                                                                  1. 2

                                                                    I’m disappointed to read the negative comments on TFA complaining that the author has “merely” identified a problem and called on us to fix it, without also implementing the solution. That is an established pattern, a revolution has four classes of actor:

                                                                    • theoreticians
                                                                    • propagandists
                                                                    • agitators
                                                                    • organisers

                                                                    Identifying a problem is a necessary prerequisite to popularising the solution, but all four steps do not need to be partaken by the same person. RMS wrote the GNU Manifesto, but did not write all of GNU. Martin Luther wrote the ninety-five theses, but did not undertake all of protestant reform. Karl Marx and Friedrich Engels wrote the Communist Manifesto but did not lead a revolution in Russia or China. The Agile Manifesto signatories wrote the manifesto for agile software development but did not all personally tell your team lead to transform your development processes.

                                                                    Understandably, there are people who do not react well to theory, and who need the other three activities to be completed before they can see their role in the change. My disappointment is that the noise caused by so many people saying “you have not told me what to do” drowns out the few asking themselves “what is to be done?”

                                                                    1. 2

                                                                      I read through and concluded that the author says nothing. Cannot exactly identify the problem he raises up. And by observing the first lines, I think he’s an idiot.

                                                                      That computing is so complex is not a fault of nerdy young 50 years old men. If nerdy 50 years young men had designed that stuff we’d be using Plan9 with Prolog and not have as many problems as now.

                                                                      The current computing platforms are created by multiple-body companies and committees with commercial interests. They’ve provided all the great and nice specs such as COBOL, ALGOL, HDMI, USB, UEFI, XML and ACHI, just few to start the list with. All of the bullshit is the handwriting of the ignorant, not of those playing dungeons and dragons or solving rubik cubes.