1. 4

    I am working on a proof of concept for GDPR using a graph database and vuejs. Wednesday I will be speaking about API first CMS at WHO in Copenhagen.

    1. 2

      GDPR is going to be a hot topic next year. Is your idea to demonstrate links between data points?

      1. 3

        Yes it is! I am preparing a GitHub repo and few blog posts. I will share all when it is ready.

        1. 1

          Please do, I’m interested on this matter!

          1. 1

            Hello, as promised I have published the first part here: https://blog.grakn.ai/gdpr-threat-or-opportunity-4cdcc8802f22 the second part is here: https://medium.com/@samuelpouyt/grakn-ai-to-manage-gdpr-f10cd36539b9 and I have yet to publish the api example. Code is available here https://github.com/idealley/grakn-gdpr

      2. 1

        Are you talking about GDPR at the WHO? Or an actual CMS?

        1. 3

          At who I am speaking about Cloud CMS an actual CMS we have implemented where I work, but I am speaking generally about API first CMS’s and the benefits they can bring to a company, especially if you need to publish to different channels.

          1. 1

            Have you spoken at any other humanitarian agencies yet or worked at an NGO in a technical capacity before?

            1. 1

              I am working at an NGO. And we have implemented it. I agree it requires some technical knowledge, but the benefits are huge!

              I did not speak at humanitarian agencies on this topic, but I have have in other digital circles.

              1. 1

                Cool, well good luck! I haven’t been to the Copenhagen office before, been to GVA and in-country offices, they only let me out of my cage to see the outside world once in a blue moon.

                1. 1

                  I was also in cage. One day I was invited, my boss said no. I took the days off on my extra hours, and financed myself. Like this trip to Copenhagen. :( But all the rest is fun!

      1. 1

        The site linked seems to be for an older course and contains only static content.

        The right site for the pharo mooc (with exercises) is this one.

        1. 22

          First time I’ve seen a good synthesis of:

          • “Managed” languages’ closed-world top-down design assumption makes interoperability difficult and slow.
          • C’s main advantage is not performance, as often assumed, but its assumption of flatter, open-world interoperability in the context of a larger system.
          • C’s safety disadvantages are implementation artifacts that can be addressed without changing languages—but because of the emphasis on numerical performance, implementations are going in the opposite direction.
          • Without an equally good open-world language (Rust?) you couldn’t rewrite a lot of C anyway.
          1. 14

            C’s main advantage is not performance, as often assumed, but its assumption of flatter, open-world interoperability in the context of a larger system.

            “Interoperability?!” Ha! At least I remember battling calling conventions.

            For better or worse, C won; that’s the difference.

            The greatest trick C ever pulled was convincing the world it doesn’t have a VM or runtime.

            1. 6

              For better or worse, C won; that’s the difference.

              I would say for better. Not because the C model was better, but because one of the risks was not to have a common calling convention. And that would have been a nightmare.

              The greatest trick C ever pulled was convincing the world it doesn’t have a VM or runtime.

              This is rarely debated but completely right. Lack of VM or heavy runtime was so influential that almost no one remembers other ways of working, except in almost dead or niche languages.

              Due to this, almost all new languages are boringly close to C and this has created a huge gap in language innovation.

              1. 5

                Due to this, almost all new languages are boringly close to C and this has created a huge gap in language innovation.

                How so? I’m trying to follow this comment here but am not sure what you mean.

            2. 10

              “C’s safety disadvantages are implementation artifacts that can be addressed without changing languages—but because of the emphasis on numerical performance, implementations are going in the opposite direction.”

              I countered the concept of just implementing C safer elsewhere with the following:

              The problem is that it isn’t a new idea. People keep trying it as shown below. Unfortunately, C wasn’t designed so much as a modified version of something (BCPL) that was the only collection of features Richards could get to compile on his crappy hardware. It’s not designed for easy analysis or safety. So, all the attempts are going to hit problems in what legacy code they can support, their performance, or even effectiveness if about reliability/security in a pointer-heavy language. Compare that to Ada, Wirth’s stuff, or Modula-3 to find they don’t have that problem or have much less of it because they were carefully designed balancing the various tradeoffs. Ada even meets author’s criteria for safe language with explicit memory representation despite him saying safe languages don’t have that.

              To back that up with references, first is a bunch of attempts at safer C’s or C-like languages with performance issues. The next two are among most recent and practical at memory safety for C apps far as CompSci goes. The last one is an Ada book that lists by chapter each technique its designer used to systematically mitigate bugs or vulnerabilities in systems code.

              https://pdfs.semanticscholar.org/a890/a850dc78e65e26f8f4def435b17094ce08cf.pdf

              https://llvm.org/pubs/2006-06-12-PLDI-SAFECode.html

              https://www.cs.rutgers.edu/~santosh.nagarakatte/softbound/

              http://www.adacore.com/uploads/technical-papers/SafeSecureAdav2015-covered.pdf

              EDIT: Removed this comment from main thread and posted it here as a reply. Clicked the wrong one at first apparently.

              1. 9

                Ada basically proves his point that you can have a safe, unmanaged, integrative systems language, if you provide good enough facilities for describing the “alien” memory you got from other modules.

                I think he’s mainly responding to the widespread assumption that that’s not possible: that only a language that assumes its runtime “owns” the entire address space and allows direct memory access only through a narrow API can ever be safe. Presumably Ada users know that’s not true, but they’re too busy to write blogs…

                1. 4

                  I like that perspective. With that one, we start with what can be done with current languages then can argue tradeoffs each method offers. Far as Ada users, they seem to be apathetic to or suck at community efforts more than most groups using a language. I think the ones doing solid writing almost exclusively do it in Ada-specific forums, conferences, etc. It might be a pragmatic choice based on how many people in general forums ignore such languages. Nonetheless, it hurts them in adoption since the benefits aren’t widely seen.

                2. 5

                  Link to CCured paper: https://www.cs.virginia.edu/~weimer/p/p477-necula.pdf

                  If I had a pressing need for some high assurance C code, I’d probably start with something like that. Their results were pretty good with minimal annotations. If you commit to using something like that, you’d increase annotations over time to improve performance.

                3. 4

                  I agree completely, and reading this paper echoes a lot of my own thoughts about not just the C ecosystem but the surrounds of UNIX in which many of us are constructing the building blocks of our even broader distributed systems. A variety of more-constrained runtime environments with various degrees of memory safety exist, most or all of which are to some degree willfully ignorant of, or even actively hostile to, existing operating system primitives and libraries of code written in other languages.

                  One example is of how difficult it is to use something like doors when your language implies and enforces a model of parallelism that precludes direct control of operating system threads. Another is SQLite – how likely is it that a de novo implementation to fit into a language-specific world view will have seen as much real-world use and extensive testing?

                  1. 2

                    It was hard wading through that…. but D looks like a closer fit to addressing all the concerns listed in that doc.

                  1. 3

                    I think this conclusion dubiously supported:

                    The conclusion is the one you probably arrived at too by now: In CSS, we repeat ourselves too much.

                    Repeat ourselves too much for what? What consequences are there of this repetition that make it so bad that we should spend effort on getting rid of it?

                    CSS with this much repetition, this much bloat, is slow.

                    I don’t think the evidence supports this assertion.

                    • Is there evidence that testing all these redundant rules against DOM nodes at runtime is unacceptably expensive? As far as I can tell, no: big complicated websites like BBC News or Tumblr seem to spend about 2 to 4 times as much time computing layout as style, and another 4 or 5 times longer than that on running JS. I don’t think deduplicating CSS declarations would help with this anyway; you still get the same (or larger) set of selectors.
                    • Is there evidence that parsing these redundant declarations takes an unacceptably long time? Doesn’t seem so to me; the time that I see Chrome’s dev tools attribute to CSS parsing seems to be a rounding error on complicated websites.
                    • Is there evidence that transmitting these redundant declarations over the wire burns unacceptably much bandwidth? I’m not very convinced; redundant text gzips very easily. Right now, the people worrying loudly about page weight are mostly talking about the sizes of images which completely dunk on CSS by a factor of dozens.

                    I don’t think there’s evidence here to support the notion that doing things to your source code that will make it more confusing, such as deduplicating declarations in it that are only identical by coincidence¹, is going to net you an appreciable performance gain on any metric you care about. At the very least you have the Amdahl’s Law problem: if you try to optimise a part of a process that only takes a tiny proportion of the total time you’re spending, then even if you optimise it so that it’s infinitely faster, you will still at best have removed only the entirety of the tiny proportion of your runtime that it accounted for in the first place.

                    (¹):

                    Q: Does it still make sense from a maintenance perspective to avoid repetition when declarations are only coincidentally the same? A: Yes, …

                    1. 4

                      Solid agree. My understanding is that CSS (unlike a lot of HTML/JS/etc) was specifically designed to allow efficient implementation. Which isn’t an argument to be wasteful, but you have to try really hard to make it a problem. Unlike certain other web techs which spontaneously turn into performance sinkholes unless you maintain constant vigil.

                      Nightmare scenario: someone replaces 50K of static CSS with another megabyte of javascript that dynamically generates the “optimal” style for each page.

                      1. 2

                        Nightmare scenario: someone replaces 50K of static CSS with another megabyte of javascript that dynamically generates the “optimal” style for each page.

                        Nightmare? This has been going on for the past couple of years.

                        1. 1

                          From where did you get your understanding that css was designed to have efficient implementations?

                          1. 1

                            I’m not him, but I know CSS doesn’t have children selectors (e.g “select all divs which have a child element which matches this selector) mainly for performance reasons, which probably means performance was a consideration when they designed other parts of the spec too.

                            1. 1

                              Yeah, that’s what I’m thinking of.

                          2. 1

                            “Nightmare scenario: someone replaces 50K of static CSS with another megabyte of javascript that dynamically generates the “optimal” style for each page.”

                            I was thinking more on the lines of them implementing a whole OS in a browser that then renders the CSS in its browser.

                            http://copy.sh/v86/?profile=windows98

                            Note: IE crashed on me when I typed a URL to then tell me I’ll need to install a modem. So, they’re not there but getting close…

                          3. 2

                            Changes in CSS trigger recalculations of layout. Having duplicated CSS may be contributing to the layout recalculations that you see in complicated sites. Especially if you haven’t loaded all of the CSS up front. Ignoring the layout numbers is ignoring half the effect that CSS has on the page.

                            I used to work on a tool that did css optimization built into the templating language. Reasons for the tool were that CSS was difficult deduplicate for very large complicated sites but duplicated CSS did cause expensive and unnecessary layout recalculations among other things.

                            Add to this the fact that duplicated CSS can sometimes make it difficult to manage global changes on complicated sites. It is possible more expensive than you might be thinking.

                          1. -1

                            XOR also set upon us thousands of cost-cutting MBAs.

                            Marvin Minsky showed that a single perceptron could not compute the XOR function. This was never doubted. It’s a trivial result. Minsky probably didn’t think much of it. It was an example under which a very simple algorithm (guaranteed to converge if the data were linearly separable) would not work.

                            However, it brought on the AI winter when cost-cutting MBAs started using it to argue that “AI” and “neural networks” were a failure. “Neural networks” couldn’t even learn XOR! That’s not true, of course. A single perceptron can’t learn XOR and no one thought much of the fact.

                            Of course, we know a lot about neural nets that we didn’t know in the ‘60s. We know where gradient descent (back-propagation) works well and what its failure modes are. We know about ReLU units. We know about the importance of validation data. We know that neural nets generally aren’t the best approach to exact binary operations.

                            1. 14

                              However, it brought on the AI winter when cost-cutting MBAs started using it to argue that “AI” and “neural networks” were a failure. “Neural networks” couldn’t even learn XOR!

                              I really, really doubt that. I read lots of books from that era when I studied AI. I played with some of the tools, too. The AI Winter came from the cycle of over-promising what an AI method/company would do, large to massive investment in time/money into that, and not getting those results on top of new problems. A big example would be Japan’s Fifth Generation project which probably was under-performing rules engines or stochastic search on dumb, fast processors despite them putting huge amounts of money and talent into it. This problem led to the AI Winter that took down LISP and Prolog in industry in general. Rules engines lived on in languages such as Java under banner of Business Process Management.

                              1. 9

                                You confuse the decline in perceptron studies of 1970s (which did follow the Minsky book) with ‘AI Winter’ of 1990s. The latter was mostly due to failure of expert systems and knowledge engineering.

                                The ‘cost-cutting MBAs’ didn’t have anything to do with either.

                                1. 3

                                  I don’t know why is so down-voted without explanation!

                                  Minsky killed neural networks (not the AI winter) and he did it on purpose to get more ARPA funding. When he and Papert published their book they were right that feed forward perceptrons could not manage XOR and multi layer perceptrons were almost imposible to train, even when they already knew multi-layer were able to process XOR.

                                  This book was so influential it mad impossible for Rosenblatt (the father of the perceptron) to succeed obtaining any military funding. Minsky & Papert never wanted to debate the merits of neural networks, but concentrated on the limitations of the Perceptron (a small part of the neural network idea) to solve what they called a ‘group of interesting problems’.

                                  The book destroyed Rosenblatt work in the AI world, delaying the use of neural networks for 20/30 years.

                                1. 4

                                  My takeaway from the whole Itanium debacle was that it was maybe the strongest example of “No, really, there is no such thing as a sufficiently advanced compiler that will save your ass.” The Cell chip being the second-strongest, but one that worked because it had so much consumer force behind it.

                                  The Itanium chip was awesome, but man what a mess for devs and compiler engineers.

                                  There’s a lot to be learned from that hubris.

                                  (hint hint)

                                  1. 2

                                    This is also what I remember from that time. The super-compiler was a technical bet that did not materialise.

                                    And also, it did not help the marketing budget of x86 was much bigger.

                                    At that time I was involved in some big HW pre-sales processes and Itanium never was cost effective in the business case. It was more expensive and the only thing we could explain to our prospects was that the roadmap was better and suited for the really high end market, a place where according to Itanium marketing x86 will eventually abandon. But of course the x86 marketers repeated that this would never happen.

                                    But clients felt this was a gamble and they preferred to stay with x86 and the higher freq means higher processing power rather than wait for the appearance of a magical compiler. This was a key point because the people making the decision Itanium/Intel were HW people that did not feel confortable changing the application landscape to introduce a new compiler.

                                    1. 1

                                      Idk. Both the Itanium and Cell sales were mostly driven by platform appeal (or lockin for HP) that wouldve happened just as easily with an Intel/AMD CPU as we’ve seen recently in both markets. Far as people buying them direct, both appealed to tiny niches of customers that liked their unique capabilities enough to forgoe benefits of mainstream platforms. And then they both died with Itanium outlasting the Cell in iteration count (IIRC).

                                      By died, I mean market demand and real investment into them. I think IBM and Mercury still sell Cell’s but most demand went to clusters of multjcore CPU’s and GPU’s.

                                    1. 3

                                      Yes, you’re right, waterfall the way is described today to promote other methodologies did never exist.

                                      Even in the old days when you had highly paid consultants of Company A doing the analysis and Company B doing the programming, updates (feedback) to the analysis document created in the previous phase was something usual and expected.

                                      1. 4

                                        Yes, you’re right, waterfall the way is described today to promote other methodologies did never exist.

                                        I was taught The Waterfall Method in high-school (around 2006) and tested on it. It existed, to unfortunate Australian school students :)

                                        1. 1

                                          It’s like the UNIX and C were a hoax gag. Except it’s real, done in businesses, mandated in many schools, and surviving the Agile wave through sheer inertia and control-freak managers. Human nature in action…

                                      1. 2

                                        Are we including things like Cleanroom Software Engineering in this tag?

                                        1. 2

                                          I’d just call it Programming since it’s a methodology for programming that doesn’t actually require formal verification. It’s certainly a formal method but designed for people who wouldn’t get that. Overlaps considerably with good, programming style. I’d avoid putting formal tag on it just so someone wouldn’t filter it thinking it would be useless to them like a “Programming with Proofs” paper requiring strong, math/logic background.

                                        1. 5

                                          Am I the only one that thinks that the Mastodon/GNU social is going to be a huge mess?

                                          Even if the federated code works flawlessly, it’s going to be almost impossible to have recognisable identities without some big players setting up their nodes and providing some assurance of who is who.

                                          1. 27

                                            Not great for celebrities/brands. Just fine for me and my friends, and people they vouch for.

                                            1. 6

                                              Yeah, I’m nearly certain that the people complaining about this aren’t the people using it.

                                            2. 10

                                              No more of a mess than email.

                                              1. 2

                                                And look how “uncool” email is now, and Slack et al are in.

                                                edit: For how federation can turn into a big mess, see Usenet, which is now just pirates and spammers.

                                                (FWIW, the IRC model of federation is also interesting. More smaller scale.)

                                                1. 2

                                                  see Usenet, which is now just pirates and spammers.

                                                  Usenet got that way through an evaporative cooling process when all the real users skipped town to the web. That is the eventual fate of any platform, federated or not.

                                                  1. 2

                                                    Email and Usenet don’t have investors spending millions of dollars on marketing (or on feature development, to be fair).

                                                    1. 1

                                                      email is uncool because it’s push not pull.

                                                    2. 2

                                                      Email is a gigantic mess.

                                                      1. 1

                                                        I love email (as a protocol and as a communication medium), but email is by definition something private and sender to receiver (cc: never worked really well). It’s not a publication protocol.

                                                        Twitter-like protocol is something public, closer to usenet and completely unrelated to email.

                                                      2. 10

                                                        I’ve said (roughly) this before, but I’ll put it here because it applies. I genuinely don’t believe that any of these federated services will ever take off so long as “federation” is viewed by the developers as the “killer feature”.

                                                        Virtually no one wants to run their own instance. And virtually everyone just wants to set up an account and use it without worrying about whether their server will be fast enough, or be kept up-to-date, or even still be online in a year. As an aside, yes, I know Twitter could disappear tomorrow, but I can’t control that risk so I don’t stress about it.

                                                        In fact, virtually no one even wants to choose their server (or node, or whatever you want to call it). I certainly don’t, because there’s no way for me to make a reasonable decision based on a giant list of servers and basic stats about them. How do I know which ones are run by trustworthy people? How do I know which ones are run by some kid who doesn’t know the first thing about securing a server? That’s too much stress, and most people will respond by giving up. I signed up for a Mastodon account only after it appeared that mastodon.social was something akin to an “official” instance. No idea if that’s true or not, but that was what got me to sign up.

                                                        So do I think that federation is undesirable? Not at all. But federation is a safety valve, not a core feature. It forces (in theory) providers to put their users first, because their users can leave and take their data and connections with them. It also allows nerds and the paranoid (whether justifiably or not) to self-host, which should make for a richer, more inclusive, and more pleasant experience for everyone. But again, the people self-hosting are going to be a tiny, tiny minority if the system ever becomes widely popular and we should acknowledge this and act accordingly.

                                                        1. 2

                                                          I agree not many people want to host their own server, but many people do want to choose their instance. Probably not a majority, but a pretty large minority. So far this is mostly because some instances act kind of like mini-BBS/forums, not merely a place to connect to the larger federated network from. The “local timeline” tab shows a firehose feed of all posts from people on your own instance, and people on instances with some kind of shared community use that as a general chat (it’s less useful on huge instances like mastodon.social). There are other ways this could be built that doesn’t tie the community to a specific server, e.g. GNU Social has a concept of “groups”, which Mastodon doesn’t interoperate with. But some things are hard to implement if not tied to a server; different servers have different moderation/content policies, for example, which is implementable because the admin of the server can enforce them.

                                                        2. 9

                                                          I love how many other instances are also named “mastodon”.

                                                          “No, I’m not user@mastodon.social, I’m user@mastodon.network, you followed the wrong person”

                                                          1. 1

                                                            I was sad to see that gay.crime.team does not have open registration.

                                                        1. 1

                                                          Happy to see it deployed.

                                                          One of the things I really miss from usenet was the killfile. Banning is too harsh, and hiding is a solution that would be extremely useful.

                                                          1. 16

                                                            Don’t use macros.

                                                            Why not? Macros can be very useful. For example, say I have a dispatch table to call function’s with a common signature and set of local variables. If there are 30 different functions, a macro defining the function and declaring the common variables means that if something changes I only have to change it in one place. This is more than just a ease of coding thing: if I change from signed to unsigned or change the width of an integer and forget to change it in one place, there can be serious and hard-to-find consequences.

                                                            Don’t use fixed-size buffers.

                                                            Always use static, fixed-sized buffers allocated in the bss, if you can get away with it (that is, you know the maximum size at compile time). Allocation can fail at runtime, and adding checks everywhere for this is error-prone. If you’re allocating and freeing chunks of memory at runtime, you run the risk of use-after-free, reference miscounts, etc.

                                                            If the size of a block isn’t known until runtime, but is known at startup, allocate the necessary memory at startup and free it at shutdown.

                                                            Only as a last resort should you be doing allocation and freeing repeatedly during runtime, when the set of objects and their sizes depends on data only accessible while running.

                                                            1. 11

                                                              I feel the writer is not so experienced with C.

                                                              Not only generic recommendations like Prefer maintainability (when should we not prefer maintainability?) or Use a disciplined workflow (yes, but what kind of workflow?), some of them are against common C best practices, like: Do not use a typedef to hide a pointer or avoid writing “struct” .

                                                              Taking into account opaque pointers are something standard in stdlib and highly recommended to hide complexity and allow code change, I don’t know from where he got these ideas.

                                                              1. 3

                                                                Opaque pointers hidden behind typedefs are something I’ve never been totally comfortable with, though I guess I’ve been using them without knowing! Where in libc are they used?

                                                                1. 4

                                                                  typedef void* lobster_handle_t; is probably the most common way–of which I’m aware–of exposing types and structs for public consumption without giving away internal implementation details to users. This is doubly useful if you have, for example, the same interface implemented differently on different platforms: your _win32.c and _posix.c variants are chosen based on #ifdefs, but user code including your headers only ever sees the opaque pointer.

                                                                  1. 5

                                                                    Wouldn’t a lobster handle just be a claw?

                                                                    1. 3

                                                                      Or the tail

                                                                    2. 4

                                                                      Forward declaration is the new hotness:

                                                                      struct T;
                                                                      void f( T * x ); // feel free to pass around T*, but you don't get to see inside
                                                                      

                                                                      It brings no benefits to C code because all pointer types implicitly cast to each other, but in C++ they don’t and it’s definitely preferred there.

                                                                      1. 4

                                                                        It brings no benefits to C code because all pointer types implicitly cast to each other

                                                                        Whoa, no they don’t. void * implicitly converts to any other type of (non-function) pointer, and vice-versa, but that’s it.

                                                                        (many compilers do allow for function pointer <-> void * conversions, even implicitly, but I think that’s an extension for POSIX compatibility.)

                                                                        1. 1

                                                                          MSVC/GCC/clang all allow it, but they do warn about it by default.

                                                                        2. 3

                                                                          T isn’t a valid type name in C. You have to use struct T unless you supply a typedef.

                                                                      2. 2

                                                                        FILE, for example.

                                                                        1. 3

                                                                          Correct me if I’m wrong, but doesn’t one usually use a FILE * rather than working with a raw FILE?

                                                                          1. 1

                                                                            Sorry I was thinking just “opaque pointer” not one hidden behind a typedef. An example of a completely opaque type (from the perspective of the standard library) is va_list. Extending beyond the C standard library, you have things like pthread_t in POSIX (which could be “the standard library” depending on your definition), which is of unspecified type.

                                                                            1. 1

                                                                              Keep in mind, va_list is not necessarily a pointer, and it’s only opaque in the sense that its contents are undefined and unportable. On x86-64 Linux, for example, it’s a 24 byte struct, and may be defined (depending on your compiler, headers, and phase of moon) as:

                                                                              struct __va_list_struct {
                                                                                  unsigned int gp_offset;
                                                                                  unsigned int fp_offset;
                                                                                  union {
                                                                                      unsigned int overflow_offset;
                                                                                      char *overflow_arg_area;
                                                                                  };
                                                                                  char *reg_save_area;
                                                                              };
                                                                              
                                                                              1. 1

                                                                                Right, I was trying to think of an example that is an explicitly opaque type hiding behind a typedef. It’s always interesting to see how POSIX and/or C sometimes mandates somethings as completely undefined by type, but not others. jmp_buf has to be an array type, for example, but is not specified beyond that, and va_list is explicitly of any type at all.

                                                                                1. 1

                                                                                  time_t Standard C does not mandate a definition at all (it could be an integer, could be a float, could be a structure). POSIX defines it though.

                                                                                  1. 2

                                                                                    Time is an illusion. Lunchtime doubly so.

                                                                        2. 1

                                                                          FILE * is the more visible example.

                                                                      3. 4

                                                                        if I change from signed to unsigned or change the width of an integer and forget to change it in one place, there can be serious and hard-to-find consequences.

                                                                        Agree, which is why using typedefs to make maximal use of C’s sad type system is a better move than a mere macro. Also, macros can do weird things when expanded in code, and it’s easy to end up with a codebase that is unreadable and ungreppable because of having to continually expand non-intuitive macros. They’re handy, in moderation, but overuse is not so great.

                                                                        Only as a last resort should you be doing allocation and freeing repeatedly during runtime, when the set of objects and their sizes depends on data only accessible while running.

                                                                        Spoken like a true Fortran programmer! ;)

                                                                        More seriously, anything that is actually interactive and of any real practical use is easier coded with dynamic allocation. Also, the number of people that properly write fixed-size allocation code without leaving gigantic security holes and undefined behavior open is small. Better just to use malloc and free and know that you have problems than to hope somebody didn’t mismatch a buffer size with a differently-spec'ed memmove call.

                                                                        That said, in a library, if you don’t allow users to specify their own allocation routines you are bad and you should feel bad.

                                                                        ~

                                                                        Overall, I agree that this advice is not so great, probably because the author hasn’t had to deal with producing libraries for others to consume. That very much colors how these things are evaluated.

                                                                        1. 6

                                                                          Fortran

                                                                          curls up in a ball, rocks back and forth, crying

                                                                          They’re handy, in moderation, but overuse is not so great.

                                                                          That’s true of just about anything, but yes, macros are a sharp tool. It’s very easy to hurt yourself if not used very carefully, but like any sharp tool sometimes there’s a good use case. Never say never. :)

                                                                          More seriously, anything that is actually interactive and of any real practical use is easier coded with dynamic allocation.

                                                                          True, but not everything need be interactive. The most critical code I work on right now is highly dynamic at runtime, but does no memory allocation after startup. We calculate the sizes of various structures based on parameters provided by the system at startup, and allocate memory once. This is necessary for various reasons, but most importantly because of performance; we deal with tens-of-thousands of work units a second, of varying size. Repeatedly allocating and freeing blocks would rapidly result in fragmentation.

                                                                          We originally thought about allocating fixed-size blocks, since most modern allocators would handle that well so long as there weren’t any other allocations happening. Things like tcmalloc would still probably be okay, but at the end of the day we decided to use a static allocation scheme with what amounts of a large array with chase pointers in each slot, making allocation an O(1) operation with zero fragmentation (basically a slab allocator). Additionally, we can use mlock to keep those pages in memory to avoid any indeterminacy with swapping.

                                                                          Variable-sized data is fed into a ring buffer with chase pointers and we keep pointers to things in the ring in the slab-allocated structures; we never copy out of the ring. We track the ring pointers and invalidate any data in a block that gets overwritten while in use (which is surprisingly cheap if you do it right).

                                                                          (Sorry, that was a big digression, but I really like working on that code.)

                                                                          Also, the number of people that properly write fixed-size allocation code without leaving gigantic security holes and undefined behavior open is small.

                                                                          I would argue that writing strncpy(foo, bar, BUFSIZE) is less error-prone than strncpy(foo, bar, dynamically_allocated_size_that_changes). (I admit that’s a contrived example.)

                                                                          Again, obviously, not everything can work this way. There are times when you have to use dynamic allocation, but, at least in my experience, people have a bigger problem tracking reference counts and avoiding use-after-free than they do dealing with fixed-size buffers.

                                                                          1. 4

                                                                            it’s easy to end up with a codebase that is unreadable and ungreppable because of having to continually expand non-intuitive macros

                                                                            That’s true, although macros are also sometimes used to fix the problem that C codebases are often hard to grep in the first place. The Linux kernel uses a whole series of WARN macros partly for that reason. Lots easier to grep for WARN_ONCE in a big source tree than have to pore through every inline use of printk.

                                                                        1. 2

                                                                          How dangerous are Go’s http stack, Phoenix, or Twisted in these regards? Some of them seem like they are easier to avoid in Go (due to its bounded channels), but I’ve not been working at large enough of a scale to witness bugs from this.

                                                                          Addendum: I suppose I’d like to understand how much of this is due to node.js being node.js, and how much of this is inherent to the nature of concurrent servers.

                                                                          1. 1

                                                                            The event based programming model is the main reason on these issues.

                                                                            The current programming languages and tools are based around the assumption that the code can be read as a novel and not as a ‘choose your own adventure book’. There are some solutions to make things simpler (like implementing channels, typed events, continuations…) but every one has its own issues, specially when you have to handle complex I/O events.

                                                                            Possibly the easiest solution is to keep it as simple as possible and keep the computation to a single loop with non blocking I/O. This has been the way much games work and is straightforward enough if you have good tools to check for loop blocking.

                                                                            Unfortunately, that limits a lot the kind of projects you can develop, specially if you don’t have clever primitives for I/O.

                                                                            Personally, after working a lot with node.js in server side programming, I prefer to use node.js event model directly defining few events in systems (get data/process data/reply with data, plus a timer). Unfortunately, no editor has good tools for working with events as first class elements (like functions, variables, classes). That means that having to find manually why an event is not triggering a function is really common (and exasperating) task.

                                                                            1. 1

                                                                              One thing that is nice about Go and Elixir/Erlang is that both of them, via CSP/Actors, make it very easy to create your own event loops, but treat I/O as effectively blocking. But I wouldn’t say that either Go or Erlang are event oriented as such.

                                                                              What kind of complicated I/O events cause problems with channels, if I may ask?

                                                                          1. 2

                                                                            Any idea how easy is to incorporate it to a system currently using Boehm GC?

                                                                            1. 5

                                                                              This is mostly regarding the original Spolsky post, but if any piece of writing requires explication to its intended audience, the author has failed his task.

                                                                              1. 6

                                                                                Considering programming, I expect the audience to evolve. What’s informative for the next maintainer when they’re coming up to speed is likely to get in their way as they maintain the code longer.

                                                                                1. 2

                                                                                  Suppose that I accept that:

                                                                                  if any piece of writing requires explication to its intended audience, the author has failed his task.

                                                                                  but this doesn’t help us to solve the actual problem.

                                                                                  Today almost all the code written requires additional comments or documents to reach its intended audience, so… what should we do to solve the current mess that is reading other people code?

                                                                                1. 10

                                                                                  A part of me feels that this might bring something interesting to the literate programming table. The nice thing about this layout is it assumes you’re here to deeply study something. The definitions are off to the side—there for you if you need them, not taking up space in the exposition if you don’t. The discussion, especially in Talmud, tends to be much much larger than the text. That seems likely to be the case in software too, but whenever I think about doing literate programming I find a lot of dull code that really requires no elaboration.

                                                                                  For a while, everybody seemed to be using something to present Javascript and commentary side-by-side; I don’t recall whether this is the thing or not but Groc certainly looks like it. I liked this style and it seemed to be taking off a few years ago but nobody seems to be doing this anymore.

                                                                                  I wind up with the same problem here I always have with literate programming, which is that most code just isn’t that interesting. What code do I want to study linearly? Usually I’m either writing new code from whole cloth or else debugging someone else’s edifice. I’m not likely to brew a pot of coffee and light some candles, sit down and really get romantic with someone else’s codebase. But maybe I should? Maybe that would make us better programmers…?

                                                                                  1. 14

                                                                                    The nice thing about this layout is it assumes you’re here to deeply study something. The definitions are off to the side—there for you if you need them, not taking up space in the exposition if you don’t….I wind up with the same problem here I always have with literate programming, which is that most code just isn’t that interesting. What code do I want to study linearly? Usually I’m either writing new code from whole cloth or else debugging someone else’s edifice.

                                                                                    It’s not that “most code just isn’t that interesting”; it’s rather that most code isn’t that interesting to the author, and that happens because they are lugging around tons of context in their head.

                                                                                    For the last week or so, I’ve been on-again, off-again trying to get Factor rendering on Windows at arbitrary DPIs. Over the course of this effort, Factor’s text rendering has gone from being a black box to being boring, and I am getting there with Windows' Uniscribe API. And, honestly, both turn out to be quite straightforward; they just involve a ton of knowledge I happened to not to have originally. So when I eventually finish getting this quite right, the resulting code will be boring to me, and it’ll also be boring to people who know Factor’s text system and Uniscribe, but it’ll be completely fascinating to people who don’t know either.

                                                                                    One thing I think we’re not great at in programming is leveraging the fact that the screen is dynamic. I think you’re really close to hitting exactly why I think a format like this could be absolutely amazing for code. What I’d love, conceptually, is to be able to have comments at various interest levels, and to allow people of different skill levels to see different doc levels. For example, for true newbies, it’d honestly be great to show probably literally every function call’s documentation off to the side preemptively, so you can cross-reference them while you look at the code. If you know core Factor, but not the ui.text library, maybe only those functions would be shown. And if you know all of that, you probably just want any inline comments to be shown for things that are weird or unexpected (e.g., GetDpiForMonitor inexplicably returns both a DPI for the X axis and a DPI for the Y axis, even though the documentation guarantees they must be identical, which is the kind of thing I need a quick note on even if I’m an expert in a system).

                                                                                    That’s all doable today, but if we really thought about a Tosafot section, it could be useful in ways that would even apply for “boring” code. Part of the reason I’m able to spool up on Uniscribe so quickly is that it’s an iterative improvement on other text layout systems (QuickDraw, LaTeX, etc.), and that I am already familiar with tons of Unicode issues (normalization, CJK issues, etc.), so the docs casually gloss over tons of major design decisions that “just make sense” to me. But it’d be wonderful if I could see that information wrapping the code also. Maybe LaTeX does something neat here that I could integrate into the system, or maybe there’s this cool trick from Uniscribe that I could borrow and put into the Pango-based libraries. That might be really useful to someone who was looking to improve Factor’s text rendering systems in general, but are “boring” to me because they’re not actually relevant to what I’m doing.

                                                                                    And further updates on that documentation might be great, too. Maybe I note here that I’m making assumptions based on Unicode 9.0, but Unicode 10.0 changes something up (like breaking up the CJK plane because reasons). It’d be great to link to the original rationale, the new standard, and how to do the migration. (A lower-key version of this has been happening with country flags and certain emoji that display radically differently on iOS v. Windows v. Android, despite all three devices properly and reasonably honoring those code points.) Or my note on Windows' GetDpiForMonitor function might change in a world of VR where our monitors take advantage of our eyes tending to scan more side-to-side than up-and-down. I dunno. But it’d be great to be able to see how things change, and what the rationale for the changes were. If that’s why I’m there. Because otherwise, it’s boring.

                                                                                    So yes, all code is boring to someone. But most documentation is fascinating to the right reader, and a programming environment that seamlessly mixed the two could be phenomenal.

                                                                                    1. 7

                                                                                      It’s worth noting that Talmud wasn’t written by one person. On any given page you’re seeing the original author of the biblical text, plus Rashi, plus other rabbis. What may be too much is expecting one programmer to write a literate program. It may only work if one programmer writes the center text and other programmers supply the surrounding text.

                                                                                      1. 4

                                                                                        Oh, absolutely. I deleted this part of my comment, but I was going to note that you really need a networked, live, bidirectionally linked system—something kind of like Project Xanadu, but focused on code—to do what I was describing in a sane way. But maybe one day. :)

                                                                                        1. [Comment removed by author]

                                                                                          1. 4

                                                                                            You’re raising a really valid point, but there are fixes, and I ironically think that they come from completely separating where comments are stored from where code is stored. In a system as dynamic as what I’m describing, doing so would let you say, “This comment was written for v9 of this code, but you’re looking at v15; here’s what the code looked like at v9. Feel free to update this comment or mark it as inaccurate.” We’re honestly heading that direction already with commit messages and hg annotate/git blame; it’d make sense to just own that more completely.

                                                                                            1. [Comment removed by author]

                                                                                              1. 2

                                                                                                I’d love to hear about your experiences with TeX. I took a couple months last year to work with plain TeX and found it quite enjoyable.

                                                                                                1. [Comment removed by author]

                                                                                                  1. 2

                                                                                                    I came into this in the early 2000s, so my first exposure was to LaTeX, and I didn’t think it looked all that nice. Later I came upon ConTeXt, which I found much easier to make look the way I wanted. It’s sort of assumed with ConTeXt that if you need to do something really fancy, you’ll head to either plain TeX or Metafun (or TiKZ). So I had the impression that ConTeXt was a lot closer to plain TeX than LaTeX. And I think that planted the seed for me to learn plain TeX. In between, I picked up troff (Heirloom Troff is quite nice actually) and looked at a few other things (Docbook, Lout). I haven’t had much occasion to do much typesetting lately, but if I were going to, I’d probably be choosing plain TeX absent other information.

                                                                                            2. 3

                                                                                              Another reply mentions displaying age markers with comments; these look like obsolete-page markers in NethackWiki (which sometimes has source code discussion, too…) Detailed game wikis also seem to mark clearly whether a paragraph is about game mechanics, game strategy, in-universe background, outside cultural references, or technical limitations of implementation.

                                                                                              Maybe in some kinds of code it is a good idea to have a convention of separating generic domain knowledge, local organizational practices, technical approaches across the codebase and local technical decisions; sometimes even the rate of change can be predicted.

                                                                                              I guess a separate difference of program code as opposed to biblical text is «value density» — those who study Talmud care about details of each sentence; but there is a lot of code where the details are just not worth caring about; and we value not needing to read code in order and not needing to pay attention to everything to find the things we are looking for.

                                                                                        2. 3

                                                                                          For a while, everybody seemed to be using something to present Javascript and commentary side-by-side; I don’t recall whether this is the thing or not but Groc certainly looks like it. I liked this style and it seemed to be taking off a few years ago but nobody seems to be doing this anymore.

                                                                                          You’re probably talking about Docco and its spinoffs, Pycco (I’m the maintainer); Marginalia and others.

                                                                                          1. 2

                                                                                            Thanks!

                                                                                          2. 4

                                                                                            Literate programming is one of the biggest failed memes in the last half century of software engineering.

                                                                                            Consider: if the point of programming is to leverage abstraction, what is the bloody point of knowing the details about how something was done? We all by construction do not care.

                                                                                            1. 7

                                                                                              My issue with literate programming is that it makes refactoring much harder and the weaving of LaTeX (ugh) and whatever language can get very annoying to deal with.

                                                                                              1. 4

                                                                                                Well, it depends what you consider Literate Programming. If we consider it as defined by Knuth you’re right on the complexity, as the original tool functionality (CWEB and noweb) never was incorporated to IDEs.

                                                                                                But you can use markdown to do Literate Programming and it works nicely (see codedown). Obviously, is not straightforward to move a 1 million lines of code to this, but starting a new program with markdown as literate programming tool is easy and a good way of starting with literate programming.

                                                                                          1. 2

                                                                                            Setting up Continuous Integration in a Gitlab opensource project. Not the best tool in the world for compiled software, but good enough for our needs.

                                                                                            1. 2

                                                                                              Other than technical or philosophical issues, the main change of the git/github revolution has been this:

                                                                                              1. Burden of VCS maintainance pushed to contributors

                                                                                              I no longer contribute to a lot of projects, but I still remember when sending a patch was as easy as download the zip sources (or cvs checkout), copy the files in a new directory, edit and run diff. Send the diff file to the maintainer.

                                                                                              Now things are more hairy and I fail to see the benefit. Don’t get me wrong, github has greatly extended visibility of software projects, and added a ticketing system, but they did it in spite of git.

                                                                                              1. 3

                                                                                                I agree with you about the github pull request model (that bitbucket, gitlab, etc have copied), but that is not git’s fault. Applying a diff is easy in git [git apply <foo.diff>]. Git provides an way to prepare a mail with the patch (git format-patch) and sending it (git send mail) to make the contributors life easier.

                                                                                                Too bad this approach is not more common for FLOSS projects this days, not only is the gh pull-request model more complex but in practice it locks valuable information about the project’s history on github as most people this days explain the changes in the pull request comment, not in the commit message.

                                                                                              1. 2

                                                                                                On the technical side:

                                                                                                On the non-technical side:

                                                                                                1. 4

                                                                                                  Pardon my french, but holy #@$%! I’ve heard of macros but I never realized/thought they could change the SYNTAX of the language. I always thought they were some kind of improved versions of functions/classes, but this is major.

                                                                                                  I have a question though, for this: How much runtime over head is this? Say, for this particular example, is the compiled code equivalent to simply generating a hash table (i.e. no run-time overhead) or does the string get parsed (i.e. everything in set-macro-character) in runtime?

                                                                                                  1. 3

                                                                                                    Well, that’s the objective of macros, not only to change the syntax but to create new data structures. Reader macros are the simplest ones

                                                                                                    One of the classical macros is to use it to define automatas (and parsers) at compile time, skipping tools like lex/yacc: one simple example in lisp, and something more interesting in Scheme.

                                                                                                    As usual, C2 wiki is always an interesting read.

                                                                                                    1. 3

                                                                                                      I’ve heard of macros but I never realized/thought they could change the SYNTAX

                                                                                                      That is because the OP is about reader macros not macros. Macros are more less what you functions that run at compile time and operate on code (which is made of Lisp’s own objects) or syntax objects in the case of Racket

                                                                                                      1. 2

                                                                                                        Macros are expanded at compile time, so there’s only a difference when the file’s loaded; at runtime, the code will be identical to you constructing the hash table yourself.

                                                                                                        The code here is actually using a reader macro, which means a) the code’s actually more involved here than in most macros, and b) it’s expanded at read time, rather than at compile time. This is necessary because {a => 1} isn’t something Lisp knows how to read in by default, so we need tell it how.

                                                                                                      1. 1

                                                                                                        What is functional programming? It is monads, functors, Haskell, and elegant code.

                                                                                                        Candid question, who is he? Some kind of rock start or prophet? I suppose he wanted to get people’s attention, but…