1. 6

    I am currently reading the Git Pro book hosted on git-scm.com. It is freely availble as ebook in many formats.

    Having read about 1/3 so far I can really recommend it. It is well written and even after using git for years there is plenty of stuff to learn from it. Might it be internals that will become clearer or different aproaches on how to use it.

    1. 1

      I think you meant “Pro Git”. :) (I agree, it’s worth reading.)

      1. 1

        Yeah I botched that one up, thanks for the correction. :)

      2. 1

        The “Git Internals” chapter is pretty eye opening. While not strictly necessary for using git I found understanding how the underlying content addressable store fits into how a repo is put together helps to pull away nearly all the magic.

      1. 2

        Given that this is by someone at Google I wonder whether this is related to Go’s famous:

        Function should have comment or be unexported

        1. 3

          You can tell they aren’t go programmers because they recommend replacing

          int width = ...; // Width in pixels.
          

          with

          int widthInPixels = ...;
          

          while any competent go programmer would recommend replacing it with

          int w = ...;
          

          and perhaps even a zero-width space if go2 will allow it.

          (yes this is a joke)

          1. 1

            Famous it might be, but that error message is new to me, so thanks for posting it. I think it’s rather lovely that the compiler enforces that.

            1. 2

              I agree. It’s one thing I really like. Especially when it makes people reflect and describe it in a different way than the implementation. Code comments are also great to explain not what you do (the steps), but also why you do them and what the goal is and sometimes writing the documentation alone, which makes one think about a function is enough to create a clearer and “better” solution.

          1. 24

            Has anyone else found that making many small helper methods tends to make reading the code much more difficult though? This is something that I spend a lot of time musing about, and writing half-assed tools about (I have a bad scheme clone of imatix’s gsl).

            When I see a function that is just a list of helper functions, I have to zip around look up the implementation for all of them! Especially when they’re not re-used across the code base, you’re learning a whole set of semantics that are unnecessary for your problem domain.

            If the top level function was fully documented, then I’d only need to do that when I change something. Which is an improvement, but then it means that changing something is the slowest part, which is the opposite of maintainable code.

            A function name is only better than a line comment in that it delineates the start and the end of the code it’s referring to, unless it’s reused, but YAGNI if it’s not reused already. You could mark sections within comments like pre-function programming…

            Ah shit. I think I’m reverse engineering literate programming again.

            1. 10

              Has anyone else found that making many small helper methods tends to make reading the code much more difficult though?

              Depends on the length and complexity of the function.

              I think it’s much cleaner to have a function calling a list of well named functions each with a specific purpose than many lines of if else switch for loops etc.

              As far as maintenance, I suspect you know what you’re trying to modify and can go directly to which function is requiring the modification. Why would you need to zip around looking at all of them?

                1. 8

                  I agree. My understanding is that this is partly based on people applying generally good rules in a way too draconian way. So while it’s great that code quality is something that people look at nowadays and build tools to improve it I think they arrived at a point where they might be either counter productive or too limited.

                  Yeah, it’s good to have short functions that aren’t too complex, but I think everyone has seen code that became clearer when increasing it’s size.

                  My favorite example is cyclomatic complexity, which is a great concept to quantify complexity and give a clue on when it’s worth to separate, however it sometimes can lead to the complexity becoming even more of a nuisance when that complexity is split up, to make the tool checking your code quality happy.

                  I don’t know what the solution is and have seen different bad outcomes of bringing code quality (tools) into companies. Next to justifying insanity by throwing around some quote or some tool, I’ve seen the other side as well, which is starting to use such a tool and instead of taking its input simply configuring it so that the code just passes checks. Other than that I have seen extremely smart ways, just as silly patterns to work around code quality checkers, leading in the code being hard to read, rather than more easy.

                  Something where these tools do very badly is that they are often not made with the programming language in mind and for obvious reasons not with the programming task at hand. So the outcome is usually that writing something like parsers or converters in a readable and efficient way makes most of these tool think code is horribly complex.

                  I think general statements are hard to make. Sometimes I prefer to split up stuff into helpers, sometimes it is way clearer to kind of write “paragraphs” of code with some comment above them, but only when it cannot be reused as it is code that requires what happens before and after and the context/the amount of variables on the stack is both big and specific, but even here I wouldn’t consider that a general rule.

                  A while ago I read an article about testing. If I remember correctly, it was linked here, but I can’t find it right now. It was about how not every test needs to be written or every single line of code needs to be tested and that despite testing being extremely important and something people should do. It still is nonsense to not test code, just like it being nonsense to have big programs in only a few functions.

                  In other words: In my opinion, part of what make a programmer a good/experienced programmer is knowing what the right thing to do is.

                  1. 3

                    My favorite example is cyclomatic complexity

                    So I decided to play with oclint and wrote essentially one function in three different styles. I also snuck a typo in one of them to see if the tool would give a shit (and it wouldn’t). oclint complains about fun2, which is arguably the most straightforward and a reasonably idiomatic way to write this kind of code in C. It does not complain about complexity in the other two functions…

                    It also nags about all the variable names because they are too short. How useless. Here’s the code:

                    int
                    fun1(int op, int l, int r)
                    {
                        return
                          op == 1 ? l + r
                        : op == 2 ? l - r
                        : op == 3 ? l * r
                        : op == 4 ? l / r
                        : op == 4 ? l % r
                        : op == 6 ? l << r
                        : op == 7 ? l >> r
                        : op == 8 ? l == r
                        : op == 9 ? l != r
                        : 0;
                    }
                    
                    int /* line 17 */
                    fun2(int op, int l, int r)
                    {
                        if (op == 1) return l + r;
                        if (op == 2) return l - r;
                        if (op == 3) return l * r;
                        if (op == 4) return l / r;
                        if (op == 5) return l % r;
                        if (op == 6) return l << r;
                        if (op == 7) return l >> r;
                        if (op == 8) return l == r;
                        return 0;
                    }
                    
                    int
                    fun3(int op, int l, int r)
                    {
                        switch (op) {
                        case 1: return l + r;
                        case 2: return l - r;
                        case 3: return l * r;
                        case 4: return l / r;
                        case 5: return l % r;
                        case 6: return l << r;
                        case 7: return l >> r;
                        case 8: return l == r;
                        default: return 0;
                        }
                    }
                    

                    Summary: TotalFiles=1 FilesWithViolations=1 P1=0 P2=1 P3=9

                    /tmp/ocl/oclint-0.12/bin/cat.c:17:1: high npath complexity [size|P2] NPath Compexity Number 256 exceeds limit of 200

                    1. 1

                      Now suppose our poor programmer had started with fun2 and thought that there was no way to cheat the tool and reduce complexity without breaking the function down into smaller pieces… tada, more code, a layer of indirection, and an extra pair of checks to maintain behavior with the old numbering scheme! And now the tool will happily accept it (well it still nags about names that are shorter than three letters).

                      Only now it’s much harder for somebody reading the code to determine which number matches which operation. Of course one could introduce yet another layer of indirection by replacing the numbers with identifiers, and then you also need to make sure the table & defines stay in sync. If that’s too much work, we can introduce yet more complexity by having another tool generate the table at build time.. anything goes as long as the tool is happy!

                      int op_add(int l, int r) { return l + r; }
                      int op_sub(int l, int r) { return l - r; }
                      int op_mul(int l, int r) { return l * r; }
                      int op_div(int l, int r) { return l / r; }
                      int op_mod(int l, int r) { return l % r; }
                      int op_shl(int l, int r) { return l << r; }
                      int op_shr(int l, int r) { return l >> r; }
                      int op_eq(int l, int r) { return l == r; }
                      
                      int (* const optab[])(int, int) = {
                          NULL,
                          op_add,
                          op_sub,
                          op_mul,
                          op_div,
                          op_mod,
                          op_shl,
                          op_shr,
                          op_eq,
                      };
                      
                      int
                      fun4(int op, int l, int r)
                      {
                          if (op < 1 || op >= sizeof optab / sizeof *optab )
                                  return 0;
                          return optab[op](l, r);
                      }
                      
                      1. 1

                        I think the tool’s right. With fun2 the flow is much less obvious; in fun1 and fun3 it’s immediately clear that only one path will be taken.

                        1. 2

                          What in fun2 would even suggest that more than one path may be taken?

                          And then why does that not apply to fun3? Both are reliant on returns, nicely lined up.

                          fun1 is the least idiomatic of them all, and while the way I lay it out should give a strong clue about its behavior, it might still take a moment longer for the average C programmer to determine that it does indeed do what it should (the sneaky typo aside).

                          1. 2

                            What in fun2 would even suggest that more than one path may be taken?

                            There are a bunch of different if statements. A priori, any combination of them is possible.

                            And then why does that not apply to fun3? Both are reliant on returns, nicely lined up.

                            It does to a certain extent, but less so, because a switch is an immediate indication that only be one path will be taken - fall-through is so rarely desirable that the reader is inclined to assume it won’t happen.

                            fun1 is the least idiomatic of them all

                            Interesting, because I found it the clearest / most readable.

                    2. 4

                      I like style C with blocks

                      var x int
                      { // minor function 1
                          ...code..
                          x = whatever
                      }
                      

                      my coworker hates it though

                      1. 4

                        I like to use the same pattern. It’s nice because you don’t have to go around looking for tiny helper functions that only get called once while still properly scoping the lifetime of your variables so readers can forget about them when leaving a scope.

                    3. 8

                      I find the “many small helpers” style unbearable when any of the helpers mutate shared state.

                      Having many short pure functions can still be confusing, but it’s vastly more manageable for me.

                      1. 7

                        Has anyone else found that making many small helper methods tends to make reading the code much more difficult though?

                        Yes, I’ve been saying this for a while.

                        Show me your half-assed tools and I’ll show you mine.

                        1. 3

                          NOW THAT’S A WHOLE ASS, not half. That’s awesome!

                          Notes on yours: Reminds me a lot of the tanglec program that the axiom guy daly came up with. I like your notation a lot, but I’m not sure I like the before/after notation… I’ll take a deeper look at this and try to think more clearly about it. The testing part is fascinating, totally different ideas than I would ever think of.

                          The iMatix guys take it a different way and say that your literate programs have “models” within them that you can pull out and give some meaning to, by encoding them in well documented XML, which is then expanded into code.

                          Mine doesn’t support looping correctly yet in the template language, because I don’t want to re-invent monads. but here it is. It doesn’t cover what you’re doing quite, as I’m trying to go with a higher order concept of “models” that the gsl folks came up with.

                          I have some work locally that’s about joining these two ideas. Scheme as the meta-language, SXML/XML “models” to contain program models, and tangle/weave (with TeX or HTML, not sure yet) for source style. It’s extremely half-baked as there are a lot of missing parts.

                          1. 4

                            Thanks! Yes, I’m aware of Tim Daly’s work and we’ve exchanged comments in various places on the internet. I tend to think his emphasis on typography – shared with classic Literate Programming – is a tarpit. But it’s working for him, and I don’t really have complete confidence in my approach yet ^_^

                            I hadn’t come across the iMatix/gsl approach, do you have a pointer to learn more about it? That’ll help me gain context on your tool. Thanks.

                            1. 2

                              I tend to think his emphasis on typography – shared with classic Literate Programming – is a tarpit.

                              Having tried it, I am inclined to agree – at least for myself, who is susceptible to typography twiddling. On the other hand, are you familiar with Docco, the “quick-and-dirty, hundred-line-long, literate-programming-style documentation generator”? I have had good experiences with it.

                              Docco (and its clones) take as input ordinary source files containing markdown-formatted comments, and render them to two-column HTML files with comments to the left, and the code blocks that follow them to the right. That encourages writing the comments as a narrative, but limits the temptation to twiddle with typography – it basically lets you keep a nicely read- & skimmable version of the source code at hand at all times.

                              All of this is far more modest than what you are working on, of course. I’ll be keeping an eye on it!

                              1. 2

                                I’m a little familiar with Docco. As you said, it’s a little better because you don’t tweak the presentation endlessly. However, I think it also throws the baby out with the bathwater by dropping the flexibility to reorder parts of a program to present them in the most comprehensible way. There’s a few “quasi-literate” systems like that out there. They’re certainly better than nothing! But I think they have trouble scaling beyond short single-file examples.

                              2. 2

                                The best place to see a good example is: https://imatix-legacy.github.io/mop/index.html They are the ones that wrote ZMQ.

                                The basic idea is simple. Put same data in a serialized hierarchical format, that is also very easy for humans to read (for gsl, this is XML). It’s best if it has a CDATA like feature, for unquoted/raw data.

                                Next, there is what is called the “template”. This is another format of data, that you apply to your xml files to generate output.

                                Finally, there is what is called the “script”. This lets you do things like include xml files as subtrees of others, run them through multiple templates, and execute pretty basic code.

                                Through these pieces, you can generate what gsl calls the model. With a given set of scripts and templates, you can hypothetically reduce the “uniqueness” of your implementation to a series of XML files. zproto and other tools use this to actually generate entire server & client implementations with XML descriptions of your format. You can see a description of how zproto kind of works here : http://hintjens.com/blog:75

                                So it’s the happy combination of a templating language, some xml / hierarchy manipulation tooling, and a simple scripting language. I then took it into whacko land by saying all of that just sounds like lisp macros with some custom reader settings, so why not implement each of these three pieces as scheme, and add on top my desire for literate programming as part of the “generation” work it does. It’s not very far off the ground yet, but I do think there is something in here. We’ll see.

                                1. 3

                                  Most interesting. I see the connection you’re drawing with Lisp macros. It sounds like GSL is basically lisp macros that get a tree of data rather than just individual variables. So like lots of nested pattern matching on the s-expression you pass into the macro, so that you can name different parts of it to access them more conveniently. Does this seem right? Now I wish I could read more about it, and that the original series had more posts :)

                            2. 3

                              That looks like it ends up equivalent to AOP, which is a nightmare to debug when done badly (and any widely-used programming tool ends up being used badly). I read of someone doing something like this in Ruby that was allegedly highly maintainable - writing the simple version of a function, thoroughly testing it, then locking it down and using monkeypatching to handle cross-cutting concerns rather than complicating the original function - but it’s so contrary to my experience that I struggle to believe it.

                              What I prefer is something like the “final tagless” style - just enough inversion of control, write your function in terms of commands and you can invoke the same function in either a simple or a complex context, provided it offers those commands.

                              1. 3

                                AOP is very different from MOP in terms of actual day-to-day usage and thought pattern, but I think you’re entirely right that “how will it be used badly?” is a great pattern to try and compare approaches.

                                One part of AOP that makes things more difficult is there is no way to see the steps/expansions visually anywhere other than with tooling. With MOP (largely, I’m only talking about gsl), it’s something explicit in your templates, and you can run iterations of gsl to see what is generated very easily. Hypothetically you could use gsl to implement AOP-style programming, but most of the culture around MOP is to focus on the “model” that is repeated in your business logic (think state machines) and making them easier to see. AOP is much more about taking large cross cutting things like logging or error handling and rolling them up into these aspects.

                                As I’m thinking more about this though, I’m seeing more of your point. It’s a slippery slope to just munging crap everywhere.

                              2. 3

                                This is extremely interesting! I am a big fan of literate programming because it allows me to create a narrative for understanding my program, but LP is not without its problems: you often don’t get access to your regular development tools (syntax highlighting and auto-indenting for your language, integration with IDE-like utils such as Racer for Rust) and this makes development more difficult and painful.

                                Your idea allows the creation of a similar narrative—order the fragments in an order that tells the story to the reader—but also gives the programmer access to his usual coding environment, save perhaps for a quick fix to allows for the directive at the top.

                              3. 4

                                A function name is better in another way: it doesn’t become outdated as easily as a comment.

                                Has anyone else found that making many small helper methods tends to make reading the code much more difficult though?

                                I have found it makes reading the code somewhat more difficult, on the first read through, iff I need to thoroughly understand or review all the code. Once I’ve done that, it makes reading the code easier, because I don’t need to parse every line to understand what it does: it tells me what it does and I know I can trust the name to be accurate.

                                If I only need to fix a bug and know I can trust the author, I know I don’t need to check all the implementations.

                                If you constantly need to check all the implementations, then it indeed only hurts to make many small well named methods. But I assert that means you are doing something wrong.

                                1. 2

                                  look up the implementation? I’m confused why wouldn’t the implementation be in the function which contains the helper functions?

                                  1. 2

                                    Look at the link that mikejsavage posted above for the “helper” functions I’m thinking of.

                                    Putting the functions inline inside a function is only useful in languages that support that style (which c++ does wonderfully now), but I’m thinking about a slightly higher point around documentation.

                                    1. 1

                                      Oh so our definitions of helper functions are entirely different things o__o. cool. I guess author did say helper methods, not helper functions.

                                  2. 1

                                    Nope. Small helpers make it easier to read, provided they’re well-named and well-typed. As @danielrheath says, if they mutate state they’re very hard to understand - but don’t do that, thread state through explicitly instead.

                                  1. 10

                                    I like using inoremap jk <Esc> instead of inoremap jj <Esc>. It’s quicker to hit two keys in quick succession and it has the benefit of being mostly being a no-op in normal mode since you just go down a line and then up a line; which is nice if you have a nervous habit of returning to normal mode even though you might already be in it.

                                    1. 2

                                      I am a big fan of using jk, and like you have never run into any issues where I need to type “jk” in insert mode. I recently switched to spacemacs, where they introduce a default of “fd” which I have found similarly ergonomic.

                                      1. 1

                                        So doing that would cause typing ‘jk’ in insert mode to return you to normal mode? What would happen if you actually wanted to type ‘jk’?

                                        1. 4

                                          In three years that I’ve used jk, it’s never been an issue.

                                          That would change when someone invents texting integration into vim.

                                          1. 4

                                            You need a small delay so that the chord doesn’t register, it’s about one second. So you type j, wait a second, then k.

                                            1. 3

                                              Please keep in mind this timeout is configurable via timeout and ttimeout.

                                            2. 1

                                              Hasn’t ever happened to me either. If it did, you would just have to hit j, then wait a second for the multi-key timeout to expire and the j to actually appear, then hit k.

                                            3. 1

                                              Me too! IMO, jj just doesn’t feel right… It will probably become more of a habit when I start using a new mbp.

                                              1. 1

                                                some people remap jk and kj to esc so all you need to do is press both j and k at the same time to get esc. I am too used to jj to do that but you might want to try that.

                                              1. 10

                                                I can’t get over how incredibly good the erosion pass makes it look. Super cool!

                                                1. 12

                                                  Agreed. That is incredibly impressive. Long ago, I studied and extended a detailed (if inaccurate) tectonic simulation that someone else had written out of frustration with fractal-based continent outline generation… what I’d have hoped to get out of it was coastlines that have clearly developed from natural processes, and it never delivered anything remotely like that.

                                                  The river and erosion algorithms used here must have been a lot of work for the inventor, and I had no idea there was anything so good being done.

                                                  The part that renders it as line art so it feels like a map is also really cool, and I think does a lot to impress people even though it’s “just” presentation.

                                                  1. 3

                                                    tectonic simulation

                                                    Sounds interesting! I jumped to google and I found something with cute results in here, paper here

                                                1. 5

                                                  I really appreciate the comment on linear algebra. I don’t think I fully understood the power of composing matricies until after college when I started to get into computer graphics.

                                                  1. [Comment removed by author]

                                                    1. 19

                                                      Perhaps to prevent attacks where someone can be MitM’d, given some data marked as immutable (maybe even data that the actual site never would mark as immutable) and have that persist long after the fact.

                                                      1. [Comment removed by author]

                                                        1. 1

                                                          Exactly. This response can only be trusted over HTTPS.

                                                          1. [Comment removed by author]

                                                            1. 3

                                                              A browser trusting the response to really mean what it says might never ever check to see whether the resource has changed upstream. Because that’s its intended purpose. That would mean that the malicious content from a single MitM would get kept and used forever, even later when browsing from a secure network.

                                                      2. 4

                                                        I remember reading somewhere that browsers were discussing only adding new features over HTTPS as an incentive to get sites to upgrade

                                                      1. 1

                                                        Completely agree with the author on Expert C Programming, it’s a wonderful read.

                                                        1. 11

                                                          It’s unfortunate the blog post doesn’t mention this, but if you’re already routing all your dns queries through dnsmasq you can mitigate the exploit until all your packages and such are updated by setting dnsmasq’s edns-packet-max to 1024.

                                                          1. 1

                                                            Does that only apply to udp packets? The man pages seems to imply that is the case, while the CVE says large tcp responses are also a problem.

                                                          1. 4

                                                            Controlling the mouse sounds neat, but given the option I’d opt for just using a widow manager designed for mouse-free movement from the start.

                                                            1. 7

                                                              in another thread we mentioned be more accepting of “progressive” c. so i thought i’d share this (sorry if self-links are not accepted - i checked the guidelines and it seemed ok), which was my attempt at pushing the limits of how c might be programmed.

                                                              the example code at https://bitbucket.org/isti/c-orm/wiki/FlowPhonebook shows the full flavour. error handling (return codes) is systematized with macros; functions are “namespaced” (in structs); the ORM itself include dynamically generated code (form a python script).

                                                              i’m not saying all c programs should be like this. it was an experiment…

                                                              1. 4

                                                                Looks neat! If you’re interested in taking C to it’s absolute limits along those same lines you might want to check out libCello.

                                                                1. 1

                                                                  oh, that’s interesting. one thing i tried to avoid was anything that forced changes globally in the user’s program. so, for example, even though there’s a string type, the interfaces are mainly char*. i wonder how “invasive” or “polluting” cello is (especially exceptions). but i need to look in more detail. thanks.

                                                                  [edit: it’s making me feel guilty that i needed code gen for so little…!]