1. 2

    I feel like the first example can be solved more elegantly by doing something like

    l = [0] * 10;
    i = 0;
    
    rate_limit(l, i):
      if now() > l[i]:
        l[i] = now() + 1 minute
        i += 1
        if i == l.size:
          i = 0
        return false
      return true
    
    1. 2

      It’s roughly the same as in the post. The main difference is using a circular array to keep track of the times instead of a linked list:

      l = new Queue()
      
      def rate_limit():
          while l.peek() < now():
              l.pop()
      
          if l.len() > 10:
              return false
          else:
              l.push(now() + 10 minutes)
              return true
      

      In fact, you can replace the while loop with an if statement and it will still work.

      1. 1

        You’re only allowing one call per minute, you need to allow N calls per minute, but the spirit is the same.

        1. 1

          There are n calls per minute.

          1. 1

            Oh I see, n = 10 in this case.

      1. 9

        if you’re the author, don’t you have a non-twitter link?

        1. 3

          Not at the moment - I just made it and tweeted about it. However, I plan on putting it under “X in One Picture” series on https://roadmap.sh

          1. 2

            I posted it elsewhere, for posterity, or something:

            https://i.postimg.cc/k9m1CkpF/bigO.jpg

            1. 2

              I also wondered whether tweets are endorsed here…

              1. 4

                I think most of the backlash against Twitter is that it’s a poor format for reading long form writing, content is usually light or unsubstantiated (rumors), it’s a redirect to a better primary source.

                For the most part, Twitter links do pretty well in Lobster.rs: https://lobste.rs/domain/twitter.com

            1. 2

              Do we really want the map to be screaming bright red? Red is a very emotive colour. It has meaning. It can connotates danger, and death, which is still statistically extremely rare for coronavirus.

              I remember hearing/reading that associating red with danger and violence is a western thing and that other cultures have their own associations. Maybe you could show a different colored map based on the geographic location of the viewer?

              1. 3

                A common example is in China Japan and Korea, red is associated with fortune so when a stock increases in price, that increase is red, and a decrease is in green. This is opposite of how it’s done in the US.

              1. 3

                I did a facepalm at the “It turns shadowing from a frequent cause of bugs into something that prevents bugs!”

                1. 5

                  I think it’s really not quite as bad because you have the semantic information through the variable name and a strong type system backing you. There are definitely times where I’m taking the same piece of data but transmuting it over a sequence of types for different reasons like encodings and such and it’s a pain to perform that ceremony of Hungarian notation. If the semantics remain the same and the type encoding is the only difference, maybe shadowing is the right thing, especially with lifetimes.

                  1. 1

                    I think the mentality is, that a variable isnt anIdentifier, but kind of apair (anIdentifier, aType), so repeating the information contained in the type in the identifier is redundant and sometimes even bad, because after a refactoring the type might have changed (from a vector to an iterator for example, etc).

                    Of course this shouldn’t be the opportunity to name all local variables x, but it can make sense for a chain of transformations, filtering down a result list for example.

                1. 1

                  Can SIGSTOP be used with similar effect?

                  1. 2

                    PTRACE_ATTACH sends SIGSTOP to this thread.

                    http://man7.org/linux/man-pages/man2/ptrace.2.html

                    I’m not clear on what the benefits of using PTRACE_ATTACH over SIGSTOP is but I think it’s that you actually get a tracer connection to the thread/process?

                  1. 12

                    What is this? A “news” website that quoted a twitter post with an inline screenshot of an email from a google groups list?

                    Here’s a direct link to the email: https://groups.google.com/d/msg/lisalist/aIo6cNu54xM/_Ck_CsmSBgAJ

                    1. 5

                      Wow, this is going to be cool! One of the biggest gripes I have with both Ruby and Python is that it’s so hard to understand what is happening when your program is no longer making forward progress and your logs just stop. When I try to sample the process, I just get code from the interpreter executing some generic byte code operations. If I’m lucky, the code is calling into some C library or implementation and I can infer where in the code the program has halted or is looping over or is spending most of it’s time.

                      1. 3

                        Have you tried DTrace? Ruby has been instrumented since 2.0. Most of the popular language runtimes are!

                        1. 2

                          My vague impression is that Python code “ought” to be easy to pull a stack trace out of, without needing to modify your program, provided you have debug symbols for the interpreter, because you can walk the Python stack automatically in gdb with a script that knows which C functions correspond to bytecode evaluation and which of their local variables correspond to line numbers and function names - see py-bt on https://wiki.python.org/moin/DebuggingWithGdb

                          1. 1

                            Visual studio made some really impressive stuff for seamlessly switching between c++ and python callstacks. https://docs.microsoft.com/en-us/visualstudio/python/debugging-mixed-mode

                          2. 2

                            Agreed! One of our Ruby apps has a section that seems far too slow for no apparent reason. Last time I worked on it, I spent a day or so fiddling with profilers, to no avail. Anything that makes that easier is a good thing for the Ruby ecosystem.

                          1. 2

                            Usually Excellent but sometimes Average. But this is not true for a large number of people I work with. I work at a very large and successful brand name company and I’m 2.5 years out of college.

                            Most days call into standup from my car at 10:15. Standup is my cue to start driving to work. I get to one of my work’s parking lot at 10:45 and take 15 minutes to walk the half-mile to my building/office. I work from 11 to noon where I then get lunch until 1. I work until 4 when I get a bit tired and start studying Japanese and Mandarin at the lunch table. I study until 6 where I then either go to my language class or the gym or a sports game. I get home at around 9, chill out, and go to bed.

                            Some days I work at home a bit at night or the weekends whenever I feel like I can get some productive work going on something interesting or when the load on our infrastructure is a bit lower and I can take over more of our device fleets.

                            Once every three or six months there’s some massive fire where I actually work from home at 9 to 11, go to work, get lunch for an hour, and work until about 6, and do my regular after work debauchery. Then I work a bit from home at night like check my tests, fix some stuff, submit for more tests, go to bed to check them the next day.

                            I’ve been very successful in my career working this due to a combination of being the most senior person on my team at 2.5 years, being an extremely efficient engineer because I only allocate a few hours to work a day, and having very clean development. I also hate doing work so I push back on anything I find dumb or a waste of time or doing things that will lead to more work later. I notice a lot of other people even more senior than me fail to communicate issues like this and usually fail to push back on things that make no sense. Many people I work with end up either doing 9-5 but many also do 8-6 or 8-7. I try to get them to stop overworking but they always retort that there’s so much work left. I hate that mentality because there’s always more work to do. I’ve historically had very lenient managers and my newest manager seems to accept the fact with a disgruntled look on his face. But hey, he loves that he doesn’t actually have to manage me and I’m a free spirit who provides strong feedback.

                            1. 39

                              The argument seems to rely on the cost of static typing, which is stated in the following four points that I challenge:

                              It requires more upfront investment in thinking about the correct types.

                              I don’t buy this argument at all. If you don’t think about correct types in a dynamic languages, you will run into trouble during testing (and production, when your tests aren’t perfect). You really have to get your types right in a dynamic language too. Arguably, with a statically typed language, you have to think less because compiler will catch your error. I think that’s the whole point of statically typed languages (performance concerns aside).

                              It increases compile times and thus the change-compile-test-repeat cycle.

                              I’d have to see some proof to show that static typing plays a significant role in compile time. I’ll buy that you can make a very complicated (perhaps turing complete) type system and that would have a big impact. But there are statically typed languages that compile really fast, and most of the compile time is probably not spent on types. I’d argue that it is likely for the compiler to catch your error with types faster than you could compile, run, and test to find the same error with no types.

                              It makes for a steeper learning curve.

                              That may or may not be true. Sure, a type system can be very complicated. It doesn’t have to be. On the other hand, a dynamic language will still have types, which you need to learn and understand. Then, instead of learning to annotate the types in code, you learn to figure out type errors at run time. Is that so much easier?

                              Either way, I don’t believe type systems are an insurmountable barrier. And I think some people give the learning curve way too much weight. Maybe they are working on throwaway software on a constantly changing faddy tech stack, in a place with high employee turnover. It’ll matter more. I suppose there’s a niche for that kind of software. But I’m more into tech that is designed for software that is developed, used, and maintained for years if not decades. A little bit of extra learning up front is no big deal and the professionals working on it will reap the benefits ever after.

                              And more often than we like to admit, the error messages a compiler will give us will decline in usefulness as the power of a type system increases.

                              That might be the case with the current crop of languages with clever type system, though I don’t know if it’s inherent. Do static type system need to be so powerful (read: complicated) however? A simpler system can get you just as much type safety, at the expense of some repetition in code.

                              I think there are dimnishing returns, but not due to the cost of static typing as such, but rather due to the fact that types just don’t catch all errors. Once the low hanging fruit is out, there’ll be proportionally more and more logic errors and other problems that aren’t generally prevented with types. Or you could catch these with extensive type annotation, but the likelihood of preventing a real problem becomes small compared to the amount of annotation required. And then there’s the usual question: who checks the proof?

                              There have been some famous bugs that resulted from systems using different units. So if these numeric quantities were properly typed, these bugs would have been prevented. However, what if we change the scenario a little, and suppose we’re measuring fuel or pressure, in the right units. But we read the wrong quantity – spent fuel instead of stored fuel, or exterior pressure instead of interior pressure? Sure you can add more types to prevent such misuse, but it gets more and more verbose, and then you’re moving closer and closer to re-expressing (and thus enforcing) the program logic in the language of the type system; we could consider that to be a language of its own.

                              Now you have two programs, and one can prevent bugs in the other, but both could still be buggy. And the other program starts to grow because you start needing explicit conversions to enable the code to actually perform a computation on internal-pressure-in-pascal. Of course you are subverting the type system when you say you really want to convert internal-pressure-in-pascal to just pressure-in-pascal or whatever. Bugs ahoy?

                              1. 18

                                A simpler system can get you just as much type safety, at the expense of some repetition in code.

                                I agree with most of the rest of your comment, but this part is untrue. Stronger type systems do allow you to enforce more powerful laws at compile time. At one end of the curve we have a type system like C’s, which barely buys you anything, and then at the other end we have full dependent types where you can prove arbitrary invariants about your code (this function always terminates, this value is always even, etc.) that you cannot prove in a weaker type system. In between is a huge spectrum of safety checking power.

                                1. 8

                                  The C type system can actually be quite powerful if you wrap basic types in one element structs. struct meter { int v; } and struct foot { int v; } can’t be added by mistake, but can still be worked with using one line inline functions with no performance penalty. It’s just work (which nobody likes).

                                  1. 5

                                    I would not describe that as “quite powerful” at all. That’s one of the most basic things a type system can give you.

                                    You can’t really prove any interesting properties until you at least have proper polymorphism. Java doesn’t, for example, because every object can be inspected at runtime in certain ways. In a sensible type system, there are no properties of objects except those which are explicitly stated in the type of the object.

                                    In such a type system, you can prove interesting properties like that a data structure does not “depend” in any way on the objects in contains. For example, if you could implement a function

                                    fmap :: (a -> b) -> f a -> f b
                                    

                                    Which “mapped over” the contents of your object with some function, this would prove that your object never inspects its contents and therefore its structure does not depend on the values of its contents (because this function is universally quantified over ‘b’, and therefore you could map every ‘a’ to a constructed type which cannot be inspected in any way).

                                    You can prove all sorts of useful properties like this (often without even realize you’re doing it) once you have proper quantification in your type system. One of the coolest quantification-based proofs I know of is that Haskell’s ST monad is extrinsically pure.

                                    As you add more power to your type system (up to full dependent types, linear types, etc.) you can prove more and more useful things.

                                    1. 2

                                      As long as you like all your types disjoint, sure. But I’ll pass.

                                      1. 2

                                        So what’s wrong with disjoint types?

                                        1. 2

                                          It doesn’t you have rationals and floats that are both numbers for example.

                                          1. 2

                                            In Ocaml ints and floats are different types and operators like (+) only apply to ints, one has to use (+.) for floats. It’s not a problem IME.

                                            1. 1

                                              I think automatic type conversion of ints to reals was the original sin of FORTRAN.

                                            2. 1

                                              In mathematics the system Z of integers and the system R of reals are different. The number 3 has different properties depending on system context - for example 3x = 1 has a solution in the second context.

                                        2. 0

                                          But it lacks a keyword connection to category theory.

                                      2. 12

                                        I don’t buy this argument at all. If you don’t think about correct types in a dynamic languages, you will run into trouble during testing (and production, when your tests aren’t perfect). You really have to get your types right in a dynamic language too. Arguably, with a statically typed language, you have to think less because compiler will catch your error. I think that’s the whole point of statically typed languages (performance concerns aside).

                                        That’s a good point and one that took me a long time to learn: if a concept cannot be expressed in a language, it doesn’t magically disappear and absolve the programmer from thinking about it. Types are one example as you mention; similarly, in many discussions about Rust, some people mention that the borrow checker is an impediment to writing code. It’s true that some programs are rejected by the compiler, but lifetime and ownership are also concerns in C programs as well. The main differences are that in Rust you have rules and language constructs to talk about those concerns while in C it’s left to documentation and convention.

                                        1. 7

                                          But there are statically typed languages that compile really fast, and most of the compile time is probably not spent on types.

                                          OCaml is a good example of this.

                                          1. 3

                                            I don’t buy this argument at all. If you don’t think about correct types in a dynamic languages, you will run into trouble during testing (and production, when your tests aren’t perfect). You really have to get your types right in a dynamic language too. Arguably, with a statically typed language, you have to think less because compiler will catch your error.

                                            It’s true that you have to get the types right in a dynamic language, but the appeal of dynamic languages isn’t that you don’t have to think about types. It’s that you don’t have to think about the shape of types. For example:

                                            def make_horror_array(depth: int):
                                                arr = []
                                                deepest_arr = arr
                                                
                                                for i in range(depth):
                                                    deepest_arr.append([])
                                                    deepest_arr = deepest_arr[0]
                                                deepest_arr.append(depth)
                                                return arr
                                            

                                            What type should that return? Contrived, but it’s not the gnarliest type problem I’ve run into. Sometimes it’s nice to have a language where I can give up on getting the types right and rely on tests and contracts to check it.

                                            1. 4

                                              It’s that you don’t have to think about the shape of types.

                                              How do you use a value without thinking about it’s type or shape?

                                              In your example, you can’t just blindly apply numerical operation to first element of that horror array since it might be another array. So, if you wanted to get to the value inside of those nested arrays, you’d need to think about how you would “strip” them off, wouldn’t you? And wouldn’t it mean that layers of nesting have some special meaning for us?

                                               

                                              Taking your implementation as reference:

                                              >>> make_horror_array(0)
                                              [0]
                                              >>> make_horror_array(1)
                                              [[1]]
                                              >>> make_horror_array(5)
                                              [[[[[[5]]]]]]
                                              >>> make_horror_array(10)
                                              [[[[[[[[[[[10]]]]]]]]]]]
                                              

                                              we can write a Haskell version that distinguishes between a value nested in a “layer” and a value by itself:

                                              λ> :{
                                              λ> data Nested a = Value a | Layer (Nested a)
                                              λ>
                                              λ> -- just for presentation purposes
                                              λ> instance Show a => Show (Nested a) where
                                              λ>   show (Value a) = "[" ++ show a ++ "]"
                                              λ>   show (Layer a) = "[" ++ show a ++ "]"
                                              λ> :}
                                              λ>
                                              λ> mkHorror n = foldr (.) id (replicate n Layer) $ Value n
                                              λ> :type mkHorror
                                              mkHorror :: Int -> Nested Int
                                              λ>
                                              λ> mkHorror 0
                                              [0]
                                              λ> mkHorror 1
                                              [[1]]
                                              λ> mkHorror 5
                                              [[[[[[5]]]]]]
                                              λ> mkHorror 10
                                              [[[[[[[[[[[10]]]]]]]]]]]
                                              

                                              and if we don’t need layers anymore, we can get value out pretty easily:

                                              λ> :{
                                              λ> fromNested :: Nested a -> a
                                              λ> fromNested (Value a) = a
                                              λ> fromNested (Layer a) = fromNested a
                                              λ> :}
                                              λ>
                                              λ> fromNested (mkHorror 0)
                                              0
                                              λ> fromNested (mkHorror 5)
                                              5
                                              
                                              1. 4

                                                Assuming it’s correct, it should return whatever the type inference engine chooses for you :)

                                                1. 1

                                                  This is because type theory is confused about what programs do - which is operate on bit sequences (or, these days, byte sequences). These sequences may be representations of mathematical objects or representations of things that are not mathematical objects, but they remain representations, not actual ideal mathematical objects.

                                                2. 1

                                                  Now you have two programs, and one can prevent bugs in the other, but both could still be buggy.

                                                  Aren’t a lot of type systems proven type-correct nowadays?

                                                  1. 3

                                                    The type system can be fine but the rules you define with the types could be flawed. Thus, you can still write flawed programs that the type system can’t prevent because the types were defined incorrectly.

                                                    1. 1

                                                      Can you give an example? I’m not sure exactly what breed of incorrect types you’re referring to.

                                                  2. 1

                                                    It requires more upfront investment in thinking about the correct types.

                                                    I don’t buy this argument at all. If you don’t think about correct types in a dynamic languages, you will run into trouble during testing (and production, when your tests aren’t perfect).

                                                    One thing I am learning from working with inexperienced developers is that even thinking about which container type you are using is a challenge. Should your function return a Seq? An Array? An Iterator? A Generator? And what if your new library returns a Generator and the old one returned an Iterator and now you have to rewrite all your declarations for seemingly no reason at all? Some kind of “most general type” restriction/requirement/tool would help with this…

                                                    1. 2

                                                      This is one of the things I think Go does really well (in spite of having a generally quite weak type system) - thanks to implicit interfaces, you can just return the concrete type you’re using and the caller will automatically pick up that it ‘fits’.

                                                      1. 1

                                                        This sort of works – but even with that system, it’s easy to declare one’s types too tightly.

                                                        It depends in part on how granular the collection library’s interfaces are (ditto for numeric tower, effects tracking, monad wizard tool).

                                                      2. 1

                                                        I’m unclear what you mean. Many languages offer two solutions. You can declare the variable IEnumerable or whatever as appropriate. Or you declare the variable as “whatever type the initializer has”.

                                                        1. 3

                                                          When in doubt, use the inference!

                                                          1. 1

                                                            It is sometimes easy to choose wrong. Iterable vs Iterator vs Enumerable

                                                      1. 5

                                                        Is this interesting if it doesn’t have any features to be able to process English and extract semantic understanding rather than just have its own new completely fixed syntax that you have to learn? you may as well just learn the thing you’re trying to do directly instead of the betty syntax…

                                                        here is how it matches if you want to count something.

                                                        match = command.match(/^count\s+(?:the\s+)?(?:total\s+)?(?:number\s+of\s+)?(words?|lines?|char(?:acter)?s?)\s+in\s+(.+)$/i) ||
                                                                command.match(/^how\s+many\s+(words?|lines?|char(?:acter)?s?)\s+are(?:\s+there)?\s+in\s+(.+)$/i)
                                                        

                                                        .. so with any other natural English syntax other than the fixed ones it’s programmed for, it won’t work.

                                                        1. 4

                                                          Makes me think of old adventure games.

                                                          “Get ye flask.” “You cannot get ye flask”

                                                          ..but I agree. It would make more sense to try and use something like NLTK or ParseyMcParseFace. Maybe an idea for future versions?

                                                          1. 0

                                                            Regex and yacc are the new NLP.

                                                          2. 2

                                                            That code snippet immediately disappointed me in Betty. I thought it was using a service like https://api.ai/ to match patterns. Could’ve beaten looking up a command on stackoverflow for one-off jobs.

                                                            1. 1

                                                              Uhm… Would you like to have a tool send all commands to remote host? I agree that the simple text approach is not much different from stackoverflow.com search but sending everything I type to some web service would be a lot worse for me.

                                                              1. 2

                                                                sending everything I type to some web service

                                                                That is what you’re doing when you’re searching on stackoverflow.

                                                                1. 2

                                                                  The other key is “everything I type”. I sanitize my queries to the outside internet pretty thoroughly at my work.

                                                                  1. 1

                                                                    The key is “some web service”. I trust stackoverflow.com with programming questions, and even if I don’t there is a dump of all answers available.

                                                              2. 1

                                                                Even if it did use some kind of NLP, why would I want this? “OK, Google” misinterprets what I want maybe 25% of the time. Why would I want that kind of ambiguity on the command line. That’s literally why I use a command line, because there is no ambiguity, when I type X, Y happens. Every. Single. Time.

                                                              1. 9

                                                                I strongly disagree with this from a business standpoint. Sure, MacOS might be better documented by some collaborative effort but it very well may not be supported. Collaborative documentation ends up describing the current state of systems and processes instead of the intended flow. Documentation is a pseudo-contract between consumers and producers where consumers know what to hold producers to and producers know what they must provide to consumers. By pushing the documentation to only the consumer side, you start seeing documentation that solidifies bugs into something people start depending on because that is how it was documented. When that bug gets fixed, all of a sudden, you have nothing to point to that says you were allowed to use it in that way. Fundamentally, it doesn’t solve the problem that there is no support provided by the producing side and we are just pushing the problem a little further down the investment pipeline.

                                                                1. 4

                                                                  The first `demo’ crashed my laptop. I haven’t had to restart my computer in a long time.

                                                                  1. 1

                                                                    Worked for me on my MacBook Pro 2017 in Safari. Pretty impressed that it can take down your system. Probably worth a bug report.

                                                                  1. 1

                                                                    Is it a present-day 0-day that’s been fixed in iOS/Android?

                                                                    (For those who haven’t the time to read through all the details of the post?)

                                                                    1. 1

                                                                      Should be fixed in July updates.

                                                                      1. 1

                                                                        Should be fixed in July updates.

                                                                        What does that mean? July is almost over. Apple did release some security updates recently, are you saying those included fixes to this?

                                                                        EDIT: Ah yes, appears they did at the bottom here, CVE-2017-9417.

                                                                        1. 2

                                                                          https://source.android.com/security/bulletin/2017-07-01

                                                                          Also for android. I just double checked my Pixel has this update so it’s distributed and available.

                                                                      1. 10

                                                                        Choose your hashes and your designs carefully!

                                                                        Isn’t the moral here that you should try to plan for hash alg changes? Because “Choose your hashes” is just hindsight, really. When SVN was designed, SHA1 was probably still “safe”, right?

                                                                        1. 13

                                                                          Plan for changes. And then start making them. SVN isn’t alone here, but there was a solid ten year lead time between “SHA1 can have collisions” and “I told you so”. The state of RC4 is somewhat similar, with people refusing to move because it wasn’t broken enough. (And a lack of clear direction forward in some cases.)

                                                                          Google went to considerable effort to create a collision. If they hadn’t, people would still say it’s only a theoretical concern. Be thankful it’s still just a warning shot.

                                                                          1. 2

                                                                            There was a submission a long while ago talking about various strange behaviors of crypto material. One was that there was an algorithm that used as a hash, two hash functions and XORed the results together to create the final hash. This allowed them to survive the deprecation of MD5, though they never moved away from MD5. I wonder how useful it is to do this as a way of transitioning away from aging weakly vulnerable hashes.

                                                                            1. 3

                                                                              XOR is worse than concatenation for combining hashes if you’re looking for collision resistance. Here’s a paper describing the safest way to combine two hashes: https://eprint.iacr.org/2013/210

                                                                              Also, see this answer on crypto StackExchange describing various failures of combined hashes.

                                                                              1. 2

                                                                                Yes, that is strange. :) I think the crypto community usually frowns on things like that. Consider that at the time you had something better than MD5 to stir into the result, you could have just used that something better. MD5 ^ SHA1 isn’t notably superior to just SHA1. SHA1 ^ SHA2 isn’t really better than SHA2.

                                                                                1. 1

                                                                                  I think the crypto community usually frowns on things like that.

                                                                                  Is it just “that’s pointless” or could it really hurt? Is it likely or inevitable that the output of two different hash functions on the same input would have coincidental correlations that cancel out with the xor, creating a subtly biased composite function that is worse than the sum of its parts?

                                                                                  1. 1

                                                                                    Indeed: It used to be frowned on because people were concerned that there might be some subtle interaction between the algorithms, although I’ve never seen any evidence of such interactions “in the wild”.

                                                                                    With modern cryptographically strong hash functions, which have a much stronger theoretical base for their security, it’s just a pointless waste of time. I guess you gain a slight ‘security through obscurity’ benefit: an algorithm for generating collisions in SHA-2 (if such a thing were to be discovered) might not work on your custom SHA1^SHA2 implementation, but the same background theory could probably be used to break your implementation - it would just take a little more time.

                                                                                    1. 1

                                                                                      The contrast between crypto and os/application level approach to security is striking.

                                                                                      In the former, an algorithm once accepted is assumed secure until someone publishes a paper or poc that demonstrates a weakness. Furthermore it is assumed that these first discoveries are always more theoretical than practical, so there is no “zero day” rush to change algorithms.

                                                                                      In the latter case, we assume there are undiscovered bugs that could be weaponized in a short timeframe, and defense in depth is the norm.

                                                                                      I do not find it surprising that some developers reach for tricks like mixing the output of two different algorithms. After all, if mixing predictable data (message) with pseudo-random noise (keystream) works for encryption and mixing potentially predictable data (time, io events, etc.) with other such events works for entropy pool mixing, why wouldn’t it work for mixing hashes (which can be thought of as being pseudo-random noise seeded with the key)?

                                                                                      If there were no undesirable interactions between two different hash algorithms, intuition says that mixing one with the other is safe as long as one of the algorithms remains secure. And again assuming no interaction, intuition might say that to break the composite, it is inevitable that you break its components..

                                                                                    2. 1

                                                                                      Mostly pointless I believe. There’s a proof that given two 128 bit hashes, the work to find a collision in both is proportional to 2^128 + 2^128, or 2^129, and not the 2^256 you might hope for. But also, any time you color outside the lines, you run the risk of making things worse.

                                                                                      1. 1

                                                                                        Right, so I would assume the reasoning people have for mixing is not that it doubles the number of bits of search space, but that it saves your ass the day someone finds one of these algorithms is broken and the work to find a collision is proportional to 2^49 or whatever.

                                                                                        Making things worse is a thought that eludes these people.

                                                                                      2. 1

                                                                                        Is it likely or inevitable that the output of two different hash functions on the same input would have coincidental correlations that cancel out with the xor, creating a subtly biased composite function that is worse than the sum of its parts?

                                                                                        It’s likely there’ll be nonzero correlation, because hash functions aren’t written in isolation and use similar techniques. But probably not significant enough to make a difference in practice.

                                                                                  2. 4

                                                                                    You need to pick a sufficienly strong hash for your application. SHA1 is still in this territory for SVN’s purposes, since collisions are still negligible during typical use as a version control system. Which is great, because otherwise the fix about to be released would break the system for many users.

                                                                                    SVN”s problems are that apart from discussing the issue years ago nobody bothered to check what actually happens in the implementation when a collision occurs (that’s a process problem), and that we found ourselves incredibly constrained while trying to come up with the best possible fix for the “webkit” problem. Today, we cannot change the hash without breaking important parts of the system or adding (yet more) backwards compat boilerplate code. We must prevent SHA1 collisions from entering the system to prevent (perhaps accidental) DoS attacks on the system. At the core, this is a design problem. Some features have tightly embraced SHA1 and now replacing it involves a lot of work.

                                                                                    Edit: Another factor that complicated things was that API, protocol, and on-disk format changes are off-limits for SVN’s patch releases, but we had to patch both the 1.9 and 1.8 release series.

                                                                                    1. 2

                                                                                      The iron law of cryptography is that crypto schemes always get weaker over time.

                                                                                      But when this is a problem which isn’t going to manifest itself for fifteen years or more, punting it to the long grass is always going to be very tempting!

                                                                                      1. 2

                                                                                        SHA2 hadn’t even been published when SVN was first released (SVN was released in 2000, SHA2 was first published as a draft in 2001 and finalised in 2002).

                                                                                      1. 2

                                                                                        I’m fascinated by the way they added the test case to the make file. I’m not much of a C person so my question is: is this a common convention since there is no auto discovery of tests?

                                                                                        1. 3

                                                                                          For library code, generally. You need to write a little program that calls the function you want to test.

                                                                                          1. 1

                                                                                            It’s relatively common in C, but I personally can’t stand it, since it puts a high barrier on adding new tests (automating things shouldn’t be inconvenient!). This is one of the reasons I like gtest for C/C++ applications (https://github.com/google/googletest).

                                                                                          1. 6

                                                                                            Long story short? Poverty. When your chances of marrying into la dolce vita are slim to none, you start studying the hard stuff so you get a shot at getting in a well paid industry.

                                                                                            Based on interviews with 11,500 girls and young women across Europe, it finds their interest in these subjects drops dramatically at 15, with gender stereotypes, few female role models, peer pressure and a lack of encouragement from parents and teachers largely to blame.

                                                                                            Role models are largely an US cultural construct. It’s unusual to be driven by emulation in Europe.

                                                                                            The “lack of encouragement from parents and teachers” is a joke. What you see in rich countries is a lack of economical pressure. While in Italy you can get by just fine with a high-school degree, in eastern Europe you need an university degree to get a real shot at extracting yourself from poverty.

                                                                                            1. 3

                                                                                              Role models are largely an US cultural construct. It’s unusual to be driven by emulation in Europe.

                                                                                              Hmm… I don’t know much about Europe but in the US and Japan many people’s actions are pretty well define s by what the mob or people in the spotlight believes should be done. It’s not necessarily that you have a single role model that you attach yourself to but being shown something like Elon Musk making a bunch of money and then creating a bunch to successful moonshot companies is something people aspire towards. Does something like this not happen in Europe? I feel like it’s human nature to be inspired by the people around you whether they’re close to you or far and significant. Kind of like how America has this culture where apps can solve problems, because they’ve seen that apps can solve problems in their day to day lives and they’ve lived through apps becoming hit sensations overnight.

                                                                                              1. 2

                                                                                                Role models are largely an US cultural construct. It’s unusual to be driven by emulation in Europe.

                                                                                                ‘Role models’ is a wider psychological concept; it’s not just about copying famous people you admire. Here’s a short blog post introducing the concept; it applies to more than just children, of course. Excerpt:

                                                                                                Individuals that are observed are called models. In society, children are surrounded by many influential models, such as parents within the family, characters on children’s TV, friends within their peer group and teachers at school. Theses models provide examples of behavior to observe and imitate, e.g. masculine and feminine, pro and anti-social etc.

                                                                                                Children pay attention to some of these people (models) and encode their behavior. At a later time they may imitate (i.e. copy) the behavior they have observed. They may do this regardless of whether the behavior is ‘gender appropriate’ or not, but there are a number of processes that make it more likely that a child will reproduce the behavior that its society deems appropriate for its gender.

                                                                                                First, the child is more likely to attend to and imitate those people it perceives as similar to itself. Consequently, it is more likely to imitate behavior modeled by people of the same gender.

                                                                                                Second, the people around the child will respond to the behavior it imitates with either reinforcement or punishment. If a child imitates a model’s behavior and the consequences are rewarding, the child is likely to continue performing the behavior. If parent sees a little girl consoling her teddy bear and says “what a kind girl you are”, this is rewarding for the child and makes it more likely that she will repeat the behavior. Her behavior has been reinforced (i.e. strengthened).

                                                                                              1. 4

                                                                                                I found this blog post about red team practices at Facebook to be relevant.

                                                                                                https://medium.com/starting-up-security/red-teams-6faa8d95f602

                                                                                                They seem to focus more on assuming that someone had compromised many things. It’s generally a waste of time to actually use social engineering or sniffing against the real company. Someone could be digging through the garbage for months before they find something useful. Thus, you assume someone has been phishing for months or digging thorough garbage or old computers that the company throws out or someone had planted some laptop somewhere a few months ago. Or maybe assume someone had quit and they’re using a device that’s still connected to the internet somewhere that was left in a hallway before they were escorted out.

                                                                                                After assuming these things, your teams goal is to have systems and experience in place to fight against attacks if they were to happen on the inside.

                                                                                                1. 18

                                                                                                  Taking time to have fun coding with stuff like this is a bit underrated, in my opinion. When I worked at Fog Creek, I once redirected joelonsoftware.com to my computer (but only when viewed from inside the office), where I had carefully added a new entry, entirely in Joel’s voice, about how the company had not been successful as-was, and how FogBugz was, effectively immediately, completely open-sourced—with download link. To do that required working with sysadmins to alter DNS, learning how routing works, figuring out how to set up IIS properly on my computer, hacking CityDesk, and more—all skills I wouldn’t have picked up for awhile otherwise, and all in the name of a joke. Doing stuff for fun can be a great motivator.

                                                                                                  1. 5

                                                                                                    At my work, there is a tendency for people to joke about some “absurd” idea to solve some crazy problem we have. Every time, I kind of wonder if that idea really is absurd. There is always pushback saying it’s not really an efficient use of our time but often the underlying issue is a tremendous waste of time for us. We are just thinking so incredibly short term, especially with agile development pipelines, that we never really spend the day or two to flush out an arbitrary moonshot. These are the dinky little tasks that I find the most interesting because it’s trying to solve hard problems or at least clarify the problem space but it seems to be too high of a risk for wasted time money and effort for my team.

                                                                                                    1. 6

                                                                                                      Not quite that, but I have done what amounts to Enterprise FizzBuzz in this capacity to learn stuff, and it does indeed sometimes end up demonstrating some crazy idea actually does work in practice. For example, Miniredis (which you should not use in 2017 under any circumstances whatsoever) began as a joke about how we could totally use Redis on Windows back in 2009; we just needed to write our own version from scratch. Which I then did…and which was such a short and tight code base that we actually did use it, shipping it to several thousand customers.