1. 3

    Kubernetes uses ConfigMaps: https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/

    They are basically YAML/JSON properties which can be sent to containers in various ways:

    - name: LOG_LEVEL
            name: env-config
            key: log_level

    Kubernetes handles the rollout of changes, and since a lot of infrastructure tasks are pre-defined (like routing from one service to another ala istio) there’s a lot less one-off config changes that you need to do. They support literals, files and directories. You can also do secrets: https://kubernetes.io/docs/concepts/configuration/secret/

    1. 2

      Go 1.9 introduced type aliases:

      type callback = func(int) bool
      func filter(xs []int, f callback) []int {
      	var filtered []int
      	for _, x := range xs {
      		if f(x) {
      			filtered = append(filtered, x)
      	return filtered
      func main() {
      	fmt.Println(filter([]int{1, 2, 3, 4}, func(x int) bool {
      		return x < 3

      C# has delegate types:

      public delegate bool callback(int x);
      public static int[] filter(int[] xs, callback cb) {
          var filtered = new List<int>();
          foreach (int x in xs) {
              if (cb(x)) {
          return filtered.ToArray();
      public static void Main() {
          int[] xs = {1,2,3,4};
          foreach (int x in xs) {
              Console.Write(x + " ");
      1. 2

        Excellent point, but the actual parameters still end up being structurally typed. The formal parameters get named as instances of the type, but the actual values when constructed are not declared to be of that type.

        That is, in your first example, I could do something like this:

        func foo(i int) bool {
        filter(int_array, foo)

        The function foo was never explicitly declared to be of type callback, but rather assignment/passing was allowed because foo met the structural requirements of the callback type.

        I think the answer to my question may be “no, it’s not possible to reasonably have function types in a purely nominative type system” though that just rubs me the wrong way.

      1. 2

        I’m trying to understand what you are after with the “single executable” part?

        1. 2

          Self-contained. For the most part controversy I guess? :-)

          1. 5

            Right. That controversy you may have. I guess we have rather differing interpretations of self-contained.

            $ file start.sh

            start.sh: POSIX shell script, ASCII text executable

            $ file target/hprotostuffdb-rjre

            target/hprotostuffdb-rjre: ELF 64-bit LSB executable

            $ grep JAR start.sh

            $BIN $PORT comments-ts/g/user/UserServices.json $ARGS\
              $PUBSUB $ASSETS -Djava.class.path=$JAR comments.all.Main

            $ objdump -p target/hprotostuffdb-rjre |grep RPATH

            RPATH                $ORIGIN:$ORIGIN/jre/lib/amd64/server

            $ objdump -p target/hprotostuffdb-rjre |grep NEEDED

            NEEDED               libpthread.so.0
            NEEDED               libjvm.so
            NEEDED               libcrypto.so.1.0.0
            NEEDED               libssl.so.1.0.0
            NEEDED               libz.so.1
            NEEDED               libstdc++.so.6
            NEEDED               libgcc_s.so.1
            NEEDED               libc.so.6

            $ find . -name ‘*so’


            I’m not even going into the rest of the jre scaffolding. I guess you could argue the stuff under comments-ts is not part of the “comment-engine”, but it’s there, and it (or something equivalent) is needed anyway. Admittedly, only two of the files in the entire package have the ‘executable’ flag set, so you can have half your cake if that’s the criteria for being self-contained :-)

            1. 4

              Thanks for the detail response.
              It was my way of showing people that jvm apps can have “golang-style” deployments where you ship a binary and run and be only 12MB (my production nginx binary is 14MB)

              But realistically, if you have the jvm installed, the jar is only 455kb and that is only the one that needs to be shipped along with the 92kb js and 7.1kb css. That is how I deploy internally.

              With golang, you do not have this choice.

              1. 4

                Ah, so now I am starting to see the points that you are really trying to make.

                1. Bundling of dependencies. I don’t think there’s much novelty to it; proprietary and floss applications alike have distributed binaries with bundled dependencies for a long long time. This includes many applications that bundle a jvm.

                2. A jvm bundle can be reasonably small. Admittedly I haven’t paid attention to it, but I’ve had jvm bundles before, and I don’t recall them being outrageously large.

                Calling it a “single executable” or self-contained might not be the best terminology to get the point across. Even more so when you consider that the executable also depends on many libraries that are not bundled; see objdump output above and compare to the list of bundled shared objects. Any one of these system libraries could go through an ABI change (in worst case one where existing symbols are removed or modified to work in an incompatible way, without symbol versioning…), and when that happens, your uhm.. self-contained single executable thingy won’t run right, or at all. It’s not just a theoretical concern, it’s something people using binary distributions of proprietary software (e.g. games) may have to feck with rather often.

                I can’t comment on how this compares to golang deployments, which I’ve never used.

                1. 1
                  1. Pretty much agree.
                  2. A lot of ppl dismiss the jvm as bloated (in terms of size and runtime memory). I guess it all depends how one uses it (along with the knobs to tune). I run mine at 128mb memory max, and that could handle 130k req/s. My usage of the jvm is like a stored-procedure language though. All the heavy lifting is on the C++ libs that I’m depending on.

                  I understand your points and appreciate your comments. Cheers

                2. 2

                  Recent versions of go have additional build modes: https://golang.org/cmd/go/#hdr-Description_of_build_modes

                  Theoretically you could deploy your code as a plug-in.

          1. 3

            Generally agree but there are operational costs to vertical scaling. That single DB is also a single point of failure and achieving high availability is often just as hard as scaling horizontally. (master / slave failover and backups may seem mundane but there are plenty of examples of companies screwing it up)

            Something like Cassandra, Elasticsearch or Kafka has redundancy built in, which is a big win. I think spanner style sql databases could hit a real sweet spot.

            As for SOA I think it depends on what you’re working on. Sometimes breaking up applications into separate processes with well defined interfaces can make them easier to reason about and work on.

            As an application evolves over time the complexity can grow out of control until any time someone touches the code they break it. How often have new developers thrown up their hands, scrapped the product, started over and wasted 6 rebuilding what they had in the first place?

            Maybe SOA could help with that by limiting the scope? (Though maybe better code discipline would achieve the same result?)

            I guess all I’m saying is that good engineering practices can help smaller software too.

            1. 2

              Makefiles are great for small Unix projects, not so much for something that needs to be built for Windows too… Windows developers live in a parallel universe of build tools and that coupled with the shear size of chromium helps to explain why they felt it necessary to make ninja.

              The complexity of the project is pretty crazy though.

              1. 2

                I’ve used nmake with some success to build things on Windows.

              1. 3

                Why do folks have such a hard time looking for 3rd party packages? An immutable, sorted map doesn’t come up all that often (I don’t think I’ve ever “needed” the immutability bit), but there are packages out there that can do it. For example: https://github.com/Workiva/go-datastructures#dtrie. *

                It’s not type safe, but it fits the requirements he wanted.

                • caution: I’ve never actually used this package
                1. 4

                  One of the chief dangers of excessively verbose and inflexible code is not just that its implementer has to do a lot of typing; it’s that all that typing provides a high surface area for bugs and generally difficult-to-reason-about implementations. This sort of issue affects whoever has to use the library, not just whoever has to do the implementing.

                  It’s also, by the way, generally true that code which somebody else wrote is going to be more general and, you know, not written by you, and therefore magnify the verbosity and difficult-to-reason-about issues.

                  1. 5

                    Why do folks have such a hard time looking for 3rd party packages?

                    Sometimes people don’t want to add external dependencies for things that, in some cases, are (relatively speaking) straightforward to implement.

                  1. 2

                    So, what are the current Best Practices for dealing with these sorts of markets? What are reasonable and secure alternatives to Coinbase?

                    1. 4

                      Xapo takes security seriously: https://support.xapo.com/xapo-security

                      • keys are stored offline
                      • multiple signatures are used
                      • data is encrypted and distributed across multiple regions
                      • they use 2fa, a pin and a password to access the vault
                      • it requires 48 hours to move bitcoins out of the vault

                      You can still use sites like coinbase, just don’t leave a lot of bitcoins in them

                      1. [Comment removed by author]

                        1. 2

                          They do provide insurance: https://xapo.com/terms/, though only for things related to the company, not failures on the users fault:

                          The Xapo Bitcoin Reserve is designed to cover direct and effective losses suffered by users as a result of attacks of hackers to our systems, theft by any third party and/or Xapo employee from our systems or facilities, break-ins at a physical location of our vaults, and/or our bankruptcy, which are not due to or related to your acts, omissions or errors (“Qualifying Losses”).

                          What exactly are you looking for? FDIC insurance? I don’t think the US government is ever going to insure a competing currency.

                          There are risks to action and risks to inaction. Keeping your bitcoin on a piece of paper under your mattress is also dangerous - the primary danger, at least with bitcoin, being you’d lose the private key…

                          For this particular case Xapo requires 24 hours to reset a password and sends a bunch of warnings before they do it.

                      2. 3

                        Some alternatives to Bitcoin allow electronic transfers to be reversed. :)

                        1. 3

                          Given the history of Bitcoin markets, I wouldn’t keep any money in any of them that you aren’t planning on trading right now. So many of them have been hacked or have the founders mysteriously disappear with the money.

                          1. 1

                            One alternative is Blockchain wallet (https://blockchain.info/) This site cannot reset your password because your password ultimately secures your wallet’s private key. This means that the service is as secure as your password, which could be considered a “secure alternative to Coinbase”. Note that this does not include ethereum or litecoin wallets, nor an exchange though.

                            Coinbase also claims to have FDIC insured deposits for its US customers (but only for the USD balance), so that can be a real advantage over other exchanges at least. They’re also insured against theft/security failures (their policy would probably be detailed enough to exclude OP’s problem).

                          1. 3

                            I appreciate the way you’ve written this as an exploration of the problem space rather than a polemic (“10 reasons you’re a dumb-dumb if you don’t count with HyperLogLog”). It feels far more authentic and useful.

                            Having said that, I find a disturbing story here: how much time was spent standing up Cassandra, operationalizing it, validating the new process, adjusting the existing billing processes, (…) when you’re sitting on an embarrassingly parallel problem? Do you really want to trust a complex datastore when wc is the perfect solution here?

                            1. 3

                              wc has to count the unique lines. Like many companies we faced the problem were certain customers received an extreme amount of data compared to their peers. This meant that we had to count billions of items.

                              I’m curious how the problem is embarrassingly parallel… I was never happy with the Cassandra solution, so if there’s something obvious here I missed, I’d definitely be interested…

                              Cassandra is quite simple, though expensive, if you’ve already payed down the operational cost. (which we had) It solves the high availability problem.

                              1. 1

                                Careful re-reading indicates the problem isn’t quite embarrassingly parallel; the roll-up has to occur monthly, and I misread the gzipped log date format as monthly logs; oops, my bad :)

                                There are still a handful of problems still present in that pipeline, and you’d still see some very strong gains if you spend a day parallelizing it:

                                find *.gz | xargs -I{} echo "<(gzcat {} | sort | uniq)" | xargs echo sort -m | bash | uniq | wc -l

                                Right now you’re sorting the log file output and eliminating unique elements, merging them together and re-stripping uniques (gah!)

                                Instead, if you did

                                find *.gz | xargs -I{} echo "<(gzcat {} | LANG=C sort )" | xargs echo LANG=C sort -mu | bash | wc -l

                                you eliminate duplicates at merge time (-u), which cuts out n+1 full-file iterations and just does it at the k-way merge time. It’s also critical to set LANG to C or else you’ll get eaten alive with multibyte comparisons.

                                If those two would be too slow - and seriously, don’t underestimate the boost you could get there, the next likely step would be to fan out the download, gzcat and sort onto multiple servers. parallel can help do that with very little administration:

                                aws s3 ls | parallel --sshloginfile ssh-keys-go-here --return {.}.sort --cleanup "aws s3 cp {} . && gzcat {} | sort > {.}.sort" && find *.sort | sort -mu | wc -l (this one is from memory, so the syntax may need a tweak!)

                                Even better, drop the --return option, mount the output directories from your counter nodes to your invoking node, and be sure your network connection is fast enough and you’ve got a very easy to parallelize counter.

                            1. 1

                              So, to go full circle, if I were to do it again, I’d probably spend most of my time building something clever with a HyperLogLog, only to eventually cave-in and resort to something inefficient, bland and boring.

                              Why not HyperLogLog when it’s the most efficient solution?

                              1. 3

                                Its not 100% accurate and in the end I don’t think that’d fly with a billing system.

                              1. 2

                                Since we wanted better than linear performance (ie a billion tweets shouldn’t take a billion operations to count), we explored the indexed option.

                                Can someone explain what this means? I don’t quite understand how you could have sublinear time for counting items. Maybe this is referring to the uniqueness/deduplication part?

                                1. 4

                                  Sorry for the confusion. It’s sub-linear at query time. You still have to pay the cost of indexing, but you can do that over the whole month.

                                1. 2

                                  Fantastic. Systemd makes it so much easier to build init scripts and up till now there hasn’t been a great distro option for servers.

                                  1. 1

                                    An unfortunate side effect of closures is that they make it trivially easy to leak memory in Internet Explorer

                                    Even newer versions?

                                    1. 2

                                      No. (http://msdn.microsoft.com/en-us/library/dd361842.aspx)

                                      I think the article was written in 2006 and is pretty out of date.

                                      1. 1

                                        Assuming it’s still true, I’d say it’s mitigated by per-process tabs. I’m surprised though that each page doesn’t have its own arena. I can understand how DOM and JS objects get tangled together and can’t be collected, but it seems easy (always does) to just blow away both the DOM and JS heaps for a page after you navigate sufficiently far away. (After reading the msdn link, it seems even IE 7 does just that. It’s only a leak while on the page.)

                                        (What’s the longest running web page? gmail? Would evil google deliberately add lots of DOM closure leaks to gmail to convince people to use chrome? :))

                                        As a side note, closures/lambdas/anonymous functions are where lots of languages' GCs break down. luajit will permanently pin any function passed as a function pointer to C, and the jvm has similar issues with permgen.

                                      1. 2

                                        The issue is a bit overblown. It depends on using a hidden pop-under:

                                        When you click the button to start or stop the speech recognition on the site, what you won’t notice is that the site may have also opened another hidden popunder window.

                                        You also have to grant it permission the first time you use it. It’s not clear to me how they could fix this without just getting rid of the feature. (Maybe only allow it in active windows?)

                                        1. 2

                                          There are plenty of sites I might grant permission to use the microphone for a short while, but that doesn’t mean I want them listening in forever after.

                                          I read about this story before and somebody pointed out that the html spec even calls out this possibility and clearly says the user agent must disallow access after the tab/window is closed to prevent background recording.

                                          1. 1

                                            the user agent must disallow access after the tab/window is closed to prevent background recording.

                                            It does. The issue is the site opens another window, transfers the recording to that one and then closes the original. However apparently that shouldn’t work either:

                                            To minimize the chance of users unwittingly allowing web pages to record speech without their knowledge, implementations must abort an active speech input session if the web page lost input focus to another window or to another tab within the same user agent.

                                            Even just switching to a new tab should stop recording audio.

                                        1. 1

                                          Is this feature easily disabled or is it hidden away in about:flags or somesuch?

                                          1. 3

                                            Settings -> Advanced -> Privacy -> Content Settings -> Media -> “Do not allow sites to access my camera and microphone”

                                          1. 2

                                            Actually this tendency to transform arbitrary decisions into moral categories and then hang them as a millstone around the neck of others is a much broader phenomena.

                                            Does this actually describe the way that anybody evangelises REST?

                                            I think that the application ought to drive the discussion about architecture. If REST works for you application, then by all means use it, but if it doesn’t don’t be afraid to use something else.

                                            I don’t think anybody disagrees with this.

                                            1. 1

                                              Does this actually describe the way that anybody evangelises REST?

                                              That’s a fair question. All I have is anecdotal evidence from my own experience… I did point to that clean URL blog post, but maybe I should’ve spent more time finding examples.

                                              I don’t think anybody disagrees with this.

                                              Lot’s of developers are afraid to try something different. Read pashields above.

                                              1. 2

                                                That blog post has nothing to do with REST. Whether a system has user-friendly URLs or a big mess of UUIDs doesn’t tell you anything about how RESTful it is.

                                                pashields seems to be in agreement with you:

                                                In fewer words, not using REST (or whatever) is fine, but swap it for another well thought out architecture (even your own), not something ad-hoc.

                                                1. 1

                                                  That blog post has nothing to do with REST. Whether a system has user-friendly URLs or a big mess of UUIDs doesn’t tell you anything about how RESTful it is.

                                                  It’s an example of making a big deal out of something that doesn’t matter.