1. 3

    My initial reading of this was: “christ, it’s been demonstrated to be airborne?”

    That’s an interesting topic: if you read things published by CDC virologists it’s always “not a chance” but the USAMRIID virologists aren’t so sure.

    Glad to hear I mis parsed the title so badly. Airborne Ebola would be a very bad thing indeed.

    1. 2

      if you read things published by CDC virologists it’s always “not a chance” but the USAMRIID virologists aren’t so sure

      Is this a case of politics (fear) overriding study data?

    1. 3

      Spoiler: Stick with stdlib until you really, really need to go elsewhere. Happy to see I’m not the only one who came to this conclusion.

      1. 1

        That approach could be applied to most scenarios, correct?

      1. 3

        As a big user of and advocate for MongoDB, I initially upvoted this before reading it, because I expected it to be an article about how MongoDB has stagnated and will get its lunch eaten by more interesting and reliable alternatives.

        Instead, it was a pretty empty “don’t worry they’ll all come crawling back to PostgreSQL someday” article. Of course PostgreSQL will keep going and support more document-store like features. SQL isn’t dying – it’s just losing its role as the Hammer for everyone’s Nail. Now we have more tools. People are not being duped by MongoDB or other NoSQL databases, they’re just finding that it’s a helpful tool in their arsenal.

        1. 1

          People are not being duped by MongoDB or other NoSQL databases

          There are some, but the term “duped” might be a bit harsh. Its easy to find posts about projects that jump completely into the NoSQL pool, and find that is does not solve all their problems. The resolution is often a mix of data storage platforms that fit different workloads.

        1. 2

          Has the idea of a minimum rep to be able to downvote been considered?

          Downvotes are a way for the community to self maintain. Minimum rep would keep the mechanism of “cleaning by community” in place, while requiring some level of constructive participation before being able to affect other posts.

          1. 2

            It was by account age before (and still is for comment downvoting) but from what I recall, a lot of the users downvoting things with “poor quality” that probably shouldn’t have were users with enough karma to otherwise do so.

          1. [Comment removed by author]

            1. 6

              Hadoop isn’t a database. Hadoop at its core is a map reduce framework. Its not about “scaling to 10 TB”, its about processing data.

              Here’s a simple example:

              You generate 100GB of log data a day. For some people, that is a ton of data, for others, not very much. I need to be able to find information in that 100GB within a couple hours. A python script on a single or a java app on a single box won’t cut it. They won’t get me the answer I need in the amount of time that I need it. So I spread the load across many machines. I’m getting the answer that I need but, I’m in a painful scenario of dealing with my home grown cluster of machines.

              Hadoop is just a standard framework handling cluster’s of map reduce jobs. A nice chunk of hard work has been done for you. Lots of rough edges have been shaved off.

              It doesn’t matter what you think as an outsider about someone’s choice to use Hadoop or other technologies unless you understand their problem. I have data, I need to get an answer in X period of time, to do that I need a cluster. You don’t get to decide what is someone else’s X. Are there people who could use something other than hadoop? Sure. Maybe like some “look at me” blog posts and articles like the point of “that problem could be solved in excel”, maybe it could, maybe a single Python script could handle it. But if that one script dies? What then? No answer. Maybe that person went with Hadoop after looking at the problem and deciding that a job tracker as a SPOF was way more likely to fail them than just a single python script.

              How about, we stop assuming our colleagues are idiots and try to understand why people make the tradeoffs that they do. Yes, Hadoop is a beast in many ways. Yeah, operationally, it can be a giant pain. But, there are plenty of reasons people want/need to use it that cynic’s sniping from the sidelines simply won’t see.

              1. 3

                How about, we stop assuming our colleagues are idiots and try to understand why people make the tradeoffs that they do. Yes, Hadoop is a beast in many ways. Yeah, operationally, it can be a giant pain.

                I have met cases where people want to go with Hadoop based on hype (or to gain career experience), when they have a dataset < 10GB. These people are not “idiots” per se, but don’t always consider that traditional single node technologies are most suitable many (if not most) projects. Articles like this can help keep perspectives in the right place, where sometimes “boring old-fashioned RDBMS” are a valid approach.

                You generate 100GB of log data a day. For some people, that is a ton of data, for others, not very much. I need to be able to find information in that 100GB within a couple hours. A python script on a single or a java app on a single box won’t cut it. They won’t get me the answer I need in the amount of time that I need it. So I spread the load across many machines. I’m getting the answer that I need but, I’m in a painful scenario of dealing with my home grown cluster of machines.

                Curious what type of operations you perform that take that much time for < 5GB / hour of logging?

                1. 1

                  I had a hard time listening to a lot of the sessions at strata in Santa Clara this year. The enterprise vendors definitely smell money in the hadoop ecosystem and this had created a feedback loop which had resulted in a lot of “you need a hadoop” type cargo cult behavior.

                  I use a small hortonworks cluster to process about 200 Gb/day of video viewing and ad data from log files. We had a home brewed distributed log processing system that was less functional than what we got for free (development wise) by using hadoop. Using hadoop as a big dump parallel hammer to apply functions to large data sets without having to write custom code is the sweet spot for my needs.

                  1. 1

                    Maybe that person went with Hadoop after looking at the problem and deciding that a job tracker as a SPOF was way more likely to fail them than just a single python script.

                    Did you mean “less likely”?

                    1. 1

                      I did.

                      Sadly can’t edit now. :/

                1. 1

                  “One question begged of Big Data has been – is anybody actually handling data big enough to merit a change to NoSQL architectures?”

                  1. 1

                    I think part of the issue is that the volume (aka size) is only one of the 4 Vs. I would think that the velocity will end up having more of an impact on the architectures because of how some of the consensus algorithms end up working (well…velocity in combination with distribution (think transatlantic / transpacific) in combination with volume (larger clusters)).

                  1. 2

                    We used Erlang at my workplace until recently. I have to say I don’t really mind it, I pretty much agree with most of the criticisms that have been levelled at it (strings as lists of ints…, et. al.), but overall I was happy with it. The things it does well, it does really well.

                    1. 1

                      The things it does well, it does really well.

                      Is there a list of these things somewhere?

                    1. 1

                      The first thing I thought that tag might refer to was http://rstatd.sourceforge.net.

                      Why not “statistics”?

                      1. 1

                        Perhaps, but it is worth considering that R and statistics have partial overlap. Not all statistics discussions are about R, and in some cases R is used for things like graphing and data parsing.

                        Perhaps there is a need for both tags.

                      1. [Comment removed by author]

                        1. 1

                          I would seem that most (if not all) CxO become salespeople, regardless of background. The path for someone who bases their decisions on practical issues of a primarily technical / functional / quality focus is limited. This is not to say that a business focus is also required - budget, revenue, expenses are all important aspects of a successful business. But it does seem like the focus eventually becomes one sided, and why innovation seems to stall after too many “managers” get involved.

                        1. 2

                          Oh, a challenge! :) Even the C like C++ version was still using std::string, which is slow. I attached a faster version to a github issue if anybody is morbidly curious.

                          1. 1

                            I am. Which github issue?

                              1. 1

                                Ah! Part of the article I posted. facepalm

                            1. 7

                              Why does everyone keep saying Go has C-level performance? The latest Alioth benchmarks indicate that it is only about half as fast as C. http://benchmarksgame.alioth.debian.org/u32/go.php

                              And the part about Go being faster than Java is certainly not true. Java blows Go out of the water in certain benchmarks. http://benchmarksgame.alioth.debian.org/u32/benchmark.php?test=all&lang=java&lang2=go&data=u32

                              There are a lot of reasons to like Go, and it is pretty fast, but it’s still not “within a close margin” of C’s performance.

                              1. 2

                                We’re on a logarithmic scale, and there are clusters. Looking at the Mandelbrot results, you can see one cluster with most members at around 50× slower than C (Smalltalk, Lua, JRuby, and PHP, with Python and MRI Ruby as outliers on the slow side, and HiPE as an outlier on the fast side) and another cluster between 0.9× and 3× as slow as C (C, Ada, Java, C#, Haskell, Dart, and barely SBCL). Go is definitely in the “C-level” cluster and not the “PHP-level” cluster.

                                1. 2

                                  Why does everyone keep saying Go has C-level performance?

                                  Probably because we each have a different perspective. For example, I can understand the “within a close margin of C’s performance” moniker if you’re moving from a language that is 40x slower to a language that is 2-4x slower. You may think that 2-4x slower is not a close margin, but someone else might. Depends on what you’re doing I guess.

                                  1. 2

                                    Exactly. Coming from Python, Go gives me a set of performance options with lower effort than I would have to invest if I was to use C instead.

                                    1. 2

                                      I suppose the “close margin” is open to interpretation, but saying that Go is “closer to the performance of native languages than other enterprise level languages such as Java, Scala, Erlang et al.” is clearly untrue. Erlang is slower than Go, but Java and Scala are both faster than Go in certain benchmarks. On that note, Haskell, OCaml, and SBCL are also faster than Go at some benchmarks. That’s why I think saying “Go has C-level performance” is a bit disingenuous, since a lot of other languages have similar or better performance.

                                      1. 2

                                        The phrase “native languages” doesn’t even make sense. How is go not a native language itself?

                                        1. 1

                                          You know, it’s possible to disagree with people without accusing them of dishonesty, which is what you are doing when you call them “disingenuous”.

                                          Unfortunately, I can’t downvote you for extreme rudeness; the closest options are “troll” (which would imply that you’re being disingenuous, which I have no reason to believe) and “spam”. So I’ve chosen “spam”.

                                    1. 1

                                      I’m seeing a github 404 at this address.

                                      1. 1

                                        URL changed, and I can’t seem to edit now. Posted in a top level comment.

                                        1. 1

                                          I’m not sure what the etiquette is, but you may want to delete and resubmit.

                                          1. 1

                                            Its been too long it seems, I could not delete or edit. I would resubmit otherwise.

                                        1. 2

                                          Site appears to be having difficulties at the moment. Maybe too many people Trying Juila.

                                          1. 2

                                            I really love the idea of ggplot and the aesthetic of the charts, but the dependencies make it somewhat impractical to use for light-weight applications. It needs a Fortran compiler, Numpy/Scipy, Matplotlib, and whatever else in between, installing it can be quite a lengthy pain.

                                            Is anyone aware of any maintained Python charting libraries that are less dependency-heavy? Searching yields many results with untouched code from 2009 and older.

                                            It appears much of the advancement in charting libraries is going towards JavaScript/SVG client-side rendered toolkits.

                                            1. 1

                                              I am particularly partial to jgraph, which is a small C program with basically no dependencies.

                                              I have this in my ~/bin to convert jgr files to trimmed PDFs (requires pdfcrop from texlive and ps2pdf from ghostscript, I think):

                                              jgraph -P "$1" | ps2pdf - - | pdfcrop - "$2"
                                              

                                              I’m not aware of any language bindings for jgraph, but if you spend an hour with its man page, you should be able to write simple jgr files in a jiffy. (Maybe skim the lecture notes on the author’s web page first.)

                                              Fun fact: jgraph was written in 1992 but didn’t stop compiling until ca. October 2012. At that point, the author fixed it and released a new version.

                                              As an Archlinux user, I’m obligated to inform you that I maintain a package for it in the AUR.

                                              1. 1

                                                Is anyone aware of any maintained Python charting libraries that are less dependency-heavy?

                                                Seaborn perhaps? http://stanford.edu/~mwaskom/software/seaborn/index.html

                                                1. 1

                                                  Ah, seaborn looks very nice too, but alas has the exact same dependencies as ggplot. :)

                                                  1. 1

                                                    I did not notice it uses the same base libs as ggplot.py. I been taking the easy route by installing via anacoda, which makes most of these dependency installs simple on OS X.

                                              1. 1

                                                Proposed: R, Julia

                                                1. 3

                                                  I was surprised to see D mentioned. I’ve never met anyone who codes in it, for work or play. Can someone with experience in that language give me a quick rundown on the features?

                                                  1. 2

                                                    I am seeing D mentioned more often in discussions relating to Go. I know nothing about D but it appears to have something that invites comparisons between the two.

                                                    1. 1

                                                      They are both successors to C++, though in the case of Go I think they dropped back closer to C before moving forward again. D has basically every C++ feature, but with different syntax and some other relaxations to make it easier to use.

                                                      You see ruby and python mentioned a lot in the same circles, too, though I would argue the difference between go and D is larger than that between ruby and python.

                                                  1. 8

                                                    I don’t particularly like Go, for probably-subjective reasons, but I would still like to read technical articles about it. It seems obviously on-topic and relevant, so the request to stop posting about Go comes across as rude.

                                                    1. 0

                                                      I believe Go is actively harming the software industry by providing a very incapable tool for building working software, yet it’s presented as a real alternative. I do not think it’s responsible to let others get introduced to such silliness.

                                                      1. 7

                                                        I would have to disagree, as I find it quite suitable for building useful real world systems. I believe the CoreOS people would also disagree with you, as would many other people and organisations.

                                                        1. 0

                                                          I will have to disagree and claim that the suitability is an illusion.

                                                          1. 10

                                                            I, like many others, have built useful real-world systems in other languages. I didn’t start using Go because I was forced to. I’ll leave it at that.

                                                            1. 4

                                                              Go like many languages is suitable for some tasks, and not for others. What about it is actually harming the industry, other than cases where developers misuse it (something that can happen with any tool)?

                                                              1. 9

                                                                No parametric polymorphism (i.e. generics) - let’s not even go past this point. It’s so fundamental that giving it up is accepting defeat by your programming language.

                                                                Types are proofs (see the Curry-Howard isomorphism) - this has many consequences but one is that we can use types to prove properties about values and functions. Easy, right? People will say “I don’t care about proving that an int is an int” but they’re missing a huge implications - one being: parametric functions create cases where there’s only a limited amount of implementations.

                                                                Given a function with the type:

                                                                a -> a
                                                                

                                                                (i.e. a implies a)

                                                                What are the possible implementation(s)? Let’s try to implement it:

                                                                parametric :: a -> a
                                                                parametric a = _
                                                                

                                                                What can we put in the underscore hole? We need to produce an a, what are the possible a values which we have in scope? The only reasonable answer we have is a. This means, that given the signature a -> a, there is only a single implementation.

                                                                We don’t need anything else from this function, no tests, no names, no comments - nothing but the type signature to know exactly what this function does.

                                                                That’s a pretty simple example, right? Let’s take a real world problem that I had yesterday:

                                                                I wanted a function which would take a list of disjunctive types and give me a list of just the right sides:

                                                                [Either3 a b c] -> [c]
                                                                

                                                                Now, I could write this function directly against the list and Either3 types but there’s a chance I could do something like:

                                                                getRights :: [Either3 a b c] -> [c]
                                                                getRights xs = []
                                                                

                                                                So, getting all the rights from my list returns an empty list - not what I want! If we make this type signature very generic, then we get to a place where we only have a single possible implementation:

                                                                unite :: (Monad m, Plus m, Foldable t) => m (t a) -> m a
                                                                unite value = value >>= foldMapPlus return
                                                                

                                                                Here are example usages:

                                                                unite [Right 1, Right 2, Left "Whoops", Middle False, Right 3]
                                                                -- [1, 2, 3]
                                                                
                                                                unite [Just 1, Nothing, Just 2]
                                                                -- [1, 2]
                                                                

                                                                Not only have we specified exactly what should happen in the type but we’ve gotten a function which works for infinite types.

                                                                I consider writing programs without free theorems to be absolutely archaic. Recommending for people to do exactly that (by advocating Go, for example) is harmful.

                                                                1. 7

                                                                  You didn’t actually say how it was harming the industry. You just stated that you prefer programming languages with certain properties.

                                                                  Also, it’s clear that you care very deeply about advocating for the functional paradigm. It would help your case if you didn’t include flame bait in your comments. (For instance, I’ve used Go effectively. But your comments would have me believe that I’m just deluded. But I’ve also used Haskell effectively. Am I deluded about that, too?)

                                                                  1. 1

                                                                    It’s not a preference, advocating programming without free theorems is harmful. I did say that:

                                                                    I consider writing programs without free theorems to be absolutely archaic. Recommending for people to do exactly that (by advocating Go, for example) is harmful.

                                                                    And yes, I believe it’s an illusion that Go is effective at solving problems.

                                                          2. 6

                                                            You could argue the same about PHP, Java or $whatever_language_you_dislike.

                                                            1. 3

                                                              Yes and I will, keeping in mind that Java is the most capable.

                                                            2. 3

                                                              That’s exactly like saying “Let’s not talk about drugs because they’re bad.” Well, how do you tell others that drugs are bad without talking about them, discussing them, etc?

                                                              The same argument has been made about sex-ed and abstinence-only education. Guess what, when persons who’ve only been told “just don’t do it” do it and don’t know the first thing about the consequences, bad things happen.

                                                              So let the topic come up here. And if Go really is so terrible, then let it be discussed. Give others the opportunity to learn and understand what makes Go the “literally Hitler” of programming languages (or whatever might be good about it).

                                                              1. 1

                                                                Let’s talk about drugs. Let’s not present drugs to people without experience, telling them that it’s a great idea.

                                                              2. 2

                                                                Go does exactly what it sets out to do: make solving Google’s problems using code easier. Anything else is a nice side effect of Go making it easier for Google to do things.