1. 24
  1.  

  2. 31

    I disagree with this article. Selecting the right programming language is all about understanding the problem at hand and selecting the best tool for the job. One of the requirements is to understand the long term effects of your choice and plan ahead. If you later need to change the entire structure because you decided to use an ill-suited programming language/platform, you’re in for at world of hurt when the entire thing has to be re-done in something else. It’s not premature optimization.

    1. 14

      The best way to understand the problem at hand is to try solving it. Selecting a language that is good enough for most cases is a great way to go ahead and start finding what problems are hard in that problem, and which are not. This doesn’t apply to every problem, since some are well studied and have many resources for you to look at, but hopefully, in those cases you would be using solutions that have already been made.

      If you choose a language believing that performance will be a major problem, and the real problem ends up being maintainability, then you have made a premature optimization, that some may be unwilling to rectify by moving to another language due to a preconception that performance is a problem, as them “solving” it with a language choice will have “solved it” and that’s why they haven’t seen it. Solutions to problems that don’t exist in a first place are hard to get rid of, because people believe that they haven’t seen the problem because of the solution.

      Also, worth noting that a majority of software (that people can choose a language for) written nowadays (web servers) are not particularly performance sensitive as long as you can throw enough money at it, and as the article said, you’d often need additional manpower to create software in a more performant language, and the crossover point for that is quite far from what most companies experience. You have to know that the problem that you are solving cannot be scaled horizontally to justify the extra cost of developing it in a performant language, and that, once again, requires knowledge in the problem, which you don’t necessarily have.

    2. 23

      “Selecting a programming language can be a form of premature optimization, so select Python because it is the optimal choice.”

      I do think people (especially on the junior end of the experience spectrum) spend way more time and energy on language choice than is useful. And sometimes it is due to concerns about performance that don’t matter in context. There is a good article to be written about that.

      But this article is just a “Python is good and you should definitely use it” advocacy piece.

      1. 9

        This is unhelpfully driven, in part, by our industry stubborn clinginess when it comes to technology. Every recruiter and every hiring manager I’ve ever met has not only asked me what languages I prefer, but has been taken aback to the point of hostility by questions like “for what?” and answers like “it depends”. If choosing the right language is indeed a form of premature optimisation, it’s hard to blame the people who do it: oftentimes it’s a career choice.

        That, in turn, is driven by many other factors, not the least the fact that language complexity these days is mind-boggling. Even Python, which has the reputation of a simple language, is not that easy – your average codebase doesn’t use just Python “the language”, but also a myriad conventions about what is and isn’t Python, all sorts of decorators of questionable usefulness and so on.

        It’s like every general-purpose language out there is slowly evolving to encompass several DSLs, to the point where you can’t just “know” Python, or C++, or Rust – you have to use all of it, full time, on a permanent basis, and follow all the important community blogs, and watch the conference talks, because the way you wrote code two years ago is no longer idiomatic.

        I, for one, am pretty hesitant to say I know Python, even though I’ve actually used it for a very long time, since before 2.x, in fact: truth is, even though it’s the scripting language I am most familiar with (I’ve buried Perl more than 10 years ago), if you stick me in front of a Python source file picked at random from a major project, there’s about a 50% chance that it’ll be basically incomprehensible to me unless I google the hell out of it.

        1. 1

          I kinda feel that keeping up with the shifting package managers is more troublesome than keeping up with shifting idioms. However, I have been working in Python and JS shops. So I recognise that I may be something of an extreme case.

      2. 21

        Strong disagreement.

        First, “premature optimization” has been twisted very far from its original meaning. Most people only know the bit “premature optimization is the root of all evil in computer science”, but they don’t know the part that comes before. In particular:

        The improvement in speed from Example 2 to Example 2a is only about 12%, and many people would pronounce that insignificant. The conventional wisdom shared by many of today’s software engineers calls for ignoring efficiency in the small; but I believe this is simply an overreaction to the abuses they see being practiced by penny-wise-and-pound-foolish programmers, who can’t debug or maintain their “optimized” programs. In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering

        (NB: I didn’t know about this part until very recently, unfortunately.)

        Knuth argues that we should not overlooked the easily obtained 12% extra performance. By selecting a different language, for example Go, instead of Python, we can have gains of 10x, even 100x! What does it say about our profession that we would turn up our noses at such possible gains? (Fun exercise: write a program that does rot13 translation from stdin to stdout. I don’t know Go very well, but I was able to write a program that processes a 1 GB file in ~0.5s; my best Python version takes 82s, 160x slower.)

        Second, this advice makes an assumption—one that I used to subscribe to—that is not borne by practice: that you can wring out performance out of a program that was not at all designed with performance in mind. There’s this idea that we can change a couple of algorithms, run a profiler and address the worst outliers and that we’ll have a fast program. But that’s rarely so simple: we often have a uniform mud ball of slowness and nothing in the profile really stands out. Making one part of the program faster has little effect, because all the rest is slow. There is a very good quote at the beginning of the io_uring paper:

        the initial design was centered around efficiency. Efficiency isn’t something that can be an afterthought, it has to be designed in from the start - you can’t wring it out of something later on once the interface is fixed.

        Python does have a place, I use it pretty much every week, but I don’t think it’s fair to say that it doesn’t have performance issues and it’s really disingenuous to suggest that taking those issues in consideration in our initial design is a form of malpractice.

        1. 2

          In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal

          He never specifies about what kind of improvement it is. It might be a manufacturing improvement, a maintainability improvement, a performance improvement, etc. The same with computer programs. Many improvements are actually tradeoffs. One way of writing code may result in more performant code, but it often will result in slower time to completion and/or harder maintainability. It’s almost always a trade-off, and I feel like Knuth was ignoring that.

          1. 2

            his advice makes an assumption—one that I used to subscribe to—that is not borne by practice: that you can wring out performance out of a program that was not at all designed with performance in mind . . . Efficiency isn’t something that can be an afterthought, it has to be designed in from the start

            Ah, but what do you mean by efficiency? :) If it’s akin to “non-pathological” — e.g. use well-defined components, or a connection pool for your DB rather than connecting with each request — then I agree totally. But I frequently see penny-wise-and-pound-foolish premature optimizations made in the name of “efficiency” — or, more precisely, presumed yet unvalidated assumptions about efficiency. One common example is that service-to-service communication should be over gRPC rather than e.g. JSON-over-HTTP because it’s faster. But gRPC carries enormous costs to complexity, often isn’t actually faster than alternatives in any significant way, and it’s rare that service-to-service communication costs are a performance bottleneck in the first place!

            1. 4

              Ah, but what do you mean by efficiency? :)

              In the past, I’ve defined it a bit nebulously as making “reasonable” use of the computer’s resources. Now, we could argue about what constitutes “reasonable”: if a program is processing data at 10% the maximum theoretical speed of your memory, is that reasonable? Would 5% still be reasonable? Would 1%? I get that different people and different problems will have different thresholds, but basically it’s about not wasting too much machine time when solving the problem.

              One common problem of not thinking about performance upfront—and one that still is present in a lot of my code—is to start with “individual-level thinking” (to quote Casey Muratori): building classes that represent a single object and implement methods that work on a single object. If, for example, a method foo does a dynamic memory allocation and we call foo on 100,000 objects, then our program (and its users!) will pay the cost of 100,000 allocation syscalls. If instead we design our system around groups of objects (and your DB connection pool is an example of that), then foo can allocate once and use that memory to process a large number of objects, and thus amortize the cost of the allocations.

              If a program is not initially designed around groups of objects and the public API works in terms of single objects, it’s going to be hard, long, and difficult to change the design to use batches instead.

              As to your comparison of gRPC vs JSON-over-HTTP, I don’t know enough about gRPC to argue in the favour of either, but I think the point of batches applies: it doesn’t matter which one we pick if we’re going to be doing 100,000 service-to-service calls instead of, say, 1000.

              1. 1

                I agree that an API which deals in high-cardinality entities but doesn’t offer batching capabilities is a problem that’s always worth solving. I’d file this under the “non-pathological” umbrella, though admittedly knowing the difference between pathological architectural issues and deferrable optimizations is more art than science.

          2. 11

            Python is oriented towards productivity

            I think this is not invalid, but if you are looking for productive languages in 2021 you could do much better than python. The place where python does still have a competitive edge, of course, is hiring. But then again, you may rue the easiness of hiring because there quite a few footguns and being easy to hire increases the likelihood that you wind up getting someone that is amateurish (even though they are ‘seniors’), or if you get a junior, there aren’t guardrails around those footguns.

            1. 7

              I think this is not invalid, but if you are looking for productive languages in 2021 you could do much better than python.

              This seems incredibly subjective to me. It’s great that you feel that way, but when you word it this way you make it sound like an absolute which it most assuredly is not.

              1. 6

                If you don’t agree that Python is optimizing for developer speed (and I certainly don’t) then the whole article falls apart.

                You say hiring might be easier, but I decline anything from recruiters if it’s Python. I’m so done trying to make sense of code bases with no typing. Mypy helps, but real static types it is not.

                As far as I’ve seen, Python is picked because it’s the language everyone on the team knows.

                1. 4

                  Python productivity

                  There are two types of programmer productivity, addressing different problems:

                  1. I don’t know much about the underlying tech. Can I get this script running in 30 minutes?
                  2. I don’t know much about the problem domain. Can I change this multi-million line behemoth without breaking everything and getting fired?

                  Python excels at (1) and fails miserably at (2). Yes, I’m aware of mypy and its attempts at solving the second problem. It’s not there yet.

                  1. 2

                    think this is not invalid, but if you are looking for productive languages in 2021 you could do much better than python

                    Such as? And why?

                    1. 4

                      I’m a plain old C/asm (or Lisp) guy, but it seems to me that if you’re thinking about using Python then modern Javascript (e.g. TypeScript if you like type annotations, as I do) running in node.js does pretty much everything Python does but fifty times faster if you’re actually writing algorithms not just gluing together other people’s C libraries. There’s a similar extensive library of modules ready to use. There’s a slightly different set of footguns. but are they worse? Probably not.

                      1. 5

                        GP was claiming that there were probably “much better than python” out there, not “roughly comparable”.

                        But your specific claim is not even true. Node.js doesn’t do everything python does, at all. It doesn’t do sync io nor does it have support for multi threading or multi processing. It does allow you to start your application in multiple processes but doesn’t offer you a way to control them like you do in C for example. This is a huge deal. You have no way to control basic scalability of your application. It will queue up all io calls until it exausts the resources. And has no other concept of concurrency than essentially making everything in paralell. It’s a memory leak by design.

                        It has an old fashioned and less ergonomic syntax with many more corner cases and qwirks than python. And to my knowledge, it has no well established (or probably not any at all)? GPU library.

                        The point is that python is versatile. Node js is not even primarily a programming language implementation. It is a single thread event machine that ships with a JavaScript API, provided by an implementation extracted from a browser. Still puzzles me that people don’t find this weird. A library is the main piece and the programming language comes as an addition.

                        1. 1

                          No two things that are distinct do all the same things as each other. You choose the features that matter to you.

                          “It doesn’t do sync io nor does it have support for multi threading”

                          That’s a contradiction. If you’re programming in plain old JS then it doesn’t do sync io because while you’re waiting it goes off and does things for other threads. But if you’re programming in Iced Coffescript or TypeScript then you can write your program as if the IO was synchronous, without explicitly writing the underlying callback yourself – exactly the same as happens in C where when you do synch io your thread library (or the OS) saves all your registers and loads someone else’s registers and runs another thread until your IO is done.

                          1. 1

                            If you’re programming in plain old JS then it doesn’t do sync io because while you’re waiting it goes off and does things for other threads.

                            What makes you think that is how it works? It is not. Browser JavaScript and nodejs are single threaded. In fact, the whole reason nodejs was created was to provide a build of a single threaded js implementation with async io. Everything is performed in the same thread. Perform an operation that takes perceptible time and your whole application freezes. Your browser will get unresponsive. Put a for cycle with a few million iterations inside a callback and see what happens.

                            Nodejs is standalone build of V8, which is chrome’s JavaScript engine. Early nodejs builds for Windows did execute IO by relying on worker threads, but that is not the point as you don’t have access to them. It was a workaround not intended for production usage but rather to provide a solution for people to use their windows machines for development. In the end you only have access to one thread and you have to do everything there, by design. You have to trust it to use whatever resources it needs to complete io (essentially starving the machine resources if you push it) and have no API to control what gets done simultaneously or when to wait for what. Notice that this is a perfectly acceptable value proposition for a script that is executed in in the context of a webpage. But it is absurd in simple cases such as writting on a text file line by line sequentially. You have no way of doing it without letting nodejs opening as many IO descriptors as quick as possible. Sort of like fork bombing your machine. Nodejs did include synchronous io, but they deprecated because people put it inside callbacks and flooded GitHub with tickets claiming their application would ‘randomly’ freeze.

                            To this day, I still haven’t heard the reason why one would choose to write a server application or even a command line utility using this IO model. From what I gathered, the rational includes people not being familiar with multi threading and synchronisation APIs and creating race conditions.

                            I am not sure what you are referring to when you mention typescript. Last time I checked, it was simply a compiler targeting JavaScript with no functionality whatsoever. But that was a few years ago, I don’t know about presents day. Iced coffescript does provide alternative IO APIs though.

                  2. 9

                    Two remarks:

                    • As the blog post points out, there are large companies with massive codebases in scripting-league languages (Python, PHP, Ruby, etc.; Javascript!) out there. But a surprising number of these companies are investing millions in trying to (1) implement some static typing on top of the language for better maintainability (performance is not cited as the concern usually), or (2) faster implementation techniques than the mainstream language implementation. (Companies seem to have some success doing (1), more than (2); because optimizing a dynamic language to go faster is surprisingly difficult.) This could be taken positively as in the blog post, “there are tools to make your Python codebase more maintainable / faster anyway”, but also negatively: “companies following this advice are now stuck in a very expensive hole they dug for themselves”.

                    • “Computation is cheap, developers are expensive” is an argument, but I get uneasy thinking about the environmental costs of order-of-magnitude-slower-than-necessary programs running in datacenters right now. (I’m not suggesting that they are all Python programs; I’m sure there are tons of slow-as-hell Java or C++ or whatever programs running in the cloud.) I wish that we would collectively settle on tools that are productive and energy-efficient, and try to think a bit more about the environment than “not at all” as in the linked post.

                    1. 5

                       We did once an estimate, that if one our embedded product consumed 3W more per unit we’d have burned nearly 500MWh extra energy over deployed units’ then-lifetime.

                      Inefficient code in prod is irresponsible, and unlike crypto mining it is not shamed enough. You might think your slow script that just scratches your itch is nbd but before you know it it’s cloned and used at ten thousand instances..

                      1. 5

                        “Computation is cheap, developers are expensive” is an argument, but I get uneasy thinking about the environmental costs of order-of-magnitude-slower-than-necessary programs running in datacenters right now.

                        Not to mention the poor users who have to wait ten times longer for a command to finish. Their time is also expensive, and they usually outnumber the developers.

                        1. 2

                          “companies following this advice are now stuck in a very expensive hole they dug for themselves”.

                          One could argue that this is “a good problem to have.” I mean, it didn’t become a problem for Facebook or Dropbox or Stripe or Shopify until they were already wildly successful, right?

                          1. 4

                            There is a strong selection bias here as we hear much less about which technical issues plagued less-successful organisations. It would be dangerous to consider any choice of a large company to be a good choice based on similar reasoning.

                        2. 7

                          The real tragedy is that this choice is still such a big deal, because calling code in one language from another has only slightly improved in decades

                          1. 7

                            Watching the responses to this thread has been an illuminating look at a sizable chunk of the lobsters user base.

                            While there are a sizable number of crustaceans who also self identify as Pythonistas, I suspect there is a much larger group who do not and who very strongly favor statically typed languages like Rust, Zig, the C family or Go.

                            The folks from the statically typed set seem to be very dismissive of Python as a programming language and, perhaps without intending to be, the people who favor it.

                            There’s nothing wrong with that per se, but I would just ask those highly opinionated folks to consider recognizing that very few things in the universe of computer science and programming are absolutes, and it would be helpful to remember that when posting.

                            1. 3

                              I do see that a lot of people are raising their concerns, but I don’t see (most of) them as being dismissive of Python. That article is very dismissive of non-Python languages though.

                              1. 3

                                That article is very dismissive of non-Python languages though.

                                Would you mind elaborating on why you say this please?

                                My take:

                                The article was written by a member of the Python Steering Council, based on his experiences with teams for whom the primary programming language is Python and would PREFER to keep using Python but who probably aren’t aware of the potential optimization paths that exist so they can keep using their high velocity language of choice.

                                In short, saying “Hey you have lots of options to try before you completely rewrite your code base and give up on Python” isn’t being dismissive of other programming languages at all, it’s making people aware of the tools that the Python ecosystem has to offer.

                                That’s my $.02 anyway :)

                                1. 1

                                  Have you ever been told that Python couldn’t be used for a project because it wouldn’t be fast enough? I have, and I find it a bit frustrating as big banks, YouTube, Instagram, and plenty of other places that are performance-sensitive still manage to select Python and be happy. […]

                                  I think this is what you focus on, but I see this instead:

                                  […] And so this blog post is going to argue that Python makes sense to select even for projects with performance concerns and how to work towards better performance in an iterative fashion if your first attempt isn’t fast enough. […]

                                  My point is just that there is a significant amount of problems that just can’t be solved with Python properly.

                                  If you are implementing some database system or something like that, you know that Python won’t do it. You can still write a prototype in Python but you really shouldn’t settle with it.

                                  I feel like this is an important thing to point out when making that argument. To omit that feels dismissive to me.

                                  1. 2

                                    My point is just that there is a significant amount of problems that just can’t be solved with Python properly.

                                    If you are implementing some database system or something like that, you know that Python won’t do it. You can still write a prototype in Python but you really shouldn’t settle with it.

                                    I feel like this is an important thing to point out when making that argument. To omit that feels dismissive to me.

                                    ORLY? :) Some investment mega-banks would beg to differ

                                    Mind you, I’m not suggesting that Python is an ideal programming language with which to implement a highly scalable ACID compliant SQL database - I don’t know what the ansewr to that particular question is, and I suspect neither do you.

                                    Everyone has tools they prefer to others. Different tools have different performance characteristics under different kinds of load. There are no absolutes. Acting as if there are is a mistake in my book.

                                    1. 2

                                      That article is really weird. On the one hand, the author clearly seems to be competent in both programming and writing. However, most of the stuff discussed in the article seems ludicrous to me.

                                      Everyone has tools they prefer to others. Different tools have different performance characteristics under different kinds of load. There are no absolutes. Acting as if there are is a mistake in my book.

                                      +1

                                      I think that puts it in a nutshell.

                            2. 6

                              More general principle: making any decision can be a form of premature optimisation. Compared to any point in time that comes later, the point when you make the decision is the point in time when you know the least about the problem you’re solving.

                              Keep options open and explore parallel tracks for as long as economically feasible.

                              1. 2

                                So I just got a new split keyboard thing so I’m going to write out an opinion, which is inevitably along the lines of what has already been shared.

                                Python is a terrible language with a super great library ecosystem.

                                It’s a terrible language because it’s stuck with syntax and design choices due to backwards compatibility but it’s also fractured by having broken that backwards compatibility… Actually from this pov any language without a very measured approach to updates sucks, yes I am a lisp believer…

                                With that unfair opinion out of the way I’d like to dig into the “selecting a programming language” part.

                                First off; as someone said, the right idea is to start trying to solve the problem and be prepared to throw away the solution and rewrite it in a more appropriate language after the problem is better understood.

                                Secondly; someone also mentioned that juniors spend an unecessary amount of cognitive capacity on choosing a language and the point regarding it being a career choice (/ religious choice) was raised.

                                What I’d like to contribute to the discussion is to expand on the “you use what you know” angle.

                                Most people want to solve problems not learn tools. When you finally know a few tools you just use them, you don’t debate whether you should use something else (unless you are navel-gazing and/or studying computer science).

                                Therefore the initial choice is actually super important in some restricted sense but if you don’t take that stance then you usually actually don’t have a choice. You are just going to use whatever is in front of you, i.e. default into js/python/R/java/C.

                                If you actually bikeshed the initial choice then you probably never actually commit enough to anything to get trapped by your expertise in using a familiar tool (the one you use “in anger”).

                                The thing is: describing computation is just not as important as describing data (assumptions!). Computations progress from the assumptions, our failure to coordinate on an acceptable data interchange format is a tragedy (and ever since json there was real progress in infrastructure development).

                                I feel like the old assumption was to build on top of a database and the new assumption is to build underneath a data interchange format (the database is a smaller component of the whole system than it used to be, coordination takes up a much larger part than before).

                                So yeah, think about the philosophy of your tools so you won’t end up switching when they inevitably turn in your hands (unless they are under your control). If you are expert enough to know many languages then you know the purposes to which they are suited and can make judgment calls on where to prototype and where to rewrite. If you are a noob then you should learn the four true programming languages (I love putting an opinion like this here where everyone is a bigger PLT nerd than I am):

                                • Forth :: Most low level
                                • Lisp :: Best man-meets-machine intersection
                                • ML :: Best man-meets-other-mans-code intersection
                                • Prolog :: Most high level
                                • C :: Most similar model to turing machines taught in school

                                I claim Forth leads to Prolog which leads to ML which leads to Lisp which leads to C which leads you back to Forth (https://en.wikipedia.org/wiki/Wuxing_(Chinese_philosophy)).

                                1. 2

                                  In my experience this is rarely an issue. Most places (I worked for it otherwise know) went for Python, Ruby and the likes first until they realize that bottlenecks can not exclusively be “solved” with horizontal scaling.

                                  1. 1

                                    Did I miss something in this essay? Seems like the title says not to select programming languages too soon and then it’s Python, Python, Python.

                                    Is Python not a programming language?

                                    I feel like with this title there might be something interesting to bring out, but this essay didn’t do it, at least for me. I feel like the author is arguing against their own thesis.

                                    1. 2

                                      It’s written as though the default language for all products is Python, i.e. Python’s not a “choice” in any meaningful sense. Note the choice to “Prototype in Python” is basically just waved away as “you have to start somewhere.” Overall, this reads like a rehash of the language choice bits of Paul Graham’s Beating the Averages but with Python rather than Lisp. Given the popularity of Python though, it doesn’t give you the edge Graham claims he gained from using Lisp.

                                      To be a bit more charitable to the author, I think the argument may be something along the lines of “use the language you know best and worry about optimization later” with the assumption that a great many people know Python best. Depending upon the project, that might not be bad advice but it certainly seems over-generalized here.

                                      1. 1

                                        To be a bit more charitable to the author, I think the argument may be something along the lines of “use the language you know best and worry about optimization later”

                                        I substituted that in and kinda made it all work, but hell, that sure made it a weak essay.

                                        Thanks.