1. 8

    You shoud seriously look at the napoleon / google doc: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/

    This is already implemented and supported standard.

    1.  

      Yes indeed. For those unfamiliar, here are (from the link) examples of the two docstring styles that Napoleon (a Sphinx extension) parses and renders. PyCharm, too, parses Numpy and Google docstrings and uses the information for tooltips, static analysis, etc.

      Google style:

          """Summary line.
      
          Extended description of function.
      
          Args:
              arg1 (int): Description of arg1
              arg2 (str): Description of arg2
      
          Returns:
              bool: Description of return value
      
          """
          return True
      

      NumPy style:

          """Summary line.
      
          Extended description of function.
      
          Parameters
          ----------
          arg1 : int
              Description of arg1
          arg2 : str
              Description of arg2
      
          Returns
          -------
          bool
              Description of return value
      
          """
          return True
      
      1.  

        Hmm, Google’s style + napoleon extension does seem quite good. I wonder if I should update my style guide. I suggested that you should just bite the bullet and use Sphinx style there due to the doc auto-gen benefits, but seems like this is best of both worlds.

    1. 2

      Learning Racket & Scheme.

        1. 3

          Used to be http://pixelmonkey.org, but I just migrated it to http://amontalenti.com. Not 100% technical content, but a lot of technical stuff there.

          Here’s my tech category: https://amontalenti.com/category/technology

          Also, some technical posts at: https://blog.parse.ly/post/author/andrew-montalenti/

          1. 3

            Playing with GatsbyJS and Zappa/Lambda.

            1. 5

              I haven’t read the book, but noticed that the author presented the book and the topic at Google and it was published a few days ago. On YouTube here:

              https://www.youtube.com/watch?v=bmSAYlu0NcY

              1. 2

                The tl;dr is that Facebook has released, via open source, a new tool for doing single-file deployments of Python projects, a problem which is also partially solved by pex and Python3’s native zipapp.

                The architecture of XARs is novel, though. It’s not a zip file that gets decompressed on the fly; instead, it’s more a disk image which gets mounted on the fly. The details:

                XARs are slightly modified squashfs files that mount themselves when executed and unmount after an idle timeout. They could almost be thought of as a self-executing container without the virtualization. By using the squashfs format, we not only distribute data in a far more compressed format than with zip file, but we also decompress on demand only the portions we need. Thanks to this architecture, XARs have nearly zero overhead in production and can be used just as native scripts or executables would be. […] XARs have advantages for interpreted languages like Python. By collecting a Python script, associated data, and all native and Python dependencies, we achieve a hermetic binary that can run anywhere in our infrastructure, regardless of operating system or packages already installed. In fact, this works for many Python tools as well as for JavaScript (Node.js), Lua tooling, and bundling multiple C++ executables and data files together, yielding a single archive that is smaller and can be moved as a single unit.

                So, though it solves this problem for Python, Facebook suggests that XARs might be useful as a generic packaging tool for production code, without the overhead of full machine pre-baked images (e.g. packer AMIs), and also without the complexity of Linux system package managers (e.g. deb, rpm).

                1. -1

                  As one insignificant user of this language, please stop adding these tiny edge case syntax variations and do something about performance. But I am one small insignificant user …

                  1. 56

                    This is exactly the attitude that leads to maintainers’ burn outs.

                    Do realize this:

                    • Adding syntax variations is not done at the expense of performance, different volunteers are working on what’s more interesting to them.
                    • Python is not a product, and you’re not a paying customer, you don’t get to say “do this instead of that” because none of the volunteer maintainers owes you to produce a language for you. Just walking by and telling people what to do with their project is at the very least impolite.
                    • If you want something to happen in an open source project, volunteer to do it.

                    (None of this is aimed at you personally, I don’t know who you are. I’m dissecting an attitude that you’ve voiced, it’s just all too common.)

                    1. 23

                      Python is not a product, and you’re not a paying customer, you don’t get to say “do this instead of that” because none of the volunteer maintainers owes you to produce a language for you. Just walking by and telling people what to do with their project is at the very least impolite.

                      I agree with the general direction of your post, but Python is a product and it is marketed to people, through the foundation and advocacy. It’s not a commercial product (though, given the widespread industry usage, you could argue it somewhat is). It’s reasonable of users to form expectations.

                      Where it goes wrong is when individual users claim that this also means that they need to be consulted or their consultation will steer the project to the better. http://www.ftrain.com/wwic.html has an interesting investigation of that.

                      1. 2

                        Where it goes wrong is when users claim that this also means that they need to be consulted or their consultation will steer the project to the better.

                        Wait, who is the product being built for, if not the user? You can say I am not a significant user, so my opinion is not important, as opposed to say Google which drove Python development for a while before they focused on other things, but as a collective, users’ opinions should matter. Otherwise, it’s just a hobby.

                        1. 5

                          Sorry, I clarified the post: “individual users”. There must be a consultation process and some way of participation. RFCs or PEPs provide that.

                          Yet, what we regularly see is people claiming how the product would be a better place if we listened to them (that, one person we never met). Or, alternatively, people that just don’t want to accept a loss in a long-running debate.

                          I don’t know if that helps clarifying, it’s a topic for huge articles.

                          1. 3

                            I often find what people end up focusing on - like this PEP - is bike shedding. It’s what folks can have an opinion on after not enough sleep and a zillion other things to do and not enough in depth knowledge. Heck I could have an opinion on it. As opposed to hard problems like performance where I would not know where to start, much less contribute any code, but which would actually help me and, I suspect, many other folks, who are with some sighing, migrating their code to Julia, or, like me, gnashing their teeth at the ugliness of Cython.

                            1. 4

                              Yeah, it’s that kind of thing. I take a harsh, but well-structured opinion any time and those people are extremely important. What annoys me is people following a tweet-sized mantra to the end, very much showing along the path that they have not looked at what is all involved or who would benefit or not knowing when to let go off a debate.

                      2. 17

                        Adding syntax variations is not done at the expense of performance, different volunteers are working on what’s more interesting to them.

                        Regrettably, a lot of languages and ecosystems suffer greatly from the incoherence that this sort of permissive attitude creates.

                        Software is just as much about what gets left out as what gets put in, and just because Jane Smith and John Doe have a pet feature they are excited about doesn’t mean they should automatically be embraced when there are more important things on fire.

                        1. 8

                          the incoherence that this sort of permissive attitude creates

                          The Haskell community would’ve just thrown PEP 572 behind {-# LANGUAGE Colonoscopy #-} and been done with it.

                          Sure, this doesn’t get us out of jail free with regard to incoherence, but it kicks down the problem from the language to the projects that choose to opt-in.

                          1. 2

                            I find it hard to see this as a good thing. For me, it mostly highlights why Haskell is a one-implementation language… er, 2 ^ 227 languages, if ghc --supported-extensions | wc -l is to be taken literally. Of course, some of those extensions are much more popular than others, but it really slows down someone trying to learn “real world” Haskell by reading library code.

                            1. 2

                              Of course, some of those extensions are much more popular than others

                              Yeah, this is a pretty interesting question! I threw some plots together that might help explore it, but it’s not super conclusive. As with most things here, I think a lot of this boils down to personal preference. Have a look:

                              https://gist.github.com/atondwal/ee869b951b5cf9b6653f7deda0b7dbd8

                          2. 4

                            Yes. Exactly this. One of the things I value about Python is its syntactic clarity. It is the most decidedly un-clever programming language I’ve yet to encounter.

                            It is that way at the expense of performance, syntactic compactness, and probably some powerful features that could make me levitate and fly through the air unaided if I learned them, but I build infrastructure and day in, day out, Python gets me there secure in the knowledge that I can pick up anyone’s code and at the VERY LEAST understand what the language is doing 99% of the time.

                          3. 4

                            I find that “people working on what interests them” as opposed to taking a systematic survey of what use cases are most needed and prioritizing those is a hard problem in software projects, and I find it curious that people think this is not a problem to be solved for open source projects that are not single writer/single user hobby projects.

                            Python is interesting because it forms core infrastructure for many companies, so presumably they would be working on issues related to real use cases. Projects like numpy and Cython are examples of how people see an important need (performance) and go outside the official language to get something done.

                            “If you want something to happen in an open source project, volunteer to do it.” is also one of those hostile attitudes that I find curious. In a company with a paid product of course that attitude won’t fly, but I suspect that if an open source project had that attitude as a default, it would gradually lose users to a more responsive one.

                            As an example, I want to use this response from a library author as an example of a positive response that I value. This is a library I use often for a hobby. I raised an issue and the author put it in the backlog after understanding the use case. They may not get to it immediately. They may not get to it ever based on prioritization, but they listened and put it on the list.

                            Oddly enough, I see this kind of decent behavior more in the smaller projects (where I would not expect it) than in the larger ones. I think the larger ones with multiple vendors contributing turn into a “pay to play” situation. I don’t know if this is the ideal of open source, but it is an understandable outcome. I do wish the hostility would decrease though.

                            1. 12

                              Performance has never been a priority for Python and this probably won’t change, because as you said, there are alternatives if you want Python’s syntax with performance. Also its interoperability with C is okeish and that means that the small niche of Python’s users that use it for performance critical operations that are not already supported by Numpy, Numba and so on, will always be free to go that extra mile to optimize their code without much trouble compared to stuff like JNI.

                              If you want raw performance, stick to C/C++ or Rust.

                              1. 3

                                I also observe the same tendency of smaller projects being more responsive, but I think the issue is scale, not “pay to play”. Big projects get so much more issue reports but their “customer services” are not proportionally large, so I think big projects actually have less resource per issue.

                              2. 0

                                He did say “please”.

                              3. 7

                                please stop adding these tiny edge case syntax variations and do something about performance.

                                There’s a better forum, and approach, to raise this point.

                                1. 2

                                  I guess you are saying my grass roots campaign to displace “Should Python have :=” with “gradual typing leading to improved performance” as a higher priority in the Python world is failing here. I guess you are right :)

                                2. 1

                                  Have you tried Pypy? Have you tried running your code through Cython?

                                  Have you read any of the zillion and one articles on improving your Python’s performance?

                                  If the answer to any of these is “no” then IMO you lose the right to kvetch about Python’s performance.

                                  And if Python really isn’t performant enough for you, why not use a language that’s closer to the metal like Rust or Go or C/C++?

                                  1. 6

                                    Yes to all of the above. But not understanding where all the personal hostility is coming from. Apparently having the opinion that “Should := be part of Python” is much less important than “Let’s put our energies towards getting rid of the GIL and creating a kickass implementation that rivals C++” raises hackles. I am amused, entertained but still puzzled at all the energy.

                                    1. 5

                                      There was annoyance in my tone, and that’s because I’m a Python fan, and listening to people kvetch endlessly about how Python should be something it isn’t gets Ooooold when you’ve been listening to it for year upon year.

                                      I’d argue that in order to achieve perf that rivals C++ Python would need to become something it’s not. I’d argue that if you need C++ perf you should use C++ or better Rust. Python operates at a very high level of abstraction which incurs some performance penalties. Full stop.

                                      1. 5

                                        This is an interesting, and puzzling, attitude.

                                        One of the fun things about Cython was watching how the C++ code generated approaches “bare metal” as you add more and more type hints. Not clear at all to me why Python can not become something like Typed Racket, or LISP with types (I forget what that is called) that elegantly sheds dynamism and gets closer to the metal the more type information it gets.

                                        Haskell is a high level language that compiles down to very efficient code (barring laziness and thunks and so on).

                                        Yes, I find this defense of the slowness of Python (not just you but by all commentators here) and the articulation that I, as one simple, humble user, should just shut up and go away kind of interesting.

                                        I suspect that it is a biased sample, based on who visits this post after seeing the words “Guido van Rossum”

                                        1. 8

                                          My hypothesis is that people who want performance are minority among Python users. I contributed to both PyPy and Pyston. Most Python users don’t seem interested about either.

                                          1. 3

                                            For me that has been the most insightful comment here. I guess the vast majority of users employ it as glue code for fast components, or many other things that don’t need performance. Thanks for working on pypy. Pyston I never checked out.

                                          2. 5

                                            Not clear at all to me why Python can not become something like Typed Racket, or LISP with types (I forget what that is called) that elegantly sheds dynamism and gets closer to the metal the more type information it gets.

                                            Isn’t that what mypy is attempting to do? I’ve not been following Python for years now, so really have no horse in this race. However, I will say that the number of people, and domains represented in the Python community is staggering. Evolving the language, while keeping everyone happy enough to continue investing in it is a pretty amazing endeavor.

                                            I’ll also point out that Python has a process for suggesting improvements, and many of the core contributors are approachable. You might be better off expressing your (valid as far as I can see) concerns with them, but you might also approach this (if you care deeply about it) by taking on some of the work to improve performance yourself. There’s no better way to convince people that an idea is good, or valid than to show them results.

                                            1. 4

                                              Not really. Mypy’s goal is to promote type safety as a way to increase program correctness and reduce complexity in large systems.

                                              It doesn’t benefit performance at all near as I can tell, at least not in its current incarnation.

                                              Cython DOES in fact do this, but the types you hint with there are C types.

                                              1. 2

                                                Ah, I thought maybe MyPy actually could do some transformation of the code, based on it’s understanding, but it appears to describe itself as a “linter on steroids,” implying that it only looks at your code in a separate phase before you run it.

                                                Typed Racket has some ability to optimize code, but it’s not nearly as sophisticated as other statically typed languages.

                                              2. 3

                                                Be aware that even Typed Racket still has performance and usability issues in certain use cases. The larger your codebase, the large the chance you will run into them. The ultimate viability of gradual typing is still an open question.

                                              3. 3

                                                In no way did I imply that you should “shut up and go away”.

                                                What I want is for people who make comments about Python’s speed to be aware of the alternatives, understand the trade-offs, and generally be mindful of what they’re asking for.

                                                I may have made some false assumptions in your case, and for that I apologize. I should have known that this community generally attracts people who have more going on than is the norm (and the norm is unthinking end users posting WHY MY CODE SO SLOW?

                                                1. 2

                                                  Hey, no problem! I’m just amused at the whole tone of this set of threads set by the original response (not yours) to my comment, lecturing me on a variety of things. I had no idea that (and can’t fathom why) my brief comment regarding prioritization decisions of a project would be taken so personally and raise so much bile. What I’m saying is also not so controversial - big public projects have a tendency to veer into big arguments over little details while huge gaps in use cases remain. I saw this particular PEP argument as a hilarious illustration of this phenomenon in how Python is being run.

                                                  1. 3

                                                    Thinking about this a little more - sometimes, when languages ‘evolve’ I feel like they forget themselves. What makes this language compelling for vast numbers of programmers? What’s the appeal?

                                                    In Python’s case, there are several, but two for sure are a super shallow learning curve, and its tendency towards ‘un-clever’ syntax.

                                                    I worry that by morphong into something else that’s more to your liking for performance reasons, those first two tenets will get lost in the shuffle, and Python will lose its appeal for the vast majority of us who are just fine with Python’s speed as is.

                                                    1. 1

                                                      Yes, though we must also remember that as users of Python, invested in it as a user interface for our code ideas, we are resistant to any change. Languages may lose themselves, but changes are sometimes hugely for the better. And it can be hard to predict.

                                                      In Python’s 2.x period, what we now consider key features of the language, like list comprehensions and generator expressions and generators, were “evolved” over a base language that lacked those features altogether, and conservatives in the community were doubtful they’d get much use or have much positive impact on code. Likewise for the class/type system “unification” before that. Python has had a remarkable evolutionary approach over its long 3-decade life, and will continue to do so even post-GvR. That may be his true legacy.

                                          3. 1

                                            Heh. I think this is an example of the Lobste.rs rating system working as it should :) I posted an immoderate comment borne of an emotional response to a perfectly reasonable reply, and end up with a +1: +4 -2 troll, -1 incorrrect :)

                                        1. 3

                                          A good list. I put together a list in 2012 with three types of software teams: vertically scaled, horizontally scaled, and fully distributed. I prefer fully distributed for a number of reasons, though vertically scaled can also work (with the right group). I think horizontally scaled is the anti-pattern, despite it being common.

                                          http://www.pixelmonkey.org/2012/05/14/distributed-teams

                                          1. 10

                                            I spent a few years of my professional life writing a set of Eclipse plugins and coding an Eclipse “RCP”, or, Rich Client Platform app. This was 2006-2008, so the community was active and Eclipse was winning.

                                            The author’s observation about P2 is astute. It seemed to me that all Eclipse subprojects started to become over engineered in this time period. I worked with EMF and GMF, which became massive projects (in terms of LoC) with lots of complexity. I know there were even efforts to extend the editor into lots of “meta” directions. I suspect all of this meant the project became A Big Pile, and eventually the users could feel it. It’s a pity.

                                            More broadly, my big problem with Java is how damn much IDEs need to know about the language and environment to make the programmer productive in them. When you contrast to environments like Python, Go, and Elixir, the difference is stark.

                                            1. 5

                                              This is my favorite programming talk that one can find online. I rewatch it frequently.

                                              1. 1

                                                Same

                                              1. 4

                                                I have an odd take on the “10x programmer” theory.

                                                I think a 10x programmer is just a solid programmer with 10+ years of programming experience, and with a certain ambitious/creative mindset. The reason it seems “innate” is because, often, by age 25, these programmers have already gotten their 10 years of experience. So few professions have this option to start the experience-building process so early that we assume it must be innate. For example, it is quite hard to be an electrical engineer or a physicist with 10 years of experience by age 25, but it is not far-fetched in programming. Take a look at concert violinists who started their training as young teens (or even younger) and it makes some more sense.

                                                I started programming at age 15. I programmed a whole lot – and it was for fun. By the time I was 25, I felt really productive as a programmer. But people who start at age 25 may take until their mid 30s to feel as confident. I don’t have more innate ability; I just had a whole lot more practice and time!

                                                That all said, there are definitely “bozo programmers” in the world. That is, people who picked the career for the wrong reasons (e.g. to sneak into a cushy job at a bank) and who are just getting by on good communication/political skills and the occasional snippet of StackOverflow code. But, I do think these people are being weeded out by the best hiring practices of the best tech companies.

                                                The more interesting question for me is whether 20 years of experience makes any difference vs 10. I suspect it only makes a marginal difference, which is why our industry also suffers from ageism.

                                                1. 1

                                                  The more interesting question for me is whether 20 years of experience makes any difference vs 10.

                                                  In my limited experience it does quite a lot of difference.

                                                  Ten years ago I was a brillant web developer (a full stack developer in current parlance).

                                                  Now I’ve faced so many fields and so many challenges that the best way I can describe myself is as a guy “able to identify the indipendent variables that govern a complex system and to find the simplest path to reach a certain point”.

                                                  This is so abstract that job intervierwers usually stare at me like I was suddently speaking Klingon, so I have replaced this with a list of things that I did. Unfortunately these do not give remotely an idea of what I’m actually able to do about complexity.

                                                  However, today my colleagues made me notice that I’m simply aging differently: for example, according to them, my ability to switch context rapidly is visibly reduced, just like my will to fight against a solution I know is broken or my ability to cope with working burocracy and politics.

                                                  1. 1

                                                    Very interesting. Thanks for sharing!

                                                1. 11

                                                  Public reminder that Python 3 was released 10 years ago, while Python 2 was only 8 years old when Python 3 was released. That’s some spectacular mishandling of a new software release.

                                                  1. 2

                                                    Huh?

                                                    1. 4

                                                      I think the point they’re trying to get at is that Python 3 is older than Python 2 was, when Python 3 first came out. Python 2 code is still very prolific, and anecdotally I’m seeing people still starting new projects with Python 2.

                                                  1. 2

                                                    The film “Revolution OS”, though a bit dated, has quite a lot of information regarding the “free software” vs “open source” terminology debate.

                                                    1. 1

                                                      cf. “Worse is Better.”

                                                      1. 4

                                                        At my undergrad CS program (NYU, 2002-2006) they taught Java for intro programming courses, but then expected you to know C for the next level CS courses (especially computer architecture and operating systems). Originally, they taught C in the intro courses, but found too many beginning programmers to drop out – and, to be honest, I don’t blame them. C isn’t the gentlest introduction to programming. But this created a terrible situation where professors just expected you to know C at the next level, while they were teaching other concepts from computing.

                                                        But, as others have stated, knowing C is an invaluable (and durable) skill – especially for understanding low-level code like operating systems, compilers, and so on. I do think a good programming education involves “peeling back the layers of the onion”, from highest level to lowest level. So, start programming with something like Python or JavaScript. Then, learn how e.g. the Python interpreter is implemented in C. And then learn how C relates to operating systems and hardware and assembler. And, finally, understand computer architecture. As Norvig says, it takes 10 years :-)

                                                        The way I learned C:

                                                        • K&R;
                                                        • followed by some self-instruction on GTK+ and GObject to edit/recompile open source programs I used on the Linux desktop;
                                                        • read the source code of the Python interpreter;
                                                        • finally, I ended up writing C code for an advanced operating systems still archived/accessible here which solidified it all for me.

                                                        Then I didn’t really write C programs for a decade (writing Python, mostly, instead) until I had to crack C back open to write a production nginx module just last year, which was really fun. I still remembered how to do it!

                                                        1. 3

                                                          One of the things I loved about my WSU CS undergrad program 20 years ago is that in addition to teaching C for the intro class, it was run out of the EE department so basic electronics courses were also required. Digital logic and simple circuit simulations went a long way towards understanding things like “this is how RAM works, this is why CPUs have so much gate count, this is why you can’t simply make up pointer addresses”

                                                          1. 2

                                                            they taught Java for intro programming courses, but then expected you to know C for the next level CS courses (especially computer architecture and operating systems).

                                                            It’s exactly like this at my university today. I don’t think there’s any good replacement for C for this purpose. You can’t teach Unix system calls with Java where everything is abstracted into classes. Although most “C replacement” languages allow easier OS interfacing, they similarly abstract away the system calls for standard tasks. I also don’t think it’s unreasonable to expect students to learn about C as course preparation in their spare time. It’s a pretty simple language with few new concepts to learn about if you already know Java. Writing good C in a complex project obviously requires a lot more learning, but that’s not required for the programming exercises you usually see in OS and computer architecture courses.

                                                            1. 1

                                                              I think starting from the bottom and going up the layers is better. Rather than being frustrated as things get harder, you will be grateful for and know the limitations of the abstractions as they are added.

                                                            1. 12

                                                              Related to work, I enjoyed the Google SRE Book, available for free here:

                                                              https://landing.google.com/sre/book/index.html

                                                              Site Reliability Engineering is perhaps the correct term for what everyone, for awhile, was (apparently incorrectly) describing as “DevOps”. That is, SRE is about having large scale software systems operate well, cheaply, and reliably even in the face of changing codebases, unreliable hardware, and shifting usage patterns. If you run a large scale production system, this book serves as a good “book club” subject for your engineering team.

                                                              One more work related title, I enjoyed going through Python Data Science Handbook with my team of both expert Python programmers and some novices. Also freely available here:

                                                              https://jakevdp.github.io/PythonDataScienceHandbook/

                                                              It runs through the “PyData” stack, things like numpy, matplotlib, seaborn, pandas, scikit-learn, and so on. This includes a challenging final chapter on doing machine learning with Python.

                                                              A quick warning on this one: I think the Wes McKinney Python for Data Analysis book does a better job “warming up” these subjects for novices. It was recently updated for a 3rd edition.

                                                              In November, I decided to take a break from programming books for a few months. I have been working through the 3rd edition of “Managing Humans”, the engineering management book by Rands. I’m halfway through an interesting book on the adverse effects of goal setting, entitled “Why Greatness Cannot Be Planned”. I am late to the scene in cracking open “Thinking, Fast and Slow”. I will probably complete all three by end of January, since I’ve found all of them helpful in early chapters.

                                                              1. 1

                                                                I’m curious about athena’s supported for ET styled PDF and LaTeX output. Can you give a PDF example of one of these essays in the demo site?

                                                                1. 1

                                                                  athena doesn’t support PDF output; it only converts to HTML. My personal Pandoc Markdown to PDF via LaTeX script produces documents like this (.md source) when I export to an ET template.

                                                                1. 7

                                                                  One thing that is rarely mentioned in the static vs dynamic debate is whether the programming language compiler is actually the right place for static typing.

                                                                  I have been doing a lot of work lately that makes me believe that schemas and types are much more useful in databases and wire formats than they are in programming languages themselves. Examples for me include data modeling in Elasticsearch (where types determine query and aggregation capabilities) and Parquet (where types determine storage costs and read/parse speeds). I would contrast these two with, say, MongoDB and JSON, where the lack of types and schemas in these layers leads to massive loss of “effectiveness”, to borrow Hickey’s term.

                                                                  Meanwhile, in programming language compilers themselves, types are often a source of “wrestling with the compiler” for even the well-trained senior programmer and are rarely the source of order of magnitude performance gains, at least not at cluster scale and petabyte data scale. The loss of programmer time is also simply a bad opportunity cost bet. Even the single-core speedups come more from “low-level optimizations” than from type models, e.g. in Python they come from writing code in C and Cython to strip out interpreter overhead.

                                                                  I realize here I am talking much more about speed than program correctness. But for me, correctness comes at such a higher level than the code itself. It comes from users.

                                                                  1. 3

                                                                    Even the single-core speedups come more from “low-level optimizations” than from type models, e.g. in Python they come from writing code in C and Cython to strip out interpreter overhead.

                                                                    Non sequitur. Python is dynamically typed and the speedups come from implementing it in C … which is statically typed (to an extent, anyway).

                                                                    1. 5

                                                                      I’d say C is “machine typed”, rather than statically typed in the way Haskell programmers or Java/Scala programmers think about the world.

                                                                      1. 3

                                                                        It’s not machine typed and on purpose. BCPL was machine-typed:

                                                                        https://en.wikipedia.org/wiki/BCPL

                                                                        Thompson and Ritchie changed that. See this vid:

                                                                        https://vimeo.com/132192250

                                                                  1. 3

                                                                    I used to work a lot on DSLs for data modeling. It is useful to think about “Internal” vs “External” DSLs.

                                                                    In the former, you borrow programming language features to create code patterns that “read as” as a DSL. The classic examples here are “fluent APIs” in Java, the builder pattern, or the Django/SQLAlchemy ORMs in Python.

                                                                    An “External” DSL, by contrast, is a full-blown language that has its own syntax and semantics, and is not hosted by some other language. A good example here is SQL; a smaller example might be Dockerfiles.

                                                                    Of course, internal/external can also form a continuum of sorts. You can have “escape hatches” in the external DSL that allows for arbitrary code; for example, many SQL implementations allow for UDFs written in some other language. Likewise, in some languages, you can have an internal DSL that is quite restricted and starts to veer in the direction of being an external DSL; some Clojure macro-based APIs start to have this feel, for example.

                                                                    I spent nearly 2 years on DSLs professionally, and my conclusion at the end of that exercise is that you almost always want an internal DSL. Not only are they easier to build, but they are also easier to use. Every attempt I have seen of an external DSL written specifically for one project has ended up “A Big Pile”.

                                                                    I think this is because the best external language designers are programming language designers, and these are generally Herculean efforts. It tuns out parsing and compiling code is hard. Thus the best hosts for domain specific programming end up being flexible programming languages that already have well understood syntax and semantics.

                                                                    Looking back to this article, Python’s support for internal DSLs is quite limited. Especially compared to a language like Clojure.

                                                                    Some of the metaprogramming features (descriptors, decorators, context managers, metaclasses) let you write somewhat declarative code, but for the most part, true internal DSLs are rare. Instead, I like the take in the O’Reilly book, Fluent Python, that these features are just hallmarks of “Pythonic” API design, not meant to be abused to support non-obvious syntax for odd business domains.

                                                                    I think this lack of full-blown metaprogramming support in Python may be a feature, and not a bug, of the language, when it comes to long term maintainability of code. DSLs often feel magical within a language, and Python’s community frowns on too much magic. That said, as this article details, there are some exceptions (like Django ORM and numpy) where it feels like the magic was well-utilized.

                                                                    1. 2

                                                                      As far as “internal” DSLs go, you also have some choice as to the “embedding depth” of the DSL. A shallow embedding means that a DSL expression is immediately evaluated to a value in the host language, whereas a deep embedding means that a DSL expression is translated to an AST which can then be manipulated before evaluation.

                                                                      For example, if you wanted to make a math DSL that supported differentiation of expressions, you could maybe use dual numbers for doing this with a shallow embedding, or you could just use AST-based differentiation with a deep embedding.

                                                                      Using the finally tagless style, we can actually make our DSL expressions parametric over the level and manner of embedding. The expression “sin(x+y)” could optionally be interpreted as having type e.g. Double, in which case the code would compile to an extremely fast low-overhead shallow embedding, or it could be interpreted as having some recursive type like

                                                                      data Exp = Const Double | Sin Exp | Sum Exp Exp | ...
                                                                      

                                                                      in which case we could inspect the structure of the expression, a la;

                                                                      d x (Const a) = Const 0
                                                                      d x (Sin a) = Prod (Cos a) (d x a)
                                                                      d x (Var v) = if x == v then Const 1 else Const 0
                                                                      ...