1. 29

  2. 20

    This is a good article but it seems to me that it totally ignores the security implications of leaving old code in place because it works for your use case.

    If your application is such that the machine it’s running on isn’t internet connected then hey, good on you, but how many of us can actually say that?

    1. 8

      It’s a milter. I don’t think the author can actually say that.

    2. 19

      My milter implementation has been completely stable since written in Python 2 in 2011. Now I have to destabilize it because people are taking Python 2 away.

      (I do not have tests. Tests would require another milter implementation that was known to be correct.)

      Sounds like the Python 3 tests have a completely stable Python 2 implementation to be tested against.

      1. 4

        My thought exactly.

        He can test against live input samples. And some randomly generated ones / edge cases which he should have done anyways.

        But of course, it is work. But if his current version really is rock solid, he could just continue running it with python 2 or buy red hat enterprise.

      2. 15

        Isn’t this a complaint about the lack of free support from distros? Am I misunderstanding?

        Perhaps a group of people would like to start a paid support service for Python 2?

        1. 8

          RHEL will be supporting a Python 2 interpreter until at least June of 2024. Potentially longer if they think there’s enough money in offering another “extended lifecycle” (which got RHEL 6 up to a whopping 14 years of total support from its initial release date).

          1. 2

            Alternately something can be “done” and never need to be touched again. Expiring toolchains breaks a lot of “done” code.

            1. 20

              In the current security landscape? Are you serious? No code is perfect. New flaws in old code are being found and exploited all the time.

              1. 1

                Obviously python is large enough to be a security problem, but take e.g. boltdb in the golang world. It doesn’t need more commits, unless golang shifts under it. I believe it’s possible to have code that’s useful and not a possible security problem.

                1. 15

                  I don’t understand where you’re coming from here. I’m not a Golang fan, but looking at the boltdb repo on github I see that it’s explicitly unmaintained.

                  You’re saying that you don’t think boltdb will ever have any serious security flaws that need addressing?

                  I don’t mean to be combative here, but I have a hard time swallowing this notion. Complex software requires maintenance in a world where the ingenuity of threat actors is ever on the increase.

                  1. 3

                    Maybe it wouldn’t need any more commits in terms of features, which is most likely true in case of Bolt as the README states that it focusses on simplicity and doing one thing. But there’s no way to prove that something is secure, you can’t know if there’s a certain edge-case which will result in a security vulnerability. And in that sense, it does require maitainence. Because we can’t prove that something is secure, we can only prove that something is insecure.

                    1. 1

                      In fact, the most recent Go 1.14 release adds a -d=checkptr compiler flag to look for invalid uses of unsafe that is enabled for -race and -msan builds by default, and because it does invalid unsafe pointer casts all over the place, it causes fatal errors if you like to run with -race in CI, for example.

                      So yeah, Go indeed did shift from under it very recently.

                  2. 6

                    Some things might be able to. I do not personally believe a sendmail milter is one of those things that can be “done” and never need to be touched again. Unless email itself becomes “done”, I suppose.

                2. 24

                  In some cases, I have a great deal of sympathy for the author’s point.

                  In the specific case of the software that triggered this post? Not so much. The author IS TALKING ABOUT A SENDMAIL MILTER when they say that

                  Python 2 is only legacy through fiat

                  No. Not in this case. An unmaintained language/runtime/standard library is an absolute environmental hazard in the case of a sendmail milter that runs on the internet. This is practically the exact use case that it should absolutely be deprecated for, unless you’re prepared to expend the effort to maintain the language, runtime and libraries you use.

                  This isn’t some little tool reading sensor data for an experiment in a closed environment. It’s processing arbitrary binary data from untrusted people on the internet. Sticking with this would be dangerous for the ecosystem and I’m glad both python and linux distro maintainers are making it painful for someone who wants to.

                  1. 2

                    A milter client doesn’t actually process arbitrary binary data from the Internet in a sensible deployment; it encapsulates somewhat arbitrary binary data (email messages and associated SMTP protocol information that have already passed some inspection from your MTA), passes it to a milter server, and then possibly receives more encapsulated binary data and passes it to the MTA again. The complex binary milter protocol is spoken only between your milter client and your milter server, in a friendly environment. To break security in this usage in any language with safe buffer handling for arbitrary data, there would have to be a deep bug that breaks that fundamental buffer safety (possibly directly, possibly by corrupting buffer contents so that things are then mis-parsed at the protocol level and expose dangerous operations). Such a deep break is very unlikely in practice because safe buffer handling is at the core of all modern languages (not just Python but also eg normal Rust) and it’s very thoroughly tested.

                    (I’m the author of the linked-to blog entry.)

                    1. 2

                      I guess I haven’t thought about one where it would be safe… the last one I worked on was absolutely processing arbitrary binary data from the internet, by necessity. It was used for encrypting/decrypting messages, and on the inbound side, it was getting encrypted message streams forwarded through from arbitrary remote endpoints. The server could do some inspection, but that was very limited. Pinning it to some arbitrary library version for processing the message structures would’ve been a disaster.

                      That’s my default frame of reference when I think of a milter… it processes information either on the way in or way out that sendmail doesn’t know how to and therefore can’t really sanitize.

                      1. 1

                        For us, our (Python) milter client sits between the MTA and a commercial anti-spam system that talks the milter protocol, so it gets a message blob and some metadata from the MTA, passes it off to the milter server, then passes whatever the milter server says about the email’s virus-ness and spam-ness back to the MTA. This is probably a bit unusual; most Sendmail milter clients are embedded directly into an MTA.

                        If our milter client had to parse information out of the message headers and used the Python standard library for it, we would be exposed to any bugs in the email header parsing code there. If we were making security related decisions based on header contents (even things like ‘who gets how much spam and virus checking’), we could have a security issue, not just a correctness or DoS/crash one (and crashes can lead to security issues too).

                        (We may be using ‘milter client’ and ‘milter server’ backward from each other, too. In my usage I think of the milter server as the thing that accepts connections, takes in email, and provides decisions through the protocol; the milter clients are MTAs or whatever that call up that milter server to consult it (and thus may be eg email servers themselves). What I’m calling a milter server has a complicated job involving message parsing and so on, but a standalone client doesn’t necessarily.)

                        1. 2

                          Mine was definitely in-process to the MTA. (I read “milter” and drew no client/server distinction, FWIW. I had to go read up just now to see what that distinction might even be.) Such a distinction definitely wasn’t a thing I had to touch in the late 2000s when I wrote the milter I was thinking about as I responded.

                          The more restricted role makes me think about it a little differently, but it’d still take some more thinking to be comfortable sitting on a parsing stack that was no longer maintained, regardless of whether my distro chose to continue shipping the interpreter and runtime.

                          Good luck to you. I don’t envy your maintenance task here. Doubly so considering that’s most certainly not your “main” job.

                    2. 1

                      Yeah, it’s a good thing they do, it’s not the distro-maintainers fault that Python became deprecated.

                    3. 38

                      Are people really still whining about this?!?

                      Python 2 is open source free software and you’re a software developer. Grab the code, build it yourself, and keep running Python 2 as long as you want. Nobody is stopping you.

                      This is even more silly because Python2 was ALREADY DEPRECATED in 2011 when the author started his project.

                      1. 5

                        /Are people really still using this argument?!?/ Just because software, packages and distributions don’t cost money doesn’t mean that people don’t use them and have expectations from them. In fact, that is exactly why they were provided in the first place. This “you should have known better” attitude is totally counterproductive because it implies that if you want any kind of stability or support with some QoS you should not use free/open-source software. I don’t think any of us want to suggest that. It would certainly not do most open source software justice.

                        1. 9

                          This “you should have known better” attitude is totally counterproductive because it implies that if you want any kind of stability or support with some QoS you should not use free/open-source software.

                          My comment doesn’t imply that, though. In fact, as I pointed out, the author can still download Python2 and use it if he wants to. Free to use does not imply free support, and I think it’s a good thing for people to keep in mind.

                          Furthermore, I don’t think a “you should have known better” attitude is out of line towards somebody who ignored 10 years of deprecation warnings. What did he think was going to happen? He had 10 years of warning - he really should have known better…

                          1. 1

                            if you argue with the 10 years of warning you’re missing the point.

                            The point is not that there was no time to change it. The point is that it shouldn’t need change at all.

                          2. 13

                            Just because software, packages and distributions don’t cost money doesn’t mean that people don’t use them and have expectations from them

                            Haven’t there been a few articles recently about people being burt out from maintaining open source projects? This seems like the exact kind of entitled attitude that I think many of the authors were complaining about. I’m sure there would be plenty of people to maintain it for you if you paid them, but these people are donaiting their time. Expecting some developer to maintain software depreciated in 2011 for you is absurd.

                            1. 1

                              Yeah, I’ve read a few of those articles, too. And don’t get me wrong I’m not trying to say that things should be this way. A lot of open source work deserves to be paid work!

                              But I also don’t think there is anything entitled about this point of view. It’s simply pragmatic: people make open source software, want others to use it, and that is why they support and maintain it. Then the users become dependent. Trouble ensues when visions diverge or no more time can be allocated for maintenance.

                              1. 9

                                At the same time, it’s not like a proprietary software vendor that you staked your entire business on. The source code to Python 2 isn’t going anywhere. Just because the PSF and your Linux distribution decided to stop maintaining and packaging an ancient version doesn’t mean you can’t continue to rely on some company (or yourself!) to maintain it for you. For instance, Red Hat will keep updating Python 2 for RHEL until June 2024.

                                And as crazy as it might seem to have to support software yourself, consider that the FreeBSD people kept a 2007 version of GCC in their build process until literally this week. That’s 13 years where they kept it working themselves. It’s not like it’s hard to build and package obsolete userspace software; nothing is going to change in the way Linux works that would prevent you from running Python 2 on it in five years (unlike most system software which might make more assumptions about the system it’s running on).

                                Some amount of gratuitous change is worth getting worked up about. For example, it’s a well-known issue in fast-moving ecosystems like JavaScript that you might not be able to get your old project to build with new dependency versions if you step away for a year. That’s a problem.

                                I, for one, am extremely glad that it’s now okay for library authors to stop maintaining Python 2 compatibility. The alternative would have been maintaining backwards compatibility using something like a strict mode (JavaScript, Perl) or heavily encouraging only using a modern subset of the language (C++). The clean break that Python made may have alienated some people with legacy software to keep running, but it moved the entire ecosystem forwards.

                                1. 1

                                  The source code to Python 2 isn’t going anywhere. Just because the PSF and your Linux distribution decided to stop maintaining and packaging an ancient version doesn’t mean you can’t continue to rely on some company (or yourself!) to maintain it for you.

                                  1. Some distros are eager to make python launch python3. This action is vanity-based hostile to having Python 2 and 3 side-by-side (with 2 coming from a non-distro source).
                                  2. By not keeping Python 2 open to maintainance by willing parties in the obvious place (at the PSF) and by being naming-hostile to people doing it elsewhere in a way that not only maintains but adds features, the PSF is making pooling effort for continued maintenance of Python 2 harder than it has to be.
                                  1. 2

                                    It’s arguably more irresponsible to continue to implicitly pushing Python 2.x as the “default” python by continuing to be refer to it by the python name out of deference to “not breaking things” when it is explicitly unmaintained.

                            2. 7

                              it implies that if you want any kind of stability or support with some QoS you should not use free/open-source software

                              If you want support with guarantees attached you shouldn’t expect to get that for free. If you are fine with community/developer-provided support with no guarantees attached, then free software is fine.

                              I think being deprecated for a decade before support being ended is pretty amazing for free community-provided support, to be honest.

                          3. 11

                            Otherwise it is perfectly functional and almost certainly completely secure, and would keep running fine for a great deal longer.

                            That’s the catch. Python 2 is officially no longer supported, even for security patches. I think that’s why there’s such a heavy-handed push to remove it.

                            (Yes there’s other unsupported stuff still packaged, but I’d argue none come anywhere close to the centrality of Python, and thus the consequences of a security flaw don’t measure up.)

                            1. 10

                              I do not have tests. Tests would require another milter implementation that was known to be correct.

                              There are no invariants or expected behaviors to test??

                              1. 5

                                Yes, this seems like a ridiculous statement. Maybe it makes sense in context, but even so… I would expect some kind of tests about binary string manipulation…

                              2. 10

                                Oh, come on. This transition has been 12 years in the making, and 11 years of that was spent saying “nah, Python 2 is fine, and nobody else has migrated anything yet, so no rush”.

                                1. 4

                                  Are there any other significant examples of languages/runtimes that are still used without updates and security patches? Even the small communities I follow, like J, seem to get at least occasional updates.

                                  1. 4

                                    It seems like the greater problem here is the lack of static typing.

                                    EDIT: To elaborate: splitting something that represents a single concept (here, string -> string or bytes) is one of the hardest things to do in a large, dynamically-typed codebase. If some Xs are now Ys, this forces you to find every X and ask yourself “should this now be a Y?”, and even finding all the Xs is difficult because the lack of type information makes it hard for tools to help you.

                                    In a statically-typed language, you have a greater chance of finding all the Xs, and when you determine that some X is in fact a Y, you can explore the consequences of that fact with the compiler’s help.

                                    1. 4

                                      Python 3 has optional type annotations and several static checkers available. So this code could be written in a statically-typed (or close to it) way on Python 3, or gradually have static type information added as part of a program of porting and testing.

                                      There are also static analysis tools like Bandit (which operates at the level of the Python AST, and can do quite a bit more than just type checking).

                                      1. 2

                                        That’s great news for maintenance and porting legacy code. For new projects, good type systems give us ergonomics and safety and it’s no longer necessary to compromise.

                                        1. 2

                                          My personal stance differs from yours – I generally strongly prefer dynamically-typed languages! – but I just was pointing out that if you want to do more or less “statically-typed Python” you can now.

                                    2. 3

                                      Functioning code that you don’t have to maintain and that just works is an asset; it sits there, doing a valuable job, and requires no work. Code that you have to do significant work on just so that it doesn’t break (not to add any features) is a liability

                                      I disagree with that notion. Show me a piece of code that sits there forever unchanged, requiring no work? In my experience the code that requires no work is then propped up by a ton of work to e.g. keep the environment it is running stable by using virtualization or strange hardware just to avoid to change that code. Therefore I do not think code is an asset, it is always a liability and the author has just discovered this with his own code now.

                                      1. 3

                                        If I move to a distro without Python 2 (which seems likely because I assume new versions of Debian/Ubuntu won’t have it), I’ll probably start using StaticPython for some things.


                                        Note that there’s a huge difference between maintaining Python 2, which isn’t too hard, and maintaining a tangle of Python 2 packages. Luckily I only need the former as most of my programs only depend on the stdlib, or have trivial dependencies that I install with tarballs.