1.  

    NP-hard is a worst-case bound on your algorithm.

    Please do not do this. You are confusing too many things here. You probably want to say exponential behavior is a worst case bound. NP-hard is a complexity class to which a problem belongs, not an algorithm bound.

    1. 2

      From what I can see, a significant chunk of problems in the unix command lines is due to the fact that arbitrary filenames are allowed. I wonder if we should look at restricting the file names to include only specific sets of characters (or at least exclude specific ASCII characters that make life difficult in the shell world) on the filesystem level.

      1. 1

        Else clause is nice, but what a non-intuitive use of that keyword! They could easily have named it default or normal or any other word. Instead, they used else which is completely opposite to what else means in the context of if.

        1. 2

          I see your paper. Have you looked at Array theory1 ? and Nial? (In your informal paper, it would be nice if you include references so that we can tell easily if you know about the above or is unfamiliar with previous approaches).

          1. 8

            Functional Programming and Object-Oriented Programming are dual, in a very precise way.

            Data types and constructing inhabitant terms is dual to interface types and observation methods. Functions are interfaces with a single method that takes an argument and returns an argument. Isomorphic data types can be understood as existence of particular functions that map A to B and back from B to A, preserving identities. Dual to functions are so-called differences: these are data that witness an output argument belonging to an input argument. A basic argument could show that two classes are behaviorally equivalent whenever there is no difference between them (the differences between A and B, and the differences between B and A, are empty).

            In Functional Programming, one is interested in the termination of a program. In Object-Oriented Programming, one is interested in the dual of termination: productivity of a process. Consider that the difficulty of preventing dead-locks is similar to the difficulty of ensuring termination.

            In terms of logic, many advancements in recent years have brought us: constructive and computational interpretations of classical logics; languages that allow the expression of focussing on both the output of a program, and the input of a process; polarization of types to model within one system both functional aspects and objective aspects; better understanding of paradoxical mathematical foundations lead to the theory of non-wellfounded relations, which brings us the dual of recursion/induction called corecursion/coinduction.

            In terms of abstract algebra, we now know that notions of quotient types of languages by certain equivalence/congruence relations is dual to notions of subtypes of abstract objects by certain bisimulations/bisimilarities of behaviors.

            In terms of philosophy, it is understood that rationality and irrationality can also be applied to software systems. “The network is unpredictable,” is just another way of saying that the computer system consists of an a priori irrational component. Elementary problems in Computing Science, such as the Halting Problem, witness the theoretical existence of irrational components. Those who assume every system is completely rational, or can always be considered rational, suffer from a “closed world syndrome.”

            1. 3

              From a layman’s perspective, I thnk it is dual in another way too. In terms of how mutability is handled. Functional way of organizing programs often strives to push the state out, with a set of pure functions inside, which are stitched together on some skeleton that handles state or IO. On the other hand, OO programming encapsulates the state, striving to provide as simple interface to the outside world as possible. i.e the mutability is hidden inside rather than pushed out.

            1. 8

              I think of it this way: An iterable is the equivalent of a file that you can ‘cat’ and consume the output line by line in a typical unix pipeline. On the other, iterator is the equivalent of a file socket. You read from it, and the data is consumed – never to return.

              1. 3

                Interesting to see some backlash over this, Dave Winer’s objections have caught my eye in particular.

                On the one hand I’m not sure google should be punishing sites for being http only.

                On the other hand, what is the open web if your ISP can inject ads into a page where there are none?

                1. 2

                  I didn’t see a link to Winer’s objection in the linked article. Do you have a reference?

                    1. 4

                      He sounds a bit, well

                      HTTPS is going to burn huge portions of the open web

                      His entire shtick seems to be that he thinks HTTPS is a conspiracy by Google to control the web, somehow.

                      1. 3

                        He seems to be confounding Google’s motives, which in fairness are probably not altruistic, with the technology itself which is obviously pretty sound.

                        1. 2

                          I’ve literally never seen so much FUD in my life. He must have some fundamental misconception about how HTTPS works. I just don’t see how he could be arguing these points otherwise.

                          I mean, I would be mad if Google really was doing what he thinks they’re doing. But they’re not. He’s also totally missing (ignoring?) the fact that Mozilla is also taking steps matching Google’s.

                          1. 4

                            I hate to say it because I have a lot of respect for his work, but I think basically he’s got a lot of domains and can’t be bothered converting them. I totally get the objections against the way Google are approaching this, but going after https itself is dumb.

                            Why would you think it’s a bad thing that you can guarantee that the site you are viewing has not been tampered with?

                            I’ve seen him call out Mozilla too in fairness.

                            1. 1

                              Meh. Honestly I have no issues with the way Google is approaching this. They (and Mozilla) give plenty of time before making even the tiniest changes, and in the end really all they’re doing is changing the UI to reflect reality.

                              And without them doing that, people exactly like Winer just wouldn’t care.

                            2. 3

                              I’m skimming through, trying to understand it, and he never really states an objection anywhere that I can see. I am familiar with several reasonable objections to the concentration of power created by the CA system and to the burden it imposes on content creators; I just don’t see Winer actually expressing any of them.

                        2. 1

                          On the other hand, what is the open web if your ISP can inject ads into a page where there are none?

                          May be this is better served by adding signatures to the basic HTTP rather than forcing HTTPS everywhere?

                          1. 2

                            Wouldn’t that involve the same trust infrastructure but without actually encrypting the traffic?

                            1. 4

                              Not completely. The benefit is that intermediaries can cache it if required, and clients can verify the signature only when needed. With the forcing of HTTPS everywhere, a lot of caching infrastructure that existed previously has become useless without any alternatives. These are especially important in low bandwidth countries or communities relying on low bandwidth gateways.

                        1. 6

                          The word secure is somewhat meaningless without enough context. Also, HTTPS doesn’t immediately translate to secure and adding “not secure” to the url bar doesn’t achieve much either. AFAIR chrome still mistreats the “target = _blank” property…

                          1. 15

                            This is a common argument that I never understood the utility of. HTTPS is table stakes of online security, as there’s no security to be had if anyone on the network path can modify the origin contents.

                            There’s plenty of actual research and roadmaps on indicators like Not Secure, and the eventual goal is indeed to mark the insecure option Not Secure instead of marking HTTPS as Secure. The web is a complex slow moving beast, but this is exactly a step in that direction!

                            Anyway, if there’s one thing experience showed us is that trying to convey “context” on the security status of a TLS connection to users is a losing proposition.

                            1. 4

                              There’s plenty of actual research and roadmaps on indicators like Not Secure, and the eventual goal is indeed to mark the insecure option Not Secure instead of marking HTTPS as Secure. The web is a complex slow moving beast, but this is exactly a step in that direction!

                              Not that I don’t believe you, but mind pointing me at this research?

                              Anyway, if there’s one thing experience showed us is that trying to convey “context” on the security status of a TLS connection to users is a losing proposition.

                              This is exactly my concern, it seems that sprinkling “security” hints to non-technical users usually leads to them making the wrong assumptions.

                              1. 1

                                I am focusing on a specific point in your post

                                there’s no security to be had if anyone on the network path can modify the origin contents.

                                This can be addressed by adding signatures rather than encrypting the whole page. There are useful applications such as page caching especially in low bandwidth situations which are defeated by encryption everywhere.

                            1. 1

                              Noting a similar project – byterun which implements a Python VM on top of Python by Ned Batchelder (and used in the book Architecture of Open Source Applications – implementations in 500 lines or less ). Here a skeleton of another python interpreter over python that I developed for a class.

                              1. 1

                                Another interesting make flavor is BSD bmake which allows monitoring and updating of dependencies. It would be nice to see that also in the list.

                                1. 3

                                  It actually looks nicer than Make files. I found the source for it here for any one interested.

                                  If anyone has used it, does it check the timestamp of the artifact before building?

                                  1. 1

                                    Could you please clarify if you are considering just plain Make or GNU Make extensions in your document? If GNU Make is considered, dependency on a directory is certainly possible with the | dependency.

                                    Also, could you please add Mk (from Plan9) to the list? It is supposed to be a better Make.

                                    1. 1

                                      Mk is Make reduced to its essentials. That removes a lot of cruft but also some useful features. People probably debate which things are cruft or useful, though.

                                    1. 4

                                      McIlroy’s critique seems intellectually dishonest. One would not accept his solution in a take home coding test – it’s like when we ask someone to implement line splitting and they use .split().

                                      Of course what he does is shorter; but not all programs can be that short, merely because this toy example can. Such a program would be a poor example of literate programming, but is also a poor example of how to handle complexity in general. When you actually write complicated things in shell you quickly find out that these components don’t always reuse, and the need to author, explain and organize your own components rapidly outstrips the facilities available in shell.

                                      1. 5

                                        “Intellectually dishonest” ? I’d give points for using .split unless the assignment spelled out what could be used.

                                        1. 7

                                          That is another way of clarifying the distinction. Knuth didn’t set out to write the shortest or most production worthy program, but rather to demo Web on a simple example program. McIlroy’s answer is to a different question; and his sleight of hand is in equating the two.

                                        2. 3

                                          not all programs can be that short

                                          Why not?

                                          I’ve seen small databases, small web servers, small programming languages…

                                          1. 3

                                            Just because one iteration of an idea can be small, doesn’t mean all useful iterations will be.

                                            1. 1

                                              Of course not, but that’s stupid. Who cares if “all useful iterations” will be: One could iterate on an idea and produce a massive steaming pile of dogshit simply because they’re a shit programmer.

                                              What I really care about is whether all business problems can be solved with small programs, and given that the smallest database is also the fastest and most featureful I’m inclined to believe that it is.

                                              1. 3

                                                I obviously meant that not all business problems can be solved with small programs.

                                                What database is the smallest, fastest, and most full featured?

                                                1. 2

                                                  I obviously meant that not all business problems can be solved with small programs.

                                                  Right. I understand this is a prevailing thought, but I’m not convinced.

                                                  What database is the smallest, fastest, and most full featured?

                                                  kdb

                                                  1. 1

                                                    Sigh. I thought you were going to say kdb, but I thought I’d ask in case you had a novel or interesting answer. It’s incredibly niche, anyone familiar with its featureset would plainly know that it’s not general purpose.

                                                    1. 0

                                                      I’m using Kdb as a (CRM) database.

                                                      I’m also using it for time series (yes), and as an application server for a real-time bidding system.

                                                      I’ve got a unstructured data ingestion system on Kdb.

                                                      I’ve even got a full text search running on Kdb.

                                                      I know people doing GIS with Kdb.

                                                      Not sure what your definition of “general purpose” is, but it certainly meets mine.

                                                      1. 3

                                                        For the most part discussions I’ve had about kdb have been overly religious for my taste, and I’m not about to get in another holy war. I’m glad kdb works for you, but implementing features like FTS and GIS on top of kdb yourself doesn’t mean kdb has those features.

                                                        1. 1

                                                          I don’t know about that. I think that the ability to implement them in KDB does – I’m just writing queries here, and that’s important.

                                                          You can call this a potato if you want, but I’d say postgresql can do GIS queries as well even though someone had to write them in C and link them in as a externally shipped tool.

                                                          Today I wanted to index a table by a cuckoo hash. I can’t imagine the mess of SQL needed to do such a thing, and it’s about five lines long in kdb. Doing that in postgresql would be very invasive.

                                                          1. 2

                                                            That’s neat. Would you care to share those 5 lines?

                                                            1. 2

                                                              Sure. Unoptimised version follows:

                                                              pos:{[t;n] k:count[t] div 2;raze (0,k)+\:((n mod 16),(n div 16)) mod k}
                                                              hash:{last (md5 "c"$ -18!x) except 0x00}
                                                              add:{ [table;text] n:hash text; if[-11h=type table; :table set ins[get table;n]; ]; :ins[table;n]; };
                                                              ins:{ [t;n] i:pos[t;n]; j:i rand where t[i]=0x00; if[j<>0N; :@[t;j;:;n]; ]; j:i@rand where t[i]<>n; if[j<>0N; m:t j; p:pos[t;m]; q:p@rand where t[p]<>m; if[q<>0N; :@[t;j,q;:;n,m]; ]; :@[t;j;:;n]; ]; :t; };
                                                              check:{ [table;text] n:hash text; t:$[-11h=type table;get table;table]; :any t[pos[t;n]]=n; };
                                                              

                                                              … but it’s still fast enough to check things out.

                                                              1. 2

                                                                I think I’ve got you beat on brevity:

                                                                create table "table" ( "text" varchar);
                                                                create index hash_index on "table" using hash("text");
                                                                

                                                                May not a be cuckoo hash, but I’m not convinced that matters. Except possibly in a highly specialized, niche application. I doubt you wrote this for your CRM, for example.

                                                                1. 2

                                                                  I think I’ve got you beat on brevity:

                                                                  Oh if I just want a regular hash/index I can use the g or s properties (sorted is fine). Indeed that’s what I benchmark this problem with. The exact syntax would be:

                                                                  table:([] id:`s#`sym$(); date:`s#`date$() acct:`acct$())
                                                                  

                                                                  I doubt you wrote this for your CRM, for example.

                                                                  A CRM is a component of this application.

                                                                  So there’s an attribution component where I’ve got ~2m accounts that I want to connect to some set of around ~1m ids. You might imagine it’s:

                                                                  id date -> account[]
                                                                  

                                                                  and indeed the trivial

                                                                  create table (id varchar, date date, acct varchar);
                                                                  

                                                                  is fine, but it’s a chunky index – around 10-16GB per day – that I’d have to build and for this use case I don’t need exact answers: it’s okay to select a few acct that don’t have the id, and given the number of processes that need this data built, maybe something that uses around 20MB might be worth experimenting with.

                                                                  1. 2

                                                                    I don’t think I have adequate information about the problem to discuss more in depth. Only, I could store a year of that index (6T) on the cloud for less than 2 engineer-weeks of USD. So if your optimization took longer than 2 weeks cumulative to design, implement, test, be used by other engineers effectively, per year, then it’d be a waste. Moreover, I suspect there is a trivial strategy to represent that index in a smaller way. You have string identifiers for a few million rows, you could easily use an int32 id and save space. Or maybe you can’t. I don’t know enough about your application. I also don’t know what you mean by number of processes that need the data built. If a distinct 16 gb index is built for each process running, I can see how that would explode the size. But I don’t see anything else in what you said to indicate whether that’s the case. Again, not enough info.

                                                                    I don’t really think it’s productive to continue this discussion. You like kdb and I’m happy for you. But not many people can just implement, e.g., a reasonable GIS system for their application. And though I could, I don’t want to. I’ll use PostGIS or something else off the shelf unless I have a compelling reason to go out of my way to make a custom solution. Because all that bloat you talk about, that’s functionality I don’t even know I need yet, but will a year down the line when the scope of my application expands. It’s subtle logic handling edge cases that might have otherwise wasted a lot of my time.

                                                                    That’s why I say kdb is niche, it’s for people who actually will derive financial value out of doing stuff like that themselves. That’s tremendously uncommon.

                                                                    1. 1

                                                                      That’s why I say kdb is niche, it’s for people who actually will derive financial value out of doing stuff like that themselves. That’s tremendously uncommon.

                                                                      Okay, but I’m not arguing it’s not niche. I said that “all business problems can be solved with small programs” and you said they can’t.

                                                                      That people are okay with big programs is a (possibly) unrelated issue.

                                                                      1. 2

                                                                        Solving a problem with limited time and developer resources is a business problem.

                                                                        1. -1

                                                                          I don’t agree “limited time and developer resources” is a business problem that could be solved better with a big program than a small program.

                                                                          That actually sounds absolutely absurd to me, so I assume you must mean something else, but I can’t imagine what it might be.

                                                                          1. 3

                                                                            I do mean that. Using software that supports doing what you need saves time and developer resources. Suppose a team needs a few different unrelated features. In the database world, there are many large general purpose databases that support a ton of features, including the features the hypothetical team needs. But they aren’t terribly likely to find a database that supports ONLY those few unrelated features. And in the interest of saving time and developer resources, they could reasonably choose that large general purpose database over implementing those features themselves on top of a small but highly customizable database.

                                                                            And on a meta level, building a large general purpose database program solves a business problem: build a database that a ton of different teams can use, so you can sell it a ton of times. The fact that you technically can implement your own GIS on kdb isn’t all that compelling if I’m looking to buy a GIS database. I can implement my own GIS on a lot of things, the point is I don’t want to.

                                                                            I said that “all business problems can be solved with small programs” and you said they can’t.

                                                                            Perhaps “can’t in practice with realistic constraints” is more accurate. Can all business problems be solved with small programs given unlimited resources and top developers? Who fucking cares? That’s not how real life works. No one choses Walgreens vs CVS based on how many lines of code those companies execute to conduct business. If refining, specializing, and minimizing their code size made them more money somehow, then maybe their business problems would be better solved with small programs. But they probably get more value per dollar out of mixing and matching large generic programs. More value per dollar is better in a business context.

                                                                            There is a point where specializing becomes more effective than mixing and matching large general purpose code, but that point isn’t “always every time for any business problem.”

                                                                            1. 1

                                                                              This is all over the place and I’m not sure how to respond. I’m not even really sure what you’re saying.

                                                                              Why exactly do you think that a problem like GIS requires a large program, when we can clearly see a solution with a small one?

                                                                              there are many large general purpose databases that support a ton of features

                                                                              There are also small general purpose databases that support a ton of features.

                                                                              Not sure what your point is.

                                                                              Can all business problems be solved with small programs … Who fucking cares?

                                                                              I do. There is significant value in small programs: They have less bugs, they are easier to read and write and they run faster. I find programs that are correct and fast to be more valuable than programs that aren’t, and I can’t imagine there’s a business that thinks otherwise that will last very long.

                                                                              There’s other things in your post that I don’t really understand your point. It’s not clear if you disagree with me or where you disagree. It almost seems like you’re angry about something – maybe this religious point you mentioned earlier – that has nothing to do with me.

                                                                              1. 2

                                                                                Why exactly do you think that a problem like GIS requires a large program, when we can clearly see a solution with a small one?

                                                                                That’s not what I’m saying. I’m saying a program that supports GIS, and a bunch of other unrelated features so as to be general purpose, will be large. One such program is PostgreSQL.

                                                                                It’s not clear if you disagree with me or where you disagree.

                                                                                I agree small programs are good. I disagree that every problem can be solved with small programs.

                                                                                It almost seems like you’re angry about something

                                                                                No, just frustrated that this discussion is going exactly the way I expected, and that I should have known better and not gotten involved.

                                                                                that has nothing to do with me.

                                                                                It has everything to do with you. I feel I have dangled my point in front of your face, and you are perfectly capable of understanding but have refused to do so.

                                                                                So here it is laid out:

                                                                                Thesis: not all problems can be solved with small programs.

                                                                                Example: I do not believe the problem solved by PostgreSQL could be solved by a small program.

                                                                                Problem solved by PostgreSQL: saving time and developer resources by providing a general purpose, many featured solution usable immediately by a wide variety of teams. Contrast with kdb, which requires implementing desired features on top of it.

                                                                                Make sense?

                                                                                1. 1

                                                                                  I’m saying a program that supports GIS, and a bunch of other unrelated features so as to be general purpose, will be large. One such program is PostgreSQL.

                                                                                  PostgreSQL ships GIS as an add-on.

                                                                                  Same as with kdb.

                                                                                  Thesis: not all problems can be solved with small programs.

                                                                                  “All problems” isn’t important.

                                                                                  You can always invent a problem that cannot be solved by a small program, such as “needs to be a big program.”

                                                                                  All business problems is a little better, and while still open to a certain amount of shenanigans, if you’re not intellectually dishonest you’ll get something out of the argument.

                                                                                  Shit like this:

                                                                                  Example … Problem solved by PostgreSQL: saving time and developer resources by providing a general purpose, many featured solution usable immediately by a wide variety of teams. Contrast with kdb, which requires implementing desired features on top of it.

                                                                                  are counterproductive. “general purpose” is met:

                                                                                  • having a range of potential uses or functions; not specialized in design.

                                                                                  however:

                                                                                  • many featured solution usable immediately by a wide variety of teams

                                                                                  is weasel words. Define exactly what you mean by this. How many varieties of team is “wide enough”. How many features is “many featured enough”? I’m certain whatever number you choose we can simply implement that many with kdb and close this point off.

                                                                                  finally:

                                                                                  • Contrast with kdb, which requires implementing desired features on top of it.

                                                                                  … like GIS using PostGIS.

                                                                                  1. 1

                                                                                    PostgreSQL ships GIS as an add-on.

                                                                                    Same as with kdb.

                                                                                    I was not aware, you made it sound like your friends implemented GIS on kdb. If kdb has a fully featured GIS plugin that ships with the distribution, then I stand corrected—for this feature. To define fully featured for you, lets go with this, in particular sections 8.8. Operators, 8.9. Spatial Relationships and Measurements, and 8.11. Geometry Processing.

                                                                                    many featured solution usable immediately by a wide variety of teams

                                                                                    Define exactly what you mean by this.

                                                                                    It defines itself.

                                                                                    • has many features
                                                                                    • usable immediately by a wide variety of teams

                                                                                    And I was implying that it’s usable immediately by a wide variety of teams because it has many features, GIS being one example. For another example, generalized inverted indexes on hierarchical document values like JSON. Although perhaps there is also a kdb plugin that ships with the distribution and provides generalized inverted indexes?

                                                                                    I’m certain whatever number you choose we can simply implement that many with kdb and close this point off.

                                                                                    But they aren’t there already, which is the entire point.

                                                                                    Contrast with kdb, which requires implementing desired features on top of it.

                                                                                    … like GIS using PostGIS.

                                                                                    PostGIS ships with PostgreSQL in nearly every distribution channel. And the end user of the database certainly does not have to implement PostGIS, since it’s already written. Which is my entire point.

                                                                                    1. 2

                                                                                      I just got to this thread, and it’s immensely interesting to me in spite of occasionally falling off the tightrope into flames.

                                                                                      Excluding specifics of different tools, there’s two aesthetics at war here between you and @geocar:

                                                                                      • Using conventional tools provided by others, because they have an incentive to serve as many users as possible, and so could potentially anticipate features you may need in the future. This is a really nice steel-man of the conventional approach that most people cargo-cult as “reuse”.

                                                                                      • Using small programs, minimalist tools and as few dependencies as possible, because every new dependency introduces new degrees of freedom where things can go wrong, where somebody else’s agenda may conflict with your own, where you pay for complexity that you don’t need.

                                                                                      If only we could magically unbundle the benefits of other people’s code from their limitations, have features magically appear when we need them, and be magically robust to security holes in features we don’t use.

                                                                                      The synthesis I’ve come up with to these two poles is to use libraries by copying, and then gradually rip out what I don’t need. This obviously makes things more minimal, so moves me closer to @geocar (whose biases I share). But it also moves me closer to your side, because when I need a new feature from upstream next year I know enough about the internals of a library to actually bring it back into the fold.

                                                                                      It’s hard to imagine a better synthesis than this. The only way to get the benefits without the limitations is to get on a path to understanding your dependencies more deeply.

                                                                                      Edit: http://arclanguage.org/item?id=20221 provides deeplinks inside the evolution of a project of mine, where you can see a library being ingested and assimilated, periodically exchanging DNA with “upstream”.

                                                                                      1. 1

                                                                                        I agree that minimal software is generally better too. I just don’t think it’s practical or valuable to make all software minimal. Using a HTTP wrapper library for literally one request in an app? Kill that dependency. But wait, the app isn’t consumer facing, just needs to get done in as little time as possible, and probably won’t be substantially extended? Screw it, who cares? Adding a dependency in a situation that matters so little is totally worth it if the wrapper library saves the developer 20 minutes of learning a more low level API.

                                                                                        1. 1

                                                                                          But you haven’t addressed my comment at all. Copying the HTTP wrapper library is a reasonable option, right? At worst, it adds minimal overhead for upgrading and so on. At best, it reduces your exposure to a fracas like befell left-pad.

                                                                                          1. 2

                                                                                            If the app matters then copying the HTTP wrapper, or any other library, could be valuable. If the app doesn’t matter, it’s still a waste of time. It’s all about tradeoffs.

                                                                                            Something like an HTTP wrapper, I might just drop it entirely. A lot of those libraries are just reinterpretations of how the author feels APIs should look. Something like ncurses though? I’m not touching it, no way. Or postgres? Forking a database is a huge commitment. But a json parser with a few hokey features I’ll never need, that slow down the parser? I’ve forked that. A password hashing library that bizarrely had waaaay more functions than hash, and check_hash? Forked.

                                                                                            For C++ it’s especially valuable to fork and strip, because monster headers increase compile times. In big projects, adding a header that increases compile time by 200ms can add minutes to build time. Yikes.

                                                                                            So yeah, I agree with you that forking and stripping is a good strategy. It doesn’t apply to everything, but in situations where it’s the best choice, I find it’s usually the best choice by a long shot.

                                                                                            1. 2

                                                                                              It sounds like you’re already practicing what I struggled to figure out. That’s great! I’ll suggest that your narrative of “big programs” is too blunt, and doesn’t adequately emphasize the challenges of dealing with their fallout.

                                                                                              Forking a database is a huge commitment.

                                                                                              All you’re doing is copying it. How is that a commitment?

                                                                                              There’s a certain amount of learned helplessness that rears its head whenever the word “fork” comes up. Let’s just say “copy” to get past that. That’ll help us realize that there’s no dependency we can’t copy into our project, just to allow for future opportunities to rip out code. Start with the damn OS! OpenBSD has a userland you can keep on your system and recompile with a single command. Why can’t everyone do this?

                                                                                              1. 3

                                                                                                It’s not learned helplessness, it’s that maintaining database software is actually hard. If you copy it but don’t change it, you’re pretty much just taking the peripheral burden upon yourself. Now if you want to deploy it you’re on the hook for builds, packaging, package testing, patches for your distro and so on. All this stuff normally done by actual domain experts. Not only is it a huge waste of time, it’s something you’re really likely to screw up at least once.

                                                                                                I work on database engines and I don’t even host my own databases when I can afford it. Setting up replication, failover, backups, etc., that’s a ton of work, especially since you have to test all of it thoroughly and regularly. If it were for a business application, I’d happily pay for Heroku Postgres all the way up to premium-8 tier ($8500 / month). At $102,000 / year, that’s still lower than the salary I’d pay for an engineer I’d actually trust to manage a HA postgres setup.

                                          2. 2

                                            On the other hand, with just a moment’s thought to the command line, McIlroy’s version will quickly show problems with the definition of “word” where you end up with “isn”, “wouldn” and “t” as “words,” among other problems. McIlroy can then spend time on replacing the first line with a more specialized program to break words out of a text stream. Knuth can do the same, but how much time has been spent writing the rest of the code to deal with counting and sorting words?

                                            1. 4

                                              I’ve way more often been in the position of replacing huge shell/Python/&c agglomerations with a single well defined and modular program than the opposite. Perhaps there is ultimately good reason that people build large systems in languages with module systems and interfaces, instead of in shell.

                                              Most languages have libraries for the kind of stuff you’re talking about — modules don’t have to be literally separate programs.

                                              1. 1

                                                Also busted: words with accents, like café, Montréal, née, Québec, and résumé. He even used the word “Fabergé” in his review, which would become “faberg” in the output!

                                              2. 2

                                                Why would you not accept his solution? He doesn’t use a ready-made frequency algorithm, but shows his knowledge of the problem and the tools at hand to implement exactly the algorithm required. Exactly what I want a candidate to do.

                                                1. 2
                                                  1. 1

                                                    If we imagine a second student, who is Knuth, who gives the expected answer, then McIlroy is like the clever student, calling the other student’s answer unimaginative or dull — but to be especially imaginative was never the purpose to begin with.

                                                    1. 2

                                                      Indeed, you are right about that.

                                                1. 2

                                                  Does any one here remember the bug in Solaris 8 (I think the initial releases– it was certainly patched later), where rm did not leave out . and ..? As you would expect, rm -rf .* on any directory had amusing consequences!

                                                  ps: Having been bitten by it a few times, while testing our product, I know it existed, but I cant find the information on the bug any more. If any one remembers, I would be really glad!.

                                                  1. 1

                                                    Shouldn’t it be the shell’s responsibility to expand the glob pattern .*?

                                                    1. 2

                                                      The shell expands it to “..”. It’s rm’s responsibility to skip “..”.

                                                      That said, while there have been rumors that this or that unix would delete “..”, the earliest sources I have available to me check for that case and ignore it. Other than systemd, I’m not aware of a system that actually had this bug. Solaris 8 would seem to be at least a decade too late.

                                                      Here’s a rather old copy of rm. As you can see, it checks for “..” and won’t remove it.

                                                      https://github.com/dspinellis/unix-history-repo/blob/Research-V7-Snapshot-Development/usr/src/cmd/rm.c

                                                      1. 1

                                                        I did make this mistake recently with, I think chmod. I wanted to make root’s dot files and directories world inaccessible and proceeded to make the entire system starting with /root/.. more secure.

                                                        EDIT: And probably group inaccessible.

                                                        1. 1

                                                          This was a bug later introduced, (from what I remember). Unfortunately, while I can access the OpenSolaris source, the history stops at the OpenSolaris launch. As you can see, by comparison to the rm.c from the unix-history-repo, the file has been refactored and reworked quite extensively.

                                                        2. 1

                                                          Yes it is.

                                                          From glob(7):

                                                             Long ago, in UNIX V6, there was a program /etc/glob that would expand
                                                             wildcard patterns.  Soon afterward this became a shell built-in.
                                                          
                                                             These days there is also a library routine glob(3) that will perform
                                                             this function for a user program.
                                                          

                                                          I don’t know if this played a role in the alleged bug.

                                                      1. 4

                                                        Interestingly, this is one of the questions that has a precise mathematical solution (when it is idealized of course). Essentially, if there is a fixed cost associated with owning a thing vs a cost for using it, each time period, you can break even (on average) if you buy the thing just after you have spent sufficient money on rent to have bought it in the first place.

                                                        1. 4

                                                          The problem you linked is trading off buying versus renting when the future use is uncertain, but most people expect to either own a home or pay rent every month until they die. The NYT calculator is mainly about calculating the NPV of two streams of payments and trading off the opposing opportunity costs (investing your down payment vs missing out on rising home values).

                                                        1. 1

                                                          Strangely familiar, yet so different to such a shell implemented in C.

                                                          I noticed Gary calling * squat and @ spiral - is this a Ruby thing? It reminded me of INTERCAL‘s names for characters, but they aren’t the same.

                                                          1. 1

                                                            I think he called * a splat operator, which comes from Perl. The name comes from what the operator does – flatten a list like a bug.

                                                          1. 8

                                                            It’s interesting to see folks respond so positively to the Touch Bar after Lenovo was unanimously criticized for their Adaptive Keyboard. I think the difference is with Apple vertically integrating hardware, OS, common builtin apps, and having good relationship with 3rd party vendors, so the Touch Bar will have excellent support almost immediately.

                                                            1. 15

                                                              I have yet to see a picture of Lenovo’s adaptive keyboard showing anything other than the standard volume up/down etc. keys. Apple at least has the good sense to showcase the touch bar displaying something that can’t be done with normal keys.

                                                              1. 1

                                                                That critique is so unfair. He bemoans the lack of caps lock (which is replaced by home and end), and spends a pretty large portion whining about it. I think he is probably the only person in the world who misses them.

                                                                Honestly, I think the Lenovo Thinkpad layout shown is very sane, and is a developers laptop unlike the MBP.

                                                                1. 1

                                                                  Honestly, I am pretty neutral about the F keys being replaced. It doesn’t help me (although Lenovo can’t take advantage of it in the way Apple can, because lenovo doesn’t control all the software).

                                                                  It’s all the o they changed that made me go “nope”.

                                                                  1. 1

                                                                    Did you ever use the Gen2 X1’s “adaptive keyboard”? It was atrocious. Insanely slow response time, hit targets of 1x1 pixel, and basically never being in the right context ever.

                                                                    At least from what I saw in Apple’s demo today and the early reports, it looks like Apple has figured out this scenario considerably better.

                                                                    1. 1

                                                                      Both seem to a sad step down from the Lebedev designs of 2007/2011.

                                                                      It’s so sad to see that keyboards these days are considered a cost center, where saving a few cents is more important than user experience .

                                                                    1. 2

                                                                      A problem with command lines is that most operating systems have some limit on the length of command line. Hence, I think it would be great if you provide the ability to read from stdin, which is expected in a unix program. Keep the help to –help or -h. You are not modifying the system (such as by writing to a file or modifying one). Hence there is little need for caution – that is for printing usage for invocation without flags – when there is a valid use case for reading from stdin.

                                                                      1. 1

                                                                        So, what’s cool about Red?

                                                                        1. 4

                                                                          since red is heavily inspired by rebol, much of the “what’s cool” carries over from what was (is) so cool about rebol. my personal favourite feature is the small, portable and extremely capable runtime. think about the way applications are now being shoehorned into a web browser, and imagine if someone had made a browser equivalent that was optimised for that use case. now imagine further that that someone really cared about executable size, and managed to pack a language interpreter, gui engine, network stack, etc. into a < 1 MB binary, and that it furthermore made common applications easy to write because there is so much functionality in the runtime that they get for free.

                                                                          i think rebol lost out on a lot of mindshare by not being open source until it was too late; i would have liked to see it become the default quick scripting language for small gui apps (visual basic, tcl/tk and python all tried, but none of them quite succeeded in that either; as far as i can see, the niche is yet to be satisfactorily filled)

                                                                          how rebol is different is a good starting point, followed by red’s own about page.

                                                                          1. 2

                                                                            Wow, this sounds very interesting. Thanks for explaining! I did not know much about Rebol, and especially like the slogan on Rebol’s homepage:

                                                                            Most software systems have become needlessly complex. We rebel against that complexity, fighting it with the most powerful tool available, language itself.

                                                                            Red’s feature list sounds amazing:

                                                                            • Functional, imperative and symbolic programming
                                                                            • Prototype-based object support
                                                                            • Homoiconic (Red is its own meta-language and own data-format)
                                                                            • Optionally typed, rich set of datatypes (50+)
                                                                            • Both statically and JIT-compiled to native code
                                                                            • Concurrency and parallelism strong support (actors, parallel collections)
                                                                            • Low-level system programming abilities through the built-in Red/System DSL
                                                                            • High-level scripting and REPL console support
                                                                            • Highly embeddable
                                                                            • Low memory footprint, garbage collected
                                                                            • Low disk footprint (< 1MB)
                                                                            1. 2

                                                                              by not being open source until it was too late;

                                                                              Even now, it is not fully opensource. Only the core is opensource. The gui part is not. One question I have is, is the core (the opensourced part) sufficient for building Red?

                                                                              1. 1

                                                                                Seeing as how it builds itself when you download it for the first time, I’ll assume yes. :)

                                                                          1. 1

                                                                            Doesn’t Red still require Rebol to compile?

                                                                            1. 1

                                                                              yep, they’ve been concentrating on other stuff from what i gather