1. 2

This is starting to get really interesting. I think this deserves a boost every few days. Day 3 was hard, but on day 4 the author used a meta-approach; pretty much the same thing as used by https://github.com/max-sixty/aoc-gpt, who is currently 35th on the global leaderboard.

I’m really curious to see how this scales to the more difficult problems. So far comprehensible manual solutions were less than 10 lines of code for me (after golfing them a bit).

1. 14

I’m sorry, I’ve only gone through the first few steps of the tutorial but the initial claims have already bothered me a lot. I have to be pedantic because this is the biggest pet peeve of mine:

Turing famously showed that computers can’t decide whether your code halts.

I would say this is actually false, even though almost everybody thinks it’s true. At least, if you’re interested in actually solving problems rather than being interested in what is mostly a mathematical/theoretical curiosity.

The Halting Problem is famously only undecidable if the code you’re analyzing is being modeled on a “Turing machine” or something equally “powerful” to it. The problem is, computers are not as powerful as Turing machines, and furthermore, Turing machines (or something equivalent to them) neither exist nor can ever exist in practice, because it’s impossible to build a “machine” with truly infinite memory.

In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):

https://en.wikipedia.org/wiki/Cycle_detection

As you can see, if you use one of these cycle detection algorithms, then the Halting Problem is not undecidable. It’s easily solvable. The problem is that, while these known algorithms are able to solve the problem in constant memory, there are no known algorithms to solve it in a time-efficient way yet. But again, it is decidable, not undecidable.

Now, I understand how Turing machines as a model can be useful, as the simplifying assumption of never having to worry about a memory limit might allow you to more easily do types of analysis that could be more difficult to do otherwise. And sure, the way the Halting Problem was “solved” was very clever (and I actually suspect that’s a major reason why it has reached its current legendary status).

But this simplifying assumption should have never allowed a result like the Halting Problem which is known to be wrong for the type of computing that we actually care about to be elevated to the status that it currently has, and by that I mean, to mislead everyone into thinking that the result applies to actual computers.

So again, I would say, be careful whenever you see a Turing machine being mentioned. I would say it would be better to view them as mostly a mathematical/theoretical curiosity. And yes, they can also be a powerful and useful tool IF you understand its limitations. But you should never confuse this model for what actually is the real world.

Henry Rice proved a much more devastating result: “computers can’t decide anything interesting about your code’s input-output!”

Since Rice’s theorem (and many other computer science results!) depends on the Halting Problem being true, while it is actually false for real-world machines, my understanding is that Rice’s theorem is also false for real-world machines.

But Rice’s theorem (and sentences like the article is using) is absolutely another example of a huge number of cases where the Halting Problem has seemingly mislead almost everyone into thinking that something can’t be solved when it actually can be solved.

I wonder how much more research could have advanced (like for example, in SMT solvers, formal methods, etc) if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.

How many good people were discouraged into going into this area of research because of this widespread belief that this pursuit is proven to be unachievable in the general case?

Now, I’m not a computer scientist nor a mathematician or logician. So I could be wrong. If so, I would be interested in knowing how I’m wrong, as I’ve never seen this issue being discussed in a thorough way.

Edit: another (according to my understanding) completely misleading phrase in the tutorial exemplifying what I’m describing:

we learned Turing’s proof that halts is actually impossible to implement: any implementation you write will have a bug!

Now, to be clear I’m not blaming the author of the tutorial, as this is just one very tiny example of what actually is an extremely widespread belief across the whole industry.

1. 14

The halting problem is a theorem. When the author writes “Turing famously showed that computers can’t decide whether your code halts.” they are referring to that theorem, even if the words they wrote are not precise and formal enough to state the actual halting problem.

Yes it’s a good point that in reality our programs are finite and halting can be determined for this set of programs.

mislead everyone into thinking that the result applies to actual computers.

The bridge between theory and practice is a really big topic. There’s a lot that could be said on that topic. But I think the key take-away from the halting problem and rices theorem for writing programs that analyze and process programs is this:

Your algorithm analysis system is going to need to output one of 3 things {yes, unknown, no} and not one of 2 things {yes, no}.

I wonder how much more research could have advanced if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.

I definitely do not think we would be further ahead if Turing’s theorem had not been discovered. It’s not something that held work back, it’s something that informed more precise and correct directions for analysis of programs. So I believe we are better off for it.

1. 2

Your algorithm analysis system is going to need to output one of 3 things {yes, unknown, no} and not one of 2 things {yes, no}.

Well, strictly speaking the system would still only tell you “yes” or “no”. But since Rice’s theorem is false for real-world computers (as far as I know), it would be possible to determine whether your program will halt because it ran out of memory, because some error was encountered or because it completed successfully, which is immensely useful to know for the vast majority of programs (but not for certain programs that are trying to solve strictly mathematical questions with infinite bounds).

I definitely do not think we would be further ahead if Turing’s theorem had not been discovered.

Of course not, I wasn’t trying to make that point.

It’s not something that held work back, it’s something that informed more precise and correct directions for analysis of programs. So I believe we are better off for it.

I think it’s not the theorem itself that held (and is holding) work back. It’s the widespread belief that it applies to real-world computers that holds work back.

Imagine how many people have been discouraged from pursuing work in formal methods because they believe it’s impossible to figure out if a program has a bug in the general case. I see engineers saying that it’s impossible to solve it all the freaking time.

1. 3

Well, strictly speaking the system would still only tell you “yes” or “no”.

I don’t understand this. What I was meaning was that, for example, I cannot write a program that terminates and correctly tells if an arbitrary program halts, but I can write one that tries set of checks and either reports that the input program does halt, does not halt or that it cannot tell.

Imagine how many people have been discouraged from pursuing work in formal methods because they believe it’s impossible to figure out if a program has a bug in the general case. I see engineers saying that it’s impossible to solve it all the freaking time.

It is impossible in the general case (assuming unbounded memory). If you restrict to finite memory like real computers have, the memory is usually so large that it doesn’t make a practice difference. You can’t just note down the entire system state including all 16GB of memory every iterator and try to detect loops. This works in theory but not in practice.

1. 2

I cannot write a program that terminates and correctly tells if an arbitrary program halts.

Yes, you can, because programs run on machines with finite memory. The cycle detection algorithms I mentioned are examples of such programs that always terminate and correctly tell you whether an arbitrary program halts.

It is impossible in the general case (assuming unbounded memory).

But you shouldn’t assume unbounded memory, because that doesn’t exist.

If you restrict to finite memory like real computers have, the memory is usually so large that it doesn’t make a practice difference. You can’t just note down the entire system state including all 16GB of memory every iterator and try to detect loops. This works in theory but not in practice.

Sure, currently it’s impractical, but as far as I know there’s no proof that it will always be impractical. I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.

But I haven’t seen a convincing argument that you necessarily have to do this to solve the problem in general and that you can’t just analyze the program description to determine whether it will halt.

1. 3

A cycle detection algorithm has to detect, by definition, that a Turing machine is in a state that it was in earlier. That is what it means for there to be a cycle. Detecting cycles in other ways requires program analysis and gets into the territory of Rice’s theorem. I’m not assuming anything about how the algorithm does this. Direct inspection of the physical implementation of the state representation is just one way. It could be programmatic interrogation via an API.

The number of possible states of a Turing machine is 2^n bits. This exceeds the number of particles in the universe for even a modest amount of bits. That means you can make a small program that just cycles through all possible states, for which you cannot physically have a second Turing machine that enumerates which states have already been seen. That’s an impossibility theorem for all I care

1. 1

Detecting cycles in other ways requires program analysis and gets into the territory of Rice’s theorem.

Didn’t I just prove that Rice’s theorem is not valid for programs that use a finite amount of state?

1. 1

But didn’t I just show, here and in my other comment about the finiteness of space, that the Halting Problem, and thus Rice’s theorem, are still valid in practice?

This entire argument doesn’t change anything about what we can do in practice.

1. 1

didn’t I just show (…) that the Halting Problem, and thus Rice’s theorem, are still valid in practice?

But they’re not valid in practice! In practice, there are already known algorithms that truly solve the Halting problem for finite state machines. Like the Tortoise and Hare algorithm. They truly solve this problem, in practice. Which means the Halting theorem’s proof is invalid for finite-state machines. The problem is decidable and there are known solutions.

What you’re talking about, perhaps, is that the time complexity of these algorithms, currently, is too great to be useful in practice. Which, if that is what you are arguing, then I agree. But that’s not what I was arguing about:

The number of possible states of a Turing machine is 2^n bits. This exceeds the number of particles in the universe for even a modest amount of bits.

If you really were arguing about the time complexity of the currently-known algorithms, then sure, that is relevant.

But I was very clear about this: I was not arguing about what we can currently do, I was arguing about what might be possible to do in the future.

And my point is that you don’t have to necessarily iterate through all of these states to detect a cycle. It might be possible to construct an algorithm that solves the exact same issue without always running the program step by step, but instead, actually looks at the program and exploits how the program works to do the analysis much faster.

That means you can make a small program that just cycles through all possible states, for which you cannot physically have a second Turing machine that enumerates which states have already been seen.

This also seems to be irrelevant to what I was discussing, because again, you don’t have to necessarily enumerate all states which have already be seen in order to detect when a cycle of states starts to repeat forever.

I mean, sure, in the vast majority of cases, currently, you’d have no other choice but to do this. But again, I was very clear about this: I’m not arguing about what we can currently do, I’m arguing about what might be possible to do in the future, especially given that this problem is decidable.

2. 3

Yes, you can, because programs run on machines with finite memory. The cycling detection algorithms I mentioned are examples of such programs that terminate and correctly tell you whether an arbitrary program halts.

I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.

Cycle detection algorithms work by running the program and watching the states. They don’t work by analyzing the program description.

1. 1

The proof for the Halting Problem does not prevent the machine that is to determine the halting status from simulating the program.

1. 1

Cycle detection algorithms work by running the program and watching the states. They don’t work by analyzing the program description.

No. You mean that the currently known cycle detection algorithms work that way.

Or do you know of a proof that this is the only way to detect cycles (for an arbitrary program)?

1. 4

I do, it’s called the halting problem.

1. 1

I do, it’s called the halting problem.

So you’re saying that the Halting theorem is valid for programs that run on machines with finite memory?

1. 3

You were speculating that there might be practical, efficient algorithms to do this. This seems unlikely to me even in the general (finite) case, as I’d guess an efficient algorithm for this would be equivalent to proving P = NP.

1. 1

I agree with you. I think that it’s unlikely, but I also think that it’s possible.

As an example, primality testing was not believed to be polynomial-time for many decades, until finally someone discovered a deterministic polynomial-time algorithm in 2002, quite unexpectedly at the time.

As additional evidence that this is not a possibility that we can easily dismiss, I also present the following: Donald Knuth believes that P = NP :-)

1. 2

I was thinking it might be the case that it could be shown to be an NP problem, perhaps without much effort, for someone more knowledgeable than I am in this area.

I’m aware of Knuth’s speculation, though I’ve seen him discuss it somewhere and he said something like, “Powers can be really, really big,” implying that even if P = NP, it might not change the practical tractability of NP problems.

2. 1

Yes, you can, because programs run on machines with finite memory

It doesn’t make sense that you would try to clarify this to me, I obviously know this for the finite memory case - as we been discussing. What you’ve misunderstood (on purpose?) is that in my paragraph I am referring to unbounded memory. Maybe try a little bit to map out what you think the person you are talking to knows and does not know for better communication or something.

I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.

This is the cycle detection algorithm you discussed earlier. There are also others way to construct proofs that programs halt.

in general

there is no general way to decide if programs (i.e. Turing machines) halt.

1. 1

I’m sorry if I misunderstood you. I certainly did not do it on purpose.

in my paragraph I am referring to unbounded memory.

I can write one that tries set of checks and either reports that the input program does halt, does not halt or that it cannot tell

Ok, given the context, then yes, I agree with you. You could write such a program. But if you assume the program you’re trying to analyze runs with finite memory, then you can even write a program that always tells you “yes, the program halts” or “no, the program does not halt”, for any arbitrary program.

And in fact, it would even be possible to tell why it halted!

I cannot write a program that terminates and correctly tells if an arbitrary program halts

Given the context of unbounded memory, then I agree with this statement. But I would definitely like it much more if people always preceded such statements with the “unbounded memory” assumption that they are making, to avoid such common misunderstandings. I’m not criticizing you specifically, rather, just the common discourse in general :)

1. 2

Ok, given the context, then yes, I agree with you. You could write such a program. But if you assume the program you’re trying to analyze runs with finite memory, then you can even write a program that always tells you “yes, the program halts” or “no, the program does not halt”, for any arbitrary program.

A turing machine can tell if a finite memory program halts, but a finite memory program cannot determine if all finite memory programs halt - because it may run out of memory itself.

1. 1

A turing machine can tell if a finite memory program halts, but a finite memory program cannot determine if all finite memory programs halt - because it may run out of memory itself.

Yes, it can. The tortoise and hare algorithm is a finite memory program and it can determine if all finite memory programs halt.

1. 1

tortoise and hare algorithm

so suppose you have 16 GB of memory to run tortoise and hare algorithm,

there is a program large enough that your 16 Gigs will run out and the algorithm wont be able to complete.

1. 1

suppose you have 16 GB of memory to run tortoise and hare algorithm,

That means it can only analyze finite state machines with up to 8 GB of memory

there is a program large enough that your 16 Gigs will run out and the algorithm wont be able to complete.

No, the algorithm will be able to complete.

If the program tries to use more than 8 GB of memory, presumably it would crash with an out-of-memory error, which the tortoise and hare algorithm would interpret as a halting state and then terminate. Or it would loop, which the tortoise and hare algorithm would detect as a cycle and then terminate.

The program is not allowed to use an unbounded amount of memory.

2. 1

there is no general way to decide if programs (i.e. Turing machines) halt.

Yes, but why is that interesting outside of trying to analyze a program that is trying to solve some kind of mathematical problem (i.e. something with infinite bounds)?

I think the more interesting statement is that there is a general way to decide if programs, when ran on deterministic finite-state machines, will halt, even if they were written in Turing-complete languages.

And I think the even more interesting question is: how can we reduce the time complexity of such algorithms that can decide that?

1. 2

I think the more interesting statement is that there is a general way to decide if programs, when ran on deterministic finite-state machines, will halt, even if they were written in Turing-complete languages.

Well yes, but you need to run the decision program on a wachine with exponentially more memory, which is impossible in practice.

1. 1

Well yes, but you need to run the decision program on a wachine with exponentially more memory, which is impossible in practice.

No, you don’t.

The tortoise and hare algorithm uses only a constant amount of memory to determine whether a finite-state machine halts. Concretely, it uses 2*N bytes of memory if the underlying finite state machine uses N bytes of memory.

3. 9

In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):

As you can see, if you use one of these cycle detection algorithms, then the Halting Problem is not undecidable. It’s easily solvable. The problem is that, while these known algorithms are able to solve the problem in constant memory, there are no known algorithms to solve it in a time-efficient way yet. But again, it is decidable, not undecidable.

This is wrong. Cycle detection works for Turing machines too. The equivalent halting problem for a finite machine is “there’s no algorithm that can tell if an arbitrary program, ran with sufficient memory, will halt.”

ie, take this program:

``````def find_odd_perfect_number:
i = 3
while True:
if is_perfect(i):
return i
i += 2
``````

That will eventually halt on any computer due to hitting memory limits. But hitting a memory limit does not tell you if it it would have halted if you had more memory!

I wonder how much more research could have advanced (like for example, in SMT solvers, formal methods, etc) if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.

This is also wrong. We know it’s possible for an algorithm to find bugs, we’ve done that for decades. Rice’s Theorem says that there’s no algorithm that always finds bugs in any buggy program. It’s why SMT solving is, in reality, Very Hard.

1. 1

This is wrong. Cycle detection works for Turing machines too.

It’s not wrong. Cycle detection doesn’t work for Turing machines precisely because of the Halting problem. The program may run forever without repeating states (including the state of the Turing machine tape), so for certain programs you will never find a cycle using a cycle detection algorithm on a Turing machine, therefore these algorithms would never terminate.

Which to me, is only of interest as a theoretical exercise or as a means to explore the consequences of true infinity, but not for solving most real-world problems.

The equivalent halting problem for a finite machine is “there’s no algorithm that can tell if an arbitrary program, ran with sufficient memory, will halt.”

What is “running with sufficient memory”? Isn’t that the same as running on a Turing machine?

But hitting a memory limit does not tell you if it it would have halted if you had more memory!

Sure, but that problem is not very interesting for the vast majority of cases, because in the vast majority of cases you’re actually more interested in knowing whether the program you’ve built is going to crash even if you give it a huge amount of memory than knowing whether it will work correctly on an imaginary machine with infinite memory which nobody will ever have.

Generally speaking, knowing whether a program would halt on a Turing machine is useful for solving mathematical problems related to unbounded / infinite limits rather than solving real-world problems.

This is also wrong. We know it’s possible for an algorithm to find bugs, we’ve done that for decades. Rice’s Theorem says that there’s no algorithm that always finds bugs in any buggy program.

No, it’s not wrong. Rice’s theorem only says that for imaginary Turing machines, which is my entire point! You’re making the same mistake everyone usually makes. Rice’s theorem doesn’t say that for machines with finite memory, such as… computers.

It’s why SMT solving is, in reality, Very Hard.

That’s not because of Rice’s theorem! You just proved my point. You also believed that SMT solving is hard because of Rice’s theorem which means that these widespread beliefs about the Halting problem and Rice’s theorem are making people believe things which are not true and are discouraging people from pursuing these avenues of research.

1. 4

It’s not wrong. Cycle detection doesn’t work for Turing machines precisely because of the Halting problem. The program may run forever without repeating states (including the state of the Turing machine tape), so for certain programs you will never find a cycle using a cycle detection algorithm on a Turing machine, because these algorithms would never terminate.

It actually works better on Turing machines because in a real computer, cycle detection doesn’t work if the cycle is larger than your available memory. But in both abstract and real machines, it can’t tell you if a cycle is larger than your available memory.

You just proved my point. You also believed that SMT solving is hard because of Rice’s theorem which means that these widespread beliefs about the Halting problem and Rice’s theorem are making people believe things which are not true and are discouraging people from pursuing these avenues of research.

I believe SMT solving is hard because I’ve worked with people who research SMT solving, and I assure you that anyone developing SMT solvers will tell you the exact same thing.

1. 1

It actually works better on Turing machines because in a real computer, cycle detection doesn’t work if the cycle is larger than your available memory.

So your point is that when:

1. The program being analyzed is assumed to run on a finite-state machine
2. The program that detects whether the above program halts is running on a Turing machine

… then it works just as well as (or even better than) if the second program was running on a finite-state machine.

If this is what you are arguing, then I agree but it’s irrelevant to what we were discussing. You said:

This is wrong. Cycle detection works for Turing machines too.

Sure, it works if the first program is assumed to run on a finite-state machine. But that’s not how the Halting problem is defined. The Halting problem is actually assuming that the first program runs on a machine with infinite memory. This is why people say the Halting problem is undecidable rather than decidable. And it’s also why it doesn’t apply to real-world computers. So what I said is correct.

But in both abstract and real machines, it can’t tell you if a cycle is larger than your available memory.

Cycles can’t be larger than the number of states in the available memory, by the pigeonhole principle.

You also believed that SMT solving is hard because of Rice’s theorem … I believe SMT solving is hard because I’ve worked with people who research SMT solving, and I assure you that anyone developing SMT solvers will tell you the exact same thing.

I know SMT solving is hard, but as far as I understand it’s not because of Rice’s theorem.

Or are you assuring me that those people also have mistaken beliefs, given that Rice’s theorem is false for finite state machines? If so, that’s even worse than I thought, then.

1. 4

Or are you assuring me that those people also have mistaken beliefs, given that Rice’s theorem is false for finite state machines? If so, that’s even worse than I thought, then.

Maybe it’s not the computer scientists who are mistaken here?

1. 1

Maybe it’s not the computer scientists who are mistaken here?

It is a known result in computer science that Rice’s theorem says the problem is undecidable for Turing machines, because the proof is based on the Halting problem and the Halting problem is undecidable for Turing machines.

However, it is also a known result in computer science that the Halting problem is decidable for finite-state machines, which makes the Rice’s theorem proof invalid for finite-state machines.

2. 2

Cycles can’t be larger than the number of states in the available memory, by the pigeonhole principle.

If there are `n` bits of memory available, then a cycle can have length up to `2^n`, so you need `n * 2^n` bits of memory to store the table of visited states.

1. 1

If there are n bits of memory available, then a cycle can have length up to 2^n, so you need n * 2^n bits of memory to store the table of visited states.

No, you don’t. The tortoise and hare algorithm detects such cycles and doesn’t store n * 2^n bits of memory, it only stores 2*N bits of memory.

1. 2

Alright, so it’s not limited by memory, but it still needs to walk through the entire cycle, which might take `2^n` steps.

1. 1

The currently known algorithms for detecting cycles might indeed take 2^n steps to detect a cycle.

I’m arguing that it’s possible to devise an algorithm that looks at the program description and can determine whether there are cycles or not, without actually having to simulate the program running step-by-step.

1. 2

So, to be clear, you’re claiming that there is an algorithm `H(P,I,n)` that takes a program `P` and determines whether it halts in fewer than `2^n` steps when run on input `I`, and the time complexity of this algorithm is polynomial in `n`?

1. 1

So, to be clear, you’re claiming that there is an algorithm H(P,I,n) that takes a program P and determines whether it halts in fewer than 2^n steps when run on input I, and the time complexity of this algorithm is polynomial in n?

Polynomial? I don’t know. There are reasons to believe this is not possible, but some of these reasons I find them a bit suspicious, especially because all complexity analysis is done with Turing machines. So it might be possible, I’m not sure. Or perhaps we can achieve quasi-polynomial time (i.e. QP).

SAT solving is NP-complete, so if P=NP perhaps it might be possible? Perhaps you just need to find a satisfying set of assignments in things like loop conditions to determine whether a program loops or not?

Of course, this is all assuming the program P uses a finite amount of memory.

But even if the time complexity of H is exponential, that says nothing about the efficiency of the algorithm, i..e how long it actually takes to solve the problem for reasonable and useful instances of programs.

1. 3
1. 1

EXP can still be quasi-polynomial (QP), I think. So far we only know that P is not EXP, as far as I know.

QP means that this could be something like O(2^(log(N)^1.1)), right? Which means for a state size of 1 TiB, it could take on the order of… 2^17 = 130k times more steps than a state size of 1 bit to calculate in the worst case, which is pretty awesome :)

My optimism is unprecedented…

2. 3

1. 1

How so?

1. 3

You are fixated on Turing machines. Here are some starting points for leaving that worldview behind:

• Untyped lambda calculus is Turing-complete. Specifically, it is not decidable whether a given term has a normal form. This formalism does not have a Turing-machine tape, and so does not suffer from running out of memory.
• Descriptive computational complexity can be used to describe `P`, `NP`, `EXP`, and several other common complexity classes without Turing machines. I have a diagram which shows this in detail. We can immediately see that `P` is contained in `EXP` by an analogue of the time-hierarchy theorem, for example; `EXP` is described by the same logic as `P`, but with second-order quantifiers instead of first-order quantifiers.
• Rice’s theorem does not depend on Turing machines. It can be proven directly from Tarski’s undefinability of truth, which shows that semantic deciders are generally too powerful to exist, and which itself is proven from the first incompleteness theorem.

You may find this book interesting. It explains all of these results, starting with Gödel and ending with Turing.

1. 1

This formalism does not have a Turing-machine tape, and so does not suffer from running out of memory.

I’m not entirely sure what you mean by this phrase. If lambda calculus is Turing-complete, then it means lambda calculus is as powerful as Turing machines, so it is able to use infinite amounts of state, just like a Turing-machine.

Which is the entire problem that I’m pointing out. If we restrict ourselves to analyzing programs that can actually run on a computer (i.e. those that use a finite amount of state), then all theories that were considered undecidable suddenly become decidable.

Descriptive computational complexity can be used to describe P, NP, EXP, and several other common complexity classes without Turing machines.

OK. I don’t quite understand your point here?

Rice’s theorem does not depend on Turing machines.

It depends on the program being analyzed being able to use an infinite amount of state, otherwise Rice’s theorem is false. Or are you saying Rice’s theorem is true for programs that use a finite amount of state?

1. 1

Your focus on state is specific to Turing machines. Lambda calculus does not have intermediate state; any expansion or reduction of terms is going to yield another term.

I think that Wikipedia puts it best in their page on Rice’s theorem:

It is important to note that Rice’s theorem does not concern the properties of machines or programs; it concerns properties of functions and languages.

I also like the phrasing from Yanofsky, p19:

Every nontrivial property of computable functions is not decidable.

Because we are discussing properties of functions, and not just properties of various encodings of functions (various programs), it doesn’t really matter which encoding we choose; all that is required is that anything computable is expressible with our encoding.

1. 1

Ok, I think I understand that now, although I’m not entirely sure (I’ve only really began to understand Rice’s theorem in the past couple of days).

But to me, that seems only interesting and important if you are interested in analyzing a type of computation that cannot be performed in our universe. Which I understand can be useful in certain scenarios, but I’m not sure that it is as useful as it seems for most problems.

I mean, we can theorize about turing-equivalent computation, analyze it, speculate about it and even extract conclusions about it. We can even perform a limited subset of this type of computation in the real world. But we can never truly perform this type of computation, which is a much stronger type of computation that what is possible to do in our universe.

You can still write programs that express such types of computation (e.g. programs written in Turing-complete languages that have no concept of memory allocation failure), with the understanding that if your program uses too much memory, it will crash/halt in the real world. Which means that when you are going to analyze how these programs will actually execute in the real world, then you should to that analysis using a model of computation that represents what computers can do, functionally, not what some theoretical super-powerful imaginary computation device would do in an imaginary universe.

If we limit ourselves to analyzing the type of computation that is possible to do in our universe, then there are problems which were undecidable in the previous model that actually become decidable.

I am not entirely sure what the analogous of Rice’s theorem would be for computations that are limited to using finite state. I mean, Rice’s theorem can be a little difficult to understand :) So I don’t know, perhaps you can clarify that for me:

It seems that Rice’s theorem is defined over the “computable functions”, where “computable functions” actually seems to be a misnomer, because some of these functions are theoretically impossible to compute in our universe, regardless of how much space and energy could be accessible to us: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time and storage space.”

If you were to define “computable functions” as “the set of functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”, then how would that look like? Because this set of functions are exactly the ones that deterministic finite state machines can calculate, and are also the ones that are theoretically possible to compute in our universe.

Given the above, I know Rice’s theorem proof relies on the Halting theorem, and the latter’s proof is not only invalid for deterministic finite-state machines, but in fact we can prove the opposite theorem (i.e. that such computation is decidable).

So if we were to define such a theorem analogous to Rice’s theorem but for computations that can be performed in our universe, then my intuition says that such theorem would be decidable given that you can always fully analyze this type of computation (e.g. since it is possible to simulate this type of computation step by step and guarantee that this simulation always terminates, because the simulation either terminates by itself or we can detect that it starts to repeat forever).

1. 2

There are relatively small Turing machines whose behaviors include halting after taking more steps than our observable universe can represent. For a summary of what we currently know, see this table of busy-beaver values for Turing machines. This contradicts your implicit claim that we can only prove facts about programs which are running on physical machines.

You are close to reinventing computational complexity theory. You are not the first person to wonder what happens when we limit the amount of resources available to a Turing machine.

I would strongly recommend working through Peter Smith’s book linked above, “An Introduction to Gödel’s Theorems”. This book introduces Turing machines after building up the rest of the machinery of computability theory; the work of Gödel, Tarski, and Turing is put into that particular ordering, and we can pause after Tarski’s undefinability of truth and use Yanofsky’s example to prove Rice’s theorem without Turing machines.

1. 1

So here’s another fun one for you.

Let me extract this excerpt from Wikipedia: “The busy beaver function quantifies the maximum score attainable by a Busy Beaver on a given measure. This is a noncomputable function.

The busy beaver function, Σ : N → N, is defined so that Σ(n) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol n-state Turing machines of the above-described type, when started on a blank tape. “

Ok, good. Now let’s say we use my definition of computable function (well, I’m sure it’s not literally my definition and that someone from many decades ago has already used an identical definition), which is the following: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”.

You could say the following: “But @wizeman, if your computable functions are even more limited than the usual definition of super-powerful computable functions, then it’s even more impossible to compute the busy beaver function!”.

Yes, that’s true. But, the busy beaver is using a Turing machine as a model, which means it also uses the absurdly super-powerful definition of computable functions!

So how about we define the busy beaver function as such: “The busy beaver function, Σ : N x N → N, is defined so that Σ(n, k) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol (4 symbols including the endmarkers) n-state deterministic linear bounded automatons of the above-described type, when started on a blank tape of length k.”

What have we gained by doing this?

Well, suddenly, the busy beaver function actually becomes computable, even when using my non-super-powerful definition of computable functions!

Note that, just like Wikipedia says, a computable function does not imply that they can be efficiently computed (i.e. computed within a reasonable amount of time).

But it does mean that something that wasn’t even theoretically possible to do before (not to mention in practice), now is theoretically possible.

1. 1

There are relatively small Turing machines whose behaviors include halting after taking more steps than our observable universe can represent. For a summary of what we currently know, see this table of busy-beaver values for Turing machines.

Yeah. If by “small” you mean a Turing machine with 1.3×10^7036 bits of tape, sure. Remember, the busy beaver function calculates how many bits of state the tape ends up with (well, not bits, symbols, but it’s almost the same thing).

I can create a deterministic finite state machine that takes even longer to run: start with a 1.3 x 10^7036-bit integer initialized to zero and increment it on each step until it wraps around.

But I don’t know how that proves anything about what I said.

This contradicts your implicit claim that we can only prove facts about programs which are running on physical machines.

I think there’s two things wrong with this statement of yours:

1. I don’t think I claimed it’s impossible to prove facts about imaginary machines, even infinite ones. Which seems obvious to me, otherwise induction wouldn’t work.

2. I don’t think I claimed that it’s always practical to prove things about physical machines, which also seems obvious to me, as you can imagine building the biggest machine possible but then you don’t have enough energy to start it, let alone do the entire computation.

But I did mean that restricting ourselves to finite-state machines makes it theoretically possible to prove statements that are impossible to prove for those with infinite state.

prove Rice’s theorem without Turing machines.

You can prove Rice’s theorem without Turing machines but I don’t think you can prove it if you say that a computable function is one that can be computed with finite storage.

In fact, I believe I can prove the opposite of Rice’s theorem if we use that definition of computable function.

1. 1

You’ve misunderstood. A Turing machine’s description does not include its tape; it includes only the state transitions and symbols. I strongly recommend taking a crack at that PDF of Smith’s book.

1. 1

The description does not include its tape, but the Turing machine’s tape is part of its state.

That’s why a busy beaver can be a small program, terminate and yet take such a long time: it’s because it can access so much state. Remember: a program can never take longer to run than 2^N steps (where N=state size in bits), unless the program starts to loop forever. So if a busy beaver only had access to 10 bits worth of tape (+ K states in its description), it could never take longer than 2^10 = 1024 (* 2^K) steps to complete.

The finite-state program that increments a 1.3*10^7036-bit integer until it wraps around is also an extremely small program that takes even longer to terminate than a busy beaver that uses a tape as long as that integer.

2. 3

In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):

That cycle detection algorithms work has nothing to do with the finiteness of the machine memory and everything with the finiteness of the memory used by the program. It is well known that you can solve the Halting Problem for specific subsets of all possible machines, where you can exploit some property of the program. Those algorithms also work on theoretical machines with infinite memory running the same programs.

That they wouldn’t work for programs that use infinite memory is a red herring: they also don’t work, in practice, for many programs that use a finite amount of memory. The same is true for any other subset for which the halting problem can be solved: there are practical limits that prevent them from being solved in general.

The proof for the Halting Problem does not really depend on the infiniteness of machine memory: you can always devise the intermediate machine such that it loops for an amount of time that makes the Halting Problem only solvable given more time than the lifetime of the universe. That is equivalent to not solving it.

(I am in general sympathetic to your point. I find it baffling that CS books don’t clearly say “yes, Rice’s theorem, but don’t let that stop you from proving many interesting properties about programs. I am a physicist by training; I have little patience for mathematical infinities. However, often results remain true when replacing infinities with the finite limits of the universe. This is such a case. For many practical purposes that is irrelevant though)

1. 1

That cycle detection algorithms work has nothing to do with the finiteness of the machine memory and everything with the finiteness of the memory used by the program.

Yes, that’s what I meant, actually. I was saying that as long as you assume that the program is going to run on a computer (which has finite memory), then you can always determine whether it halts or not.

It is well known that you can solve the Halting Problem for specific subsets of all possible machines, where you can exploit some property of the program.

And I was pointing out that the property of the program that we can exploit in this case is “the program can only use a finite amount of memory”, which is the subset of all programs that can run on a computer. An immensely useful subset if you can solve the Halting problem for these, I tell you! :-)

The proof for the Halting Problem does not really depend on the infiniteness of machine memory

Actually, it does subtly rely on the program being analyzed being able to use an infinite amount of memory.

How do I know this? Because the Halting problem is decidable if the program is only allowed to use finite memory, which means that the undecidability proof is invalid in this case.

you can always devise the intermediate machine such that it loops for an amount of time that makes the Halting Problem only solvable given more time than the lifetime of the universe. That is equivalent to not solving it.

No, that still makes it solvable. Your intermediate machine may very well loop for an infinite or almost infinite amount of time. That still makes the problem decidable. It is a separate issue how long an algorithm takes to solve the problem. And also, you don’t know how long this algorithm takes. It may very well be that such an algorithm can determine whether a program loops or not, much more quickly than how long the program loops for.

2. 2

There is a different problem: Can you tell me whether a program is malicious ? If so how ?

Let’s say it’s running something in a loop, and you’re trying to watch it to determine if it is malicious (looping file deletion). That is already a variation of the halting problem, because you cannot tell if the loop will halt (not deleting everything) and simply terminating it after X cycles isn’t enough. Because in this time it could have a) deleted too much or b) not actually finished the user intended work. The same goes for trying to find out if a program will behave differently for antivirus-scanning vs running normally.

1. 1

Can you tell me whether a program is malicious? If so how?

If you have a formal (i.e. mathematically precise) specification of what a malicious program is, then it is possible to solve it, yes.

I think the issue with your example is that you don’t have a specification, i.e. there isn’t enough information to solve the problem.

But you could specify, for example: a benign program never deletes any files that it hasn’t created before, or whose paths the user hasn’t passed as a (specific) command-line parameter, or which the user hasn’t selected in a GUI file-manager-like window.

Then it would be possible to analyze a program and determine whether it does that or not.

2. 2

This isn’t pedantry, this is just a misunderstanding of math and logic. Mathematical statements have to be precise. You can’t make a general statement about computers without having a precise, mathematically representable model of what a computer is. A Turing machine is one such model, and it’s believed to be a very good one, in part because of its correspondence to other similar models like the lambda calculus. But most notable is the register machine that is much closer to a real computer than a Turing machine, but still equivalent. So the halting problem result still holds.

Also, the halting problem has never discouraged anyone ever from research. I obviously don’t know every researcher, but it’s extremely well understood that the halting problem only applies to a general algorithm about all possible programs, and there are obviously classes of programs where you can easily know whether or not they halt. The biggest example there is the simply typed lambda calculus, of which programs are proven to always terminate.

The STLC is not a secret, it’s the basis of a huge amount of CS research. So the class of programs that the halting problem applies or doesn’t apply to is very well known.

1. 1

You can’t make a general statement about computers without having a precise, mathematically representable model of what a computer is.

How about we use a deterministic finite-state machine to model a computer? Would you be OK with that?

A Turing machine is one such model

No, it is not. That is my point.

It’s not one such model because Turing machines can run programs which no computer will ever be able to run, and I really mean ever. It doesn’t matter which computer you construct, it will never be able to run such programs like a Turing machine does.

Specifically, I am talking about those programs which use infinite amounts of state (which in a Turing machine, includes the tape).

That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.

And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.

So a Turing machine is not a model of a computer, it is a purely theoretical model of a super-powerful (due to its infinite tape) abstract machine which is impossible to construct in the real world and which no computer can ever emulate.

But most notable is the register machine that is much closer to a real computer than a Turing machine, but still equivalent.

Still not a model of a computer. Register machines are turing-equivalent, which means register machines can use infinite amounts of state as well, and thus they are a purely theoretical construction which is much more powerful than a computer will ever be.

For the sake of argument, imagine I create a model of a computer and tell you: hey, my computer model has an instruction which, when executed, can instantly tell tomorrow’s lottery numbers and put them in a register. And another instruction which kills everyone on the planet.

Would you say this is a model of a computer? Do you think it is really a model of what computers do? It does seem absurd, but a model of a computer with a truly (i.e. mathematically) infinite tape can be argued to be similarly absurd.

Note that I’m not saying that models have to be a perfect simulation of what a computer does, atom by atom.

But they should at least functionally represent what a computer does. And it’s OK to have unfaithful models, i.e. with certain assumptions or simplifications (like what Turing machines do, which assume the program can use infinite amounts of memory), but if you do so, you have to be careful about the implications of that and not assume that just because something is true in your model, then it is also true for computers. Otherwise you’re just misleading everyone.

So the halting problem result still holds.

It doesn’t, because of what I mentioned above. The Halting problem is decidable on finite-state machines.

and there are obviously classes of programs where you can easily know whether or not they halt.

Sure. But there is a particular class of programs, which is those that are only allowed to use a finite amount of state (i.e. those that can run on computers), for which problems such as the Halting problem and Rice’s theorem which are undecidable on Turing machines, suddenly become decidable.

So ironically, these Turing machines which are more powerful than computers, end up having limitations that finite-state machines like computers don’t have!

And not only these problems become decidable on computers, but suddenly it also becomes “easy” to prove statements about these programs, as the Tortoise and Hare algorithm (and other similar ones) demonstrate. However, there is still a huge practical issue in that the time complexity of these known algorithms is way too high to be useful, so we’d need more efficient algorithms for them to be useful for real-world programs.

1. 3

How about we use a deterministic finite-state machine to model a computer? Would you be OK with that?

I’m totally ok with that. That’s why I brought up the simply typed lambda calculus, which does not suffer from the halting problem and which has programs that are guaranteed to terminate. There are many models of computation that each have different properties and tradeoffs.

But infinite models still have immense practical utility. I have a good example - integers. The set of integers in math is infinite. Yet we can still use a finite subset of integers in our daily lives, and we know many things about finite subsets of integers because we’ve proven them on infinite sets.

Another occurrence of infinity is with interactive programs, like anything with a user interface or an event loop. While technically they don’t execute forever, or at least we don’t know if they do or not because we as humans can’t experience “forever,” you still have to reason about an interactive system as if it were infinite. It can be thought of as an infinite program that you can choose to halt, or as a finite program where you choose an arbitrary number of iterations beforehand. The distinction is totally moot though, because you have to use the same kind of logic in either case. Or in the case of FSMs, the number of finite states is so large that it being finite has no relevance to the real world, because you could never check anything about those states directly in any number of lifetimes.

You are arguing against the usage of infinite reasoning for real-world objects, and that we’re going to disagree on. That’s just how math works. You use infinity sometimes.

1. 1

But infinite models still have immense practical utility. I have a good example - integers. The set of integers in math is infinite.

Yes, that’s why I argue that Turing machines are ideal if you want to model a computation about mathematical statements, e.g. for figuring out the Collatz conjecture or whatever, i.e. statements which are about truly infinitely-large things, which practically only occur in math.

But for modeling computers and the execution of real-world programs, finite-state machines seem a lot more appropriate and representative.

Another occurrence of infinity is with interactive programs, like anything with a user interface or an event loop. While technically they don’t execute forever, or at least we don’t know if they do or not because we as humans can’t experience “forever,” you still have to reason about an interactive system as if it were infinite. It can be thought of as an infinite program that you can choose to halt, or as a finite program where you choose an arbitrary number of iterations beforehand

Actually, this type of infinity (in terms of time) is not necessarily an issue. As I mentioned earlier, the Tortoise and Hare algorithm (and others like it) can always deterministically detect whether programs halt or whether they loop forever, assuming the program is running on a finite-state machine. This means that it is possible to reason about these programs, i.e. Rice’s theorem is decidable for these programs.

So it doesn’t necessarily matter if they are designed to execute forever.

BTW, this is another example where the Halting problem, being usually defined for and studied on Turing machines, is misleading people into believing that reasoning about always-looping programs is undecidable.

The distinction is totally moot though, because you have to use the same kind of logic in either case.

No, this is not true. You can model programs that run forever as a deterministic finite-state machine, which don’t have the same undecidability problems as Turing machines.

Or in the case of FSMs, the number of finite states is so large that it being finite has no relevance to the real world, because you could never check anything about those states directly in any number of lifetimes.

How do you know this? Are you saying that to check statements about these programs you can only simulate them step-by-step while they run through all states, i.e. that you cannot answer these statements by looking at the description of the program?

You are arguing against the usage of infinite reasoning for real-world objects, and that we’re going to disagree on. That’s just how math works. You use infinity sometimes.

Well, math is math and the real world is the real world. They are not the same thing. In the first one, truly infinite things exist, in the second one, they do not (except time perhaps), or at least, they’ll never be accessible to us.

But if you disagree with me, that’s fine :-)

1. 4

But for modeling computers and the execution of real-world programs, finite-state machines seem a lot more appropriate and representative.

What problems are you working on that you’re thinking FSMs will be more effective?

You can model programs that run forever as a deterministic finite-state machine

You can. How would you prove or check a property about such a program? (see the next response for what you would do)

Are you saying that to check statements about these programs you can only simulate them step-by-step while they run through all states, i.e. that you cannot answer these statements by looking at the description of the program?

No, I’m saying you have two choices to prove or check properties about large state machines:

1. Run them step by step and see if the property is ever broken. We have plenty of experience with this in model checking, and anyone who has actually done model checking on a real model will tell you that you have to immensely abstract your system and keep your bounds very small for this to be useful in any practical way. This is possible in very small, controlled cases, but you are arguing that FSMs should replace all other forms of reasoning about programs. And experience shows that this is a very bad idea.

2. Use infinite reasoning such as induction to show the property holds on an infinite SM. This is the better idea, but the one that you are opposed to.

But if you disagree with me, that’s fine :-)

The debate isn’t between you and me, it’s between you and the CS community, I’m just relaying what is known about state machines in CS.

1. 1

What problems are you working on that you’re thinking FSMs will be more effective?

Writing computer programs and proving things about them.

To be clear, I’m not saying FSMs will be more effective right now. It’s just that it’s theoretically possible to solve problems for them that aren’t possible to solve for Turing machines.

You can. How would you prove or check a property about such a program? (see the next response for what you would do)

If you have an algorithm that does exactly what the Tortoise and Hare algorithm does, but works in a much more practically time-efficient manner (while still not using excessive amounts of memory), then it would be possible to use that algorithm to prove and check properties about programs that aren’t allowed to consume infinite memory.

Note that the Tortoise and Hare algorithm (and others like it) work without knowing anything about the program, they simply run the program step-by-step.

I’m conjecturing that it might be possible to greatly improve the performance of such an algorithm by looking at the program description and exploiting properties about how the program works.

No, I’m saying you have two choices to prove or check properties about large state machines: (…)

OK, then I think we misunderstood each other. You are describing the currently-known ways of solving this issue. But when you said this:

because you could never check anything about those states directly in any number of lifetimes.

I thought you were talking about something that is impossible to do, i.e. that can never be improved upon.

So yes, I agree with you, currently those are the most feasible ways to solve this problem. And of course, I use induction myself when I want to prove properties about my programs, because this is one of the currently most feasible ways to do this. So I know that they are useful right now and I also know about the limitations of both of them.

But I wasn’t talking about what is currently feasible, I’m talking about what might be possible to do.

In theory, as far as I know there’s nothing that says that to prove properties about a program running on a finite-state machine, that you have to 1) run this program step-by-step like in a model checker, or 2) do things like induction.

In other words, I don’t think that those will always be the only options, especially because it’s possible to solve problems in FSMs that aren’t possible to solve for Turing machines. But I’d be happy to be proven otherwise!

1. 2

Writing computer programs and proving things about them.

A worthy goal :)

I think we’ve arrived at mutual understanding. I won’t try and discourage optimism, because we should always be looking for improvements over what we know today. It would be great if we could reduce any part of computation to deterministic, finite, and manageable structures.

For today, I stick with Leslie Lamport’s advice and point of view, and focus on the techniques of infinite state machines.

2. 1

This is wrong. In particular, Brainfuck is Turing-complete, implementable on physical computers, and simulates a tape.

1. 1

This is wrong. In particular, Brainfuck is Turing-complete, implementable on physical computers, and simulates a tape.

What do you mean exactly?

If it’s Turing-complete, by definition it can’t be simulated on a physical computer.

That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.

And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.

1. 29

Why Twitter didn’t go down … yet

I was hoping for some insights into the failure modes and timelines to expect from losing so many staff.

This thread https://twitter.com/atax1a/status/1594880931042824192 has some interesting peeks into some of the infrastructure underneath Mesos / Aurora.

1. 12

And yesterday it was possible to post entire movies (in few-minute snippets) in Twitter, because the copyright enforcement systems were broken.

1. 5

That tweet got deleted. At this point it’s probably better to archive them and post links of that.

1. 11

It wasn’t deleted - there’s an ongoing problem over the last few days where the first tweet of a thread doesn’t load on the thread view page. The original text of the linked tweet is this:

I’ve seen a lot of people asking “why does everyone think Twitter is doomed?”

As an SRE and sysadmin with 10+ years of industry experience, I wanted to write up a few scenarios that are real threats to the integrity of the bird site over the coming weeks.

1. 12

It wasn’t deleted - there’s an ongoing problem over the last few days where the first tweet of a thread doesn’t load on the thread view page.

It’s been a problem over the last few weeks at least. Just refresh the page a few times and you should eventually see the tweet. Rather than the whole site going down at once, I expect these kinds of weird problems will start to appear and degrade Twitter slowly over time. Major props to their former infrastructure engineers/SREs for making the site resilient to the layoffs/firings though!

1. 2

Not only to the infra/SREs but also to the backend engineers. Much of the built-in fault-tolerance of the stack was created by them.

2. 2

I have this URL archived too, but it seems to still be working.

1. 1

hm, most likely someone would have a mastodon bridge following these accounts RT-ing :-)

2. 2

FWIW, I just tried to get my Twitter archive downloaded and I never received an SMS from the SMS verifier. I switched to verify by email and it went instantly. I also still haven’t received the archive itself. God knows how long that queue is…

1. 2

I think it took about 2 or 3 days for my archive to arrive last week.

3. 2

oh, so they still run mesos? thought everyone had by now switched to k8s…

1. 13

I used to help run a fairly decent sized Mesos cluster – I think at our pre-AWS peak we were around 90-130 physical nodes.

It was great! It was the definition of infrastructure that “just ticked along”. So it got neglected, and people forgot about how to properly manage it. It just kept on keeping on with minimal to almost no oversight for many months while we got distracted with “business priorities”, and we all kinda forgot it was a thing.

Then one day one of our aggregator switches flaked out and all of a sudden our nice cluster ended up partitioned … two, or three ways? It’s been years, so the details are fuzzy, but I do remember

• some stuff that was running still ran – but if you had dependencies on the other end of the partition there was lots of systems failing health checks & trying to get replacements to spin up
• Zookeeper couldn’t establish a quorum and refused to elect a new leader so Mesos master went unavailable, meaning you didn’t get to schedule new jobs
• a whole bunch of business critical batch processes wouldn’t start

It was a very painful lesson. As someone on one of these twitter threads posted, “asking ‘why hasn’t Twitter gone down yet?’ is like shooting the pilot and then saying they weren’t needed because the plane hasn’t crashed yet”.

1. 8

Twitter is well beyond the scale where k8s is a plausible option.

1. 2

I wonder what is the largest company that primarily runs on k8s. The biggest I can think of is Target.

1. 3

There’s no limit to the size of company that can run on kube if you can run things across multiple clusters. The problem comes if you routinely have clusters get big rather than staying small.

1. 1

Alibaba, probably.

1. 1

Oh, I didn’t realize that was their main platform.

2. 1
1. 2

I was thinking about that too, but I’m guessing that CFA has a fraction of the traffic of Target (especially this time of year). Love those sandwiches though…

2. 2

1. 1

I work at a shop with about 1k containers being managed by mesos and it is a breath of fresh air after having been forced to use k8s. There is so much less cognitive overhead to diagnosing operational issues. That said, I think any mesos ecosystem will be only as good as the tooling written around it. Setting up load balancing, for instance . . . just as easy to get wrong as right.

1. 2

Annoyingly, blocking these requests did not surface any error – it seems that whatever that Stripe is scraping is non-essential

Seems like a lot of the value add for Stripe is their magic sauce differentiating fraud from non-fraud. And, much like people, they try to widen their “line-of-sight” on their counter party. I assume they do this both to check for red flags, and probably also to check for “green” flags too. Stripe scraping the website might just be part of their fraud differentiation, and that seems to me to be consistent with the blocked scraping not surfacing an error.

1. 1

Extremely advanced anti-fraud systems are fairy standard across the American payment card industry, from what I’ve heard. As a developer, Stripe’s SDK and API, especially around subscription management and comprehensive story for suspended payments, is the primary reason to choose them.

1. 1

They really don’t care as much about fraud as we be optimal for society, because their customers, the ones receiving the fraudulent payments, are the ones that bear the risk. And those customers don’t have much choice due to the creditcard duopoly.

2. 1

It being anti fraud would also explain why it claims to be Chrome rather than explicitly saying “Stripe” in the user agent string.

It’s slightly worrying that it says outdated Chrome. In the unlikely but not totally implausible event that they really are using headless Chrome to scrape pages, they could have all sorts of unpatched old vulnerabilities. ;)

1. 5

I tinkered a bit more with it after posting this, specifically because I was wondering the same thing, and yeah… it really looks like they are using a very old version of headless Chrome.

1. 2

So we’re finally at the point where you don’t target normal visitors, but towards scarper bots.

1. 1

Oh eek

I hope they sandbox that well

1. 3

Its not as fun but im pretty sure 16gb ram on hetzner cloud is the same price as 4 at digitalocean. For like 40 a month you could get a dedicated ryzen 6 core 12 thread server with 64gb ram.

1. 5

maybe, but I personally don’t like the “just throw more hardware at it” approach to solving (most) problems

1. 4

Fair; but “throw more hardware at it temporarily while we come up with a fix for our suddenly not fit for purpose architecture” is IMO a reasonable approach..

1. 3

yeah, but changing hosting providers and moving your entire infrastructure over usually isn’t a simple temporary thing, right? I know it can be in some situations, but given how many moving parts there are to hosting mastodon…

1. 2

I don’t want to host it because of that.

Setting up the web process (es), worker process (es), proxy process per host, object storage, elasticsearch and postgresql -and monitoring for each, resource limits (so a runaway process doesn’t prevent you SSH-ing in), backups… Sooner or later it seems like real work.

By contrast, `honk` offers fewer features, but running it requires a proxy process (for TLS), a honk process, and a directory to put data in (which you can back up using `rsync`). Was a bit too bare-bones for me, though.

Somewhere in between, `plemora` does nearly everything that `mastodon` does (including API-compatibility for client apps), but you don’t need elasticsearch or separate worker processes.

2. 2

Well, they threw a raspi 4 at it… 😀

1. 4

It shows confidence in your product to say “here’s all the ways we can be broken.”

1. 1

Well, all the ways they know of, with a few catch-alls such as ‘undiscovered bugs in sqlite, the kernel, fs drivers, …’

1. 7

What can I use this for? Neither this explanation nor the linked pages help me understand where when can I use Jevko (nor how to pronounce it).

1. 4

What can I use this for?

For the same things you could use XML/JSON/TOML for.

Neither this explanation nor the linked pages help me understand where when can I use Jevko

According to the website:

It has no data types, no semantics, no underlying model of cons cells or anything similar. It’s as close to pure generic syntax as it gets.

So at the lowest level Jevko is a minimal formal specification for flexible trees of text.

For its simplicity it seems quite powerful to me. I don’t see it replacing JSON/markdown or even TOML. But is seems trivial to implement on very low-level, older, less bloated, or less powerful systems.

Note: I am not the author of this tool, I merely posted its link because I find it an elegant and versatile solution.

1. 3

@eterps’ answer points in the right direction.

To expand on that, perhaps a better starting point is the home page: https://jevko.org/ Which includes a formal specification: https://jevko.org/spec.html

Pronunciation instruction are there as well ;)

The name Jevko /ˈdʒef.kɔ/ is derived from Polish drzewko (/ˈdʐɛf.kɔ/), meaning small tree.

There is also https://jevko.github.io/ which has a lousy demo that I haven’t updated in a while: https://jevko.github.io/interjevko.bundle.html

Finally, the repos on GitHub may give a vague idea on what this can be used for.

1. 4

So is it sorta like Rebol without a defined execution semantic behind it (e.g. no do/parse dialect)?

1. 1

The things in common with Rebol are the spirit of Lisp and the square brackets.

Important differences are that Jevko is only a syntax, like S-expressions or whatever-the-name-of-Rebol-syntax-is.

Indeed there is no semantics attached to it at all.

You could make up some semantics and build a language. A toy example: https://github.com/jevko/jevkalk

Another important syntactic difference from S-exp/Rebol-syntax is that whitespaces are not a separator in Jevko.

The toy language above exploits that to allow spaces in identifiers, so instead of `frobCount` or `frob_count` you can write `frob count`. It’s pretty fun.

Another difference is that the syntax is designed in such a way that it should parse into things that look like name-value pairs, which makes it easier to process a lot of things, e.g.:

``````c [d]
a [b]
``````

would parse to something like

``````[[["c ",[[],"d"]],["\na ",[[],"b"]]],""]
``````

which has `c` paired with `[d]` and `a` paired with `[b]`.

This structure is useful for further processing in many applications.

1. 2

I think that for many practical purposes those trailing spaces will become very annoying very fast.

1. 2

For such purposes, you’d trim them. E.g. https://github.com/jevko/easyjevko.js and https://github.com/jevko/easyjevko.lua implement a very simple format on top of Jevko which takes care of trimming leading and trailing spaces automatically and converts Jevko parse trees into JS/Lua objects/tables/arrays/strings.

1. 3

Of course, but for many practical purposes splitting on whitespace is what you want to do anyway and for all those purposes Jevko is at a disadvantage relative to s-expressions. I’d say simplicity includes how simple it is to use for your purposes and an additional trimming step is less simple than a format that doesn’t require that trimming step. So I don’t think Jevko is simpler than s-expressions for many purposes.

1. 2

What I should’ve written is ~ that as a someone who is interested in a specific behavior like the one you are describing, you would want to use not plain Jevko, but a Jevko format that does the right thing for you.

The simplest one right now is the one I linked, although formal spec for that is still TODO.

However it might still turn out to do the wrong thing for you (there is no splitting on spaces, just trimming). In such case you’d indeed need to create your own or wait for one being available.

Or just use S-expressions, of course! ;)

Although there are several reasons why you might still pick Jevko, especially if a formally specified format that meets your constraints shall become available.

Authoritative formal specification being one advantage.

Another might be convenient properties of Jevko, such as being closed under concatenation by design (not true for S-exp).

2. 2

Important differences are that Jevko is only a syntax, like S-expressions or whatever-the-name-of-Rebol-syntax-is.

That’s actually what Rebol is as well; the dialects interpret the syntax, that’s why you can have overloaded values between say Rebol/View, Rebol/Do, Rebol/Secure, Rebol/Load, &c.

The toy language above exploits that to allow spaces in identifiers, so instead of frobCount or frob_count you can write frob count. It’s pretty fun.

ah, there are a few languages like that, various Algol dialects didn’t treat space as a separator, so you could have identifiers with spaces in them. TCL has some interesting processing sorta like jevkalk has.

The processing of structures is quite nice, esp without additional syntax. How do you delineate that from those explicitly wrapped in `[ ]` pairs?

1. 1

That’s actually what Rebol is as well; the dialects interpret the syntax, that’s why you can have overloaded values between say Rebol/View, Rebol/Do, Rebol/Secure, Rebol/Load, &c.

I don’t know Rebol below the very surface, but I presume that there is a central language that binds all these dialects, isn’t there? I.e. the basic semantics. The point is that it is useful to separate Rebol-the-language (syntax+semantics) from the syntax alone in this context.

ah, there are a few languages like that, various Algol dialects didn’t treat space as a separator, so you could have identifiers with spaces in them. TCL has some interesting processing sorta like jevkalk has.

These early Algols and COBOLs didn’t really do the same thing – they simply ignored the whitespace, so `abc` is the same as `ab c` or `a b c`, etc.

The one notable language that did a similar thing is actually very early Lisp. Initially S-expressions had the form `(process, frob count, another argument)` rather than `(process frob-count another-argument)`, i.e. comma- rather than space-separated.

Tcl is very nice in its minimalism. And it does have a similar syntax to Jevko, namely the Tcl braces. However, it’s still a bit different: https://www.reddit.com/r/sexpr/comments/xplmh7/tcl_braces_are_similar_to_jevko_only_3_special/iq5nt1j/

The processing of structures is quite nice, esp without additional syntax. How do you delineate that from those explicitly wrapped in [ ] pairs?

Not sure if I understand the question. Could you give an example?

1. 2

I don’t know Rebol below the very surface, but I presume that there is a central language that binds all these dialects, isn’t there? I.e. the basic semantics. The point is that it is useful to separate Rebol-the-language (syntax+semantics) from the syntax alone in this context.

No no, that’s not the case; the interpretive context applies what the symbols within the data means, otherwise it’s just data without any other meaning, quite like what you’re doing here really.

These early Algols and COBOLs didn’t really do the same thing – they simply ignored the whitespace, so abc is the same as ab c or a b c, etc.

Algol68 and PL/I actually did support whitespace in identifiers, as did ABC. It’s not very commonly used, but it is different from what Algol60, early Fortran, &c did by being insensitive to whitespace.

Additionally, early Lisp used M-Expressions, similar to Mathematica, but you also had LOGO which could do all sorts of things with it’s quoting conventions.

Not sure if I understand the question. Could you give an example?

I apologize, what I meant was what happens with

``````c [d]
a [b]
``````

when we surround them by brackets; is it the same parse tree as:

``````[
c [d]
a [b]
]
``````

or

``````[c [d]]
[a [b]]
``````

or

``````[
[c [d]]
[a [b]]
]
``````

?

1. 1

No no, that’s not the case; the interpretive context applies what the symbols within the data means, otherwise it’s just data without any other meaning, quite like what you’re doing here really.

Interesting. It makes sense: it’s merely a context-dependent evaluation strategy.

But still, for this discussion that’s the irrelevant part.

What is relevant is the uninterpreted data, i.e. the syntax tree, the result of parsing Rebol syntax. I don’t know if it is formally specified somewhere, but if it was, then this is the thing to compare to Jevko. Jevko itself is just a simple syntax.

Just want to be very clear about that. :)

Algol68 and PL/I actually did support whitespace in identifiers, as did ABC. It’s not very commonly used, but it is different from what Algol60, early Fortran, &c did by being insensitive to whitespace.

I did some research on this not that long ago, which involved looking at reference manuals and reports on various old languages.

AFAICT Algol 68 and PL/I didn’t support it in the same way I have implemented here or the way early Lisp did[0].

Algol 68 was space-insensitive, according to this report:

https://www.softwarepreservation.org/projects/ALGOL/report/Algol68_revised_report-AB.pdf

In particular see:

• Page 20, section 1.1.3.1.g).(v)

• Page 117, section 9.4.d)

This means space-insensitivity.

In any case there was a lot of rules involved, stropping, etc.

All this probably made sense when punched cards were still used. I don’t know.

PL/I spec here:

http://www.bitsavers.org/pdf/honeywell/multics/AG94-02_PLI_langSpec_Mar81.pdf

Section 2.6.1. does not allow identifiers in spaces. Section 2.6.4. explicitly defines space as a delimiter.

I can’t find any formal description of ABC, only this: https://homepages.cwi.nl/~steven/abc/programmers/handbook.html

A quick look suggests that it indeed supported identifiers with spaces. That’s interesting! I’m curious about the details.

Additionally, early Lisp used M-Expressions, similar to Mathematica, but you also had LOGO which could do all sorts of things with it’s quoting conventions.

Yes, Jevko is similar to M-expressions in that it also uses square brackets and a Lisp written in it would prefer `f[x]` over `[[f][x]]`.

I think M-expressions were generally a good idea. S-expressions won likely because they were were clearly enough defined to be implemented and were sufficient to express everything. M-expressions weren’t as clearly defined and used S-expressions as a subset, so there was no point in bothering.

If Lisp was specified only using simplified M-expressions from the beginning, maybe something like Jevko would’ve evolved long ago.

Ad. quoting conventions and similar ways to specify identifiers with spaces (present in some lanugages today) – the problem with them is that it’s extra work and it’s ugly compared to regular identifiers. Identifiers with spaces are kind of tacked on, are not first-class, so to speak. So nobody uses that.

On the other hand if it was natural and simple maybe it would catch on.

Or maybe it wouldn’t. Clearly it didn’t in this timeline. But it’s fun to speculate and design.

Ah, now I see, thank you for the examples.

Each of these examples would parse to a different parse tree. Brackets always build a subtree.

In fact nothing is really lost in a Jevko parse tree. The special characters may or may not be implied, but otherwise it’s a concrete syntax tree.

Below are the parse trees for your examples in order, starting with:

``````c [d]
a [b]
``````
``````{"subjevkos":[{"prefix":"c ","jevko":{"subjevkos":[],"suffix":"d"}},{"prefix":"\na ","jevko":{"subjevkos":[],"suffix":"b"}}],"suffix":"","opener":"[","closer":"]","escaper":"`"}
``````
``````{"subjevkos":[{"prefix":"","jevko":{"subjevkos":[{"prefix":"\n  c ","jevko":{"subjevkos":[],"suffix":"d"}},{"prefix":"\n  a ","jevko":{"subjevkos":[],"suffix":"b"}}],"suffix":"\n  "}}],"suffix":"","opener":"[","closer":"]","escaper":"`"}
``````
``````{"subjevkos":[{"prefix":"","jevko":{"subjevkos":[{"prefix":"\n  c ","jevko":{"subjevkos":[],"suffix":"d"}},{"prefix":"\n  a ","jevko":{"subjevkos":[],"suffix":"b"}}],"suffix":"\n  "}}],"suffix":"","opener":"[","closer":"]","escaper":"`"}
``````
``````{"subjevkos":[{"prefix":"","jevko":{"subjevkos":[{"prefix":"\n","jevko":{"subjevkos":[{"prefix":"c ","jevko":{"subjevkos":[],"suffix":"d"}}],"suffix":""}},{"prefix":"\n","jevko":{"subjevkos":[{"prefix":"a ","jevko":{"subjevkos":[],"suffix":"b"}}],"suffix":""}}],"suffix":"\n"}}],"suffix":"","opener":"[","closer":"]","escaper":"`"}
``````

These were produced by https://github.com/jevko/parsejevko.js

Notice that they all end in:

``````"opener":"[","closer":"]","escaper":"`"}
``````

These are the three special characters in Jevko. They are stored at the top level of the parse tree, so you can recover the original text from it, even if you configure the parser to use nonstandard special characters.

[0] In fact early Lisp also didn’t do the exact same thing as my implementation, as it allowed only single embedded blanks. That may be a good idea.

1. 4

It’s a parable about doing something for the first time without having a measured overview over the landscape in which you are operating.

It chronicles a self-directed learning experience with false starts and dead ends.

The only way to prevent it is to have someone with relevant experience at hand (and actually consult them, more often than seems necessary, and actually listen to them).

So, yes.

1. 2

The main trouble with these kinds of checklists is: when to apply which rules. If you go through this list for every little code change, you get death by procedure/bureaucracy. But not doing something consistently usually means it is only haphazardly applied or not at all. So how do you operationalize such checklists?

1. 4

I’d argue it isn’t so much a “checklist” as it is “these are the sorts of questions/continuing internal critiques/principles that a good engineer should have at all times when working”.

At some point, we’ve gotta earn the salaries we’re getting paid.

1. 2

It doesn’t work that way. You could say the same for doctors, yet explicit checklists improve healthcare outcomes. You need to explicitly think about it; implicit judgments miss a lot.

Moreover, many of these principles aren’t things you ‘just know’ as a good engineer. Take number 3: that requires explicitly having a colleague explain something to you, so they can verify they’ve understood.

1. 2

It’s always surprising which commerce sites are good and which aren’t. It seems that the larger the brand, the worse their site. I was recently delighted by Siemens though: searching for my 14 year old fridge, I found it, with complete drawings and the ability to find and order every single part.

1. 3

Doesn’t seem any different from coming up with good search terms to get Google to find what you’re looking for? I notice I can often find things others fail to find. The only difference is better (in the sense that it gets the desired results; not that they are better in some more objective sense) search terms.

1. 1

Maybe a difference in magnitude, rather than kind? I’ve never had to spend more than a minute or two maximum finagling search engine results, but scrying good looking images out of Stable Diffusion can be an hours long process very easily.

1. 6

Better title: a minor Windows program is a lot slower than it should be, for this user.

1. 35

Given his track record of identifying performance problems in Windows apps, and not just minor ones. which mostly boil down to “the app is doing way to much, often unnecessary, work”, I think he is entitled to use this title.

1. 1

That may well be, but this specific article in isolation (which is what most of us are judging by) does not substantiate the claim. The author even acknowledges it in the first paragraph:

I apologize for this title because there are many things that can make modern software slow. Blindly applying one explanation without a bit of investigation is the software equivalent of a cargo cult. That said, this post describes one example of why modern software can be painfully slow.

1. 21

I don’t understand the apology and think the title is fine. I read it like: “Why criminals are caught (part 38) – (the case of) the Hound of Baskervilles”. Where the parenthetical parts are optional. You wouldn’t think a blog post with that title would claim all criminals are caught because of the specific reason in that case.

2. 11

for this user.

Are there users for whom waiting an extra 20 seconds before they can start using a program is acceptable?

1. 7

No, but there are likely users with fewer than 40,000 files in their Documents directory.

1. 2

I think you should rarely focus on the state machine and make it a central concept in your program, like you should rarely use a ‘fold’. In the case of fold, there are often higher level functions like map and filter you want to use. In the case of state machine, there are higher level patterns that should be the central concepts.

1. 1

My advice: if you can make your history append-only, do so. So e.g. you correct when someone’s new salary became valid, but you don’t change any of the records about payments owed (and made of course); rather you just add correction records. You can’t recalculate what someone was owed at a historical point in time given the knowledge at the time… but do you really need that information, given that you do have a record of what the system calculated at that time? That you can recalculate what they are actually owed and how large the correction should be is often enough.

1. 4

Because the fundamental misunderstanding here is that programming is hard because the tools make it hard. Programming is hard because it’s just a hard thing to do. It’s several layers of abstraction away from making some physical you can hold in your hands. There’s no Magic Programming Tool that is suddenly going to take programming from hard -> easy.

1. 7

I guess I kind of agree with this but there’s a major clarification needed. Programming complex things will never be easy, but it can be less frustrating if you use languages that are better than the status quo languages most people use nowadays. But plenty of those languages already exist today, the reason they’re not getting used isn’t that we haven’t invented enough new languages yet.

1. 2

Most things are hard and we use tools to make them easier. In ways programming is easier than making a physical thing with your hands, because you get easy modifications and do-overs.

1. 1

I believe the same, but would be very happy be proven wrong here.

1. 2

There’s clearly programmer pain—and pain that some of these new languages [..] could solve [..]

Well, that’s an assertion I’d like to see some evidence for.

I snipped “and tools” in the first [..], for which I consider this statement more likely to be true, but PL’s themselves?

1. 2

Yes! This guy gets it! I love this post.

1. 2

I think this quote alone is a devastating critique of the kind of arguments they take issue with:

system dynamics mechanizes software teams [..]: it models software teams as ruly phenomena that obey equations

1. 3

The thing is; when you connect your model to the reality then there’s bound to be some sweeping of assumptions under some rug… and rather than stop there for a while and looking at the “why?” we skip straight to the “how?” and I think that is rather polite in the micro but absolutely horrendously rude in the macro.

What he calls “mystical” (speaking statements as fact) I call “rainbow speak” - it’s a very common source of conflict online (some wires get crossed and you start getting fed propaganda for the wrong echo chamber => circular arguments of stating “facts” at each other)

1. 2

This can be taken too far in the other direction, too. We’re complex folks that work in teams, but sometimes simple models can still be rather accurate and important… I’m particularly thinking of Conway’s Law – it’s a red flag, to me, if an organization doesn’t respect Conway’s Law in their planning.

1. 9

On Rust’s “diversity numbers being terrible” and worse than the industry – is it worse than open source? For example, I think the last time anyone checked, there were many fewer women open source programmers vs. paid career programmers.

That is, I think all open source projects have this issue (and lobste.rs and HN seem to as well :-/ )

1. 4

My impression is that Rust’s gender ratio number is in fact worse than open source. One possible reason is that compiler is a field gender-coded to be masculine. That is, I think Rust-the-project’s number is worse than Rust ecosystem (e.g. bodil, the maintainer of im crate, is a woman) but similar to, say, LLVM.

1. 1

Seems like diffusion models could be used to hack “blurring”.

When people take screenshots of text, it’s common to obfuscate sensitive text like passwords via painting or blurring. It’s recommended to not use “blurring” because the original text can be brute-force calculated by blurring & comparing the images. i.e., blurring leaves a lot of information in tact, and that residual information can be used to reconstruct the original form.

It seems like you could train a diffusion model to “un-blur” text or even go a step further and un-blur faces too.

1. 1

You can do that without these models. Reversing such effects applied to photos has been used to restore the faces of child abusers that subsequently went to jail.