I see a lot of negativity in the answers here and I want to say that if you are looking to understand Rice’s theorem, this is an excellent, and succinct, tutorial.
@wizeman’s post is actually an interesting tangent, but it takes nothing away from the quality of this tutorial – it is raising a much larger question and the critique would apply to any tutorial, book, or video that treats these subjects of theoretical computer science.
I also really enjoyed it, except some of the last bits (around step 26) asked me about things from earlier and I was too lazy to scroll up and remind myself which of the 20+ program snippets had that particular name and so just button mashed to the end.
The reason that I find this kind of theorem unsatisfying is the universal quantification. Yes, it is impossible to determine whether a program halts, for all programs over all inputs. It is possible to show whether some large subset of all finite programs halt over unbounded inputs (@wiseman is somewhat missing the fact the we do have a practically infinite tape connected to our computers, we call it a network). The really interesting question, to me, is: what are the limits over which that computation is not just possible, but feasible. The same applies to Rice’s theorem. In practice, for a lot of problems, an proof assistant that returns true, false, or don’t know is useful, especially if it can provide some hints about how to modify the input to make the proof possible.
I agree totally on the computation question. The classic example is primitive recursion: if each case leads to “smaller” input then you have simple induction to prove that your program terminates.
I would be pretty interested in working through those kinds of formalisms, and investigating the bounds of interesting programs within there (after all, lots of programs we write in practice do terminate, so why?)
@wiseman is somewhat missing the fact the we do have a practically infinite tape connected to our computers, we call it a network
That’s a good point!
Strictly speaking, I don’t think a network is a truly infinite tape. But yes, for practical purposes, currently we’d have to model it as such, especially if programs are allowed to run indefinitely.
So I guess, now that I think about it, perhaps the bounded Halting problem is a more interesting Halting problem to me than the Halting problem itself, since it ensures by construction that the program can only use a finite amount of memory, regardless of whether they use things like a network or not.
I wonder if we can always answer any questions about programs that only run for a bounded amount of time.
The really interesting question, to me, is: what are the limits over which that computation is not just possible, but feasible. The same applies to Rice’s theorem.
Yes, indeed! I completely agree that is the more interesting question. I just get bothered when people say it’s not even possible.
Edit: sorry, I substantially changed my comment as more thoughts occurred to me.
Strictly speaking, I don’t think a network is a truly infinite tape.
A Turing machine doesn’t actually require an infinite tape because it can only move the tape by one unit per time step. This is the basis for all of the proofs about space vs time complexity on Turing machines. The ‘infinite’ tap needs to be one entry longer than the number of time steps that the program runs for. This leads to an infinite tape only for programs that run for infinite time. One of the simplest non-terminating programs on a Turing machine advances the tape one step and repeats. This requires an infinite tape, but never revisits any entry and so is equivalent to a program that reads one byte from a socket, discards it, and loops.
Given the rate of growth in cloud storage versus the speed of a computer, the ‘tape’ length can be increased faster than it can be consumed by a single machine, which makes it equivalent to a Turing machine tape.
This then leads to more interesting economic questions than the pure theoretic computer science termination questions: can this computation be completed before I run out of money (or, more often, before I exhaust my users’ attention span).
A Turing machine doesn’t actually require an infinite tape because it can only move the tape by one unit per time step.
I don’t understand this argument. If the Turing machine had an unbounded but finite tape, rather than a truly mathematically infinite tape, then many proofs would not work and instead, we would believe the opposite of what we believe. Proofs such as the halting problem being undecidable, Rice’s theorem, the busy beaver function being uncomputable, etc…
If a Turing machine was what you said, it’s very likely that I would absolutely have no problem with it.
This leads to an infinite tape only for programs that run for infinite time.
Yes, but this is crucially important for all the proofs that I mentioned, otherwise it would be possible to prove the opposite of what they prove.
One of the simplest non-terminating programs on a Turing machine advances the tape one step and repeats. This requires an infinite tape, but never revisits any entry and so is equivalent to a program that reads one byte from a socket, discards it, and loops.
Right, but this is only one program out of an infinite number of programs that actually, can always use the entire length of the infinite tape.
For example, on a Turing machine, I can write a non-terminating program that advances the tape one step, then goes back to the beginning of the tape, reads or writes something for the entire existing length of the tape, then advances the tape one more step, and loops forever doing this. This program requires truly infinite storage (in the mathematical sense) and it would not be possible for this program to succeed executing forever unless the tape length is truly infinite.
Of course, in the real world nothing would execute forever (actually, it could if the program runs on multiple computers), but when we prove things about programs we are only concerned with modeling the behavior of a program in an ideal (and hopefully, functionally representative) model.
The problem I have is that the Turing machine is not functionally representative of what computers can do. A finite-sized model such as a deterministic finite-state machine or a deterministic linear bounded automaton, on the other hand, is functionally representative of what computers can do, even if it has an incredibly huge but finite tape length.
This is why Turing equivalent computation is considered more powerful than that which finite-sized models can do (although ironically, it has more limitations) and it’s why I argue that Turing equivalent computation or anything more powerful should be considered hypercomputation.
Note that this does not prevent us from creating Turing-complete programming languages and creating programs written in those languages as a simplification, which allows us to express programs with Turing-equivalent computations. But the model we use to analyze the execution of those programs should represent what computers can do.
If you want, you can still use a Turing machine to analyze the execution of programs, but if you do this, you should be aware that the conclusions you extract may be absurd, so don’t mislead everyone into thinking that these conclusions apply to realistic computation devices (i.e. those with a finite size).
Given the rate of growth in cloud storage versus the speed of a computer, the ‘tape’ length can be increased faster than it can be consumed by a single machine, which makes it equivalent to a Turing machine tape.
No, it does not, because there is one very crucially important difference: cloud storage cannot keep increasing forever (and I mean forever in its true sense). That is the difference between “something grows so big that I cannot even imagine its size, but it will reach a point where it doesn’t grow anymore so it is still finite” and “truly mathematically infinite”.
If we actually used a machine model with a hugely, hugely big tape (say, with a tape length as large as the number of atoms in the universe), but it was still finite, rather than a Turing machine where the tape is mathematically infinite, then all of the problems I mentioned would disappear! The halting problem would be considered decidable, Rice’s theorem would be considered false, we could always compute the busy beaver function, etc.
Note, however, that there is a difference between computable functions and efficiently computable functions (i.e. whether they complete in a reasonable amount of time), and this is true whether a model has an infinite tape or a finite one. There is also a difference between efficiently computable functions and their time complexity, which can be (but not always are) very different. So strictly speaking these are all different issues and they would have to be analyzed separately.
This then leads to more interesting economic questions than the pure theoretic computer science termination questions: can this computation be completed before I run out of money (or, more often, before I exhaust my users’ attention span).
Sure, but this also assumes a model with finite size (even if it can be very large), which leads to the answer being computable.
It seems like you are arguing for the same thing I am: that we should use a model with finite bounds, even if we use a bound so large that it is hard to comprehend. Unfortunately, this is not what we are doing currently, which leads to all sorts of paradoxes and mistaken conclusions about the kind of computation that is possible to do in our universe.
You have confused an endomorphism (a function which has the same type for domain and codomain, like a function from the natural numbers to the natural numbers) with the repeated application of that endomorphism to a starting value. The endomorphism is like a description of how a Turing machine takes a single step, while repeated application is like evaluation of that Turing machine on a given input. The Halting Problem, Rice’s theorem, etc. prove properties of the endomorphism, not properties of the repeated application.
You can argue that is true (although I look at it from another perspective) but that’s a bit of a pedantic argument: you are missing the forest for the trees.
Even if you see it from that perspective, you are missing the fact that the Halting Problem, Rice’s theorem, etc. can only be proved if the domain and codomain are infinite, which is what leads to those absurd conclusions.
There are an infinite number of functions that have infinite domains and codomains and which a real computer can never implement.
If the domains and codomains were finite, you could prove the opposite of those theorems.
Edit: I’d also like to say that your perspective is actually a very interesting way to think about the problem, I hadn’t really thought about it like that before. Thanks!
I’m sorry, I’ve only gone through the first few steps of the tutorial but the initial claims have already bothered me a lot. I have to be pedantic because this is the biggest pet peeve of mine:
Turing famously showed that computers can’t decide whether your code halts.
I would say this is actually false, even though almost everybody thinks it’s true. At least, if you’re interested in actually solving problems rather than being interested in what is mostly a mathematical/theoretical curiosity.
The Halting Problem is famously only undecidable if the code you’re analyzing is being modeled on a “Turing machine” or something equally “powerful” to it. The problem is, computers are not as powerful as Turing machines, and furthermore, Turing machines (or something equivalent to them) neither exist nor can ever exist in practice, because it’s impossible to build a “machine” with truly infinite memory.
In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):
As you can see, if you use one of these cycle detection algorithms, then the Halting Problem is not undecidable. It’s easily solvable. The problem is that, while these known algorithms are able to solve the problem in constant memory, there are no known algorithms to solve it in a time-efficient way yet. But again, it is decidable, not undecidable.
Now, I understand how Turing machines as a model can be useful, as the simplifying assumption of never having to worry about a memory limit might allow you to more easily do types of analysis that could be more difficult to do otherwise. And sure, the way the Halting Problem was “solved” was very clever (and I actually suspect that’s a major reason why it has reached its current legendary status).
But this simplifying assumption should have never allowed a result like the Halting Problem which is known to be wrong for the type of computing that we actually care about to be elevated to the status that it currently has, and by that I mean, to mislead everyone into thinking that the result applies to actual computers.
So again, I would say, be careful whenever you see a Turing machine being mentioned. I would say it would be better to view them as mostly a mathematical/theoretical curiosity. And yes, they can also be a powerful and useful tool IF you understand its limitations. But you should never confuse this model for what actually is the real world.
Henry Rice proved a much more devastating result: “computers can’t decide anything interesting about your code’s input-output!”
Since Rice’s theorem (and many other computer science results!) depends on the Halting Problem being true, while it is actually false for real-world machines, my understanding is that Rice’s theorem is also false for real-world machines.
But Rice’s theorem (and sentences like the article is using) is absolutely another example of a huge number of cases where the Halting Problem has seemingly mislead almost everyone into thinking that something can’t be solved when it actually can be solved.
I wonder how much more research could have advanced (like for example, in SMT solvers, formal methods, etc) if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.
How many good people were discouraged into going into this area of research because of this widespread belief that this pursuit is proven to be unachievable in the general case?
Now, I’m not a computer scientist nor a mathematician or logician. So I could be wrong. If so, I would be interested in knowing how I’m wrong, as I’ve never seen this issue being discussed in a thorough way.
Edit: another (according to my understanding) completely misleading phrase in the tutorial exemplifying what I’m describing:
we learned Turing’s proof that halts is actually impossible to implement: any implementation you write will have a bug!
Now, to be clear I’m not blaming the author of the tutorial, as this is just one very tiny example of what actually is an extremely widespread belief across the whole industry.
The halting problem is a theorem. When the author writes “Turing famously showed that computers can’t decide whether your code halts.” they are referring to that theorem, even if the words they wrote are not precise and formal enough to state the actual halting problem.
Yes it’s a good point that in reality our programs are finite and halting can be determined for this set of programs.
mislead everyone into thinking that the result applies to actual computers.
The bridge between theory and practice is a really big topic. There’s a lot that could be said on that topic. But I think the key take-away from the halting problem and rices theorem for writing programs that analyze and process programs is this:
Your algorithm analysis system is going to need to output one of 3 things {yes, unknown, no} and not one of 2 things {yes, no}.
I wonder how much more research could have advanced if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.
I definitely do not think we would be further ahead if Turing’s theorem had not been discovered. It’s not something that held work back, it’s something that informed more precise and correct directions for analysis of programs. So I believe we are better off for it.
Your algorithm analysis system is going to need to output one of 3 things {yes, unknown, no} and not one of 2 things {yes, no}.
Well, strictly speaking the system would still only tell you “yes” or “no”. But since Rice’s theorem is false for real-world computers (as far as I know), it would be possible to determine whether your program will halt because it ran out of memory, because some error was encountered or because it completed successfully, which is immensely useful to know for the vast majority of programs (but not for certain programs that are trying to solve strictly mathematical questions with infinite bounds).
I definitely do not think we would be further ahead if Turing’s theorem had not been discovered.
Of course not, I wasn’t trying to make that point.
It’s not something that held work back, it’s something that informed more precise and correct directions for analysis of programs. So I believe we are better off for it.
I think it’s not the theorem itself that held (and is holding) work back. It’s the widespread belief that it applies to real-world computers that holds work back.
Imagine how many people have been discouraged from pursuing work in formal methods because they believe it’s impossible to figure out if a program has a bug in the general case. I see engineers saying that it’s impossible to solve it all the freaking time.
Well, strictly speaking the system would still only tell you “yes” or “no”.
I don’t understand this. What I was meaning was that, for example, I cannot write a program that terminates and correctly tells if an arbitrary program halts, but I can write one that tries set of checks and either reports that the input program does halt, does not halt or that it cannot tell.
Imagine how many people have been discouraged from pursuing work in formal methods because they believe it’s impossible to figure out if a program has a bug in the general case. I see engineers saying that it’s impossible to solve it all the freaking time.
It is impossible in the general case (assuming unbounded memory). If you restrict to finite memory like real computers have, the memory is usually so large that it doesn’t make a practice difference. You can’t just note down the entire system state including all 16GB of memory every iterator and try to detect loops. This works in theory but not in practice.
I cannot write a program that terminates and correctly tells if an arbitrary program halts.
Yes, you can, because programs run on machines with finite memory. The cycle detection algorithms I mentioned are examples of such programs that always terminate and correctly tell you whether an arbitrary program halts.
It is impossible in the general case (assuming unbounded memory).
But you shouldn’t assume unbounded memory, because that doesn’t exist.
If you restrict to finite memory like real computers have, the memory is usually so large that it doesn’t make a practice difference. You can’t just note down the entire system state including all 16GB of memory every iterator and try to detect loops. This works in theory but not in practice.
Sure, currently it’s impractical, but as far as I know there’s no proof that it will always be impractical. I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.
But I haven’t seen a convincing argument that you necessarily have to do this to solve the problem in general and that you can’t just analyze the program description to determine whether it will halt.
A cycle detection algorithm has to detect, by definition, that a Turing machine is in a state that it was in earlier. That is what it means for there to be a cycle. Detecting cycles in other ways requires program analysis and gets into the territory of Rice’s theorem. I’m not assuming anything about how the algorithm does this. Direct inspection of the physical implementation of the state representation is just one way. It could be programmatic interrogation via an API.
The number of possible states of a Turing machine is 2^n bits. This exceeds the number of particles in the universe for even a modest amount of bits. That means you can make a small program that just cycles through all possible states, for which you cannot physically have a second Turing machine that enumerates which states have already been seen. That’s an impossibility theorem for all I care
But didn’t I just show, here and in my other comment about the finiteness of space, that the Halting Problem, and thus Rice’s theorem, are still valid in practice?
This entire argument doesn’t change anything about what we can do in practice.
didn’t I just show (…) that the Halting Problem, and thus Rice’s theorem, are still valid in practice?
But they’re not valid in practice! In practice, there are already known algorithms that truly solve the Halting problem for finite state machines. Like the Tortoise and Hare algorithm. They truly solve this problem, in practice. Which means the Halting theorem’s proof is invalid for finite-state machines. The problem is decidable and there are known solutions.
What you’re talking about, perhaps, is that the time complexity of these algorithms, currently, is too great to be useful in practice. Which, if that is what you are arguing, then I agree. But that’s not what I was arguing about:
The number of possible states of a Turing machine is 2^n bits. This exceeds the number of particles in the universe for even a modest amount of bits.
If you really were arguing about the time complexity of the currently-known algorithms, then sure, that is relevant.
But I was very clear about this: I was not arguing about what we can currently do, I was arguing about what might be possible to do in the future.
And my point is that you don’t have to necessarily iterate through all of these states to detect a cycle. It might be possible to construct an algorithm that solves the exact same issue without always running the program step by step, but instead, actually looks at the program and exploits how the program works to do the analysis much faster.
That means you can make a small program that just cycles through all possible states, for which you cannot physically have a second Turing machine that enumerates which states have already been seen.
This also seems to be irrelevant to what I was discussing, because again, you don’t have to necessarily enumerate all states which have already be seen in order to detect when a cycle of states starts to repeat forever.
I mean, sure, in the vast majority of cases, currently, you’d have no other choice but to do this. But again, I was very clear about this: I’m not arguing about what we can currently do, I’m arguing about what might be possible to do in the future, especially given that this problem is decidable.
Yes, you can, because programs run on machines with finite memory. The cycling detection algorithms I mentioned are examples of such programs that terminate and correctly tell you whether an arbitrary program halts.
I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.
Cycle detection algorithms work by running the program and watching the states. They don’t work by analyzing the program description.
You were speculating that there might be practical, efficient algorithms to do this. This seems unlikely to me even in the general (finite) case, as I’d guess an efficient algorithm for this would be equivalent to proving P = NP.
I agree with you. I think that it’s unlikely, but I also think that it’s possible.
As an example, primality testing was not believed to be polynomial-time for many decades, until finally someone discovered a deterministic polynomial-time algorithm in 2002, quite unexpectedly at the time.
As additional evidence that this is not a possibility that we can easily dismiss, I also present the following: Donald Knuth believes that P = NP :-)
I was thinking it might be the case that it could be shown to be an NP problem, perhaps without much effort, for someone more knowledgeable than I am in this area.
I’m aware of Knuth’s speculation, though I’ve seen him discuss it somewhere and he said something like, “Powers can be really, really big,” implying that even if P = NP, it might not change the practical tractability of NP problems.
Yes, you can, because programs run on machines with finite memory
It doesn’t make sense that you would try to clarify this to me, I obviously know this for the finite memory case - as we been discussing. What you’ve misunderstood (on purpose?) is that in my paragraph I am referring to unbounded memory. Maybe try a little bit to map out what you think the person you are talking to knows and does not know for better communication or something.
I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.
This is the cycle detection algorithm you discussed earlier. There are also others way to construct proofs that programs halt.
in general
there is no general way to decide if programs (i.e. Turing machines) halt.
I’m sorry if I misunderstood you. I certainly did not do it on purpose.
in my paragraph I am referring to unbounded memory.
I can write one that tries set of checks and either reports that the input program does halt, does not halt or that it cannot tell
Ok, given the context, then yes, I agree with you. You could write such a program. But if you assume the program you’re trying to analyze runs with finite memory, then you can even write a program that always tells you “yes, the program halts” or “no, the program does not halt”, for any arbitrary program.
And in fact, it would even be possible to tell why it halted!
I cannot write a program that terminates and correctly tells if an arbitrary program halts
Given the context of unbounded memory, then I agree with this statement. But I would definitely like it much more if people always preceded such statements with the “unbounded memory” assumption that they are making, to avoid such common misunderstandings. I’m not criticizing you specifically, rather, just the common discourse in general :)
Ok, given the context, then yes, I agree with you. You could write such a program. But if you assume the program you’re trying to analyze runs with finite memory, then you can even write a program that always tells you “yes, the program halts” or “no, the program does not halt”, for any arbitrary program.
A turing machine can tell if a finite memory program halts, but a finite memory program cannot determine if all finite memory programs halt - because it may run out of memory itself.
A turing machine can tell if a finite memory program halts, but a finite memory program cannot determine if all finite memory programs halt - because it may run out of memory itself.
Yes, it can. The tortoise and hare algorithm is a finite memory program and it can determine if all finite memory programs halt.
suppose you have 16 GB of memory to run tortoise and hare algorithm,
That means it can only analyze finite state machines with up to 8 GB of memory
there is a program large enough that your 16 Gigs will run out and the algorithm wont be able to complete.
No, the algorithm will be able to complete.
If the program tries to use more than 8 GB of memory, presumably it would crash with an out-of-memory error, which the tortoise and hare algorithm would interpret as a halting state and then terminate. Or it would loop, which the tortoise and hare algorithm would detect as a cycle and then terminate.
The program is not allowed to use an unbounded amount of memory.
there is no general way to decide if programs (i.e. Turing machines) halt.
Yes, but why is that interesting outside of trying to analyze a program that is trying to solve some kind of mathematical problem (i.e. something with infinite bounds)?
I think the more interesting statement is that there is a general way to decide if programs, when ran on deterministic finite-state machines, will halt, even if they were written in Turing-complete languages.
And I think the even more interesting question is: how can we reduce the time complexity of such algorithms that can decide that?
I think the more interesting statement is that there is a general way to decide if programs, when ran on deterministic finite-state machines, will halt, even if they were written in Turing-complete languages.
Well yes, but you need to run the decision program on a wachine with exponentially more memory, which is impossible in practice.
Well yes, but you need to run the decision program on a wachine with exponentially more memory, which is impossible in practice.
No, you don’t.
The tortoise and hare algorithm uses only a constant amount of memory to determine whether a finite-state machine halts. Concretely, it uses 2*N bytes of memory if the underlying finite state machine uses N bytes of memory.
In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):
As you can see, if you use one of these cycle detection algorithms, then the Halting Problem is not undecidable. It’s easily solvable. The problem is that, while these known algorithms are able to solve the problem in constant memory, there are no known algorithms to solve it in a time-efficient way yet. But again, it is decidable, not undecidable.
This is wrong. Cycle detection works for Turing machines too. The equivalent halting problem for a finite machine is “there’s no algorithm that can tell if an arbitrary program, ran with sufficient memory, will halt.”
ie, take this program:
def find_odd_perfect_number:
i = 3
while True:
if is_perfect(i):
return i
i += 2
That will eventually halt on any computer due to hitting memory limits. But hitting a memory limit does not tell you if it it would have halted if you had more memory!
I wonder how much more research could have advanced (like for example, in SMT solvers, formal methods, etc) if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.
This is also wrong. We know it’s possible for an algorithm to find bugs, we’ve done that for decades. Rice’s Theorem says that there’s no algorithm that always finds bugs in any buggy program. It’s why SMT solving is, in reality, Very Hard.
This is wrong. Cycle detection works for Turing machines too.
It’s not wrong. Cycle detection doesn’t work for Turing machines precisely because of the Halting problem. The program may run forever without repeating states (including the state of the Turing machine tape), so for certain programs you will never find a cycle using a cycle detection algorithm on a Turing machine, therefore these algorithms would never terminate.
Which to me, is only of interest as a theoretical exercise or as a means to explore the consequences of true infinity, but not for solving most real-world problems.
The equivalent halting problem for a finite machine is “there’s no algorithm that can tell if an arbitrary program, ran with sufficient memory, will halt.”
What is “running with sufficient memory”? Isn’t that the same as running on a Turing machine?
But hitting a memory limit does not tell you if it it would have halted if you had more memory!
Sure, but that problem is not very interesting for the vast majority of cases, because in the vast majority of cases you’re actually more interested in knowing whether the program you’ve built is going to crash even if you give it a huge amount of memory than knowing whether it will work correctly on an imaginary machine with infinite memory which nobody will ever have.
Generally speaking, knowing whether a program would halt on a Turing machine is useful for solving mathematical problems related to unbounded / infinite limits rather than solving real-world problems.
This is also wrong. We know it’s possible for an algorithm to find bugs, we’ve done that for decades. Rice’s Theorem says that there’s no algorithm that always finds bugs in any buggy program.
No, it’s not wrong. Rice’s theorem only says that for imaginary Turing machines, which is my entire point! You’re making the same mistake everyone usually makes. Rice’s theorem doesn’t say that for machines with finite memory, such as… computers.
It’s why SMT solving is, in reality, Very Hard.
That’s not because of Rice’s theorem! You just proved my point. You also believed that SMT solving is hard because of Rice’s theorem which means that these widespread beliefs about the Halting problem and Rice’s theorem are making people believe things which are not true and are discouraging people from pursuing these avenues of research.
It’s not wrong. Cycle detection doesn’t work for Turing machines precisely because of the Halting problem. The program may run forever without repeating states (including the state of the Turing machine tape), so for certain programs you will never find a cycle using a cycle detection algorithm on a Turing machine, because these algorithms would never terminate.
It actually works better on Turing machines because in a real computer, cycle detection doesn’t work if the cycle is larger than your available memory. But in both abstract and real machines, it can’t tell you if a cycle is larger than your available memory.
You just proved my point. You also believed that SMT solving is hard because of Rice’s theorem which means that these widespread beliefs about the Halting problem and Rice’s theorem are making people believe things which are not true and are discouraging people from pursuing these avenues of research.
I believe SMT solving is hard because I’ve worked with people who research SMT solving, and I assure you that anyone developing SMT solvers will tell you the exact same thing.
It actually works better on Turing machines because in a real computer, cycle detection doesn’t work if the cycle is larger than your available memory.
So your point is that when:
The program being analyzed is assumed to run on a finite-state machine
The program that detects whether the above program halts is running on a Turing machine
… then it works just as well as (or even better than) if the second program was running on a finite-state machine.
If this is what you are arguing, then I agree but it’s irrelevant to what we were discussing. You said:
This is wrong. Cycle detection works for Turing machines too.
Sure, it works if the first program is assumed to run on a finite-state machine. But that’s not how the Halting problem is defined. The Halting problem is actually assuming that the first program runs on a machine with infinite memory. This is why people say the Halting problem is undecidable rather than decidable. And it’s also why it doesn’t apply to real-world computers. So what I said is correct.
But in both abstract and real machines, it can’t tell you if a cycle is larger than your available memory.
Cycles can’t be larger than the number of states in the available memory, by the pigeonhole principle.
You also believed that SMT solving is hard because of Rice’s theorem …
I believe SMT solving is hard because I’ve worked with people who research SMT solving, and I assure you that anyone developing SMT solvers will tell you the exact same thing.
I know SMT solving is hard, but as far as I understand it’s not because of Rice’s theorem.
Or are you assuring me that those people also have mistaken beliefs, given that Rice’s theorem is false for finite state machines? If so, that’s even worse than I thought, then.
Or are you assuring me that those people also have mistaken beliefs, given that Rice’s theorem is false for finite state machines? If so, that’s even worse than I thought, then.
Maybe it’s not the computer scientists who are mistaken here?
Maybe it’s not the computer scientists who are mistaken here?
It is a known result in computer science that Rice’s theorem says the problem is undecidable for Turing machines, because the proof is based on the Halting problem and the Halting problem is undecidable for Turing machines.
However, it is also a known result in computer science that the Halting problem is decidable for finite-state machines, which makes the Rice’s theorem proof invalid for finite-state machines.
Cycles can’t be larger than the number of states in the available memory, by the pigeonhole principle.
If there are n bits of memory available, then a cycle can have length up to 2^n, so you need n * 2^n bits of memory to store the table of visited states.
If there are n bits of memory available, then a cycle can have length up to 2^n, so you need n * 2^n bits of memory to store the table of visited states.
No, you don’t. The tortoise and hare algorithm detects such cycles and doesn’t store n * 2^n bits of memory, it only stores 2*N bits of memory.
The currently known algorithms for detecting cycles might indeed take 2^n steps to detect a cycle.
I’m arguing that it’s possible to devise an algorithm that looks at the program description and can determine whether there are cycles or not, without actually having to simulate the program running step-by-step.
So, to be clear, you’re claiming that there is an algorithm H(P,I,n) that takes a program P and determines whether it halts in fewer than 2^n steps when run on input I, and the time complexity of this algorithm is polynomial in n?
So, to be clear, you’re claiming that there is an algorithm H(P,I,n) that takes a program P and determines whether it halts in fewer than 2^n steps when run on input I, and the time complexity of this algorithm is polynomial in n?
Polynomial? I don’t know. There are reasons to believe this is not possible, but some of these reasons I find them a bit suspicious, especially because all complexity analysis is done with Turing machines. So it might be possible, I’m not sure. Or perhaps we can achieve quasi-polynomial time (i.e. QP).
SAT solving is NP-complete, so if P=NP perhaps it might be possible? Perhaps you just need to find a satisfying set of assignments in things like loop conditions to determine whether a program loops or not?
Of course, this is all assuming the program P uses a finite amount of memory.
But even if the time complexity of H is exponential, that says nothing about the efficiency of the algorithm, i..e how long it actually takes to solve the problem for reasonable and useful instances of programs.
EXP can still be quasi-polynomial (QP), I think. So far we only know that P is not EXP, as far as I know.
QP means that this could be something like O(2^(log(N)^1.1)), right? Which means for a state size of 1 TiB, it could take on the order of… 2^17 = 130k times more steps than a state size of 1 bit to calculate in the worst case, which is pretty awesome :)
You are fixated on Turing machines. Here are some starting points for leaving that worldview behind:
Untyped lambda calculus is Turing-complete. Specifically, it is not decidable whether a given term has a normal form. This formalism does not have a Turing-machine tape, and so does not suffer from running out of memory.
Descriptive computational complexity can be used to describe P, NP, EXP, and several other common complexity classes without Turing machines. I have a diagram which shows this in detail. We can immediately see that P is contained in EXP by an analogue of the time-hierarchy theorem, for example; EXP is described by the same logic as P, but with second-order quantifiers instead of first-order quantifiers.
This formalism does not have a Turing-machine tape, and so does not suffer from running out of memory.
I’m not entirely sure what you mean by this phrase. If lambda calculus is Turing-complete, then it means lambda calculus is as powerful as Turing machines, so it is able to use infinite amounts of state, just like a Turing-machine.
Which is the entire problem that I’m pointing out. If we restrict ourselves to analyzing programs that can actually run on a computer (i.e. those that use a finite amount of state), then all theories that were considered undecidable suddenly become decidable.
Descriptive computational complexity can be used to describe P, NP, EXP, and several other common complexity classes without Turing machines.
OK. I don’t quite understand your point here?
Rice’s theorem does not depend on Turing machines.
It depends on the program being analyzed being able to use an infinite amount of state, otherwise Rice’s theorem is false. Or are you saying Rice’s theorem is true for programs that use a finite amount of state?
Your focus on state is specific to Turing machines. Lambda calculus does not have intermediate state; any expansion or reduction of terms is going to yield another term.
I think that Wikipedia puts it best in their page on Rice’s theorem:
It is important to note that Rice’s theorem does not concern the properties of machines or programs; it concerns properties of functions and languages.
Every nontrivial property of computable functions is not decidable.
Because we are discussing properties of functions, and not just properties of various encodings of functions (various programs), it doesn’t really matter which encoding we choose; all that is required is that anything computable is expressible with our encoding.
Ok, I think I understand that now, although I’m not entirely sure (I’ve only really began to understand Rice’s theorem in the past couple of days).
But to me, that seems only interesting and important if you are interested in analyzing a type of computation that cannot be performed in our universe. Which I understand can be useful in certain scenarios, but I’m not sure that it is as useful as it seems for most problems.
I mean, we can theorize about turing-equivalent computation, analyze it, speculate about it and even extract conclusions about it. We can even perform a limited subset of this type of computation in the real world. But we can never truly perform this type of computation, which is a much stronger type of computation that what is possible to do in our universe.
You can still write programs that express such types of computation (e.g. programs written in Turing-complete languages that have no concept of memory allocation failure), with the understanding that if your program uses too much memory, it will crash/halt in the real world. Which means that when you are going to analyze how these programs will actually execute in the real world, then you should to that analysis using a model of computation that represents what computers can do, functionally, not what some theoretical super-powerful imaginary computation device would do in an imaginary universe.
If we limit ourselves to analyzing the type of computation that is possible to do in our universe, then there are problems which were undecidable in the previous model that actually become decidable.
I am not entirely sure what the analogous of Rice’s theorem would be for computations that are limited to using finite state. I mean, Rice’s theorem can be a little difficult to understand :) So I don’t know, perhaps you can clarify that for me:
It seems that Rice’s theorem is defined over the “computable functions”, where “computable functions” actually seems to be a misnomer, because some of these functions are theoretically impossible to compute in our universe, regardless of how much space and energy could be accessible to us: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time and storage space.”
If you were to define “computable functions” as “the set of functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”, then how would that look like? Because this set of functions are exactly the ones that deterministic finite state machines can calculate, and are also the ones that are theoretically possible to compute in our universe.
Given the above, I know Rice’s theorem proof relies on the Halting theorem, and the latter’s proof is not only invalid for deterministic finite-state machines, but in fact we can prove the opposite theorem (i.e. that such computation is decidable).
So if we were to define such a theorem analogous to Rice’s theorem but for computations that can be performed in our universe, then my intuition says that such theorem would be decidable given that you can always fully analyze this type of computation (e.g. since it is possible to simulate this type of computation step by step and guarantee that this simulation always terminates, because the simulation either terminates by itself or we can detect that it starts to repeat forever).
There are relatively small Turing machines whose behaviors include halting after taking more steps than our observable universe can represent. For a summary of what we currently know, see this table of busy-beaver values for Turing machines. This contradicts your implicit claim that we can only prove facts about programs which are running on physical machines.
You are close to reinventing computational complexity theory. You are not the first person to wonder what happens when we limit the amount of resources available to a Turing machine.
I would strongly recommend working through Peter Smith’s book linked above, “An Introduction to Gödel’s Theorems”. This book introduces Turing machines after building up the rest of the machinery of computability theory; the work of Gödel, Tarski, and Turing is put into that particular ordering, and we can pause after Tarski’s undefinability of truth and use Yanofsky’s example to prove Rice’s theorem without Turing machines.
Let me extract this excerpt from Wikipedia: “The busy beaver function quantifies the maximum score attainable by a Busy Beaver on a given measure. This is a noncomputable function.
The busy beaver function, Σ : N → N, is defined so that Σ(n) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol n-state Turing machines of the above-described type, when started on a blank tape. “
Ok, good. Now let’s say we use my definition of computable function (well, I’m sure it’s not literally my definition and that someone from many decades ago has already used an identical definition), which is the following: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”.
You could say the following: “But @wizeman, if your computable functions are even more limited than the usual definition of super-powerful computable functions, then it’s even more impossible to compute the busy beaver function!”.
Yes, that’s true. But, the busy beaver is using a Turing machine as a model, which means it also uses the absurdly super-powerful definition of computable functions!
So how about we define the busy beaver function as such: “The busy beaver function, Σ : N x N → N, is defined so that Σ(n, k) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol (4 symbols including the endmarkers) n-state deterministic linear bounded automatons of the above-described type, when started on a blank tape of length k.”
What have we gained by doing this?
Well, suddenly, the busy beaver function actually becomes computable, even when using my non-super-powerful definition of computable functions!
Note that, just like Wikipedia says, a computable function does not imply that they can be efficiently computed (i.e. computed within a reasonable amount of time).
But it does mean that something that wasn’t even theoretically possible to do before (not to mention in practice), now is theoretically possible.
There are relatively small Turing machines whose behaviors include halting after taking more steps than our observable universe can represent. For a summary of what we currently know, see this table of busy-beaver values for Turing machines.
Yeah. If by “small” you mean a Turing machine with 1.3×10^7036 bits of tape, sure. Remember, the busy beaver function calculates how many bits of state the tape ends up with (well, not bits, symbols, but it’s almost the same thing).
I can create a deterministic finite state machine that takes even longer to run: start with a 1.3 x 10^7036-bit integer initialized to zero and increment it on each step until it wraps around.
But I don’t know how that proves anything about what I said.
This contradicts your implicit claim that we can only prove facts about programs which are running on physical machines.
I think there’s two things wrong with this statement of yours:
I don’t think I claimed it’s impossible to prove facts about imaginary machines, even infinite ones. Which seems obvious to me, otherwise induction wouldn’t work.
I don’t think I claimed that it’s always practical to prove things about physical machines, which also seems obvious to me, as you can imagine building the biggest machine possible but then you don’t have enough energy to start it, let alone do the entire computation.
But I did mean that restricting ourselves to finite-state machines makes it theoretically possible to prove statements that are impossible to prove for those with infinite state.
prove Rice’s theorem without Turing machines.
You can prove Rice’s theorem without Turing machines but I don’t think you can prove it if you say that a computable function is one that can be computed with finite storage.
In fact, I believe I can prove the opposite of Rice’s theorem if we use that definition of computable function.
You’ve misunderstood. A Turing machine’s description does not include its tape; it includes only the state transitions and symbols. I strongly recommend taking a crack at that PDF of Smith’s book.
The description does not include its tape, but the Turing machine’s tape is part of its state.
That’s why a busy beaver can be a small program, terminate and yet take such a long time: it’s because it can access so much state. Remember: a program can never take longer to run than 2^N steps (where N=state size in bits), unless the program starts to loop forever. So if a busy beaver only had access to 10 bits worth of tape (+ K states in its description), it could never take longer than 2^10 = 1024 (* 2^K) steps to complete.
The finite-state program that increments a 1.3*10^7036-bit integer until it wraps around is also an extremely small program that takes even longer to terminate than a busy beaver that uses a tape as long as that integer.
In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):
That cycle detection algorithms work has nothing to do with the finiteness of the machine memory and everything with the finiteness of the memory used by the program. It is well known that you can solve the Halting Problem for specific subsets of all possible machines, where you can exploit some property of the program. Those algorithms also work on theoretical machines with infinite memory running the same programs.
That they wouldn’t work for programs that use infinite memory is a red herring: they also don’t work, in practice, for many programs that use a finite amount of memory. The same is true for any other subset for which the halting problem can be solved: there are practical limits that prevent them from being solved in general.
The proof for the Halting Problem does not really depend on the infiniteness of machine memory: you can always devise the intermediate machine such that it loops for an amount of time that makes the Halting Problem only solvable given more time than the lifetime of the universe. That is equivalent to not solving it.
(I am in general sympathetic to your point. I find it baffling that CS books don’t clearly say “yes, Rice’s theorem, but don’t let that stop you from proving many interesting properties about programs. I am a physicist by training; I have little patience for mathematical infinities. However, often results remain true when replacing infinities with the finite limits of the universe. This is such a case. For many practical purposes that is irrelevant though)
That cycle detection algorithms work has nothing to do with the finiteness of the machine memory and everything with the finiteness of the memory used by the program.
Yes, that’s what I meant, actually. I was saying that as long as you assume that the program is going to run on a computer (which has finite memory), then you can always determine whether it halts or not.
It is well known that you can solve the Halting Problem for specific subsets of all possible machines, where you can exploit some property of the program.
And I was pointing out that the property of the program that we can exploit in this case is “the program can only use a finite amount of memory”, which is the subset of all programs that can run on a computer. An immensely useful subset if you can solve the Halting problem for these, I tell you! :-)
The proof for the Halting Problem does not really depend on the infiniteness of machine memory
Actually, it does subtly rely on the program being analyzed being able to use an infinite amount of memory.
How do I know this? Because the Halting problem is decidable if the program is only allowed to use finite memory, which means that the undecidability proof is invalid in this case.
you can always devise the intermediate machine such that it loops for an amount of time that makes the Halting Problem only solvable given more time than the lifetime of the universe. That is equivalent to not solving it.
No, that still makes it solvable. Your intermediate machine may very well loop for an infinite or almost infinite amount of time. That still makes the problem decidable. It is a separate issue how long an algorithm takes to solve the problem. And also, you don’t know how long this algorithm takes. It may very well be that such an algorithm can determine whether a program loops or not, much more quickly than how long the program loops for.
There is a different problem: Can you tell me whether a program is malicious ? If so how ?
Let’s say it’s running something in a loop, and you’re trying to watch it to determine if it is malicious (looping file deletion). That is already a variation of the halting problem, because you cannot tell if the loop will halt (not deleting everything) and simply terminating it after X cycles isn’t enough. Because in this time it could have a) deleted too much or b) not actually finished the user intended work. The same goes for trying to find out if a program will behave differently for antivirus-scanning vs running normally.
Can you tell me whether a program is malicious? If so how?
If you have a formal (i.e. mathematically precise) specification of what a malicious program is, then it is possible to solve it, yes.
I think the issue with your example is that you don’t have a specification, i.e. there isn’t enough information to solve the problem.
But you could specify, for example: a benign program never deletes any files that it hasn’t created before, or whose paths the user hasn’t passed as a (specific) command-line parameter, or which the user hasn’t selected in a GUI file-manager-like window.
Then it would be possible to analyze a program and determine whether it does that or not.
This isn’t pedantry, this is just a misunderstanding of math and logic. Mathematical statements have to be precise. You can’t make a general statement about computers without having a precise, mathematically representable model of what a computer is. A Turing machine is one such model, and it’s believed to be a very good one, in part because of its correspondence to other similar models like the lambda calculus. But most notable is the register machine that is much closer to a real computer than a Turing machine, but still equivalent. So the halting problem result still holds.
Also, the halting problem has never discouraged anyone ever from research. I obviously don’t know every researcher, but it’s extremely well understood that the halting problem only applies to a general algorithm about all possible programs, and there are obviously classes of programs where you can easily know whether or not they halt. The biggest example there is the simply typed lambda calculus, of which programs are proven to always terminate.
The STLC is not a secret, it’s the basis of a huge amount of CS research. So the class of programs that the halting problem applies or doesn’t apply to is very well known.
You can’t make a general statement about computers without having a precise, mathematically representable model of what a computer is.
How about we use a deterministic finite-state machine to model a computer? Would you be OK with that?
A Turing machine is one such model
No, it is not. That is my point.
It’s not one such model because Turing machines can run programs which no computer will ever be able to run, and I really mean ever. It doesn’t matter which computer you construct, it will never be able to run such programs like a Turing machine does.
Specifically, I am talking about those programs which use infinite amounts of state (which in a Turing machine, includes the tape).
That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.
And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.
So a Turing machine is not a model of a computer, it is a purely theoretical model of a super-powerful (due to its infinite tape) abstract machine which is impossible to construct in the real world and which no computer can ever emulate.
But most notable is the register machine that is much closer to a real computer than a Turing machine, but still equivalent.
Still not a model of a computer. Register machines are turing-equivalent, which means register machines can use infinite amounts of state as well, and thus they are a purely theoretical construction which is much more powerful than a computer will ever be.
For the sake of argument, imagine I create a model of a computer and tell you: hey, my computer model has an instruction which, when executed, can instantly tell tomorrow’s lottery numbers and put them in a register. And another instruction which kills everyone on the planet.
Would you say this is a model of a computer? Do you think it is really a model of what computers do? It does seem absurd, but a model of a computer with a truly (i.e. mathematically) infinite tape can be argued to be similarly absurd.
Note that I’m not saying that models have to be a perfect simulation of what a computer does, atom by atom.
But they should at least functionally represent what a computer does. And it’s OK to have unfaithful models, i.e. with certain assumptions or simplifications (like what Turing machines do, which assume the program can use infinite amounts of memory), but if you do so, you have to be careful about the implications of that and not assume that just because something is true in your model, then it is also true for computers. Otherwise you’re just misleading everyone.
So the halting problem result still holds.
It doesn’t, because of what I mentioned above. The Halting problem is decidable on finite-state machines.
and there are obviously classes of programs where you can easily know whether or not they halt.
Sure. But there is a particular class of programs, which is those that are only allowed to use a finite amount of state (i.e. those that can run on computers), for which problems such as the Halting problem and Rice’s theorem which are undecidable on Turing machines, suddenly become decidable.
So ironically, these Turing machines which are more powerful than computers, end up having limitations that finite-state machines like computers don’t have!
And not only these problems become decidable on computers, but suddenly it also becomes “easy” to prove statements about these programs, as the Tortoise and Hare algorithm (and other similar ones) demonstrate. However, there is still a huge practical issue in that the time complexity of these known algorithms is way too high to be useful, so we’d need more efficient algorithms for them to be useful for real-world programs.
How about we use a deterministic finite-state machine to model a computer? Would you be OK with that?
I’m totally ok with that. That’s why I brought up the simply typed lambda calculus, which does not suffer from the halting problem and which has programs that are guaranteed to terminate. There are many models of computation that each have different properties and tradeoffs.
But infinite models still have immense practical utility. I have a good example - integers. The set of integers in math is infinite. Yet we can still use a finite subset of integers in our daily lives, and we know many things about finite subsets of integers because we’ve proven them on infinite sets.
Another occurrence of infinity is with interactive programs, like anything with a user interface or an event loop. While technically they don’t execute forever, or at least we don’t know if they do or not because we as humans can’t experience “forever,” you still have to reason about an interactive system as if it were infinite. It can be thought of as an infinite program that you can choose to halt, or as a finite program where you choose an arbitrary number of iterations beforehand. The distinction is totally moot though, because you have to use the same kind of logic in either case. Or in the case of FSMs, the number of finite states is so large that it being finite has no relevance to the real world, because you could never check anything about those states directly in any number of lifetimes.
You are arguing against the usage of infinite reasoning for real-world objects, and that we’re going to disagree on. That’s just how math works. You use infinity sometimes.
But infinite models still have immense practical utility. I have a good example - integers. The set of integers in math is infinite.
Yes, that’s why I argue that Turing machines are ideal if you want to model a computation about mathematical statements, e.g. for figuring out the Collatz conjecture or whatever, i.e. statements which are about truly infinitely-large things, which practically only occur in math.
But for modeling computers and the execution of real-world programs, finite-state machines seem a lot more appropriate and representative.
Another occurrence of infinity is with interactive programs, like anything with a user interface or an event loop. While technically they don’t execute forever, or at least we don’t know if they do or not because we as humans can’t experience “forever,” you still have to reason about an interactive system as if it were infinite. It can be thought of as an infinite program that you can choose to halt, or as a finite program where you choose an arbitrary number of iterations beforehand
Actually, this type of infinity (in terms of time) is not necessarily an issue. As I mentioned earlier, the Tortoise and Hare algorithm (and others like it) can always deterministically detect whether programs halt or whether they loop forever, assuming the program is running on a finite-state machine. This means that it is possible to reason about these programs, i.e. Rice’s theorem is decidable for these programs.
So it doesn’t necessarily matter if they are designed to execute forever.
BTW, this is another example where the Halting problem, being usually defined for and studied on Turing machines, is misleading people into believing that reasoning about always-looping programs is undecidable.
The distinction is totally moot though, because you have to use the same kind of logic in either case.
No, this is not true. You can model programs that run forever as a deterministic finite-state machine, which don’t have the same undecidability problems as Turing machines.
Or in the case of FSMs, the number of finite states is so large that it being finite has no relevance to the real world, because you could never check anything about those states directly in any number of lifetimes.
How do you know this? Are you saying that to check statements about these programs you can only simulate them step-by-step while they run through all states, i.e. that you cannot answer these statements by looking at the description of the program?
You are arguing against the usage of infinite reasoning for real-world objects, and that we’re going to disagree on. That’s just how math works. You use infinity sometimes.
Well, math is math and the real world is the real world. They are not the same thing. In the first one, truly infinite things exist, in the second one, they do not (except time perhaps), or at least, they’ll never be accessible to us.
But for modeling computers and the execution of real-world programs, finite-state machines seem a lot more appropriate and representative.
What problems are you working on that you’re thinking FSMs will be more effective?
You can model programs that run forever as a deterministic finite-state machine
You can. How would you prove or check a property about such a program? (see the next response for what you would do)
Are you saying that to check statements about these programs you can only simulate them step-by-step while they run through all states, i.e. that you cannot answer these statements by looking at the description of the program?
No, I’m saying you have two choices to prove or check properties about large state machines:
Run them step by step and see if the property is ever broken. We have plenty of experience with this in model checking, and anyone who has actually done model checking on a real model will tell you that you have to immensely abstract your system and keep your bounds very small for this to be useful in any practical way. This is possible in very small, controlled cases, but you are arguing that FSMs should replace all other forms of reasoning about programs. And experience shows that this is a very bad idea.
Use infinite reasoning such as induction to show the property holds on an infinite SM. This is the better idea, but the one that you are opposed to.
What problems are you working on that you’re thinking FSMs will be more effective?
Writing computer programs and proving things about them.
To be clear, I’m not saying FSMs will be more effective right now. It’s just that it’s theoretically possible to solve problems for them that aren’t possible to solve for Turing machines.
You can. How would you prove or check a property about such a program? (see the next response for what you would do)
If you have an algorithm that does exactly what the Tortoise and Hare algorithm does, but works in a much more practically time-efficient manner (while still not using excessive amounts of memory), then it would be possible to use that algorithm to prove and check properties about programs that aren’t allowed to consume infinite memory.
Note that the Tortoise and Hare algorithm (and others like it) work without knowing anything about the program, they simply run the program step-by-step.
I’m conjecturing that it might be possible to greatly improve the performance of such an algorithm by looking at the program description and exploiting properties about how the program works.
No, I’m saying you have two choices to prove or check properties about large state machines:
(…)
OK, then I think we misunderstood each other. You are describing the currently-known ways of solving this issue.
But when you said this:
because you could never check anything about those states directly in any number of lifetimes.
I thought you were talking about something that is impossible to do, i.e. that can never be improved upon.
So yes, I agree with you, currently those are the most feasible ways to solve this problem. And of course, I use induction myself when I want to prove properties about my programs, because this is one of the currently most feasible ways to do this. So I know that they are useful right now and I also know about the limitations of both of them.
But I wasn’t talking about what is currently feasible, I’m talking about what might be possible to do.
In theory, as far as I know there’s nothing that says that to prove properties about a program running on a finite-state machine, that you have to 1) run this program step-by-step like in a model checker, or 2) do things like induction.
In other words, I don’t think that those will always be the only options, especially because it’s possible to solve problems in FSMs that aren’t possible to solve for Turing machines. But I’d be happy to be proven otherwise!
Writing computer programs and proving things about them.
A worthy goal :)
I think we’ve arrived at mutual understanding. I won’t try and discourage optimism, because we should always be looking for improvements over what we know today. It would be great if we could reduce any part of computation to deterministic, finite, and manageable structures.
For today, I stick with Leslie Lamport’s advice and point of view, and focus on the techniques of infinite state machines.
This is wrong. In particular, Brainfuck is Turing-complete, implementable on physical computers, and simulates a tape.
What do you mean exactly?
If it’s Turing-complete, by definition it can’t be simulated on a physical computer.
That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.
And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.
You’re replacing the logic of the function with matching it against a regex. The method you use to match it against a regex is an implementation detail.
I disliked it because the question was asked “But wouldn’t it be cool if your IDE gave you a refactoring hint?” No. No that wouldn’t be cool because it would imply that replacing that fast concise array manipulation or length checking is for some reason not as good as using a regular expression. For simple things like that, I call it RE abuse. Not using an RE is almost always better. The tutorial lost me at this point.
I see a lot of negativity in the answers here and I want to say that if you are looking to understand Rice’s theorem, this is an excellent, and succinct, tutorial.
@wizeman’s post is actually an interesting tangent, but it takes nothing away from the quality of this tutorial – it is raising a much larger question and the critique would apply to any tutorial, book, or video that treats these subjects of theoretical computer science.
I also really enjoyed it, except some of the last bits (around step 26) asked me about things from earlier and I was too lazy to scroll up and remind myself which of the 20+ program snippets had that particular name and so just button mashed to the end.
The reason that I find this kind of theorem unsatisfying is the universal quantification. Yes, it is impossible to determine whether a program halts, for all programs over all inputs. It is possible to show whether some large subset of all finite programs halt over unbounded inputs (@wiseman is somewhat missing the fact the we do have a practically infinite tape connected to our computers, we call it a network). The really interesting question, to me, is: what are the limits over which that computation is not just possible, but feasible. The same applies to Rice’s theorem. In practice, for a lot of problems, an proof assistant that returns true, false, or don’t know is useful, especially if it can provide some hints about how to modify the input to make the proof possible.
I agree totally on the computation question. The classic example is primitive recursion: if each case leads to “smaller” input then you have simple induction to prove that your program terminates.
I would be pretty interested in working through those kinds of formalisms, and investigating the bounds of interesting programs within there (after all, lots of programs we write in practice do terminate, so why?)
That’s a good point!
Strictly speaking, I don’t think a network is a truly infinite tape. But yes, for practical purposes, currently we’d have to model it as such, especially if programs are allowed to run indefinitely.
So I guess, now that I think about it, perhaps the bounded Halting problem is a more interesting Halting problem to me than the Halting problem itself, since it ensures by construction that the program can only use a finite amount of memory, regardless of whether they use things like a network or not.
I wonder if we can always answer any questions about programs that only run for a bounded amount of time.
Yes, indeed! I completely agree that is the more interesting question. I just get bothered when people say it’s not even possible.
Edit: sorry, I substantially changed my comment as more thoughts occurred to me.
A Turing machine doesn’t actually require an infinite tape because it can only move the tape by one unit per time step. This is the basis for all of the proofs about space vs time complexity on Turing machines. The ‘infinite’ tap needs to be one entry longer than the number of time steps that the program runs for. This leads to an infinite tape only for programs that run for infinite time. One of the simplest non-terminating programs on a Turing machine advances the tape one step and repeats. This requires an infinite tape, but never revisits any entry and so is equivalent to a program that reads one byte from a socket, discards it, and loops.
Given the rate of growth in cloud storage versus the speed of a computer, the ‘tape’ length can be increased faster than it can be consumed by a single machine, which makes it equivalent to a Turing machine tape.
This then leads to more interesting economic questions than the pure theoretic computer science termination questions: can this computation be completed before I run out of money (or, more often, before I exhaust my users’ attention span).
I don’t understand this argument. If the Turing machine had an unbounded but finite tape, rather than a truly mathematically infinite tape, then many proofs would not work and instead, we would believe the opposite of what we believe. Proofs such as the halting problem being undecidable, Rice’s theorem, the busy beaver function being uncomputable, etc…
If a Turing machine was what you said, it’s very likely that I would absolutely have no problem with it.
Yes, but this is crucially important for all the proofs that I mentioned, otherwise it would be possible to prove the opposite of what they prove.
Right, but this is only one program out of an infinite number of programs that actually, can always use the entire length of the infinite tape.
For example, on a Turing machine, I can write a non-terminating program that advances the tape one step, then goes back to the beginning of the tape, reads or writes something for the entire existing length of the tape, then advances the tape one more step, and loops forever doing this. This program requires truly infinite storage (in the mathematical sense) and it would not be possible for this program to succeed executing forever unless the tape length is truly infinite.
Of course, in the real world nothing would execute forever (actually, it could if the program runs on multiple computers), but when we prove things about programs we are only concerned with modeling the behavior of a program in an ideal (and hopefully, functionally representative) model.
The problem I have is that the Turing machine is not functionally representative of what computers can do. A finite-sized model such as a deterministic finite-state machine or a deterministic linear bounded automaton, on the other hand, is functionally representative of what computers can do, even if it has an incredibly huge but finite tape length.
This is why Turing equivalent computation is considered more powerful than that which finite-sized models can do (although ironically, it has more limitations) and it’s why I argue that Turing equivalent computation or anything more powerful should be considered hypercomputation.
Note that this does not prevent us from creating Turing-complete programming languages and creating programs written in those languages as a simplification, which allows us to express programs with Turing-equivalent computations. But the model we use to analyze the execution of those programs should represent what computers can do.
If you want, you can still use a Turing machine to analyze the execution of programs, but if you do this, you should be aware that the conclusions you extract may be absurd, so don’t mislead everyone into thinking that these conclusions apply to realistic computation devices (i.e. those with a finite size).
No, it does not, because there is one very crucially important difference: cloud storage cannot keep increasing forever (and I mean forever in its true sense). That is the difference between “something grows so big that I cannot even imagine its size, but it will reach a point where it doesn’t grow anymore so it is still finite” and “truly mathematically infinite”.
If we actually used a machine model with a hugely, hugely big tape (say, with a tape length as large as the number of atoms in the universe), but it was still finite, rather than a Turing machine where the tape is mathematically infinite, then all of the problems I mentioned would disappear! The halting problem would be considered decidable, Rice’s theorem would be considered false, we could always compute the busy beaver function, etc.
Note, however, that there is a difference between computable functions and efficiently computable functions (i.e. whether they complete in a reasonable amount of time), and this is true whether a model has an infinite tape or a finite one. There is also a difference between efficiently computable functions and their time complexity, which can be (but not always are) very different. So strictly speaking these are all different issues and they would have to be analyzed separately.
Sure, but this also assumes a model with finite size (even if it can be very large), which leads to the answer being computable.
It seems like you are arguing for the same thing I am: that we should use a model with finite bounds, even if we use a bound so large that it is hard to comprehend. Unfortunately, this is not what we are doing currently, which leads to all sorts of paradoxes and mistaken conclusions about the kind of computation that is possible to do in our universe.
You have confused an endomorphism (a function which has the same type for domain and codomain, like a function from the natural numbers to the natural numbers) with the repeated application of that endomorphism to a starting value. The endomorphism is like a description of how a Turing machine takes a single step, while repeated application is like evaluation of that Turing machine on a given input. The Halting Problem, Rice’s theorem, etc. prove properties of the endomorphism, not properties of the repeated application.
You can argue that is true (although I look at it from another perspective) but that’s a bit of a pedantic argument: you are missing the forest for the trees.
Even if you see it from that perspective, you are missing the fact that the Halting Problem, Rice’s theorem, etc. can only be proved if the domain and codomain are infinite, which is what leads to those absurd conclusions.
There are an infinite number of functions that have infinite domains and codomains and which a real computer can never implement.
If the domains and codomains were finite, you could prove the opposite of those theorems.
Edit: I’d also like to say that your perspective is actually a very interesting way to think about the problem, I hadn’t really thought about it like that before. Thanks!
I already knew Rice’s theorem going in and thought it did a great job of explaining it.
Can you prove this statement? 😉
I’m sorry, I’ve only gone through the first few steps of the tutorial but the initial claims have already bothered me a lot. I have to be pedantic because this is the biggest pet peeve of mine:
I would say this is actually false, even though almost everybody thinks it’s true. At least, if you’re interested in actually solving problems rather than being interested in what is mostly a mathematical/theoretical curiosity.
The Halting Problem is famously only undecidable if the code you’re analyzing is being modeled on a “Turing machine” or something equally “powerful” to it. The problem is, computers are not as powerful as Turing machines, and furthermore, Turing machines (or something equivalent to them) neither exist nor can ever exist in practice, because it’s impossible to build a “machine” with truly infinite memory.
In fact, if the machine where the code is running has finite memory, then the Halting problem is actually very easily solved (well, in theory). And there already exists various known algorithms that detect cycles of values (where the “value” in this context is the state of the program):
https://en.wikipedia.org/wiki/Cycle_detection
As you can see, if you use one of these cycle detection algorithms, then the Halting Problem is not undecidable. It’s easily solvable. The problem is that, while these known algorithms are able to solve the problem in constant memory, there are no known algorithms to solve it in a time-efficient way yet. But again, it is decidable, not undecidable.
Now, I understand how Turing machines as a model can be useful, as the simplifying assumption of never having to worry about a memory limit might allow you to more easily do types of analysis that could be more difficult to do otherwise. And sure, the way the Halting Problem was “solved” was very clever (and I actually suspect that’s a major reason why it has reached its current legendary status).
But this simplifying assumption should have never allowed a result like the Halting Problem which is known to be wrong for the type of computing that we actually care about to be elevated to the status that it currently has, and by that I mean, to mislead everyone into thinking that the result applies to actual computers.
So again, I would say, be careful whenever you see a Turing machine being mentioned. I would say it would be better to view them as mostly a mathematical/theoretical curiosity. And yes, they can also be a powerful and useful tool IF you understand its limitations. But you should never confuse this model for what actually is the real world.
Since Rice’s theorem (and many other computer science results!) depends on the Halting Problem being true, while it is actually false for real-world machines, my understanding is that Rice’s theorem is also false for real-world machines.
But Rice’s theorem (and sentences like the article is using) is absolutely another example of a huge number of cases where the Halting Problem has seemingly mislead almost everyone into thinking that something can’t be solved when it actually can be solved.
I wonder how much more research could have advanced (like for example, in SMT solvers, formal methods, etc) if everyone actually believed that it’s possible for an algorithm to decide whether your program has a bug in it.
How many good people were discouraged into going into this area of research because of this widespread belief that this pursuit is proven to be unachievable in the general case?
Now, I’m not a computer scientist nor a mathematician or logician. So I could be wrong. If so, I would be interested in knowing how I’m wrong, as I’ve never seen this issue being discussed in a thorough way.
Edit: another (according to my understanding) completely misleading phrase in the tutorial exemplifying what I’m describing:
Now, to be clear I’m not blaming the author of the tutorial, as this is just one very tiny example of what actually is an extremely widespread belief across the whole industry.
The halting problem is a theorem. When the author writes “Turing famously showed that computers can’t decide whether your code halts.” they are referring to that theorem, even if the words they wrote are not precise and formal enough to state the actual halting problem.
Yes it’s a good point that in reality our programs are finite and halting can be determined for this set of programs.
The bridge between theory and practice is a really big topic. There’s a lot that could be said on that topic. But I think the key take-away from the halting problem and rices theorem for writing programs that analyze and process programs is this:
Your algorithm analysis system is going to need to output one of 3 things {yes, unknown, no} and not one of 2 things {yes, no}.
I definitely do not think we would be further ahead if Turing’s theorem had not been discovered. It’s not something that held work back, it’s something that informed more precise and correct directions for analysis of programs. So I believe we are better off for it.
Well, strictly speaking the system would still only tell you “yes” or “no”. But since Rice’s theorem is false for real-world computers (as far as I know), it would be possible to determine whether your program will halt because it ran out of memory, because some error was encountered or because it completed successfully, which is immensely useful to know for the vast majority of programs (but not for certain programs that are trying to solve strictly mathematical questions with infinite bounds).
Of course not, I wasn’t trying to make that point.
I think it’s not the theorem itself that held (and is holding) work back. It’s the widespread belief that it applies to real-world computers that holds work back.
Imagine how many people have been discouraged from pursuing work in formal methods because they believe it’s impossible to figure out if a program has a bug in the general case. I see engineers saying that it’s impossible to solve it all the freaking time.
I don’t understand this. What I was meaning was that, for example, I cannot write a program that terminates and correctly tells if an arbitrary program halts, but I can write one that tries set of checks and either reports that the input program does halt, does not halt or that it cannot tell.
It is impossible in the general case (assuming unbounded memory). If you restrict to finite memory like real computers have, the memory is usually so large that it doesn’t make a practice difference. You can’t just note down the entire system state including all 16GB of memory every iterator and try to detect loops. This works in theory but not in practice.
Yes, you can, because programs run on machines with finite memory. The cycle detection algorithms I mentioned are examples of such programs that always terminate and correctly tell you whether an arbitrary program halts.
But you shouldn’t assume unbounded memory, because that doesn’t exist.
Sure, currently it’s impractical, but as far as I know there’s no proof that it will always be impractical. I think you’re assuming that to determine whether a program halts, you have to simulate running the program step by step while it runs through all the states.
But I haven’t seen a convincing argument that you necessarily have to do this to solve the problem in general and that you can’t just analyze the program description to determine whether it will halt.
A cycle detection algorithm has to detect, by definition, that a Turing machine is in a state that it was in earlier. That is what it means for there to be a cycle. Detecting cycles in other ways requires program analysis and gets into the territory of Rice’s theorem. I’m not assuming anything about how the algorithm does this. Direct inspection of the physical implementation of the state representation is just one way. It could be programmatic interrogation via an API.
The number of possible states of a Turing machine is 2^n bits. This exceeds the number of particles in the universe for even a modest amount of bits. That means you can make a small program that just cycles through all possible states, for which you cannot physically have a second Turing machine that enumerates which states have already been seen. That’s an impossibility theorem for all I care
Didn’t I just prove that Rice’s theorem is not valid for programs that use a finite amount of state?
But didn’t I just show, here and in my other comment about the finiteness of space, that the Halting Problem, and thus Rice’s theorem, are still valid in practice?
This entire argument doesn’t change anything about what we can do in practice.
But they’re not valid in practice! In practice, there are already known algorithms that truly solve the Halting problem for finite state machines. Like the Tortoise and Hare algorithm. They truly solve this problem, in practice. Which means the Halting theorem’s proof is invalid for finite-state machines. The problem is decidable and there are known solutions.
What you’re talking about, perhaps, is that the time complexity of these algorithms, currently, is too great to be useful in practice. Which, if that is what you are arguing, then I agree. But that’s not what I was arguing about:
If you really were arguing about the time complexity of the currently-known algorithms, then sure, that is relevant.
But I was very clear about this: I was not arguing about what we can currently do, I was arguing about what might be possible to do in the future.
And my point is that you don’t have to necessarily iterate through all of these states to detect a cycle. It might be possible to construct an algorithm that solves the exact same issue without always running the program step by step, but instead, actually looks at the program and exploits how the program works to do the analysis much faster.
This also seems to be irrelevant to what I was discussing, because again, you don’t have to necessarily enumerate all states which have already be seen in order to detect when a cycle of states starts to repeat forever.
I mean, sure, in the vast majority of cases, currently, you’d have no other choice but to do this. But again, I was very clear about this: I’m not arguing about what we can currently do, I’m arguing about what might be possible to do in the future, especially given that this problem is decidable.
Cycle detection algorithms work by running the program and watching the states. They don’t work by analyzing the program description.
The proof for the Halting Problem does not prevent the machine that is to determine the halting status from simulating the program.
No. You mean that the currently known cycle detection algorithms work that way.
Or do you know of a proof that this is the only way to detect cycles (for an arbitrary program)?
I do, it’s called the halting problem.
So you’re saying that the Halting theorem is valid for programs that run on machines with finite memory?
You were speculating that there might be practical, efficient algorithms to do this. This seems unlikely to me even in the general (finite) case, as I’d guess an efficient algorithm for this would be equivalent to proving P = NP.
I agree with you. I think that it’s unlikely, but I also think that it’s possible.
As an example, primality testing was not believed to be polynomial-time for many decades, until finally someone discovered a deterministic polynomial-time algorithm in 2002, quite unexpectedly at the time.
As additional evidence that this is not a possibility that we can easily dismiss, I also present the following: Donald Knuth believes that P = NP :-)
I was thinking it might be the case that it could be shown to be an NP problem, perhaps without much effort, for someone more knowledgeable than I am in this area.
I’m aware of Knuth’s speculation, though I’ve seen him discuss it somewhere and he said something like, “Powers can be really, really big,” implying that even if P = NP, it might not change the practical tractability of NP problems.
[Comment removed by author]
It doesn’t make sense that you would try to clarify this to me, I obviously know this for the finite memory case - as we been discussing. What you’ve misunderstood (on purpose?) is that in my paragraph I am referring to unbounded memory. Maybe try a little bit to map out what you think the person you are talking to knows and does not know for better communication or something.
This is the cycle detection algorithm you discussed earlier. There are also others way to construct proofs that programs halt.
there is no general way to decide if programs (i.e. Turing machines) halt.
I’m sorry if I misunderstood you. I certainly did not do it on purpose.
Ok, given the context, then yes, I agree with you. You could write such a program. But if you assume the program you’re trying to analyze runs with finite memory, then you can even write a program that always tells you “yes, the program halts” or “no, the program does not halt”, for any arbitrary program.
And in fact, it would even be possible to tell why it halted!
Given the context of unbounded memory, then I agree with this statement. But I would definitely like it much more if people always preceded such statements with the “unbounded memory” assumption that they are making, to avoid such common misunderstandings. I’m not criticizing you specifically, rather, just the common discourse in general :)
A turing machine can tell if a finite memory program halts, but a finite memory program cannot determine if all finite memory programs halt - because it may run out of memory itself.
Yes, it can. The tortoise and hare algorithm is a finite memory program and it can determine if all finite memory programs halt.
so suppose you have 16 GB of memory to run tortoise and hare algorithm,
there is a program large enough that your 16 Gigs will run out and the algorithm wont be able to complete.
That means it can only analyze finite state machines with up to 8 GB of memory
No, the algorithm will be able to complete.
If the program tries to use more than 8 GB of memory, presumably it would crash with an out-of-memory error, which the tortoise and hare algorithm would interpret as a halting state and then terminate. Or it would loop, which the tortoise and hare algorithm would detect as a cycle and then terminate.
The program is not allowed to use an unbounded amount of memory.
Yes, but why is that interesting outside of trying to analyze a program that is trying to solve some kind of mathematical problem (i.e. something with infinite bounds)?
I think the more interesting statement is that there is a general way to decide if programs, when ran on deterministic finite-state machines, will halt, even if they were written in Turing-complete languages.
And I think the even more interesting question is: how can we reduce the time complexity of such algorithms that can decide that?
Well yes, but you need to run the decision program on a wachine with exponentially more memory, which is impossible in practice.
No, you don’t.
The tortoise and hare algorithm uses only a constant amount of memory to determine whether a finite-state machine halts. Concretely, it uses 2*N bytes of memory if the underlying finite state machine uses N bytes of memory.
This is wrong. Cycle detection works for Turing machines too. The equivalent halting problem for a finite machine is “there’s no algorithm that can tell if an arbitrary program, ran with sufficient memory, will halt.”
ie, take this program:
That will eventually halt on any computer due to hitting memory limits. But hitting a memory limit does not tell you if it it would have halted if you had more memory!
This is also wrong. We know it’s possible for an algorithm to find bugs, we’ve done that for decades. Rice’s Theorem says that there’s no algorithm that always finds bugs in any buggy program. It’s why SMT solving is, in reality, Very Hard.
It’s not wrong. Cycle detection doesn’t work for Turing machines precisely because of the Halting problem. The program may run forever without repeating states (including the state of the Turing machine tape), so for certain programs you will never find a cycle using a cycle detection algorithm on a Turing machine, therefore these algorithms would never terminate.
Which to me, is only of interest as a theoretical exercise or as a means to explore the consequences of true infinity, but not for solving most real-world problems.
What is “running with sufficient memory”? Isn’t that the same as running on a Turing machine?
Sure, but that problem is not very interesting for the vast majority of cases, because in the vast majority of cases you’re actually more interested in knowing whether the program you’ve built is going to crash even if you give it a huge amount of memory than knowing whether it will work correctly on an imaginary machine with infinite memory which nobody will ever have.
Generally speaking, knowing whether a program would halt on a Turing machine is useful for solving mathematical problems related to unbounded / infinite limits rather than solving real-world problems.
No, it’s not wrong. Rice’s theorem only says that for imaginary Turing machines, which is my entire point! You’re making the same mistake everyone usually makes. Rice’s theorem doesn’t say that for machines with finite memory, such as… computers.
That’s not because of Rice’s theorem! You just proved my point. You also believed that SMT solving is hard because of Rice’s theorem which means that these widespread beliefs about the Halting problem and Rice’s theorem are making people believe things which are not true and are discouraging people from pursuing these avenues of research.
It actually works better on Turing machines because in a real computer, cycle detection doesn’t work if the cycle is larger than your available memory. But in both abstract and real machines, it can’t tell you if a cycle is larger than your available memory.
I believe SMT solving is hard because I’ve worked with people who research SMT solving, and I assure you that anyone developing SMT solvers will tell you the exact same thing.
So your point is that when:
… then it works just as well as (or even better than) if the second program was running on a finite-state machine.
If this is what you are arguing, then I agree but it’s irrelevant to what we were discussing. You said:
Sure, it works if the first program is assumed to run on a finite-state machine. But that’s not how the Halting problem is defined. The Halting problem is actually assuming that the first program runs on a machine with infinite memory. This is why people say the Halting problem is undecidable rather than decidable. And it’s also why it doesn’t apply to real-world computers. So what I said is correct.
Cycles can’t be larger than the number of states in the available memory, by the pigeonhole principle.
I know SMT solving is hard, but as far as I understand it’s not because of Rice’s theorem.
Or are you assuring me that those people also have mistaken beliefs, given that Rice’s theorem is false for finite state machines? If so, that’s even worse than I thought, then.
Maybe it’s not the computer scientists who are mistaken here?
It is a known result in computer science that Rice’s theorem says the problem is undecidable for Turing machines, because the proof is based on the Halting problem and the Halting problem is undecidable for Turing machines.
However, it is also a known result in computer science that the Halting problem is decidable for finite-state machines, which makes the Rice’s theorem proof invalid for finite-state machines.
If there are
n
bits of memory available, then a cycle can have length up to2^n
, so you needn * 2^n
bits of memory to store the table of visited states.No, you don’t. The tortoise and hare algorithm detects such cycles and doesn’t store n * 2^n bits of memory, it only stores 2*N bits of memory.
Alright, so it’s not limited by memory, but it still needs to walk through the entire cycle, which might take
2^n
steps.The currently known algorithms for detecting cycles might indeed take 2^n steps to detect a cycle.
I’m arguing that it’s possible to devise an algorithm that looks at the program description and can determine whether there are cycles or not, without actually having to simulate the program running step-by-step.
So, to be clear, you’re claiming that there is an algorithm
H(P,I,n)
that takes a programP
and determines whether it halts in fewer than2^n
steps when run on inputI
, and the time complexity of this algorithm is polynomial inn
?Polynomial? I don’t know. There are reasons to believe this is not possible, but some of these reasons I find them a bit suspicious, especially because all complexity analysis is done with Turing machines. So it might be possible, I’m not sure. Or perhaps we can achieve quasi-polynomial time (i.e. QP).
SAT solving is NP-complete, so if P=NP perhaps it might be possible? Perhaps you just need to find a satisfying set of assignments in things like loop conditions to determine whether a program loops or not?
Of course, this is all assuming the program P uses a finite amount of memory.
But even if the time complexity of H is exponential, that says nothing about the efficiency of the algorithm, i..e how long it actually takes to solve the problem for reasonable and useful instances of programs.
Well, it seems that the problem is EXP-complete: https://cs.stackexchange.com/questions/112684/halting-problem-in-exp-complete
EXP can still be quasi-polynomial (QP), I think. So far we only know that P is not EXP, as far as I know.
QP means that this could be something like O(2^(log(N)^1.1)), right? Which means for a state size of 1 TiB, it could take on the order of… 2^17 = 130k times more steps than a state size of 1 bit to calculate in the worst case, which is pretty awesome :)
My optimism is unprecedented…
Your understanding is wrong.
How so?
You are fixated on Turing machines. Here are some starting points for leaving that worldview behind:
P
,NP
,EXP
, and several other common complexity classes without Turing machines. I have a diagram which shows this in detail. We can immediately see thatP
is contained inEXP
by an analogue of the time-hierarchy theorem, for example;EXP
is described by the same logic asP
, but with second-order quantifiers instead of first-order quantifiers.You may find this book interesting. It explains all of these results, starting with Gödel and ending with Turing.
I’m not entirely sure what you mean by this phrase. If lambda calculus is Turing-complete, then it means lambda calculus is as powerful as Turing machines, so it is able to use infinite amounts of state, just like a Turing-machine.
Which is the entire problem that I’m pointing out. If we restrict ourselves to analyzing programs that can actually run on a computer (i.e. those that use a finite amount of state), then all theories that were considered undecidable suddenly become decidable.
OK. I don’t quite understand your point here?
It depends on the program being analyzed being able to use an infinite amount of state, otherwise Rice’s theorem is false. Or are you saying Rice’s theorem is true for programs that use a finite amount of state?
Your focus on state is specific to Turing machines. Lambda calculus does not have intermediate state; any expansion or reduction of terms is going to yield another term.
I think that Wikipedia puts it best in their page on Rice’s theorem:
I also like the phrasing from Yanofsky, p19:
Because we are discussing properties of functions, and not just properties of various encodings of functions (various programs), it doesn’t really matter which encoding we choose; all that is required is that anything computable is expressible with our encoding.
[Comment removed by author]
Ok, I think I understand that now, although I’m not entirely sure (I’ve only really began to understand Rice’s theorem in the past couple of days).
But to me, that seems only interesting and important if you are interested in analyzing a type of computation that cannot be performed in our universe. Which I understand can be useful in certain scenarios, but I’m not sure that it is as useful as it seems for most problems.
I mean, we can theorize about turing-equivalent computation, analyze it, speculate about it and even extract conclusions about it. We can even perform a limited subset of this type of computation in the real world. But we can never truly perform this type of computation, which is a much stronger type of computation that what is possible to do in our universe.
You can still write programs that express such types of computation (e.g. programs written in Turing-complete languages that have no concept of memory allocation failure), with the understanding that if your program uses too much memory, it will crash/halt in the real world. Which means that when you are going to analyze how these programs will actually execute in the real world, then you should to that analysis using a model of computation that represents what computers can do, functionally, not what some theoretical super-powerful imaginary computation device would do in an imaginary universe.
If we limit ourselves to analyzing the type of computation that is possible to do in our universe, then there are problems which were undecidable in the previous model that actually become decidable.
I am not entirely sure what the analogous of Rice’s theorem would be for computations that are limited to using finite state. I mean, Rice’s theorem can be a little difficult to understand :) So I don’t know, perhaps you can clarify that for me:
It seems that Rice’s theorem is defined over the “computable functions”, where “computable functions” actually seems to be a misnomer, because some of these functions are theoretically impossible to compute in our universe, regardless of how much space and energy could be accessible to us: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time and storage space.”
If you were to define “computable functions” as “the set of functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”, then how would that look like? Because this set of functions are exactly the ones that deterministic finite state machines can calculate, and are also the ones that are theoretically possible to compute in our universe.
Given the above, I know Rice’s theorem proof relies on the Halting theorem, and the latter’s proof is not only invalid for deterministic finite-state machines, but in fact we can prove the opposite theorem (i.e. that such computation is decidable).
So if we were to define such a theorem analogous to Rice’s theorem but for computations that can be performed in our universe, then my intuition says that such theorem would be decidable given that you can always fully analyze this type of computation (e.g. since it is possible to simulate this type of computation step by step and guarantee that this simulation always terminates, because the simulation either terminates by itself or we can detect that it starts to repeat forever).
There are relatively small Turing machines whose behaviors include halting after taking more steps than our observable universe can represent. For a summary of what we currently know, see this table of busy-beaver values for Turing machines. This contradicts your implicit claim that we can only prove facts about programs which are running on physical machines.
You are close to reinventing computational complexity theory. You are not the first person to wonder what happens when we limit the amount of resources available to a Turing machine.
I would strongly recommend working through Peter Smith’s book linked above, “An Introduction to Gödel’s Theorems”. This book introduces Turing machines after building up the rest of the machinery of computability theory; the work of Gödel, Tarski, and Turing is put into that particular ordering, and we can pause after Tarski’s undefinability of truth and use Yanofsky’s example to prove Rice’s theorem without Turing machines.
So here’s another fun one for you.
Let me extract this excerpt from Wikipedia: “The busy beaver function quantifies the maximum score attainable by a Busy Beaver on a given measure. This is a noncomputable function.
The busy beaver function, Σ : N → N, is defined so that Σ(n) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol n-state Turing machines of the above-described type, when started on a blank tape. “
Ok, good. Now let’s say we use my definition of computable function (well, I’m sure it’s not literally my definition and that someone from many decades ago has already used an identical definition), which is the following: “computable functions are exactly the functions that can be calculated using a mechanical calculation device given unlimited amounts of time but limited amounts of storage space”.
You could say the following: “But @wizeman, if your computable functions are even more limited than the usual definition of super-powerful computable functions, then it’s even more impossible to compute the busy beaver function!”.
Yes, that’s true. But, the busy beaver is using a Turing machine as a model, which means it also uses the absurdly super-powerful definition of computable functions!
So how about we define the busy beaver function as such: “The busy beaver function, Σ : N x N → N, is defined so that Σ(n, k) is the maximum attainable score (the maximum number of 1s finally on the tape) among all halting 2-symbol (4 symbols including the endmarkers) n-state deterministic linear bounded automatons of the above-described type, when started on a blank tape of length k.”
What have we gained by doing this?
Well, suddenly, the busy beaver function actually becomes computable, even when using my non-super-powerful definition of computable functions!
Note that, just like Wikipedia says, a computable function does not imply that they can be efficiently computed (i.e. computed within a reasonable amount of time).
But it does mean that something that wasn’t even theoretically possible to do before (not to mention in practice), now is theoretically possible.
Yeah. If by “small” you mean a Turing machine with 1.3×10^7036 bits of tape, sure. Remember, the busy beaver function calculates how many bits of state the tape ends up with (well, not bits, symbols, but it’s almost the same thing).
I can create a deterministic finite state machine that takes even longer to run: start with a 1.3 x 10^7036-bit integer initialized to zero and increment it on each step until it wraps around.
But I don’t know how that proves anything about what I said.
I think there’s two things wrong with this statement of yours:
I don’t think I claimed it’s impossible to prove facts about imaginary machines, even infinite ones. Which seems obvious to me, otherwise induction wouldn’t work.
I don’t think I claimed that it’s always practical to prove things about physical machines, which also seems obvious to me, as you can imagine building the biggest machine possible but then you don’t have enough energy to start it, let alone do the entire computation.
But I did mean that restricting ourselves to finite-state machines makes it theoretically possible to prove statements that are impossible to prove for those with infinite state.
You can prove Rice’s theorem without Turing machines but I don’t think you can prove it if you say that a computable function is one that can be computed with finite storage.
In fact, I believe I can prove the opposite of Rice’s theorem if we use that definition of computable function.
You’ve misunderstood. A Turing machine’s description does not include its tape; it includes only the state transitions and symbols. I strongly recommend taking a crack at that PDF of Smith’s book.
The description does not include its tape, but the Turing machine’s tape is part of its state.
That’s why a busy beaver can be a small program, terminate and yet take such a long time: it’s because it can access so much state. Remember: a program can never take longer to run than 2^N steps (where N=state size in bits), unless the program starts to loop forever. So if a busy beaver only had access to 10 bits worth of tape (+ K states in its description), it could never take longer than 2^10 = 1024 (* 2^K) steps to complete.
The finite-state program that increments a 1.3*10^7036-bit integer until it wraps around is also an extremely small program that takes even longer to terminate than a busy beaver that uses a tape as long as that integer.
That cycle detection algorithms work has nothing to do with the finiteness of the machine memory and everything with the finiteness of the memory used by the program. It is well known that you can solve the Halting Problem for specific subsets of all possible machines, where you can exploit some property of the program. Those algorithms also work on theoretical machines with infinite memory running the same programs.
That they wouldn’t work for programs that use infinite memory is a red herring: they also don’t work, in practice, for many programs that use a finite amount of memory. The same is true for any other subset for which the halting problem can be solved: there are practical limits that prevent them from being solved in general.
The proof for the Halting Problem does not really depend on the infiniteness of machine memory: you can always devise the intermediate machine such that it loops for an amount of time that makes the Halting Problem only solvable given more time than the lifetime of the universe. That is equivalent to not solving it.
(I am in general sympathetic to your point. I find it baffling that CS books don’t clearly say “yes, Rice’s theorem, but don’t let that stop you from proving many interesting properties about programs. I am a physicist by training; I have little patience for mathematical infinities. However, often results remain true when replacing infinities with the finite limits of the universe. This is such a case. For many practical purposes that is irrelevant though)
Yes, that’s what I meant, actually. I was saying that as long as you assume that the program is going to run on a computer (which has finite memory), then you can always determine whether it halts or not.
And I was pointing out that the property of the program that we can exploit in this case is “the program can only use a finite amount of memory”, which is the subset of all programs that can run on a computer. An immensely useful subset if you can solve the Halting problem for these, I tell you! :-)
Actually, it does subtly rely on the program being analyzed being able to use an infinite amount of memory.
How do I know this? Because the Halting problem is decidable if the program is only allowed to use finite memory, which means that the undecidability proof is invalid in this case.
No, that still makes it solvable. Your intermediate machine may very well loop for an infinite or almost infinite amount of time. That still makes the problem decidable. It is a separate issue how long an algorithm takes to solve the problem. And also, you don’t know how long this algorithm takes. It may very well be that such an algorithm can determine whether a program loops or not, much more quickly than how long the program loops for.
There is a different problem: Can you tell me whether a program is malicious ? If so how ?
Let’s say it’s running something in a loop, and you’re trying to watch it to determine if it is malicious (looping file deletion). That is already a variation of the halting problem, because you cannot tell if the loop will halt (not deleting everything) and simply terminating it after X cycles isn’t enough. Because in this time it could have a) deleted too much or b) not actually finished the user intended work. The same goes for trying to find out if a program will behave differently for antivirus-scanning vs running normally.
If you have a formal (i.e. mathematically precise) specification of what a malicious program is, then it is possible to solve it, yes.
I think the issue with your example is that you don’t have a specification, i.e. there isn’t enough information to solve the problem.
But you could specify, for example: a benign program never deletes any files that it hasn’t created before, or whose paths the user hasn’t passed as a (specific) command-line parameter, or which the user hasn’t selected in a GUI file-manager-like window.
Then it would be possible to analyze a program and determine whether it does that or not.
This isn’t pedantry, this is just a misunderstanding of math and logic. Mathematical statements have to be precise. You can’t make a general statement about computers without having a precise, mathematically representable model of what a computer is. A Turing machine is one such model, and it’s believed to be a very good one, in part because of its correspondence to other similar models like the lambda calculus. But most notable is the register machine that is much closer to a real computer than a Turing machine, but still equivalent. So the halting problem result still holds.
Also, the halting problem has never discouraged anyone ever from research. I obviously don’t know every researcher, but it’s extremely well understood that the halting problem only applies to a general algorithm about all possible programs, and there are obviously classes of programs where you can easily know whether or not they halt. The biggest example there is the simply typed lambda calculus, of which programs are proven to always terminate.
The STLC is not a secret, it’s the basis of a huge amount of CS research. So the class of programs that the halting problem applies or doesn’t apply to is very well known.
How about we use a deterministic finite-state machine to model a computer? Would you be OK with that?
No, it is not. That is my point.
It’s not one such model because Turing machines can run programs which no computer will ever be able to run, and I really mean ever. It doesn’t matter which computer you construct, it will never be able to run such programs like a Turing machine does.
Specifically, I am talking about those programs which use infinite amounts of state (which in a Turing machine, includes the tape).
That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.
And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.
So a Turing machine is not a model of a computer, it is a purely theoretical model of a super-powerful (due to its infinite tape) abstract machine which is impossible to construct in the real world and which no computer can ever emulate.
Still not a model of a computer. Register machines are turing-equivalent, which means register machines can use infinite amounts of state as well, and thus they are a purely theoretical construction which is much more powerful than a computer will ever be.
For the sake of argument, imagine I create a model of a computer and tell you: hey, my computer model has an instruction which, when executed, can instantly tell tomorrow’s lottery numbers and put them in a register. And another instruction which kills everyone on the planet.
Would you say this is a model of a computer? Do you think it is really a model of what computers do? It does seem absurd, but a model of a computer with a truly (i.e. mathematically) infinite tape can be argued to be similarly absurd.
Note that I’m not saying that models have to be a perfect simulation of what a computer does, atom by atom.
But they should at least functionally represent what a computer does. And it’s OK to have unfaithful models, i.e. with certain assumptions or simplifications (like what Turing machines do, which assume the program can use infinite amounts of memory), but if you do so, you have to be careful about the implications of that and not assume that just because something is true in your model, then it is also true for computers. Otherwise you’re just misleading everyone.
It doesn’t, because of what I mentioned above. The Halting problem is decidable on finite-state machines.
Sure. But there is a particular class of programs, which is those that are only allowed to use a finite amount of state (i.e. those that can run on computers), for which problems such as the Halting problem and Rice’s theorem which are undecidable on Turing machines, suddenly become decidable.
So ironically, these Turing machines which are more powerful than computers, end up having limitations that finite-state machines like computers don’t have!
And not only these problems become decidable on computers, but suddenly it also becomes “easy” to prove statements about these programs, as the Tortoise and Hare algorithm (and other similar ones) demonstrate. However, there is still a huge practical issue in that the time complexity of these known algorithms is way too high to be useful, so we’d need more efficient algorithms for them to be useful for real-world programs.
I’m totally ok with that. That’s why I brought up the simply typed lambda calculus, which does not suffer from the halting problem and which has programs that are guaranteed to terminate. There are many models of computation that each have different properties and tradeoffs.
But infinite models still have immense practical utility. I have a good example - integers. The set of integers in math is infinite. Yet we can still use a finite subset of integers in our daily lives, and we know many things about finite subsets of integers because we’ve proven them on infinite sets.
Another occurrence of infinity is with interactive programs, like anything with a user interface or an event loop. While technically they don’t execute forever, or at least we don’t know if they do or not because we as humans can’t experience “forever,” you still have to reason about an interactive system as if it were infinite. It can be thought of as an infinite program that you can choose to halt, or as a finite program where you choose an arbitrary number of iterations beforehand. The distinction is totally moot though, because you have to use the same kind of logic in either case. Or in the case of FSMs, the number of finite states is so large that it being finite has no relevance to the real world, because you could never check anything about those states directly in any number of lifetimes.
You are arguing against the usage of infinite reasoning for real-world objects, and that we’re going to disagree on. That’s just how math works. You use infinity sometimes.
Yes, that’s why I argue that Turing machines are ideal if you want to model a computation about mathematical statements, e.g. for figuring out the Collatz conjecture or whatever, i.e. statements which are about truly infinitely-large things, which practically only occur in math.
But for modeling computers and the execution of real-world programs, finite-state machines seem a lot more appropriate and representative.
Actually, this type of infinity (in terms of time) is not necessarily an issue. As I mentioned earlier, the Tortoise and Hare algorithm (and others like it) can always deterministically detect whether programs halt or whether they loop forever, assuming the program is running on a finite-state machine. This means that it is possible to reason about these programs, i.e. Rice’s theorem is decidable for these programs.
So it doesn’t necessarily matter if they are designed to execute forever.
BTW, this is another example where the Halting problem, being usually defined for and studied on Turing machines, is misleading people into believing that reasoning about always-looping programs is undecidable.
No, this is not true. You can model programs that run forever as a deterministic finite-state machine, which don’t have the same undecidability problems as Turing machines.
How do you know this? Are you saying that to check statements about these programs you can only simulate them step-by-step while they run through all states, i.e. that you cannot answer these statements by looking at the description of the program?
Well, math is math and the real world is the real world. They are not the same thing. In the first one, truly infinite things exist, in the second one, they do not (except time perhaps), or at least, they’ll never be accessible to us.
But if you disagree with me, that’s fine :-)
What problems are you working on that you’re thinking FSMs will be more effective?
You can. How would you prove or check a property about such a program? (see the next response for what you would do)
No, I’m saying you have two choices to prove or check properties about large state machines:
Run them step by step and see if the property is ever broken. We have plenty of experience with this in model checking, and anyone who has actually done model checking on a real model will tell you that you have to immensely abstract your system and keep your bounds very small for this to be useful in any practical way. This is possible in very small, controlled cases, but you are arguing that FSMs should replace all other forms of reasoning about programs. And experience shows that this is a very bad idea.
Use infinite reasoning such as induction to show the property holds on an infinite SM. This is the better idea, but the one that you are opposed to.
The debate isn’t between you and me, it’s between you and the CS community, I’m just relaying what is known about state machines in CS.
Writing computer programs and proving things about them.
To be clear, I’m not saying FSMs will be more effective right now. It’s just that it’s theoretically possible to solve problems for them that aren’t possible to solve for Turing machines.
If you have an algorithm that does exactly what the Tortoise and Hare algorithm does, but works in a much more practically time-efficient manner (while still not using excessive amounts of memory), then it would be possible to use that algorithm to prove and check properties about programs that aren’t allowed to consume infinite memory.
Note that the Tortoise and Hare algorithm (and others like it) work without knowing anything about the program, they simply run the program step-by-step.
I’m conjecturing that it might be possible to greatly improve the performance of such an algorithm by looking at the program description and exploiting properties about how the program works.
OK, then I think we misunderstood each other. You are describing the currently-known ways of solving this issue. But when you said this:
I thought you were talking about something that is impossible to do, i.e. that can never be improved upon.
So yes, I agree with you, currently those are the most feasible ways to solve this problem. And of course, I use induction myself when I want to prove properties about my programs, because this is one of the currently most feasible ways to do this. So I know that they are useful right now and I also know about the limitations of both of them.
But I wasn’t talking about what is currently feasible, I’m talking about what might be possible to do.
In theory, as far as I know there’s nothing that says that to prove properties about a program running on a finite-state machine, that you have to 1) run this program step-by-step like in a model checker, or 2) do things like induction.
In other words, I don’t think that those will always be the only options, especially because it’s possible to solve problems in FSMs that aren’t possible to solve for Turing machines. But I’d be happy to be proven otherwise!
A worthy goal :)
I think we’ve arrived at mutual understanding. I won’t try and discourage optimism, because we should always be looking for improvements over what we know today. It would be great if we could reduce any part of computation to deterministic, finite, and manageable structures.
For today, I stick with Leslie Lamport’s advice and point of view, and focus on the techniques of infinite state machines.
This is wrong. In particular, Brainfuck is Turing-complete, implementable on physical computers, and simulates a tape.
What do you mean exactly?
If it’s Turing-complete, by definition it can’t be simulated on a physical computer.
That’s why the Wikipedia page about Turing machines says “(…) the computation of a real computer is based on finite states and thus not capable to simulate a Turing machine (…)”.
And it’s also why the Wikipedia page about finite-state machines says: “The finite-state machine has less computational power than some other models of computation such as the Turing machine.”.
I disliked this from the first example because you aren’t replacing the function with a regex you are replacing it with .test(s)
You’re replacing the logic of the function with matching it against a regex. The method you use to match it against a regex is an implementation detail.
I disliked it because the question was asked “But wouldn’t it be cool if your IDE gave you a refactoring hint?” No. No that wouldn’t be cool because it would imply that replacing that fast concise array manipulation or length checking is for some reason not as good as using a regular expression. For simple things like that, I call it RE abuse. Not using an RE is almost always better. The tutorial lost me at this point.
This is actually posting your answers to a server. Would be interesting to see the stats.
Oh, whoops. I spammed the responses to get through it all and see if it got more interesting
Sometimes switching to another option actually gave me a better explanation.