The program doesn’t stop from the point of view of the standard model of the natural numbers, but in a suitable nonstandard model it stops (after a nonstandard amount of time) if it was given parameter from the desired set.

This can actually be illustrated by a slightly simpler example.

The basic idea of non-standard models is: we have some ideas how natural numbers behave, we write a theory to formalize these ideas, but this theory cannot be complete and consistent at once. So if it has no contradictions, there are things that can be neither proven nor disproven. And a theory without contradictions describes multiple different models. We can use these models as models of time, and computations/proofs will behave differently.

As an example: let’s describe natural numbers. 0,1,2,… We can add, multiply them; a+b=b+a, ab=ba, (a+b)c=ac+bc. We know that a+b+c and ab*c are well-defined even without specifying the order of operations. Natural numbers behave like that, but so do reals. We can say «every x except 0 can be represented as x=y+1»; this excludes reals, positive reals, and integers. We can also add that for two different natural numbers x and y either x-y or y-x is natural, but not both. This defines comparison: x>y if x-y is natural.

We can add the notion of finite sequences (something numbered by natural numbers up to a given number). A proof is a finite sequence of finite sequences of natural numbers, where we actually encode the characters in the proof as numbers.

This is a sensible description of natural numbers, and you can go on and formally define the finite sequences as a new kind of object. But the description is not complete, and there is an alternative model. This model is composed of integer polynomials with a natural most-significant coefficient. 0, x-1, 8x^5-3x+17… (the model does satisfy the conditions I have written above)

What happens if we used these «extended natural numbers» for numbering statements in a proof? We can have the following proof: 0:«True», 1:«True», 2:«True», …, x-2: «False», x-1: «False», x: «False».

This is a sequence of length «x», and every statement is either obviously true or easily follows from the previous one. So it is a proof. Of something false.

Our initial theory did not fully describe the natural numbers. So we could find a different model described by the same theory; but translating the notion of proof made some things provable even though they shouldn’t be.

A similar thing happens with more interesting nonstandard models of natural numbers and set theory. Of course, both the theory and the model are more complicated there; but again, we try to describe the natural numbers, cannot do it in complete detail, and so there is an alternative model of the same theory which can interpret our notion of proof or computation, but the results are different. The difference is also more complicated, of course…

The article is one example of such a situation: the standard formalization of the properties of natural numbers is not enough to prove that the program in question never stops.

The program doesn’t stop from the point of view of the standard model of the natural numbers, but in a suitable nonstandard model it stops (after a nonstandard amount of time) if it was given parameter from the desired set.

[Comment removed by author]

This can actually be illustrated by a slightly simpler example.

The basic idea of non-standard models is: we have some ideas how natural numbers behave, we write a theory to formalize these ideas, but this theory cannot be complete and consistent at once. So if it has no contradictions, there are things that can be neither proven nor disproven. And a theory without contradictions describes multiple different models. We can use these models as models of time, and computations/proofs will behave differently.

As an example: let’s describe natural numbers. 0,1,2,… We can add, multiply them; a+b=b+a, a

b=ba, (a+b)c=ac+bc. We know that a+b+c and ab*c are well-defined even without specifying the order of operations. Natural numbers behave like that, but so do reals. We can say «every x except 0 can be represented as x=y+1»; this excludes reals, positive reals, and integers. We can also add that for two different natural numbers x and y either x-y or y-x is natural, but not both. This defines comparison: x>y if x-y is natural.We can add the notion of finite sequences (something numbered by natural numbers up to a given number). A proof is a finite sequence of finite sequences of natural numbers, where we actually encode the characters in the proof as numbers.

This is a sensible description of natural numbers, and you can go on and formally define the finite sequences as a new kind of object. But the description is not complete, and there is an alternative model. This model is composed of integer polynomials with a natural most-significant coefficient. 0, x-1, 8

x^5-3x+17… (the model does satisfy the conditions I have written above)What happens if we used these «extended natural numbers» for numbering statements in a proof? We can have the following proof: 0:«True», 1:«True», 2:«True», …, x-2: «False», x-1: «False», x: «False».

This is a sequence of length «x», and every statement is either obviously true or easily follows from the previous one. So it is a proof. Of something false.

Our initial theory did not fully describe the natural numbers. So we could find a different model described by the same theory; but translating the notion of proof made some things provable even though they shouldn’t be.

A similar thing happens with more interesting nonstandard models of natural numbers and set theory. Of course, both the theory and the model are more complicated there; but again, we try to describe the natural numbers, cannot do it in complete detail, and so there is an alternative model of the same theory which can interpret our notion of proof or computation, but the results are different. The difference is also more complicated, of course…

The article is one example of such a situation: the standard formalization of the properties of natural numbers is not enough to prove that the program in question never stops.