The strided representation of multidimensional arrays allows a lot of really nice properties to just sort of naturally fall out. I find the representation particularly beautiful.

I do too! Phil Abrams’ 1970 thesis on An APL Machine (pdf) has a really interesting discussion of ‘beating’, a process introduced to use the strided representation to carry out various operations (transpose, rotate, take, drop, …) efficiently.

It also introduces ‘drag-along’ which is a sort of lazy evaluation of APL expressions - I actually don’t know if this is used widely/at all in APL implementations. It for example allows you to calculate 3↑2×-V by only negating and doubling the first three elements of V rather than working on the whole vector V then taking the first three elements at the end (example from An APL Machine). I’d like to try implementing drag-along one day.

It also introduces ‘drag-along’ which is a sort of lazy evaluation of APL expressions - I actually don’t know if this is used widely/at all in APL implementations. It for example allows you to calculate 3↑2×-V by only negating and doubling the first three elements of V rather than working on the whole vector V then taking the first three elements at the end (example from An APL Machine).

Interesting. I have some tentative plans to implement something like that in my apl interpreter, but I didn’t know there was any interest in it already. In that particular example, I don’t think it gains you very much; you can just rewrite as 2×-3↑V, which is probably more clear in context. (Or even ¯2×3↑V.) I’m much more interested in it in for reductions. If you have +/⍳1e9, there’s no reason that should take up 8gb of memory. So you can make ⍳ return a generator, which is basically a bit of state plus a function that can return the next number in the sequence. (Actually, you probably want to make it produce 1000 or so numbers at once, but that’s a separate refinement.) A related paper is Extending APL to Infinity, which wants to implement those ideas as part of the language itself, rather than just the implementation. Personally, I don’t think the method presented in that paper is the right way to do it, but it’s an interesting proposal.

Thanks that looks like an interesting paper! Will read it slowly.

There’s a cool language FAC (A Functional APL Language) designed by H. Tu and Alan Perlis which adds infinite arrays and laziness to APL too, e.g. from the IEEE paper:

A FAC array is a lazy data-structure where any array element is computed by demand. It means that each index access A[I] behaves as a function call A(I), not as a memory access.

From my skim it seems to be similar in spirit to the infinite arrays in the Extending APL paper (representing them as functions on indices).

Also in one of the more recent versions of k ⍳/! really did return an unevaluated range which I think acted like a generator - but it looks like in the version I just downloaded this has been removed…

Also in one of the more recent versions of k ⍳/! really did return an unevaluated range which I think acted like a generator - but it looks like in the version I just downloaded this has been removed…

It was probably k7. I don’t think you can download it anymore (and I can’t find my old copy…), but there’s a web version, where !1000000000 evaluates correctly. The capabilities of that version seem to be pretty limited, though; 1+!1000000000 works, but +/!1000000000 and 2*!1000000000 give a ws error.

From my skim it seems to be similar in spirit to the infinite arrays in the Extending APL paper (representing them as functions on indices).

Right. There’s a big problem with this. Without the ability to make semantic assertions about these functions, it’s difficult to use them effectively. They give an example which computes e; here is another for π:

π ←→ 4×+/1,(¯1*⍳_)÷(1+2×⍳_) ⍝remove '1,' when ⎕io≡0

The problem being, of course, that you can’t actually apply reductions to arbitrary unbounded sequences. First off, reduction is right-associative, so something like -/⍳_ won’t work, since you’d have to start at the end of the sequence, and there is no end. So, fine; restrict it to associative functions. Maybe even just + and ×. Well, then you need the sequence to converge; either to 0 in the case of +, or to ±1 in the case of ×. (Or, for ×, it can have a 0 anywhere, but that’s not very interesting.) An actual APL implementation doesn’t know that, though; it will just keep applying the reduction until the difference between successive elements of the scan is less than some epsilon. So what do you do in the case of a sequence where the first 6000 elements look like they’re going to converge, but then the sequence diverges wildly? It’s not a particularly likely event, but it is possible, and it will happen at some point, and then the language will have failed its users.

My proposed solution is a full-on computer algebra system based on APL. So you can have infinite sequences, but the language knows about them rather than just blindly querying a function for their elements. If you try to apply a reduction to one, the implementation has to prove that the scan converges, and then returns a symbolic result, rather than just one that happens to be within an epsilon of the previous result.

Incidentally, though, I think modeling arrays as functions that map indices to elements is a great idea. (As a part of the language, not the implementation.) I’m currently working on an informal paper showing that, with forks and another almost inconsequential extension, arrays are isomorphic to a subset of functions. (It’s even better: we get prefix agreement with scalar functions.) It also leads to a generalised version of the rank operator, and with another small extension it can replace outer product too.

Thanks this is very interesting! For what it’s worth I would certainly use an APL-ish computer algebra system. I used to use Mathematica a bit and something capturing its symbolic capability in a small number of primitives would be really fun to use I think (could be fun choosing the primitives too).

Also your informal paper sounds intriguing - I’d certainly be very interested to read it when it’s done!

If I may ask, is your interpreter available anywhere? I’d love to see it.

I’m much more interested in it in for reductions. If you have +/⍳1e9, there’s no reason that should take up 8gb of memory

So after my first quick read of the paper in question, it defines what are called “J-vectors”. Basically, a triple of (direction,start,count). I don’t know how that compares to the “Extending APL to Infinity” mechanism; thanks for the link there!

Thank you for that, I’ll look over that this weekend!

It predates APL2/nested arrays; I wonder if it would be difficult to add those in to the design (without having read any of the paper it’s going to be either a non-issue and trivial, or impossible).

Tangentially: Sometimes I imagine a parallel universe in which APL didn’t add nested arrays but instead implemented the notation for trees in the ‘A Programming Language’ book. Trees seem more natural to me than nested arrays, so since trees are in the book anyway it seems curious to me that nested arrays were chosen. Some of the tree operations from the book were added to J.

The tree notation seems nice for some things - e.g. if you have a probability tree T then the probability to reach each leaf is ×/T and the probability to reach each node is ×\T (if / and \ are taken as acting along paths). And writing // for folding across levels, +//×\T gives a vector of ones (total probability at each level is one).

Yeah I thought that was cool too. It certainly helps to explain why APL-ish langauge REPLs are so blazing fast. I wonder what other tricks they utilize to minimize data movement…

The strided representation of multidimensional arrays allows a lot of really nice properties to just sort of naturally fall out. I find the representation particularly beautiful.

I do too! Phil Abrams’ 1970 thesis on An APL Machine (pdf) has a really interesting discussion of ‘beating’, a process introduced to use the strided representation to carry out various operations (transpose, rotate, take, drop, …) efficiently.

It also introduces ‘drag-along’ which is a sort of lazy evaluation of APL expressions - I actually don’t know if this is used widely/at all in APL implementations. It for example allows you to calculate 3↑2×-V by only negating and doubling the first three elements of V rather than working on the whole vector V then taking the first three elements at the end (example from An APL Machine). I’d like to try implementing drag-along one day.

Interesting. I have some tentative plans to implement something like that in my apl interpreter, but I didn’t know there was any interest in it already. In that particular example, I don’t think it gains you very much; you can just rewrite as

`2×-3↑V`

, which is probably more clear in context. (Or even`¯2×3↑V`

.) I’m much more interested in it in for reductions. If you have`+/⍳1e9`

, there’s no reason that should take up 8gb of memory. So you can make`⍳`

return a generator, which is basically a bit of state plus a function that can return the next number in the sequence. (Actually, you probably want to make it produce 1000 or so numbers at once, but that’s a separate refinement.) A related paper is Extending APL to Infinity, which wants to implement those ideas as part of the language itself, rather than just the implementation. Personally, I don’t think the method presented in that paper is the right way to do it, but it’s an interesting proposal.Thanks that looks like an interesting paper! Will read it slowly.

There’s a cool language FAC (A Functional APL Language) designed by H. Tu and Alan Perlis which adds infinite arrays and laziness to APL too, e.g. from the IEEE paper:

From my skim it seems to be similar in spirit to the infinite arrays in the Extending APL paper (representing them as functions on indices).

Also in one of the more recent versions of k ⍳/! really did return an unevaluated range which I think acted like a generator - but it looks like in the version I just downloaded this has been removed…

It was probably k7. I don’t think you can download it anymore (and I can’t find my old copy…), but there’s a web version, where

`!1000000000`

evaluates correctly. The capabilities of that version seem to be pretty limited, though;`1+!1000000000`

works, but`+/!1000000000`

and`2*!1000000000`

give a ws error.Right. There’s a big problem with this. Without the ability to make semantic assertions about these functions, it’s difficult to use them effectively. They give an example which computes e; here is another for π:

The problem being, of course, that you can’t

actuallyapply reductions to arbitrary unbounded sequences. First off, reduction is right-associative, so something like`-/⍳_`

won’t work, since you’d have to start at the end of the sequence, and thereisno end. So, fine; restrict it to associative functions. Maybe even just`+`

and`×`

. Well, then you need the sequence to converge; either to 0 in the case of`+`

, or to ±1 in the case of`×`

. (Or, for`×`

, it can have a 0 anywhere, but that’s not very interesting.) An actual APLimplementationdoesn’t know that, though; it will just keep applying the reduction until the difference between successive elements of the scan is less than some epsilon. So what do you do in the case of a sequence where the first 6000 elements look like they’re going to converge, but then the sequence diverges wildly? It’s not a particularlylikelyevent, but it is possible, and it will happen atsomepoint, and then the language will have failed its users.My proposed solution is a full-on computer algebra system based on APL. So you can have infinite sequences, but the language

knowsabout them rather than just blindly querying a function for their elements. If you try to apply a reduction to one, the implementation has to prove that the scan converges, and then returns a symbolic result, rather than just one that happens to be within an epsilon of the previous result.Incidentally, though, I think modeling arrays as functions that map indices to elements is a great idea. (As a part of the language, not the implementation.) I’m currently working on an informal paper showing that, with forks and another almost inconsequential extension, arrays are isomorphic to a subset of functions. (It’s even better: we get prefix agreement with scalar functions.) It also leads to a generalised version of the rank operator, and with another small extension it can replace outer product too.

Thanks this is very interesting! For what it’s worth I would certainly use an APL-ish computer algebra system. I used to use Mathematica a bit and something capturing its symbolic capability in a small number of primitives would be really fun to use I think (could be fun choosing the primitives too).

Also your informal paper sounds intriguing - I’d certainly be very interested to read it when it’s done!

If I may ask, is your interpreter available anywhere? I’d love to see it.

So after my first quick read of the paper in question, it defines what are called “J-vectors”. Basically, a triple of

`(direction,start,count)`

. I don’t know how that compares to the “Extending APL to Infinity” mechanism; thanks for the link there!Thank you for that, I’ll look over that this weekend!

It predates APL2/nested arrays; I wonder if it would be difficult to add those in to the design (without having read any of the paper it’s going to be either a non-issue and trivial, or impossible).

Tangentially: Sometimes I imagine a parallel universe in which APL didn’t add nested arrays but instead implemented the notation for trees in the ‘A Programming Language’ book. Trees seem more natural to me than nested arrays, so since trees are in the book anyway it seems curious to me that nested arrays were chosen. Some of the tree operations from the book were added to J.

The tree notation seems nice for some things - e.g. if you have a probability tree T then the probability to reach each leaf is ×/T and the probability to reach each node is ×\T (if / and \ are taken as acting along paths). And writing // for folding across levels, +//×\T gives a vector of ones (total probability at each level is one).

Yeah I thought that was cool too. It certainly helps to explain why APL-ish langauge REPLs are so blazing fast. I wonder what other tricks they utilize to minimize data movement…