The syntax does 1:1 map into Python. Python has long placed code being beautiful (clean; readable out loud) as a top design constraint. Zig does away with the range() and zip() constructions.
Zig:for (elems) |x| vs Python:for x in elems.
Zig:for (a..b) |n| vs Python:for n in range(a,b), and the extension in Python of for n in range(a, b, stepvalue)
Zig:for (elems, nats) |e, n| vs Python:for e, n in zip(elems, nats)
Zig:for (elems, nats, 0..) |e, n, idx| is messy in Python. Usually Python:for idx, (e,n) in enumerate(zip(elems, nats)). One could do Python:for e, n, idx in zip(elems, nats, itertools.count()) but it seems less clear.
There are a some languages, like GDScript, that back away from these constructs because they encourage a style of programming that creates intermediate lists. This is perceived as slower, though it can be wrong. For example, for a in [i for i in thing if rare_condition(i)] has an intermediate list and is marginally faster then using an intermediate iterator. At least, its complicated.
It would be good for a simple nomenclature to cover these so that all languages would provide this functionality.
This looks pretty cohesive with the rest of Zig, definitely a welcome addition. I assume inline for gets this new behavior as well? That isn’t made clear in the post (granted Zig newcomers wouldn’t know what that is).
There’s only one simple rule when it comes to the length of the sequences: all lengths must match. Passing arrays of different length is safety-checked UB (i.e. you will get a panic in safe release modes).
Does this mean you have to check the sizes of the sequences before you use them in non-safe release modes?
As an example, here is how this new syntax affected std.mem.copy:
--- a/lib/std/mem.zig
+++ b/lib/std/mem.zig
@@ -196,13 +196,8 @@ test "Allocator.resize" {
/// Copy all of source into dest at position 0.
/// dest.len must be >= source.len.
/// If the slices overlap, dest.ptr must be <= src.ptr.
pub fn copy(comptime T: type, dest: []T, source: []const T) void {
- // TODO instead of manually doing this check for the whole array
- // and turning off runtime safety, the compiler should detect loops like
- // this and automatically omit safety checks for loops
- @setRuntimeSafety(false);
- assert(dest.len >= source.len);
- for (source, 0..) |s, i|
- dest[i] = s;
+ for (dest[0..source.len], source) |*d, s|
+ d.* = s;
}
The TODO is solved, the @setRuntimeSafety is no longer needed, and the assertion is performed by the slice of dest to match the source length.
This was not a motivating source code example when deciding on the new syntax - it was a happy side effect of the change.
To my understanding, yes. Though I expect in practice this wouldn’t be an issue. One of the motivating examples for this feature is StructOfArrays. Those arrays are all the same length so there wouldn’t be a concern in that case.
And in the case of iterating over multiple arrays/slices with potentially differing lengths: without this for loop syntax, you would need to do bounds checking before or during the loop anyway, so I think the behavior of the for loop makes sense. Rather than have the for loop truncate any longer slices, the surrounding code will handle mismatched lengths to be more explicit
Thanks. I see how this feature would make SOA code neater.
What still makes me uneasy about this is that if you don’t check your bounds (because you assume the arrays are the same length), and you run a non-safe release this would be UB.
Unless I got this wrong, there’s no bounds checking in that case, right?
That is correct. I just tested. In a safe mode there is a runtime panic “for loop over objects with non-equal lengths” and in an unsafe release mode it seems to iterate for the length of the first array entry, with no bounds checking.
I think this fits well with Zig’s approach to safety. In a more critical piece of code you could use ReleaseSafe with bounds checking. Then a block of code that needs particular optimizations could have @setRuntimeSafety(false) to remove bounds checks for example.
Common Lisp’s ‘loop’ has had this forever, and it’s more flexible, with cleaner syntax:
(let ((my-string "this is a test of looping")
(my-list '(this is a list for looping multiple-times)))
(loop
;; Initial value expression and use "then" expression for future values
:for offset = 0
:then (+ 1 offset (length word))
;; Same expression every time through
:for word = (ju:get-word my-string :at offset)
;; Elements in my-list by two
:for (sym1 sym2) :on my-list :by #'cdr
;; Just loop over the elements
:for sym3 :in my-list
;; from 0 to 4
:for num :below 4
:collect (list num word sym1 sym2 sym3)))
And it produces:
((0 "this" THIS IS THIS) (1 "is" IS A IS) (2 "a" A LIST A) (3 "test" LIST FOR LIST))
Some people hate it, but I’ll sometimes have an entire algorithm or function boil down to a tidy (loop).
(loop :for el :in '("water" "earth" "fire" "air")
:for n :in '("tribes" "kingdom" "nation" "nomads")
:do
(print `(,e ,n)))
I think the (loop) is clearer because I can read it like a sentence and it’s clearer which variables get elements from which list, and it easily extends to any number of lists and other types of iteration. In the Zig code it’s not immediately clear how “elems”, “nats”, “e”, and “n” are related, and really not clear how I’d extend it for more lists or other types of iteration.
I don’t know Zig, but from what I’ve seen it seems like Zig goes to great lengths to make things read fluently from left to right. Putting the iterator before the element binding falls in line with that. See also the dereference syntax in this post—x.*—which I think is brilliant.
This looks very productive and reminds me of python. If somebody from the go core team reads this: Please, please take inspiration from this!
The syntax does 1:1 map into Python. Python has long placed code being beautiful (clean; readable out loud) as a top design constraint. Zig does away with the
range()andzip()constructions.for (elems) |x|vs Python:for x in elems.for (a..b) |n|vs Python:for n in range(a,b), and the extension in Python offor n in range(a, b, stepvalue)for (elems, nats) |e, n|vs Python:for e, n in zip(elems, nats)for (elems, nats, 0..) |e, n, idx|is messy in Python. Usually Python:for idx, (e,n) in enumerate(zip(elems, nats)). One could do Python:for e, n, idx in zip(elems, nats, itertools.count())but it seems less clear.There are a some languages, like GDScript, that back away from these constructs because they encourage a style of programming that creates intermediate lists. This is perceived as slower, though it can be wrong. For example,
for a in [i for i in thing if rare_condition(i)]has an intermediate list and is marginally faster then using an intermediate iterator. At least, its complicated.It would be good for a simple nomenclature to cover these so that all languages would provide this functionality.
Python’s popularity is a sign of the end times. Prove me wrong.
This looks pretty cohesive with the rest of Zig, definitely a welcome addition. I assume
inline forgets this new behavior as well? That isn’t made clear in the post (granted Zig newcomers wouldn’t know what that is).Does this mean you have to check the sizes of the sequences before you use them in non-safe release modes?
As an example, here is how this new syntax affected
std.mem.copy:The TODO is solved, the
@setRuntimeSafetyis no longer needed, and the assertion is performed by the slice ofdestto match the source length.This was not a motivating source code example when deciding on the new syntax - it was a happy side effect of the change.
That’s a neat use of slices. So slicing checks the bounds, and aborts if they are wrong, e.g. if the source is longer than the dest?
Correct.
this also functions as a nice reminder that postfix deref is pretty neat
To my understanding, yes. Though I expect in practice this wouldn’t be an issue. One of the motivating examples for this feature is StructOfArrays. Those arrays are all the same length so there wouldn’t be a concern in that case.
And in the case of iterating over multiple arrays/slices with potentially differing lengths: without this for loop syntax, you would need to do bounds checking before or during the loop anyway, so I think the behavior of the for loop makes sense. Rather than have the for loop truncate any longer slices, the surrounding code will handle mismatched lengths to be more explicit
Thanks. I see how this feature would make SOA code neater.
What still makes me uneasy about this is that if you don’t check your bounds (because you assume the arrays are the same length), and you run a non-safe release this would be UB.
Unless I got this wrong, there’s no bounds checking in that case, right?
That is correct. I just tested. In a safe mode there is a runtime panic “for loop over objects with non-equal lengths” and in an unsafe release mode it seems to iterate for the length of the first array entry, with no bounds checking.
I think this fits well with Zig’s approach to safety. In a more critical piece of code you could use ReleaseSafe with bounds checking. Then a block of code that needs particular optimizations could have
@setRuntimeSafety(false)to remove bounds checks for example.Thanks for the detailed explanation :)
So you could have a whole app compiled as safe release, and disable safety selectively. Nice.
You’re welcome, and that’s correct
Looks like this is basically a syntactic version of
zip. (Presumably with some performance improvement overzipdue to not having to build tuples.)From the name I was expecting it might be like Clojure’s
formacro, which is more like nested loops (but with filtering and binding).The syntax seems a little clunky.
Common Lisp’s ‘loop’ has had this forever, and it’s more flexible, with cleaner syntax:
And it produces:
Some people hate it, but I’ll sometimes have an entire algorithm or function boil down to a tidy (loop).
By what metric is this cleaner? It seems pretty cluttered to me. More flexible undoubtedly though.
Maybe I overdid it with the extras. Using a simple example from the article:
Becomes:
I think the (loop) is clearer because I can read it like a sentence and it’s clearer which variables get elements from which list, and it easily extends to any number of lists and other types of iteration. In the Zig code it’s not immediately clear how “elems”, “nats”, “e”, and “n” are related, and really not clear how I’d extend it for more lists or other types of iteration.
This looks pretty cool. Is the design of this feature related to the design decision of putting the element after the iterator?
I don’t know Zig, but from what I’ve seen it seems like Zig goes to great lengths to make things read fluently from left to right. Putting the iterator before the element binding falls in line with that. See also the dereference syntax in this post—
x.*—which I think is brilliant.That is neat. I think a postfix dereference makes a lot of sense, too.
That was also the case before the recent changes to for loops.
[Comment removed by author]