Threads for jmiven

  1. 3

    How far does one have to navigate the docs before seeing some code?

    1. 9

      You only have to click on the well-named “Introductory example”, in the upper-left corner.

    1. 1

      I suggest adding the date (2012) to the title.

      1. 1

        The talk mentionned in the title starts at 4:21:45.

        1. 3

          Note that despite the tag, this is not a mac-only app.

          1. 1

            Thank you, I hadn’t realized that and didn’t even bothered to read the page!

            I’ve suggested the removal of the tag.

          1. 0

            I hated this. In my opinion nothing was gained from this exercise that couldn’t have been gleaned with a simple simulation of surface deformation for the curves of the font.

            Masking this as an exploration of learning “on nature’s terms” is just a pretentious way of carving kyle+karen=love surrounded by a little heart in the middle of the forest. I was thought that only assholes do that.

            1. 28

              You could have made the same (correct, imo) point without adding an insult.

              1. 8

                I do feel strongly about a guy putting words like on the internet:

                The project challenges how we humans are terraforming and controlling nature to their own desires, which has become problematic to an almost un-reversible state. Here the roles have been flipped, as nature is given agency to lead the process, and the designer is invited to let go of control and have nature take over.

                And then terraforms nature by carving into a tree and pretending nature has now somehow gained “agency” by his self interpreted benevolent act. Give me a break.

                1. 20

                  Give me a break.

                  maybe take one. If you get angry at some random artsy website, you may need some time away from computers. I know for sure that I need that occasionally.

              2. 24

                There’s a few points in the writeup that separate this from the random infatuation carving in the forest:

                • The tree was planted by his parents.
                • The type of tree and its age was required before a carving can take place.
                • The author is aware of how to carve tree bark without harming it.
                • “No trees were harmed in this experiment.”

                It’s clear that after 5 years from the carving, the healing of the carving has been successful and the tree is healthy.

                1. 2

                  The author is aware of how to carve tree bark without harming it.

                  This claim is not backed by evidence. The author is aware of how not to kill a tree on a 1-2 year timescale via girdling, but you can’t reasonably claim that it wasn’t harmed.

                2. 4

                  Do you think a simple simulation of surface deformation is somehow free? Or that other forms of artistic expression should be condemned for their slight environmental impact because they don’t produce a quantifiable value?

                  Like I just can’t wrap my head around any kind of internally consistent worldview where this response makes sense. It feels like, instead of being a principled stand in defense of nature, it’s a backlash against what you see as pretentiousness, but actively setting out to shut down projects that make other people happy for no other reason than that you don’t like them is a lot more asshole behavior than carving things on trees.

                1. 3

                  You won’t find it yet in the release notes, but GCC 12.1 is finally able to compile recent D code, thanks to a big bump of the version of the D frontend integrated in GCC.

                  The maintainer, Iain Buclaw, said that it will be documented more officially later.

                  1. 1

                    Oh, thank you for pointing this out. This is a huge accomplishment for D !

                  1. 5

                    I always cringe a bit when I read things like:

                    However, the most recent major update of text changed its internal string representation from UTF-16 to UTF-8.

                    One of the biggest mistakes that a language can make is to have a string representation. Objective-C / OpenStep managed to get this right and I’ve seen large-scale systems doubling their transaction throughput rate by having different string representations for different purposes.

                    This is particularly odd for a language such as Haskell, which excels at building abstract data types. This post is odd in that it demonstrates an example of the benefits of choosing a string representation for your workload (most of their data is ASCII, stored as UTF-8 to handle the cases where some bits aren’t), yet the entire post is about moving from one global representation to another.

                    For their use, if most of their data is ASCII, then they could likely get some big performance boots from having two string representations:

                    • A unicode string stored as UTF-8, with a small (lazily-built - this is Haskell, after all) look-aside structure to identify code points that span multiple code units.
                    • A unicode string stored as ASCII, where every code point is exactly one byte.
                    1. 6

                      One of the biggest mistakes that a language can make is to have a string representation.

                      By this optic, we are in luck! Haskell has ~6 commonly used string types. String, Text, lazy Text, ByteString, lazy ByteString, ShortByteString and multiple commonly used string builders! /i

                      I am very happy with the text transition to UTF-8. Conversions from ByteString are now just a UTF-8 validity check and buffer copy and in the other direction a zero-copy wrapper change.

                      1. 4

                        I think what David is saying is that ObjC has one string type (NSString/NSMutableString) with several underlying storage representations, including ones that pack short strings into pointers. That fact does not bubble up into several types at the surface layer.

                        1. 3

                          Exactly as @idrougge says: a good string API decouples the abstract data type of a string (a sequence of unicode code points) from the representation of a string and allows you to write efficient code that operates over the abstraction.

                          NSString (OpenStep’s immutable string type) requires you to implement two methods:

                          • length returns the number of UTF-16 code units in the string (this is a bit unfortunate, but OpenStep was standardised just before UCS-2 stopped being able to store all of unicode. This was originally the number of unicode characters.)
                          • characterAtIndex: returns the UTF-16 code unit at a specific point index (again, designing this now, it would be the unicode character).

                          There is also an optional -copyCharacters:inRange:, which amortises Objective-C’s dynamic dispatch cost and bounds checking costs by performing a batched sequence of -characterAtIndex: calls. You don’t have to provide this, but things are a lot faster if you do (the default implementation calls -characterAtIndex: in a loop). You can also provide custom implementations of various other generic methods if you can do them more efficiently in your implementation (for example, searching may be more efficient if you convert the needle to your internal encoding and then search).

                          There are a couple of lessons that ICU learned from this when it introduced UText. The most important is that it’s often useful to be able to elide a copy. The ICU version (and, indeed, the Objective-C fast enumeration protocol, which sadly doesn’t work on strings) provides a buffer and allows you to either copy characters to this buffer, or provide an internal pointer, when asked for a particular range and allows you to return fewer characters than are asked for. If your internal representation is a linked list (or skip list, or tree, or whatever) of arrays of unicode characters then you can return each buffer in turn while iterating over the string.

                          The amount of performance that most languages leave on the floor from mandating that text is either stored in contiguous memory (or users must write their entire set of text-manipulation routines without being able to take advantage of any optimised algorithms in the standard library) is quite staggering.

                          1. 4

                            a good string API decouples the abstract data type of a string (a sequence of unicode code points) from the representation of a string and allows you to write efficient code that operates over the abstraction.

                            How, when different abstractions have different tradeoffs? ASCII is single-byte, UTF-8 and UTF-16 are not, and so indexing into them at random character boundaries is O(1) vs. O(n). The only solution to that I know of is to… write all your code as if it were a variable-length string encoding, at which point your abstract data type can’t do as well as a specialized data type in certain cases.

                            1. 3

                              Tangentially, you can find the start of the next (or previous) valid codepoint from a byte index into a UTF8 or UTF16 string with O(1) work. In UTF8, look for the next byte that doesn’t start with “0b10” in the upper two bits. I’m a known valid UTF-8 string it’ll be occur within at most 6 bytes. :)

                              (Indexing into a unicode string at random codepoint indices is not a great thing to do because it’s blind to grapheme cluster boundaries.)

                              Serious question, have you ever actually indexed randomly into ASCII strings as opposed to consuming them with a parser? I can’t personally think of any cases in my career where fixed-width ASCII formats have come up.

                              1. 2

                                Serious question, have you ever actually indexed randomly into ASCII strings as opposed to consuming them with a parser? I can’t personally think of any cases in my career where fixed-width ASCII formats have come up.

                                I have, yes, but only once for arbitrary strings. I was writing a simple mostly-greedy line-breaking algorithm for fixed-width fonts, which started at character {line length} and then walked forwards and backwards to find word breaks and to find a hyphenation point. Doing this properly with the dynamic programming algorithm from TeX, in contrast, requires iterating over the string, finding potential hyphenation points, assigning a cost to each one, and finally walking the matrix to find the minimal cost for the entire paragraph.

                                I’ve also worked with serialised formats that used fixed-width text records. For these, you want to split each line on fixed character boundaries. These are far less common today, when using something like JSON adds a small amount of size (too much in the ’80s, negligible today) and adds a lot more flexibility.

                                For parallel searching, it’s quite useful to be able to jump to approximately half (/ quarter / eighth / …) of the way along a string, but that can be fuzzy: you don’t need to hit the exact middle, if you can ask for an iterator about half way along then the implementation can pick a point half way along and then scan forwards to find a character boundary.

                                More commonly, I’ve done ‘random access’ into a string because integers were the representation that the string exposed for iterators. It’s very common to iterate over a string, and then want to backtrack to some previous point. The TeX line breaking case is an example of this: For every possible hypenation point, you capture a location in the string when you do the forward scan. You then need to jump to those points later on. For printed output, you probably then do a linear scan to convert the code points to glyphs and display them, so you can just use an integer (and insert the hyphen / line break when you reach it), but if you’re displaying on the screen then you want to lay out the whole paragraph and then skip to the start of the first line that is partially visible.

                                ICU’s UText abstraction is probably the best abstract type that I’ve seen for abstracting over text storage representations. It even differentiates between ‘native’ offsets and code unit offsets, so that you can cache the right thing. The one thing I think NSString does better is to have a notion of the cheapest encoding to access. I’d drop support for anything except the unicode serialisations in this, but allow 7-bit ASCII (in 8-bit integers), UTF-8, UTF-16, UTF-32 (and, in a language that has native U24 support, raw unicode code points in 24-bit integers) so that it’s easy to specialise your algorithm for a small number of cases that should cover any vaguely modern data and just impose a conversion penalty on people bringing data in from legacy encodings. There are good reasons to prefer three of the encodings from that list:

                                • ASCII covers most text from English-speaking countries and is fixed-width, so cheap to index.
                                • UTF-8 is the densest encoding for any alphabetic language (important for cache usage).
                                • UTF-16 is the densest encoding for CJK languages (important for cache usage).

                                UTF-32 and U24 unicode characters are both fixed-width encodings (where accessing a 32-bit integer may be very slightly cheaper than a 24-bit one on modern hardware), though it’s still something of an open question to me why you’d want to be able to jump to a specific unicode code point in a string, even though it might be in the middle of a grapheme cluster.

                                Apple’s NSString implementation has a 6-bit encoding for values stored in a single pointer, which is an index into a tiny table of the 64 most commonly used characters based on some large profiling thing that they’ve run. That gives you a dense fixed-width encoding for a large number of strings. When I added support for hiding small (7-bit ASCII) strings in pointers, I reduced the number of heap allocations in the desktop apps I profiled by over 10% (over 20% of string allocations), I imagine that Apple’s version does even better.

                              2. 1

                                I’ve written code in Julia that uses the generic string functions and then have passed in an ASCIIStr instead of a normal (utf8) string and got speedups for free (i.e. without changing my original code).

                                Obviously if your algorithm’s performance critically depends on e.g. constant time random character access then you’re not going to be able to just ignore the string type, but lots of the time you can.

                                1. 1

                                  indexing into them at random character boundaries is O(1) vs. O(n).

                                  Raku creates synthetic codepoints for any grapheme that’s represented by multiple codepoints, and so has O(1) indexing. So that’s another option/tradeoff.

                                  1. 1

                                    Julia similarly allows O(1) indexing into its utf8 strings, but will throw an error if you give an index that is not the start of a codepoint.

                                    1. 3

                                      But that’s just UTF-8 code units, i.e. bytes; you can do that with C “strings”. :)

                                      Not grapheme clusters, not graphemes, not even code points, and not what a human would consider a character.

                                      If you have the string "þú getur slegið inn leitarorð eða hakað við ákveðinn valmöguleika" and want to get the [42]nd letter, ð, indexing into bytes isn’t that helpful.

                                      1. 1

                                        Oh, I see I misunderstood. So Raku is storing vectors of graphemes with multi-codepoint graphemes treated as a codepoint. Do you know how it does that? A vector of 32bit codepoints with the non-codepoint numbers given over to graphemes + maybe an atlas of synthetic codepoint to grapheme string?

                                  2. 1

                                    How, when different abstractions have different tradeoffs? ASCII is single-byte, UTF-8 and UTF-16 are not, and so indexing into them at random character boundaries is O(1) vs. O(n).

                                    Assuming that your data structure is an array, true. For non-trivial uses, that’s rarely the optimal storage format. If you are using an algorithm that wants to do random indexing (rather than small displacements from an iterator), you can build an indexing table. I’ve seen string representations that store a small skip list so that they can rapidly get within a cache line of the boundary and then can do a linear scan (bounded to 64 bytes, so O(1)) to find the indexing point.

                                    If you want to be able to handle insertion into the string then a contiguous array is one of the worst data structures because inserting a single character is an O(n) operation in the length of the string. It’s usually better to provide a tree of bounded-length contiguous ranges and split them on insert. This also makes random indexing O(log(n)) because you’re walking down a tree, rather than doing a linear scan.

                                  3. 1

                                    I really miss working in the NS* world.

                                  4. 2

                                    ByteString isn’t a string type though, it’s a binary sequence type. You should never use it for text.

                                    1. 3

                                      ByteString is the type you read UTF-8 encoded data into, then validate it is properly encoded before converting into a Text - it is widely used in places where people use “Strings” in other languages like IO because it is the intermediate representation of specific bytes. It fits very well in with the now common Haskell mantra of parse, don’t validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) - we know we have some data, and we need a type to represent it; we parse it into a Text which we then know is definitely valid (which these days is just a zero copy validation from a UTF-8 encoded ByteString). It’s all semantics, but we’re quite happy talking about bytestrings as one of the string types, because it represents a point in the process of dealing with textual data. Not all ByteStrings are text, but all texts can be ByteStrings.

                                  5. 2

                                    This comment reads very much like you’re quite ignorant of the actual state of strings in Haskell, particularly given how many people complain that we have too many representations.

                                    Also, this article is specifically about code which relies on internal details of a type, so I’m not sure how your suggestions help at all - this algorithm would need to be written for the specific representations actually used to be efficient.

                                    One thing I have wanted to do for a while is add succinct structures to UTF-8 strings which allow actual O(1) indexing into the data, but that’s something that can be built on top of both the Text and ByteString types.

                                    1. 1

                                      It sounds like you missed the /i in the parent post. I know, it’s subtle ;)

                                      1. 1

                                        That is not the parent post. Axman6 was replying to David. :)

                                        1. 1

                                          argh, thread’s too too long :)

                                      2. 1

                                        This comment reads very much like you’re quite ignorant of the actual state of strings in Haskell, particularly given how many people complain that we have too many representations.

                                        I don’t use Haskell but the complaints that I hear from folks that do are nothing to do with the number of representations, they are to do with the number of abstract data types that you have for strings and the fact that each one is tied to a specific representation.

                                        Whether text is stored as a contiguous array of UTF-{8,16,32} or ASCII characters, as a tree of runs of characters in some encoding, embedded in an integer, or in some custom representation specifically tailored to a specific use should affect performance but not semantics of any of the algorithms that are built on top. You can then specialise some of the algorithms for a specific concrete representation if you determine that they are a performance bottleneck in your program.

                                        One thing I have wanted to do for a while is add succinct structures to UTF-8 strings which allow actual O(1) indexing into the data, but that’s something that can be built on top of both the Text and ByteString types.

                                        It’s something that can be built on top of any string abstract data type but cannot be easily retrofitted to a concrete type that exposes the implementation details without affecting the callers.

                                        1. 1

                                          number of abstract data types that you have for strings and the fact that each one is tied to a specific representation

                                          The types are the representations.

                                          You can write algorithms that would work with any of String and Text and Lazy.Text in Haskell using the mono-traversable package.

                                          However, that whole bunch of complexity is only justified if you’re writing a library of complex reusable text algorithms without any advanced perf optimizations. Otherwise in practice there just doesn’t seem to be that much demand for indirection over string representations. Usually a manual rewrite of an algorithm for another string type is faster than adding that whole package.

                                    1. 4

                                      Is there a particular thing on this page you wanted to draw attention to? I’m just seeing a basic quora page.

                                      1. 6

                                        Given the initial title, I believe srid meant to share this specific answer by Conal Elliott.

                                        1. 9

                                          Eh, don’t think this is really lobsters-quality. The specific answer is “Read this, also imperative programming isn’t rigorous because it isn’t.” Not much to discuss there.

                                          A few years back I ran a challenge to see if this was really true and the general results were that verifying FP programs wasn’t actually much easier than verifying imperative programs.

                                          1. 2

                                            Is there a repository that collects all the solutions to your challenge? I’d love to be able to compare them.

                                            1. 3
                                          2. 1

                                            Thanks, this is very interesting. I believe the differences between imperative and functional programming can mostly be rectified through the concatenative calculus, though. That’s essentially what I’m trying to do with Dawn. It requires a relatively complex type system, though. And it’s not clear how to integrate dependent types, which might be preferable for proofs. But I hope to eventually add proofs as a meta-languange.

                                        1. 20

                                          Warning: this is not supposed to be taken very seriously. It’s not a joke, but I won’t bet 2 cents that I’m right about any of it.

                                          Pretty much all widely used languages today have a thing. Having a thing is not, by far, the only determinant factor in whether a language succeeds, and you can even question whether wide adoption is such a good measure of success. But the fact is, pretty much all languages we know and use professionally have a thing, or indeed, multiple things:

                                          • Python has simplicity, and later, Django, and later even, data science
                                          • Ruby has Rails and developer happiness (whatever that means)
                                          • Go had simplicity (same name, but a different thing than Python’s) and concurrency (and Google, but I don’t think that it counts as a thing)
                                          • PHP had web, and, arguably, Apache and cheap hosts
                                          • JavaScript has the browser
                                          • Typescript has the browser, but with types
                                          • Java had the JVM (write once, run everywhere), and then enterprise
                                          • C# had Java, but Microsoft, and then Java, but better
                                          • Rust has memory safety even in the presence of threads

                                          Even older languages like SQL, Fortran, Cobol, they all had a thing. I can’t see what Hare’s thing might be. And to be fair, it’s not a problem exclusively with, or specially represented by, Hare. 9/10 times, when there’s a post anywhere about a new language, it has no thing. None. It’s not even that is not actually particularly well suited for it’s thing, it can’t even articulate what it’s thing is.

                                          “Well, Hare’s thing is system’s programming” that’s like saying that McDonald’s thing is hamburgers. A thing is more than a niche. It’s … well, it’s a thing.

                                          It might well be the case that you can only see a thing in retrospect (I feel like that might be the case with Python, for instance), but still, feels like it’s missing, and not not only here.

                                          1. 3

                                            It might well be the case that you can only see a thing in retrospect

                                            Considering how many false starts Java had, there was an obvious and error-ridden search process to locate the thing—first delivering portability, mainly for the benefit of Sun installations nobody actually had, then delivering web applets, which ran intolerably poorly on the machines people needed them to run on, and then as a mobile device framework that was, again, a very poor match for the limited hardware of the era, before finding a niche in enterprise web platforms. Ironically, I think Sun did an excellent job of identifying platforms in need of a thing, seemingly without realizing that their thing was a piss-poor contender for being the thing in that niche. If it weren’t for Sun relentlessly searching for something for Java to do, I don’t think it would have gotten anywhere simply on its merits.

                                            feels like it’s missing

                                            I agree, but I also think it’s a day old, and Ruby was around for years before Rails. Although I would say that Ruby’s creator did so out of a desire for certain affordances that were kind of unavailable from other systems of the time—a Smalltalk placed solidly in the Perl-Unix universe rather than isolated in a Smalltalk image. What we seem to have here is a very small itch (Zig with a simpler compiler?) being scratched very intensely.

                                            1. 2

                                              Ruby and Python were in the back of my mind the whole time I was writing the thing about things (hehe), and you have a point about Java, that thing flailed around A LOT before settling down. Very small itch is a good summary.

                                              Time will tell, but I ain’t betting on it.

                                              1. 1

                                                I’m with you. But we’ll see, I guess.

                                            2. 3

                                              Pretty much all widely used languages today have a thing. […] Even older languages like SQL, Fortran, Cobol, they all had a thing

                                              An obvious language you do not mention is C. What’s C’s thing in that framework? And why couldn’t Hare’s thing be “C, but better”, like C# is to Java? (Or arguably C++ is to C, or Zig is to C)

                                              1. 12

                                                C’s thing was Unix.

                                                1. 4

                                                  Incorrect…C’s thing was being a portable less terrible macroassembler-ish tool.

                                                2. 3

                                                  Well, I did say a thing is not the only determinant for widespread adoption. I don’t think C had a thing when it became widely used. Maybe portability? It was the wild wild west days, though.

                                                  Hare could very well eat C’s lunch and became big. But being possible is far away from being likely.

                                                  1. 2

                                                    C’s thing is that it’s a human-friendly assembly.

                                                    strcpy is rep stosb, va_list is a stack parser, etc.

                                                    1. 5

                                                      But it’s not. At least not once you turn on optimizations. This is a belief people have that makes C seem friendlier and lower level, but there have been any number of articles posted here about the complex transformations between C and assembly.

                                                      (Heck, even assembly isn’t really describing what the CPU actually does, not when there’s pipelining and multiprocessing going on.)

                                                      1. 2

                                                        But it is. Sure, you can tell the compiler to optimize, in which case all bets are obviously off, but it doesn’t negate the fact that C is the only mainstream high-level language that gets you as close to the machine language as it gets.

                                                        That’s not a belief, it’s a fact.

                                                        1. 4

                                                          you can tell the compiler to optimize, in which case all bets are obviously off

                                                          …and since all deployed code is optimized, I’m not sure what your point is.

                                                          Any modern C compiler is basically humoring you, taking your code as a rough guideline of what to do, but reordering and inlining and unrolling and constant-folding, etc.

                                                          And then the CPU chip gets involved, and even the machine instructions aren’t the truth of what’s really going on in the hardware. Especially in x86, where the instruction set is just an interpreted language that gets heavily transformed into micro-ops you never see.

                                                          If you really want to feel like your C code tells the machine exactly what to do, consider getting one of those cute retro boards based on a Z-80 or 8086 and run some 1980s C compiler on it.

                                                          1. -1

                                                            No need to lecture and patronize if you don’t get the point.

                                                            C was built around machine code, with literally every language construct derived from a subset of the latter and nothing else. It still remains true to that spirit. If you see a piece of C code, you can still make a reasonable guess to what it roughly translates to. Even if it’s unrolled, inlined or even trimmed. In comparison with other languages, where “a += b” or “x = y” may translate into the pages of binary.

                                                            Do you understand the point?

                                                            1. 2

                                                              C Is Not a Low-level Language

                                                              The post you’re replying to isn’t patronizing you, it’s telling the truth.

                                                              1. 2

                                                                You are missing the point just the same.

                                                                It’s not that C generates an exact assembly you’d expect, it’s that there’s a cap on what it can generate from a given piece of code you are currently looking at. “x = y” is a memcpy at worst and a dereferencing a pointer does more or less just that. Not the case with C++, leave alone Go, D, etc.

                                                                1. 1

                                                                  I suggest reading an intro to compilers class textbook. Compilers do basic optimizations like liveliness analysis, dead store eliminations etc. Just because you write down “x = y” doesn’t mean the compiler will respect it and keep the load/store in your binary.

                                                                  1. -1

                                                                    I suggest trying to make a rudimentary effort to understand what others are saying before dispensing advice that implies they are dolts.

                                                              2. 2

                                                                If you see a piece of C code, you can still make a reasonable guess to what it roughly translates to.

                                                                As someone who works on a C compiler for their day job and deals with customer support around this sort of thing, I can assure you this is not true.

                                                                1. 2

                                                                  See my reply to epilys.

                                                                  Can you share an example of resulting code not being even roughly what one was expecting?

                                                                  1. 4

                                                                    Some general observations. I don’t have specific examples handy and I’m not going to spend the time to conjure them up for what is already a thread that is too deep.

                                                                    • At -O0 there are many loads and stores generated that are not expected. This is because the compiler is playing it safe and accessing everything from the stack. Customers generally don’t expect that and some complain that the code is “stupid”.
                                                                    • At -O1 and above, lots of code gets moved around, especially when inlining and hoisting code out of loops. Non-obvious loop invariants and loops that have on effect on memory (because the user forgot a volatile) regularly result in bug reports saying the compiler is broken. In nearly every case, the user expects all the code they wrote to be there in the order they wrote it, with all the function calls in the same order. This is rarely the case.
                                                                    • Interrupt code will be removed sometimes because it is not called anywhere. The user often forgets to tag a function as an interrupt and just assumes everything they write will be in the binary.
                                                                    • Our customers program microcontrollers. They sometimes need timing requirements on functions, but make the assumption that the compiler will generate the code they expect to get the exact timing requirements they need. This is a bad assumption. They may think that a series of reads and writes from memory will result in a nearly 1-1 correspondence of load/stores, but the compiler may realize that because things are aligned properly, it can be done in a single load-multiple/store-multiple instruction pair.
                                                                    • People often expect one statement to map to a contiguous region of instructions. When optimizaton is turned on, that’s not true in many cases. The start and end of something as “simple” as x = y can be very far apart.

                                                                    This is just from recent memory. There is really no end it. I won’t get into the “this instruction sequence takes too many cycles” reports as those don’t seem to match your desired criteria.

                                                                    1. 1

                                                                      Thanks. These are still more or less in the ballpark of what’s expected with optimizations on.

                                                                      I’ve run into at least a couple of these, but I can remember only one case when it was critical and required switching optimizations off to get what we needed from the compiler (had to with handling very small packets in the nic driver). Did kill a day on that though.

                                                  1. 23

                                                    The thing is that systemd is not just an init system, given it wants to cover a lot of areas and “seeps” into the userspace. There is understandably a big concern about this and not just one of political nature. Many have seen the problems the pulseaudio-monoculture has brought, which is a comparable case. This goes without saying that ALSA has its problems, but pulseaudio is very bloated and other programs do a much better job (sndio, pipewire (!)) that now have a lot of problems to gain more traction (and even outright have to camouflage as libpulse.so).

                                                    Runit, sinit, etc. have shown that you can rethink an init system without turning it into a monoculture.

                                                    1. 4

                                                      In theory, having all (or at least most) Linux distros on a single audio subsystem seems like a good idea. Bugs should get fixed faster, compatibility should be better, it should be easier for developers to target the platform. But I also see a lot of negativity toward PulseAudio and people seem to feel “stuck” with it now.

                                                      So where’s the line between undesirable monoculture and undesirable fragmentation?

                                                      1. 22

                                                        The Linux ecosystem is happy with some monocultures, the most obvious one is the Linux kernel. Debian has dropped support for other kernels entirely, most other distros never tried. Similarly, with a few exceptions such as Alpine, most are happy with the GNU libc and coreutils. The important thing is quality and long-term maintenance. PulseAudio was worse than some of the alternatives but was pushed on the ecosystem because Poettering’s employer wanted to control more of the stack. It’s now finally being replaced by PipeWire, which seems to be a much better design and implementation. Systemd followed the same path: an overengineered design, a poor implementation (seriously, who in the 2010s, thought that writing a huge pile of new C code to run in the TCB for your system was a good idea?) and, again, pushed because Poettering’s employer wanted to control more of the ecosystem. The fact that the problems it identifies with existing service management systems are real does not mean that it is a good solution, yet all technical criticism is overridden and discounted as coming from ‘haters’.

                                                        1. 5

                                                          seriously, who in the 2010s, thought that writing a huge pile of new C code to run in the TCB for your system was a good idea?

                                                          I really want to agree with you here, but looking back at 2010 what other choice did he realistically have? Now its easy, everyone will just shout rust, but according to Wikipedia, rust didn’t have its first release till June while systemd had its first release in March.

                                                          There were obviously other languages that were much safer than C/C++ around then but I can’t think of any that people would have been okay with. If he had picked D, for example, people would have flipped over the garbage collection. Using a language like python probably wasn’t a realistic option either. C was, and still is, ubiquitous just like he wanted systemd to be.

                                                          1. 3

                                                            I really want to agree with you here, but looking back at 2010 what other choice did he realistically have?

                                                            C++11 was a year away (though was mostly supported by clang and gcc in 2010), but honestly my choice for something like this would be 90% Lua, 10% modern C++. Use C++ to provide some useful abstractions over OS functionality (process creation, monitoring) and write everything else in Lua. Nothing in something like systemd is even remotely performance critical and so there’s no reason that it can’t be written in a fully garbage collected language. Lua coroutines are a great abstraction for writing a service monitor.

                                                            Rust wouldn’t even be on my radar for something like this. It’s a mixture of things that can’t be written in safe Rust (so C++ is a better option because the static analysis tools are better than they are for the unsafe dialect of Rust) and all of the bits that can could be written more easily in a GC’d language (and don’t need the performance of a systems language). I might have been tempted to use DukTape’s JavaScript interpreter instead of Lua but I’d have picked an interpreted, easily embedded, GC’d language (quickjs might be a better option than DukTape now but it wasn’t around back then).

                                                            C was, and still is, ubiquitous just like he wanted systemd to be.

                                                            Something tied aggressively to a single kernel and libc implementation (the maintainers won’t even accept patches for musl on Linux, let alone other operating systems) is a long way away from being ubiquitous.

                                                          2. 4

                                                            In what optics are Pipewire any kind of improvement on the situation? It’s >gstreamer< being re-written by, checking notes, the same gstreamer developers - with the sole improvements over the previous design being the use of dma-buf as a primitive, with the same problems we have with dma-buf being worse than (at least) its IOS and Android counterparts. Poettering’s employers are the same as Wim Taymans. It is still vastly inferior to what DirectShow had with GraphEdit.

                                                          3. 14

                                                            I’ve been using Linux sound since the bad old days of selecting IRQs with dipswitches. Anyone who says things are worse under PulseAudio is hilariously wrong. Sound today is so much better on Linux. It was a bumpy transition, but that was more than a decade ago. Let it go.

                                                            1. 6

                                                              Sound today is so much better on Linux.

                                                              Mostly because of improvements to ALSA despite pulseaudio, not because of it.

                                                              1. 4

                                                                Yep! Pulseaudio routinely forgot my sound card existed and made arbitrary un-requested changes to my volume. Uninstalling it was the single best choice I’ve made with the software on my laptop in the last half decade.

                                                            2. -2

                                                              It’s no accident that PulseAudio and SystemD have the same vector, Poettering.

                                                              1. 17

                                                                The word you’re looking for is “developer”, or “creator”. More friendlysock experiment, less name-calling, please :)

                                                                1. 3

                                                                  Was Poettering not largely responsible for the virulent spread of those technologies? If so, I think he qualifies as a vector. I stand by my original wording.

                                                                  1. 7

                                                                    It’s definitely an interesting word choice. To quote Merriam-Webster: vector (noun), \ˈvek-tər,

                                                                    1. […]
                                                                      1. an organism (such as an insect) that transmits a pathogen from one organism or source to another
                                                                      2. […]
                                                                    2. an agent (such as a plasmid or virus) that contains or carries modified genetic material (such as recombinant DNA) and can be used to introduce exogenous genes into the genome of an organism

                                                                    To be frank, I mostly see RedHat’s power hunger at fault here. Mr. Poettering was merely an employee whose projects, who without doubt follow a certain ideology, fit into this monopolistic endeavour. No one is to blame for promoting their own projects, though, and many distributions quickly followed suit in adopting the RedHat technologies which we are now more or less stuck with.

                                                                    Maybe we can settle on RedHat being the vector for this, because without their publicitly no one would’ve probably picked up any of Poettering’s projects in the large scale. To give just one argument for this, consider the fact that PulseAudio’s addition to Fedora (which is heavily funded by RedHat) at the end of 2007 coincides with Poettering’s latest-assumed start of employment at RedHat in 2008 (probably earlier), while Pulseaudio wasn’t given much attention beforehand.

                                                                    Let’s not attack the person but discuss the idea though. We don’t need a strawman to deconstruct systemd/pulseaudio/avahi/etc., because they already offer way more than enough attack surface themselves. :)

                                                                    1. 5

                                                                      Let’s not attack the person but discuss the idea though. We don’t need a strawman to deconstruct systemd/pulseaudio/avahi/etc., because they already offer way more than enough attack surface themselves. :)

                                                                      This is why this topic shouldn’t be discussed on this site.

                                                          1. 3

                                                            Take off every Zig for great glory!

                                                            1. 1

                                                              I don’t get it. :D Is this a reference to something?

                                                              1. 1
                                                            1. 4

                                                              After I saw @jcs’s video about amend, I thought about something like src. On the one hand, something far simpler than git would handle probably 90% of my needs for my solo projects. And my solo projects are most of what I do. (I’m a teacher who programs as a hobby, not a programmer.) But on the other hand, I contribute to FOSS projects, so I’d have to use git in those contexts whether I like it or not. With that in mind, I decided it was easier to stick with one thing. Switching back and forth seems like more trouble than it’s worth.

                                                              All of which to say, I wonder if this (or anything like it), however well designed and robust, can get much traction.

                                                              1. 3

                                                                All of which to say, I wonder if this (or anything like it), however well designed and robust, can get much traction.

                                                                I’m not sure I will ever use src, for the reasons you mention. But I think that it actually doesn’t need traction. It’s less than 3000 lines of Python, documentation and help strings included.

                                                                1. 4

                                                                  Sure, there’s no need for a small project to be popular. I was just thinking that it’s a shame that people (myself included) may ignore something good because context switching is a pain. (Disclaimer: I haven’t tried src, but let’s assume it’s great for the sake of discussion.)

                                                                  1. 2

                                                                    It’s less than 3000 lines of Python, documentation and help strings included.

                                                                    But it uses either SCCS or RCS as a backend. For a simple wrapper around a separate source control system, 3000 lines seems like a lot.

                                                                    For comparison, 9front’s git does not depend on an external version control system, includes a git server, and is only 3 times bigger – 8921 lines of code – as counted by:

                                                                     wc -l `{git/walk -c /sys/src/cmd/git /sys/lib/git}
                                                                    

                                                                    I think that for a minimalist version control system, there’s likely a simpler approach that could work well.

                                                                1. 1

                                                                  I don’t think it is good to keep inventing new languages. Wouldn’t guix or nix have worked better here?

                                                                  1. 4

                                                                    I see where you’re coming from, but I’m not sure that would work given the (at least implicit) contstraints. One of the goals mentioned in the original pull request (discussion) was for it to be small and self contained, thus minimising the overhead (both in terms of machine resources and brainspace) of working on the build itself.

                                                                    Both Nix and Guix depend on an underlying runtime system (nix language and guile scheme respectively). Not only is it yet another dependency to manage, but it also means there’s more context switching overhead when moving between hacking on the build vs. hacking on the source code. Wheras Zuo is designed to be very similar to racket, and should make it easier to switch tracks.

                                                                    Besides, build systems are fun!

                                                                    1. 4

                                                                      To add to crstry’s post, Nix and Guix wouldn’t work for building Racket on Windows, anyway.

                                                                      1. 4

                                                                        It is allways good to keep inventing new languages.

                                                                      1. 2

                                                                        this is a really nice article and i appreciate the youtube links, will definitely watch later. i was wondering if you planned on having a demo up and running soon or a repo to plug? sometime later this year i want to build a stream on my own personal site and am planning on building it with phoenix/elixir. it’s a rather complicated task but i like biking and would be interested in creating a client that i can stream and play my music on so that other’s can listen while i ride without getting copystruck + it seems like a fun side-project!

                                                                        edit: i found the link! should’ve probably waited before sending this comment in but i was a little hyped to see someone building a similar project out in the open!

                                                                        1. 3

                                                                          Thank you for the kind words! Sure thing: it’s up and running at http://lofi.limo/

                                                                          Sounds like a cool project; hope you’ll share it here :)

                                                                          1. 2

                                                                            for sure, this sort of thing is exactly what excites me about the web!! i’m probably younger than a lot of the people on here but i think it’s probably the same feelings other generations talk about when they reminisce on the old web. it’s cool that even though there are tools out there for people to bootstrap projects like that instantly (i.e. twitch and yt) but building it yourself is part of the fun, in my opinion. i was mostly inspired by the creator of this site who has his own personal stream set up.

                                                                            1. 2

                                                                              I agree! This is the fun stuff. I love to pop open a beer and watch Joshua hack away on his old Mac. So mellow and relaxing…

                                                                              1. 2

                                                                                Sorry but who is Joshua?

                                                                                Brilliant website btw. Have been listening to it intermittently while at work. Much better than listening on youtube and watching the uBlock counter go up every second.

                                                                                1. 2

                                                                                  Sorry but who is Joshua?

                                                                                  jcs, whose stream was mentionned by flbn.

                                                                        1. 5

                                                                          Related story and discussion from some days ago. :)

                                                                          Vale seems promising, hopefully it will find its niche.

                                                                          1. 12

                                                                            PSA: before proposing your favorite language as a better alternative, make sure to read the previous story to avoid rehashing the same arguments. Thanks :)

                                                                            1. 2

                                                                              hard agree with this comment, and also it would be great if we could apply the ML tag

                                                                              1. 2

                                                                                you can use the suggest link under a submission to change which tag’s are applied to a submission

                                                                                1. 2

                                                                                  I never provide feedback to posts with enough frequency to remember this, so thank you! will comply

                                                                                  edit actually I don’t have a suggest link for this one…

                                                                            1. 2

                                                                              Given an unsorted array (sequence) of non-negative integers, find a continuous sub-array (sub-sequence) which adds to a given number S. Implement the solution as a generic algorithm, but visualize the results with indices that are 1-based.

                                                                              Despite the fact that I don’t like those artificial interview questions, I was curious to see how my naîve solution would look like. I was using a sum variable inside the outer loop, which results in an O(N^2) algorithm: Go.

                                                                              This version isn’t very useful for the final solution that will be presented below, but allows us to calculate the complexity of this solution, O(N3), as we have 3 nested loops (with N being the number of elements in the sequence)

                                                                              Where is the third inner loop, is it the call to std::accumulate(?

                                                                              1. 2

                                                                                Where is the third inner loop, is it the call to std::accumulate(?

                                                                                I don’t understand singpolyma’s reply, so I might be missing something, but yes std::accumulate has linear time complexity.

                                                                                1. 1

                                                                                  O(N3)

                                                                                  … I think they mean N3 also known as O(N)

                                                                                  1. 1

                                                                                    The loops are nested, so it’s not O(N). That said, I only see two nested loops, and the naive solution is to test every possible pair of endpoints, which is nC2, which is O(N^2).

                                                                                    1. 1

                                                                                      I wasn’t doing analysis, only commenting that O(N3) would be nonsense. If they mean O(N^3) that’s fine of course.

                                                                                      1. 6

                                                                                        You can tell it’s a copypaste error because nobody would write N3 to mean N×3, they’d write 3N.

                                                                                        1. 2

                                                                                          That’s just how it looks in the cut and paste. In the article the 3 is super-scripted.

                                                                                  1. 2

                                                                                    Emacs shows matching parentheses by default

                                                                                    It seems that show-paren-mode no longer has to be enabled?

                                                                                    If so, that’s great. I’m trying to use most tools with as many defaults as possible.

                                                                                    After 20 years on Emacs, my .emacs / init.el is very small, just a few use-package invocations.

                                                                                    I wish make-backup-files defaulted to nil or that Emacs shipped with some mode that flipped those few variables that many people alter, such as that one, to more modern defaults.

                                                                                    1. 1

                                                                                      I agree that the tilde backup files are annoying, but they have saved my ass a few times. I think they are turned off when using a version-control mode and the file is under version control.

                                                                                      1. 7

                                                                                        I used to turn them off as well, until I discovered backup-directory-alist.

                                                                                        (setq backup-directory-alist '(("." . "~/.cache/emacs/backup")))
                                                                                        

                                                                                        Now they don’t clutter my directories, but they’re still available when I need them.

                                                                                        1. 1

                                                                                          I’m not on 28 yet, but Emacs (with vc-mode, 99% defaults) does create the backup files even for files in version control for me. I’ve sort of learned to ignore them mentally.

                                                                                          1. 1

                                                                                            In my Emacs 27.2, vc.el has this commented-out code:

                                                                                               ;; (unless vc-make-backup-files
                                                                                               ;;   (make-local-variable 'backup-inhibited)
                                                                                               ;;   (setq backup-inhibited t))
                                                                                            

                                                                                            …with a preceding comment claiming that this is somehow wrong. I must have learned this behavior on an earlier version of Emacs where that code was still active. Now I’m using Magit and I do have backup-inhibited set in version-controlled files, so I presume Magit is doing that.

                                                                                            1. 1

                                                                                              I just checked on emacs -Q (27.2) and backup-inhibited is still t on my version-controlled files. A quick (rip)grep didn’t find anything relevant though, but something must be setting it, and it’s not Magit.

                                                                                        2. 1

                                                                                          I honestly expect that a lot of old…experienced Emacs users will complain about this, as with most other changes to the old defaults that’s been done.

                                                                                          But I agree, it’s great that it’s on by default now! One thing less to enable as a standard.

                                                                                          1. 10

                                                                                            As someone whose .emacs file is older than some of my junior devs, I say attack those defaults with a chainsaw. I can fix the defaults I don’t like, but new devs can’t, so the opinions of my fellow foggies shouldn’t count when making these changes.

                                                                                            1. 2

                                                                                              I thought that the guiding philosophy for improvements to the out-of-box defaults is that the old settings are still available there and experienced users will know how to revert to them if they wish (whereas new users may have more trouble going the other way). And as an Emacsen user for nearly [mumble!?] years now, I’m perfectly fine with that!