1. 49
    1. 19

      Julia actually pursues many of those things we wonder about, but don’t always take the time to figure out. We know this is Unix, so it’ll make sense when it’s explained to us, but we don’t often take the time to find out why things are the way they are.

      It’s always educational to read what Julia’s been wondering about :)

      1. 7

        Also whether output is buffered or not might depend on what print function you use, for example in Rust print! buffers when writing to a pipe but println! will flush its output.

        As far as I know this is not correct. Rather, rust’s stdout is always line-buffered (for now). Since println always adds a new line it kinda sorta flushes implicitly, but you will get the same if you output a new line from print, strictly speaking it’s not an explicit operation of the function. The behaviour is really the interplay of the data being printed out and the stream’s configuration.

        The same occurs in most other systems e.g. in python if you print(…, end="") you will not see anything printed out, because line buffering.

        1. 13

          Oops, that’s what I get for believing what someone told me on social media without testing it for myself – I tried to check most things in this post but obviously I missed this one. I ran a test and like you say they’re both line buffered.

          Updated the post to say something that I’ve actually tested is true.

          1. 1

            With python(last I checked), you can request unbuffered IO, using the -u option to the python command.

          2. 3

            Why doesn’t it use some version of Nagle’s algorithm?

            1. 3

              Nagle’s algorithm requires you to be able see acknowledgements of your sent data.

              1. 3

                You can time the write() syscall that your buffer flush performs in order to tell how much backpressure there is from slow pipeline stages after you?

                1. 1

                  I wonder how much error you might get from preëmption (especially under load); even if you do get control returned to you immediately*, taking the time on the other end might end up being another syscall anyway*?

                  *I have no idea if there’s any guarantees or anything around these.

                  1. 1

                    The problem is not that write() is slow, the problem is that write() is never called because the buffer isn’t full.

                    You would need to do something like set a timer in print() for 10ms (for example) and flush the buffer when it fires.

                    1. 2

                      Yeah so if you look at the context of the thread you’re replying to, it’s full of people talking about heuristics for buffered writers to flush more often.

                2. 1

                  Or something like frame pacing? Perhaps every time a line is produced, add it to the buffer then flush the buffer if it is full or if more than say 16ms has passed since the last flush.

                3. 2

                  I was trying to make my site maker’s hot-build pipeline reactive (to inotifywait events [1]) and immediately ran into the stuck pipeline problem. Luckily, I instantly recognised what was going on because Archmage McIlroy had recently schooled me [2] about the dangers of stdio buffering…

                  [1] It’s terrible and great. I love it. stdbuf fixed it just fine on my machine https://github.com/search?q=repo%3Aadityaathalye%2Fshite+stdbuf&type=code

                  [2] Spake he: https://www.evalapply.org/posts/shell-aint-a-bad-place-to-fp-part-1-doug-mcilroys-pipeline/index.html#appendix-an-unexpected-masterclass

                    Sidelight. Buffering by C's stdio package can cause
                    deadlock in a feedback loop. A process that buffers its
                    output will starve if it needs feedback from stuff that's
                    waiting in its output buffer. stdio's buffering is evil!
                  
                  1. 2

                    Perl (disable [stdout buffering] with $| = 1)

                    As usual Perl has its own distinct style!

                    1. 5

                      Yeah, tho you can write STDOUT->autoflush; or use English; $OUTPUT_AUTOFLUSH = 1; if you prefer a style that’s more Perl 5 than Perl 4 :-)

                      1. 2

                        The mnemonic is that you can use $| to make sure your pipes are piping hot :-)

                        https://perldoc.perl.org/perlvar#$%7C

                        There was often-quoted advice when people had issues - e.g. writing to a child process and reading back - to check if you were “suffering from buffering”.

                        (Which google tells me is also a broader writeup by mjd: https://perl.plover.com/FAQs/Buffering.html)

                      2. 1

                        A useful use of cat, perhaps: avoiding grep block buffering at the start of a pipeline

                        1. 9

                          cat doesn’t change anything in that example.

                          1. 2

                            It wouldn’t help force unbuffering, but if for some reason you wanted buffering, you could tack | cat onto the end to force it to think it wasn’t going to a TTY, like tail -f log | grep xxx | cat.