1. 18

    I no longer believe that daemons should fork into the background. Most Unix systems now have better service control and it makes the code easier to deal with if it doesn’t call fork(). This makes it easier to test (no longer do you have to provide an option not to fork() or an option to fork()) and less code is always better.

    1. 6

      Not forking also allows logging to be an external concern and the process should just write to stdout and stderr as normal.

      1. 1

        This is not so much about the forking per se, but rather the other behaviour that generally goes with it: closing any file descriptors that might be connected to a controlling terminal.

      2. 4

        OpenBSD’s rc system seems to expect that processes fork. I don’t see an obvious workaround for processes that don’t fork.

        1. 3

          It’s not that hard to write a program to do the daemonization (call umask(), setsid(), chdir(), set up any redirection of stdin, stdout and stderr, then exec() the non-forking daemon.

          1. 2

            It’s even simpler when you have daemon(3): http://man7.org/linux/man-pages/man3/daemon.3.html

            1. 1

              Which you do on OpenBSD, actually.

              Note that daemon(3) is a non-standard extension so it should be avoided for portable code. The implementation is simple enough, though.

          2. 2

            I’m not sure this is accurate, at least on -current. There are several go “deamons” that as far as I understand don’t support fork(2). These can still be managed by OpenBSD’s rc system:

            # cd /etc/rc.d
            # cat grafana                                                                                                                                                                                                  
            #!/bin/ksh
            #
            # $OpenBSD: grafana.rc,v 1.2 2018/01/11 19:27:10 rpe Exp $
            
            daemon="/usr/local/bin/grafana-server"
            daemon_user="_grafana"
            daemon_flags="-homepath /usr/local/share/grafana -config /etc/grafana/config.ini"
            
            . /etc/rc.d/rc.subr
            
            rc_bg=YES
            rc_reload=NO
            
            rc_cmd $1
            

            I’m not sure if there’s more to it that I don’t understand, I don’t write many deamons!

            1. 1

              Well, it turns out, I can’t read! The key to this is rc_bg, see https://man.openbsd.org/rc.subr#ENVIRONMENT

          3. 1

            For those that don’t know, daemontools is a nice service system that explicitly wants programs to not try to daemonize themselves. For services I build and run I try to use that.

          1. 4

            This whole area has been exercising my brain recently.

            As much as I hate the C standard committee’s lack of courage in defining behaviour, as often a simple decision, even if controversial will resolve it.

            However, here is one that is sort of unresolvable.

            What behaviour should a program have that indexes beyond the bounds of an array?

            There is no way a standard can prescribe what the result will be.

            It must be undefined.

            So the obvious thing to do, as Pascal did, is do bounds checking and have a defined behaviour.

            That imposes substantial runtime costs in CPU and memory, so users do switch it off…..

            So what should the behaviour be?

            One reasonable assumption a compiler writer can make is that there is no way the programmer can intend to index out of bounds, so I can assume that the index is less than the bound and generate machine code accordingly.

            You might say, these newfangled optimizations are the problem… no they aren’t.

            Compilers have been relaying out data in memory according what they think best for decades.

            Where this whole thing is driving me nuts is around asserts. See this thread I started here… https://readlist.com/lists/gcc.gnu.org/gcc-help/7/39051.html

            Asserts, if they are compiled in, tell the compiler (without getting involved in UB optimizations) that if expression is false, then everything down stream is not reachable…. so it analyzes under the assumption the expression is true.

            However it doesn’t attempt to warn you if it finds a code path where the expression is false, and completely redoes it’s optimization without that assumption if you compile the assert out.

            1. 4

              What behaviour should a program have that indexes beyond the bounds of an array?

              There is no way a standard can prescribe what the result will be.

              It must be undefined.

              This gets to the heart of the matter, I think. Part of the issue is people confuse “the language standard” with “what compilers do”. The language says it is undefined behaviour for an out-of-bounds array access to occur, or for signed integers to have their value range exceeded, but there’s no reason why compilers can’t generate code which will explicitly detect these situations and throw out an error message (and terminate).

              So why don’t compilers generally do that by default? Because C is used in performance critical code where these checks have a cost which is considered significant. And, despite the claims in this article, there are cases where trivial optimisations such as assuming that signed integer arithmetic operations won’t overflow can lead to significant speedups (it’s just that these cases are not something as trivial as a single isolated loop).

              If you do value deterministic behaviour on program error and are willing to sacrifice some performance to get it, the obvious solution is to use a language which provides that, i.e. don’t use C. But that’s not a justification to criticise the whole concept of undefined behaviour in C.

              1. 4

                There is a false choice between inefficient code with run time bounds checking and compiler “optimizations” that break working code. I love the example in http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf where the GCC developers introduce a really stupid UB based “optimization” that broke working code and then found, to their horror, that it broke a benchmark. So they disabled it for the benchmark.

                And, despite the claims in this article, there are cases where trivial optimisations such as assuming that signed integer arithmetic operations won’t overflow can lead to significant speedups (it’s just that these cases are not something as trivial as a single isolated loop).

                Great. Let’s see an example.

                But that’s not a justification to criticise the whole concept of undefined behaviour in C.

                I think this attitude comes from a fundamental antipathy to the design of C or a basic misunderstanding of how it is used. C is not Java or Swift - and not because its designers were stupid or mired in archaic technology.

                1. 4

                  There is a false choice between inefficient code with run time bounds checking and compiler “optimizations” that break working code

                  Optimisations don’t break working code. They cause broken code to have different observable behaviour.

                  And, despite the claims in this article, there are cases where trivial optimisations such as assuming that signed integer arithmetic operations won’t overflow can lead to significant speedups (it’s just that these cases are not something as trivial as a single isolated loop).

                  Great. Let’s see an example.

                  I don’t have a code example to hand, and as I said they’re not trivial, but that doesn’t mean it’s not true. Since it can eliminate whole code paths, it can affect the efficacy of for example value range propagation, affect inlining decisions, and have other flow-on effects.

                  I think this attitude comes from a fundamental antipathy to the design of C or a basic misunderstanding of how it is used

                  I disagree.

                  1. -1

                    Optimisations don’t break working code. They cause broken code to have different observable behaviour.

                    That’s a legalistic answer. The code worked as expected and produced the correct result. The “optimization” caused it to misfunction.

                    I don’t have a code example to hand

                    Apparently nobody does. So the claimed benefit is just hand waving.

                    I disagree.

                    The thinking of Lattner is indicative. He agrees that compiler behavior using the UB loophole makes C a minefield. His solution is to advocate Swift. People who are hostile to the use of C should not be making these decisions.

                    1. 5

                      That’s a legalistic answer.

                      Alas, in absence of “legalistic answers”, the only definition of C is either…

                      • An implementation of C is a valid implementation iff every program in a Blessed Set of programs compile and runs successfully and outputs exactly the same values.

                        or

                      • An implementation of C is a valid implementation iff, every program that compiles and runs successfully on The One True Blessed C Compiler, compiles and runs and outputs exactly the same values AND every program that fails to compile on The One True Blessed Compiler, fails to compile on the candidate compiler.

                      What sort of C are you envisioning?

                      Those may be appropriate ways to progress, but that is a different language and probably should be called something other than C.

                      1. 3

                        Apparently nobody does. So the claimed benefit is just hand waving

                        Again, I don’t agree.

                        1. 1

                          You can disagree all you want, but you also seem to be unable to produce any evidence.

                          1. 3

                            You can disagree all you want, but you also seem to be unable to produce any evidence.

                            I have high confidence that I could produce, given some time, an example of code which compiled to say 20 instructions if integer overflow were defined and just 1 or 2 otherwise, and probably more by abusing the same technique repeatedly, but you might then claim it wasn’t representative of “real code”. And then if I really wanted to satisfy you I would have to find some way to trawl through repositories to identify some piece of code that exhibited similar properties. It’s more work than I care to undertake to prove my point here, and so I suppose you have a right to remain skeptical.

                            On the other hand, I have at least explained (even if only very briefly) how small optimisations such as assuming that integer arithmetic operations won’t overflow could lead to significant differences in code generation, beyond simple exchanging of instructions. You’ve given no argument as to why this couldn’t be the case. So, I don’t think there’s any clearly stronger argument on either side.

                            1. 0

                              I have high confidence that I could produce, given some time, an example of code which compiled to say 20 instructions if integer overflow were defined and just 1 or 2 otherwise

                              I have no confidence of this, and it would be a completely uninteresting optimization in any case.

                              On the other hand, I have at least explained (even if only very briefly) how small optimisations such as assuming that integer arithmetic operations won’t overflow could lead to significant differences in code generation, beyond simple exchanging of instructions.

                              Not really. You are omitting a single instruction that almost certainly costs no cycles at all in a modern pipelined processor. Balance that against putting minefields into the code - and note there is no way in C to check for this condition. The tradeoff is super unappealing.

                              1. 2

                                Not really. You are omitting a single instruction

                                No, I was not talking about omitting a single instruction.

                2. 2

                  With assert(), you are telling the compiler that at this point, this is true. The compiler is trusting your assertion of the truth at that point.

                  Also, if you compile with -DNDEBUG -O3 you will get the warning:

                  [spc]saltmine:/tmp>gcc -std=c99 -Wall -Wextra -pedantic -DNDEBUG -O3 c.c
                  c.c: In function ‘main’:
                  c.c:7:20: warning: ‘*((void *)&a+10)’ is used uninitialized in this function [-Wuninitialized]
                  c.c:13:8: note: ‘a’ was declared here
                  [spc]saltmine:/tmp>gcc -std=c99 -Wall -Wextra -pedantic -O3 c.c
                  [spc]saltmine:/tmp>
                  
                  1. 2

                    No, that is a meaningless statement.

                    The compiler doesn’t even see an assert statement, let alone “trust it”.

                    It is a macro that gets expanded to “plain old code” at preprocessor time, so depending on NDEBUG settings it expands either to something like if(!(exp))abort() or nothing.

                    What the compiler does trust is the __attribute_((__noreturn__)) on the abort() function.

                    1. 1

                      My file:

                      #include <assert.h>
                      
                      int foo(int x)
                      {
                        assert(x >= 0);
                        return x + 5;
                      }
                      

                      My file after running it through the C preprocessor:

                      # 1 "x.c"
                      # 1 "/usr/include/assert.h"
                       
                       
                      
                       
                       
                       
                      
                       
                      # 12
                      
                      # 15
                      
                      #ident      "@(#)assert.h   1.10    04/05/18 SMI"
                      
                      # 21
                      
                      # 26
                      extern void __assert(const char *, const char *, int);
                      # 31
                      
                      # 35
                      
                      # 37
                      
                       
                      # 44
                      
                      # 46
                      
                      # 52
                      
                      # 63
                      
                      # 2 "x.c"
                      
                      int foo(int x)
                      {
                         ( void ) ( ( x >= 0 ) || ( __assert ( "x >= 0" , "x.c" , 5 ) , 0 ) );
                        return x + 5;
                      }
                      #ident "acomp: Sun C 5.12 SunOS_sparc 2011/11/16"
                      

                      Not an __atttribute__ to be found. This C compiler can now generate code as if x is never a negative value.

                      1. 1

                        #include <assert.h> int foo(int x) { assert(x >= 0); return x + 5; }

                        Can you copy paste the assembly output? (I read sparc asm as part of my day job….)

                        I’d be interested to see is it is treating __assert() as anything other than a common or garden function.

                        1. 1

                          I’m not sure what it’ll prove, but okay:

                              cmp     %i0,0
                              bge     .L18
                              nop
                          
                          .L19:
                              sethi   %hi(.L20),%o0
                              or      %o0,%lo(.L20),%o0
                              add     %o0,8,%o1
                              call    __assert
                              mov     6,%o2
                              ba      .L17
                              nop
                          
                              ! block 3
                          .L18:
                              ba      .L22
                              mov     1,%i5
                          
                              ! block 4
                          .L17:
                              mov     %g0,%i5
                          
                              ! block 5
                          .L22:
                          
                          !    7    return x + 5;
                          
                              add     %i0,5,%l0
                              st      %l0,[%fp-4]
                              mov     %l0,%i0
                              jmp     %i7+8
                              restore
                          
                              ! block 6
                          .L12:
                              mov     %l0,%i0
                              jmp     %i7+8
                              restore
                          

                          I did not specify any optimizations, and from what I can tell, it calls a function called __assert().

                          1. 1

                            TL;DR; The optimiser for this compiler is crap. And it isn’t treating __assert() as special / noreturn.

                            int foo(int x)
                            {
                               ( void ) ( ( x >= 0 ) || ( __assert ( "x >= 0" , "x.c" , 5 ) , 0 ) );
                              return x + 5;
                            }
                            
                                ;; x is register %i0
                                cmp     %i0,0               ; Compare x with 0
                                bge     .L18                ; If it is large branch to .L18
                                nop                         ; Delay slot. Sigh sparc pipelining is makes debugging hard.
                            
                            ;;; This is the "call assert" branch. gcc has function __attribute__((cold)) or
                            ;;; __builtin_expect() to mark this as the unlikely path.
                            .L19:
                                sethi   %hi(.L20),%o0       
                                or      %o0,%lo(.L20),%o0
                                add     %o0,8,%o1
                                call    __assert
                                mov     6,%o2               ;Delay slot again
                                ba      .L17                ; Branch absolute to .L17
                                nop                         ;Delay slot
                            
                                ;; Really? Is this optimized at all?
                                ! block 3
                            .L18:
                                ba      .L22                ; Branch absolute to .L22!
                                mov     1,%i5               ; put 1 in %i5
                            
                            ;;; Seriously? Is this thing trying to do it the hard way?
                            ;;; The assert branch sets %i5 to zero.
                                ! block 4
                                .L17:
                                ;; Fun fact. %g0 is the sparc "bit bucket" reads as zero, ignores anything written to it.
                                mov     %g0,%i5             
                            
                                ! block 5
                                ;; Falls through. ie. Expected to come 
                                ;; out of __assert() *hasn't treated __assert as noreturn!*
                            
                                ;; Joins with the x>=0 branch
                            .L22:
                            
                            !    7    return x + 5;
                                ;; Local register %l0 is x + 5
                                add     %i0,5,%l0
                                st      %l0,[%fp-4]         ;WTF? Has this been inlined into a larger block of code?
                                mov     %l0,%i0             ;WTF? as above?
                                jmp     %i7+8               ;Return to calling addres.
                                restore                     ;Unwind sparc register windowing.
                            
                                ;; WTF? No reference to label .L12
                                ! block 6
                            .L12:
                                mov     %l0,%i0
                                jmp     %i7+8
                                restore
                            
                    2. 1

                      Actually, that isn’t quite what happens….

                      Actually, it’s “Just Another Macro” which, very approximately, expands to …

                       if( !(exp)) abort();
                      

                      …where abort() is marked __attribute__((noreturn));

                      Which is almost, but not quite what one would want….

                      As the compiler uses the noreturn attribute to infer that if !exp, then rest of code is unreachable, therefore for rest of code exp is true.

                      Alas, I have found that it doesn’t, if it finds a path for which exp is false, warn you that you will abort!

                      I certainly feel there is room for compiler and optimizer writers work with design by contract style programmers to have a “mutually beneficial” two way conversation with the programmers when they write asserts.

                      1. 0

                        Which is almost, but not quite what one would want….

                        I’m not sure I understand you. assert() will abort if the expression given is false. That’s what it does. It also prints where the expression was (it’s part of the standard). If you don’t want to abort, don’t call assert(). If you expect that assert() is a compile-time check, well, it’s not.

                        I certainly feel there is room for compiler and optimizer writers work with design by contract style programmers to have a “mutually beneficial” two way conversation with the programmers when they write asserts.

                        There’s only so far that can go though. Put your foo() function in another file, and no C compiler can warn you.

                        assert() is also a standard C function, which means the compiler can have built-in knowledge of its semantics (much like a C compiler can replace a call to memmove() with inline assembly). The fact that GCC uses its __attribute__ extension for this doesn’t apply to all other compilers.

                        1. 2

                          That’s the other bit of Joy about C.

                          There are two entirely separate things….

                          The compiler.

                          And the standard library.

                          gcc works quite happily with several entirely different libc’s.

                          assert.h is part of libc, not the compiler.

                          How assert() is implemented is the domain of the libc implementer not the compiler writer.

                          I have poked at quite a few different compilers and even more libc’s…. as I have summarised is how all I have looked at are doing things. (Although some don’t have a concept of “noreturn” so can’t optimize based on that)

                          Which compiler / libc are you looking at?

                          1. 3

                            The compiler that comes with Solaris right now.

                            You can’t have a standard C compiler without the standard C library. I can get a compiler that understands, say, C99 syntax, but unless it comes with the standard C library, it can’t be called a compliant C99 compiler. The standard covers both the language and the library. I’m reading the C99 standard right now, and here’s an interesting bit:

                            Each library function is declared, with a type that includes a prototype, in a header, (182) whose contents are made available by the #include preprocessing directive.

                            And footnote 182 states:

                            (182) A header is not necessarily a source file, nor are the < and > delimited sequences in header names necessarily valid source file names.

                            To me, that says the compiler can have knowledge of the standard functions. Furthermore:

                            Any function declared in a header may be additionally implemented as a function-like macro defined in the header … Likewise, those function-like macros described in the following subclauses may be invoked in an expression anywhere a function with a compatible return type could be called.(187)

                            (187) Because external identifiers and some macro names beginning with an underscore are reserved, implementations may provide special semantics for such names. For example, the identifier _BUILTIN_abs could be used to indicate generation of in-line code for the abs function. Thus, the appropriate header could specify #define abs(x) _BUILTIN_abs(x) for a compiler whose code generator will accept it. In this manner, a user desiring to guarantee that a given library function such as abs will be a genuine function may write #undef abs whether the implementation’s header provides a macro implementation of abs or a built-in implementation. The prototype for the function, which precedes and is hidden by any macro definition, is thereby revealed also.

                            So the compiler can absolutely understand the semantics of standard C calls and treat them specially. Whether a C compiler does so is implementation defined. And good luck writing offsetof() of setjmp()/longjmp() portably (spoiler: you can’t—they’re tied to both the compiler and architecture).

                            So, getting back to assert() and your issues with it. Like I said, the compilers knows (whether it’s via GCC’s __attribute__(__noreturn__) or because the compiler has built-in knowledge of the semantics of assert()) that the expression used must be true and can thus optimize based on that information, much like it can remove the if statement and related code:

                            const int debug = 0;
                            
                            {
                              int x = debug;
                            
                              if (x)
                              {
                                fprintf(stderr,"here we are!\n");
                                exit(33);
                              }
                              // ...
                            }
                            

                            even through x, because debug is constant, x is loaded with a constant, and not modified prior to the if statement. Your wanting a warning about an invalid index to an array whose index is used in assert() is laudable, but to the compiler, you are telling it “yes, this is fine. No complaints please.” Compile the same code with NDEBUG defined, the assert() goes away (from the point of view of the compiler phase) and the diagnostic can be issued.

                            Yes, it sucks. But that’s the rational.

                            The intent is you run the code, you get the assert, you fix the code (otherwise, why use assert() in the first place?) or remove the assert() because the assumption made is no longer valid (this has happened to me but not often and usually after code has changed, which is something you want, no?).

                    3. 2

                      You can do #define assert(p) if (!(p)) __builtin_unreachable() to keep the optimisation benefits! And MSVC has __assume which behaves similarly.

                      1. 1

                        Hmm… Interesting….

                        Does it then elide the expression !(p)?

                        Or does it impose the run time cost of evaluating !(p) and not the benefit of invoking abort()?

                        1. 1

                          Since __builtin_unreachable only exists to guide the compiler, p has to be an expression with no side effects, and then the compiler can optimise it out because you don’t use its result.

                      2. 1

                        I think this is an incorrectly framed question. C says that it’s not the compiler’s problem. You index past an array bound, perhaps you know what you are doing or perhaps not. The compiler is just supposed to do what you said. If you have indexed into another data structure by mistake or past the bound of allocated memory - that’s on the programmer ( BTW: I think opt-in bounds checked arrays would be great). It is unreasonable for the compiler to assume things that may be false. For example, if the programmer cautiously adds a check for overflow, I don’t want the compiler to assume that the index must be in bounds so the check can be discarded.

                        1. 6

                          C says that it’s not the compiler’s problem

                          Actually the C standard says it’s not the compilers problem, it’s undefined behaviour and completely your problem.

                          If you want it to have some weird arsed, but well defined behaviour, you need a different language standard.

                          In C standardese, things that are “the compilers problem” are labelled clearly as “implementation defined”, things that are your problem are labelled “undefined behaviour”.

                          perhaps you know what you are doing or perhaps not.

                          Well, actually, you provably don’t know what you’re doing…. as the compiler and linker lays out the data structures in ram pretty much as they damn well feel like.

                          Part of that for, example, like the struct padding and alignment is part of the ABI for that particular system, which is not part of the C standard, and most of that will change as you add or remove other data items and/or change their types. If you need to rely on such things, there are other (some non-standard) mechanisms, eg. unions types and packing pragmas.

                          BTW: I think opt-in bounds checked arrays would be great

                          gcc and clang now does have sanitizers to check that.

                          However the C standard is sufficiently wishy-washy on a number of fronts, there are several corner cases that are uncheckable, and valgrind is then your best hope. Valgrind won’t help you, for example, if you index into another valid memory region or alignment padding.

                          For example, if the programmer cautiously adds a check for overflow, I don’t want the compiler to assume that the index must be in bounds so the check can be discarded.

                          How ever, if the compiler can prove that the check always succeeds, then the check is useless and the programmer has written useless code and the compiler rightly elides it.

                          Modern versions of gcc will (if you have the warnings dialled up high enough, and annotated function attributes correctly) will warn you about tautologies and unreachable code.

                          1. 1

                            The C standard is not the C language. It is a committee report attempting to codify the language. It is not made up of laws of physics - it can be wrong and can change. My argument is that the standard is wrong. Feel free to disagree, but please don’t treat the current instance of the standard as if they were beyond discussion.

                            In fact, I do want a different standard: one that is closer to my idea what the rules of the language should be in order to make the language useful, beautiful, and closer to the spirit of the design.

                            The compiler and linker don’t have total freedom to change layouts even in the current standard - otherwise, for example, memcpy would not work. Note: “Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.”

                            struct a{ int x[100];}; char *b = malloc(sizeof(int)*101; struct a *y = (struct a *)b; if(sizeof(struct a)) != sizeof(int)*100 ) panic(“this implementation of C won’t work for us\n”); …. do stuff … y->x[100] = checksum(y);

                            But worse,

                            message = readsocket(); for(i = 0; i < message->numberbytes; i++) if( i > MAX)use(m->payload[i]))

                            if the compiler can assume the index is never greater than the array size and MAX is greater than array size, according to you it should be able to “optimize” away the check.

                            How ever, if the compiler can prove that the check always succeeds, then the check is useless and the programmer has written useless code and the compiler rightly elides it.

                            This is one of the key problems with UB. The compiler can assume there is no UB. Therefore the check is assumed unnecessary. Compilers don’t do this right now, but that’s the interpretation that is claimed to be correct. In fact, in many cases the compiler assumes that the code will not behave the way that the generated code does behave. This is nutty.

                            1. 8

                              The C standard is not the C language.

                              Hmm. So what is The C Language?

                              In the absence of the standard, there is no “C Language”, merely a collection of competing implementations of different languages, confusingly all named “C”.

                              I don’t think calling a standard “Wrong” isn’t very helpful, as that would imply there exists some definition of Right.

                              I rather call it “differing from all known implementations” or “unimplementable” or “undesirable” or “just plain bloody silly”.

                              There is no One True Pure Abstract Universal C out there like an ancient Greek concept of Numbers.

                              There are only the standard(s) and the implementations.

                              In fact, I do want a different standard: one that is closer to my idea what the rules of the language should be in order to make the language useful, beautiful, and closer to the spirit of the design.

                              Ah, the Joys of Standards! They are so useful, everybody wants their own one! ;-)

                              Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined.”

                              Umm. So, yes, an array is a contiguous sequence, but we’re talking about indexing out of bounds of an array. So what is contiguous beyond that array?

                              Answer 1 : Possibly empty padding to align the next object at the appropriate alignment boundary.

                              Answer 2: Which is the next object? That is determined by the field order within a struct…. (with the alignment padding determined by the ABI), but if the array is not in a struct…. it’s all bets off as to which object the compiler linker chooses to place next.

                              Hmm. Your example didn’t format nicely (nor was it valid syntax (missing parenthesis) so let me see if I can unravel that to see what you mean…

                              struct a { 
                                 int x[100];
                              }; 
                              char *b = malloc(sizeof(int)*101); 
                              struct a *y = (struct a *)b; 
                              
                              if(sizeof(struct a)) != sizeof(int)*100 ) 
                                  panic(“this implementation of C won’t work for us\n”); 
                              
                              …. do stuff … 
                              
                               y->x[100] = checksum(y);
                              

                              Hmm. Not sure what you’re trying to say, but try this one….

                              #include <stdio.h>
                              #include <stdint.h>
                              
                              struct {
                                  char c[5];
                                  uint32_t i;
                              } s;
                              
                              uint64_t l;
                              
                              int main(void)
                              {
                                 printf( "sizeof(s)=%lu\n sizeof(c)=%lu\n sizeof(i)=%lu\n", sizeof(s),sizeof(s.c),sizeof(s.i));
                                 printf( "address of s=%08lx\n address of l=%08lx\n diff = %ld\n", (uintptr_t)&s, (uintptr_t)&l, ((intptr_t)&s-(intptr_t)&l));
                                 return 0;
                              }
                              

                              Outputs…

                              sizeof(s)=12                                                                                                                                                                                 
                              sizeof(c)=5                                                                                                                                                                                 
                              sizeof(i)=4                                                                                                                                                                                 
                              address of s=00601050                                                                                                                                                                        
                              address of l=00601048                                                                                                                                                                       
                              diff = 8                                                                                                                                                                                    
                              

                              https://onlinegdb.com/B1Ut3031m

                              1. 3

                                In the absence of the standard, there is no “C Language”, merely a collection of competing implementations of different languages, confusingly all named “C”.

                                What a weird idea. So much of the core code of the internet and infrastructure was written in an impossible language prior to the first ANSI standard. And it even worked!

                                Ah, the Joys of Standards! They are so useful, everybody wants their own one! ;-)

                                There is a standards process. It involves people presenting proposals for modifications and discussing their merits. There are changes ! That’s totally normal and expected.

                                The moment C compilers began padding, every compiler added a “packed” attribute. The reason is that many C applications require that capability. Imagine ethernet packets with artisanal compiler innovative ordering. And those attributes are not in the standard - yet they exist all the same.

                                Your example is not my example.

                                1. 2

                                  So much of the core code of the internet and infrastructure was written in an impossible language prior to the first ANSI standard. And it even worked!

                                  Were actually written in whatever dialect was available on the day and worked for that machine, that compiler on that day.

                                  And porting to a different machine, different compiler, different version of the compiler, was a huge pain in the ass.

                                  I know.

                                  I’ve done a hell of a lot of that over the decades.

                                  Role on tighter standards please.

                                  Yup, and most of them added a subtly different packed attribute, and since it was not terribly well documented and defined, I’ve had a fair amount of pain from libraries written (LWIP comes to mind), where at various points in their history got the packed attribute wrong, so it wasn’t portable from 32 to 64 bit.

                                  Your example is not my example.

                                  You example wasn’t formatted, and I didn’t quite understand the point of it. Can you format it properly and expand a bit on what you were sayign with that example?

                      1. 4

                        All other protocols are dead. ALL HAIL HTTPS!

                        And no, I am not happy about this. We will eventually have IP over IP. The only port that will be available is 443. True peer-to-peer communication is dead. The promise of the Internet in the 90s is dead.

                        All that’s left is HTTPS.

                        As controlled by Google.

                        1. 6

                          Here is a project that allows you to use APL from within Lua.

                          1. 5

                            Last week

                            • Fiddled with different ways of attaching to processes and viewing their states.
                            • Some other technical stuff that went well

                            This was for the low level debugger I’m trying to make.

                            So, from what I’ve read and seen, tools that attach and inspect other process tend to just use gdb under the hood. I was hoping for a more minimal debugger to read and copy.

                            lldb almost does what I need because of its existing external Python interface but documentation for writing a stand-alone tool (started from outside the debugger rather than inside) is scattered. I haven’t managed to make it single step.

                            Using raw ptrace and trying to read the right memory locations seems difficult because of things like address randomization. And getting more information involves working with even more memory mapping and other conventions.

                            I wish all these conventions were written in some machine readable language agnostic way so I don’t have to human-read each one and try to implement it. Right now this is all implicit in the source code of something like gdb. This is a lot of extra complexity which has nothing to do with what I’m actually trying to accomplish.

                            The raw ptrace approach would also likely only work for Linux. And possibly strong tied to C or assembly.

                            The problem with the latter is that eventually I will want to do this to interpreters written in C or even interpreters written in interpreters written in C. Seems like even more incidental complexity in that way.

                            An alternative is to log everything and have a much fancier log viewer after the fact. This way the debugged program only need to emit the right things to a file or stdout. But this loses the possibility for any interactivity.

                            Plus, all of this would only be worth it if I can get some state visualization customizable to that specific program (because usually it will be an interpreter).

                            Other questions: How to avoid duplicating the work when performing operations from “inside the program” and from “outside” through the eventual debugger?

                            Other ideas: Try to do this with a simpler toy language/system to get an idea of how well using such a workflow would work in the first place.

                            Some references

                            This week

                            • Well, now that I have a better idea of how deep this rabbit hole is, I need to decide what to do. Deciding is much harder than programming…
                            • Or maybe I should do one of the other thousand things I want to and have this bit of indecision linger some more.
                            1. 5

                              I wrote a very simple PoC debugger in Rust if you are interested in the very basics: https://github.com/levex/debugger-talk

                              It uses ptrace(2) under the hood, as you would expect.

                              1. 1

                                Thanks! I’ve a had a look at your slide and skimmed some of your code (don’t have Rust installed or running would be the first thing I’d do).

                                I see that you’re setting breakpoints by address. How do you figure out the address at which you want to set a breakpoint though?

                                How long did it take to make this? And can you comment on how hard it would be to continue from this point on? For example reading C variables and arrays? Or getting line numbers from the call stack?

                                1. 2

                                  Hey, sorry for the late reply!

                                  In the talk I was setting breakpoints by address indeed. This is because the talked focused on the lower-level parts To translate line numbers into addresses and vice-versa you need access to the “debug information”. This is usually stored in the executable (as decribed by the DWARF file format). There are libraries that can help you with this (just as the disassembly is done by an excellent library instead of my own code).

                                  This project took about a week of preparation and work. I was familiar with the underlying concepts, however Rust and its ecosystem was a new frontier for me.

                                  Reading C variables is already done :-), reading arrays is just a matter of a new command and reading variables sequentially.

                                  1. 1

                                    Thanks for coming back to answer! Thanks to examples from yourself and others I did get some stuff working (at least on the examples I tried) like breakpoint setting/clearing, variable read/write and simple function calls.

                                    Some things from the standards/formats are still unclear, like why I only need to add the start of the memory region extracted from /proc/pid/maps if its not 0x400000.

                                    This project took about a week of preparation and work. I was familiar with the underlying concepts, however Rust and its ecosystem was a new frontier for me.

                                    A week doesn’t sound too bad. Unfortunately, I’m in the opposite situation using a familiar system to do something unfamiliar.

                                    1. 2

                                      I think that may have to do with whether the executable you are “tracing” is a PIE (Position-Independent Executable) or not.

                                      Good luck with your project, learning how debuggers work by writing a simple one teaches you a lot.

                                  2. 2

                                    For C/assembly (and I’ll assume a modern Unix system) you’ll need to read up on ELF (object and executable formats) and DWARF (debugging records in an ELF file) that contain all that information. You might also want to look into the GDB remote serial protocol (I know it exists, but I haven’t looked much into it).

                                    1. 1

                                      Well, I got some addresses out of nm ./name-of-executable but can’t peek at those directly. Probably need an offset of some sort?

                                      There’s also dwarfdump I haven’t tried yet. I’ll worry about how to get this info from inside my tracer a bit later.

                                      Edit: Nevermind, it might have just been the library I’m using. Seems like I don’t need an offset at all.

                                      1. 2

                                        I might have missed some other post, but is there a bigger writeup on this project of yours? As to the specifics of digging up such information, take a look at ECFS - https://github.com/elfmaster/ecfs

                                        1. 1

                                          I might have missed some other post, but is there a bigger writeup on this project of yours?

                                          I’m afraid not, at least for the debugger subproject. This is the context. The debugger would fit in two ways:

                                          • Since I have a GUI maker, I can try to use it to make a graphical debugger. (Ideally, allowing custom visualizations created for each new debugging task.)
                                          • A debugger/editor would be useful for making and editing [Flpc]((github.com/asrp/flpc) or similar. I want to be able to quickly customize the debugger to also be usable as an external Flpc debugger (instead of just a C debugger). In fact, it’d be nice if I could evolve the debugger and target (=interpreter) simultaneously.

                                          Although I’m mostly thinking of using it for the earlier stages of development. Even though I should already be past that stage, if I can (re)make that quickly, I’ll be more inclined to try out major architectural changes. And also add more functionality in C more easily.

                                          Ideally, the debugger would also be an editor (write a few instructions, set SIGTRAP, run, write a few more instructions, etc; write some other values to memory here and there). But maybe this is much more trouble than its worth.

                                          Your senseye program might be relevant depending on how customizable (or live customizable) the UI is. The stack on which its built is completely unknown to me. Do you have videos/posts where you use it to debug and/or find some particular piece of information?

                                          As to the specifics of digging up such information, take a look at ECFS - https://github.com/elfmaster/ecfs

                                          I have to say, this looks really cool. Although in my case, I’m expecting cooperation from the target being debugged.

                                          Hopefully I will remember this link if I need something like that later on.

                                          1. 2

                                            I have to say, this looks really cool. Although in my case, I’m expecting cooperation from the target being debugged.

                                            My recommendation, coolness aside, for the ECFS part is that Ryan is pretty darn good with the ugly details of ELF and his code and texts are valuable sources of information on otherwise undocumented quirks.

                                            Your senseye program might be relevant depending on how customizable (or live customizable) the UI is. The stack on which its built is completely unknown to me. Do you have videos/posts where you use it to debug and/or find some particular piece of information?

                                            I think the only public trace of that is https://arcan-fe.com/2015/05/24/digging-for-pixels/ but it only uses a fraction of the features. The cases I use it for on about a weekly basis touch upon materials that are NDAd.

                                            I have a blogpost coming up on how the full stack itself map into debugging and what the full stack is building towards, but the short short (yet long, sorry for that, the best I could do at the moment) version:

                                            Ingredients:

                                            Arcan is a display server - a poor word for output control, rendering and desktop IPC subsystem. The IPC subsystem is referred to as SHMIF. It also comes with a mid level client API: TUI which roughly correlates to ncurses, but with more desktop:y featureset and sidesteps terminal protocols for better window manager integration.

                                            The SHMIF IPC part that is similar to a ‘Window’ in X is referred to as a segment. It is a typed container comprised of one big block (video frame), a number of small chunked blocks (audio frames), two ring buffers as input/output queue that carry events and file descriptors.

                                            Durden act a window manager (Meta-UI).This mostly means input mapping, configuration tracking, interactive data routing and window layouting.

                                            Senseye comes in three parts. The data providers, sensors, that have some means of sampling with basic statistics (memory, file, ..) which gets forwarded over SHMIF to Durden. The second part is analysis and visualization scripts built on the scripting API in Arcan. Lastly there are translators that are one-off parsers that take some incoming data from SHMIF, parses it and renders some human- useful human- level output, optionally annotated with parsing state metadata.

                                            Recipe:

                                            A client gets a segment on connection, and can request additional ones. But the more interesting scenario is that the WM (durden in this case) can push a segment as a means of saying ‘take this, I want you to do something with it’ and the type is a mapping to whatever UI policy that the WM cares about.

                                            One such type is Debug. If a client maps this segment, it is expected to populate it with whatever debugging/troubleshooting information that the developer deemed relevant. This is the cooperative stage, it can be activated and deactivated at runtime without messing with STDERR and we can stop with the printf() crap.

                                            The thing that ties it all together - if a client doesn’t map a segment that was pushed on it, because it doesn’t want to or already have one, the shmif-api library can sneakily map it and do something with it instead. Like provide a default debug interface preparing the process to attach a debugger, or activate one of those senseye sensors, or …

                                            Hierarchical dynamic debugging, both cooperative and non-cooperative, bootstrapped by the display server connection - retaining chain of trust without a sudo ptrace side channel.

                                            Here’s a quick PoC recording: https://youtu.be/yBWeQRMvsPc where a terminal emulator (written using TUI) exposes state machine and parsing errors when it receives a “pushed” debug window.

                                            So what I’m looking into right now is writing the “fallback” debug interface, with some nice basics, like stderr redirect, file descriptor interception and buffer editing, and a TUI for lldb to go with it ;-)

                                            The long term goal for all this is “every byte explained”, be able to take something large (web browser or so) and have the tools to sample, analyse, visualise and intercept everything - show that the executing runtime is much more interesting than trivial artefacts like source code.

                                            1. 1

                                              Thanks! After reading this reply, I’ve skimmed your lastest post submitted here and on HN. I’ve added it to my reading list to considered more carefully later.

                                              I don’t fully understand everything yet but get the gist of it for a number of pieces.

                                              I think the only public trace of that is https://arcan-fe.com/2015/05/24/digging-for-pixels/ but it only uses a fraction of the features.

                                              Thanks, this gives me a better understanding. I wouldn’t minding seeing more examples like this, even if contrived.

                                              In my case I’m not (usually) manipulating (literal) images or video/audio streams though. Do you think your project would be very helpful for program state and execution visualization? I’m thinking of something like Online Python Tutor. (Its sources is available but unfortunately everything is mixed together and its not easy to just extract the visualization portion. Plus, I need it to be more extensible.)

                                              For example, could you make it so that you could manually view the result for a given user-input width, then display the edges found (either overlayed or separately) and finally after playing around with it a bit (and possibly other objective functions than edges), automatically find the best width as show in the video? (And would this be something that’s easy to do?) Basically, a more interactive workflow.

                                              The thing that ties it all together - if a client doesn’t map a segment that was pushed on it, because it doesn’t want to or already have one, the shmif-api library can sneakily map it and do something with it instead.

                                              Maybe this is what you already meant here and by your “fallback debug interface” but how about having a separate process for “sneaky mapping”? So SHMIF remains a “purer” IPC but you can an extra process in the pipeline to do this kind of mapping. (And some separate default/automation can be toggled to have it happen automatically.)

                                              Hierarchical dynamic debugging, both cooperative and non-cooperative, bootstrapped by the display server connection - retaining chain of trust without a sudo ptrace side channel.

                                              Here’s a quick PoC recording: https://youtu.be/yBWeQRMvsPc where a terminal emulator (written using TUI) exposes state machine and parsing errors when it receives a “pushed” debug window.

                                              Very nice! Assuming I understood correctly, this takes care of the extraction (or in your architecture, push) portion of the debugging

                                              1. 3

                                                Just poke me if you need further clarification.

                                                For example, could you make it so that you could manually view the result for a given user-input width, then display the edges found (either overlayed or separately) and finally after playing around with it a bit (and possibly other objective functions than edges), automatically find the best width as show in the video? (And would this be something that’s easy to do?) Basically, a more interactive workflow.

                                                The real tool is highly interactive, it’s the basic mode of operation, it’s just the UI that sucks and that’s why it’s being replaced with Durden that’s been my desktop for a while now. This video shows a more interactive side: https://www.youtube.com/watch?v=WBsv9IJpkDw Including live sampling of memory pages (somewhere around 3 minutes in).

                                                Maybe this is what you already meant here and by your “fallback debug interface” but how about having a separate process for “sneaky mapping”? So SHMIF remains a “purer” IPC but you can an extra process in the pipeline to do this kind of mapping. (And some separate default/automation can be toggled to have it happen automatically.)

                                                It needs both, I have a big bag of tricks for the ‘in process’ part, and with YAMA and other restrictions on ptrace these days the process needs some massage to be ‘external debugger’ ready. Though some default of “immediately do this” will likely be possible.

                                                I’ve so far just thought about it interactively with the sortof goal that it should be, at most, 2-3 keypresses from having a window selected to be digging around inside it’s related process no matter what you want to measure or observe. https://github.com/letoram/arcan/blob/master/src/shmif/arcan_shmif_debugif.c ) not finished by any stretch binds the debug window to the TUI API and will present a menu.

                                                Assuming I understood correctly, this takes care of the extraction (or in your architecture, push) portion of the debugging

                                                Exactly.

                                                1. 2

                                                  Thanks. So I looked a bit more into this.

                                                  I think the most interesting part for me at the moment is the disassembly.

                                                  I tried to build it just to see. I eventually followed these instructions but can’t find any Senseye related commands in any menu in Durden (global or target).

                                                  I think I managed to build senseye/senses correctly.

                                                  Nothing obvious stands out in tools. I tried both symlinks

                                                  /path/to/durden/durden/tools/senseye/senseye
                                                  /path/to/durden/durden/tools/senseye/senseye.lua
                                                  

                                                  and

                                                  /path/to/durden/durden/tools/senseye
                                                  /path/to/durden/durden/tools/senseye.lua
                                                  

                                                  Here are some other notes on the build process

                                                  Libdrm

                                                  On my system, the include -I/usr/include/libdrm and linker flag -ldrm are needed. I don’t know cmake so don’t know where to add them. (I manually edited and ran the commands make VERBOSE=1 was running to get around this.)

                                                  I had to replace some CODEC_* with AV_CODEC_*

                                                  Durden

                                                  Initially Durden without -p /path/to/resources would not start saying some things are broken. I can’t reproduce it anymore.

                                                  Senseye
                                                  cmake -DARCAN_SOURCE_DIR=/path/to/src ../senses
                                                  

                                                  complains about ARCAN_TUI_INCLUDE_DIR and ARCAN_TUI_LIBRARY being not found:

                                                  Make Error: The following variables are used in this project, but they are set to NOTFOUND.
                                                  Please set them or make sure they are set and tested correctly in the CMake files:
                                                  ARCAN_TUI_INCLUDE_DIR
                                                  
                                                  Capstone

                                                  I eventually installed Arcan instead of just having it built and reached this error

                                                  No rule to make target 'capstone/lib/libcapstone.a', needed by 'xlt_capstone'.
                                                  

                                                  I symlinked capstone/lib64 to capstone/lib to get around this.

                                                  Odd crashes

                                                  Sometimes, Durden crashed (or at least exited without notice) like when I tried changing resolution from inside.

                                                  Here’s an example:

                                                  Improper API use from Lua script:
                                                  	target_disphint(798, -2147483648), display dimensions must be >= 0
                                                  stack traceback:
                                                  	[C]: in function 'target_displayhint'
                                                  	/path/to/durden/durden/menus/global/open.lua:80: in function </path/to/durden/durden/menus/global/open.lua:65>
                                                  
                                                  
                                                  Handing over to recovery script (or shutdown if none present).
                                                  Lua VM failed with no fallback defined, (see -b arg).
                                                  
                                                  Debug window

                                                  I did get target->video->advanced->debug window to run though.

                                                  1. 2

                                                    I’d give it about two weeks before running senseye as a Durden extension is in a usable shape (with most, but not all features from the original demos).

                                                    A CMake FYI - normally you can patch the CMakeCache.txt and just make. Weird that it doesn’t find the header though, src/platform/cmake/FindGBMKMS.cmake quite explicitly looks there, hmm…

                                                    The old videos represent the state where senseye could run standalone and did its own window management. For running senseye in the state it was before I started breaking/refactoring things the setup is a bit different and you won’t need durden at all. Just tested this on OSX:

                                                    1. Revert to an old arcan build ( 0.5.2 tag) and senseye to the tag in the readme.
                                                    2. Build arcan with -DVIDEO_PLATFORM=sdl (so you can run inside your normal desktop) and -DNO_FSRV=On so the recent ffmpeg breakage doesn’t hit (the AV_CODEC stuff).
                                                    3. Build the senseye senses like normal, then arcan /path/to/senseye/senseye

                                                    Think I’ve found the scripting error, testing when I’m back home - thanks.

                                                    The default behavior on scripting error is to shutdown forcibly even if it could recover - in order to preserve state in the log output, the -b arguments lets you set a new app (or the same one) to switch and migrate any living clients to, arcan -b /path/to/durden /path/to/durden would recover “to itself”, surprisingly enough, this can be so fast that you don’t notice it has happened :-)

                                                    1. 1

                                                      Thanks, with these instructions I got it compiled and running. I had read the warning in senseye’s readme but forgot about it after compiling the other parts.

                                                      I’m still stumbling around a bit, though that’s what I intended to do.

                                                      So it looks like the default for sense_mem is to not interrupt the process. I’m guessing the intended method is to use ECFS to snapshot the process and view later. But I’m actually trying to live view and edit a process.

                                                      Is there a way to view/send things through the IPC?

                                                      From the wiki:

                                                      The delta distance feature is primarily useful for polling sources, like the mem-sense with a refresh clock. The screenshot below shows the alpha window picking up on a changing byte sequence that would be hard to spot with other settings.

                                                      Didn’t quite understand this example. Mem diff seems interesting in general.

                                                      For example, I have a program that changes a C variable’s value every second. Assuming we don’t go read the ELF header, how can senseye be used to find where that’s happening?

                                                      From another part of the wiki

                                                      and the distinct pattern in the point cloud hints that we are dealing with some ASCII text.

                                                      This could use some more explanation. How can you tell its ASCII from just a point cloud??

                                                      Minor questions/remark

                                                      Not urgent in any way

                                                      • Is there a way to start the process as a child so ./sense_mem needs less permissions?
                                                      • Is there a way to view registers?
                                                      Compiling

                                                      Compiling senseye without installing Arcan with cmake -DARCAN_SOURCE_DIR= still gives errors.

                                                      I think the first error was about undefined symbols that were in platform/platform.h (arcan_aobj_id and arcan_vobj_id).

                                                      I can try to get the actual error message again if that’s useful.

                                                      1. 2

                                                        Thanks, with these instructions I got it compiled and running. I had read the warning in senseye’s readme but forgot about it after compiling the other parts. I’m still stumbling around a bit, though that’s what I intended to do.

                                                        From the state you’re seeing it, it is very much a research project hacked together while waiting at airports :-) I’ve accumulated enough of a idea to distill it into something more practically thought together - but not there quite yet.

                                                        Is there a way to view/send things through the IPC?

                                                        At the time it was written, I had just started to play with that (if you see the presentation slides, that’s the fuzzing bit, the actual sending works very much like a clipboard paste operation), the features are in the IPC system now, not mapped into the sensors though.

                                                        So it looks like the default for sense_mem is to not interrupt the process. I’m guessing the intended method is to use ECFS to snapshot the process and view later. But I’m actually trying to live view and edit a process.

                                                        yeah, sense_mem was just getting the whole “what does it take to sample / observe process memory without poking it with ptrace etc. Those controls and some other techniques are intended to be bootstrapped via the whole ipc-system in the way I talked about earlier. That should kill the privilege problem as well.

                                                        Didn’t quite understand this example. Mem diff seems interesting in general.

                                                        The context menu for a data window should have a refresh clock option. If that’s activated, it will re-sample the current page and mark which bytes changed. Then the UI/shader for alpha window should show which bytes those are.

                                                        For example, I have a program that changes a C variable’s value every second. Assuming we don’t go read the ELF header, how can senseye be used to find where that’s happening?

                                                        The intended workflow was something like “dig around in memory, look at projections or use the other searching tools to find data of interest” -> attach translators -> get symbolic /metadata overview.

                                                        and the distinct pattern in the point cloud hints that we are dealing with some ASCII text. This could use some more explanation. How can you tell its ASCII from just a point cloud??

                                                        See the linked videos on “voyage of the reverse” and the recon 2014 video of “cantor dust”, i.e. a feedback loop of projections + training + experimentation. The translators was the tool intended to make the latter stage easier.

                                                    2. 1

                                                      I’d give it about two weeks before running senseye as a Durden extension is in a usable shape (with most, but not all features from the original demos).

                                                      A CMake FYI - normally you can patch the CMakeCache.txt and just make. Weird that it doesn’t find the header though, src/platform/cmake/FindGBMKMS.cmake quite explicitly looks there, hmm…

                                                      The old videos represent the state where senseye could run standalone and did its own window management. For running senseye in the state it was before I started breaking/refactoring things the setup is a bit different and you won’t need durden at all. Just tested this on OSX:

                                                      1. Revert to an old arcan build ( 0.5.2 tag) and senseye to the tag in the readme.
                                                      2. Build arcan with -DVIDEO_PLATFORM=sdl (so you can run inside your normal desktop) and -DNO_FSRV=On so the recent ffmpeg breakage doesn’t hit (the AV_CODEC stuff).
                                                      3. Build the senseye senses like normal, then arcan /path/to/senseye/senseye

                                                      Think I’ve found the scripting error, testing when I’m back home - thanks.

                                                      The default behavior on scripting error is to shutdown forcibly even if it could recover - in order to preserve state in the log output, the -b arguments lets you set a new app (or the same one) to switch and migrate any living clients to, arcan -b /path/to/durden /path/to/durden would recover “to itself”, surprisingly enough, this can be so fast that you don’t notice it has happened :-)

                                  3. 3

                                    If you are looking for references on debuggers then the book How Debuggers Work may be helpful.

                                  1. 10

                                    From the readme:

                                    You probably noticed the peculiar default line length. Black defaults to 88 characters per line, which happens to be 10% over 80. This number was found to produce significantly shorter files than sticking with 80 (the most popular), or even 79 (used by the standard library). In general, 90-ish seems like the wise choice.

                                    This is a table stakes deal breaker for me. I know, I know, I’m likely old fashioned. I prefer old school. :-)

                                    1. 5

                                      It is a default though, you can pass --line-length to it.

                                      1. [Comment removed by author]

                                        1. 4

                                          Honestly though, is your terminal window really 80 columns wide? And should outdated defaults matter?

                                          1. 3

                                            Yes, my terminal window is really 80 columns wide.

                                            I also have a source file where a single line of code is 250 characters (and no, it really can’t be broken up due to semantic constraints).

                                            So, what should be the minimum width of a terminal window?

                                            1. 1

                                              I actually prefer to code with a 80-wide terminal most of the time, because it tends to remind me to simplify my code more than I would otherwise :o

                                            2. 1

                                              I think 79 is better than 80, because 79 allows for a single-column ruler on the side of the window and stuff

                                              1. 1

                                                This is about the size of your code “viewport”, not of your terminal.

                                                3 columns are already used by my line length indicator in vim, but that number is really arbitrary too.

                                              2. 1

                                                departing from established standards because you feel like it is a pretty bad sign in general. as are --long-gnu-style-options, but that’s a different issue.

                                              3. 2

                                                I just counted the length of lines over 266 Lua files [1], calculating the 95th percentile [2] of line length. Only 4% had a 95 percentile of 89 or higher; 11% had a 95th percentile of 80 or higher. And just because, 4% had 95th percentiles of 80 and 81. For maximum line lengths:

                                                • 42% with longest line of 79 characters or less
                                                • 46% with longest line of 80 characters or less
                                                • 56% with longest line of 88 characters or less

                                                Longest line found: 204 characters

                                                [1] I don’t use Python, so I’m using what I have. And what I have are multiple Lua modules I’ve downloaded.

                                                [2] That is, out of all the lines of code, 95% of line lengths are less than this length.

                                                1. 1

                                                  https://en.wikipedia.org/wiki/88_(number)

                                                  I can’t help it but read it as a fascist code, or at least I start thinking about whether it could be on every occasion I see this number (not totally unfounded, because where I live it is used this way every now and then). I don’t think they meant to use it this way, so I think it’s fine (more than that, good, because it devalues the code by not treating the numebr as taboo).

                                                  1. 1

                                                    Personally I don’t like line break at 80 or 90 with python, as the 4-spaces indent quickly uses up a lot of horizontal space. For example, if you write unittest-style unit tests, before you write any assignment or call, you have already lost 8 spaces.

                                                    class TestMyFunctionality(unittest.TestCase):
                                                        def setUp(self):
                                                            # ....
                                                        def test_foo(self):
                                                            x = SomeModule.SomeClass(self.initial_x, get_dummy_data(), self.database_handle)
                                                    

                                                    Of course you could start introducing line breaks, but that quickly leads to debates on “hanging indent”, lisp-style-indent, etc. or you end up with a lot of vanity variables for local variables.

                                                    With lisp-style indent I mean the following snippet, that (if it was the way the autoformatter would do it would convince me to accept a 80 character line length limit)

                                                    class TestMyFunctionality(unittest.TestCase):
                                                        def setUp(self):
                                                            # ....
                                                        def test_foo(self):
                                                            x = SomeModule.SomeClass(self.initial_x,
                                                                                     get_dummy_data(),
                                                                                     self.database_handle)
                                                    

                                                    Whereas I find the “hanging indent” style makes understanding the structure of the syntax tree so much more difficult.

                                                    class TestMyFunctionality(unittest.TestCase):
                                                        def setUp(self):
                                                            # ....
                                                        def test_foo(self):
                                                            x = SomeModule.SomeClass(
                                                                    self.initial_x,
                                                                    get_dummy_data(),
                                                                    self.database_handle)
                                                    
                                                  1. 6

                                                    No, no, no… Defaulting to double-quotes over apostrophes sends it to hell right away. I’m not fond of squeezing my left pinky over Shift the entire time I’m typing. Also, apostrophes are obviously more classy. Double-quotes smells too much of C and other C-inspired syntaxes.

                                                    On a slightly more serious note, if a line fits into the length limit doesn’t mean it should necessarily be reformatted this way. I prefer this:

                                                    return {
                                                        'AND': eval_and,
                                                        'OR': eval_or,
                                                    }[op](some, more, args, here)
                                                    

                                                    to not be turned into a one-liner. But black does.

                                                    1. 8

                                                      But that’s the point of Black—to remove all thought about formatting so no one can bikeshed coding styles. There are no aesthetic concerns taken into account period. If it’s ugly, it’s ugly. Get over your artistic tendencies and program—that’s what you are paid for.

                                                      Would I use this? No (that is, if I programmed in Python—I don’t). I’ve built up a coding style that works for me over the past 30 years, and yes, I am concerned with aesthetic concerns of code. Then again, I’ve been fortunate to work at places were I an use my personal coding style.

                                                      1. 1

                                                        Code style affects readability though, it’s not just about making it look pretty (I like pretty code too though). So the choices Black makes in that regard are important. Personally I don’t think automatic formatting tools should be too concerned with line length (except maybe in some very specific contexts) and they should just work with the lines they’re fed. The rules this tool uses for splitting lines seem fairly arbitrary and it’s one of the few areas where I think a human is better off making the call.

                                                        I’m not a Go programmer, but I think gofmt handles this better?

                                                    1. 2

                                                      And the source code is available from Don Hopkin’s webpage.

                                                      1. 5

                                                        Alas that’s only the binary distribution of the HyperLook runtime and SimCity. The PostScript source is obscured by tokenizing it as binary and stripping the comments unfortunately. And I deeply regret the actual HyperLook source code for HyperLook is lost to me in the sands of time (unless Dug or Arthur has it on a tape somewhere), although I still have the SimCity sources.

                                                        I scanned all the manuals for older versions of HyperNeWS, the entire HyperLook manual (it was pretty big), and the SimCity manual, which we made in the NeWS version of FrameMaker (aka PainMaker: it’s RIDDLED with FEATURES!) Links to those are at the end of the article.

                                                        i’d love to run it again in an emulator (it’d run faster than ever I bet) and make some screencasts of demos, but my SS2 is in storage in the US. If somebody could please give me some help getting X11/NeWS to run on a SparcStation emulator I’d buy them a lot of beer or whatever they needed to tolerate raw unshielded doses of early Solaris. I have tried but haven’t been able to find the right images and get them to work.

                                                        Best yet would be to run a SparcStation emulator in the browser, then anybody could actually run the original version of HyperLook and SimCity! Has anybody done that and configured it to boot up a version of Solaris that runs OpenWindows on a dumb color framebuffer? That would be fine!

                                                        1. 1

                                                          Good to know!

                                                        1. 2

                                                          In one study, Harvard professor Latanya Sweeney looked at the Google AdSense ads that came up during searches of names associated with white babies (Geoffrey, Jill, Emma) and names associated with black babies (DeShawn, Darnell, Jermaine). She found that ads containing the word “arrest” were shown next to more than 80 percent of “black” name searches but fewer than 30 percent of “white” name searches.

                                                          Is this a problem? Presumably ads were placed based on click-through rate, and while CTR is an imperfect measure of relevancy, I don’t see problems with delivering relevant ads. As I understand even TV ads are racially targeted based on channels and programs.

                                                          1. 8

                                                            She found that ads containing the word “arrest” were shown next to more than 80 percent of “black” name searches but fewer than 30 percent of “white” name searches.

                                                            […]
                                                            I don’t see problems with delivering relevant ads.

                                                            I’m not sure I understand your objection. Actually, I hope that I did not understood it at all.

                                                            Can you elaborate?

                                                            1. 4

                                                              It could be that African-American names like DeShawn or Darnell might show up more in articles about arrests, since, sadly, a lot of African-Americans are arrested in the US vs. non-African-Americans. Google is doing what Google does well—making correlations, which ends up showing biases in our culture. It’s similar to the time Target sent baby-related coupons to an address because it had data that showed a high likelihood of a pregnancy and yes, the daughter was pregnant but her father did not know (she was still a teen if I remember correctly).

                                                              The right answer is to address this as a society and let Google do what it does (with respect to searching). The wrong answer to is to blame the data (or data collector) for what it’s showing.

                                                              1. 3

                                                                The right answer is to address this as a society…

                                                                For sure!

                                                                …and let Google do what it does (with respect to searching).

                                                                The problem is that Google (but not only Google!) replicate, reinforce and spread the bias we should fix.

                                                                So, to fix our culture we should “fix IT business” too.

                                                                Instead of trying to build AIs that look ethical, we should ethically regulate businesses, so that people can’t profit from unethical behaviours (not even through undebuggable software proxies).

                                                              2. 2

                                                                Google is placing “X arrest records” ads more often for search “X” if “X arrest records” ads are clicked more often for search “X”. I don’t see why Google should place ads that are clicked less often.

                                                                1. 3

                                                                  Or maybe “X arrest records” ads are clicked more often for search “X” because Google is placing “X arrest records” ads for search “X” 3 times more often?

                                                                  But actually this is not the real cognitive issue.

                                                                  The problem is that if you propose systematically a correlation between X and “arrest record”, people’s subconscious register the association even they don’t click the ads.

                                                                  Now, even if Google propose such ads just to maximise profit, the long term effects of such association are an externality that affects beliefs and the social and political behaviour of many people.

                                                                  The fact that such externalities are hard to measure protect Google from a fair taxation that could cover the social costs, but also make it important to forbid such externalities by law.

                                                                  1. 1

                                                                    I agree externality is plausible in this case. I am in favor of taxing externalities, in principle. For example, carbon tax.

                                                                    In practice, a good tax policy is a hard problem and this problem is probably beyond our taxation skill by far. I may reconsider when we get much better at taxing externalities, say, a successful implementation of carbon tax.

                                                                    1. 1

                                                                      In practice, a good tax policy is a hard problem

                                                                      So we have a single viable solution: forbid the techniques that produce the externalities.

                                                                      1. 1

                                                                        What? Another viable solution is to live with externalities without doing anything. This is actually what we do to most externalities.

                                                                        1. 1

                                                                          Well, why not?

                                                                          I mean, the profit of a few companies is much more important than the lifes of a few billions of people, isn’t it?

                                                                          No.

                                                                          You can call this “viable” just because you have been lucky enough to not being sistematically discriminated, sistematically associated with arrests, sistematically underpayed.
                                                                          You know another externality we all live with? The suicides at FoxConn.
                                                                          Another one? Democracy manipulations.

                                                                          In practice, you can accept externalities that poison the roots of democracy only because you profit by them. Or more probably, because you have been convinced you do.

                                                            1. 2

                                                              Bear with me, this might sound dumb, but I find it super confusing when you have some object reference, which might be null itself, and it’s also got object values/references inside, which could also be null. So a value can be more-or-less null/unusable in multiple ways, but sometimes it will(!) be usable with almost nothing not-null, depending on context. Each time I step into the code I’ve got to re-establish which things are going to be present and why, depending on context. And add null-checks everywhere. I wish I knew the name for this pattern. (the errorless data structure, the bag of holding, &c) I’m totally down with make illegal state unrepresentable but it’s hard to refactor once the code is already written, inherited-from, corner-case’d, and passed around everywhere.

                                                              It’s the same with functions. I swear I saw a line of code today that was like below (paraphrasing). I mean sure, I can get used to anything, but it just looks to me like a failure mode.

                                                              return service.Generate(data, null, null, null, null, null, null);
                                                              
                                                              1. 1

                                                                I wonder what would happen if say, 64K of data was mapped to virtual address 0 and made read-only [1]. That way, NULL pointers wouldn’t crap out so much, A NULL pointer points to an address of NUL bytes, so it’s a valid C string. In IEEE-754 all zeros represents 0. All pointers lead right back to this area. If you use “size/pointer” type strings, then you get 0-byte (or 0-character) strings. It just seems to work out fine.

                                                                It’s probably a horrible idea, but it would be fun to try.

                                                                [1] I would be nice if this “read-only” memory acted like ROM—it could be written to, but nothing actually changes.

                                                                1. 2

                                                                  I’ve had some fun thoughts about this before :D

                                                                  I was sketching out ideas of a microcontroller design that could potentially “not have registers” and also try to avoid lots of arbitrary hardcoded memory addresses. In practice it always ended up having registers in the form of a couple of internal busses and some flags, but it would look like it mostly didn’t have registers as far as programmers were concerned.

                                                                  I wanted to make the “program counter” a value stored at memory address zero. This would also mean the ‘default’ value of memory address zero set in the ROM would be the entry point in the code, which I thought was pretty.

                                                                  This also simplified a few things from the circuitry point of view:

                                                                  • No JMP instruction, just MOV 0,value
                                                                  • No halt instruction, just MOV 0,0

                                                                  Some thought later made me realise that using low memory addresses for critical things was a bad idea. When a program wigs out it can start writing to arbitrary random addresses, and address zero is a very common target in many bugs. Overwriting address zero would make the CPU jump to new code and potentially make things harder to debug.

                                                                  In the end I thought it best to setup the first 64 bytes or so of memory to be an intentional trap instead. ie any read or write to those bytes would immediately halt the processor. A lot less elegant, but a lot more practical.


                                                                  Back to your idea.

                                                                  Letting the first 64K of memory be usable would allow a lot of programs to keep running, a lot like the old “Abort/retry/ignore” allowed us to do in the DOS days. For some bugs this would be brilliant and let you try and gracefully recover (eg finish saving a document).

                                                                  Alas there would also be a chance of data being damaged (eg files getting overwritten) if you continue into unknown territory; so I think it would still be worth bringing up an A/R/I style dialog. Even if only so we can blame the users if something goes wrong :P

                                                              1. 8

                                                                A couple notes on the article (specifically, the one it links to at the beginning, The Logical Disaster of Null).

                                                                Null is a crutch. It’s a placeholder for I don’t know and didn’t want to think about it further

                                                                I disagree. In C, at least, NULL is a preprocessor macro, not a special object, “which expands to an implementation-defined null pointer constant”. In most cases, it’s either 0, or ((void *)0). It has a very specific definition and that definition is used in many places with specific meaning (e.g., malloc returns NULL on an allocation failure). The phrase, “It’s a placeholder for I don’t know and didn’t want to think about it further”, seems to imply that it’s used by programmers who don’t understand their own code, which is a different problem altogether.

                                                                People make up what they think Null means, which is the problem.

                                                                I agree. However, again in C, this problem doesn’t really exist, since there are no objects, only primative types. structs, for example, are just logical groupings of zero or more primative types. I can imagine that, in object-oriented languages, the desire to create some sort of NULL object can result in an object that acts differently than non-NULL objects in exceptional cases, which would lead to inconsistency in the language.

                                                                In another article linked-to in Logical Disaster of Null talks about how using NULL-terminated character arrays to represent strings was a mistake.

                                                                Should the C language represent strings as an address + length tuple or just as the address with a magic character (NUL) marking the end?

                                                                I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.

                                                                1. 4

                                                                  “On Multics C on the Honeywell DPS-8/M and 6180, the pointer value NULL is not 0, but -1|1.”

                                                                  1. 3

                                                                    The C Standard allows that. It basically states that, in the source code, a value of 0 in a pointer context is a null pointer and shall be converted to whatever value that represents in the local architecture. So that means on a Honeywell DPS-8/M, the code:

                                                                    char *p = 0;
                                                                    

                                                                    is valid, and will set the value of p to be -1. This is done by the compiler. The name NULL is defined so that it stands out in source code. C++ has rejected NULL and you are expected to use the value 0 (I do not agree with this, but I don’t do C++ coding).

                                                                    1. 2

                                                                      I believe C++11 introduced the nullptr keyword which can mostly be used like NULL in C.

                                                                      1. 1

                                                                        Correct. Just for reference, from the 1989 standard:

                                                                        “An integral constant expression with the value 0, or such an expression cast to type void * , is called a null pointer constant.”

                                                                    2. 3

                                                                      I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.

                                                                      That’s not a very convincing argument IMO since you can implement either of the options yourself no matter which one is supported by the stdlib, the choice of one doesn’t in any way impact the potential flexibility. On the other hand NULL-terminated strings are much more likely to cause major problems due to how extremely easy it is to accidentally clobber the NULL byte, which happens all the time in real-world code.

                                                                      And the language not supporting Pascal-style strings means that people would need to reach for one of a multitude of different and incompatible third-party libraries and then convince other people on the project that the extra dependency is worth it, and even then you need to be very careful when passing the functions to any other third-party functions that need the string.

                                                                      1. 1

                                                                        You make a good point. Both options for strings can be implemented. As for Pascal strings, it is nice that a string can contain a NULL character somewhere in the middle. I guess back in the day when C was being developed, Ritchie chose NULL-terminated strings due to length being capped at 255 characters (the traditional Pascal string used the first byte to contain length). Nowadays, since computers have more memory, you could just use the first 4 bytes (for example) to represent string length, in which case, in C it could just be written as struct string { int length; char *letters; }; or something like that.

                                                                        From Ritchie: “C treats strings as arrays of characters conventionally terminated by a marker. Aside from one special rule about initialization by string literals, the semantics of strings are fully subsumed by more general rules governing all arrays, and as a result the language is simpler to describe and to translate than one incorporating the string as a unique data type.”

                                                                    1. 3

                                                                      It would be instructive to look at other federated services to see how it might work.

                                                                      An organization or individual can run their own SMTP (email) server. There are multiple SMTP programs to select (Postfix, Sendmail, EXIM, OpenSMTP) and they all interoperate [1], allowing one SMTP server to exchange messages with another SMTP server. The organization running the server can dictate which other SMTP servers they will accept messages from and send to, and any individual that does not agree can either find an organization they find tolerable, or run their own SMTP instance.

                                                                      An individual user of SMTP, using a variety of client programs (elm, mutt, pine—yes, I’m old school here) can further filter incoming messages, rejecting based on the sending server (“I reject anything from example.net”) or an individual sender (“I reject anything from fred@example.com”). Furthermore, if done correctly, one can move their email address from one organization to another (if they have their own domain name and the organization will accept mail for said domain).

                                                                      Another federated service is NNTP (Usenet [2]). It’s similar to SMTP in that there are serveral server implementations to choose from [3] and many clients to choose from (rn, nn, tin). Also, an organization can select not only which NNTP servers they talk to (this is almost mandatory) but what messages they accept (news groups—one organization can say, accept everything, while others might accept everything but the binary groups (messages that contain non-text information like pictures or movies)). Again, if a user does not agree with the organization, they can move to one that receives what the user wants, or the user themself can run NNTP and find an organization (or organizations) that will send them wanted messages. The user can then do further filtering, again based on sending server or individual (or message group).

                                                                      The major difference between SMTP and NNTP (besides the base protocol) is the distribution method. SMTP is (more or less) point-to-point [4] while NNTP is more fan-out method [5].

                                                                      So the point of federation is for servers to exchange messages amongst themselves, according to organizational and individual requirements. SMTP is usually more direct but locating a recipient is out of scope for SMTP. NNTP is more broadcast, but finding a recipient (or “channel” if you want) is easier as that’s part of the definition of NNTP (want to talk about C? comp.lang.c. Want to walk about old computers? alt.folklore.computers. You can always get a list of groups that are available).

                                                                      What does this mean for lobste.rs? Does it federate with Hacker News or Reddit? Or does it only federate with other sites that run the lobste.rs codebase? (IMNSHO this defeats the point of federation as it should be based upon protocol and not implementation; also, different clients that handle things like filtering and presentation).

                                                                      [1] More or less. Aside from installing and running, say, Postfix, there are other steps generally required to achieve proper interoperability. Sad, but true.

                                                                      [2] There is very little difference between a mailing list and a Usenet group. The major difference is how subscriptions and distribution works. In fact, there have been several SMTP clients that are also NNTP clients.

                                                                      [3] It’s been too long for me to remember names of these. I only had to deal with NNTP server software for like six months in the early 90s.

                                                                      [4] In that it talks to an endpoint that will accept email on behalf of the receipient.

                                                                      [5] The NNTP server will only exchange messages to a configured set of NNTP severs, who will then distribute the message to their configured set of NNTP servers.

                                                                      1. 8

                                                                        I’m running the following:

                                                                        • SMTP, using Postfix with my own greylist software
                                                                        • HTTP, using Apache
                                                                        • GOPHER, wrote my own gopher server (source code available via said server)
                                                                        • QOTD, again, wrote my own
                                                                        • DNS, running bind but it’s not visible to the outside world. It’s authoritative for all my domains; the company serving up my zones slaves off my DNS server.
                                                                        1. 1

                                                                          I don’t hear much about it but they claim to have won high rankings in several competitions.

                                                                          1. 2

                                                                            How?

                                                                            I tried using a simple function I wrote and oh look! It doesn’t like assert(). Even if I include #include <assert.h> it still complains. It even complains if I provide a prototype for assert(). I fish around some and oh look, they use their own syntax for asserts. How cute. I convert the existing calls to assert() to their special snowflake syntax and … it finds a non-existent problem.

                                                                            1. 2

                                                                              I saved it for Saturday since I had no info on it. Maybe Ultimate refers to the migraine you get tryimg to reproduce their successes.

                                                                          1. 14

                                                                            Hi, dabmancer.

                                                                            I want to tell you a story… I skimmed your laptop.txt and found no pictures. I went to back to the parent… menu, still didn’t find any pictures.

                                                                            So I decided to contact you and ask for pics! I was just about to ssh into a tilde and weechat into the local ircd to ask who knows much about gopher when I realized that whoever responded would just browse your whole hole to find contact information–and I can do that, the floodgap proxy works fine from work.

                                                                            AND, your guestbook works. :) My message was delivered already, well before I tapped out this rambling, pointless message. Cheers! p.s. send laptop pics

                                                                            1. 6

                                                                              I didn’t realize I was reading this through a Gopher proxy until I read this comment. I just though I was on a mailing list reader.

                                                                              I really should setup a gopher server to serve up all the content on my website, in a Docker container, just because I can.

                                                                              1. 6

                                                                                I wrote my own gopher server mainly to mirror my blog to gopherspace. It wasn’t that hard.

                                                                                1. 4

                                                                                  Oh shit, it was a Gopher! Given a prior thread, I guess this one should be on list for coolest, modern Gophersites. The FloodGap homepage is itself really neat, too.

                                                                                  1. 3

                                                                                    Running a gopher hole is pretty easy. I run mine off pygopherd, which is nice in that it will turn directories into gophermaps with type hinting, but if you plan to write your own maps a gopher server is only a handful of lines of code.

                                                                                    1. 2

                                                                                      Making your own gopher server is an afternoon of work or so. That’s what I did for the server.

                                                                                  2. 6

                                                                                    Now that you mention it, I do need to take pictures. My email is dabmancer@dread.life, for anyone interested (I did not get your email if you sent one already). I’ll try to respond to every email that I get (and also be helpful). I’m glad the stuff works. The whole point of gopher is that it’s too simple to go wrong.

                                                                                    1. 3

                                                                                      I don’t really understand but I still suspect this is the most awesome thing I’ll read all week.

                                                                                    1. 7

                                                                                      Folks - take notes!

                                                                                      I am shocked at the number of developers/engineers I work with that are debugging an extremely complex problem, and force themselves to keep so much state in their head. If you can write out the debugging steps more like a journal / record of every action you took, it’s much easier to reinflate your subconscious state.

                                                                                      Make it refined enough someone else could reasonably follow along, and you’ll be able to as well. Lots of coworkers in other functions take detailed daily notes as a habit to show their progress to management, software gets lucky as there is an “output” on a small granularity of work.

                                                                                      As I get more and more reprioritizations & interruptions in my work, I’ve found it’s helpful to have confidence that all but maybe the last 30 mins of work are recorded in a decent fashion (org-mode!).

                                                                                      1. 2

                                                                                        I took notes on running a specific regression test at work. It’s something like 50 steps just to set it up [1]. And even then, others that have tried running it have had to fill in information I’ve neglected. It is hard to know at times what should be written down and what doesn’t have to be written down. And that changes over time, unfortunately.

                                                                                        [1] Why not automate it? Not that easy when you have to set up the programs and associated data across four machines. And then when it’s automated, it’s an even harder issue to debug if the automation breaks down [2].

                                                                                        [2] About half the time the test fails anyway because the clocks are out of sync. I had to code a specific test for that in the regression test, and yes, for some reason, ntpd isn’t working right. The other times the test fails is because the Protocol Stack From Hell [3] fell over because someone looked at it funny.

                                                                                        [3] Six figures to license a proprietary SS7 stack that isn’t worth the magnetic flux used to store it. This is the “best of breed” SS7 stack, sadly.

                                                                                      1. 5

                                                                                        I’ve never been able to connect with the idea that “programmers should not be interrupted” since for as long as I’ve been programming, interruptions have not been a problem for me. And I do get into a “flow” on a regular basis. Like most programmers, I’m working on existing code and a lot of my time is spent trying to understand it.

                                                                                        It would be arrogant of me to say that what I do would work for everyone, but I’ll take a stab at listing why I don’t think interruptions are typically a problem for me. This includes meetings.

                                                                                        • I take a lot of notes to remind myself of things. I also write a lot of throwaway prose in some random file to collect my thoughts. I review them a lot.
                                                                                        • I repeat steps many times to hammer the point home. What, exactly, did that code do there? Run through it. Then again and study it. Again. And make notes.
                                                                                        • I avoid doing too much at once. If I’m doing too much, I’ll try to work out some of the details with someone else who is working on a related thing. (I guess I’m interrupting that person, but usually they’re receptive to having such a conversation.)
                                                                                        • Turn notifications off. I expect many people already do this. Notifications are largely useless even when you’re not doing anything.
                                                                                        • It’s rare that I consider a meeting an interruption because I tend to know when they’re going to happen and they are often pertinent to the task at hand, even if tangentially.
                                                                                        • If another developer needs to talk to me, it’s usually related to my work.

                                                                                        Maybe I’ve been lucky, but if that’s the case, it’s been nearly 20 years of it.

                                                                                        1. 3

                                                                                          For me, programming is more of an “atmosphere” than it is an interruptible concentration. When in this “atmosphere” I work on and off but generally persist and keep the state of mind even with distractions and breaks - which I often take small breaks to help think and avoid getting stuck.

                                                                                          1. 1

                                                                                            I like the mental image that “atmosphere” conjures up. Another one that could work might be “milieu.”

                                                                                          2. 1

                                                                                            Once I get into the “flow” I have to be scraped off the ceiling when I get interrupted it’s that jarring to me. Granted, I can only get into that “flow” at home (even there it’s rare when I can descend that far into “flow”) and I work on stuff I enjoy. I don’t think I’ve ever gotten into the “flow” at work to that degree, but that’s mostly due to the work not being that difficult, nor an excessive amount.

                                                                                          1. 2

                                                                                            I’ve been using joe for over 25 years. Yes, it’s a plain text editor. It has features to help with coding. It can load files of any size. It’s fast. It’s not as small as I like [1], but compared to what people use today, it’s tiny. And more importantly, it’s NOT Internet enabled.

                                                                                            [1] The current version I use has an executable size of 1.5M. Large compared to my favorite editor under MS-DOS, which was 40K.

                                                                                            1. 1

                                                                                              Thumbs up for JOE, even when I don’t understand it (config format is pretty damn complex), I really, really espect people using it daily, though I haven’t seen anyone using it IRL.

                                                                                              1. 1

                                                                                                Just out of curiosity, what was your favorite editor under DOS?

                                                                                                1. 3

                                                                                                  PE version 1.0 written in 1982 by Jim Wyllie (while at IBM). I only ever found one bug (lines have to end in CRLF or it does strange things) and one limitation—lines are limited to 253 characters (255 with CRLF). Other than that, no limitations. Ran under MS-DOS 1.0 (it will probably still run under Windows today) and could deal with files that even exceeded memory (tested once—it was … sluggish). 45K (oops, mis-remembered the size) in size, and programmable (to a limited degree).

                                                                                                  Later versions were okay.

                                                                                              1. 4

                                                                                                I found questions that don’t apply (I don’t use IDEs, nor do I care to use IDEs [1]) or I have a philosophical difference (OO—horribly overrated, much like Uncle Bob). Also, where are the questions about functional programming, or declarative programming?

                                                                                                [1] I have yet to find an IDE that doesn’t crash horribly. I’ve done this over the past 30 years.

                                                                                                1. 2

                                                                                                  It looks similar to ideas I have as I work on my keywordless language [1]. A simple example would be:

                                                                                                  ? t < v : ^ ERANGE , v;
                                                                                                  

                                                                                                  (where ? is IF, : is THEN and ^ is RETURN). A more complex example is:

                                                                                                  {? err,c = getc()
                                                                                                     == EOF , _           : ^ 0   , v;
                                                                                                     != 0   , _           : ^ err , v;
                                                                                                     _      , _           : { ungetc(c) ; ^ 0 , v; }
                                                                                                     _      , is_digit(c) : n = c - '0';
                                                                                                     _      , is_upper(c) : n = c - 'A' + 10;
                                                                                                     _      , is_lower(c) : n = c - 'a' + 10;
                                                                                                  }
                                                                                                  

                                                                                                  (where _ is “don’t care” placeholder). Internally, the compiler will [2] re-order the tests from “most-specific” to “least-specific” (so the _ , _ : bit is like ELSE). Also, here, getc() returns two values [3], both of which are checked. I do not have exceptions because I’m not fond of exceptions [5] so I don’t have syntax for it.

                                                                                                  [1] Based off an idea I read about in the mid-80s [4].

                                                                                                  [2] I’m still playing around with syntax.

                                                                                                  [3] I had a hard time moving from assembly to C, simply because I could not return multiple values easily.

                                                                                                  [4] It’s a long term PONARV of mine.

                                                                                                  [5] It’s a dynamic GOTO and abused way too much in my opinion.

                                                                                                  1. 1

                                                                                                    Very nice. Re [2], does that mean that the sequence of the checks in this construct really is immaterial?

                                                                                                    1. 1

                                                                                                      Non-existent. I’m still working (even after all these years) on syntax. It was only after I posed the above did I realize that trying to go from “most-specific” to “least-specific” is problematic in the above example. Of these two:

                                                                                                      == EOF , _
                                                                                                      != 0 , _
                                                                                                      

                                                                                                      Which one is more specific? It’s for these reasons (and some more) that this is taking a long time.