1. 2

    I still don’t understand necessity of centralized message brokers for IoT setups. Especially for home IoT (as opposed to industrial/commercial). Even if setup requires central control node, I don’t understand why this node should run message broker and processing node connected to this message broker as a client.

    If a sensor pushes data to the processing node, why it should push it via message broker and not directly? Pull is even worse: sensor should get request from message queue and then reply to the same queue, it’s worse than if a sensor would be just a listening server.

    A controllable device (i.e. light switch) should receive control message as quickly as possible, but it can stuck into thinking that it’s connected to MQTT server, but at the side of MQTT TCP connection is closed. It will not receive that message until next keepalive or something like that.

    Even putting NTP-like time protocol on top of MQTT is too quirky (and IoT devices always want to ask what’s time, they are frequently restarting, including when waking up from hibernation, and don’t have proper clocks).

    Message queues have no data aggregation built-in (unlike time-series systems, like all these monitoring systems). They do nothing interesting. Why they’re better than direct connection? I can understand message queues for “put long task for asynchronous processing”, i.e. for web apps, but what’s purpose for it in IoT?

    1. 3

      Consider sensors that are on battery: they’ll wake up every few seconds (for things that need to respond quickly) or every minute (for simple sensors). The control node might want to send a command to it, but if the sensor is just listening, the control node would need to keep trying to connect in hopes that it manages to catch the sensor in the 50ms or so that it’s online. What happens if the sensor gets a different IP the next time it tries to connect? It’s lost.

      How about if the sensor connects to the control node? It may be that there are multiple systems interested in readings. For example, I have a separate process that logs time series next to the control node, because home-assistant does a crap job of tagging its influxdb time series. I may also want to send commands to the sensor outside of the control node, e.g., for firmware updates. Once again, the decoupling provided by MQTT helps.

    1. 3

      as always, feel free to submit feedback, criticism or issues!

      1. 3

        Just some nitpicking on dependencies:

        • When depending on a Git repository (as you do with your colored dependency), it is a good practice to point to a particular commit or tag using the rev or tag parameter instead of the branch, as the branch’s HEAD can change but a commit or tag can only point to only one specific state of the repository.
        • When publishing a binary (executable) crate, it is a good practice to publish along the crate the Cargo.lock. You can find the reasoning on why you should publish this file in Cargo’s FAQ

        I will try it later though! I always complained that some prompt frameworks are using scripting languages like Python or Ruby that have slow spin-up rate, so this project seems interesting and a cool way to customize my ugly and boring prompt.

        1. 1

          You kind of cover this but the Cargo.lock would capture the commit that the git dependency was at when the lock file was generated. So if the Cargo.lock was checked in everyone would build against the same commit.

        2. 2

          I already implemented a similar tool some months ago rusty-prompt, maybe you can get some inspiration out of it.

          1. 1

            sure! thanks for sharing!

          2. 1

            My bashes (both the one that comes with Mac OS and the latest 5.0.7 from brew) seem to cache PS1 somehow, making pista break quite a lot.

            ➜  ~ /usr/local/bin/bash
            bash-5.0$ PS1=$(pista)
            ~
            $ cd git
            ~
            $ PS1=$(pista)
            ~/git
            $ cd nomad
            ~/git
            $ PS1=$(pista)
            ~/g/nomad master ·
            $
            
            1. 2

              Try PS1='$(pista)'. What’s happening is that pista is getting called once, when you set PS1, and then never again. The single quotes force PS1 to literally contain the expansion, which then gets expanded (and thereby call pista) each time the prompt is printed

              1. 2

                Ohhh, no :( Of course. I feel like I should step far away from the computer now.

                1. 2

                  looks like the installation instructions were faulty!

                  1. 1

                    Oh, whew, thanks for that. Now I feel slightly less stupid :)

              2. 1

                cant seem to replicate this, but it looks like a PROMPT_COMMAND thing.

              3. 1

                @hostname is nice to have if $SSH_CONNECTION is set.

                1. 4

                  i have plans to extend pista into a library, so you could build your own prompts with pista’s functions. maybe ill add a hostname function :^)

              1. 2

                Can anyone comment on what open-coding is? From context, it sounds like replacing the forth definition with a reference to machine code, but I’m not 100% sure that’s right, and also can’t think through the details.

                1. 3

                  Open coding is basically inlining with specialization. I am mostly familiar with it in the context of Smalltalk and Lisp implementations, where if the compiler can know that a particular value will be a fixnum, it can, for example, replace a call to the addition primitive (which also needs to handle bigints) with an ADD instruction. This can also avoid tagging and untagging values at every step.

                1. 3

                  My prompt normally looks like this:

                  thequux@peewee <~> 
                  [0]$ 
                  

                  This basic prompt has stayed almost exactly the same for 15 years (there was a brief time where the second line looked like : 0 ;, but I ended up finding that intrusive) . However, it has gotten a couple of subtle features:

                  If I am in a directory with a git repo, this changes to:

                  thequux@peewee <~/path/to/git-repo> (git:master)
                  [0]$ 
                  

                  There can be other tags that appear in the prompt as well: which python venv is activated, the name of the environment I’m chrooted into (if any), what version of Ruby is active through rvm, and any shell configuration files that have changed since the last time they were loaded.

                  There are also chevrons that appear on the left side to let me know I’m in a nested shell:

                  thequux@peewee <~> 
                  [0]$ zsh
                  thequux@peewee <~> 
                  >[0]$ zsh
                  thequux@peewee <~> 
                  >>[0]$ zsh
                  thequux@peewee <~> 
                  >>>[0]$ 
                  

                  Also, if I am root, the second line in the prompt turns red. It is all designed to be unobtrusive and only show me information when it’s relevant, but also, no matter how much information is being displayed, to give me as much space as possible to type my command line.

                  If anybody is interested, the complete configuration is at https://github.com/thequux/dotfiles/blob/master/conffiles/zshrc

                  1. 3

                    I used to run the UCLA LUG’s installfest back in the late ‘00’s. Aside from my role organizing the event, I would solve all of the weird hardware that didn’t work after the bulk install, which was nearly always proprietary hardware. (To this day, I bear grudges against Broadcom and Sony for their wireless adapters and laptops, respectively). I’d explain that I’d be able to get their hardware working for now, but because of those hardware vendors’ attitudes towards Linux, it would probably break the next time they upgraded Linux, so when they replaced their laptop, they should look for one with an Intel or Atheros network card, etc.

                    I have no idea how many people continued using Linux after the installfest; most people needed it for their university coursework. However, the dozen or so people who came back with a cheap PC (out of ~100 or so people who had weird hardware that didn’t work) suggests that the strategy worked at least a little.

                    1. 1

                      I like how alternative git interfaces are their own little cottage industry.

                      Why have none of them been blessed by the git project itself as the new standard or recommended interface?

                      1. 2

                        Near as I can tell, they all involve tradeoffs. I use stgit daily, as an adjunct to branch-per-feature. I like it a lot, but would hesitate to recommend it to most developers, because when things go wonky, I usually need to fall back on fairly low-level knowledge of git. I’d imagine the same is true of the other interfaces.

                      1. 55

                        Any tool proponent that flips the problem of tools into a problem about discipline or bad programmers is making a bad argument. Lack of discipline is a non-argument. Tools must always be subordinate to human intentions and capabilities.

                        We need to move beyond the culture of genius and disciplined programmers.

                        1. 20

                          Indeed; this practice strikes me as being uncomfortably close to victim-blaming.

                          1. 15

                            That’s a good analogy. People like to think they’re good programmers and don’t write buggy code so when faced with the evidence to the contrary in others they defensively blame the other programmer because otherwise they’d need to admit the same thing would happen to them.

                            I think these broken arguments persist because their psychology and internal logic forces admitting our own faults which most people find displeasurable so defensive thinking kicks in to avoid it.

                            1. -5

                              Victim blaming is not a problem. It takes two people to make a person a victim: a bad actor and the actor without adequate protection.

                              1. 4

                                This is a pretty gross saying, even if it’s nice-sounding and pithy.

                                1. 0

                                  Not really. Every victim chose at some point to be a victim. That is not to say the other party can be absolved of blame. Far from it, the other party is the guilty one.

                                  Take software. If nobody chooses hard languages, unsafe languages, nobody will be victimized. Choosing those languages and then blaming the language leaves you responsible for your choices, even while the tool chain is at fault

                                  1. 2

                                    This is absolutely ridiculous. If I walk down the street and I’m mugged, how did “I choose to become a victim”? There’s many, many cases where someone becomes a victim randomly.

                                    Your logic applies only if we have some sort of perfect foresight. That’s impossible.

                                    1. -1

                                      When the mugging starts, do you give up? Do you look for an exit? Or do you just hand over your dignity without a further thought? Did you not notice the people before they started mugging you?

                                      1. 1

                                        People who downvoted as troll, look at Active Self Protection on YouTube for examples of places people choose to be or choose to not be victims

                                    2. 1

                                      Person’s walking down the street. Hell, let’s make them heavily armed, far more “adequately protected” than most people would think is reasonable. A sniper from outside visible range shoots them in the back of the head. They chose to be a victim? Come on.

                                      1. 0

                                        An exception to prove the rule. Most victimizing isn’t as mismatched, nor as final, as the proposed scenario.

                                  2. 2

                                    People who downvoted because Troll.

                                    Come on. This position is in good faith, and I only bring it because yes, saying a person is at fault for choosing their tools is indeed victim blaming. And victim blaming is not a problem.

                                2. 20

                                  We’re at a point where we already were in the 60’s with cars and in the 90’s with planes. Everything was “driver error”, “pilot error”. In that case, since the results were actual fatalities, at a certain point there was this group of people that basically said: “that’s always going to happen, we’re always going to have human error, get over yourselves - how do we prevent people dying regardless”?

                                  And that’s how we got seat belts, airbags, telescopic steering wheels, etc. for cars and detailed checklists, redundant systems, etc. for airplanes. I think 2017 was the first year with 0 fatalities for air travel. So it can be done.

                                  It’s a very difficult mindset issue.

                                  1. 11

                                    Tool authors should not blame the programmer, but programmers should not blame their tools. Discipline and responsibility are needed from both.

                                    1. 8

                                      If you’re writing a lot of code in a memory-unsafe language, and find yourself occasionally writing memory- or concurrency-related bugs, and you know there are languages out there which make such bugs impossible or hard to write with no loss in performance or productivity, is it not okay to blame the tool? When should a carpenter stop blaming themselves for using the hammer incorrectly, and just accept that they need a new hammer whose head doesn’t occasionally fly off the handle?

                                      1. 3

                                        Discipline and responsibility means using the most appropriate tools for the job, being aware of the limitations of those tools, and doing what is necessary to compensate for those limitations.

                                        1. 3

                                          I agree with that, I work as a C++ programmer specifically because it’s the right tool for the particular job I’m doing. However, we use a safer language (Go in our case, because that’s what we know) for stuff where C++ isn’t completely necessary. If performance and no GC was a higher concern, or if we were on weaker hardware, Rust instead of Go would’ve been very interesting for all the parts where we don’t have to interact with a huge C++ library (in our case WebRTC).

                                        2. 2

                                          When a carpenter loses a hand they don’t blame the bandsaw. That I see so many programmers blame dangerous tools for being dangerous means were not even reached the level of a craft yet. Let alone an engineering discipline.

                                          1. 18

                                            I appreciate the analogy, but it doesn’t really apply. First of all, there is no tool that can replace a bandsaw for what a bandsaw does well. However, any worker who uses a bandsaw recognizes that it’s a fundamentally dangerous machine, and they take safety precautions when using it. If the analogy really applied, it would be considered unthinkable to write C code without a full suite of static analyzers, valgrind test suites, rigorous (MISRA C-level) coding standards, etc. Second, and more importantly, the saying applies to the quality of tool, but sometimes the tool is simply too dangerous to use: good carpenters will refuse to use a tablesaw without an anti-kickback guard or other safety features. Finally, when there was another tool that would do they job just as well, they’d use it.

                                            1. 9

                                              https://schneems.com/2016/08/16/sharp-tools.html

                                              this is not how tools work, either in programming, or in carpentry

                                              1. 3

                                                This is a fantastic essay and you should submit it as a story.

                                              2. 8

                                                When a commercial carpentry shop has a carpenter lose a hand to a bandsaw, they are more or less forced to stop and try to make the bandsaw safer. This might be done by adding safety features to the tool, or by redesigning the process to use it less/differently, or by ensuring that the workers are trained in how to use it correctly without risk and are wearing proper safety equipment.

                                                It’s not the carpenter’s fault, and it’s not the bandsaw’s fault, it’s the fault of the system that brings them together in a risky manner.

                                                1. 4

                                                  The company should not blame the employee or the bandsaw, and the employee should not blame the company’s training or procedures or the bandsaw. Discipline and responsibility are needed from both. That includes the discipline and responsibility needed to make the bandsaw, training, and procedures as safe as is possible, and to only certify employees who are capable of operating the device safely, and the employee’s discipline and responsibility to follow the training and procedures properly and to not defeat the safety measures.

                                                  Assigning blame is useless. The focus should be on identifying all root and proximal causes, and eliminating each one, with priority chosen based on the heirarchy of values.

                                        1. 7

                                          T for touch-tone (because there’s also P for pulse, a whole other story)

                                          I’m old enough to remember when Touch-Tone (a trademark; {DTMF,tone}-dialing being the “more official” name) was a pay-for option. You could save some small amount of money every month by not having it. Hayes and ATDP saved me a whole like…$0.33 a month or something.

                                          1. 6

                                            On the other end of the timeline, a few years ago I had a free rotary telephone and an effectively free VOIP line with an FXO port that only supported DTMF (and sadly no SIP or H.323 access; hooray for consumer-oriented service). I ended up wiring a 3.5mm TR jack into the phone and hooking that up to my computer, with a small shell script to play the appropriate DTMF tones from my computer. It probably saved me a whole $10 that year.

                                            1. 2

                                              As a BBS operator, saving money by ditching touch tone was a critical money-saver. Those lines were never used to dial out. I think the feature cost more than a dollar a month.

                                            1. 2

                                              Others have hit on this problem, but I’ll make it a top-level comment: who is going to pay for this? I am pretty decent at Rust. I love working on infrastructure. I’m a senior engineer by most org charts. I’ve never seen any sort of job opportunity for this.

                                              Tech’s biggest problem is too much money being involved.

                                              1. 3

                                                It’s more strange that there’s a tremendous amount of money involved, but very little of it ever seems to make its way towards maintaining and improving core system and applications. It all ends up going towards showing us more ads, and figuring out how to gather more of our personal information, so that companies can show us more ads.

                                                1. 2

                                                  Tech’s biggest problem is too much money being involved.

                                                  Or not enough money, depending on how you look at it. When security is opt in, the market will opt out to save a buck.

                                                  1. 1

                                                    I seem to be working professionally on an infrastructure project in Rust. We got there because we needed to build a bare-metal version and I insisted on Rust. Near as I can tell, there are more than a few other companies doing infrastructure work in Rust. It’s still far from the majority, but this is to be expected from a language that’s only been stable for ~3 years.

                                                    1. 1

                                                      Looks like a great application for Rust! Best wishes to your project.

                                                  1. 2

                                                    Has anyone gotten a pinebook? I signed up a long time ago and haven’t heard anything back yet. I’d love to be able to try this stuff out, but the hype articles make me think of vaporware more than anything promising.

                                                    1. 2

                                                      Yeah, I’ve had mine for a year and a half now. They should still be shipping.

                                                      1. 1

                                                        I thought the pinebook started shipping a year ago? Are you somewhere in their queue? Still have login details?

                                                        https://hackaday.com/2017/04/28/hands-on-with-the-pinebook/

                                                      1. 3

                                                        I think there’s an important distinction to be made between what I call casual and professional use of a program. Casual interfaces are designed to be used occasionally, but not to make up the majority of what you do with your computer, whereas you use professional interfaces for most of your work day. Obviously, what may be a casual tool for some people would be a professional tool for others.

                                                        However, some classes of software are nearly only ever used professionally: CAD software, many development tools, business accounting packages, lots of enterprise software, etc. Essentially, the sort of thing where for most people who use it, it is literally their job to use that software.

                                                        This is relevant because, like any investment, investing time in learning how to use a complex interface is only useful if you expect it to pay off in time saved over the many times you use an interface. I use a disk partitioning tool maybe once or twice a year; it should be simple and obvious how to make it do the sorts of things I generally want to do. On the other hand, I practically live in my shell and my text editor, so the week or two of time I’ve spent on each to learn how to use them to their full power has paid off handsomely.

                                                        So, to me, the interesting question is which tools you use frequently don’t have professional interfaces.

                                                        1. 5

                                                          I question the assertion that the interpreting slowdown is exponential; whether it takes a few dozen cycles per statement (as in a bytecode interpreter) or a few thousand (as in a pure AST interpreter), the same number of statements would need to be executed, and so the slowdown is roughly linear no matter how deep the call stack gets. The exception to that would be if the interpreter were itself getting interpreted for a subroutine call, and I can’t see any reason that an implementation would do such a thing.

                                                          That said, one interesting place you might look for purely interpreted applications is in the Forth family; it’s fairly common to implement forth using some form of threaded code, which is effectively an AST modulo the fact that neither the S nor the T really apply to forth. You don’t see many programs written in forth directly (though apparently Chuck Moore has written a VLSI layout package), but postscript is fairly common and every implementation I’ve seen has been a pure interpreter.

                                                          Another possibility, also outside the realm of traditional languages, is TeX, which seems to be completely uncompileable (even to bytecode); I’d definitely consider LaTeX to be a large application written entirely in TeX.

                                                          1. 2

                                                            and so the slowdown is roughly linear no matter how deep the call stack gets.

                                                            You’re the first one to say anything about this! I need to think about this again. Right now I think you’re right. Hm, so the slowdown isn’t from where I think it is in that case. I hope I didn’t just waste more than a month on a false premise.

                                                            Also, then I’m surprised there aren’t more examples of direct AST interpretation if its “just” a constant factor.

                                                            Edit: Ok, so I think the issue is that the slowdown scales exponentially with the height of the call stack but that’s unaffected by whether bytecode or direct AST is interpreted. I’m trying to push primitives (like loops) out of the host language and have them written in L (a bit like Forth but its more interpreted than compiled). Now instead of calling a primitive D directly, I call A which calls B which calls C which calls D and each of those have ~10 instructions in L then I get a 1000x slowdown. Is this reasoning right?

                                                            The exception to that would be if the interpreter were itself getting interpreted for a subroutine call

                                                            There are many nested layers but the interpreter itself isn’t interpreted (except possibly as a (very deliberate) test).

                                                            Forth

                                                            I guess I’m thinking of Forth as basically already bytecode since its a stack machine like most bytecode’s VM. The same goes for Postscript although I don’t know the language.

                                                            TeX and LaTeX

                                                            That’s an interesting example. I don’t know how it works internally, just heard it described as a macro expansion language. I’m not sure I’d say that’s interpreted though but its definitely not compiled to either assembly or bytecode. So it would be an example I’m looking for, although I don’t think I can actually make use of this example.

                                                            1. 4

                                                              I think the way to look at is is that, as you push the veil back and write more and more of your logic in L, the slowdown applies to more and more of your execution time. It’s still not exponential; if a mixed program takes time t to run, and you have an interpretation overhead of o, the slowest you can possibly get by moving everything to interpretation is t*o and the fastest it can get is t/o. Those bounds could be tightened somewhat by considering the time that it spends in interpreted code vs. compiled code separately; take i to be the ratio of time spent in interpreted code. Then the worst case runtime would be t*i+(1-i)*t*o and the best case runtime would be t*(1-i)+i*t/o.

                                                              These bounds are difficult to think about intuitively though, because they look at if you moved everything to compiled code or interpreted code. So, instead of looking at the possible performance improvement or loss, let’s look at the actual performance relative to the best case of everything being compiled. If the proportion of interpreted code in your program is i, the slowdown relative to the best possible case will simply be i*o+(1-i)

                                                              It’s worth noting, though, that for the first few levels of pushing logic into your language, i grows (roughly) exponentially larger. The base has nothing to do with the interpretation overhead though; it’s simply the branching factor of your routines.

                                                              1. 1

                                                                Thanks! Your comments here should be all the way at the top!

                                                                If the proportion of interpreted code in your program is i, the slowdown relative to the best possible case will simply be i*o+(1-i)

                                                                Yes, I agree and the all-compiled program is a better baseline. This means that I should get a factor of o at worst, even if its all AST interpreted.

                                                                It’s worth noting, though, that for the first few levels of pushing logic into your language, i grows (roughly) exponentially larger.

                                                                Hm, I’m wondering if this is what I’ve been seeing and let me to the incorrect exponential conclusion.

                                                                Still, that means both direct AST interpretation and layering will result in “only” a constant factor slowdown? And its also “only” a constant factor to go from bytecode to native assembly.

                                                                In general, is there somewhere I can run my reasoning (like this one) by some people for flaws? Other than to posting on lobste.rs that is. I’ve told some others and no-one had seen the problem. At the same time I’m re-doing the time bound estimates, I’m also estimating how much time I lost because of this :(.

                                                                Ok, one more thing. If functions in level j only calls functions in level j-1 (and level 0 is primitives of the host language) then the time for a level j call is exponential in j, right? (And this happens regardless of whether the levels are (all) interpreted bytecode or direct AST.)

                                                                And that means I should make calls in as low level as possible (instead of j-1) to make things faster. Except this goes in the wrong direction in terms of the abstraction I’m trying to provide (and potential polymorphism that would allow). For example, not using the object system would be faster than using it.

                                                              2. 1

                                                                Is this reasoning right?

                                                                I don’t think so.

                                                                Perhaps think of it like this. A function can have more than one call site. Those call sites may have different stack depths. When executing that function, the interpreter will do the same work irrespective of what is on the stack.

                                                                You may be thinking of nested loops? But that isn’t anything to do with how the program is run, but rather the algorithm which is implemented.

                                                                1. 1

                                                                  Also, then I’m surprised there aren’t more examples of direct AST interpretation if its “just” a constant factor.

                                                                  You need to take into consideration the number of toy languages that make it to languages in production use, where that constant factor results in a significant savings in time and money. (Less servers, less developer downtime while tests run, etc)

                                                                  1. 1

                                                                    I guess I’m thinking of Forth as basically already bytecode since its a stack machine like most bytecode’s VM. The same goes for Postscript although I don’t know the language.

                                                                    Hmm. Forths are typically pretty low level, and often run directly on hardware (though they often will use their own calling conventions, instead of the C one). The dynamism comes from the fact that words are “compiled” into a list of addresses to the other words, which can basically then be “jumped” to in succession. So, it’s actually slightly strange, and not really a VM with “a list of known operators that you switch on” in the traditional sense.

                                                                    1. 2

                                                                      Good point. Although it means Forths are even more compiled (and a less good example of what I’m looking for) than languages that only stop at the bytecode.

                                                                1. 3

                                                                  i am somewhere between appalled that he’d go so long with a labelless keyboard and ruefully aware that i would probably gripe but do nothing about it too :)

                                                                  1. 2

                                                                    It’s more amazing to me that you could go that long without developing muscle memory.

                                                                    1. 5

                                                                      I can touch-type, but I still look at the keyboard when I type complex passwords. I think my muscle memory is linked to sequences of key presses more than to specific characters. So I can type a sentence, but give me a random sequence of letters, numbers, and symbols and I’ll mess it up if I don’t glance at the keyboard.

                                                                      1. 2

                                                                        I have my keyboard arranged in alphabetical order, mostly to troll anybody else who sits down at my keyboard. Oddly enough, I still find it useful to look at the keyboard when typing a complex password, even though the letters on the keys have nothing to do with the character that they result in.

                                                                  1. 4

                                                                    My automata theory is incredibly rusty, but isn’t it the case that you typically convert an NFA to a DFA, so that, say, you don’t have to simultaneously take two paths explicitly? I guess I could dig out my copy of the dragon book…

                                                                    1. 5

                                                                      You could, but NFA to DFA conversion is susceptible to exponential blowup in the number of DFA states. This is why production regex engines use techniques to avoid this, by e.g., compiling the DFA lazily or compiling many small DFA.s

                                                                      1. 1

                                                                        exponential blowup in the number of DFA states.

                                                                        Yeah. That makes sense!

                                                                      2. 7

                                                                        Yes, you could. However, if you compute the DFA strictly, you’ll usually end up with a HUGE (exponential in the size of the NFA) DFA, which then needs to be minimized to be workable. On the other hand, Thompson’s algorithm lazily generates the DFA as it’s needed, allowing you to still run in constant time.

                                                                        1. 3

                                                                          you’ll usually end up with a HUGE (exponential in the size of the NFA) DFA

                                                                          That’s exaggerating a bit. There are pathological, worst case, regular expressions that are exponential in the size of the NFA, but most of them aren’t, and I wouldn’t say it’s “usual.”

                                                                          1. 1

                                                                            Thanks for that! I’ll have to look at Thompson’s algorithm. I guess I’m even more rusty than a thought…

                                                                        1. 3

                                                                          The example there about edges on an array is something I’ve had to directly deal with myself when I implemented a multidimensional image-processing function for GNU Octave. The problem is to find connected components in a binary image, where voxels are either 0 or 1. You want to find all the islands of ones, and you want to be flexible if diagonals count as being connected or not.

                                                                          The problem, of course, is that along the edges you don’t want to check for neighbours outside of the image boundary. I found no good way to do this for an n-dimensional image, after a few attempts of writing very complicated code. In the end, I ended up padding the whole image with a boundary of zeros, iterating over the padded image with an if(current_voxel) conditional that skipped checking for lit neighbours around the boundary, and when checking for lit neighbours at the original image’s boundaries would give no neighbours at the padded zero boundaries.

                                                                          The code was cleaner, but I incurred a big realloc, because N-dimensional images in Octave are stored as contiguous Fortran-order (column-major) arrays. I’m still looking for a better solution to this problem.

                                                                          So, how do you do this cleanly?

                                                                          1. 2

                                                                            I recently ran across a blog post that claims Julia’s ‘cartesian iterators’ make this less painful/ugly than normal (scroll down to “A multidimensional boxcar filter”). I haven’t put them to any serious use myself yet though, so I can’t really vouch for this either way.

                                                                            1. 2

                                                                              I’d use a union-find algorithm along one dimension at a time. Diagonals would simply be another set of axes.

                                                                              1. 1

                                                                                Can you explain this more? Note that iterating one dimension at a time is a bit tricky given that the thing is n-dimensional to begin with.

                                                                            1. 3

                                                                              I’d actually go a step further than the author: sometimes time spent not thinking about work can be the most productive possible use of your time. Programming is not an end unto itself, unless you’re writing development tools (and even then, your domain is usually not actually programming; instead, you’re writing tools to make writing some kind of software easier), and the projects that will be most useful to other people are likely to be the projects that you come up with while you aren’t actively programming.