1. 21

    As a submitter in the table here, I’m happy to stop submitting my own site. I kinda naturally stopped doing that on HN as the community grew and the conversation quality dropped.

    As a reader I think this analysis of majority-submission by one user badly needs a time threshold. It seems too much to wait indefinitely for someone else to post a site before one can do so again.

    My entire RSS feed consists of low-volume stuff by less well-known people. And for most people, if they don’t submit their own site nobody will. I consider that a loss to myself as a reader.

    I’ll play with some queries.

    1. 9

      I haven’t run it, but maybe an extra condition like this would help the query?

      select domain, count(*) as submitted, count(distinct stories.user_id) as submitters,
            (select count(*) from stories s where s.domain_id = domains.id
               group by s.user_id order by 1 desc limit 1) as from_one_submitter
        from domains join stories
        on domains.id = stories.domain_id
            and stories.created_at > date_sub(now(),INTERVAL 1 MONTH)   -- <=======
        group by domain
        having count(*) > 5
            and (from_one_submitter + 1) * 2 > count(*)
        order by 2 desc;
      

      Link to schema for convenience. I’m not sure what the query performance would be with the given indexes..

      1. 5

        Results:

        +------------------------+-----------+------------+--------------------+
        | domain                 | submitted | submitters | from_one_submitter |
        +------------------------+-----------+------------+--------------------+
        | github.com             |        55 |         40 |                187 |
        | medium.com             |        29 |         20 |                245 |
        | youtube.com            |        25 |         21 |                151 |
        | devblogs.microsoft.com |        12 |          5 |                 42 |
        | utcc.utoronto.ca       |        12 |          2 |                 26 |
        | kevq.uk                |         9 |          1 |                 32 |
        | omgubuntu.co.uk        |         7 |          2 |                 10 |
        | dev.to                 |         7 |          6 |                 14 |
        | gist.github.com        |         6 |          5 |                 16 |
        | twitter.com            |         6 |          6 |                 19 |
        +------------------------+-----------+------------+--------------------+
        10 rows in set (0.89 sec)
        
        1. 9

          Ah yes, dev.to. Forgot about that one. Haven’t seen much high-quality content from there. I think if I really wanted to see stuff from there, I’d just create an account.

          1. 1

            So that idea was a bust :D

            Maybe we should filter common domains out of these results? Did you say the topical miscreants created 10 stories? How far down do we go before we get to their domain?

            1. 2

              The featured spammers posted 6 stories (now removed) from ten accounts.

            2. 1

              Oh my I do recognise some domains I frequently post.

        1. 2

          The article is fine as far as it goes, because it makes no prescriptions based on this data. The question is: is the data actionable, and how? If not, why is it interesting?

          1. 2

            Key observations I got from the article:

            • A tiny minority of functions have the majority of changes.
            • A vast majority of functions only have a couple of authors.
            • Functions stop being modified around 10K hours (just over 1.14 years).
            • Functions are equally likely to be modified at any given time until they stop being modified (either because they’re stabilized or removed).

            Actions this information suggest to me (trying to read the auspices):

            • For development speed, optimize for workflows that empower individual authors of functions–perhaps this includes streamlining code review and skipping maintenance documentation.
            • For longevity, either find a way of retaining the original authors (better pay? be nicer to them? avoid witchhunts for badthink?) or emphasize pair programming or similar stuff to purposefully break the trend of single-author development.
            • Any maintenance plan longer than about a year is probably optimistic.
            • There is no “quick hack” up until about a year in–expect the code to be actively modified/maintained up until that point.
            1. 2

              Functions are equally likely to be modified at any given time until they stop being modified (either because they’re stabilized or removed).

              Ooh, I missed this one. Thank you!

              The deep and well-known problem in the vicinity of OP is: 1% of the code we write will last a long time, but we don’t know which 1% it is.

          1. 9

            I’m continuing to work on my safe programming language implemented in machine code, that maps mostly 1:1 to machine code. The core language for just integer types is done, and I’m now going to start working on user-defined types. Here’s what my todo list looks like:

            unsafe language with just int types

            [X] parsing function headers
            [X] code-generating primitive instructions
            [X] function calls
            [X] arguments
            [X] return values
            [X] local variables and their reclamation
            [X] register locals and shadowing
            [X] blocks, loops and early exits
            

            unsafe language with user-defined types

            [X] compound types: `addr` and `array`
            [ ] `type` for user-defined product types
            [ ] creating types on the stack
            [ ] `index` for accessing inside array types
            [ ] `get` for accessing inside product types
            [ ] `handle` for heap allocations (fat pointers)
            

            safe language

            [ ] types in primitive instructions
            [ ] types in function calls
            [ ] register checks
            [ ] types in `get` instructions
            [ ] `ref` in caller can map to `addr` in callee
            [ ] register lifetime checks in conditionals
            [ ] register lifetime checks in loops
            [ ] check for function result initialization
            [ ] code-generate runtime checks for array bounds
            [ ] code-generate runtime checks for heap lookups
            [ ] checks for the restricted '*' operator
            

            Compare 5 weeks ago.

            1. 3

              This is sick man. Please please continue on :) Higher order, typed assembly is something I’ve been waiting for someone to do for awhile :D

            1. 18

              I get where he’s coming from, but I think the really valuable thing about microservices is the people-level organizational scaling that they allow. The benefits from having small teams own well-defined (in terms of their API) pieces of functionality are pretty high. It reduces the communication required between teams, allowing each team to move at something more closely approaching “full speed”. You could, of course, do this with a monolith as well, but that requires much more of the discipline to which the author refers.

              1. 15

                Why do you need network boundaries to introduce this separation? Why not have different teams work on different libraries, and then have a single application that ties those libraries together?

                1. 4

                  Network boundaries easily allow teams to release independently, which is key to scaling the number of people working on the application.

                  1. 1

                    Network boundaries easily allow teams to release independently

                    …Are you sure?

                    Independent systems are surely easier to release independently, but that’s only because they’re independent.

                    I think the whole point of a “microservice architecture” is that it’s one system with its multiple components spread across multiple smaller interdependent systems.

                    which is key to scaling the number of people working on the application

                    What kind of scale are we talking?

                    1. 1

                      Scaling by adding more people and teams to create and maintain the application.

                      1. 2

                        Sorry, I worded that question ambiguously (although, the passage I quoted already had “number of people” in it). Let me try again.

                        At what number of people writing code should an organisation switch to a microservices architecture?

                        1. 1

                          That’s a great question. There are anecdotes of teams with 100s of people making a monolith work (Etsy for a long time IIRC), so probably more than you’d think.

                          I’ve experienced a few painful symptoms of when monoliths were getting too big: individuals or teams locking large areas of code for some time because they were afraid to make changes in parallel, “big” releases taking days and requiring code freezes on the whole code base, difficulty testing and debugging problems on release.

                      2. 1

                        I think the whole point of a “microservice architecture” is that it’s one system with its multiple components spread across multiple smaller interdependent systems.

                        While this is often the reality, it misses the aspirational goal of microservices.

                        The ideal of software design is “small pieces, loosely joined”. This ideal is hard to attain. The value of microservices is that they provide guardrails to help keep sub-systems loosely joined. They aren’t sufficient by themselves, but they seem to nudge us in the right, more independent direction a lot of the time.

                    2. 2

                      Thinking about this more was interesting.

                      A [micro]service is really just an interface with a mutually agreed upon protocol. The advantage is the code is siloed off, which is significant in a political context: All changes have to occur on the table, so to speak. To me, this is the most compelling explanation for their popularity: they support the larger political context that they operate in.

                      There may be technical advantages, but I regard those as secondary. Never disregard the context under which most industrial programming is done. It is also weird that orgs have to erect barriers to prevent different teams from messing with each other’s code.

                    3. 12

                      The most common pattern of failure I’ve seen occurs in two steps:

                      a) A team owns two interacting microservices, and the interface is poor. Neither side works well with anyone but the other.

                      b) A reorg happens, and the two interacting microservices are distributed to different teams.

                      Repeat this enough times, and all microservices will eventually have crappy interfaces designed purely for the needs of their customers at one point in time.

                      To avoid this it seems to me microservices have to start out designed for multiple clients. But multiple clients usually won’t exist at the start. How do y’all avoid this? Is it by eliminating one of the pieces above, or some sort of compensatory maneuver?

                      1. 4

                        crappy interfaces designed purely for the needs of their customers…

                        I’m not sure by which calculus you would consider these “crappy”. If they are hard to extend towards new customer needs, then I would agree with you. This problem is inherent in all API design, though microservice architecture forces you to do more of that thus you could get it wrong more often.

                        Our clients tend to be auto-generated from API definition files. We can generate various language-specific clients based on these definitions and these are published out to consumers for them to pick up as part of regular updates. This makes API changes somewhat less problematic than at other organizations, though they are by no means all entirely automated.

                        …at one point in time

                        This indicates to me that management has not recognized the need to keep these things up to date as time goes by. Monolith vs. microservices doesn’t really matter if management will not invest in keeping the lights on with respect to operational excellence and speed of delivery. tl;dr if you’re seeing this, you’ve got bigger problems.

                        1. 1

                          Thanks! I’d just like clarification on one point:

                          …management has not recognized the need to keep these things up to date as time goes by…

                          By “keeping up to date” you mean changing APIs in incompatible ways when necessary, and requiring clients to upgrade?


                          crappy interfaces designed purely for the needs of their customers…

                          I’m not sure by which calculus you would consider these “crappy”. If they are hard to extend towards new customer needs, then I would agree with you.

                          Yeah. Most often this is because the first version of an interface is tightly coupled to an implementation. It accidentally leaks implementation details and so on.

                          1. 1

                            By “keeping up to date” you mean changing APIs in incompatible ways when necessary, and requiring clients to upgrade?

                            Both, though the latter is much more frequent in my experience. More on the former below.

                            Yeah. Most often this is because the first version of an interface is tightly coupled to an implementation. It accidentally leaks implementation details and so on.

                            I would tend to agree with this, but I’ve found that this problem mostly solves itself if the APIs get sufficient traction with customers. As you scale the system, you find these kinds of incongruities and you either fix the underlying implementation in an API-compatible way or you introduce new APIs and sunset the old ones. All I was trying to say earlier is, if that’s not happening, then either a) the API hasn’t received sufficient customer interest and probably doesn’t require further investment, or b) management isn’t prioritizing this kind of work. The latter may be reasonable for periods of time, e.g. prioritizing delivery of major new functionality that will generate new customer interest, but can’t be sustained forever if your service is really experiencing customer-driven growth.

                            1. 1

                              Isn’t now the problem moved to the ops team, which had to grow in size in order to support the deployment of all these services, as they need to ensure that compatible versions talk to compatible versions, if that is even possible? What I found the most problematic with any microservices deployment is that ops teams suffer more, new roles are needed just to coordinate all these “independent”, small teams of developers, for the sake of reducing the burden on the programmers. One can implement pretty neat monoliths.

                              1. 2

                                We don’t have dedicated “ops” teams. The developers that write the code also run the service. Thus, the incentives for keeping this stuff working in the field are aligned.

                      2. 6

                        It reduces the communication required between teams, allowing each team to move at something more closely approaching “full speed”.

                        Unfortunately, this has not been my experience. Instead, I’ve experienced random parts of the system failing because someone changed something and didn’t tell our team. CI would usually give a false sense of “success” because everyone’s microservice would pass their own CI pipeline.

                        I don’t have a ton of experience with monoliths, but in a past project, I do remember it was nice just being able to call a function and not have to worry about network instability. Deploying just 1 thing, instead of N things and having to worry about service discovery was also nicer. Granted, I’m not sure how this works at super massive scale, but at small to medium scale it seems nice.

                        1. 2

                          Can you give an example where this really worked out this way for you. These are all the benefits one is supposed to have, but the reality often looks different in my experience

                          1. 10

                            I work at AWS, where this has worked quite well.

                            1. 5

                              Y’all’s entire business model is effectively shipping microservices, though, right? So that kinda makes sense.

                              1. 20

                                We ship services, not microservices. Microservices are the individual components that make up a full service. The service is the thing that satisfies a customer need, whereas microservices do not do so on their own. Comprising a single service, there can be anywhere from a handful of microservices up to several hundred, but they all serve to power a coherent unit of customer value.

                                1. 4

                                  Thank you for your explanation!

                              2. 4

                                AWS might be the only place I’ve heard of which has really, truly nailed this approach. I have always wondered - do most teams bill each other for use of their services?

                                1. 5

                                  They do, yes. I would highly recommend that approach to others, as well. Without that financial pressure, its way too easy to build in some profligate waste into your systems.

                                  1. 1

                                    I think that might be the biggest difference.

                                    I’d almost say ‘we have many products, often small, and we often use our own products’ rather than ‘we use microservices’. The latter, to me, implies that the separation stops at code - but from what I’ve read it runs through the entire business at AMZ.

                                  2. 1

                                    It’s worked well at Google too for many years but we also have a monorepo that makes it possible to update all client code when we make API changes.

                            1. 4

                              Author here, happy to answer any questions about working with Manticore or my experience implementing the WASM spec.

                              1. 2

                                This is all very new to me. How is Manticore related to fuzz testing?

                                1. 3

                                  Symbolic execution and fuzzing are kind of complementary. From https://sites.cs.ucsb.edu/~vigna/publications/2018_SPMag_MechPhish.pdf , PDF page 6:

                                  Mechanical Phish’s exploitation involves two major steps. The first finds crashes in the target programs. The second step takes those crashes and attempts to figure out how they can be modified to produce exploits that take control of the program.We used AFL, a well-known and highly successful evolutionary fuzzer, as the core of the bug-finding component of our CRS. […] Symbolic execution is a slow but powerful technique for determining the equations that describe the state of the program at any point in execution. To use it efficiently, Driller limits the search space of the symbolic execution to that of the inputs generated by AFL. Specifically, the symbolic execution component will follow each input in AFL’s corpus and check if there are any new locations in the program that it can reach.

                                  Fuzzing finds crashes, symbolic execution helps turn crashes into provable vulnerabilities.

                                  1. 2

                                    This is a great response, I’d also recommend the Driller paper (https://sites.cs.ucsb.edu/~vigna/publications/2016_NDSS_Driller.pdf) as a resource for how symbolic execution and fuzzing can fit together.

                                2. 2

                                  Do you have an instruction limit or memory limit? If so how is it configured?

                                  1. 1

                                    Manticore doesn’t impose any built-in constraints on instruction count or memory usage. In fact, it’ll happily gobble up all of your system’s resources if you give it a particularly heavy workload. We usually run Manticore on cloud machines for that reason, but if we need to use it on a shared machine, we use Linux utilities like ulimit and nice to make it a bit less aggressive.

                                1. 3

                                  Can we see some example output?

                                  1. 1

                                    Here is a screenshot: https://medv.io/assets/ll.png

                                    1. 2

                                      Thank you!

                                      What do the colors indicate in that screenshot? Is that how you indicate git status?

                                      1. 1

                                        Yes, git status. Same as in git status. Green - added. blue - modified.

                                  1. 15

                                    You may consider learning perl instead:

                                    • the programs are as terse as awk
                                    • much more flexible
                                    • wide set of libraries
                                    • as widely available

                                    remember the power of perls ‘while(<>) func’ which hides command line arg parsing, stdio handling, and per-line loop. i highly recommend perl cookbook, you will be amazed how practical it is. dont allow to grow your perl(or awk) programs more than 10 lines long - they become a pain to maintain as they grow.

                                    1. 7

                                      I know perl, but found that awk was much more likely to be available and it’s much faster for quick scripts because there is less overhead in running the binary. i’ve moved most of my muscle memory to reach for awk instead.

                                      1. 5

                                        Awk does have the advantage of being a smaller language. You can understand awk enough to do useful things with it in an afternoon. Perl is Byzantine in it’s complexity as a language, which always scared me off using it for one-liners

                                        1. 3

                                          This is true especially when reading Perl scripts written by someone else. On the other hand, learn Perl enough to be able to write useful oneliners and process text streams, is quite easy.

                                          1. 2

                                            You can understand awk enough to do useful things with it in an afternoon.

                                            I did that with Perl and a bunch of other languages at various times. The trick is you learn just the subset you need for structured programming, basic I/O, and whatever data format you deal with. Much tinier. Just cuz it’s there doesn’t mean you have to use it.

                                            For Perl, I also had to learn regular expressions. They kept paying off outside of Perl, though.

                                            1. 1

                                              Just cuz it’s there doesn’t mean you have to use it.

                                              Yeah, but finding a useful subset of Perl means I have to learn enough Perl to know what a useful subset would be, where awk is already that useful subset.

                                              Doesn’t mean you can’t approach it like that, for sure, but I was pleasantly surprised how easy awk was to pick up when I decided to try to learn it a while back.

                                            2. 1

                                              Perl’s most ardent users are the language’s worst enemy ;) The downside of TMTOTDI[1] is that experienced Perl hackers settle into a set of personal idioms that they are comfortable with, but that others may not be.

                                              Bondage and discipline languages with a much stricter focus on what’s “officially” idiomatic, like Python, don’t have this problem, and neither do small, focused languages like AWK.

                                              [1] “There’s More Than One Way To Do It!”

                                            3. 4

                                              One of the reasons I prefer Perl is it’s portability between BSD and Linux. Sadly, this isn’t the case with AWK due to different implementations.

                                              1. 2

                                                scripts written for the One True awk (which most BSDs use) should work with gawk.

                                                1. 1

                                                  There is also GNU awk as package.

                                                2. 3

                                                  I used to think this, but after stuff like this I exited.

                                                  1. 1

                                                    Update: That diff is not very clear, but here’s the issue from another repo I own that triggered the patch: https://github.com/akkartik/wart/issues/5

                                                    1. 1

                                                      Sad that enabling warnings caused this, it’s usually a given when writing scripts.

                                                      1. 2

                                                        It was a warning for a few minor versions, and then an error at some minor version.

                                                        1. 3

                                                          Wow, a lot of sotware was affected by this, based on this google search. Looks like an Autotools artifact? Someone wanted to avoid using / as delimeters?

                                                          Edit for this specific use case, I think sed and AWK are a better fit…

                                                  1. 4

                                                    I don’t know much Haskell, so I don’t understand what Left, Right and M.Map do. But is attempt 2 basically isomorphic to Rich Hickey’s criticism of Haskell that it can’t unify T with Maybe T?

                                                    1. 7

                                                      Left and Right are data constructors for the Either a b type. This is a basic sum type, it’s holds either an a or a b, but not both. Pattern matching on an Either a b you see whether it’s a Left x or a Right x and then you know either x : a or x : b.

                                                      M.Map is a type constructor of type K -> V -> M.Map K V, where M.Map K V is a key-value map with keys of type K and values of type V.

                                                      But is attempt 2 basically isomorphic to Rich Hickey’s criticism of Haskell that it can’t unify T with Maybe T?

                                                      This totally misses the point of Haskell. You can’t ‘unify’ two values of different types. A function of type Maybe T -> T has an error on the None branch allowing it to type check.

                                                      T -> Maybe T is the type of Just. Obviously Haskell has that.

                                                      1. 3

                                                        I’m not a fan of Rich Hickey or his views about type systems, but let’s do a little more justice to his argument about Maybe T; He says that if I’m calling a function that expects a Maybe T, but if I’m calling it with a T, why can’t the compiler be smart enough to adapt the T by inserting a Just in front of it. So, in effect, he wants T to be a subtype of Maybe T and he wants Haskell’s type system to have contravariance w.r.t. subtyping.

                                                        Even though what he wants seems reasonable to a first approximation, he doesn’t mention what should happen if the type is Maybe a for a polymorphic a, and if I try to pass in Maybe Int. Should it become Maybe (Maybe Int) or just Maybe Int? In my view, the road to programmer hell is paved with naive intentions of taking petty shortcuts like this.

                                                    1. 29

                                                      This is a great post. I’m now going to be unfair and focus on one tiny part of it to criticize.

                                                      …a mass insertion of b’’ prefixes everywhere… would require developers to think about whether a type was a bytes or str…

                                                      In addition, …the added b characters would cause a lot of lines to grow beyond our length limits and we’d have to reformat code.

                                                      This is a great example of the kind of brain damage fostered by style guides: we start to think concerns of style are as important as concerns of semantics (the first sentence above). Would it really be so bad to wait until the migration is done to perform a final reformatting? Just increase the line size limit until then in any linting hooks.

                                                      I don’t mean to criticize just this post. We all make penny-wise pound-foolish decisions like this everyday. Myself included. Every rule we introduce has a cost, in loss of discretion and atrophying of individual judgement.

                                                      1. 5

                                                        The problem is that especially in large organizations there really is value in simply picking a set of style guidelines and then requiring that any checked in code adhere to them.

                                                        This isn’t about bike shedding either. When I started at my current job we had HUGE bodies of Python code written by folks who decidedly weren’t Python programmers. It used inconsistent tab spacing, didn’t follow variable naming conventions and generally ignored Python’s style conventions entirely.

                                                        The net result was that the code was MUCH harder to read and maintain by the long time Pythonistas on the team, and nearly impossible for newbies to modify.

                                                        1. 2

                                                          I agree with all that in isolation, but I don’t think it counters anything I said about balancing syntax and semantics.

                                                          Yes, if you leave style totally flexible, really large teams will have a really bad time. But there’s a wide spectrum here. There is always some value, but there are also costs to balance with other priorities. You don’t have to be perfectly rigid 100% of the time.

                                                          1. 1

                                                            You don’t have to be perfectly rigid 100% of the time.

                                                            Anybody who is perfectly rigid with any style guide is Just Plain Doing it Wrong :)

                                                            1. 3

                                                              Orwell’s final rule: “Break any of these rules sooner than say anything outright barbarous”.

                                                        2. 2

                                                          This is why I use a 79 character line limit.

                                                          That’s a joke, obviously, the real reason is much worse.

                                                          1. 1

                                                            I’m confused. Are there no auto-formatting tools for python that can wrap the lines for you? That seems the ideal fix.

                                                          1. 9

                                                            I’m continuing to work on my safe programming language implemented in machine code, that maps mostly 1:1 to machine code. Here’s what my todo list looks like:

                                                            unsafe language with just int types
                                                            [✓] parsing function headers
                                                            [✓] code-generating primitive instructions
                                                            [✓] function calls
                                                            [✓] arguments
                                                            [✓] return values
                                                            [ ] local variables and their reclamation
                                                            [ ] register locals and shadowing
                                                            [ ] blocks, loops and early exits
                                                            
                                                            unsafe language with user-defined types
                                                            [ ] compound types: `addr` and `array`
                                                            [ ] `type` for user-defined product types
                                                            [ ] `choice` for user-defined sum types
                                                            [ ] `ref` for creating types on the stack
                                                            [ ] `handle` for heap allocations (fat pointers)
                                                            [ ] `index` for accessing inside array types
                                                            [ ] `get` for accessing inside product types
                                                            [ ] some method for sum type access (tagged unions)
                                                            [ ] anonymous union types (e.g. `int|err`)
                                                            
                                                            safe language
                                                            [ ] types in primitive instructions
                                                            [ ] types in function calls
                                                            [ ] register checks
                                                            [ ] types in `get` instructions
                                                            [ ] types in sum type access
                                                            [ ] `ref` in caller can map to `addr` in callee
                                                            [ ] register lifetime checks in conditionals
                                                            [ ] register lifetime checks in loops
                                                            [ ] check for function result initialization
                                                            [ ] code-generate runtime checks for array bounds
                                                            [ ] code-generate runtime checks for heap lookups
                                                            
                                                            1. 9

                                                              Unjustified complexity has a name: Bloat.

                                                              1. 28

                                                                Sure. How do you know the complexity in this case is unjustified?

                                                                1. -1

                                                                  Sure. How do you know the complexity in this case is unjustified?

                                                                  It isn’t accompanied with a justification.

                                                                  1. 12

                                                                    But neither is your accusation?

                                                                    As someone who is too young for all the systemd hate, I really haven’t found any good explanation of why I should hate it too.

                                                                    1. 7

                                                                      There is a lot of justification.

                                                                      Most people care about the “Keeping the First User PID Small” aspect. The primary argument of Poettering is “shell scripts are evil”.

                                                                      On my system the scripts in /etc/init.d call grep at least 77 times. awk is called 92 times, cut 23 and sed 74. Every time those commands (and others) are called, a process is spawned, the libraries searched, some start-up stuff like i18n and so on set up and more. And then after seldom doing more than a trivial string operation the process is terminated again. Of course, that has to be incredibly slow.

                                                                      shell scripts are also very fragile, and change their behaviour drastically based on environment variables and suchlike, stuff that is hard to oversee and control.

                                                                      Most of the scripting is spent on trivial setup and tear-down of services, and should be rewritten in C, either in separate executables, or moved into the daemons themselves, or simply be done in the init system.

                                                                      1. 2

                                                                        so his reasoning is that unix tools “have to be incredibly slow,” and it’s hard to control environment variables. i haven’t read much from poettering but i think i know now why people hate him.

                                                                        1. 1

                                                                          when you start any binary many times in a loop, it is slow. that’s why if you want your shell scripts to have good performance you want to do as much in pure shell as possible, especially in any loops. Also, environment variables are global mutable state, which by definition is difficult to control. The only way to do anything about them is use cgroups for everything, which systemd largely does and sysv does not.

                                                                          1. 1

                                                                            the theoretical reasons that starting external binaries could be slower than running a single program are not enough to say that something is slow in practice. i haven’t noticed any difference in speed between sysvinit and systemd, unless you count stalling at shutdown.

                                                                            maybe we are talking about different things but there is stuff like env(1) and execle(2) which let you control the environment for a process, without cgroups.

                                                                            1. 1

                                                                              It’s “slow”, but with very fast values of slow. Sure, you probably don’t need grep and awk/sed (both have /regex/ as a core feature) - but IMNHO it’s a lot easier to read that way - and of course you don’t need a c++ compiler to fix a logic or “parser” bug.

                                                                              “Speed” was one of the arguments for perl (over sh+awk+grep+sed) - still is. But it’s normally not a very interesting or compelling argument.

                                                                              “70+ invocations of grep” might sound like a lot of “bloat”, but we’re effectively talking about quick process forks here, hot disk and cpu caches.

                                                                              I don’t think I’ve ever seen init or login being slow due to “too many forks” - but doing some things safely in parallell can be an obvious benefit. But you can do that with shell too.

                                                                    2. 10

                                                                      LOC is a terrible way to measure complexity.

                                                                      (Note that I’m not saying systemd is not complex, just that trying to equate LOC to code quality/complexity is not an accurate way to measure quality/complexity)

                                                                      1. 8

                                                                        The article you linked doesn’t say that. Also the article agrees with the tone of ethoh’s comment. I personally think that more lines of code is almost always more complex. Whether it’s irreducibly complex or just complex is another question, but it’s probably complex.

                                                                        1. 6

                                                                          “I don’t need a calculator to determine if I need to pay taxes on a million dollars.” – Ian Malcolm, Jurassic Park

                                                                          1. 1

                                                                            it’s more accurate than any alternative

                                                                            1. 2

                                                                              Which is? A knee-jerk reaction to something you don’t like? At least come up with some arguments as to why you think it’s bloated.

                                                                              1. 1

                                                                                i think you responded to the wrong comment.

                                                                              2. 0

                                                                                There is much more accurate or meaningful alternative: cyclomatic complexity.

                                                                                However SLOC is a good approximation and it is much more easier to count the lines.

                                                                                In both cases you also must consider the complexity of the dependencies (libraries, compilers…).

                                                                                1. 1

                                                                                  cool i never heard of that. are there tools to measure cyclomatic complexity?

                                                                                  1. 1

                                                                                    Yes. Some linters measure it.

                                                                          1. 1

                                                                            Wonderful. Feature requests if they don’t already exist:

                                                                            1. 15

                                                                              TL;DR: 9front plays well with other systems with very few setup.

                                                                              If you do not want to install 9front on your main disk and boot it by default, you may make use of a virtual machine, either local or hosted on a VPS or some computer at home.

                                                                              Then you can connect to it through http://drawterm.9front.org/, which does provide a “plan 9 window” (think of SSH + X11 forwarding + SFTP onto a much simpler protocol).

                                                                              It is surprising how well this works in practice since drawterm got resizing support : all the windows inside it will be resized equally well due to how plan9 handles resizes.

                                                                              You can start a window in system (rio) in it, or directly one application (acme, sam, other…).

                                                                              You can also copy paste from/to it, and have your local filesystem mounted inside 9front for sharing files between the local and remote.

                                                                              After a while, you can integrate more of 9front onto your environment (replace a DNS (ndb+cs) / mail (imap,smtp,spamfilter) / VPN (tinc) server with it. It works well.

                                                                              Given all of these are filesystems available as 9p streams, you can mount them onto other non-9front systems. Here comes all the fun: use 9p servers so that systems provide services to others. You can mount natively 9p on linux and through FUSE https://github.com/mischief/9pfs

                                                                              Then the frontier between 9front systems and other systems tends to blur, and you achieved distribution of your computing through multiple systems. I haven’t been that far due to lack of time (and other project ongoing).

                                                                              In this vision, you can use 9front for getting some work done, and make use of other systems for what else good they have and 9front lacks, and the other way around.

                                                                              I did not test the other way around: running Linux or alike in 9front. as it has an hypervisor: http://man.9front.org/1/vmx

                                                                              1. 4

                                                                                If you use macOS and have a retina display, you can use my fork of drawterm, https://bitbucket.org/j-xy/drawterm/, which has the same metal backend as the one in plan9port devdraw.

                                                                                By the way, with the new mailing list host, you can see all the “glory/gory” details about the community at https://9fans.topicbox.com/groups/9fans

                                                                                1. 2

                                                                                  Nice, have you thought about getting your patches into the drawterm that 9front ships?

                                                                                  1. 3

                                                                                    Sure. I talked to folks on irc, but it didn’t get anywhere, apart from one commit, https://code.9front.org/hg/drawterm/rev/ff43e9bf3cea

                                                                                    1. 2

                                                                                      post to the 9front mailing list or e-mail cinap directly.

                                                                                2. 3

                                                                                  You got me to upload the ISO to a VPS on Vultr, and I can connect to its console and get a window manager! Thank you.

                                                                                  I don’t have a 3-button mouse, though. IIRC that’s pretty essential? Is there some way to emulate middle-click, particularly over the internet?

                                                                                  I see a terminal window open. How can I open a second?

                                                                                  I haven’t yet tried drawterm, but I look forward to trying it out next.

                                                                                  1. 2

                                                                                    Thank you for your input. This gradual approach does indeed sound sensible.

                                                                                  1. 1

                                                                                    @akkartik@mastodon.social

                                                                                    I mostly post updates about my Mu project. (Mastodon has become my primary platform since last year’s thread.)

                                                                                    1. 2

                                                                                      Thank you for sharing this again; I couldn’t remember the name.

                                                                                      Distinguishing input and outputs of an opcode might be done with a merely decorative sigil, just like the opcode names.

                                                                                      I allocate local variables on the stack but forget to clean them up before returning.

                                                                                      You don’t need this problem. If none of your functions are reentrant then you can use static local variables instead. Reentrancy was not the default in the ancient tongues.

                                                                                      I have what is probably a simple plan for memory safety: divide the program into an upper imperative layer and a lower functional layer. Functions in the functional layer can only return copies. In the imperative layer everything is reference-counted (Rc<RefCell<_>>). With additional complication you could allow the functional layer to modify owned values & mutable references. (This model is unfortunately incompatible with OOP.)

                                                                                      I’ve been finding good error messages to be more valuable than syntactic conveniences.

                                                                                      Perhaps “design by error quality” should be a thing.

                                                                                      1. 3

                                                                                        Thank you, yes “design by error quality” feels like a movement I can get behind.

                                                                                        But I’m still too much of a lisper to give up recursive functions :)

                                                                                      1. 3

                                                                                        Way back in the day, I played around with a language with a similar goal called “TACKLE”. Unlike the Mu language here that looks a lot like C, TACKLE was a weird combination of NASM syntax & QBasic. Because I was doing real mode development & most of my experience was in QBasic with a vague familiarity with real mode x86 assembly, this combination really worked for me & I did cool stuff in it.

                                                                                        It’s sort of unusual to see stuff pitched as explicitly low-level “system languages”. C is used for all sorts of stuff. I think aside from this and TACKLE, the only one I’m aware of is C– (and I don’t even know what C– looks like – just the name).

                                                                                        Since the machines you’re targeting will need to be built from scrap, are you planning to keep ASCII (with its space half wasted by no-longer-necessary control characters)? Will you remap the graphical characters for high and low range to be something useful as part of the languages you’re building for this?

                                                                                        1. 6

                                                                                          Nope, no new encoding. I’m already boiling half the ocean, I’d like to leave the other half to someone else :)

                                                                                          I do want to add utf-8 support at some point, but yes so far it’s just ASCII.

                                                                                          I was using “system language” as a sort of shorthand. It’s like this pseudo-category of languages that contains just C. The major characteristic of C that I’m trying to replicate is that it doesn’t use other (“higher level”) languages. Other languages use it in their implementations.

                                                                                        1. 4

                                                                                          As the OP of that thread and someone genuinely interested in a truly simple, minimal operating system, I say bravo! Your work is incredible. I firmly believe that implementation beats theory any and every day of the week.

                                                                                          How’s that Level 3 programming language coming along? Got any sneak peeks you can share?

                                                                                          1. 6

                                                                                            Not yet, I’m still building Level 2. (OP is just an outline of a plan.) Level 3 is just vaporware so far. Thank you, and to all the other commenters!

                                                                                          1. 3

                                                                                            I keep getting a little arrow emoji in reading your code, is some special unicode character missing from my computer, or is it supposed to be a little arrow?

                                                                                            1. 3

                                                                                              Oh sorry I forgot to post my more useful feedback.

                                                                                              Really cool post, and it will be interesting to see where Mu goes after you’ve hacked on it to a useable state (I would totally play with it). I don’t think the future needs C either, but it’s up to programmers with edge to take the approach of cutting it out of the machine code generation process to make it irrelevant.

                                                                                              1. 1

                                                                                                Sorry, I’m cheating there. The repo just uses <-.

                                                                                              1. 60

                                                                                                This site is claiming to offer a “standard for opting out of telemetry”, but that is something we we already have: Unless I actively opt into telemetry, I have opted out. If I run your software and it reports on my behavior to you without my explicit consent, your software is spyware.

                                                                                                1. 11

                                                                                                  but that is something we we already have: Unless I actively opt into telemetry, I have opted out.

                                                                                                  I know this comes up a lot, but I disagree with that stance. The vast majority of people leaves things on their defaults. The quality of information you get from opt-in telemetry is so much worse than from telemetry by default that it’s almost not worth it.

                                                                                                  The only way I could see “opt-in” telemetry actually work is caching values locally for a while and then be so obnoxiously annoying about “voluntarily” sending the data that people will do it just to shut the program up about it.

                                                                                                  1. 26

                                                                                                    That comment acts like you deserve to have the data somehow? Why should you get telemetry data from all the people that don’t care about actively giving it to you?

                                                                                                    1. 12

                                                                                                      That comment acts like you deserve to have the data somehow?

                                                                                                      I’ve got idiosyncratic views on what “deserving” is supposed to mean, but I’ll refrain from going into philosophy here.

                                                                                                      Why should you get telemetry data from all the people that don’t care about actively giving it to you?

                                                                                                      Because the data is better and more accurate. Better and more accurate data can be used to improve the program—which is something everyone will eventually benefit from. But if you skew the data towards the kinds of people who opt into telemetry.

                                                                                                      Without any telemetry, you’ll instead either (a) get the developers’ gut instinct (which may fail to reflect real-world usage), or (b) the minority that opens bug tickets dictate the UI improvements instead, possibly mixed with (a). Just as hardly anyone (in the large scale of things) bothers with opting into telemetry, hardly anyone bothers opening bug tickets. Neither group may be representative of the silent majority that just wants to get things done.

                                                                                                      Consider the following example for illustration of what I mean (it is a deliberate oversimplification, debate my points above, not the illustration):

                                                                                                      Assume you have a command-line program that has 500 users. Assume you have telemetry. You see that a significant percentage of invocations involve the subcommand check, but no such command exists; most such invocations are immediately followed by the correct info command. Therefore, you decide to add an alias. Curiously, nobody has told you about this yet. However, once the alias is there, everyone is happier and more productive.

                                                                                                      Had you not had telemetry, you would not have found out (or at least not found out as quickly, only when someone got disgruntled enough to open an issue). The “quirk” in the interface may have scared off potential users to alternatives, not actually giving your program a fair shot because of it.

                                                                                                      1. 3

                                                                                                        Bob really wants a new feature in a software he uses. Bob suggests it to developers, but they don’t care. As far as they can tell, Bob is the only one wanting it. Bob analyzes the telemetry-related communication and writes a simple script that imitates it.

                                                                                                        Developers are concerned about privacy of their users and don’t store IP addresses (it’s less than useless to hash it), only making it easier for Bob to trick them. What appears as a slow growth of active users, and a common need for a certain feature, is really just Bob’s little fraud.

                                                                                                        It’s possible to make this harder, but it takes effort. It takes extra effort to respect users’ privacy. Is developing a system to spy on the users really more worthy than developing the product itself?

                                                                                                        You also (sort of) argued that opt-in telemetry is biased. That’s not exactly right, because telemetry is always biased. There are users with no Internet access, or at least an irregular one. And no, we don’t have to be talking about developing countries here. How do you know majority of your users aren’t medical professionals or lawyers whose computers are not connected to the Internet for security reasons? I suspect it might be more common than we think. Then on the other hand, there are users with multiple devices. What can appear as n different users can really just be one.

                                                                                                        It sort of depends on you general philosophical view. You don’t have to develop a software for free, and if you do, it’s up to you to decide the terms and conditions and the level of participation you expect from your users. But if we talk about a free software, I think that telemetry, if any, should be completely voluntary on a per-request basis, with a detailed listing of all information that’s to be sent in both human- and machine- readable form (maybe compared to average), and either smart enough to prevent fraudulent behavior, or treated with a strong caution, because it may as well be just an utter garbage. Statistically speaking, it’s probably the case anyway.

                                                                                                        I’m well aware that standing behind a big project, such as Firefox, is a huge responsibility and it would be really silly to advice developers to rather trust their guts instead of trying to collect at least some data. That’s why I also suggested how I imagine a decent telemetry. I believe users would be more than willing to participate if they saw, for example, that they used a certain feature above-average number of times, and that their vote could stop it from being removed. It’s also possible to secure per-request telemetry with a captcha (or something like that) to make it slightly more robust. If this came up once in a few months, “hey, dear users, we want to ask”, hardly anyone would complain. That’s how some software does it, after all.

                                                                                                        1. 1

                                                                                                          The fraud thing is an interesting theory, but I am unaware how likely it is; you’ve theorised a Bob who can generate fraudulent analytics but couldn’t fake an IP address or use multiple real IP addresses or implement the feature he actually wants.

                                                                                                          1. 0

                                                                                                            It’s not that he couldn’t do it, it’s just much simpler without that. It’s really about the cost. It’s easy to curl, it’s more time consuming or expensive to use proxies, and even more so to solve captchas (or any other puzzles). The lower the cost, the higher the potential inaccuracy. And similarly, with higher cost, even legitimate users might be less willing to participate.

                                                                                                            I don’t have some universal solution or anything. It’s just something to consider. Sometimes it might be reasonable to put effort into making a robust telemetric system, sometimes none at all would be preferred. I’m trying to think of a case “in between”, but don’t see a single situation where jokingly-easy-to-fake results could be any good.

                                                                                                        2. 1

                                                                                                          Telemetry benefits companies, otherwise companies wouldn’t use it. Perhaps it can benefit users, if the product is improved as a result of telemetry. But it also harms users by compromising their privacy.

                                                                                                          The question is whether the benefits to users outweigh the costs.

                                                                                                          Opt-out telemetry-using companies obviously aren’t concerned about the costs to users, compared to the benefits they (the companies) glean from telemetry-by-default. They are placing their own interests first, ahead of their users. That’s why they resort to dark patterns like opt-out.

                                                                                                      2. 12

                                                                                                        You assume that we actually need telemetry to develop good software. I’m not so sure. We developed good software for decades without telemetry; why do we need it now?

                                                                                                        When I hear the word “telemetry”, I’m reminded of an article by Joel Spolsky where he compared Sun’s attempts at developing a GUI toolkit for Java (as of 2002) to Star Trek aliens watching humans through a telescope. The article is long-winded, but search for “telescope” to find the relevant passage. It’s no coincidence that telemetry and telescope share the same prefix. With telemetry, we’re measuring our users’ behavior from a distance. There’s not a lot of signal there, and probably a lot of noise.

                                                                                                        It helps if we can develop UsWare, not ThemWare. And I think this is why it’s important for software development teams to be diverse in every way. If our teams have people from diverse backgrounds, with diverse abilities and perspectives, then we don’t need telemetry to understand the mysterious behaviors of those mysterious people out there.

                                                                                                        (Disclaimer: I work at Microsoft on the Windows team, and we do collect telemetry on a de-facto opt-out basis, but I’m posting my own opinion here.)

                                                                                                        1. 5

                                                                                                          we don’t need telemetry to understand the mysterious behaviors of those mysterious people out there

                                                                                                          Telemetry usually is not about people’s behaviors, it’s about the mysterious environments the software runs in, the weird configurations and hardware combinations and outdated machines and so on.

                                                                                                          Behavioral data should not be called telemetry.

                                                                                                          1. 3

                                                                                                            One concrete benefit of telemetry: “How many people are using this deprecated feature? Should we delete it in this version or leave it in a while longer?”

                                                                                                            We developed good software for decades without telemetry; why do we need it now?

                                                                                                            Decades-old software is carrying decades-old cruft that we could probably delete, but we just don’t know for sure. And we all pay the complexity costs one paper cut at a time.

                                                                                                            I’m as opposed to surveillance as anybody else in this forum. But there’s a steelman question here.

                                                                                                          2. 13

                                                                                                            The quality of information you get from opt-in telemetry is so much worse than from telemetry by default that it’s almost not worth it.

                                                                                                            A social scientist could likewise say: “The quality of information you get from observing humans in a lab is so much worse than when you plant video cameras in their home without them knowing.”

                                                                                                            How is this an argument that it’s ok?

                                                                                                            1. 1

                                                                                                              There are three differences as far as I can tell:

                                                                                                              The data from a hidden camera is not anonymizable. Telemetry, if done correctly (anonymization of data as much as possible, no persistent identifiers, transparency as to what data is and has been sent in the past), cannot be linked to a natural person or an indvidual handle. Therefore, I see no harm to the individual caused by telemetry implemented in accordance with best data protection practices.

                                                                                                              Furthermore, the data from the hidden camera cannot cause corrective action. The scientist can publish a paper, maybe it’ll even have revolutionary insight, but can take no direct action. The net benefit is therefore slower to be achieved and very commonly much less than the immediate, corrective action that a software developer can take for their own software.

                                                                                                              Finally, it is (currently?) unreasonable to expect a hidden camera in your own home, but there is an increased amount of awareness of the public that telemetry exists and settings should be inspected if this poses a problem. People who do care to opt out will try to find out how to opt out.

                                                                                                              1. 2

                                                                                                                Finally, it is (currently?) unreasonable to expect a hidden camera in your own home, but there is an increased amount of awareness of the public that telemetry exists and settings should be inspected if this poses a problem. People who do care to opt out will try to find out how to opt out.

                                                                                                                I think this is rather deceptive. Basically it’s saying: “we know people would object to this, but if we slowly and covertly add it everywhere we can eventually say that we’re doing it because everyone is doing it and you’ve just got to deal with it”.

                                                                                                                1. 1

                                                                                                                  I still disagree but I upvoted your post for clearly laying out your argument in a reasonable way.

                                                                                                              2. 3

                                                                                                                You seem to miss a very easy, obvious, opt-in only strategy that worked for the longest time without feeling like your software was that creepy uncle in the corner undressing everyone. As you pointed out everyone keeps the defaults, you know what else most normies do? Click next until they can start their software. So you add a dialog in that first run dialog that is supposed to be there to help the users and it has a simple “Hey we use telemetry to improve our software (here is where you can see your data)[https://yoursoftware.com/data] and our (privacy policy)[https://yoursoftware.com/privacy]. By checking this box you agree to telemetry and data collection as outlined in our (data collection policy)[https://yoursoftware.com/data_collection] [X]”

                                                                                                                and boom you satisfy both conditions, the one where people don’t go out of their way to opt into data collection and the other where you’re not the creepy uncle in the corner undressing everyone.

                                                                                                              3. 3

                                                                                                                You can also view this as an standardized way for opt-in, which isn’t currently available either.

                                                                                                                1. 3

                                                                                                                  No, it is not. It is a standardized way for opt-out.

                                                                                                                2. 3

                                                                                                                  This is a bad comment, because it doesn’t add anything except for “I think non-consensual tracking is bad”, and is only tangentially related to OP insofar as OP is used as a soapbox for the above sentiment. Therefor I have flagged the comment as “Me-too”, regardless however much I may agree with it.

                                                                                                                  1. 22

                                                                                                                    Except that in the European Union, the GDPR requires opt-in in most cases. IANAL, but I think it applies to the analytics that Homebrew collects as well. From the Homebrew website:

                                                                                                                    A Homebrew analytics user ID, e.g. 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB. This is generated by uuidgen and stored in the repository-specific Git configuration variable homebrew.analyticsuuid within $(brew –repository)/.git/config.

                                                                                                                    https://docs.brew.sh/Analytics

                                                                                                                    From the GDPR:

                                                                                                                    The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier or one of several special characteristics, which expresses the physical, physiological, genetic, mental, commercial, cultural or social identity of these natural persons.

                                                                                                                    I am pretty sure that this UUID falls under identification number or online identifier. Personally identifyable information may not be collected without consent:

                                                                                                                    Consent should be given by a clear affirmative act establishing a freely given, specific, informed and unambiguous indication of the data subject’s agreement to the processing of personal data relating to him or her, such as by a written statement, including by electronic means, or an oral statement.

                                                                                                                    So, I am pretty sure that Homebrew is violating the GDPR and EU citizens can file a complaint. They can collect the data, but then they should have an explicit step during the installation and the default should (e.g. user hits RETURN) be to disable analytics.

                                                                                                                    The other interesting implication is that (if this is indeed collection of personal information under the GDPR) is that any user can ask Homebrew which data they collected and/or to remove the data. To which they should comply.

                                                                                                                    1. 3

                                                                                                                      The data subjects are identifiable if they can be directly or indirectly identified, especially by […]

                                                                                                                      As far as I can tell, you’re not actually citing the GDPR (CELEX 32016R0679), but rather a website that tries to make it more understandable.

                                                                                                                      GDPR article 1(1):

                                                                                                                      This Regulation lays down rules relating to the protection of natural persons with regard to the processing of personal data and rules relating to the free movement of personal data.

                                                                                                                      GDPR article 4(1) defines personal data (emphasis mine):

                                                                                                                      ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;


                                                                                                                      Thus it does not apply to data about people that are netiher identified nor identifiable. An opaque identifier like 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB is not per se identifiable, but as per recital 26, determining whether a person is identifiable should take into account all means reasonably likely to be used, such as singling out, suggesting that “identifiable” in article 4(1) needs to be interpreted in a very practical sense. Recitals are not technically legally binding, but are commonly referred to for interpretation of the main text.

                                                                                                                      Additionally, if IP addresses are stored along with the identifier (e.g. in logs), it’s game over in any case; even before GDPR, IP addresses (including dynamically assigned ones) were ruled by the ECJ to be personal data in Breyer v. Germany (ECLI:EU:C:2016:779 case no. C-582/14).

                                                                                                                      1. 9

                                                                                                                        Sorry for the short answer in my other comment. I was on my phone.

                                                                                                                        Thus it does not apply to data about people that are netiher identified nor identifiable. An opaque identifier like 1BAB65CC-FE7F-4D8C-AB45-B7DB5A6BA9CB is not per se identifiable,

                                                                                                                        The EC thinks differently:

                                                                                                                        Examples of personal data

                                                                                                                        a cookie ID;

                                                                                                                        the advertising identifier of your phone;*

                                                                                                                        https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en

                                                                                                                        It seems to me that an UUID is similar to cookie ID or advertising identifier. Using the identifier, it would also be trivially possible to link data. They use Google Analytics. Google could in principle cross-reference some application installs with Google searches and time frames. Based on the UUID they could then see all other applications that you have installed. Of course, Google does not do this, but this thought experimentat shows that such identifiers are not really anonymous (as pointed out in the working party opinion of 2014, linked on the EC page above).

                                                                                                                        Again, IANAL, but it would probably be ok to reporting installs without any identifier linking the installations. They could also easily do this, make it opt-in, report all people who didn’t opt in using a single identifier, generate a random identifier for people who opt-in.

                                                                                                                        1. 4

                                                                                                                          They locked the PR talking about it and accused me of implying a legal threat for bringing it up. The maintainer who locked the thread seems really defensive about analytics.

                                                                                                                          1. 3

                                                                                                                            Once you pop, you can’t stop.

                                                                                                                            I, too, thought that your pointing out their EU-illegal activity was distinct from a legal threat (presumably you are not a prosecutor), and that they were super lame for both mischaracterizing your statement and freaking out like that.

                                                                                                                            1. 3

                                                                                                                              The maintainer who locked the thread seems really defensive about analytics.

                                                                                                                              It seems this is just a general trait. See e.g. this

                                                                                                                            2. 1

                                                                                                                              Now I really wish I had an ECJ decision to cite because at this point it’s an issue of interpretation. What is an advertising identifier in the sense that the EC understood it when they wrote that page—Is it persistent and can it be correlated with some other data to identify a person? Did they take into account web server logs when noting down the cookie ID?

                                                                                                                              Interesting legal questions, but unfortunately nothing I have a clear answer to.

                                                                                                                            3. 1

                                                                                                                              Please cite the rest of paragraph 4, definitions:

                                                                                                                              ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

                                                                                                                              https://eur-lex.europa.eu/legal-content/en/TXT/?uri=CELEX%3A32016R0679

                                                                                                                              Which was what I quoted.

                                                                                                                              1. 1

                                                                                                                                Your comment makes the following quotations:

                                                                                                                                The data subjects are identifiable if they can be directly or indirectly identified, especially by reference to an identifier such as a name, an identification number, location data, an online identifier or one of several special characteristics, which expresses the physical, physiological, genetic, mental, commercial, cultural or social identity of these natural persons.

                                                                                                                                Please ^F this entire string in the GDPR. I fail to find it as-is. They only start matching up in the latter half starting at “an identifier” and ending with “social identity”.

                                                                                                                                (1) ‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

                                                                                                                                I agree it’s pedantic of me, but it’s not a 1:1 quote from the GDPR if a sentence is modified, no matter how small.


                                                                                                                                I’ve edited in the second half in any case though. I do not, however, see any way that modification would invalidate any of the points I’ve made there, however.

                                                                                                                            4. 2

                                                                                                                              If that is true, consider submitting a PR, because GDPR violations are serious business.

                                                                                                                              1. 3

                                                                                                                                Or don’t submit a PR. As the project has stated:

                                                                                                                                Do not open new threads on this topic.

                                                                                                                                People have been banned from the project for doing exactly this.

                                                                                                                                1. 7

                                                                                                                                  “We don’t want to hear complaints” is not a new stance for Homebrew.

                                                                                                                                  1. 2

                                                                                                                                    Yeah, I got the impression that they are pretty hardline on this. I hope that they’ll reconsider before someone files a GDPR complaint.

                                                                                                                                    Personally, I don’t really have a stake in this anymore, since I barely use my Mac.

                                                                                                                                    I guess a more creative solution would be to fork the main repo and disable the analytics code and point people to that.

                                                                                                                                    Edit: the linked PR is from before the GDPR though.

                                                                                                                                2. 1

                                                                                                                                  But the above user didn’t post that did they? Your comment was meaningful and useful, but theirs was just sentimental. A law violation is a law violation, but OP just posted their own feelings about what they think is spyware and didn’t say anything about GDPR.

                                                                                                                                3. 4

                                                                                                                                  hmm I disagree, the OP is claiming that we should have a unified standard for “Do_Not_Track”. Finn is arguing that we shouldn’t need such a standard because unless I specifically state that I would like to be tracked, I should not be tracked and that any attempts to track is a violation of consent. Finn here is specifically disagreeing with the website in question. Should we organize against attempts to track without explicit consent, or give a unified way to opt out. These are fundamentally different questions and are actually directly related. If I say everyone should be allowed into any yard unless they have a private property sign, that may cause real concern for people who feel that any yard shouldn’t permit trespassing unless they have explicit permission. They are different concerns, that are related, and are more nuanced than “thing is bad”.

                                                                                                                                4. 1

                                                                                                                                  Okay. By your (non-accepted) definition, spyware abounds and is in common use.

                                                                                                                                  Simply calling it “spyware” and throwing up your hands doesn’t work. They have knobs to turn the spying off, to opt-out. I just want all those knobs to have the same label.

                                                                                                                                1. 2

                                                                                                                                  I don’t know much about Nim, but I’m pretty tired of the implication that scripting languages can’t be compiled. Nearly every well-known scripting language has a compiler; stop perpetuating 1990s stereotypes.

                                                                                                                                  1. 5

                                                                                                                                    It’s funny how you say “1990s stereotypes” when PHP didn’t have a compiler until 2010.

                                                                                                                                    1. 7

                                                                                                                                      While a fair criticism, 2010 was nearly a decade ago. There are certainly many of us who still think of the nineties as merely a decade or so ago.

                                                                                                                                      1. 3

                                                                                                                                        One’s perception doesn’t matter, 2010 is 9 years ago (9 and a half if we consider February 2010, the release date of Hip Hop), and calling it “1990s stereotypes” to bring forward an incorrect and overblown statement is not a valid move.

                                                                                                                                        1. 2

                                                                                                                                          Perceptions definitely matter when it comes to language and communication. This is because you must communicate such that it is received by the listener’s expectations. A speaker who communicates to the listener’s perceptions will be far more effective. 90’s stereotypes for example may not have been meant to be taken so literally. Now you can say as a listener, “It didn’t work for me, I took it literally” and that’s a fair criticism, however you’re not the only listener, and not every listener will receive it the way you did.

                                                                                                                                      2. 4

                                                                                                                                        Ruby was a tree-walking interpreter until 2009 (based on this page).

                                                                                                                                        Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                                                                                                        @technomancy is getting way too overwrought over the imprecise but useful phrase, “compiled language”. Wikipedia: “The term is somewhat vague.” There is a useful separation here. Until we come up with a better name for it, let’s be kind to usage of “compiled language”.

                                                                                                                                        1. 10

                                                                                                                                          Python is said to be compiled but has, and will continue to have, a “Global Interpreter Lock”.

                                                                                                                                          These things are not in opposition to each other. OCaml has a native code compiler, and its runtime system has a GIL (a misnomer: a better name is “global runtime lock”).

                                                                                                                                          1. 1

                                                                                                                                            Interesting!

                                                                                                                                          2. 4

                                                                                                                                            imprecise but useful phrase

                                                                                                                                            What do you find this phrase to be useful for?

                                                                                                                                            You can see in the comments below that the author’s intent was not to describe programs that compile to machine language, but actually to describe programs that can be distributed as single-file executables. (which is also true of Racket, Lua, Forth, and many other languages)

                                                                                                                                            So it seems that even by declaring “compiled” to mean “compiled to machine language” we haven’t actually achieved clear communication.

                                                                                                                                            1. 2

                                                                                                                                              I’m actually rather interested in how you get single file executables with Lua? Is there a tool that makes it easy, or is it something out of the way?

                                                                                                                                              EDIT: I know how to get single file executables with Love2d, and I vaguely recall that it can be done with Lua outside of Love, but it’s certainly not an automated/well-known thing.

                                                                                                                                              1. 4

                                                                                                                                                Many people simply embed Lua in their C programs (that was the original use case it was designed for) but if you’re not doing that you can streamline it using LuaStatic: https://github.com/ers35/luastatic/

                                                                                                                                              2. 2

                                                                                                                                                You’re pointing out a second way in which it is imprecise. I’m pointing out that for me – and for a large number of people who don’t know all the things you do – it was useful since it perfectly communicated what the author meant.

                                                                                                                                                1. 3

                                                                                                                                                  Oh, interesting, so you mean to say that when you read “compiled language” you took it to mean “produces a single-file executable”?

                                                                                                                                                  This is honestly surprising to me because most of the time I see “compiled language” being misused, it’s put in contrast to “interpreted language”, but in fact many interpreted languages have this property of being able to create executables.

                                                                                                                                                  It’s just a tangled mess of confusion.

                                                                                                                                                  1. 5

                                                                                                                                                    Indeed it is. I didn’t mean that I understood “produces a single-file executable.” I meant that he’s pointing towards a divide between two classes of languages, and I understood the rough outline of what languages he was including in both classes.

                                                                                                                                                    Edit: I can’t define “compiled language” and now I see it has nothing to do with compilation. But I know a “compiled language” when I see it. Most of the time :)

                                                                                                                                                    1. 3

                                                                                                                                                      Perhaps a good way to put it is “degree of runtime support that it requires”. Clearly, normal usage of both Nim and Python requires a runtime system to do things for you (e.g. garbage collection). But Nim’s runtime system does less for you than Python’s runtime system does, and gets away with it mostly because Nim can do many of those things at compile time.

                                                                                                                                                      1. 4

                                                                                                                                                        Even if a language retrofitted a compiler 20 years ago, it’s hard to move away from the initial programming UX of an interpreted language. Compilation time is a priority, and the experience is to keep people from being aware there’s a compiler. With a focus on UX, I think all your examples in this thread have a clear categorization: Perl, Python, Ruby, Racket, Lua, Node and Forth are all interpreted.

                                                                                                                                                        1. 4

                                                                                                                                                          I would frame it differently; I’d say if a language implementation has tooling that makes you manually compile your source into a binary as a separate step before you can even run it, that’s simply bad usability.

                                                                                                                                                          In the 90s, you had to choose between “I can write efficient code in this language” (basically C, Pascal, or maaaaybe Java) vs “this language has good-to-decent usability.” (nearly everything else) but these days I would like to think that dichotomy is dated and misguided. Modern compilers like rust and google golang clearly are far from being “interpreted languages” but they provide you with a single command to compile and run the code in one fell swoop.

                                                                                                                                                          1. 4

                                                                                                                                                            I’m super sympathetic to this framing. Recent languages are really showing me how ill-posed the divide is between categories like static and dynamic, or compiled and interpreted.

                                                                                                                                                            But it feels a bit off-topic to this thread. When reading what somebody else writes my focus tends to be on understanding what they are trying to say. And this thread dominating the page is a huge distraction IMO.

                                                                                                                                                            I also quibble with your repeated invocation of “the 90s”. This is a recent advance, like in the 2010s. So I think even your distraction is distractingly phrased :)

                                                                                                                                            2. 3

                                                                                                                                              Are you not barking at the wrong tree ? I don’t find a line where the author implies anything close to it.

                                                                                                                                              1. 0

                                                                                                                                                I was referring to the title of the post; as if “scripting ease in a complied language” is not something provided by basically every scripting language in existence already.

                                                                                                                                                1. 3

                                                                                                                                                  Specifically, most scripting languages make it nontrivial to package an executable and move it around the filesystem without a shebang line. On Linux, this isn’t a huge issue, but it’s convenient to not have to deal with it on Windows

                                                                                                                                                  1. 1

                                                                                                                                                    OK, but that has next to nothing to do with whether there’s a compiler or not. I think what you’re talking about is actually “emits native code” so you should like … say what you mean, instead of the other stuff.

                                                                                                                                                    1. 1

                                                                                                                                                      Fair enough.of a point, I suppose. Many people use “compiled” vs “interpreted” to imply runtime properties, not parsing/compilation properties, it isn’t exactly the proper definition.

                                                                                                                                                      I’ll try to be more precise in the future, but I would like a term for “emits native code” that is less of a mouthful.

                                                                                                                                              2. 3

                                                                                                                                                Scripting languages can be compiled, but, out of Python, Ruby, Tcl, Perl and Bash, most of them are by default written in such a way that they require code to follow a certain file structure, and if you write a program that is bigger than a single file, you end up having to lug the files around. I know that Tcl has star deploys, and I think that’s what Python wheels are for. Lua code can be wrapped into a single executable, but it’s something that isn’t baked into the standard Lua toolset.

                                                                                                                                                1. 3

                                                                                                                                                  I think it would be helpful to a lot of people if you could give examples for node, Ruby, python, … Maybe you’re just referring to the wrong usage of the word compiler ?

                                                                                                                                                  EDIT: typo

                                                                                                                                                  1. 1

                                                                                                                                                    For python there is Nuitka as far as I know.

                                                                                                                                                    1. -1

                                                                                                                                                      Node, Ruby, and Python are all all typically compiled. With Ruby and Node it first goes to bytecode and then usually the hotspots are further JIT-compiled to machine code as an optimization pass; I don’t know as much about Python but it definitely supports compiling to bytecode in the reference implementation.

                                                                                                                                                      1. 7

                                                                                                                                                        When talking about “compiled languages”, people typically mean “AOT compiled to machine code”, producing a stand-alone binary file. Python’s official implementation, CPython, interprets bytecode and PyPy has a JIT compiler. V8 (the JS engine in Node) compiles JavaScript to a bytecode and then both interprets and JIT compiles that. Ruby has a similar story. The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                                                                                                        1. 1

                                                                                                                                                          The special thing about Nim is that it has the same ease of use as a “scripting language” but has the benefits of being AOT compiled with a small runtime.

                                                                                                                                                          The idea that only scripting languages care about ease of use is just plain outdated.

                                                                                                                                                          In the 1990s it used to be that you could get away with having bad usability if you made up for it with speed, but that is simply not true any more; the bar has been raised across the board for everyone.

                                                                                                                                                    2. 3

                                                                                                                                                      Some languages like Dart [1] have first class support for both interpreting and compiling. I dont think its fair to say something like “some random person has a GitHub repo that does this” as being the same thing.

                                                                                                                                                      1. https://github.com/dart-lang/sdk
                                                                                                                                                      1. 0

                                                                                                                                                        That’s the whole point; Dart has a compiler, and any language that doesn’t is very unlikely to be taken seriously.

                                                                                                                                                        1. 1

                                                                                                                                                          The point is:

                                                                                                                                                          Nearly every well-known scripting language has a compiler

                                                                                                                                                          That may be true, but nearly every well-known scripting language doesnt have an official compiler

                                                                                                                                                          1. -2

                                                                                                                                                            Also false; Ruby’s official implementation has had a compiler (YARV) since the 1.9 days; Node.js has used the v8 JIT compiler since the beginning, (not to mention TraceMonkey and its descendants) and python has been compiling to .pyc files for longer than I’ve been a programmer.

                                                                                                                                                            According to this, Lua has had a compiler since the very beginning: https://www.lua.org/history.html I don’t know much about Perl, but this page claims that “Perl has always had a compiler”: https://metacpan.org/pod/perlcompile

                                                                                                                                                            The only exception I can think of is BASIC, and that’s just because it’s too old of a language for any of its numerous compilers to qualify as official. (edit: though I think Microsoft QuickBasic had a compiler in the 1980s or early 90s)

                                                                                                                                                            1. 6

                                                                                                                                                              QuickBasic compiled to native code, QBasic was an interpreter, GWBasic compiled to a tokenized form that just made interpretation easier (keywords like IF were replaced with binary short codes)