Threads for bobpoekert

    1. 2

      on a modern numa system reading from and writing to main memory is the biggest kind of i/o bottleneck. cpu cache is the new ram and main memory is the new disk

    2. 23

      The sentiment of this web page is delightful but doing web programming in C or any other language where you do your own memory management is asking for so much trouble. You’re trading the incidental complexity of [insert hipster web language + framework here] for the incidental complexity of valgrind, asan, tsan, etc.

      I wish there was an efficient memory-safe language that had the history and cachet of C among communities with the OpenBSD ethos - like Pascal or Oberon, had they survived the 90s in any meaningful sense. At least one of the useful languages from the current PL renaissance in industry will undoubtedly be in this position in 20 or 30 years - C won’t keep eating its children forever :)

      1. 20

        I imagine it could be rust, but there’s a degree of fetishism which I think is detrimental. People learned C without first having to learn to love C. There’s a trend now, however, that you can’t just learn a tool, you must love it. The hype turns off a lot of potential practitioners.

        There are some other not C languages I enjoy. But I arrived here without people telling me how much better life would be if only I made the switch.

        1. 17

          The Rust community is definitely strongly in favor of its language of choice, and while I think the core of the community (the Rust teams, developers of major Rust libraries, major language contributors) are appropriately careful about the rhetoric they use and claims they make, others in the community are often not so careful. Part of this comes from misunderstanding of the guarantees Rust provides (“no data races” becomes “no race conditions,” for example, which is a stronger claim), and part of this comes from people being imprecise in the words they use, even if they do understand the actual guarantee being made.

          I am looking forward to the results of the Rust Belt project, which is looking to formalize Rust, and provide a stronger and clearer description of what sort of guarantees Rust provides. There also ongoing efforts by the Rust teams to provide a stronger description of what behaviors are considered safe vs. unsafe, and to generally tighten up both their own understand of the language, and to improve the explanations official Rust documents provide. My hope is that this formalization will provide something clearer and stronger to reference when people discuss Rust and what it can offer them.

          EDIT: I should also say that I think a lot of people in the Rust community come by the excitement honestly. Personally I can be very effusive about Rust, and have to regularly remind myself to tone it down when pitching Rust to people. It is a cool language that does a lot of things I appreciate, and that feels “right” to me in a way that engenders a strong desire to encourage its use elsewhere, if only to give me more opportunities to write in it.

          1. 1

            I mostly agree with tedu on this. I like rust but there is a huge “fetishism” right now in the hype cycle. Reminds me a lot of what golang went through up until people realized where go didn’t fit in.

            Different languages sure, but yes everyone is excited but all the “you not writing in rust is bad and you should feel bad” (i’m paraphrasing) articles start to make me want to throw out regular old C more. Its not great, but after 40+ years with some minimal tools you can fairly easily mitigate most problems.

            An example of something Rust doesn’t provide but gets brushed over far too often: guarantees that you don’t have a memory leak. Memory leaks are by definition “memory safe”, but that isn’t all that interesting when it occurs. Its great it won’t break the program, but rust isn’t solving every class of problems around memory allocation. It is really tough to pierce the fanboy attitudes around the enthusiasm. It is great people are enjoying it, but lets come down to reality and evaluate based on facts not “c isn’t memory safe”. On its own that is correct as a statement, but just as meaningless as saying having a memory leak is “memory safe” IMO.

            Don’t get me wrong, rust is a good language, just could use more of the enthusiasm at a 7 not an 11. I’m not going to convert the 3k lines of C I’ve written this year to rust just because rust exists. (kernel module so no rust isn’t a great option even at this stage)

            1. 2

              Rust is my go-to language right now, and I also agree with tedu. My point was trying to a) clarify that, unfortunately, the broader community is not as careful and measured in their proselytizing for Rust as the core community is, and b) this will almost certainly get better over time, particularly as Rust’s guarantees are give a more precise and formal treatment.

              1. 2

                Yep no worries, my bigger gripe is “memory safe” is really quickly turning into a rust thought terminating cliche at times.

        2. 5

          I think that “fetishism” is typical of early adoption of most tools. I would love to be able to read the discourse around C and Unix between 1970 and 1980, for example, where I imagine C went through the same hype cycle in a relatively tight-knit community, so it left few artifacts.

          It’s so rare that a tool is truly unequivocally better than its predecessors that effusive praise tends to help people tamp down the cognitive dissonance of recognizing the areas in which it’s worse. I take your point, though - it would be nice if the discourse around new tools wasn’t this way.

          1. 8

            Within living memory, I can recall python going from being an also-ran to mainstream, but with very few people constantly telling me “I can’t believe you’re not using python already.” I do believe there has been a change in attitude that didn’t exist before. Online communities grant status and standing to pretenders. Just look at how easy it is to gain karma by shit talking PHP, even when one has had zero experience with PHP, or any programming language!

      2. 15

        “I wish there was an efficient memory-safe language that had the history and cachet of C among communities with the OpenBSD ethos”

        ocaml comes close to being that. It’s a language with a strong type system (with type inference and everything), automatic memory management, and a very straightforward compiler. I don’t know that any C compiler beats ocamlopt in how easy to debug the generated code is.

        1. 8

          There is even a book on Unix system programming in OCaml :):

          https://ocaml.github.io/ocamlunix/

        2. 3

          You could slot Oberon (or OCaml, Go, LuaJIT, Rust - CGI scripts don’t discriminate) into this stack right now and it would work just fine - the fun of using C for this is that there’s some unquantifiable value in saying “we use one language for the bulk of work on or in this system moving forward, and it’s a language that’s suitable for any job you can throw at it.”

      3. 4

        Would Go be an alternative? e.g. this seems convincing (I’ve only skimmed it, though).

        1. 4

          Lua is a very viable alternative - all the power of C, none of the issues. OpenRESTY and LAPIS both run very well on OpenBSD too ..

          1. 3

            What does all the power of C mean? A good FFI?

      4. 4

        If you haven’t considered Go, I’d encourage you to do so. It checks the boxes. It’s a memory safe language, not too far away from C, relatively efficient. And to top if off, it does provide builtin tooling to supplant tsan via go build -race.

        If that’s not your cup of tea, Rust seems to have all those benefits of memory safety and thread safety.

      5. 2

        C++? The best thing about C++ is “You only pay for what you use”. It is as C like or Java like as you choose. At my work we use it in a very C like way, but we get the advantage of free RAII, basic containers like list, vector, map, and simple to use strings. This all means simpler, terser code which is easier to reason about, maintain, and debug.

        1. 5

          I wish there was an efficient memory-safe language […]

          C++? […]

          C++ lacks memory safety.

    3. 1

      Are there performance benchmarks for this? Writing something in C doesn’t necessarily mean it’ll run faster than if it were written in eg: LuaJIT, especially if the C code isn’t written with cache efficiency in mind. And GLib is (or was the last time I used it) very pointer-fetchy.

      1. 1

        Hopefully it’s a lot faster than “hundreds of requests per second”, the stat touted on the site.

        The author says he is planning to publish some benchmarks.

    4. 3

      Training a stacked denoising autoencoder to fix compression artifacts in video

      1. 1

        It’s not yet :). Ask me again at the end of the week.

        I think the hard part is going to be stopping it from generating its own more annoying artifacts (think: puppyslugs). I have some ideas for ways to mitigate that but it’ll take some experimenting.

  1. 3

    I get the sense that the team that made this workshopped it with actual people who weren’t already familiar with ML before releasing it. Probably with Facebook employees. I like that a lot.

    1. 1

      It’s better than guessing.

  2. 28

    This is not missing the satire tag.

    1. 18

      I was wondering. Continue to stand by decision to pass on the industry and to recommend young programmers to do the same.

      I’d rather make a business dude rich and go play with my dogs at 5 pm. That’s not mediocrity, that’s knowing a video game doesn’t change the world.

      1. 20

        Even if it did change the world, your life is important too. I like what I do, it feels important to me. But I only put in 40 hours a week, most of the time, and I stand by that.

        1. 8

          Woah, that sounds like “balance” talk. Watch your mouth.

          1. 4

            Smart and well balanced employees do significantly better work long-term than ones churning out code 16 hours a day.

        2. 2

          Don’t you work in the Google, though?

          I’m not sure normal math even applies there. :-S

          1. 5

            I do, yes. I agree it’s an uncommon place.

      2. 3

        I’ve been looking for a job and while there is a lot of postings in tech, I dont see that many outside the industry. Most of my contract work up to now has mostly shown me that people outside the tech industry arent really aware of the costs, most offers being magnitudes lower than what I see in tech companies.

        And I’d love to be able to work with the outdoor companies I’ve grown to know here in Quebec.

        1. 5

          If you know people in a non-tech industry who have a problem that can be solved with software and are willing to pay significant money for it to be solved then maybe you don’t need to work for them for them to pay you. You might be sitting on one of them “sales channels”.

      3. 5

        Continue to stand by decision to pass on the industry and to recommend young programmers to do the same.

        So where do you tell them to work? There don’t seem to be very many meaty jobs, from a programmer’s perspective, outside of the technology industry. Or do you suggest that they be hobbyist programmers only and use their technical skills to segue into pre-executive jobs?

        I ask this because I think there’s mutual benefit in helping great programmers get out of the technology industry while still using their skills. It’s not good for society to have the best programmers in one industry, and it’s an industry that tends to take us for granted and to treat us poorly. The thing is: I don’t know how to go about it, much less solve the problem at scale. I suppose that it would require a fleet of agents who act as tech-industry exit consultants.

        1. -6

          I don’t come to this website to discuss your agenda.

          1. 5

            No, of course not, but I thought you might have something useful to say, or some insight. That’s why I asked you the question. Apparently, I was wrong.

    2. 15

      It’s how most technology managers/executives and almost all of the VCs think. OP is just uncouth enough to express things that others would never say (such as the disgusting gendered shit that assumes that programmers burn out because of “wives/GFs”) in public.

      1. 11

        Even when it’s true that “everyone thinks X, he’s just honest enough to say it”, that act of saying X makes things worse since it further normalizes the situation.

        1. 9
          1. I agree completely.
          2. My point is certainly not “Everyone thinks X”. It’s that the technology industry is run by people who don’t share our values or even align with how most people define social justice.

          The Moldbug controversy ties in to #2. His views are disgusting. That said, the reason there’s such a push to remove him from the conference circuit isn’t just that his politics are awful, but also that he’s accessible. The billionaires who run Silicon Valley largely share his views. They just aren’t stupid enough to get caught. (“Mencius Moldbug” was a pseduonym that got doxxed.) Also, enough people want their money that they can get away with pretty much anything, just like Trump said about his hypothetical 5th Avenue murder.

          I certainly don’t wish to excuse bad behavior or exclusionary viewpoints. I just wish there was more consistency in it. I probably wouldn’t invite Yarvin to speak if it were my conference. But how many people would turn away a billionaire venture capitalist who dislikes the 19th Amendment (Thiel) or that liberals are capable of Kristallnacht (Perkins)? I’m guessing that most of the tech industry wouldn’t.

          1. 7

            It’s hard to know someone’s intent and, in that light, it’s hard to know what his true views are.

            His self-presentation is of an intellectual who simply isn’t willing to reject monarchy, slavery, white supremacy, and other unfashionable (in many cases, because they are bad) political institutions out of hand. Just as there are useful alternative logics (e.g. non-Euclidean geometry, intuitionist logic, non-ZFC set theories) I suppose he is trying to start from first principles, with no assumptions, and derive an alternative politics. At least, that’s how he wants to present himself: a free thinker on the right, unconstrained by conventional humanist assumptions.

            Part of the problem, I think, is that he’s either disingenuous or sophomoric. For example, he claims that Europeans preferred African slaves because they were “better adapted” to slavery than Native Americans. In fact, they were only better adapted to the Southern climate (35 C summers, high humidity) because the Natives are descended from Northeast Asians (hence, the most successful Mesoamerican civilizations were at altitude). He’s remarkably willing to accept bad ideas, and his reading of history is superficial and weird.

            He might be an obscurantist “dog whistle” racist. He might just be (as you suggest is possible) coming off as aloof and lacking empathy. He’s certainly contributed to an ideological movement that harbors actual racists. Also, slavery is outright evil regardless of whether it’s racially based. (African-American slavery was an especially disgusting brand of it, but slavery has existed since antiquity, and exists today, in a variety of formats.)

            I’d feel differently if he disavowed Moldbug. Look, I’ve created (more in jest than toward any serious effect) offensive internet characters. If he said, “I was full of shit back then”, I’d like to believe that many people would forgive him. However, he hasn’t. Even worse, he claims to have named his daughter after a pro-slavery man-of-letters, Thomas Carlyle. That just makes me ill.

            1. 5

              Does it not make sense that the guideline is individual to each conference? I have a really hard time buying the exclusion or lack thereof of Yarvin as the first step in a slippery slope. Strange Loop removed him, pure and simple, no discussion, and Everything Was Fine. Lambdaconf didn’t, some people pulled out, and the show will still go on. Conferences have a right to their rules and their attendees and supporters also have that right. That we fight about it on the internet is no indication that anyone is winning anything from social pressure.

              William Shockley invented the semiconductor. He won a nobel prize. After that he became an outspoken advocate for eugenics and some pretty brutal racist policies. His SPLC file is here. He suffered, personally and professionally, for his views. He lost friends and colleagues over these views.

              My own opinions about Yarvin aside (I think Moldbug’s ideas are repugnant and dressing up plain-ol' bigotry with equivocation doesn’t make them any less repugnant), I find it highly unlikely that conferences will standardize in this regard. If a large enough people disagree with how conferences are handled, there’s nothing stopping them from hosting whomever they’d like.

              Splitting hairs about what we individually find appropriate is our own business. Do we let a Klan member or donor speak so long as they don’t bring their robes? Who cares? I think it’s important for each of us to determine what we think we should support and act accordingly. If a lot of us feel that way about a person, they won’t have a platform at conferences. There’s nothing about that process that’s undemocratic or unfair.

              1. 2

                Strange Loop removed him, pure and simple, no discussion, and Everything Was Fine.

                The nature of slippery slopes is that they don’t show up immediately.

                That we fight about it on the internet is no indication that anyone is winning anything from social pressure. I think it’s important for each of us to determine what we think we should support and act accordingly. If a lot of us feel that way about a person, they won’t have a platform at conferences. There’s nothing about that process that’s undemocratic or unfair.

                If it’s the democratic decision of the conference attendees to exclude someone I’m fine with that. If it’s some vocal/famous people on Twitter whipping up a mob of people who aren’t even going I’m a lot less fine with that, and that I do think is undemocratic.

  • 10

    How deeply disappointing. :(

  • 4

    By the first quarter I was thinking, this has got to be satire… But reading his bio makes me understand why.

  • 1

    Lieningen makes this pain almost entirely disappear (for Clojure code), and you can use it to run java projects. In this case if you put the dependencies in your project.clj when you typed lein run it would download them for you, compile your project, and run it. I think it would be worth using even if you’re not writing clojure.

    1. 2

      How is this feature of leiningen different from any other jvm-based build tool?

  • 7

    github issues seems to work fine, and it’s really convenient if you’re already using github

    1. 4

      GitHub issues doesn’t do email-based support, and it requires your customers to have GitHub accounts to report bugs. That’s fine if you’re doing a random open-source project, or if you have an external system to handle the customer management side, but it’s otherwise not an acceptable bug tracker.

      1. 4

        I think a lot of people upvote GitHub due to familiarity, but as a bug tracking system it’s pretty crappy compared to “real” bug tracking software.

      2. 1

        Yeah, that’s pretty much a deal-breaker (forcing customers to have github accounts). Thanks for pointing that out.

  • 2

    Someone needs to port this to OCaml, like, right away.

  • 3

    This feels like a case study for why code review matters.

    1. 4

      In this case, I think solid code guidelines + mechanizing them would be a better solution.

    2. 1

      There’s no guarantee that this would have been caught in review. This is a job for a linter.

      1. 1

        Does reviewing code guarantee anything? I’m not denying that a linter would be ideal, but I feel like spotting duplication like this is something the human eye is good at doing.

        1. 1

          Computers are definitely better though. Taking humans out of everything is the goal isn’t it? :)

  • 2

    If you’re interested in NLP or machine learning at all, “Deep Learning for NLP (without magic)” was really good (for someone who knows something about NLP but very little about neural nets) http://nlp.stanford.edu/courses/NAACL2013/

    Also https://www.youtube.com/watch?v=yDLKJtOVx5c&list=PLD0F06AA0D2E8FFBA is a list of really good Khan Academy-style videos on ML.

    And Alex Smola’s “Scalable Machine Learning” course is really good if you’re a systems person who wants to learn about applied ML http://alex.smola.org/teaching/berkeley2012/

  • 1

    This is a game-changer for diabetics, not sure why it got the superfluous title.

    1. 1

      Because Google isn’t in the medical devices business, so presumably this is R&D they’re planning to use for some other purpose more aligned with their business interest. Which is creepy as hell.

  • 2

    The thing you get from something like korma that you lose when you go back to writing raw sql is composition. Pretty often you want to write something like:

    (defn feed-ranking [user-id] (-> some (really complex subquery)))
    (defn has-permission [user-id] (-> another (really complex subquery)))
    (defn get-feed [user-id]
        (select user
            (where (has-permission user-id))
            (order (feed-ranking user-id)))
    

    How do you do that in raw sql? Stored procedures?

    1. 2

      Yeah… The closest analog to this is HTML templates, which usually allow you to include another template as part of the SQL. Though this isn’t really part of the SQL spec, at all, and you’d have to invent some language to do it.

  • 7

    If you have 17 million lines of code in one repo you’ve made at least one horrible mistake. Big tech companies like to shoot themselves in the foot all the time and then brag about how good their surgeons are.

    1. 5

      Big tech companies like to shoot themselves in the foot all the time and then brag about how good their surgeons are.

      This perfectly describes the whole software industry.

    2. 4

      At the risk of sounding pedantic: why? What is wrong about keeping all of your code in a single repository?

      1. 5

        As a current Google employee, I second this question. We get lots of benefits from keeping everything together, and I can’t really imagine not doing this in a large, server oriented organization. Well, I can, but I much prefer this approach.

        For a public citation, “We have a single large depot with almost all of Google’s projects on it. This aids agile development and is much loved by our users, since it allows almost anyone to easily view almost any code, allows projects to share code, and allows engineers to move freely from project to project. Documentation and data is stored on the server as well as code.” http://research.google.com/pubs/pub39983.html

        1. 3

          You can view code in other repos very easily (via github, gitweb, etc), you share code by packaging libraries as libraries and specifying them in the dependency system for your environment (pip, rubygems, apt, w/e), and being able to switch projects is determined by your company’s coding standards, which is orthogonal.

          1. 3

            Yep, there’s certainly multiple valid ways to organize things.

            We avoid libraries and prefer to build everything from head, which one giant repository helps make possible, though I’m sure that tradeoff isn’t right for everyone. There’s some info about our build system at http://google-engtools.blogspot.com/2011/08/build-in-cloud-how-build-system-works.html, and I’d expect Facebook does something similar (especially since the config format of their open sourced Buck build tool is very similar to the format of the example BUILD file).

          2. 1

            Submodules or the equivalent in another system would also help.

      2. 2

        Short answer: having everything in the same repo encourages coupling, having each service in a different repo makes you be explicit about dependencies.

        1. 5

          These constraints can be enforced in other ways. For example, Chromium uses modules with explicit dependencies, and lower modules aren’t allowed to depend on higher modules: http://www.chromium.org/developers/how-tos/getting-around-the-chrome-source-code. While using different repos is one way to help maintain separation, it’s not the only way. Having the code together helps keep dependencies in sync and eases refactors that cross module boundaries, while module separation can still be enforced at a higher layer.

          1. 1

            I’m not claiming that it’s impossible to have everything in the same repo, just that the drawbacks of doing so are clear and the benefits are dubious in my opinion.

            1. 2

              That’s fair. Some of the benefit could also be cultural as opposed to technical, encouraging the mindset of being “Facebook” instead of “<some component>”.

              DVCSs that require the full history are especially problematic to scale to huge repositories, which is why Facebook’s use of mercurial here is especially interesting. Though I can certainly see where the comment about shooting themselves in the foot and bragging about patching it up is coming from :)

        2. 2

          Isn’t tight coupling a bad thing?

          At a past company there was a single monolithic codebase and multiple daily deployments. When there was a problem during deployment they had to roll back and re-deploy after kicking out the bad patch. This process would waste a lot of people’s time and limited us to two, maybe three daily deployments.

          To combat this, the company moved toward SOA and newer products were deployed decoupled as services. This allow teams to push as often as they liked with much higher iterations. Running a full test suite on the old code base took an hour, while services' test suites took minutes or less.

      3. 1

        At the 17 million line level you probably have several distinct products that are bundled together in one repository.

      4. 1

        I worked at a company that built embedded Linux systems; many of the products we supported were years old. All of these systems needed to stay in sync with certain core components, but there were slight modifications to the build for each (for supported features, chipsets, kernel customisations, etc). In this case, a single repo worked well.

    3. 2

      Despite the many apparent advantages of multiple repos, and the lack of obvious advantages to having one repo, every large software company tends towards the mega repo. When multiple organizations develop the same solution to a problem that doesn’t exist, that suggests to me that the problem does exist, I just can’t see it.

  • 2

    Writing segmentation-based image filters. They take a database of textures, and given a source image segments the image and replaces each segment with the closest texture from the database. Use fabric swatches as your dataset and everything looks like it’s patchwork fabric.

  • 1

    This suit was in district court, so the ruling doesn’t set a strong precedent and they still have two appeals left (one to appellate court, one to supreme court). In cases like this the district court ruling doesn’t make much difference since an appeal is basically certain.

  • 1

    If anyone has questions about the marginal ideas that make up this library, feel free to ask.

    1. 1

      Have you tried getting any of this in core?

      1. 1

        Relatively few of the things in Potemkin are an objective improvement over the Clojure semantics, and even then the process for getting things into core is punishingly slow. The only time I’ve seen fast movement is when I’ve found an actual bug.

        And honestly, if enough people were bothered by stuff like partial transparency to transaction retry exceptions [1] to warrant changing core, Potemkin would probably in much wider use.

        [1] https://github.com/ztellman/potemkin/blob/master/src/potemkin/utils.clj#L40-L59