Correct me if I’m wrong, but I believe all AI models are trained from existing code bases, so the models are not particularly good at solving unseen problems. Who will write those programs?
You’re correct they’re trained on existing code bases. You’re incorrect about their not being able to solve unseen problems.
I’m no expert but as far as I understand, the underlying code base can be considered as building block elements that can be connected together to create a larger, novel, whole. If you have building blocks, rules on how to attach them and, maybe, rules on whether the resulting answer is admissible or not, then you “novel” solutions to unseen problems.
Whether the current round of neural networks does this well, or does this in a sophisticated enough way, is, I guess, open for debate but the idea is easy enough to grasp.
This is quite a good talk.
There’s a lot of Rust proponents pushing the “correctness” angle of computation. I think there will always be a place for that kind of focus but, from what I gather from Sussman’s talk, a more fundamental problem is “evolvability”, not “correctness”. That is, “correctness” can only get you so far, is slow and potentially very brittle whereas thinking about systems from an “evolvability” context allows you to try and make a minimal set of assumptions to allow for more robust systems that respond to unknown situations better.
Sussman stresses that it’s not either/or, there will be contexts when “correctness” supersedes “evolvability” (or whatever other programming model you want to use) but that evolvability is probably the better model for making large scale, scalable and robust systems.
I think correctness and evolvability may be correlated. That’s part of why Haskell has become my go to language for prototypes. I can grow and mutate my code faster because I know that I’m not breaking anything in the process. When I’m working in less strict languages, I work slower to ensure that I’m not introducing regressions.
This was a joy to watch!
Made me wonder that maybe there is more juice than I thought in the “0 dependencies” folks and suckless philosophy manifesto (https://suckless.org/philosophy/).
Thank You, Mr. Conway. And we’re sorry ahah
Every dependency carries a cost, that at least must be justified by its benefits. So yeah, cutting back on dependencies does have its benefits. On the flip side, we can also reduce the cost of each dependency: if I’m writing a library, it will cost less if I manage to simplify the build system, the size of the API… or the number of sub dependencies.
Note that API width is a huge part of a dependency’s cost. The wider the API, the longer it takes to learn. Wide API effectively require larger communication bandwidth. A bandwidth that you do not have, because you’re just trying to make your stuff work, and the expert that wrote your dependency doesn’t answer emails anyway.
That’s why Ousterhout insist that classes should be deep: that’s because APIs, internal or exposed, are overhead that is best minimised. I can think of three ways to do this:
That last one is often forgotten. Take OpenSSL for instance. Specifically its BIO (Buffer Input Output) interface. Originally, it provides lots of function that took file descriptors as parameters. Except people also needed to work with memory buffers, without having to write files just because the API said so. They “corrected” the mistake by basically duplicating the whole API with bio
versions of the same functions. As for the buffers themselves, they were actually an abstraction that could hide either a file descriptor, or a memory buffer, depending on how it was set up.
In hindsight, this design is absolutely insane. It’s pretty obvious now that they should have have a memory based interface instead. OpenSSL is a Cryptographic library. It computes stuff, and as such should not concern itself with the trivialities of I/O. Just read from and write to buffers, let users move the data where they need to. If users wanted memory buffers, they have them. If they wanted to write to files or the network, they can just do so. And indeed, more modern cryptographic libraries now do stick to simple memory based interfaces.
This response is a much more introspective take, thank you.
What the “Molly Rocket” fails to put emphasis on is that there are benefits to subdivision and that the reason to incur the cost of abstraction is because those benefits outweigh the cost. Thanks for highlighting this and for providing practical advice on API design.
What the “Molly Rocket” fails to put emphasis on is that there are benefits to subdivision and that the reason to incur the cost of abstraction is because those benefits outweigh the cost.
I believe he does so deliberately, to swing a pendulum he feels has swung too far the abstraction way. He probably assumes most of us have already been thoroughly exposed to the many benefits of abstraction. I wouldn’t repeat this advice in front of beginners however, they’d run the risk of feeling justified in making a mess of their programs.
I was ready to bag on the video but it was very good and I learned a lot of subtlety about Conway’s Law that I hadn’t considered before.
I will say that he’s pretty critical of the subdivision implied by Conway’s law and makes sweeping statements about it being bad. His argument is that subdivision of labor into modules in the org/code chart necessarily impose a communication interface that restricts the possibility space a-priori, before the solution space is understood. While this is true, the thing he misses is that by breaking problems into components, this allows for parallelization and subdivision of labor, a point that he almost surely understands because he’s mentioning Amdahl’s law, but doesn’t make the leap to give his interpretation more nuance.
Put another way, yes, subdividing the space into an ‘org-chart’ necessarily imposes constraints on the solution space but this is, in some sense, the best we can do with any problem. The possibility space is exponentially large and without this type of structure the exploration could take exponentially longer as the problem grows. We restrict ourselves to problems that can be solved efficiently (or some semblance thereof).
The deeper question, for me, is coming up with strategies that can allow for sections of the org chart to either be merged into other parts more seamlessly or, more realistically, be made so that they can be removed with minimal damage.
Put another way, yes, subdividing the space into an ‘org-chart’ necessarily imposes constraints on the solution space but this is, in some sense, the best we can do with any problem.
This is exactly what he was saying.
No, he said multiple times that this was “very bad”, “wrong” and that we shouldn’t do this in order to solve more complex problems.
A kind reading is him saying there’s an optimal solution that might be within reach with better processes or “processing power” (maybe via a musk implant?) but he misunderstands the nature of the fundamental problem.
It’s not about being clever. The solution space grows exponentially. There is no hope but to break problems into smaller segments for solution. The best we can ever do is subdivide the space to find a solution and to find a cleaner way of swapping out our subdivided blocks to explore different parts of the solution space.
Put another way, you’ll quickly end up spending more energy searching the space than you would ever extract from finding the solution. It’s easy to construct examples where you’ll quickly end up using the energy budget of the universe before finding an optimal solution.
I agree with the sentiment. “Lean and flexible” is a great philosophy. I’m arguing over the inflection of how to implement such a strategy. He seems to be advocating for “throw out encapsulation, explore the space for the optimal solution”. I’m advocating for “understand that encapsulation is a fundamental constraint, figure out how to swap encapsulated bits more easily”.
Ah, I see. Even a Superhuman AI past its Intelligence Explosion would be limited in finding the best program for any given problem, and thus has to find ways to cut the search space. And at the very least, assuming P ≠ NP, a strategy that guarantees we’ll find the best solution is also guaranteed to be computationally intractable. Thus, all tractable strategies will have suboptimal results, and we have to accept that.
Still, his point remains: our human limitations are far lower than that of a superintelligence, and the skill of the average hireable programmer is quite a bit lower than that of our best geniuses. This means we need to cut the search space even more drastically, and end up finding even worse solutions.
This is made even worse at the start of a new project, when we know least about it. We need to be prepared for the fact that a couple months or years down the line, we’ll understand the problem better, and see solutions we could not even dream of at the start. That would be my “lean and flexible” approach: understand that we start out as noobs, and let our future expert selves room for finding better solutions. That often means not setting teams in stone.
Of course, if a field is properly understood, it makes sense to put strong interface boundaries where we just know they won’t cost much. For instance, it makes a lot of sense to write cryptographic libraries whose sole job is to read and write memory buffers. Not only does it enable a nice separation of labour, such interfaces have very little computational cost, compared to that of the cryptographic primitives themselves (at least on CPUs).
Yes and I think there is more:
Very often an apriori subdivision allows us to solve a problem faster than without it and while something better is possible, it is good enough.
For many business areas most of the individual problems are trivial and the only challenge is to make them all work together.
Enforcing a consistent super structure on multiple sub solutions might well be the best way. I think he acknowledges this and at the same time condemns it. I have a hard time understanding why you condemn the best possible solution.
Practically, I wonder if it is more harmful to reorganize the org to the problems at hand frequently or keep the org relatively static.
You could argue that the second leads to easier to understand designs since the time org chart axis is eliminated. Of course, only if the org has a good enough structure.
The first one seems to be a more project - oriented approach. I think it actually works well for new stuff but then maintenance and ownership become difficult.
Here’s the relevant Javascript that makes an XMLHttpRequest
which is then interpreted by an AWK script scanning the web server logs:
...
function reportOrNot()
{
if(hadActivity) {
activityCount++;
if(Math.random() < reportingProbability) {
let url = '/articles/report.json?scrollPerc='+Math.round(scrollPerc)+"&count="+activityCount;
var oReq = new XMLHttpRequest();
oReq.open("GET", url);
oReq.setRequestHeader("Cache-Control", "no-cache, no-store, max-age=0");
// fallbacks for IE and older browsers:
oReq.setRequestHeader("Expires", "Tue, 01 Jan 1980 1:00:00 GMT");
oReq.setRequestHeader("Pragma", "no-cache");
oReq.send();
}
hadActivity=false;
}
}
...
the entirety of the JavaScript code is barely larger than the above snippet but I didn’t want to wash out the comments section.
This is so simple and yet so powerful! I wish I had thought of this.
I wonder if it would be acceptable under the GDPR to include something like a 16-bit cryptographically random number in a cookie in combination with this (Note: As I understand it, GDPR does not require consent for cookies, it requires consent for tracking, irrespective of the underlying technology).
This would let you spot return visitors (for long articles, it might be interesting to know if people save the link and come back later or if they just leave - I’m somewhere in the middle and typically leave the browser tab lying around for a few days until I have some time), with some error margin: if you have more than about 32K visitors to the site, you’ve got a high probability of collisions in the identifiers and so there’s some anonymity. If that’s not allowed, what’s the threshold? Would an 8-bit identifier be sufficient? Is an 8-bit identifier + an IP address regarded as tracking?
I found some useful information related to this problem in Introduction to the hash function as a personal data pseudonymisation technique by the European Data Protection Supervisor.
Thanks. That’s a great read, but it doesn’t quite address the use case I was suggesting. The identifier that I’m considering (hypothetically - I don’t really do anything in this space, so this isn’t something I’m actually going to implement) is a random number that’s chosen with a uniform distribution and with no checks for collisions. If I picked an 8-bit identifier and I had a thousand visitors to my site then I’d expect about four of them to have the identifier 42, about four to have the identifier 255, and so on. Unlike a hash function, there’s no way of going from the user to the identifier: they just pick a random number and store it in a cookie. If they clear their cookies, they’ll get a different random number (and I’ll see a few more unique visitors than I expect), if I get more collisions on a particular number, I’ll see fewer unique users than are really there. This doesn’t matter hugely if what I’m trying to track is whether people bookmark and come back to long articles, because I’m only going to get a rough approximation anyway.
It gets interesting when you add in IP addresses in logs because, although the short random ID is not a unique tracking token, I probably have very few (on the order of 1) visitors per IP address and so it may be sufficient to differentiate between people in a household, which might make it into PII.
Are there any benefits in terms of data locality for this method vs. sequential file storage/access? It would be nice to see some benchmarks if so.
Yeah, it would be fun to try some benchmarks. My intuition tells me that the method I’ve developed is mostly useful for large universes with lots of data and queries of different sizes and dimensions, tall, wide, small, large, etc. Or if you want to begin indexing data without knowing what the queries will look like in the future.
If you only have to support one kind of query, like a square of a certain size, indexing the data as [x,y]
or [y,x]
with the first, most significant dimension rounded to approximately the size of your query rectangle would probably be just as performant for reads. You would do multiple range queries over those “rows” or “columns” like I described in my “naive approach” example. But as soon as the size of the queried area can vary by orders of magnitude this approach starts to break down, and you start seeing an unnecessary trade-off between wasted IO bandwidth and number of range queries.
Ok, I couldn’t help myself, I did do the benchmarks: https://sequentialread.com/building-a-spatial-index-supporting-range-query-using-space-filling-hilbert-curve/#benchmarks
Nice, thanks, that’s awesome!
Can you give a little interpretation on this, though? The “number of range queries per rectangle” is high while the query time is small for small to tiny queries. This means that the Hilbert curve method makes many range queries in a rectangle compared to the sliced method but still outperforms it by 2x-7x?
I think it’s because bandwidth is the limiting factor in my test in that case. Look at the “wasted bandwidth” metric for the tiny queries, the query time graph looks a lot like the wasted bandwidth graph. The CPU also has to do work to throw out all the entries outside the rectangle. A wasted bandwidth metric of 2 means that there were twice as many entries found outside of the query rectangle than there were entries found inside. And it goes up to 18 in the worst case!! in other words, for the 32-pixels-wide slice (universe is 16 slices tall) only about 5% of the data that the query was hitting was actually inside the rectangle.
I will say I did this benchmark in 1 day and I didn’t try to optimize it much so its possible that there are other bottlenecks getting in the way of the accuracy of the results. So take it with a grain of salt.
The main thing I wanted to show was how the space filling curve method works no matter what the scale of the query is. Even though they look different because the graphs have different scales, the amount of wasted bandwidth and range-queries-per-rectangle stays constant across all the different query sizes for the hilbert index.
Also, you can tune its performance characteristics a little bit at query time – in other words different kinds of queries can be tuned individually with no extra cost. While the sliced method outperforms the space filling curve method when the slice size is tuned for the query, the problem is you have to re-index the data every time you want to tune it to a different slice size.
What do you mean by sequential file storage/access? Do you mean scanning through the whole spatial index data?
The magic is mostly here:
q->buffer = mmap(NULL, 2*size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
...
mmap(q->buffer, size, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, q->fd, 0);
...
mmap(q->buffer+size, size, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, q->fd, 0);
Where 2*size
is allocated for the buffer, then both halves of the buffer are mmap
’d to the same file. This allows semantics to be something like buffer[ start + index ] = 'x'
, without the need for bounds checking.
The article claims 3x speedup on write.
In some sense, this is basically saying that the operating system is faster writing at duplicating writes than a program doing modular arithmetic and bounds checking?
this is basically saying that the operating system is faster writing at duplicating writes than
I don’t think this is duplicating writes - the mmap calls are just aliasing, right? The fundamental magic is what q->fd
points to, that’s the only real memory allocated.
Eg. this:
q->fd = memfd_create("queue_buffer", 0);
ftruncate(q->fd, size);
Is the only real, actual allocation that happens. The mmap calls are then set up to create a messed up virtual memory addresses, where both the address at q->buffer
and the address at q->buffer + size
actually point to the same real memory.
Once you get to the end of the first page table section, at q->buffer + size - 1
, and you write at q->buffer + size
, what actually happens is that you just get teleported to the beginning of q->fd
again, magically handling the wrap-around.
I would guess the reason this is faster is because that this is hardware optimized because of https://en.wikipedia.org/wiki/Translation_lookaside_buffer
Which begs the question: Is this actually faster, when a real program has lots of other things contending for the TLB?
Either way, super cool. Seems like a sure-fire way to find kernel bugs!
Edit: Re-reading your comment, realizing you’re saying the same thing I’m saying, and I think I’m just misunderstanding what you mean by “duplicating” writes.
The magic is mostly here
Yes
In some sense, this is basically saying that the operating system is faster writing at duplicating writes than a program doing modular arithmetic and bounds checking?
The idea is that once the relevant pages are faulted in, the operating system isn’t involved and the MMU’s address translation hardware does it all.
The fact that the microbenchmark at the end of the post doesn’t actually use the queue is a liiiiitle suspicious. But queues are inherently very difficult to benchmark.
Edit: no wait, it’s multi threaded queues that are difficult to benchmark. (Because tiny changes in the average balance between producer and consumer threads can wildly change perf.) No reason why single threaded queues shouldn’t be benchmarked.
I imagine most of the speedup comes from being able to use memcpy()
instead of a for
loop for copying the data in and out of the queue.
The “slow” implementation that is ultimately compared does use memcpy
instead of a for loop. They calculate the end using a single modular arithmetic call, then do two memcpy
’s
Here’s the relevant code snippet:
memcpy(data, q->buffer + q->head, part1);
memcpy(data + part1, q->buffer, part2);
Sun Microsystems tries to sell Brendan Gregg’s own software back to him, with the GPL and author credit stripped (circa 2005).
Great article.
I found this talk very good, thanks for the link.
For the lazy, there’s a video link of the talk. There’s also a reference referenced by the talk called “Out-Sourced Profits - The Cornerstone of Successful Subcontracting” by Hart-Smith, a Boeing employee (written in 2001).
The basic idea is that if a company aggressively subcontracts all of their value proposition, then they hollow out their core and become “zombie companies” that eventually implode. The rule of thumb Hart-Smith (and Hubert) give is 10%.
I’ve heard this idea before, that large companies often go through a change of stopping innovation and basically becoming financial companies, but Hubert and Hart-Smith fill in some details on how that process happens (incremental profit motivations to subcontract, disincentivizing in-house expertise from staying, encouragement from share-holders because ‘on-paper’ profits are still good, etc.).
Worth a watch.
Is there some type of website that allows for creating “tools that I want” proposals along with upvoting to get feedback from a community?
The following websites allow posting and voting on software ideas:
I can’t imagine what argument would convince me that the license of your artefact should in any way inform the choice of implementation language, but I’d love to hear anybody try.
The basic idea is:
To me, this is kind of talking about corporate vs. “DIY” (or “artisinal/amateur/journeyperson”) and it so happens that most FOSS projects are sole developers or a small number of people. As such, the FOSS projects will presumably favor languages that allow ‘shortcuts’, high levels of expressiveness, perhaps a DSL. Sole developers also won’t be as willing to put up with a toxic community.
In a corporate context, consistency is much more important as there might be high turnover of developers, a large code base without any one or a few people knowing the whole code base. Consistency and standardization are favored as they want to be able to have programmers be as fungible as possible.
You can see this in Golang, especially considering one of it’s explicit intended goals was to service Google’s needs. The fact that it can be used outside of that context is great but the goal of the language was in service to Google, just as Rust was in service to Mozilla and it’s browser development effort. The same could also be said of Java as it was marketed as “business friendly” language for basically the reasons listed above.
The speaker goes on to talk about Raku, which I guess is what Perl6 turned into (?), as being one of the fun languages with a friendly community.
So I think it’s a little reversed. It’s more like, most free software is written by a single or small number of people, and this workflow has a selection bias of a particular type of language, or a language that favors a particular set of features while discouraging others.
Yikes, I wouldn’t touch perl with a barge pole :)
I understand the idea behind small teams better being able to handle a codebase filled with dynamic magic and various other “spooky action at a distance”, but the problem isn’t just how much cognitive load you’re wasting getting up to speed, it’s the defects you build because so much is hidden and things you thought would be orthogonal end up touching at the edges.
Raku really isn’t Perl. I don’t know enough Perl to have an opinion on it one way or the other (though its sub-millisecond startup time – about an order of magnitude faster than Python and about the same as Bash – give it a pretty clear niche in some sysadmin settings). But they are definitely different languages.
The analogy I’d pick is that their relationship is a lot like the relationship between go and C. Go (Raku) was designed by people who were deeply familiar with C (Perl) and think that it got a lot right on a philosophical level. At the same time, the designers also wanted to solve certain (in their view) flaws with the other language and to create a language aimed at a somewhat different use case. The result is a language that shares the “spirit” of C (Perl), but that makes very different tradeoffs and thus is attractive to a different set of users and is not at all a replacement for the more established language.
TBH I know nothing about Raku. I vaguely remember the next version of Perl being right around the corner for years, but by then I’d had enough of line noise masquerading as source code (which all Perl written to be “clever” definitely was at the time) so I wasn’t paying attention.
There’s an argument that you might not introduce those defects if you’re a single person working on a project because the important bits stay in your head and you remember what all the implicit invariants are.
I personally buy a weak form of that argument: things developed by individual developers do tend to have webs of plans and invariants in those individuals’ heads. AFAIK there’s some reasonable empirical research indicating that having software be modified by people other than the original authors comes with a higher rate of defects. (Lots of caveats on that: e.g. perhaps higher quality software that has its design and invariants documented well does not suffer from this.)
I’m told that hardware / FPGA designers tend to be much more territorial about having other people touch their code than software people, because of the difficulty of re-understanding code after it has been edited by someone else, because hardware contains a greater density of tricky invariants than software.
I hear what you’re saying and I think there’s a lot of validity to it but there’s a lot of subtlety and shades of gray that you’re glossing over with that argument.
So here’s a weak attempt at a counter argument: all that dynamic magic or other trickery that might end up messing things up for beginner/intermediate programmers, or even programmers that just aren’t familiar with the trickery/context/optimizations, are not such a big deal for more experienced programmers, especially ones that would invest enough time to be the primary maintainer.
It’s not that one method is inherently better than the other, it’s that the context of how the code is created selects for a particular style of development, depending on what resources you’re optimizing for. Large company with high turnover and paid employees: “boring” languages that don’t leave a lot of room for rock star programmers. Individual or small team: choose a language that gives space for an individual programmers knowledge to shine.
I saw a talk (online) by Jonathan Blow on “How to program independent games” and, to me at least, I see a lot of similarities. The tactics used as a single developer vs. a developer in a team environment are different and sometimes go against orthodoxy.
There’s no silver bullet and one method is not going to be the clear winner in all contexts, on an individual or corporate level, but, to me at least, the idea that development strategies change depending on the project context (corporate vs. individual) is enlightening and helps explain some of the friction I’ve encountered at some of the jobs I’ve worked at.
Right, but if you’re the only person working on an OSS project you’re not doing it full time, and you’re also (probably) working on a wide variety of other code 9-5 to pay the rent, basically meaning whenever you’ve got some time for it, you’re always going to be coming back to your OSS code without having it all fresh in working memory.
(disclaimer: video author)
[The larger problem is] the defects you build because so much is hidden and things you thought would be orthogonal end up touching at the edges.
I 100% agree with this; spooky action at a distance is bad, and increases cognitive load no matter what. However, I think a language can be both dynamic/high context and prevent hidden interaction between supposedly orthogonal points.
Here’s one example: Raku lets you define custom operators. This can make reading code harder for a newcomer (you might not know what the symbols even mean), but is very expressive for someone spending more time in the codebase. Crucially, however, these new operator definitions are lexically scoped, so there’s no chance that someone will use them in a different module or run into issues of broken promises of orthogonality. And, generalizing a bit, Raku takes lexical scope very seriously, which helps prevent many of the sorts of hidden issues you’re discussing.
(Some of this will also depend on your usecase/performance requirements. Consider upper-casing a string. In Raku, this is done like 'foo'.uc
, which has the signature (available via introspection) of Str:D --> Str:D
(that is, it takes a single definite string and returns a new definite string). For my usecase, this doesn’t have any spooky action. But the zig docs talk about this as an example of hidden control flow in a way that could have performance impacts for the type of code Zig is targeting).
Can you imagine Free Software being produced with MUMPS? Some languages are only viable in certain corporate environments.
I read this as “can you fit a whole genome into a QR code”. Weird.
For anyone wanting an answer to that question, here’s a SO question and answers:
I have been in a similar spot and I can understand this kind of small disappointment. I will offer a few thoughts that may or may not help!
This can be hard, but understand that by the maintainer setting aside your commit, they were surely not making a personal slight. By instead using their own commits for the fix, they are NOT saying your proposal was “wrong” or “not good enough”.
Remember the only social expectation a good open source maintainer could owe you for an unsolicited proposed contribution is a hearty thanks. You made the valiant choice to give freely of your time and talent. You should feel good about that! What you offered was a gift, and when we offer any gift, we must be mindful that we cannot expect reciprocation. That would make it a transaction, not a gift!
There are a variety of causes for why the maintainer did not merge your patch as you may have imagined they would.
You didn’t say what project it was, which is fine, but if it’s a big project or from a large organization, consider they often have strict contribution guidelines that are necessary for legal reasons, such as a “contributor license agreement”.
If your report was just a small typo or a few words, it would be a little silly for them to ask you to sign a big CLA before merging your patch. It’s possible they might have read your report, ignored the patch, and made their own fix. If that’s what happened, they actually did you a favor, by saving both of you that overhead effort.
Most large projects have a file in the repo named CONTRIBUTING or similar, that would lay this out.
It’s also possible they wanted to make the fix a slightly different way, and it was easier for them to do it directly rather than merge your patch and then make another commit on top of it. Maybe they want the commit message writtem a certain way, so the git history is more to their liking. Projects do not get points based on how many commits they merge in! :)
As a takeaway, remember that what you offered did have value – you saw the pothole, and then your intended impact was achieved, in that it got fixed. The question of whose version of the fix made it to the git history becomes irrelevant. Future strangers will no longer trip over this particular pothole thanks to your report.
For myself, I am glad the open source community has people doing things like you did!
I think this comment is spot on but I wanted to add a few points.
When I first started getting into free/libre/open source, I had a default understanding of what it meant to be “open” and allow for community contributions in that I thought these type of “drive-by” patches/bug fixes/etc. were the norm. Free software projects are varied and have different ideas of what it means to allow for community involvement and this “bazaar” approach of folding in changed proposed by community members with low engineering involvement are one style of contribution. The “cathedral” approach is another, whereby the developers are the ones with sole access to the repository and only allow contributions from their inner circle and extend their inner circle selectively with folks who commit to having a deeper involvement in the project.
Remember that “The Cathedral and the Bazaar” [1] talks about an Emacs/Stallman type style of contribution (the “cathedral”) vs. Linux/Torvalds style of contribution (the “bazaar”). I had only really heard the title and, in my ignorance, assumed that Raymond was making a “Linux vs. Microsoft” argument.
From my personal experience, when people submit patches to my own (very small and not very popular) FOSS projects, I have an initial reaction of “not quite like that, like this”. I want to practice a more “bazaar” like methodology and so it’s something I’m trying to ween myself off of but it’s a natural reaction that I have to overcome and one, I imagine, many other people feel. I think accepting contributions of this sort also provide a welcoming approach to people, so they get positive re-enforcement and pave the way for more substantial contributions.
I think [tedchs] correctly points out that having a CONTRIBUTING file for guidelines (or something similar) is sometimes present and should maybe be the norm (maybe with some type of template people can choose from) but this is a layer of process that needs to be created. For small projects, especially ones that don’t have a concrete idea of how to accept contributions, this is a layer of process infrastructure that adds complexity and might not be appropriate for the scale or scope that the project is currently.
As a general rule of thumb, when trying to contribute to other projects, I usually create an issue with an offer to help and, only after confirmation, proceed to contribute. I violate this all the time but this is one tactic to differentiate between whether the project is “cathedral-like” or “bazaar-like”
[1] https://en.wikipedia.org/wiki/The_Cathedral_and_the_Bazaar
To add a different voice:
I, as a maintainer ,think this is disrespectful.
If I receive PR, it’s not even a question for me whether I acknowledge the contributor’s work and retain his/her authorship.
If the PR needs to be adjusted, I either tell the contributor, or do it myself in a separate commit. If the adjustment is very small, I may do it in the original commit, but still retain him/her as the author.
If your report was just a small typo or a few words, it would be a little silly for them to ask you to sign a big CLA before merging your patch.
It sometimes happens even for non-trivial patches and even if you sign CLA. Example: Unix domain sockets (UDS, AF_UNIX) in System.inheritedChannel() and elsewhere in Java. (later implemented internally).
Nobody is obliged to accept your contribution (which is OK – on the other hand, nobody can force you to merge his code).
If your report was just a small typo or a few words, it would be a little silly for them to ask you to sign a big CLA before merging your patch.
Copyright rights in the U.S. can be terminated by the estate after the author dies. This is one of the reasons that most corporate open source projects require a CLA. Here is a post by a lawyer that describes other problems solved by a CLA.
I created a JSON version of the Hershey Vector Font for one of my projects a while back, which can be found here.
For those that don’t know, the Hershey Vector Font is one of the goto fonts for CNC or other “industrial” processes, like silk screen text for PCBs. Sometimes you want a simple line font just to print some basic text without having to deal with the complexity of parsing TrueType fonts or doing Bezier curve interpolation, etc.
A way to see the gray code is to think about doing a non self intersecting walk on the hypercube that visits all nodes (that is, find a Hamiltonian cycle on the hyper cube).
The best explanation I’ve seen.
Until reading this post, I didn’t understand that the whole reason for currying and the contortions for the Y-Combinator was because Lambda Calculus only supports function literals, function application and functions that can only take in one argument.
The Y-Combinator, if I’m understanding it right, is essentially a guide to make Lambda Calculus/functional programming behave like programming languages we’re used to.
From a HN comment:
Functional programming: Making you feel clever by allowing you to solve problems that nobody else even knew existed, in order to let you do what everyone else could do from the start.
For folks wanting easily parseable tarot data, checkout dariusk/corpora, specifically the tarot_interpretations.json file.
I’d love to see a free/libre ‘hacker tarot card’ deck.
“On the Paradox of Learning to Reason from Data” by Zhang, et all (paper) on how neural networks have a hard time learning even simple logic from examples.
An old concept that I’m just now catching up with but various papers on Belief Propagation:
“Multi-Scale Truchet Patterns” by Carlson (paper)
“The Dynamics of the Transition from Kardashev Type II to Type III Galaxies Favor Technosignature Searches in the Central Regions of Galaxies” by Wright et all (paper)
“Fast Poisson Disk Sampling in Arbitrary Dimensions” by Bridson (paper)