Disagree with the conclusion, the answer is much simpler. As a rule, if the project isn’t documented just assume it probably isn’t suitable for public use and certainly isn’t suitable for production use.
Github exists to share so by all means put code in any form up for others, just don’t put it undocumented into a package manager like Hackage or PyPi where it can subtly be pulled in. That is the real harm to communities.
My question is though, what do you hope to gain by sharing. Many people actually post these projects with the intent that people will use them, but then offer little or no documentation. Even many widely used projects have scant documentation (yes, few have none at all). In fact, the basis for the article came out of my own frustrations trying to use a number of open source project recently in preparation for a conference presentation - many of which were widely used (and many users of which shared my frustrations).
I keep virtually all of my code on GitHub, that way, I have a backup. What few non-open source things I do are in private repositories.
Or if I have multiple computers, it lets me share code between them without having to do some kind of obnoxious syncing.
A ton of people have dotfiles repos. No, my dotfiles are not production ready. No, I’m not going to mark my dotfiles as not ready for production.
Right. I think a lot of people do. What I suggested though is that you indicate with some sort of disclaimer that this is personal or experimental and you do not intend to support, maintain or document it. Makes it quick and easy for you and quick and easy for a consumer to make an informed choice about using it.
This is always true with open source, no matter what. We like to pretend that it’s not true, but at the end of the day:
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
There are projects that I’ve been the maintainer of with tens of thousands of users, and then life happened.
There are all kinds of other things that point toward this, though: only one contributor, very few commits, no documentation. That says it way more effectively than any kind of disclaimer I could make.
Yeah. We’ll know software engineering has made progress if we ever get to a point where we can safely omit that wording. It’s hard to imagine, right now, even with formal verification.
It’s hard to know or predict what someone will do with a piece of software. There are a finite number of requirements for a screw you buy at the hardware store. If your shed collapses because you didn’t use enough screws, that’s obviously your fault. If your website collapses because you used the wrong template engine, well how could you have known that? Obviously time to blame the author.
Absolutely. We are in agreement. :)
There are plenty of examples where this simply isn’t true (in fact, the author clearly intended it to be shared, but didn’t bother doing any real documentation). Thus the impetus for the article.
(in fact, the author clearly intended it to be shared, but didn’t bother doing any real documentation)
This is saying that it’s intended to be shared. But in the end, it’s still not offered with any kind of warranty; I can release a library with no documentation if I don’t want to write documentation.
Of course you can. I am saying you shouldn’t. Not sure who it’s helping to do so.
Documentation decreases the barrier to entry to using a particular piece of software. Zero documentation does not necessarily imply an infinitely high barrier to entry. If someone is motivated enough, the lack of documentation may not prevent them from reaping the benefits of the code that was shared.
I get that the world would be a much nicer place if we didn’t share things that were significantly lacking in quality. We should definitely continue to advocate in favor of documenting your code, even if you don’t share it. But it’s the wrong thing to should-people-to-death for. The better thing to optimize for, IMO, is to teach others how to identify whether a project is worth using or not.
Because other people can reference it if their time is not valuable and they’re willing to put the time in to read the source. Code gets open sourced for a lot of reasons, and a lot of them are not necessarily for to make something that’s immediately usable in a commercial setting. How people donate their time is really not something you can expect to really control.
Thus the above advice: If it’s not documented and you’re on the clock, assume the library doesn’t exist honestly.
There are plenty of employers who look at your github activity as a quick gauge on your code/activity level outside of work or classes. Not necessarily a fan of the practice, but it is a real reason people put up code that’s not for widespread consumption.
Many users shared your frustrations, but they’re still users. Their frustrations with the project must be less than whatever frustrations drove them to use someone’s undocumented code. If enough people find this code useful, maybe they’ll consider contributing documentation back to the code base as they figure it out. If the author sees this, they may realize that if people find the undocumented draft project useful, that maybe it’s worth some effort to improve it. Or the author doesn’t care, someone forks the project, and it takes off.
There are many paths to useful open-source projects, and they don’t all require the author to spend an inordinate amount of time documenting every feature for a project nobody may ever see. The onus on deciding how to put together your project is still on you, not the authors of every piece of open-source code that might be relevant to your project.
You think people publishing code are doing it for you, and want to chastise them for not bringing the quality to your standard?
I think it’s much more sensible to simply assume that they’re not doing it for you, but for some other reason. That is to say that I don’t think people post these projects with the intent that other people will use them, but for some other reason. Some do it because it is a convenient backup, and others because it’s the minimum needed to get a contribution, but I don’t know anyone who does it for other people.
And yet: I do want projects to have better documentation. I think a documentation standard is more valuable than a coding standard: We are writing for humans; the computer does what we say, but we write to express what we mean so that human beings can fix our mistakes in stating it correctly. To this end, I recommend that programmers write documentation first, and then implement the documentation, and I consider a programmer who cannot document his software to not be a very good programmer. I just don’t think this blog posting is how we get there.
Overusing Dijkstra-isms “Considered Harmful.”.
Seriously. This could be a fantastic article but I just can’t be bothered to open anything with ‘considered harmful’ in the title anymore.
It isn’t, so your screen worked gloriously.
It is already a written essay. It hasn’t dissuaded people to use the catchy phrasal template in their title http://meyerweb.com/eric/comment/chech.html
The considered harmful part was partially a joke - obviously I don’t think every OSS project is potentially harmful. If you give it a chance, I’d love your opinion.
I’ve used it lately, although in an article that is meant to be a direct follow-up to Dijkstra’s one.
“‘Considered Harmful’ is Considered Harmful”
I can see the medium.com post already.
It’s been written already, long before Medium existed. That’s the link others have been offering. :)
“Considered Harmful” Essays Considered Harmful
It really is not that hard to instead of just searching for all available projects, talk to some peers and see what mature options are out there, or do the ‘research" yourself. If it is a good product and there is no documentation, and that’s a problem for you, write some. Don’t just pull some repository with version 0.0.3 and use it like a blithering idiot. It’s your product and your product alone will fail, so it’s your responsibility to make sure the code you pulled in is sane, not joe schmo who puts in 3 hours a year to this side project.
I think the problem is with transitive dependencies, not direct dependencies. Sure, you can make sure all the libraries you use are high quality and well documented, but who’s to say that those libraries' maintainers did the same thing? It may just mean that you need to consider transitive dependency quality in the vetting process, although that would be a substantial amount of work in most cases.
If your project is going to be put on production, where up time, and security matters you better be sure that the transitive dependencies are not a weak link. Either that or the main dependency is so popular that the issue will be resolved quickly I suppose (lookin at you React) but that is a bit like driving a volvo without a seatbelt. Storing your dependencies with your code (or somewhere), and testing before merging new dependencies might not be a bad idea. I think it’s a pretty strong given that in any platform, package manager, etc, catastrophic failure is possible, and we should be at least a little prepared for it, especially when it’s inexpensive to do so.
My point is, though, that why are you putting it on the consumer to figure out if your project is worth using or not. It’s also not always clear, from the consumer standpoint, that a project isn’t worth using fully until they’ve already invested some time in it. It’s usually not as clear cut as zero documentation - often there is very limited or poor documentation (I’ve often cited a static site generator that I used that, once you got past the very basic getting started, you were stuck just reading the code to figure out how to use it). It’s clear looking through the issues that a lot of developers attempt to use these projects - so, what, as an open source developer, do I gain by putting a project out there that, in many case, wastes developers time or frustrates them mostly due to a lack of documentation?
It’s always on the user to decide whether or not the project is worth using or not. The developer obviously cannot jump into the user’s brain and make the decision for them Inception-style. But there are generally some useful heuristics that you can use. How long as this project been around? Is it actively maintained? Does it have a relatively large userbase? Is it backed by a reputable tech company, etc? None of these questions is difficult to find answers to.
If you put code out there without documentation and somebody decides to use it, you didn’t waste their time. They wasted their own time by making a poor decision.
I don’t agree with this, and I think the author is assuming too much about how people use GitHub. Not everybody puts their code on GitHub (or sourceforge, bitbucket, etc.) with the goal of creating a big “successful” open source project.
I publish essentially all of my code to GitHub. The code is free to use, but it’s on GitHub mostly for my own convenience and it doesn’t matter to me if nobody ever looks at it, and I don’t care if it’s helpful or useful to anybody else besides me. If I think a project could be useful to other people I’ll fill out the README and publish it somewhere (add it to QuickLisp, for example), but a lot of times I don’t bother. I’m not obligated to put code or documentation on GitHub, and nobody is entitle to have me do it.
If a project is difficult to use, isn’t documented, and you don’t have time to figure it out, simply don’t use it.
So much this. I make stuff publicly available on GitHub because it provides me with a free backup and convenient syncing between computers. (I’d use a private repo but I’m too cheap.) I try to put at least a README in each repo though.
My argument is that you can go ahead and continue to do this, however, make it clear that this project is not intended to be maintained, supported or documented - so, use at your own risk. It’s a simple thing to clarify, so that if I, as an end user, come across, I know this perhaps isn’t worth pursuing.
I’ve seen tons of projects that look like they might be real and intended for public use (some actually clearly are) but the docs are not there yet. In some cases, a developer comes across it and is like “this is exactly what I need!” It’s only after time spent trying to resolve issues that they come to realize that the developer never really intended this to be documented or maintained - i.e. it wasn’t really for public consumption.
IMO source code is not intended for end users and is always “use at your own risk,” so I don’t feel like that’s really necessary.
An application’s code is not generally for end users, and if you want a full-fledged supported, documented solution to your problem, you should look at the project website or mailing list or wiki, not the source code.
For libraries, it’s better to use Pip, NPM, QuickLisp, CPAN, etc. or even your operating system’s package manager (or homebrew) to get a “*-dev” package. If the project owner hasn’t taken steps to get the project out to the public, they’re probably not interested in getting people to use it.
FWIW I commented about disagreeing with the author’s assertions around personal projects and he has updated his wording to be more clear about that point in particular.
This scores points for the author with me. Being able to thoughtfully respond and change your opinion shows a willingness to learn and be open minded about things, which I find to be a big plus.
Thanks. I appreciate that.
Like Dave Grohl said…
Not everyone is a documentation writer, artist, public relations guru, or website wizard. In most open source projects these areas are lacking. That’s fine. Sure, they could have been readmes and support for the end user, but I find it hard to believe that these projects are doing more harm than good by being open source.
18 million developers. Even if less than 60% of the total repositories are public, we’d still have a public repository for every professional or non-professional developer in the world. In my opinion, this indicates that we’ve clearly overshot the target on code sharing.
I don’t have the source or an easy way to update most of the devices or websites that I use, so no, humanity/corporations have not done enough. Programmers' hobby projects, open source companies and even open source projects from traditionally closed source Microsoft/Google/Facebook have helped, but until the majority of code is open source I think we still have a problem. I’d still rather have all code such as Windows, Facebook server code, my car’s ECU open sourced without a readme or documentation, than not at all.