1. 29

This is a very broad question but I would like to narrow it down. So how did you begin contributing to open source and felt more comfortable contributing to large and widely used projects. I would love to hear your experience and tips.

As someone who’s programming in Python and really wants to contribute to open source projects but always get overwhelmed by looking at the large codebases. How do you you deal with that and how do you get yourself familiarized with the codebase.

I have created tiny open source projects but never tried contributing to an actual open source projects.

I would love to hear thoughts and tips on how you can overcome the fear and start contributing to open source projects and even become a maintainer

  1.  

  2. 14

    My first two contributions to programs anyone has heard of were to the linux kernel, for which I implemented two things. The second of them runs to this day and processes outbound TCP/UDP packets on countless hardware, and I’m very proud of how it’s largely unchanged after decades. The first got linux banned from some networks by causing ARP storms.

    There’s a lesson to be learned from this. Two lessons maybe, because Linus never said a harsh word after the first.

    1. 2

      If I understand his logic correctly, he would have harsh words for maintainers who approved your change, but not for you.

      1. 12

        Uhm, maintainers… this happened back when the linux world was a small clique and Linus helped people install linux. He helped me install linux, in fact ;)

        As long as I followed kernel development, he didn’t swear at anyone for any bug. I’d have gotten an earful if I had defended my code and suggested that other OSes involved in the ARP fiasco were to blame and ought to change.

        1. 1

          [Linus] helped me install linux, in fact

          Which year was this? How far from 1991? :-)

          1. 2

            ≤2 years after that July 1991 posting. I installed in July 1992 and deleted my MSDOS partition early in August. Can’t say precisely when I wrote the ARP code, since I had a rather unfortunate file system event a little later.

            1. 1

              “Only wimps use tape backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)”

              1. 1

                Only fools ask root for a favour, if that favour involves doing file system operations manually (instead of waiting for the chore to be done later by a crontab that would remember to enable tape backups).

    2. 11

      Jon Gjengset (Rust contributor & streamer) touches on this in the Humans of Open Source podcast (link to episode, it’s at 17:16).

      Sean Chen: I discount just how much time it takes to digest the entire codebase. How long would you say it takes you to digest a large codebase?

      Jon Gjengset: It’s hard for me to answer because that’s just not how I read code. It’s very rare that I just open the project and read the codebase. I’ve heard this recommended in some places and it’s just weird to me. You’re likely to just get impatient reading it.

      More commonly, there’s a particular thing that I need to know how it works. For example, let’s imagine that you need to sort an array and you’re going to use the standard library sort. You want to know what happens if two elements are equal, whether their position gets swapped or not. Let’s say for whatever reason, that matters to your application and the documentation doesn’t say. My first inclination is to click the ‘Source’ button and read what the function does. Usually, that’s how I consume a codebase, bit-by-bit, from the things that I’m using. After awhile, you get a sense for how things fit together. It’s very rare that I read the whole thing.

      If you just read a whole codebase, you get the wrong impression that some really smart person sat down and just wrote this thing. That’s just never how these things are developed. If you look at some of the really core, tricky libraries like syn, or once_cell, or rayon, their codebases are quite intricate. But they didn’t start out that way. They started with some naive implementation and then someone found a problem and then they found some neat solution. It’s rare that one person just put their thoughts into the computer and it worked.

      My way of thinking of open source is: All these codebases are continuously improving by small amounts, but the only way they improve is if someone reports something and then tries to improve it. You can be a part of that. If you’re contributing to open source for the first time, all you have to do is contribute a delta. You’re not contributing a whole thing.

      I would add: Seek out communities that value small contributions and have a holistic view of what constitutes a ‘contribution.’

      1. 10

        The earliest I can find is devel/p5-Config-ApacheFormat from April 2006. While I had worked on some of my own projects before that, I can’t really recall (or find) any contributions to other projects before that.

        I wanted to use this for some Perl program I was writing; I can’t remember what it was but based on a message to the Perl mailing list I think it was something to organize my MP3 files or some such. My stint with Perl was short; I discovered Python not long afterward, found it much more agreeable, and rewrote all the Perl things I had in Python.

        Outside of of some FreeBSD ports, I think the first patch I sent was a trivial PR to GNU GNATS which changed function foo() to foo() for POSIX compatibility. Looks like that didn’t actually make it in to a release until Feb 2015, heh.


        So anyway, to actually answer your question: my contributions always have been (and to a large extent still are) to solve some problem I’m having; either a bug or error I encountered, or some feature I’d like to add.

        A few days ago I worked a bit on a small PostgreSQL patch; if you use auto_explain it will show you the query (select foo from bar where x=$1) but there’s no way to know what the parameter value of $1 is. I never worked with PostgreSQL and I’m a pretty inexperienced C programmer. I usually use grep (or ripgrep, or anything similar) to find the part of the code I want to work on (in this case, auto_explain) which usually works quite well. Searching for a fixed string like a setting name or message often works well.

        After that, it’s just a question of mucking about with it really, which can be rather time-consuming. Setting up proper tooling can help a lot; I spent a few minutes setting up clangd for proper completion and such, and it makes everything a lot easier. I use “printf-debugging” a lot too; you can use a debugger too, but I never really liked using them myself.

        I don’t think there are any shortcuts here: if you want to work on a large complicated project you will need to invest some amount of time. Developer documentation can be useful, as can people helping you out on IRC/Slack/mailing lists, but there’s no real substitute to actually spending time with the code.

        how you can overcome the fear

        This is probably common and normal; I remember when I started working on some FreeBSD ports and it was properly scary to submit a PR. I procrastinated sending them and went over them several times to make sure I actually got it right. Even many years later when I started submitting some stuff from my website to HN, Lobsters, etc. it was a rather nervous experience at first.

        It’s just a matter of doing it, and over time your confidence will grow and you realize there isn’t that much to be afraid of. I don’t think there are any “shortcuts”. I’m not even sure this is a bad thing either. Fear or anxiety isn’t necessarily bad, and it can be a good motivator to go the extra mile and do the best work you can.

        Overall I have very few negative experiences submitting patches. Actually, I can’t remember a single one – even if a patch is rejected people are usually reasonably nice about it. I’m sure negative experiences happen; but most of the worst comes out in long-running discussions and the like and not when someone just submits a patch. The chances of being told you’re an idiot are not zero, but very slim.

        Things like Linus’ outbursts are (in)famous, but remember he doesn’t just rant to random “hey, I’m new to contributing to Linux and whadayathink about this patch?” messages but to people who, in his opinion, really ought to know better. Something to keep in mind, because I think some people might get a bit of a wrong image of what it’s like to contribute as a new user based on this.

        The most negative experience I’ve had thus far is someone on Reddit being a cunt, usually when showcasing a project or posting some article. Someone on Reddit being a cunt! Hold the presses! It’s not a great experience but also not that bad of a thing to happen either. It’s different if someone you respect tells you off, but the amount of respect I have for random Reddit people is not especially high.

        1. 2

          The chances of being told you’re an idiot are not zero, but very slim.

          That really depends on a project maintainers. E.g. I’ve been told that “I have no idea what I’m doing” when contributing a small fix to ForgeGradle. I’ve contacted other maintainers on Discord and the first reply was that the change is a no-op, so I’ve briefly explained debugging steps from the linked issue, and another maintainer said that he’ll look into it. So far he didn’t, and I still have no freaking idea what was wrong with my change (if anything was wrong at all).

          1. 3

            Yeah; that’s not exactly a stellar response, if you can even call it that :-/ In my experience anything involved with gaming has a (much) higher chance of toxicity for some reason.

            1. 2

              It looks like you ran into an asshole. I’m sorry you experienced that. Many project maintainers are a great deal less bad than that.

              1. 1

                I’m sorry to hear this .

              2. 1

                Just read your entire comment and I totally agree with you. I think as beginner in open source community it can be overwhelming and scary at first but I think we shouldn’t let that fear stop you from contributing to open source projects.

                anyways I’ll definitely start looking into open source projects from a contributors prospective and try to contribute to the project

              3. 7

                I patched a minor bug in sendmail, but hey, it was the early ’90s – everybody patched bugs in sendmail. People would patch them, and the sendmail devs would introduce new ones. It was a social thing.

                1. 4

                  A small feature addition in FreeBSD. I added the patch in 2006, and it was accepted in 2010…that’s still my only FreeBSD commit.

                  On familiarisation with the code base, FreeBSD is huge but cp is tiny. You can focus on a small part of interest, or a small part where a bug has been reported (which is what I did) without having to learn the whole project, if the project has been designed well.

                  1. 4

                    Loved a game, the game was open source, it had an easy how-to-contribute (at the time)! The game was: https://github.com/CleverRaven/Cataclysm-DDA

                    1. 3

                      First Contributions

                      It wasn’t the earliest. Can’t remember the earliest, but one of the earliest I can remember was contributing to pkgsrc through pkgsrc-wip. Not sure when that was, but when it was still on sourceforge. Perl modules, later other things, then Tor, which I maintained for a while on the official pkgsrc. But I haven’t done Perl, nor pkgsrc in over a decade.

                      I know the feeling of not daring to contribute, which is why pkgsrc-wip was nice. It was the place for not so stable stuff, pkgsrc has very nice linting and the only thing that was scary was the VCS (I think SVN).

                      There were also other forms of contributions, like CPAN Testers (running tests of different systems) and also the ParrotVM tests on obscure systems. Fun fact: That very test suite resulted in a bug being found on OpenBSD/hppa’s libc back then.

                      Fear

                      That said about fear. I think it is important to put things into perspective and realize that nothing bad will happen, in the worst case. So that’s what I’ll be focusing on.

                      Most people are nice, even in communities that don’t have the image. Especially when there is a “first contribution” tag or similar declaration on it. Your first contribution not being the best isn’t the end of the world, but in many situations it will be great, because some care is taken. Even if it’s not accepted first you’ll most likely get constructive feedback and then it will be fine the next time.

                      Everyone likes people at least trying to contribute code more than people just requesting features, writing +1, filling up everyone’s inbox, etc. more, even if it’s horrible - which it won’t be.

                      Also people sometimes do horrible things even with decades of experience and in their field of expertise. So having one, or two or ten things that were accidents or not as intended will essentially mean that you have a great story, for an evening with friends.

                      Last but not least. There is people reviewing code. That’s a basic thing that happens in open source projects. Especially in very active projects people will have seen first contributions (other than having started somewhere). You are trying to contribute something, so it will be good. :)

                      Code bases

                      Regarding code bases and getting familiar with them. For really big projects, maybe don’t worry about knowing the ins and outs of the code base. Instead scratch an itch. Implement a feature, fix a bug, make some kind of change that you want to see in there. It’s way easier to approach it from this direction. It’s way clearer, if you work out what to do and then look around, rather than trying to get a hold over everything without any goal. Just, if there is some check out style guides, and any GitHub templates for pull requests or something like that. They usually aren’t hidden, should they exist. Otherwise just work with what’s there.

                      1. 3

                        So, for me, it was kind of a process. I would often download source code and, if I found I absolutely needed customizations, I would just poke through the code until I found what I thought was the problem, or how to implement a simple WORKSFORME version of the feature. That really helped me build up confidence. Not to mention experience in spelunking in big codebases.

                        The next step was more out of necessity.

                        I’ve been the Fedora Maintainer for a python Terminal Emulator called Terminator for years, and when Fedora deprecated python2, I reached out to the maintainers of terminator and found them completely AWOL. There hadn’t been a release in years, and the maintainer was completely unresponsive. At this point I had a choice:

                        1. Let Terminator go
                        2. Maintain a giant python3 patch.
                        3. Fork the project.

                        Originally, I decided on option 2, because I hate the idea of forking the project. However, what ended up happening was that there ended up being 4 or 5 (at least) unofficial forks with similar code, but different bugs.

                        So, when the Debian Maintainer (Markus Frosch) decided to fork, I jumped on board, and spent the next year coding Terminator while looking after my family during the pandemic.

                        I recently found new work, so my contributions have been lower recently, but I still consider myself the head developer on the Terminator project.

                        If you’re looking to contribute to an interesting python Open Source project you can ping me on https://gitter.im/gnome-terminator/community or just check out the project https://github.com/gnome-terminator/terminator. It’s come a long way from it’s beginnings as a 300 line python script, but I think the codebase is manage-able, if a little quirky for a python project.

                        1. 1

                          Hey I’ll definitely check out that project

                        2. 3

                          Hmm, I believe it was a patch to gAIM (now Pidgin) to improve its handling of IRC color codes.

                          Personally I don’t think that being “someone who wants to contribute to open source projects” is really much of a path to becoming someone who does contribute. There are basically two ways, in my opinion, that do work:

                          1. Be a user of open-source projects. Inevitably, you will run into bugs, or features you wish were there. So scratch your own itch. Try fixing a bug that annoys you, or adding that thing you really want. You might not succeed, but when you have a specific goal in mind, “getting familiar with the code” becomes a completely different process. You’re looking for the thing that does this, which connects to that, which is controlled by that… you keep pulling on the strings until the puppet dances the way you want it to, you learn bit by bit, and you keep going because you have a personal investment. If you get lost, you can ask questions — specific questions about how something works are way more likely to engage someone than “hi I’m new what can I do?”, especially if you manage to run into someone who is bothered by the same bug / missing feature you are, and is glad that someone else is actually giving it a whack.

                          2. Create something new. Do something that hasn’t been done before, or in a way that’s never been done before, or at least on a platform that wasn’t supported before. And put the source out there so that other people can join in and you don’t have to work on it alone. Yes, this is entirely hit and miss, but as a person who is interested in things (any kind of things), with the ability to write code, and the knowledge that open-source exists, opportunities will come your way. Grab them. Most of them will probably languish in solitude, but that’s alright, you still learn something. And every now and then, one might take off.

                          As for being afraid… really, there’s no cause for it. Everyone you interact with is just people. Even the “big stars” are just people. Usually they’re helpful. Sometimes they’re not. Sometimes they’re in a bad mood and say something ill-advised and pissy, but you know what? It’s just pixels on the screen, just electrons from a stranger. You filter through it, you learn from it what you can, and you go on with your life.

                          1. 3

                            Almost certainly to one or another MSN Messenger chat bot whose source code has probably, thankfully, long vanished from the internet. 🙈

                            1. 3

                              I added pure-css comment folding here at lobsters 6 years ago. I didn’t have an account at the time and the folding used js, which was not enabled for non-logged users. So I went looking into the code even though I didn’t now any ruby and submitted the small patch.

                              1. 3

                                I would say don’t stress out.

                                Write code as usual. That is for work or on your personal project doesn’t matter.

                                I’m pretty sure you’ll use some open-source libraries or framework for that. And I’m pretty sure that you’ll find some bug or something you don’t like in the way it is.
                                Maybe you will need a different functionality or perhaps you’ll disagree on an api definition.

                                Anyway, at that point fork and make the changes you wished for.

                                Then some people may not like it, some people may like it and some other people won’t care at all

                                But hey! Stay cool. Life sometimes is just like that and everybody is trying his best. You still got your fork and you still got the freedom to use your version.

                                Anyway-anyhow you’ll learn a lot along the way and you’ll find yourself as a better human after that.

                                One inspiring article on the topic for me is I want to contribute to your project, how do I start? by Drew DeVault and i wrote a verbose version of what i just wrote in here.

                                Good luck.

                                1. 3
                                  First contribution

                                  So, the first project I’ve contributed to in earnest, as opposed to a random Hacktoberfest commits, is Janet. I have also given a decent amount of feedback to a project called gotoB. I actually had someone try to contribute to and help out with PISC before I had contributed much to another open source project myself.

                                  Large Codebases

                                  Navigating large codebases is very much a learned skill, and what defines “large” is likely to grow for you over time. In general, statically typed languages will be easier to understand with larger codebases than dynamic languages, in the absence of heavy testing. Beyond that, you’ll need to build understanding of code in sections, and then get that understanding out of your head, in one way or another. In the past, I’ve written narrative/prose explanations of code as I read it, or have printed out code and then marked it up (though that is less practical when the code is over about 2000 lines or so, as you start to get into more than you could fit on a table).

                                  A big part of dealing with large codebases is learning how to sort out what is important to what you’re trying to figure out vs not. And part of that is just down to learning how to see the patterns at use, building a coding vocabulary, as it were. But, until then, looking for an anchoring point. If it’s obvious, you could start from the entry point of the program. If there’s a particular system that does things that look easy to understand, drop in using printf or a debugger, and see what the stack traces look like at different points.

                                  If there is a particularly large project you’re wanting to get into, you might try building a smaller version of that project, or finding a smaller version. This should help you get a lay of the land. For example, having written PISC has given me a much better handle on Janet, for example. Not because I have a perfect knowledge of Janet, but because I know where to look to find certain sorts of things. For Rails, there’s a book called Rebuilding Rails. I don’t know if the equivalent exists for Django or not.

                                  Some developers have the time and skill to make their code resemble a good explanation of what’s going on, many don’t. So you’ll have to learn how to build an understanding of what is really going on in most codebases, open source or proprietary. This is where building debugging skills comes in super handy.

                                  Overcoming Fear

                                  As far as overcoming fear, I have some thoughts:

                                  Don’t be afraid to start small, and report bugs if you find them. Some of my notable contributions to Janet haven’t been the code I’ve contributed, but reporting behavior on Windows, or finding edge cases in the standard library. This is still valuable to a project, especially if you take the time to build a minimal viable and reproduceable example.

                                  If there is a smaller project that you want to contribute to (in terms of amount of activity, not necessarily size of codebase), then you might even be able to email the creator. For larger projects, most will have an online community you can use to ask questions and even get people to help you find which code you need to modify to accomplish something. You’d be surprised how accessible some people are if you’ve got a direct interest in what they work on (and aren’t just trying to use them for clout or generalized help).

                                  If you don’t find a receptive community the first time you try to contribute, look for another project, if you’re looking for a community. If there’s a project you really want to use, but the creator isn’t responding to bugs or the like, then fork it, or look for a more active fork. Gogs vs Gitea is an interesting case here.

                                  1. 2

                                    My first patch was to rcirc which is one of the IRC clients built-in to Emacs. (yes, of course there are more than one. It’s Emacs.) I was in a channel with someone called technoweenie and I had technomancy as my nick. I kept getting annoyed by people highlighting me when they meant to talk to him, because the most widely used IRC client in the channel at the time used “cycling” completion that would fill out the entire first completion even if what you typed was ambiguous, as opposed to completing like bash where it would fill in as much as it can unambiguously and then show you what the options are. I thought that was a pretty bad way to do completion, but my own client did it too, so I couldn’t complain about it too much until I fixed it.

                                    This was my first adventure extending Emacs, and the way you could just rewrite built-in functionality and immediately evaluate it to see its behavior with no overhead felt like such a breath of fresh air that I was addicted.

                                    Unfortunately this was back in 2007 when Emacs still used CVS so I can’t find the patch, but since it was just a bunch of elisp I put it up on the Emacs Wiki so anyone could pull it in without upgrading to the latest version of Emacs: https://www.emacswiki.org/emacs/rcircUnambiguousNickCompletion

                                    1. 2

                                      This took me down memory lane. I think my first contribution was a port of the (at the time popular) Jupiter utility to replace the C# (mono) dependencies in favor of Python, it made the entire package slimmer and it also added integration with the Ubuntu appindicator. I had an Acer Aspire One at the time (with an Intel Atom processor), which I used with a 15” external monitor and Jupiter offered the more reliable way of applying some settings and squeazing a bit more of performance from the laptop. Good times.

                                      1. 2

                                        I added a feature to firefox that I wanted. find a bug that is small in scope and hopefully has a mentor program available. most well designed open source stuff should have mostly compartmentalized code so just keep digging into the subsystems until you find out what is responsible for what.

                                        1. 2

                                          My first contribution to open source was a script to convert a GNU GNATS bug database (just a flat file with tags, really :) into a Mozilla Bugzilla database. It was so long ago (~2000) that I can’t find it anymore :)

                                          There is a big problem that’s talked about some but not enough IMO, and that’s the fact that most already underway open source projects have evolved to such a degree of complexity that it can be difficult to understand enough of the code-base to make meaningful contributions.

                                          Some are very mindful of this and have collected a bevvy of smaller bugs and changes, but that’s a lot of work for the maintainers so it’s comparatively rare.

                                          One thing I’d suggest, and I know this is EVERYWHERE but it’s true, is to scratch your own itch.

                                          Like, a few weeks back I converted from Remember The Milk to Todoist for my online personal task tracking. There’s no conversion tool that I could see and no direct export/import path for tasks, so I wrote a very simple one that worked well enough for my needs called forgetthemilk and put it up on my Github.

                                          It’s not changing the world but it’s a start and it’s there for someone else who has the same itch and maybe they can make it better? Who knows :)

                                          1. 2

                                            It would have to be the old FSharpx F# OSS project on Github.

                                            Since then the project has been broken up into smaller pieces. Nuget.org lists 16 different packages, although the project itself was broken into 3 -4 (depending on how you count them) Github repos.

                                            At the time I was one of the top contributors, and I would still today be one of the top 3 for FSharpx.Collections, except the way the repos were split up all my history ended up in FSharpx.Extras, which I don’t even think I contributed to.

                                            In fairness to the individual who undertook splitting the repo, at the time I wouldn’t have preserved the Git history correctly either given my state of Git competence then. This was the project that took me from zero Git skills to some competence.

                                            1. 2

                                              Working with large code bases is hard, but it’s also one of those things where being paralyzed by fear will prevent you getting anywhere. Ignoring or suppressing fear will get something done, even if it’s buggy. Maintainers will often help you address bugs, particularly if you have a change and can describe the buggy behavior being introduced. Not surprisingly, the effort maintainers will give you is a function of the amount of effort they think you’re giving the project.

                                              My initial things in open source happened at a similar time and really spoke to the massively different approaches maintainers have:

                                              1. In the beginning, I patched VNC to alter its behavior of allowing a new incoming connection to take precedence over an existing one. The change was one line and convinced me that the ability to change source has huge value. It wasn’t really in a fit state to merge though, since it solved a specifically personal problem.
                                              2. I started working on SOCKS code which happens to be fairly discrete code that affects networking layers of products, so it was easy enough to find the right code and work with it. In one case, I contributed changes to support server side name resolution to Gaim (now Pidgin.) The maintainers took the patch, moved it to the stable tree, and released it in around a week.
                                              3. I contributed similar changes to Putty, I think for CHAP authentication. The maintainers gave a thorough review and good feedback, as well as feedback on other code I was writing at the time. It was a little demanding to work through, but it was always clear that the maintainers had good reasons for what they wanted.
                                              4. I wrote changes for X-Chat to do similar things. This was a GPL product. While the patch was outstanding, the author made the Windows version commercial, ignoring all contributions to the project from others. I was horrified, had an email conversation with RMS, and my patch was never merged.
                                              5. I wrote similar changes for Mozilla/Firefox. This codebase is enormous and difficult to work with. I took a patch written by somebody else and ported it to the current tree, trying to get it merged. The patch ended up in a review quagmire where it seemed to be going nowhere. Each review would ask for more things, but each time more things were done, it just creates the opportunity to ask for more things to those things, etc. I spoke to the original author who said he gave up out of frustration. Eventually it got merged, although the switch from Seamonkey to Firefox meant that the UI for the feature was effectively dropped and wouldn’t be revived for a decade. I don’t think the review process really helped address the actual problems in the change either, which revolved around how to surface errors in a proxy that lived in the network layer (telling the user whether the proxy is down or the remote host is down.) HTTP proxies don’t have this problem because they were implemented in a different layer. Mozilla still doesn’t have SOCKS authentication because doing UI and accessing password stores in the middle of a TCP connection is an architectural nightmare.
                                              6. On the less successful side, I sent out a patch to allow the VFAT Linux driver to apply different file permissions to different file extensions. It worked, but the feedback from the list was to make it all configurable via /proc, which seemed a bit overwhelming for a first contribution, and I walked away from it. The feedback wasn’t wrong, and if I was more familiar with the code I’d have done it that way from the start, so it just proved to be a difficult project/problem to choose for a first patch.
                                              1. 2

                                                I remember which project it was to, but I cannot for the life of me remember what the contribution was! The project was Retroshare, which is very much still alive and kicking! It must have been around the time of the initial open source release in 2006-2007 and I believe it was surfacing some configuration option in the UI. I have not touched the project in ages and I’d be surprised if any of my commits survive, but I do enjoy that the project has persisted despite many others in that space (private networks, file sharing) dying out or never hitting enough momentum to survive.

                                                1. 2

                                                  Hello Everyone, OP here, I have read everyone’s comments and I really appreciate everyone’s advice and the experience that you guys have shared when you guys first began to contribute to open source.

                                                  1. 2

                                                    I think mine was either some tweaks to a shell script or fixing a link in the project README. If large codebases feel intimidating, it might be a good idea to look at a project that documents its source repository and design well. The dev docs @matklad and team maintain for rust-analyzer here are a fantastic example of that, and certainly help new contributors make sense of the codebase.

                                                    1. 2

                                                      I don’t know for sure, but it must’ve been edbrowse. The original version was in Perl; I taught myself Perl so I could hack on it. I didn’t even know how to make a patch. It was a big Perl script, so I’d email the author with a changed script attached. I learned to make patches in pretty short order.

                                                      A bit later, I taught myself x86 assembly when I was on a break from university, and I wrote this version of uuencode for Linux in x86 assembly, from the asmutils project. That was just for fun.

                                                      I’m also a mostly accidental contributor to the Linux kernel.

                                                      About the only piece of advice I can give is to find an itch to scratch. If someone else is doing related work and you find it useful, drop in on the community. Spend time lurking and learning, and eventually you’ll start seeing opportunities where you can contribute. Jumping in as a rando with patches is easier than you think.

                                                      As far as large codebases go, I’ll use the Linux kernel as an example, since that’s the one I’ve contributed to. It’s huge, but it’s also modular. There is a variety of subsystems and modules within the kernel being worked on independently. It’s quite possible for somebody to work off in their own little corner of the kernel, mostly not having to be concerned with the rest of it.

                                                      If you’re really invested in what you’re contributing, that can help you get past a lot of the fear. Which is why I say find the itch you want to scratch / find your niche.

                                                      1. 2

                                                        My first contribution was a mod I wrote for Factorio and my first commit was a translation update for a Minecraft mod.

                                                        1. 2

                                                          I’m not sure I remember… It was one of two compilers, weirdly … Lisp Flavoured Erlang, or Rust…

                                                          1. 2

                                                            My first contribution was a fix for an inconsinstency between Pandas and Dask. Its just a few lines of code. I would not look for place to contribute to, but usually the contribution finds you. Something which annoys you, something which can be improved upon or some new feature you want from a software. Coming from this angle keeps you motivated and you need less guidance, because you have some idea what you want. Also I would start with something really small, just to get used to the process (always add tests, stick to the code style, leave comments where needed, etc.). Learning these lessons on a larger change can be much more frustrating.

                                                            As someone who’s programming in Python and really wants to contribute to open source projects but always get overwhelmed by looking at the large codebases.

                                                            In good code bases small changes should take small effort. You dont need to look at the whole codebase. For example in my first contribution I may be looked at a few hundreds lines of code maximum. The rest was uninteresting for me.

                                                            Another small advice: Especially when you are new create an issue first and discuss why the change is needed. Don’t just present a Pull Request.

                                                            1. 2

                                                              My first contribution was to a Rails plugin called nested_form by Ryan Bates, the creator of Railscasts. I was learning about a lot of new technologies at the time including git, Ruby and Rails. Ryan quickly accepted the pull request which made my day!

                                                              1. 2

                                                                The project I worked on was open sourced by the company after some time. It was not the plan initially but then management changed.

                                                                1. 2

                                                                  I started out with bug reports, with detailed investigation. Then at some point I realized there were bugs I could fix myself. It can be that simple. I still ask in the github Issue before I write a fix if I doubt they’ll look.

                                                                  1. 2

                                                                    My first two contributions were a documentation fix to Emacs, and the addition of global-goto-address-mode. I’m happy I did both, but I still do not feel comfortable contributing, and I also kind of can’t at the moment due to difficulties sorting out their papers I need to commit anything more.

                                                                    Be curious and use IRC.

                                                                    1. 2

                                                                      Probably adding “<center>” support to Lynx in ~1995 (although it turns out I was beaten to it by Foteos who was very encouraging.) Then Persistent Cookie Jar support and Lynx Style Sheets in 1996.

                                                                      1. 2

                                                                        I’m lucky in that my first contributions to open source were though my job. In particular my job is to port a large number of open source packages to a different platform. As a result I’m spending a lot of time learning new code bases so I can be affective in porting them. Different projects are definitely better or worse w.r.t. their internally facing documentation. Some (gcc for instance) have almost none. That being said there are some places to check:

                                                                        1. Check the git logs. Recently I worked on a port of GNU patch. In GNU patch they where keeping an LRU cache of calls to glibc’s openat. There was no explanation at all in the code for why an LRU cache was needed — but that narrative history did exist in the commit which originally created the LRU cache (notably not visible with git blame).
                                                                        2. Some projects (Perl for instance) have decent internal documentation which will actually tell a story to the developer about the code. Sometimes it is in a odd format. So check if that exists (inside or outside the code base).
                                                                        3. Since my job is mostly porting and not contributing I haven’t had occasion to actually do this; but my understanding is that talking to maintainers on irc and looking for a list of new contributors’s bugs to solve are good ways to get started.
                                                                        1. 2

                                                                          My first attempt at a contribution was actually to blender. However those never made the cut. The blender community was great but I was just way too inexperienced at the time so no hard feelings there. I learned a lot through the experience.

                                                                          My first successfull contribution was an open source erlang testing project called etap.

                                                                          1. 2

                                                                            I fixed a Segfault in GDM because I bought a shiny new (for 2014) DisplayPort monitor that made my desktop environment crash every time I turned it on or off.

                                                                            https://gitlab.gnome.org/GNOME/gdm/-/commit/c229f90449277c475a15d78a697b7fa13884e69b

                                                                            To this date, my only contribution in C. It was an easy experience because it’s a fix for an obvious bug, rather than something like a new feature where the implementation is up for debate!

                                                                            1. 2

                                                                              Perhaps not what you asked for specifically, but writing code is not the only way to contribute to OSS. Back when I was starting out I remember spending lots of time passionately contributing to the very first Gentoo Wiki, and engaging with others on the mailing ideas with various ideas I had.

                                                                              Contributing in some form (code, documentation, etc.), and interacting with people from all the over world, and having fun throughout the process … these are the delights of open source.

                                                                              Also there is nothing wrong with continuing to create your own open source projects if that’s your cup of tea. Nothing else matters aside from consistently having fun; that’s my philosophy in life, at least (some people may disagree).

                                                                              1. 2

                                                                                I added support for different g-code line endings to a 2D CNC machining program to support an old milling machine that my dad had in his garage. That was a pretty rewarding first contribution :)

                                                                                1. 2

                                                                                  Sometime in 2007, I contributed a Makefile to the l7-filter project, a Layer 7 packet classification system used to tag IP packets for further action by Linux QoS mechanisms. It had a very basic one but it was hardcoded for MIPS architecture only and I needed to build an x86 version. I got lucky that I didn’t need to make any source changes (that I can recall 14 years later!).

                                                                                  I talked about this in a thread about Linux From Scratch previously. I built a LAN party router distro for my undergrad graduation project.

                                                                                  1. 2

                                                                                    Hell, it is too long ago to remember when it all started. However about your question how to start contributing:

                                                                                    Scratch your own itch.

                                                                                    This is how 90% of my contributions happened:

                                                                                    Etc.

                                                                                    I rarely traverse the issue list in a bug hunt for fixing. I either fix my own problems or fix problems reported by others when I encounter them on my own. Similarly with my open source projects - I see a problem that I want solve for myself, and then I look for existing solution. If there is none, or the existing ones do not suite my needs, then I solve them on my own alternatively I want PoC. I also have some projects that I created purely for fun.

                                                                                    1. 1

                                                                                      Performing IRC support for Minecraft mod authors, followed by an automatic error diagnosis system in a custom Minecraft launcher. It would tell you exactly how you screwed up your mod installation.

                                                                                      1. 1

                                                                                        Long time ago, a kind person paid me for writing a BSD-licensed library. I believe this was my first open source contribution.

                                                                                        1. 1

                                                                                          Not actually sure I remember, it’s been a while. Spontaneously I’d guess it was helping translate the PHP manual to German (around 2002-04?), with contributions to some PEAR projects not long after, but that’s just a project I’ve stuck to for a long time, may as well have done a few drive-by commits to random small projects earlier. Also writing detailed bug reports and repro cases is also meaningful contribution, but I guess that’s not what your question is about.

                                                                                          Unfortunately I don’t have great advice, but if the project has even a modicum of tests, just triaging and fixing bugs is a good way to learn your way around the code base. Even if your fix will probably only 2 lines, you still need to understand what a part of the code base does. Not only because of that people don’t usually submit huge things as their first contribution, and that is completely fine. (He says, having started a new job and first of all having rewritten a third of a small app, but only after adding tests for a week…)

                                                                                          1. 1

                                                                                            My first contributions were to the Oil project. Heard about it through Lobsters, actually. Just pulled up the issues, intentionally picked one that looked easy, and started on it. Later, I took the opportunity to write about the process I went through so that other newcomers had a reference point. The codebase has changed dramatically since writing it, but here it is in case anyone is interested: https://github.com/oilshell/oil/wiki/OSH-Builtins:-%22Hello,-world!%22-Example