1. 39

I’m interested in learning more about the Linux kernel, and generally how operating systems work. I’m aware of a couple of books already:

  • Linux Kernel Development, Robert Love
  • The Linux Programming Interface, Michael Kerrisk
  • A Heavily Commented Linux Kernel Source Code, Zhao Jiong (posted here a few days ago)

I’ve also read here and there that writing a toy kernel module is a good place to start. Some people will also recommend just diving into the code and trying to understand it. That seems a bit daunting given the size of the codebase. I suppose I could start with some system call and tug on that thread for a while if that’s the route I end up taking.

So, if you wanted to learn the ins and outs of the Linux kernel today, how would you go about it?

  1.  

  2. 27

    It’s not about Linux per se, but it does relate to how operating systems work and a similar kernel: The Design and Implementation of the 4.4BSD Operating System. I bought it on the recommendation of John Carmack and the depth it goes into is great. Every chapter also has little quizzes without answers, so you can confirm to yourself you know how the described system components should work.

    1. 3

      Great book. Multiple pluses on this one.

      1. 2

        Thanks! I’ll give that book a look.

        1. 1

          Thanks!

        2. 17

          some modern linux exercises if you haven’t narrowed down your own yet, and want to just jump into kernel hacking:

          • start with a small loadable kernel module that logs a message with pr_info that you can read from dmesg
          • have your kernel module create a procfs entry that can be modified from userspace to get information into the module
          • modify kernel/sched/core.c (formerly sched.c) to read a number from procfs and bias the scheduling algorithm to blacklist it from ever running. test by starting a userspace process that print in a loop, write its pid to procfs, see if it stops or not
          • introduce a subtle bug in the scheduler (if you were fortunate enough not to have in the last step) and use qemu + gdb to peer into it at runtime
          • implement a new syscall, call it from userspace
          • intentionally introduce a memory corruption bug and use kasan to catch it

          kernel dev is more of a human political problem than a technical one, and every kernel subsystem has its own way of doing things, but there’s a lot of overlap. whatever subproject you want to join or create, learn enough about its human operations so that you can be respectful. this will involve reading a bit of process documents, learning new tooling, and still probably leaving you with a sense of uncertainty about where to jump in. some projects are more newcomer friendly than others.

          it’s easy to do cool hard stuff by yourself, but with kernels it’s much more of a coordination problem, so respect the people by doing your homework before spending their time, then you can have the confidence to ask them for advice knowing that you have done your homework (you can also ask people HOW to do your homework so you can prevent wasting more of their time if a subproject doesn’t make it easy to find) and you’ll find someone who will be willing to point you in the right direction

          1. 1

            Those projects seem like a nice entry point. Thanks for the advice regarding the soft-skills too.

          2. 9

            As @WilhelmVonWeiner mentioned, I would invest in a copy of the The Design and Implementation of the 4.4BSD Operating System or The Design and Implementation of the BSD Operating System.

            And I’d suggest starting my writing your own simple operating system kernel. Personally, I moved from there to studying Minix (this was many years ago). Minix was (and I hear still is) great for learning from. It’s different than Linux in that it is a microkernel architecture, that said you will learn a lot from it because it’s easy to read through and understand. Between writing your own and Minix, you’ll be off to a good start.

            I still find the source for the various BSDs easier to follow than Linux and would suggest making that your next big move; graduate to a BSD if you will and from there, you could leap to Linux.

            One other thing to consider. Picking a less used kernel to start hacking on when you feel comfortable might be a good idea if: you can find folks in that community to mentor you. In the end, the code base matters less than having people who are grateful for your assistance and want to help you learn. In my experience, smaller communities are more likely be ones you can find mentors in. That said, your mileage my vary greatly.

            1. 1

              Thanks! Any idea why the source for BSDs is easier to follow?

              1. 2

                Not a BSD, but Minix 2 was written with readability as nearly the only goal

                1. 1

                  I could take guesses based on number of people who are committers and the development process as to why that is the case, but in the end, it would be speculation. I know I’m not alone in this feeling, but I don’t know if I’m in the majority or minority.

                2. 1

                  Any pros and cons of starting with the 4.4 BSD Operating System vs FreeBSD Operating System book?

                  1. 1

                    I haven’t read the 2nd edition of the book so I can’t comment on that. Sorry.

                3. 5

                  There’s also this pretty thorough description of many of the kernel elements.

                  https://github.com/0xAX/linux-insides

                  1. 1

                    That looks like an excellent resource, thanks!

                  2. 4

                    I’m not a kernel dev, but I wrote a toy Linux kernel module and it definitely gave me a better understanding of how the kernel works. So if you haven’t done that I would recommend it!

                    For Unix in general, the code for xv6 is also easily buildable and runnable. I ran it in an emulator and did a little hacking on it:

                    https://pdos.csail.mit.edu/6.828/2012/xv6.html

                    I would try porting Brainfuck or tinypy to it inside the emulator and running Mandelbrot :) That is, it has a tiny shell and tiny user space. You should be able to compile a new program and boot the system with it.

                    1. 3

                      For some background, I would also suggest The Design of the UNIX Operating System (1986), by Maurice Bach.

                      1. 3

                        lovely book, I’d recommend starting with this and “Unix Internals: The New Frontiers” by Uresh Vahalia.

                        1. 1

                          That’s a wonderful book, that I wish had been updated.

                      2. 3

                        I am not a Linux kernel hacker, but here is the route I would take, in addition to all the other great advice in this thread.

                        Lastly, get good at statistics, charting and benchmarking. Lots of changes are very subtle, and do to solid engineering, testing and confirming the result is going to be the vast majority of the time spent. The nature of the domain isn’t one of hammering out 1000 lines in a weekend. And even if it was, it will still requires weeks of analysis to confirm the result.

                        1. 2

                          Thanks for the tips! It will definitely be a different mindset from my normal development mode.

                        2. 3

                          I maintain a book on kernel modules, which can be a good place to start. The examples should all compile with the latest kernel. https://code.freedombone.net/bashrc/LKMPG

                          1. 1

                            Thanks!

                          2. 2

                            May want to start with a device driver to get your feet wet. Linux Device Drivers 3rd Edition is the gold standard for that.

                            HTML version, easy to navigate: http://www.makelinux.net/ldd3/ PDF: https://bootlin.com/doc/books/ldd3.pdf

                            1. 1

                              I used LDD3 a few years ago when going through the eudyptula challenge, and found that much of the advice/examples were outdated for >2.6 kernels.

                            2. 2

                              The Linux kernel is massive, and learning how it all works is a pretty big undertaking. I learned far more by trying to add to the functionality than reading a book.

                              If you want to learn how operating systems work, I’d recommend starting with something super simple, like a unikernel, RTOS, or the like. For example there’s FreeRTOS. You can buy a compatible development board and a JTAG (hardware debugger) and step through the boot process to initialize hardware and watch it do things like interrupt handling and context switching. FreeRTOS typically runs on hardware with only a memory protection unit (IE no virtual memory) which is much more simple. Once you understand that I would move on to Linux or something else full featured.

                              As for Linux proper, is there something in particular you want to learn about? There are device drivers, architecture specific code, the network stack, the block layer, the scheduler, etc…

                              I took a class on device drivers and the book for that was very good: https://www.oreilly.com/openbook/linuxdrive3/book/ . It’s ancient but the concepts are all still generally applicable. I think it was easier to start trying to add a device driver than anything else.

                              The kernel has a built-in gdb stub called KGDB that typically works over a serial port, and it also has a simplified front-end called KDB. Stepping through code is sometimes more easy than just trying to read it. Some hardware specific IDEs are also capable of stepping through kernel source (TI’s code composer is the only one that I know about, and the hobbyist SBC of choice for them is the beagleboard https://beagleboard.org/static/beaglebone/a3/Docs/ccs-jtag-simple.htm.)

                              If at some point you just want to jump in and hack, check out https://kernelnewbies.org/KernelJanitors though I never did it myself.

                              There’s also an IRC room for kernelnewbies https://kernelnewbies.org/IRC

                              1. 2

                                Try these challenges: http://eudyptula-challenge.org

                                Nvm, they are closed. Maybe there’s an alternative?

                                1. 2

                                  I’ve heard of that before. It’s a shame that the challenges aren’t open for everyone to view.

                                2. 2

                                  I went through the path of going over bugs and just taking a stab at them. I also had mentors that didn’t mind answering my questions.

                                  kernel panic -> there’s a panic command generating that output so you can find the relevant code. Go look for it, go look around.

                                  In one of my favorite bugs I went over the boot process from the start symbol and seeing where it goes and figuring out what it is doing. I strongly recommend trying that out one time, but I don’t know where it goes for linux :-)

                                  1. 3

                                    Found you something.