1. 31

What projects do you believe are essential to every programmer? What can we learn from each of them? Why is that important?

  1.  

  2. 10

    I think there will be a lot of great suggestions in this post. So I’ll just add one small practice project that taught me a lot when I was a moron (I am still a moron, sorry).

    Implement Ukkonen’s Suffix Tree Algorithm

    Primer: http://stackoverflow.com/questions/9452701/ukkonens-suffix-tree-algorithm-in-plain-english

    Notes:

    • I like that post, it links to the actual paper which might be tough to read, so the stack overflow post will explain a lot to get your feet into the pond.

    • Do NOT look at the source code, try to think about it on a white board first. You’re doing this for yourself.

    • Lots of other great Suffix Tree algorithms, I personally like Farach’s linear-time suffix tree construction algorithm.

    Reasons

    • Matching string sequences is one of the most common problems you’ll face as a programmer.

    • Condition your brain to think in the world of data structures.

    • You can time it to see how fast the computation takes, run benchmarks, much great learning! wow!

    • Lots of places where you can encounter pitfalls, if you decide to use C you will learn a great deal about memory management!

    • Its fun.

    1. 8

      Implement a simple UNIX shell. Basically, for each command you have to fork, exec, and then wait for the process to finish. You’ll learn a lot about UNIX systems programming and process management.

      1. 3

        This is a great way for people to learn about syscalls and unix in general. Once there’s a shell that can handle execution of single commands, it can be interesting to implement pipes and file redirection with dup2, or shell metacharacters like $! with the wait() syscall. Around this time it’s also a good chance to learn about zombies and orphans, and how double forking can be used to daemonize a process.

        1. 1

          s/$!/$? :)

        2. 2

          I would like to learn this! Come to think of it, I would love for a good tutorial/course in systems programming. I’ve looked around and haven’t had success in finding anything useful. If anyone has any links, please post! :)

          1. 3

            I found working through “Advanced Programming in the Unix Environment” (and doing the exercises) to be a good start; there’s also “The Linux Programming Interface”. It’ll give you a good foundation.

            Also, one of the things I like to do when learning a new language is to start writing the Unix utilities in them.

            1. 2

              Advanced Linux Programming is a good reference for the different system calls you will have to use.

              I wrote a shell as the first assignment in my Operating Systems course in university. The instructions for that particular assignment are still online on the course website.

              1. 1

                To piggyback: I think Operating Systems courses should be required in all universities teaching Computer Science, there is like a treasure chest of knowledge that is hard to grasp at first - but the rewards are so worth it.

              2. 1

                Computer Systems: A Programmer’s Perspective, which was the reference for my University’s intro to systems class is one of the best textbooks I’ve ever read; the labs that come with it (many of which can be found on the internet) are excellent as well.

            2. 6

              I’m not sure about every programmer, but implementing a toy distributed kv store can be a lot of fun and expose you to a large number of aspects of systems engineering. You should start by leveraging plenty of existing tools, but over time replacing them with pieces that you’re curious about the details of. Start with something like a few instances of memcache and create a central coordinator that maps keys to servers, then have each instance run its own coordinator that has peer discovery and leader election built in. If you want to throw a wrench into the mix, randomly kill servers and use netem or the statistic iptables plugin to induce network delay, jitter, corruption, and dropped packets.

              Eventually end up with a toy dynamo or hbase (maybe using several instances of djb’s cdb as the initial immutable storefile analogue). Then maybe you decide to add your own mvcc aware b+tree or built in machine learning or stream processing tools. You can go in so many rewarding directions.

              1. 1

                Ive made a small distributed kv store once, it was a great learning experience. Now youre making me want to go mess around with it more!

                1. 2

                  I’m actually in the middle of one right now :) I’m using this one as an excuse to learn Docker and Ansible on the ops side along with some distributed transaction techniques from Transactional Information Systems. This stuff is like crack!!!

              2. 5

                Implement a regular expression matcher. Here’s a good treatment of it: http://www.cs.princeton.edu/courses/archive/spr09/cos333/beautiful.html

                Write a Brainfuck interpreter. Then write a Brainfuck compiler and have it emit the language du jour. Here’s my try at a compiler in Python that emits C: http://justinpoliey.com/articles/2012-03-30-source-to-source-compilation.html

                1. 2

                  For anyone interested this is a chapter from a book called Beautiful Code which I really recommend reading. Its full of some great examples like this one and some deep insight.

                2. 5

                  Another good idea is to implement some reasonably advanced data structure in your language of choice. Some ideas are

                  1. 2

                    Wow Bloom Filter is actually a great call! Thanks for putting that on my radar again.

                  2. 4

                    This was an interesting challenge that I learned a lot from:

                    http://precog.com/blog-precog-2/entry/do-you-have-what-it-takes-to-be-a-precog-engineer

                    It basically has you implement a file-backed key-value store.

                    1. 1

                      Sounds interesting, but the link appears to be broken.

                        1. 1

                          I can’t seem to find it archived anywhere online, either ><

                        2. 1
                        3. 4
                          1. I’d make something that you can try to sell. You learn a lot from having to do the entire lifecycle of a product: Idea->Design->Implementation->Maintenance->Marketing->Support

                          2. ProjectEuler.net - I like to use these problems to try to learn new programming languages.

                          1. 4

                            I’d suggest participating in a LudumDate. You get 48 hours to make a game from scratch.

                            1. 2

                              A good place to test out new language/framework/engine you have not used before. After the jam you should know if you want to use the language/framework/engine in future.

                            2. 4

                              Learn Python The Hard Way, https://github.com/zedshaw/lpthw-study-projects has list of study projects for systems programming.

                              File System
                              • find: The file system is a dictionary not a tree.
                              • grep: Regex are not scary, scary regex are.
                              • cp: Copying files fast and slow.
                              • mv: A move is a rename in the filesystem dictionary.
                              • rm: A remove deletes from the dictionary.
                              • tail: A file is like a tape drive.
                              • diff: A delta is a glorious thing.
                              Daemons
                              • config: Don’t invent your own config file format.
                              • logging: You don’t need syslog.
                              • syslog: You should centralize logs with syslog.
                              • fork: Daemons are your friends.
                              • chroot: Daemons need cages.
                              • droppriv: Daemons can’t be trusted.
                              • signals: Daemons can be poked.
                              • pidfiles: Daemons can bet lost.
                              • monit: Daemons need a nanny.
                              • watcher: A Daemon that’s a tattle tail.
                              Network
                              • mail: Sending email is easy.
                              • sendmail: Sometimes email should dump to your screen.
                              • curl: Getting things from the web.
                              • webserver: Everyone can make a web framework!
                              • ssh: Logging into servers is best securely.
                              • scp: Copying to another machine (even slower).
                              • s3: Copying to the cloud.
                              • named pipe: Everything is a file (except when it’s not).
                              • sockets: A socket is a file (except when it’s not).
                              • gevent: Sockets do a lot of nothing most of the time.
                              Security
                              • iptables: Blocking ninja network haxors.
                              • blocker: Daemons can do the blocking.
                              • nmap: Scanning ports is easy and fun.
                              • honeypot: Scanning should not be easy.
                              System Info
                              • /proc: Daemons watching proc watching daemons.
                              • login: Watching logins and failures.
                              • /etc/passwd: Getting the dirt on users.
                              • stats: Good stats made easy with one file.
                              • nagios: Watching what is going on everywhere.
                              Extras
                              • cron: Time events are a queue.
                              • curses: Adding a fancy progress bar to scp.
                              1. 3

                                Making a game of some description to the point where you (or even better, someone else) can play it. There is a lot of cool algorithm work around games, especially efficiency and AI, and having to build something to a working level can teach you a lot about writing good code. It’s also quite a fun experience I’ve found.

                                1. 3

                                  Writing a simple (read: shitty) malloc/free implementation is a great way to learn about memory management/pointers in C

                                  1. 1

                                    It would be cool to replace the Lua allocator with the simple one and run the benchmarks game on it, comparing it to the built in allocator.

                                  2. 3

                                    Writing a basic compiler was really enlightening for me. This is a pretty awesome walkthrough.

                                    http://compilers.iecc.com/crenshaw/

                                    1. 3

                                      I started writing a MIPS assembler at one point, which may have been one of the most interesting (but most difficult) projects I’ve ever worked on.

                                    2. 2

                                      Implementing data structures in terms of Church encodings (i.e. catamorphisms or the eliminators). Here is an example using the Option type:

                                      https://gist.github.com/debasishg/1059864

                                      Here is an example of the List type:

                                      https://gist.github.com/debasishg/1059896

                                      The point is that data structures become just their catamorphism function. Data structures are functions! How awesome is that?

                                      1. 1

                                        Very awesome! Thanks for sharing.

                                      2. [Comment removed by author]

                                        1. 1

                                          Whoops… I read this question as “small project practices.”

                                        2. [Comment removed by author]

                                          1. 1

                                            Doing this on an 8-bit microcontroller, like an Arduino (or the microcontroller it’s based on, the ATMega328) is a pretty fun (and cheap) project – get an Arduino with small input and LCD shields, and you’ve got a fun platform to hack on. It won’t be the next Linux, with 16-32K of flash and 2K of memory, but you will certainly learn a lot.