1. 28
  1. 9

    The ray-tracer point is really cool, and honestly, it’s really satisfying. Seeing it render stuff really gives you that “yep, something’s been accomplished here” feeling. I am, sadly, entirely without artistic taste in the graphics department, so I only implemented one once, I think, but I had a really good time.

    I have a few “benchmark” projects, too – things that I do when I learn a new language or a new piece of a stack. These are projects I do both as a learning exercise and as a chance to evaluate how good that technology is. They’re things where I understand the underlying problem well enough that I don’t have to learn a language and how to solve the problem, so I can just focus on the language. They go like this:

    For systems programming languages:

    • Plain text editor for large files (large enough, that is, that you get to mmap them). Memory mapping is weird and full of strange, but easily-tested cornercases, and portable in principle but not always semantics, which helps me figure out how well the standard library delivers on portability promises and how well it accommodates things that aren’t in the first ten lines of man mmap. The underlying workhorse data structure is usually a gap buffer, which I’m pretty familiar with from Emacs, but trying to implement and use a rope instead is very edifying for some language features. Specifically, it’s a tree-like structure (which flexes weirder memory ownership models) that lends itself to garbage-collection, so it helps you evaluate both how good the garbge collector is and how easy it is to do reference-counting GC.
    • Packet decoding. This is a convenient subset of e.g. writing a DNS implementation. It’s straightforward (you do all the hard decoding part without dealing with all the handhsakes that precede it), easy to validate automatically, and is a good test for how well the language helps you avoid off-by-one errors, out-of-bounds reads/writes and the like. It also offers a bunch of easily-explored parallelization options.
    • A trivial, bare-metal task scheduler. I just set up a minimal kernel – something that uboot can dump in memory and jump straight to, which does nothing other than set a 1:1 virtual mapping (if I’m on a platform that has a MMU), set up a system timer interrupt, and walk a task list on the system timer interrupt. This is an excellent test for the toolchain – it’s a non-standard setup that usually needs some handholding at the linking step, so if the underlying toolchain is too magic-heavy, you’ll know. Once you’ve done it for a platform, you can re-implement it in another language basically forever (I’ve done it for i386 for years, I’ve only recently switched to doing it for ARM64 last year, when I tried to do it in Rust). References are always available for the muddy parts (you can mooch the boring MMU boilerplate from NetBSD, for instance). It’s an excellent test for how well a language handles weirder memory access patterns, how easily it lets you write synchronisation primitives, and how well it deals with environments where the assumptions made by the standard library aren’t met. And it’s a few weeks of fun coding away from some easy street cred (if you expose a serial console, or write a keyboard and framebuffer driver, you get to say you wrote an OS).

    For scripting languages:

    • A bookmark manager. It’s a simple CRUD application that is nonetheless useful once you’ve finished it. I’ve been using mine for about twenty years now, and it’s been rewritten in PHP, Java, C++, Common Lisp, Go, Python, and more recently Rust during this time. (I know some of those aren’t exactly scripting languages; Java is there because I actually used it to learn JSP, Rust is there because I wanted to see how it deals with heavy string manipulation and how well it handles SQLite interfacing, C++ is probably there because I was a huge fanboy at the time or I’d fallen down the stairs and hit my head, hell knows).
    • A Markdown -> HTML converter. You get to walk around and allocate strings a lot, so it’s a reasonably good profiling sampler, it helps you see how well the standard library deals with Unicode strings, and how easy file I/O is.

    For GUI toolkits:

    • A rich text editor. A rich text widget is the hardest part of a GUI toolkit, to the point where you can tell how good a toolkit is based on how good its rich text widget is. Once I have it editing formatted text and displaying pictures I just copy-paste 2,000 pages’ worth of text and pictures in it. It’s a good benchmark of both the GUI toolkit and modern tech stacks: thirty year-old toolkits routinely run circles around modern web-based WYSIWYG editors with which they are on-par, feature-wise, except for font anti-aliasing.
    • EM field simulation for some trivial case (e.g. a couple of electrostatic point charges). The math is super straightforward, you just fill up a couple of matrices. But you can ramp up the computational effort by just making it big enough and adding enough sources, so you get to see how well async operation is handled (background thread doing the math, foreground thread updating a plot) with like ten lines of math.
    1. 4

      The 500 lines or less book has maybe 30 projects in many application domains, in varied languages. From a simple database to a 3D modeler.

      There are also good materials on writing ML algorithms form scratch.

      1. 2

        Ray-tracers are also not only about visuals. I would like to build a sonar ray-tracer which accommodates for water density and so on. This video is my inspiration for that idea.

        1. 2

          … and I totally just coded up a prototype in Python: Link

        2. 1

          Awesome post. Thank you for sharing. One nit: the link to ray tracing in Python is broken.

          1. 2

            Nice catch, fixed!