I have a maxed-out M1 Macbook Pro 16”, it’s the best laptop I’ve ever had by far, the 2015 15” MBP being the runner up (which is still being used by my younger sister currently, she loves it). It’s solid, battery is great, physical fn keys are back, compiles my C++ project 3x faster than my 2020 MBP did, and without getting scorching hot like the old one. Only downgrade was that it’s heavier and thicker than the previous gen, but it’s absolutely worth it. I’m sad this is only a work machine and I’ll eventually have to give it back. I’ll definitely get my own MBP when I do though.
Working through more Advent of Code problems, and then continue building a game using Bevy. Right now just reimplementing Space Invaders but I want to explore some more interesting game mechanics. Any (not too complex) ideas would be appreciated!
I took a couple weeks off between jobs, starting today. I’m not sure what to do, a few ideas I had: do the Stanford online compilers course, finish my Rust Monte Carlo Tree Search game AI project, learn Pytorch basics, read Refactoring. Right now I’m just playing a bunch of video games.
This is the most important part of the post for me:
I have never been a part of a hiring team and I am sure there are things going on that I don’t know about.
I have some experience evaluating candidates (at small/medium sized startups). I’ve also gone through the BigTechCo interview ringer. My opinion: because the factors that make someone a good engineer are many, complex, and very hard to detect in the short amount of time given during interviews, and that we have no standardized licensing the way that other engineering disciplines, law, or medicine do, and that the work of programmers is so broad that two senior engineers can have almost no overlap in technical knowledge if they’re from different parts of the stack/different industries, there is no method for hiring that doesn’t suck. DS&A puzzles suck for hiring, but in ways that minimize the worst outcomes for companies.
The main constraint is time: Both the candidate’s and your own evaluating engineer’s. High demand candidates won’t interview with you if your process takes significantly longer than the industry average. You may need to spend 8-20 engineering hours on a single candidate - this could easily be $1000+ per candidate (plus recruiter/pipeline costs), and most will either fail or take another offer. You may get one hire for every 10 that goes through the full process. I’m making up these numbers but to be clear: it gets expensive.
Once you have them: You can have a technical conversation, but unless you see actual coding, this means bullshitters can get through. You can watch them code something closer to actual work, but actual work requires a lot of domain knowledge ramp-up. Throwing them into your own code base/some realistic simulation without letting them acclimate is not going to get you much signal very quickly. You can extend this exercise to multiple hours/a day, but then you only get to run a single trial, so to speak. You can give them multi-hour take home projects but the most in-demand engineers will just take an offer with somewhere that won’t make them do that. Additionally, these take more time for your own engineers to evaluate. It’s also game-able, they’ll ask their more-qualified engineer friend for help.
Data structures & algorithms are one of the very, very few things that all programmers are supposed to have some handle on. Almost nobody uses this knowledge directly day-to-day, but it’s the closest thing to a middle ground we can evaluate people on, because programmer skill sets are so varied. So we use it as a standardized test. People who are just incredibly smart/naturally gifted won’t have to study much to pass DS&A interviews. You want these people. People who are kinda smart but incredibly hard working will be able to study a lot and pass these interviews. You also want these people. There’s another huge class of people who are good at their jobs but don’t want to or can’t put in time studying. You probably want a lot of this group, but without the signal that DS&A interviews provide, you can’t tell who’s good without spending more time than a DS&A interview takes. Because of how expensive a bad hire is (easily able to provide negative value) even if only 1/10th of this group is bad, the expected value of riskier hiring is lower than throwing away candidates you’re almost totally sure would be good hires.
We’re basically using DS&A interviews the way other industries/disciplines use standardized licensing. There’s a formal process in place run by some central organization that tells companies that some person can be a mechanical engineer/doctor/lawyer. We don’t have this for software engineering because: software engineering is still in its infancy so best practices/required skills change 10x faster, required skill sets between different programmers can have nearly no overlap, the demand for software engineers is growing much faster, and for the majority of software engineers (maybe just web developers) when someone makes a mistake, no one dies, so there’s no external(government) enforcer of competency. It could be that in a few decades it may make more sense to have a licensing system like MechE’s do, once the field solidifies, or is broken into different specializations that are more easily evaluable.
It sucks but it sucks less than other systems - from the company’s POV. It sucks a LOT for us engineers. But unfortunately, this may actually be the current global maxima. I understand the complaints that the author has, but it should be tempered by the perspective from the other side.
without the signal that DS&A interviews provide, you can’t tell who’s good without spending more time than a DS&A interview takes.
There are plenty of other, more real-world, tasks you can give them though. At my last job we just asked people to write a CSV contact list importer for an SQLite database with some minor constraints (e.g. merge duplicate emails), and that seemed to work quite well.
The thing with many of these algorithm questions is that most of these algorithm were invented by people who went on to win Turing awards and the like for doing so, and have Wikipedia pages. You can’t really expect anyone to invent something equivalent in an interview, so you’re basically just testing memory/knowledge more than anything else. If you only hire people with enough interest in that area you’ll end up with only a certain type of programmer, which probably won’t be too great for the company, either.
This works, but you’re filtering out a lot of people who would be good hires but maybe haven’t done a lot of data engineer work recently. If you’re okay with that then great, but I think what FAANG+ are going for is something that isn’t going to advantage any one specific type of engineer. They’re also going to be filtering out a lot of good people, but not from any specific engineering demographic. I know engineers who I would regard as generally better than me who would not be able to complete the task as quickly because I’ve done a similar thing more recently, or maybe they just happen to not have needed to use SQLite in their careers so far. The big companies want a test they can use for any type of software engineer.
To your second point - right, I don’t think you’re expected to invent anything truly novel. If you have a deep understanding of the dozen or so algorithms that show up in these tests, what they’re looking for is which ones to apply when. You can acquire this trait as I said in my original comment by either being very smart and studying a little or being a little smart and studying a lot. So, the test ends up evaluating (intelligence*diligence) which may be better for large tech companies (who have a wide variety of work that needs doing, but can’t create a dozen tests for each discipline that are of objectively of equal difficulty[otherwise everyone who wants in would just study for the easiest test]) than how recently you’ve done something like the specific exercise you give them. I think generally any company can use smart and/or hardworking people, so this works well enough without biasing towards specific types of engineers.
If the difference for you is that you do just want someone with the specific domain knowledge of “data engineer who works with CSV and SQLite” then great, but that test doesn’t scale for your entire engineering team forever.
It seems to me that this kind of task is generic enough that it doesn’t really require a lot of “data engineering” experience; at least, that was our intent. Basic SQL knowledge is not completely universal, but it’s probably as close you can get, at least in the web backend space.
Note this was a “come back in a few days when you’re ready”-task, not a “please write this in n-minutes”-kind of task. I appreciate that some people may be more familiar with this kind of stuff than others, but in the “real world” you’ll end up doing loads of stuff without prior experience to the specific task. Even with 0% familiarity (which is probably very rare) it’s not very hard to come up with the required knowledge in a few internet searches. I think Joel Spolsky once phrased his hiring criteria as “smart and gets things done”, and I think this is a reasonable – though not perfect – test of that. In the end, it’s the kind of task people are expected to be able to solve in their actual job, regardless of previous experience.
I wrote a 40-line version (in Go, a rather verbose language) in 20 minutes or so. Most of the rejections I saw were in two categories: the horrible overcomplicated (instructions explicitly mentioned we were looking for just a “simple straightforward solution, no caveats”, some people sent in versions with loads of abstractions, needless parallelism, etc, sometimes hundreds of lines), or the chaotic clueless programming-by-trail-and-error where the candidate clearly had trouble understanding basic programming constructs and just tried-and-tried until something kind-of-maybe worked.
Companies like Google probably have many more applicants than we had (although we got quite a lot), so they can afford to be more selective, but I see loads of smaller companies copy the same techniques while at the same time complaining that it’s so hard to find good engineers.
In your previous post you compared software to other specialized fields (law, engineering, medical, etc.) and I wonder what the state of competency is there. I also doubt that (otherwise competent) people in those fields would be able to pass stringent competency tests in their field, just like quite a few drivers would likely fail a driving license test after 15 years.
At the end of the day, interviewing is hard no matter what you do and I don’t think there is One True Way™, but I’m skeptical that testing candidates on theoretical knowledge which bears almost no relation to their actual job is the way to go for most interviews.
I’ve been writing a Monte Carlo Tree Search game AI in Rust. It’s still got a few problems and is much slower than it could be, but I’m hoping to get it playing a game a friend designed, to discover interesting play patterns.
TIL about Monte Carlo Tree Search, thanks! I did a BSc AI course 16 years ago and plays a lot with MiniMax and AB pruning. I tried to do a genetic algorithm to generate an evaluation function, but it sounds like a Monte Carlo approach would have been better.
Dude this sounds super-interesting, if you can spend some time writing it up I’d love to see it here some time in the future.
Trying to wrap up this article I threatened to write a few weeks ago. I’m worried that it’s not interesting enough, my conclusions are basically that each of the cliches/stereotypes about each language were encountered in writing these programs.
I’d still read it ;)
Python is a language I never worked with professionally, but understand to some extent. I started learning Rust and Go at some point. I have working compilers on my system, I’ve written some code, but then I focussed on something else, and at this point I don’t even know how to write “hello world” in them.
I just finished writing the same program in Rust that I have in Python and Go as well, I was considering doing a write-up of the experience and performance.
I just built a desktop machine, my first in 9 years. Going to finish setting up my Ubuntu install and then see if I can make use of this expensive GPU and train a GAN to do something neat. Any ideas?
Perhaps thislobsterspostdoesnotexist.com along the lines of https://www.thiswaifudoesnotexist.net/ ?
I know nothing about GANs, would it be easier to create random real-looking comments?
Or for extra terror, this HR Giger painting does not exist?
One of these waifus is not like the others.
until I can feed it my entire monorepo codebase as context it’s not going to be able to replace me. that said, that could happen a lot sooner than I expect.
I don’t think humans grok the entire monorepo either. We typically work with a set of interfaces from the monorepo and the standard library. If GPT’s context window is large enough for the interfaces from the monorepo, that’s probably enough.
EDIT: a big corporate might just fine tune the model on their monorepo. I wonder if OpenAI already did that to their own code.
This has already happened. There are multiple ways you could load your code base into an LLM today if you wanted.
I’m curious, how would you do that?
Depends on what you want to do. If you want to generate code from scratch, then you can take a curated selection of your best procedures/classes and use them as context. If you want to search your code base (say, to curate a selection of good procedures/classes) then you can parallelize the search over each module and then summarize the search results. This model is a decent starting point for code completion.
Yeah, you can probably train the LLM over all the code, the documentation, all the dependencies (including the tooling, the runtime and the APIs) and their documentations as well as all the existing issues and Slack messages, customer emails and support tickets and see what comes out of all that…
I think it would still require a lot of supervision to write code that will go to production, but it could really really speed up feature development.
Our codebase is 15 years old and we still need to change things in the oldest code but we definitely don’t want to write code like that anymore. Bits of code are updated the boyscout rule way one merge request at a time. LLMs have no sense of time in that sense. Even if all the code it writes is correct, it’ll be eclectic.