Studying the Raft paper and doing lab exercises from the MIT 6.842 course that implements it
Started the MIT Distributed Systems course with a former colleague: https://pdos.csail.mit.edu/6.824/schedule.html
We’re going through papers, making notes and doing exercises. :D
Finishing off a Snowplow migration from AWS to GCP. Turned out to be trickier than expected because the AWS pipelines were ~2 years old, and things had changed quite a bit. Spent the last few weeks working on an intermediate proxy layer written in Go which does schema injection/transformation and makes things compatible b/w old and new. This was my second non-trivial Go project and I enjoyed it more than expected.
A Proof of Concept evaluating BigTable for Time Series data + Grafana supporting Data Science. Shouldn’t be difficult.
Refactoring my first Rust project to improve code quality.
Finish a blog post on lessons learned from the above project.
Begin a couple of new projects - a distributed KV Database exercise (https://github.com/pingcap/talent-plan/blob/master/rust/docs/lesson-plan.md) and something else I haven’t thought of
Some Leetcoding on Linked Lists
Meeting a friend who is here in Berlin. Studying some Data Structures. Finishing off the font and sound subsystem support in my toy VM in Rust. Thinking about creating an assembler and a debugger for that VM. So maybe I’ll spend the weekend doing some research on that.
Trying out the Zettelkasten Method (https://zettelkasten.de) via the Sublime Text plugin. I use it for learning/thinking. For my tasks, I use PlainTasks (also Sublime and VSCode extension), which I use to do the checklists for my work/home things. I have work.todo and personal.todo files which are used by the plugin.
I’ve tried all the paid fancy apps (Omnifocus, Things etc), but they aren’t really going to enforce discipline. I’ve given up on bullet journal because handwriting is too slow for jotting down notes. It’s good for brainstorming/ideas exploration, but not for the grunt work of putting things in your head on an external system.
Chip8 Emulator is about to be wrapped up. Gonna make a debugger on top of it. Writing a debugger would be a good exercise.
‘Coders at Work’. Fantastic Book
Doing a Snowplow migration from one cloud provider to another (migrating a 2 year old legacy service, with the expectation of almost-zero downtime).
Attending a Linux User conference here in Berlin: https://all-systems-go.io/ which is coincidentally like 10 minutes from my office and I got a free ticket for!
Nothing to optimize if no queries can be run! :D
Strange. I never saw any downtime.
Been getting into some toxic meetings at work and it’s not fun. Having to defend my team’s work (they’re all amazing) but bearing the brunt of not meeting unrealistic expectations has been taking a toll. Hopefully, the weekend will clear some fog. :)
Studied up on Rust lifetimes and testing chapters. Making good progress with the book.
Going to re-do some initial Project Euler problems and discuss some advanced ones with a colleague.
More annual and quarterly earnings/analysis report reading. Yay.
I live with my family and they’re moving, so I’m moving. Helping them move most of their things Saturday. Next Friday is my last day at work, and I don’t have a new job lined up yet. I’m planning to go to university in the fall so I’ll have to find something just for the summer. I also screwed up my dates so I have to go to the library Sunday and renew books, which is a 6 hour drive. A lot of driving this weekend.
I don’t think I know much history, so I’m soliciting recommendations for history books. My plan is to do a general overview of world history and then delve into specific nations and events that influenced the West e.g. Rome and WWII.
One fantastic history book that I can recommend is Glimpses of World History by Jawaharlal Nehru - India’s first Prime Minister. He wrote much of it while he was in prison, with the express purpose of educating his daughter. It’s a massive, expansive work, and it covers both the east and the west. Nehru was particularly suited since he studied as a lawyer in UK, and came home to take part in the independence struggle - he’s a child of both worlds.
One beautiful thing that this book does, which I have yet to see any other history book do, is that it relates different events happening in different parts of the world at the same time. We often see history as big events dotted over a timeline in various parts of the world, but the truth is that a lot of the things are happening simultaneously. Example: While Europe was going through the dark ages, what was happening in China?
For a book of this size and scope, it’s amazingly readable as well - since it was basically just letters to his little girl.
I would highly recommend this book, because while I was reading it, I distinctly remember thinking, “this is how history should be taught”.
As someone who does ML in C++ and built a production, petabyte level, computer vision system, and interviewed many ML people and the reason is pretty obvious. ML people can’t code. seriously , many are math people, who are academic, and have very little experience building production systems. It’s not their fault, it’s not what they trained for or interested in. These high level apis exist to address their needs.
I want to emphasize one thing that might make my previous comment more clear. The hard part of ML isn’t programming.
The hard part of ML is data collection, feature selection, and algorithm construction.
The only part where programming matters is building the training software and execution software. However most ML people care about the former, not the latter.
ML has certainly been growing fast. I see this as mirroring what’s happening in CS in general nowadays, with ML simply being the foremost frontier, and a hype word to boot.
However, I would temper your statement with an aspect of what u/zxtx said. Even those who are capable of building a project in a low-level language, won’t always want to. It’s nice to be able to dodge the boilerplate while hacking on something new. And that goes for those who understand the low-level stuff, too. So I’m not too surprised that people aren’t using low-level languages for everyday ML development. (Libraries are another story, of course.)
Perhaps you know more about how to write ML projects in C++ without getting mired in boilerplate, though. Was this ever a problem for you? Or does low-level boilerplate generally not get in your way?
Great question. I am not mired in boilerplate because C++ is a high level language. I use it because the transition from prototype to production is very smooth and natural.
I think ultimately it’s not fashionable to learn. I’m finding most younger programmers simply don’t have proficiency in it. Meaning they haven’t developed the muscle memory so it feels slower. The computer field is all about pop culture , which I believe is the actual answer to OP’s question now that I think about it. In other words, python and R are fashionable and that’s why they are being used.
I think ultimately it’s not fashionable to learn
I think ultimately it’s not fashionable to learn
It’s not just fashion: it’s an incredibly complicated language. It’s so complicated that Edison Design Group does the C++ front-ends for most commercial suppliers just because they know they’ll screw it up. Some C++ alternatives are easier to learn or provide extra benefits for extra efforts.
On top of that, it had really slow compiles compared to almost any language I was using when considering C++. That breaks developer’s mental state of flow. To test the problem, I mocked up some of the same constructs in language designed for fast compiles with it speeding way up. It was clear C++ had fundamental, design weaknesses. Designers of D language confirmed my intuition with design choices that let it compile fast despite many features and being C-like in style.
It’s true the compile times are slow , but it doesn’t kill flow because you don’t need to compile while you program, only when you want to run and test. I would argue any dev style where you quickly switch between running and coding slows you down and takes you out of flow anyway.
In regards to it being complicated, this is true. However c++17 is much more beginner friendly. Even though 1980s C++ was arguably harder to learn than today, millions learned it anyway because of fashion. Don’t underestimate the power of fashion.
And lastly , D has it’s own design flaws like introducing garbage collection. Why in a language that has RAII do you need or want garbage collection ? Nobody writing modern C++ worries about leaking memory.
“Don’t underestimate the power of fashion.”
You just said it’s out of fashion. So, it needs to be easier to learn and more advantageous than languages in fashion. I’m not sure that’s the case. Hardest comparison being it vs Rust where I don’t know which will come out ahead for newcomers. I think reduced temporal errors is a big motivator to get through complexity.
“And lastly , D has it’s own design flaws like introducing garbage collection.”
You can use D without garbage collection. Article here. The Wirth languages all let you do that, too, with a keyword indicating the module was unsafe. So, they defaulted on safest option with developer turning it off when necessary. Ada made GC’s optional with defaulting on unsafe (memory fuzzy) since real-time w/ no dynamic allocation was most common usage. There are implementations of reference counting for it and a RAII-like thing called controlled types per some Ada folks on a forum.
So, even for C++ alternatives with garbage collection, those targeting the system space don’t mandate it. Feel free to turn it off using other methods like unsafe, memory pools, ref counting, and so on.
Sorry, I had a very hard time groking your response. What I meant was that python and R are used for ML, not because of technical reasons, but because it’s fashionable. There is social capital behind those tools now. C++ was fashionable late 80s to late 90s in programming (not ML). Back then lisp and friends were popular for ML!
Do you mind clarifying your response about fashion ?
In regards to D, I still think garbage collection, even though it’s optional , is a design flaw. It was such a flaw that if you turned it off, you could not use the standard library, so they were forced to write a new one.
C++ is such a well designed language that you can do pretty much any kind of programming (generic, OOP, functional, structural, actor) with it and it’s still being updated and improved without compromising backwards compatibility. Bjarne is amazing. By this time, most language designers go off and create a new language, but not Bjarne. I would argue that’s why he is one of the greatest language designers ever. He was able to create a language that has never stopped improving.
Now WebAssembly is even getting we developers interested in C++ again!
I was agreeing it went out of fashion. I dont know about young folks seeing as it happened during push by managers of Java and C#. They kept getting faster, too. Even stuff like Python replaced it for prototyping, sometimes production. Now, there’s C/C++ alternatives with compelling benefits with at least one massively popular. The young crowd is all over this stuff for jobs and/or fun depending on language.
So, I just dont see people going with it a lot in the future past the social inertia and optimized tooling that causes many to default on it. The language improvements recently have been great, though. I liked reading Bjarne’s papers, too, since the analyses and tradeoffs were really interesting. Hell, I even found a web, application framework using it.
I would argue any dev style where you quickly switch between running and coding slows you down and takes you out of flow anyway.
I would argue any dev style where you quickly switch between running and coding slows you down and takes you out of flow anyway.
I have to disagree. REPL-driven development has only grown more & more useful over time. Now, when I say this, you may think of languages with high overhead such as Python and Clojure, and cringe. But nowadays you can get this affordance in a language that also leaves room for efficient compilation, such as Haskell. And don’t forget that even Python has tools for compiling to efficient code.
If you feel like having your mind opened on this matter, Bret Victor has done some very interesting work liberating coding from the “staring at a wall of text” paradigm. I think we’re all bettered by this type of work, but perhaps there’s something to be said for keeping the old, mature standbys in close proximity.
Sorry, just to clarify. I LOVE repl development! Bret Victors work is amazing. What I mean is anything that takes you out of the editor. For example , if you have to change windows to ccompile and run.
REPLs are completely part of the editor and live coding systems don’t take you out of the flow. But if you need to switch out of the editor and run by hand, then it takes you out of flow because it’s a context switch.
100% agreed. Fast compilation times can’t fix a crappy development cycle.
I think the worst example of anti-flow programming is TDD. A REPL is infinitely better.
Do the proponents of TDD knock REPLs? In my opinion, REPL-driven is just the next logical step in TDD’s progression.
no, I’m knocking TDD ;-)
How do you feel about platforms like https://onnx.ai/ which make it easy to write a model in a language like python, but have it deployed into a production system likely written in C++?
I think they are great but don’t go far enough. because we are entering a new paradigm where we write programs that write programs. People need to go further and write a DSL and not just a wrapper in python. I think a visual language where you connect high level blocks into a computation graph would be wonderful. And then feed it data and have it learn the parameters.
So intuitively a DSL is the correct approach, but as can be seen with systems like Tensorflow it leads to these impedance mismatches with the host language. This mismatch slows people down and ultimately leads them systems that just try to extend the host language like pytorch.
I guess when I think of DSL’s, I don’t think of host languages. I’m thinking more about languages that exist by themselves specific to a domain. In other words, there isn’t a host language like in Tensoflow.
Well for Tensorflow, I mean something like python as the host language.
wow, rereading my last sentence I can see how it had the opposite meaning than I intended. I meant I was thinking of DSLs without a host language , unlike tensorflow.
Would you mind shedding some light on what this petabyte level computer vision system is? I’m very curious!
It was a project within HERE maps to identify road features and signs to help automate map creation. Last time I was there it processed dozens of petabytes of LiDAR and imagery data from over 30 countries . It’s been a couple years so can’t tell you where it’s at today.
Several of my peers at Arcadia.io’s Pittsburgh office are hiring.
If you see “data ingestion pipeline” in any other job description, that’s my team. I’m building a pipeline right now for a start sometime later in Q1 next year on my team doing stuff in Scala, Groovy, Ruby, and Rust.
I don’t suppose you’re looking for Remote engineers or sponsor Visas? Couldn’t find anything in the posting.
We’d consider remote for someone who fits the position perfectly or is willing to move near Pittsburgh or Boston within a few months.
My team is actually “local remote”: we all live in the greater Pittsburgh area but only go into the office once a week. However, the other teams are the opposite: they work in the office daily but remotely one day per week.
We do visa sponsorship for the right candidate.
Sounds good. I’ll give it a shot :-)
Pretty much the same experience. I went through 5 rounds with a company (all online, 1 to 1.5 hrs after my work hours, via Skype). All of them involved doing live coding (algorithms, rate limiters, multithreading problems and so on), extensive system design questions, quite a bit of time spent on work experience, past projects, scalability, microservices and everything you can think of. After every round I was given the feedback that they like me and still want to go forward.
At the end, they decided they wanted to do one more round, in which I took too long to answer a linked list question and was rejected.
Not only was this a mental exhaustion, but it wasted so much time back and forth b/w me and the consultant, scheduling and rescheduling calls because we were in different timezones. Quite a lot of effort.
After all of that they gave you a linked list question at the end? And then failed you on it? Was there any chance you would be implementing linked lists in that job?
On the face of it that all seems pretty ridiculous.
Nah. I would’ve most likely worked on infrastructure, golang, cloud, etc etc.
I suspect it was simply - Engineer 1 took the interview the way they wanted it, Engineer 2 most likely didn’t collect any feedback from the previous round, and interviewed their way. In the end someone didn’t like something about me and all 5 rounds of progression wasn’t weight against the couple things I fell short on.
It’s fine I guess, a lesson learned. :-)
Working on a Google codejam problem. Algorithms/Data Structures are something I know well enough, but it’s always been a sticky point that I’m not super confident when a new problem comes my way. Aiming to slowly build up those skills.
At office taking lots and lots of interviews. Sigh.
Are there good lectures available for Distributed Systems somewhere? I find most of the classes are usually just about reading lots of papers. While that’s great and I’m doing it, it would be really fantastic to have an experienced instructor drilling down the principles and practical examples of distributed systems. Thanks :)
Someone in #go-nuts linked me this excellent video from Linux Conf Australia: Introduction to go by Mark Smith from Dropbox. :)