I’m measuring the virtual memory used, as well as the runtime performance (not just parser performance).
I have a pending change that speeds OSH up considerably (the ASDL code generation I mentioned last week) and I want to know exactly how much faster it is.
I understand a little more why most programming languages don’t publish benchmarks with every release. It is a big pain to do so :) But I think Oil itself could change that: shell is the right tool for conducting performance experiments (especially across multiple machines, which I am doing.)
I’ve been in a cycle of writing little prototypes to suss out some of the implementation details of my little PL.
Last night I sat down and wrote a lexer + recursive descent parser for the phrase grammar. In the past, I’ve learned on lex/yacc-like tools for this, and always dreaded any sort of grammar evolution. Although the phrase grammar is simple, I’m blown away by the unreasonable effectiveness of hand-writing a lexer and parser. Limiting yourself to recursive descent always imposes a worthwhile complexity constraint on the grammar if you can stomach it. It also made me realize that if my goal is economy of implementation, disposing with parser generators is a good idea.
The second-stage parser (I haven’t come up with a term for it) needs to be written. This is where macros hook into. I have a simple, unhygenic pattern-matcher that can be slotted in for now, but it really needs to be written as an NFA. All three of these components will be part of the Reader (to use Lisp parlance).
Before that, I’d like to add some special forms, as they constitute the ground floor of the language. To do that, I need to decide on an IR: I need a syntax that is lower-level than the surface syntax (which is editable by the user), but still conveys the needed structure. I’m leaning toward a sum type, but might consider sexps as well.
I am working on refactoring frontend of http://raspchat.com/ previously it was written in Vue 1 without using any kind of packaging tools (webpack or rollup). I am refactoring frontend to be more simple with a new idea around chatting in multiple rooms, this time with hyperapp, and rollup. Staying away from react :) trying to keep it minimal. Few weeks back I switched backend from Golang to Node.js (for simplicity in codebase). I hope pretty soon I will finish the pending tasks and hit v1.0.
Download link will be linked to Github. About IRC I have similar ideas, but every time I think about it, it begs the question if I should just use an IRC server with Node.js doing the WebSocket relaying; which leads me down the path of https://github.com/kiwiirc/webircgateway/ . I think I will keep it simple for now and gradually evolve it into something bigger.
Been working recently a lot on my kernel, brackos. My biggest advance recently has been successfully implementing Symmetric Multiprocessing (SMP), which allows me to use all the processors in a computer, rather than just the one that happens to boot. While getting this to work has been incredibly rewarding, it has come at the cost of me needing to redesign a lot of the functionality I had already gotten in :(. Most of my memory management functions at the very least need lots of locking put in, and in some cases (for example, my slab allocator) need to be partially reworked to support the concurrent accesses. Then I also need to do work on my scheduler to support multiple processors as well.
I think my goal this week will be to try and just get my memory management functions ready. That subdivides into a few different tasks: the slab allocator, the page frame allocator (buddy system), and the virtual address space allocator. I’ve actually already got some basic locking in the page frame allocator, but it’s untested and probably incomplete. For the slab allocator I think I’m going to go the route linux did with their slub allocator and for each cache give each processor their own slabs. Then lastly for the virtual address space allocator: I think for now I’m just going to get locking in. I think way down the line I might replace it with something more sophisticated, but right now it works well enough. Then the tricky part becomes testing the concurrency. It might be worthwhile to bring the allocators out of the kernel and test it with TSAN for validation
Oh and also study for my finals. Gotta get good grades first semester :}
I said last week that Mu’s implementation of continuations was at some sort of closure. Unfortunately this wasn’t remotely true.
a) In the process of trying to implement McCarthy’s amb operator, I found a bug with Mu reclaiming continuations too quickly.
b) In the process of trying to solve the same-fringe problem, I realized I needed to fix a long-standing hole in Mu’s static dispatch: picking the right overloaded function when called through an indirect call. Mu has function overloading and generics. Calling a generic function will specialize it and add the specialization as a new overload. But when you call a function through a function pointer using call, it currently only checks existing overloads during static dispatch. It doesn’t perform specialization.
So I have more bug-fixing before me. The good news is that trying to write small programs using continuations and coroutines has been fun.
I’ve also been spending some time with Russ Cox’s paper on NFA-based regular expressions at @andyc’s suggestion last week. Andy, that transliteration of the code seems to be quite obviously off. Even patterns without any metacharacters don’t work. They compile down to the right MATCH instructions, but interspersed with unnecessary JUMPs. I’m not sure what sort of bitrot could cause something that obvious, but I only made sense of the evaluation stage this past week. Now I’ll focus on the compiler. Or maybe it’ll be faster to just reimplement it for myself now that I have internalized the paper.
Yeah that is unfortunate. Maybe it was never right – it looks like someone sent it to Russ and he just threw it up there.
Spending some time correcting that would be a great way for me to test my knowledge. Although as I mentinoned since my lexer works with re2c now, I have less concrete motivation.
If I could see a path to getting rid of re2c via reimplementing something small like that, that would be awesome, but I don’t see that path. re2c is 20K lines of code (C code), and I believe it’s doing a lot that I need! So for the time being it will stay. It won’t cause any problems for users, because I’ll just ship the generated C code.
Maybe if I manage to get a couple blog posts out about the lexer, then I will time to dive into that code! But please keep me updated on what you find.
Makes sense. For me it’s just a way to play with some sort of simple compiler. Understanding regular expressions and NFAs seems like a good use of time as well.
I’m configuring a filofax to work as a bullet journal. This is a periodic high point in my organisation of my task list; I occasionally get distracted by the large amount of disparate sources of work, find an overarching management system, apply it for a couple of weeks then forget about it. My filofax already has my calendar in so I’m used to carrying it everywhere, and I hope that makes for better adoption.
I have to respond to a due diligence request from a potential customer, which means finding a bunch of documentation we already have and producing a bunch of documentation we don’t have.
Our all-hands/Christmas party is on Thursday and I’m emceeing for part of it.
The project I’m working on wraps up today so any development time I get will be jumping on another project that needs a push.
in the spirit of passing arbitrary condiments, in building the product we just built I accidentally designed a thing that can solve a whole collection of problems in our domain, so I’m spending some time investigating which of these other problems are actually amenable to that solution.
Working on my fun little side-project chrome extension that lets you navigate a browser with your voice. It’s good for people who have hand/wrist issues and for you slobs that eat while you browse (that’s you – greasy pizza fingers keyboard guy). It is plugin-based (think userscripts for greasemonkey) so you can install site-specific plugins within the extension to work with, for example, the beloved lobste.rs and provide voice commands specifically for the site. You can say things like “click second” and it will show the second article, “back” browser back button, “pause” it will pause a video you’re watching.
sounds fun. Do you mind if I ask a few questions?
what API are you going to use for voice analysis?
does something like getUserMedia work in background scripts or do you have to inject a content script in every web page?
Just the webkit speech recognition api. Yes, it works in the background continuously with some little tricks. In chrome it goes to google’s servers which are nice and snappy. Doesn’t work in FFX yet but now that mozilla has recently released DeepSpeech there’s potential to do much more interesting things on my own servers potentially in later versions, and bring it to FFX of course.
Work: Finalizing documentation to unveil new process standards for our product development teams. Changing most of it for the better, and a lot of it will be pretty conventional/uncontroversial, but part of it involves giving up on a few of my ideals. Not for bad reasons, mind you, I am just learning to compromise with reality I suppose.
Code-wise I will only be writing a bunch of unit tests for a bunch of legacy utilities and library code. One of the libraries might end up extracted and open sourced, depending what my team thinks.
Personal: Nothing major terms of digital technology. I have been saving and using wine bottles to craft interesting containers for plants, though. Most recently I have been cutting them in half to make self-watering planters where some mesh keeps soil and the plant above, while some twine hangs through it into a pool of water, which seeps up into the soil via capillary action. I have 4 more bottles to cut and craft in this way.
Yeah, I cheated for the second part Day 3 by looking it up in OEIS. That’s because I’ve been doing them at the end of the day and was afraid to miss the “deadline”. But once the pressure to get an answer quickly went away, I went back and actually redid part 2.
Is the group for feedback and such? I’ve been looking for something like that to work through problems with a group and get some feedback maybe. I’ve never really participated in Advent Of Code before.
Besides HackWeek at the company where I work (where I will be refactoring a couple of libraries that became too entangled over time), I’ll be getting everything ready for SpawnFest.
I am currently building a NES emulator on a Raspberry Pi, and the plan is to build a Lego case for it. That, and a Telegram bot that transcribes voice notes to text; although I’m not sure how reliable is going to be.
For $client I’ll still be plugging sql injection potentials, and debugging more weird errors with a 3rd party api.
The internal phpdoc-to-markdown documentation tool I started last week (after peeking the phpdocumentor code to fix an 18 month old bug, and realising it’s an over engineered POS) is coming along quite well, and has gained json output to boot. Hopefully this week I’ll get it doing some tag-specific parsing to get specific value types - urls, name+Email pairs, version numbers, symbol/type references and of course name/type/description sets.
As part of this I’ve started updating bits of the framework it builds upon to target its next minimum supported php version - it’s nice to finally get to use some of the recent-ish advancements in the framework itself after supporting v5.5 syntax for a while.
Also, according to the mechanic we’ll finally have the car back this Friday.
Continuing to work on mead a Go tool I started last week to aid in maintaining Go packages in Homebrew. I’ll probably give some TLC to bakelite my Go tool for doing GOOS/GOARCH builds in parrallel in the process.
On Thursday I’m giving a meetup talk on the standard Go tools. I need to write this talk and design the slides before then. This pairs well with the above, as I’m wanting bakelite to feel like an extension of go build.
Picked up a second Hetzner box to replace the first (couple of € more a month for double the ram, seemed rude not to.) So now I’m migrating the handful of SmartOS zones I had running on the old one across to the new one. (Having migrated 400+ zones in the last few months with my team at work, I’m rather used to doing this. Tedious & time consuming more than anything!)
Azure Functions with F#. Spent a bit got the interpreted version working, then realized the compiled version is preferred, but had some trouble deploying the compiled version. It’s probably something trivial. I’d say Azure functions works, they aren’t without some pain points, but better than usual for microsoft.
I tend to only talk about development stuff here, but seeing as life is pretty stagnant and we’re gearing up for 2018, not a whole lot is going down at the office, or in my free time (development fatigue on that front.)
So instead I’m going to talk about my podcast! I am the producer for the show, Mr. Rewatch, where my two friends dissect each episode of Mr. Robot. One host is a hacker, the other a comedian, and the dynamic is pretty interesting. If you want to talk about audio production (and maybe some video too!) I am all ears. Currently I am putting together a time-lapse of the last edit I did, but ran into trouble with Kdenlive trying to speed up a video from 3h12m to 3m20s. So any advice on that is more than welcome!
Over the weekend I setup Isso on my blog, was also the first time I fully setup a container using ansible, it’s quite neat (though I think I’m violating 20 best practises /shrug)
I’m probably going to try to refine the memory model for my kernel so I can make it robust against faulty kernel modules, I looked at using VT instructions but those are complicated and I’d rather avoid that. Atleast I got my Task Model figured out!
I will probably automate a few more of my servers and setup backups, I’m still on the “pray it works” method of backups here.
At work: same routine apps, no new or interesting projects on sight; expanding the internal framework when I have time.
At home: looking at job postings, it would be nice to move to europe next year. Even as I have almost ten years of experience, I’m lacking on a few techs the most interesting projects require, unit testing most prominently. I’ll try this week to start with side projects for learning and testing things, I was thinking of a finder of discounted games sraping bundle sites
I’m working on dashcache, a caching proxy for Prometheus predicated on the proposition that if data for the range (a,b) is requested, it’s quite likely (a+delta, b+delta) will be requested in delta seconds, allowing me to stitch together a mostly cached response augmented with some fresh data.
It sort of works, I wouldn’t recommend anyone waste their time with it quite yet unless they are interested in making some pre-alpha software.
One weird implementation detail is that it leans heavily on postgres range types and indexes in order to find candidate cached ranges for a given query.
in berlin making hanging out and making art! starting with fragment shading webcam inputs https://maxbittker.github.io/webcam-sketches/ also blogging this week: https://maxbittker.com/rc-art-pop-up/
Working on more benchmarks for Oil, which currently look something like this:
http://www.oilshell.org/release/0.2.0/benchmarks/osh-parser.wwz/
I’m measuring the virtual memory used, as well as the runtime performance (not just parser performance).
I have a pending change that speeds OSH up considerably (the ASDL code generation I mentioned last week) and I want to know exactly how much faster it is.
I understand a little more why most programming languages don’t publish benchmarks with every release. It is a big pain to do so :) But I think Oil itself could change that: shell is the right tool for conducting performance experiments (especially across multiple machines, which I am doing.)
[Comment removed by author]
You should cut and paste this comment into your README, just so people have a basic idea of what it’s about.
I’ve been in a cycle of writing little prototypes to suss out some of the implementation details of my little PL.
Last night I sat down and wrote a lexer + recursive descent parser for the phrase grammar. In the past, I’ve learned on
lex
/yacc
-like tools for this, and always dreaded any sort of grammar evolution. Although the phrase grammar is simple, I’m blown away by the unreasonable effectiveness of hand-writing a lexer and parser. Limiting yourself to recursive descent always imposes a worthwhile complexity constraint on the grammar if you can stomach it. It also made me realize that if my goal is economy of implementation, disposing with parser generators is a good idea.The second-stage parser (I haven’t come up with a term for it) needs to be written. This is where macros hook into. I have a simple, unhygenic pattern-matcher that can be slotted in for now, but it really needs to be written as an NFA. All three of these components will be part of the
Reader
(to use Lisp parlance).Before that, I’d like to add some special forms, as they constitute the ground floor of the language. To do that, I need to decide on an IR: I need a syntax that is lower-level than the surface syntax (which is editable by the user), but still conveys the needed structure. I’m leaning toward a sum type, but might consider sexps as well.
I am working on refactoring frontend of http://raspchat.com/ previously it was written in Vue 1 without using any kind of packaging tools (webpack or rollup). I am refactoring frontend to be more simple with a new idea around chatting in multiple rooms, this time with hyperapp, and rollup. Staying away from react :) trying to keep it minimal. Few weeks back I switched backend from Golang to Node.js (for simplicity in codebase). I hope pretty soon I will finish the pending tasks and hit v1.0.
Looks nice, however download button doesn’t work. Have you considered making the server actually IRC compatible? It’s a quite simple protocol.
Download link will be linked to Github. About IRC I have similar ideas, but every time I think about it, it begs the question if I should just use an IRC server with Node.js doing the WebSocket relaying; which leads me down the path of https://github.com/kiwiirc/webircgateway/ . I think I will keep it simple for now and gradually evolve it into something bigger.
Been working recently a lot on my kernel, brackos. My biggest advance recently has been successfully implementing Symmetric Multiprocessing (SMP), which allows me to use all the processors in a computer, rather than just the one that happens to boot. While getting this to work has been incredibly rewarding, it has come at the cost of me needing to redesign a lot of the functionality I had already gotten in :(. Most of my memory management functions at the very least need lots of locking put in, and in some cases (for example, my slab allocator) need to be partially reworked to support the concurrent accesses. Then I also need to do work on my scheduler to support multiple processors as well.
I think my goal this week will be to try and just get my memory management functions ready. That subdivides into a few different tasks: the slab allocator, the page frame allocator (buddy system), and the virtual address space allocator. I’ve actually already got some basic locking in the page frame allocator, but it’s untested and probably incomplete. For the slab allocator I think I’m going to go the route linux did with their slub allocator and for each cache give each processor their own slabs. Then lastly for the virtual address space allocator: I think for now I’m just going to get locking in. I think way down the line I might replace it with something more sophisticated, but right now it works well enough. Then the tricky part becomes testing the concurrency. It might be worthwhile to bring the allocators out of the kernel and test it with TSAN for validation
Oh and also study for my finals. Gotta get good grades first semester :}
I said last week that Mu’s implementation of continuations was at some sort of closure. Unfortunately this wasn’t remotely true.
a) In the process of trying to implement McCarthy’s
amb
operator, I found a bug with Mu reclaiming continuations too quickly.b) In the process of trying to solve the same-fringe problem, I realized I needed to fix a long-standing hole in Mu’s static dispatch: picking the right overloaded function when called through an indirect
call
. Mu has function overloading and generics. Calling a generic function will specialize it and add the specialization as a new overload. But when you call a function through a function pointer usingcall
, it currently only checks existing overloads during static dispatch. It doesn’t perform specialization.So I have more bug-fixing before me. The good news is that trying to write small programs using continuations and coroutines has been fun.
I’ve also been spending some time with Russ Cox’s paper on NFA-based regular expressions at @andyc’s suggestion last week. Andy, that transliteration of the code seems to be quite obviously off. Even patterns without any metacharacters don’t work. They compile down to the right MATCH instructions, but interspersed with unnecessary JUMPs. I’m not sure what sort of bitrot could cause something that obvious, but I only made sense of the evaluation stage this past week. Now I’ll focus on the compiler. Or maybe it’ll be faster to just reimplement it for myself now that I have internalized the paper.
Yeah that is unfortunate. Maybe it was never right – it looks like someone sent it to Russ and he just threw it up there.
Spending some time correcting that would be a great way for me to test my knowledge. Although as I mentinoned since my lexer works with re2c now, I have less concrete motivation.
If I could see a path to getting rid of re2c via reimplementing something small like that, that would be awesome, but I don’t see that path. re2c is 20K lines of code (C code), and I believe it’s doing a lot that I need! So for the time being it will stay. It won’t cause any problems for users, because I’ll just ship the generated C code.
Maybe if I manage to get a couple blog posts out about the lexer, then I will time to dive into that code! But please keep me updated on what you find.
Makes sense. For me it’s just a way to play with some sort of simple compiler. Understanding regular expressions and NFAs seems like a good use of time as well.
Helmspoint a tool to deploy machine learning models to the web
Working on my fun little side-project chrome extension that lets you navigate a browser with your voice. It’s good for people who have hand/wrist issues and for you slobs that eat while you browse (that’s you – greasy pizza fingers keyboard guy). It is plugin-based (think userscripts for greasemonkey) so you can install site-specific plugins within the extension to work with, for example, the beloved lobste.rs and provide voice commands specifically for the site. You can say things like “click second” and it will show the second article, “back” browser back button, “pause” it will pause a video you’re watching.
sounds fun. Do you mind if I ask a few questions? what API are you going to use for voice analysis? does something like getUserMedia work in background scripts or do you have to inject a content script in every web page?
Just the webkit speech recognition api. Yes, it works in the background continuously with some little tricks. In chrome it goes to google’s servers which are nice and snappy. Doesn’t work in FFX yet but now that mozilla has recently released DeepSpeech there’s potential to do much more interesting things on my own servers potentially in later versions, and bring it to FFX of course.
ah, didn’t know this was a thing. interesting API, Thanks!
Work: Finalizing documentation to unveil new process standards for our product development teams. Changing most of it for the better, and a lot of it will be pretty conventional/uncontroversial, but part of it involves giving up on a few of my ideals. Not for bad reasons, mind you, I am just learning to compromise with reality I suppose.
Code-wise I will only be writing a bunch of unit tests for a bunch of legacy utilities and library code. One of the libraries might end up extracted and open sourced, depending what my team thinks.
Personal: Nothing major terms of digital technology. I have been saving and using wine bottles to craft interesting containers for plants, though. Most recently I have been cutting them in half to make self-watering planters where some mesh keeps soil and the plant above, while some twine hangs through it into a pool of water, which seeps up into the soil via capillary action. I have 4 more bottles to cut and craft in this way.
Going through Thorsten Ball’s book about writing an Interpreter in Go
Hey I’m reading this too! :) Currently on chapter 2
I can’t talk about my work, but I am immensely enjoying solving each day’s Advent of Code problem.
Day 3 kicked my ass but day 4 was easy. If you didn’t know, we have a GitHub group/IRC channel for doing it with your fellow crustaceans.
Whoa, thanks man!
Yeah, I cheated for the second part Day 3 by looking it up in OEIS. That’s because I’ve been doing them at the end of the day and was afraid to miss the “deadline”. But once the pressure to get an answer quickly went away, I went back and actually redid part 2.
My work so far.
Is the group for feedback and such? I’ve been looking for something like that to work through problems with a group and get some feedback maybe. I’ve never really participated in Advent Of Code before.
Besides HackWeek at the company where I work (where I will be refactoring a couple of libraries that became too entangled over time), I’ll be getting everything ready for SpawnFest.
Made a little thing for scriptable input mapping for evdev devices (a slightly over-engineered solution to the xcape-on-Wayland problem).
I am currently building a NES emulator on a Raspberry Pi, and the plan is to build a Lego case for it. That, and a Telegram bot that transcribes voice notes to text; although I’m not sure how reliable is going to be.
For
$client
I’ll still be plugging sql injection potentials, and debugging more weird errors with a 3rd party api.The internal phpdoc-to-markdown documentation tool I started last week (after peeking the phpdocumentor code to fix an 18 month old bug, and realising it’s an over engineered POS) is coming along quite well, and has gained json output to boot. Hopefully this week I’ll get it doing some tag-specific parsing to get specific value types - urls, name+Email pairs, version numbers, symbol/type references and of course name/type/description sets.
As part of this I’ve started updating bits of the framework it builds upon to target its next minimum supported php version - it’s nice to finally get to use some of the recent-ish advancements in the framework itself after supporting v5.5 syntax for a while.
Also, according to the mechanic we’ll finally have the car back this Friday.
Last time
go build
.Picked up a second Hetzner box to replace the first (couple of € more a month for double the ram, seemed rude not to.) So now I’m migrating the handful of SmartOS zones I had running on the old one across to the new one. (Having migrated 400+ zones in the last few months with my team at work, I’m rather used to doing this. Tedious & time consuming more than anything!)
Advent of Code 2017 in Kotlin.
I’m supposed to be working on a paper, so I wrote a CLI to search for BibTeX references instead :)
This was just a little learning project, my first time writing Rust. Coming from JavaScript and Python, I really enjoyed the experience.
Azure Functions with F#. Spent a bit got the interpreted version working, then realized the compiled version is preferred, but had some trouble deploying the compiled version. It’s probably something trivial. I’d say Azure functions works, they aren’t without some pain points, but better than usual for microsoft.
I tend to only talk about development stuff here, but seeing as life is pretty stagnant and we’re gearing up for 2018, not a whole lot is going down at the office, or in my free time (development fatigue on that front.)
So instead I’m going to talk about my podcast! I am the producer for the show, Mr. Rewatch, where my two friends dissect each episode of Mr. Robot. One host is a hacker, the other a comedian, and the dynamic is pretty interesting. If you want to talk about audio production (and maybe some video too!) I am all ears. Currently I am putting together a time-lapse of the last edit I did, but ran into trouble with Kdenlive trying to speed up a video from 3h12m to 3m20s. So any advice on that is more than welcome!
Converting paper-based processes to paperless. Trying to figure out TypeScript and Koa2.
Trying to figure out clever ways to test performance on a given project.
Lies, damned lies and benchmarks.
Chaos reigns. I’ve been tasked with being the project manager for a month to try to impose some sanity and get things under control.
It won’t work unless the CEO stops micromanaging devs. Also, a month is not enough.
Over the weekend I setup Isso on my blog, was also the first time I fully setup a container using ansible, it’s quite neat (though I think I’m violating 20 best practises /shrug)
I’m probably going to try to refine the memory model for my kernel so I can make it robust against faulty kernel modules, I looked at using VT instructions but those are complicated and I’d rather avoid that. Atleast I got my Task Model figured out!
I will probably automate a few more of my servers and setup backups, I’m still on the “pray it works” method of backups here.
At work: same routine apps, no new or interesting projects on sight; expanding the internal framework when I have time.
At home: looking at job postings, it would be nice to move to europe next year. Even as I have almost ten years of experience, I’m lacking on a few techs the most interesting projects require, unit testing most prominently. I’ll try this week to start with side projects for learning and testing things, I was thinking of a finder of discounted games sraping bundle sites
Last week I did some work on getting more documentation in PISC, this week I intend to do a batch of Advent of Code exercises at some point.
I’m working on dashcache, a caching proxy for Prometheus predicated on the proposition that if data for the range (a,b) is requested, it’s quite likely (a+delta, b+delta) will be requested in delta seconds, allowing me to stitch together a mostly cached response augmented with some fresh data.
It sort of works, I wouldn’t recommend anyone waste their time with it quite yet unless they are interested in making some pre-alpha software.
One weird implementation detail is that it leans heavily on postgres range types and indexes in order to find candidate cached ranges for a given query.