It’s Monday, so it is time for our weekly “What are you working on?” thread. Please share links and tell us about your current project. Do you need feedback, proofreading, collaborators?
This week, I’ve been working on a couple of things.
First up, in Open Dylan was a blog post about function types and why we want to add them to Dylan 2016. I’m now working on a new blog post about how the type relationships involving function type should work. The important type relationships in Dylan are: instance?, subtype? and known-disjoint?. For function types in Dylan, the tricky one is the subtype? relationship as we have keyword and rest arguments to consider. Fortunately, some work was done on this in 1996 at Harlequin.
I’m also working on some other blog posts, including a follow up to the LLDB integration post.
Outside of Dylan work, I’ve been working on a heap / memory profiler for emscripten. I’m working on getting some parts of this submitted upstream to emscripten now, while I continue to work on the server that collects and analyzes the data. This is built in a client/server model so that a program compiled for emscripten is submitting data to a server rather than trying to perform the analysis within the same web browser window as previous tools have done for emscripten.
This tool is turning out to be really enjoyable to work on and I’m trying to make it pretty general. (It can do a lot more than just memory / heap analysis.) It also needn’t be restricted to only being used with emscripten. Someone could write code for other platforms to submit data to the collection server without too much trouble. (Although I’d have to modify the collection server to be aware of some additional data, like thread IDs.)
Any chance you read this year’s SIGBOVIK paper on this? Search for the proceedings and look for “Unit Test Based Programming.” (SIGBOVIK is a real conference that is more or less (and by design) a joke, but sometimes has real, albeit usually ridiculous, results. This paper is one such thing.
Really interested to see where your implementation goes!
Just found it. That paper was an entertaining read. Its approach is “purer,” I’d say, from a theoretical perspective as it generates complex functions from a small set of “ground” primitives, whereas inductive.js relies on the programmer explicitly listing the primitives for each function. This results in a trade-off. Solve attempts in inductive.js are generally faster; arbitrary operations like AJAX are easily represented; and it’s straightforward to compose programs of arbitrary size. On the other hand, utbp’s approach is simpler, and it’s definitely more convenient to be able to omit a list of operators. It’s an interesting field of study with a lot of thought-provoking papers out there, like utbp’s.
I’m working on scraping the Chicago Transit Authority’s bus tracker API to build a database of bus stop events (i.e. times when a given bus stopped at a given stop). The idea is to provide a tool that will let people advocate for better transit by seeing how the actual frequency compares to the scheduled frequency. By storing everything in PostGIS it should be really easy to do things like compare median wait times across different neighborhoods or wards.
No, I’m not using GeoJson, the data comes in as XML with elements containing a latitude and longitude, which I map to PostGIS POINTs. So far I haven’t built any front end to this thing, I’m still working on polling the API and storing the results to the database. But when I get a bit further along I will definitely write something up.
Tying up loose ends on a couple personal projects. I released theft, a property-based testing library for C, and am working on getting an MQTT client library for very low memory embedded systems ready to release as well. Most of the remaining work is on the documentation.
I’m also finishing up an embedded project’s bootloader, and infrastructure to do reliable firmware upgrades over unreliable packet radio.
I’m finishing my work on subtitle upscaling today, at least for now. I ended up doing a few dirty hacks, like diving into codec-specific private data in FFmpeg, because the information I need to decide when to scale is not exposed through their subtitle API. It seems to work fine on my desktop, but I still need to test it on-device (it’s an embedded product).
In my spare time, I’ve been helping out a friend with his master thesis, which is a recommender for recommender engines (very meta). It extracts certain properties about a dataset of reviews or purchases, and then uses Weka to predict which recommender engines and configurations are likely to work well for this dataset. No idea if it will actually work.
I’ve also been playing with terrain generation, working on a modified diamon-square algorithm to generate height maps for a sphere using the Peirce quincuncial projection.
I started in on classes, objects, and attribute dictionaries. The latter is a key part of Python, so I have some work in front of me to express almost everything else in terms of it. In the process I learned there’s no real magic in implementing OO, just book-keeping. I’ve been accruing technical debt in the parser, so I’m going to have to start paying that down to support method calls properly. (I don’t want to switch away from Parsec, but it might happen.)
I’m also excited to sit down and use the ContT monad to express control flow better, but I need some time to play with it.
More Rust docs. The guide is pretty close to getting finished! Five more sections. I think that means it’ll be wrapped up next week.
I really, really, really need to write a second post for metaphysics.io. I want it to be an introduction to Marx, that doesn’t say it’s about Marx till the end. I also want to write a post about “Disrupting Disruption,” and how ‘disruption’ has a whole lot of negatives associated with it that we don’t talk about.
Thanks for the great docs! As I just posted in another comment, I’m returning to Rust now and trying to learn my way through it again. It’s much easier now with some documentation.
I was thinking there needs to be a disrupting disrupting startup that allows entrenched players to remain entrenched. No wait, that is a common startup strategy. Move rapidly and then sell technology back to the giants who would move there anyways (VMWare, Salesforce, Cisco, Autodesk).
Creative Disruptive Destruction opens possibilities and makes the world a better place, though very hard to do.
I look forward to your post, the world needs more meta.
I’m continuing work on merging MongoDB 2.6 work into TokuMX. It’s not glamorous but it’s well worth doing. The changes in MongoDB 2.6 are largely refactoring efforts, but they’re leading to some good code cleanup (though some is in danger of introducing performance regressions, we’re being really careful about that) and some nice features or general improvements. In particular, the routing in the sharding layer has gotten a lot better, it’s more careful, handles batches much better, and issues operations to multiple shards concurrently, which has the potential for massive throughput improvements, especially for hashed sharding.
i found this pomodoro thing - https://github.com/tobym/pom - that i think someone posted here - useful for forcing myself to sit and think about stuff.
Spent some time over the weekend finishing up Creek. This week, I’m not working on anything technical outside of the day job. I’m trying to get everything ready before I move to Berkeley in two week’s time to start grad school.
I’m wrapping up a W3C CSS3-compliant tokenizer and parser in Go. The project originally started as a port of LESS to Go but I realized that doing a solid CSS3 parser first would be easier. Ultimately my goal is to make a fast asset pipeline tool (LESS, SASS, Minifier, etc) with zero dependencies. Just download and run.
It’ll be at https://github.com/benbjohnson/css. I’m cleaning it up a little bit right now before I open it up publicly. It uses a similar setup to the go/* packages. The CSS3 spec has additional information per-token so I’m splitting the tokens up into their own types.
In my free time I’m starting to mess around more with Rust. I’ve played with it a little bit long ago, but that was long before Cargo and many other new things. I’m not working on anything in particular with it just yet, just re-learning my way through it since I never really developed a feel for it the first time.
At $WORK we are continuing to learn about AWS, preparing to move a project there. Our business is quite spiky, so it’s mainly for peak scaling, but also flexibility in tooling & access for developers.
No projects lined up for home, unless you count working hard at relaxing. Playing board games after work with some colleagues tonight.
Following a friend blogging about taxicab numbers, I spent the evening figuring out a way to do this without much manual intervention. Started off with ruby & sqlite, is currently using bash (hell yeah!) scripts & MySQL as the data store. And maxing out my laptop somewhat as it generates cubes of numbers and adds them together. The SQL query to find a taxicab number for a given value of n sums is pretty quick though, yay for SQL!
For work, some classic Rails app related to airlines, not that fun so far be it’ll become much more interesting when we’ll start crunching big amounts of data.
On the hobby side, I slowed my Clojure readings and now focus on coding instead. I started building some Rails backend with a Clojurescript frontend. So far it’s really fun to code, Om and Clojurescript make the front end much more interesting and meaningful than it used to be. I could do the backend in Clojure instead of Rails but it’s so damn easy to build a basic backend when you’re used to Rails and it allows me to focus on client-side in the meantime.
It also allows me to spark interest for Clojure in my environment, which is mostly choosing Rails + AngularJS for every project no matter what.
Last week I added compression to messages in Fire★. I got distracted doing that instead of working on my secret game FCFODS.
This week I am going to work on the game. Though I am a bit discouraged now reading all the posts about how much money Indie gamers make. Sad really. If I get a 100 bucks out of this, I will be happy. It is that sad.
At home I have been working on my hacky script to import photos into git-annex and make use of their EXIF GPS metadata to add annex metadata: git-annex-photo-import (github). My idea is to use git-annex views to manage all my photos and the S3 & flickr remotes for backup.
Right now the biggest problem is that views are very slow here’s a short thread on the git-annex forum explaining it, Joey’s plan is apparently to use a sqlite cache DB to index the metadata, but I’m not sure what the status of that is.
BTW, if you have a much better photo management workflow, can you take this public gdocs survey and share what it is? Organizing and archiving photos is such a mess, and I really want to know if anyone’s got it figured out cleanly.
You guys are using github now? When I worked at Canonical everything was entirely in bzr and Launchpad. Is this just for a couple specific projects or are you guys kind of shifting in that direction?
This is just for a few projects. There is no overall shift I’m aware of. Launchpad does a lot of things that github doesn’t and probably won’t (and for good reason, not every project is a distro).
Our cloud-installer project also uses launchpad for PPA building, and bug tracker.
Launchpad is in fact still under development - just recently they rolled out a beta of inline comments on merge proposals, which (aside from cleaner design) was one of the few major things I liked better about github vs. LP.
Yeah, Launchpad does a ton of stuff. Github’s issues are very primitive, and Launchpad has features for translations and stuff that I can’t imagine Github having much use for. But I do kind of wish they would add support for git to Launchpad, so I kind of wondered if they might be moving a little more towards git.
Just so you know, I don’t have a lot of future plans for that specific project, except just to ensure that it’s smooth to add new pictures as I take them. It worked fairly well for me, and now I’ve imported all my backlog of photos.
It could probably use some attention to be useful for others - e.g. it includes some U.S. specific heuristics in the location stuff (ie, looks for ‘Country’, ‘County’, and ‘State’). I’d be happy to help with fixes and would welcome forks.
The things I’d most like to improve about this setup is the issue with view performance scaling, and adding support for thumbnails - ideally I’d love to see git-annex core support a ‘content summary’ that lets you preview files that aren’t currently in an annex, for images, that’d be a thumbnail.
I am taking two weeks of vacations (deserved I may add). I am currently taking a bike ride (I am taking a break at the halfway point, I don’t ride and text) and I hope to keep doing non-CS things and recharge my batteries. I’m also playing Super Ghouls n Ghosts in the evening.
I’m doing a lot of thinking lately, both about personal things, and about things I’d like to fix technically. I have a bunch of projects on the back burner, that I won’t be picking up this week. If I do write any code, it’ll likely be experimenting with Cairo to build some simple visualization tools which can be utilized in command pipelines.
I wrote this awful one-liner the other day and wanted to append | linegraph to it and have it spit out a PNG to see what it looked like, but that proved trickier than I had hoped.
I’m working on getting Whitane Tech. , my go consulting business off the ground. I’ve got my first client (a startup using app engine) and I’m putting together a company site using hugo. The website needs work.
I’m writing a case study for switching to go based off a post I wrote previously. I’m hoping people looking to adopt go can use my case studies to get buy-in from the rest of their team.
I plan on quitting my day job by the end of the year. I’m writing down my goal in public to help force myself to commit.
Wish me luck!
I’m working on an API which will index white papers for keywords and full-text search and let you retrieve urls for those papers via the API. Building it using Elasticsearch and Go for the webserver.
This week, I’ve been working on a couple of things.
First up, in Open Dylan was a blog post about function types and why we want to add them to Dylan 2016. I’m now working on a new blog post about how the type relationships involving function type should work. The important type relationships in Dylan are:
instance?
,subtype?
andknown-disjoint?
. For function types in Dylan, the tricky one is thesubtype?
relationship as we have keyword and rest arguments to consider. Fortunately, some work was done on this in 1996 at Harlequin.I’m also working on some other blog posts, including a follow up to the LLDB integration post.
Outside of Dylan work, I’ve been working on a heap / memory profiler for emscripten. I’m working on getting some parts of this submitted upstream to emscripten now, while I continue to work on the server that collects and analyzes the data. This is built in a client/server model so that a program compiled for emscripten is submitting data to a server rather than trying to perform the analysis within the same web browser window as previous tools have done for emscripten.
This tool is turning out to be really enjoyable to work on and I’m trying to make it pretty general. (It can do a lot more than just memory / heap analysis.) It also needn’t be restricted to only being used with emscripten. Someone could write code for other platforms to submit data to the collection server without too much trouble. (Although I’d have to modify the collection server to be aware of some additional data, like thread IDs.)
I’m continuing to work on my PureScript book as usual. This week is the chapter on monads and interacting with the browser.
Just discovered your book thanks to your comment :) Well you just sold one :)
I’m writing an inductive programming tool.
Ok, that is cool.
Any chance you read this year’s SIGBOVIK paper on this? Search for the proceedings and look for “Unit Test Based Programming.” (SIGBOVIK is a real conference that is more or less (and by design) a joke, but sometimes has real, albeit usually ridiculous, results. This paper is one such thing.
Really interested to see where your implementation goes!
Just found it. That paper was an entertaining read. Its approach is “purer,” I’d say, from a theoretical perspective as it generates complex functions from a small set of “ground” primitives, whereas inductive.js relies on the programmer explicitly listing the primitives for each function. This results in a trade-off. Solve attempts in inductive.js are generally faster; arbitrary operations like AJAX are easily represented; and it’s straightforward to compose programs of arbitrary size. On the other hand, utbp’s approach is simpler, and it’s definitely more convenient to be able to omit a list of operators. It’s an interesting field of study with a lot of thought-provoking papers out there, like utbp’s.
Sorry – I would have linked, but was on my phone.
Ah, interesting! I’ll have to look into your approach a bit more, but this is a great project!
This is the thing that will get me to enjoy writing tests.
I’m working on scraping the Chicago Transit Authority’s bus tracker API to build a database of bus stop events (i.e. times when a given bus stopped at a given stop). The idea is to provide a tool that will let people advocate for better transit by seeing how the actual frequency compares to the scheduled frequency. By storing everything in PostGIS it should be really easy to do things like compare median wait times across different neighborhoods or wards.
Are you using GeoJson? I would really appreciate a blog post or two on the techniques used and problems encountered.
No, I’m not using GeoJson, the data comes in as XML with elements containing a latitude and longitude, which I map to PostGIS POINTs. So far I haven’t built any front end to this thing, I’m still working on polling the API and storing the results to the database. But when I get a bit further along I will definitely write something up.
Gensym just joined lobte.rs, https://github.com/gensym/ctacruncher maybe you guys could team up?
also take a look here for some inspiration http://bdon.org/2014/07/13/realtime-transit-data-howto/
Tying up loose ends on a couple personal projects. I released theft, a property-based testing library for C, and am working on getting an MQTT client library for very low memory embedded systems ready to release as well. Most of the remaining work is on the documentation.
I’m also finishing up an embedded project’s bootloader, and infrastructure to do reliable firmware upgrades over unreliable packet radio.
I’m finishing my work on subtitle upscaling today, at least for now. I ended up doing a few dirty hacks, like diving into codec-specific private data in FFmpeg, because the information I need to decide when to scale is not exposed through their subtitle API. It seems to work fine on my desktop, but I still need to test it on-device (it’s an embedded product).
In my spare time, I’ve been helping out a friend with his master thesis, which is a recommender for recommender engines (very meta). It extracts certain properties about a dataset of reviews or purchases, and then uses Weka to predict which recommender engines and configurations are likely to work well for this dataset. No idea if it will actually work.
I’ve also been playing with terrain generation, working on a modified diamon-square algorithm to generate height maps for a sphere using the Peirce quincuncial projection.
Last week, I pushed a few more commits to hython (https://github.com/mattgreen/hython).
I started in on classes, objects, and attribute dictionaries. The latter is a key part of Python, so I have some work in front of me to express almost everything else in terms of it. In the process I learned there’s no real magic in implementing OO, just book-keeping. I’ve been accruing technical debt in the parser, so I’m going to have to start paying that down to support method calls properly. (I don’t want to switch away from Parsec, but it might happen.)
I’m also excited to sit down and use the ContT monad to express control flow better, but I need some time to play with it.
More Rust docs. The guide is pretty close to getting finished! Five more sections. I think that means it’ll be wrapped up next week.
I really, really, really need to write a second post for metaphysics.io. I want it to be an introduction to Marx, that doesn’t say it’s about Marx till the end. I also want to write a post about “Disrupting Disruption,” and how ‘disruption’ has a whole lot of negatives associated with it that we don’t talk about.
Thanks for the great docs! As I just posted in another comment, I’m returning to Rust now and trying to learn my way through it again. It’s much easier now with some documentation.
Awesome. Please open issues and tag me with any weaknesses you find. They’re still far from perfect.
I was thinking there needs to be a disrupting disrupting startup that allows entrenched players to remain entrenched. No wait, that is a common startup strategy. Move rapidly and then sell technology back to the giants who would move there anyways (VMWare, Salesforce, Cisco, Autodesk).
Creative Disruptive Destruction opens possibilities and makes the world a better place, though very hard to do.
I look forward to your post, the world needs more meta.
I’m continuing work on merging MongoDB 2.6 work into TokuMX. It’s not glamorous but it’s well worth doing. The changes in MongoDB 2.6 are largely refactoring efforts, but they’re leading to some good code cleanup (though some is in danger of introducing performance regressions, we’re being really careful about that) and some nice features or general improvements. In particular, the routing in the sharding layer has gotten a lot better, it’s more careful, handles batches much better, and issues operations to multiple shards concurrently, which has the potential for massive throughput improvements, especially for hashed sharding.
i’m trying to do more work on my thesis.
i found this pomodoro thing - https://github.com/tobym/pom - that i think someone posted here - useful for forcing myself to sit and think about stuff.
Spent some time over the weekend finishing up Creek. This week, I’m not working on anything technical outside of the day job. I’m trying to get everything ready before I move to Berkeley in two week’s time to start grad school.
You might find this VLIW DSP interesting, http://www.chipwrights.com/cw5631_table.php
I’m wrapping up a W3C CSS3-compliant tokenizer and parser in Go. The project originally started as a port of LESS to Go but I realized that doing a solid CSS3 parser first would be easier. Ultimately my goal is to make a fast asset pipeline tool (LESS, SASS, Minifier, etc) with zero dependencies. Just download and run.
Also, I’m looking for suggestions for names for the asset pipeline tool. And no, “asspipe” is not a good name. :-/
Do you have a public repo for it? I’d be very interested in taking a look.
I had a good chuckle at “asspipe”, but given that it’s a tool that would “sew” various assets together, why not the play on words “sewer”?
It’ll be at https://github.com/benbjohnson/css. I’m cleaning it up a little bit right now before I open it up publicly. It uses a similar setup to the go/* packages. The CSS3 spec has additional information per-token so I’m splitting the tokens up into their own types.
Here’s the spec I’m going off of: http://www.w3.org/TR/css3-syntax
I find it strangely fun to implement lexers and parsers. I’m not sure why. :)
Continuing to have more fun with OpenBSD and flashrd, as time allows. They gave my Alix board a good refresh.
In my free time I’m starting to mess around more with Rust. I’ve played with it a little bit long ago, but that was long before Cargo and many other new things. I’m not working on anything in particular with it just yet, just re-learning my way through it since I never really developed a feel for it the first time.
At $WORK we are continuing to learn about AWS, preparing to move a project there. Our business is quite spiky, so it’s mainly for peak scaling, but also flexibility in tooling & access for developers.
No projects lined up for home, unless you count working hard at relaxing. Playing board games after work with some colleagues tonight.
Following a friend blogging about taxicab numbers, I spent the evening figuring out a way to do this without much manual intervention. Started off with ruby & sqlite, is currently using bash (hell yeah!) scripts & MySQL as the data store. And maxing out my laptop somewhat as it generates cubes of numbers and adds them together. The SQL query to find a taxicab number for a given value of n sums is pretty quick though, yay for SQL!
For work, some classic Rails app related to airlines, not that fun so far be it’ll become much more interesting when we’ll start crunching big amounts of data.
On the hobby side, I slowed my Clojure readings and now focus on coding instead. I started building some Rails backend with a Clojurescript frontend. So far it’s really fun to code, Om and Clojurescript make the front end much more interesting and meaningful than it used to be. I could do the backend in Clojure instead of Rails but it’s so damn easy to build a basic backend when you’re used to Rails and it allows me to focus on client-side in the meantime.
It also allows me to spark interest for Clojure in my environment, which is mostly choosing Rails + AngularJS for every project no matter what.
Last week I added compression to messages in Fire★. I got distracted doing that instead of working on my secret game FCFODS.
This week I am going to work on the game. Though I am a bit discouraged now reading all the posts about how much money Indie gamers make. Sad really. If I get a 100 bucks out of this, I will be happy. It is that sad.
At work, I’m testing and adding polish to our Ubuntu OpenStack Installer.
At home I have been working on my hacky script to import photos into git-annex and make use of their EXIF GPS metadata to add annex metadata: git-annex-photo-import (github). My idea is to use git-annex views to manage all my photos and the S3 & flickr remotes for backup.
Right now the biggest problem is that views are very slow here’s a short thread on the git-annex forum explaining it, Joey’s plan is apparently to use a sqlite cache DB to index the metadata, but I’m not sure what the status of that is.
BTW, if you have a much better photo management workflow, can you take this public gdocs survey and share what it is? Organizing and archiving photos is such a mess, and I really want to know if anyone’s got it figured out cleanly.
You guys are using github now? When I worked at Canonical everything was entirely in bzr and Launchpad. Is this just for a couple specific projects or are you guys kind of shifting in that direction?
This is just for a few projects. There is no overall shift I’m aware of. Launchpad does a lot of things that github doesn’t and probably won’t (and for good reason, not every project is a distro).
Our cloud-installer project also uses launchpad for PPA building, and bug tracker.
Launchpad is in fact still under development - just recently they rolled out a beta of inline comments on merge proposals, which (aside from cleaner design) was one of the few major things I liked better about github vs. LP.
Yeah, Launchpad does a ton of stuff. Github’s issues are very primitive, and Launchpad has features for translations and stuff that I can’t imagine Github having much use for. But I do kind of wish they would add support for git to Launchpad, so I kind of wondered if they might be moving a little more towards git.
I’ll be keeping an eye on your git-annex import project, I’ve been searching for something similar to help manage my photos/images.
Just so you know, I don’t have a lot of future plans for that specific project, except just to ensure that it’s smooth to add new pictures as I take them. It worked fairly well for me, and now I’ve imported all my backlog of photos.
It could probably use some attention to be useful for others - e.g. it includes some U.S. specific heuristics in the location stuff (ie, looks for ‘Country’, ‘County’, and ‘State’). I’d be happy to help with fixes and would welcome forks.
The things I’d most like to improve about this setup is the issue with view performance scaling, and adding support for thumbnails - ideally I’d love to see git-annex core support a ‘content summary’ that lets you preview files that aren’t currently in an annex, for images, that’d be a thumbnail.
I am taking two weeks of vacations (deserved I may add). I am currently taking a bike ride (I am taking a break at the halfway point, I don’t ride and text) and I hope to keep doing non-CS things and recharge my batteries. I’m also playing Super Ghouls n Ghosts in the evening.
I’m studying Mandarin.
I’m doing a lot of thinking lately, both about personal things, and about things I’d like to fix technically. I have a bunch of projects on the back burner, that I won’t be picking up this week. If I do write any code, it’ll likely be experimenting with Cairo to build some simple visualization tools which can be utilized in command pipelines.
I wrote this awful one-liner the other day and wanted to append
| linegraph
to it and have it spit out a PNG to see what it looked like, but that proved trickier than I had hoped.grep -C 1 '%MEM' pidstat.144731 | grep -v '%MEM' | grep -v -E "^$" | grep -E -v "^--" | awk '{while ( ("date +%s --date=\"07/31/2013 "$1"PM\"" | getline result) > 0 ) { print result" "$8 } }'
I’m working on getting Whitane Tech. , my go consulting business off the ground. I’ve got my first client (a startup using app engine) and I’m putting together a company site using hugo. The website needs work.
I’m writing a case study for switching to go based off a post I wrote previously. I’m hoping people looking to adopt go can use my case studies to get buy-in from the rest of their team.
I plan on quitting my day job by the end of the year. I’m writing down my goal in public to help force myself to commit. Wish me luck!
I’m working on an API which will index white papers for keywords and full-text search and let you retrieve urls for those papers via the API. Building it using Elasticsearch and Go for the webserver.
Massive piles of refactors at $WORK, par for the course.
Learning O'Caml in my spare time and putting together a minimal website for making training schedules for a race (running), based on a given date.
FYI, there’s no apostrophe in OCaml.
The TREC Microblog Track.
I’m working on the Tweet Timeline Generation task, while my colleague is working on the ad-hoc search task.
I also created a set of Go bindings for the Thrift server used.