As per the usual …
Feel free to share what you’ve been working on here. Also mention if you need advice, help, or a second pair of eyes.
This week, I’m working on 3 different things. This feels like I wrote a small book.
The new thing that I’m doing this week is a lot of re-learning about ontologies, OWL2 and related technologies. I’ve been doing some experiments with providing context for content using named entity recognition and DBPedia. In the end, I don’t think that this is going to be enough (at all) for what I want to do, so there’s a lot more work to do and things to figure out. I’m thinking of starting a separate blog about this, as the act of writing about it helps get my thoughts together. I’m also working on some HTML / CSS / JS prototypes of how some of the tools that I’m thinking about might work, how the user might experience the annotations, and what’s going to be an interesting path forward (rather than a dead-end).
Second is Open Dylan. I published a couple of posts on the Dylan Foundry blog recently. One was the Type System Overview which got posted here on lobste.rs. The other was a sign of what I hope the future will bring: Saying Good-bye: HARP. This was a post where I indicate my intention to remove 1 of our 3 compiler back-ends despite the fact that it will bring some short term loss of functionality.
I had a lot of discussions with people in the last week about the future of Dylan, why they have had trouble getting involved, and where we should focus our efforts. I’ve now started a new thread asking people to discuss what they value about Dylan and what we see the core values of Dylan as being. My post in the thread can be found here. I’m still hopeful that more people will respond as this is a holiday weekend in the US and ICFP is also happening now, so many people are traveling or otherwise occupied.
One of the things that came up repeatedly was how difficult people find it to actually build Open Dylan and start hacking on the compiler. I believe there are many projects that don’t involve that level of dedication, but at any rate, this has come up often. As a result, we’re looking to start removing some parts of the codebase that may not be necessary today and we’ll hopefully be able to simplify some other areas of the codebase. One big place to start, as mentioned above, is our compiler back-end HARP. This leaves us with a functioning C back-end and an in-progress LLVM back-end.
I also have a patch that I’ll submit as a pull request soon that implements an extension to our method dispatch code that was a research project about 15 years ago. Unfortunately, while our method dispatch code is slow and this experimental extension was intended to speed it up, it doesn’t actually do so on a modern machine. It also increases the memory footprint of applications. We’ve never been able to confirm that this code is fully functional or bug-free as it is quite complex. We’d rather fix a number of issues with our dispatch code by replacing it with something simpler and based on other techniques rather than try to get the current technique to perform better. (Part of the problem is that it chases through memory a bit too much.)
I’m also working on putting together some other proposals for some projects that would improve Dylan. Some would make it easier to hack on while others would make it more useful for server-side applications. Among these are adding event logging to the core run-time in a way that is similar to what our GC does, what I’ve done for emscripten, and what GHC does. (Nothing new here, just some nice tools.) Another would be to change the C run-time to use standard type names and continue the process of making the code clearer that was begun earlier this year. We’d also like to improve the performance of our regular expression engine, get back to work on Unicode support, and a number of things. There are also some opportunities that open up once the HARP back-end is gone, like adding SIMD primitives and a basic vector math library that uses them. Overall, some fun and relatively easy stuff.
Finally, I’m also still working on some stuff with emscripten and a memory / heap profiler. I’ve been integrating the memory tools with my client’s application and using it to diagnose some issues (and having to improve the tools while doing so). This has been pretty enjoyable and I’ve gotten to learn about tools like crossfilter.
. I’ve been doing some experiments with providing context for content using named entity recognition and DBPedia. In the end, I don’t think that this is going to be enough (at all) for what I want to do, so there’s a lot more work to do and things to figure out.
Have you looked at or used Apache Stanbol yet? http://stanbol.apache.org
If not, Stanbol is an engine that is designed to do exactly what you’re talking about… you pass it content, and it does NER, concept extraction, entity-linking with dbpedia (and optionally your own custom entities), etc., and gives you the results back in JSON-LD.
We’re doing a lot of work with this for our startup… if you’re ever interested in chatting on this topic, feel free to give me a shout.
Dropped you a PM on here. This looks really interesting. Thanks!
Cool. Just replied to your PM and I’m on IRC now as well.
At work, I plan to continue working on my Haskell web service.
This week has seen a lot of work on improving our CA infrastructure, and working on adding revocation support to our open source CA software (none of which has made it public yet, as I’m still working out some of the details). Also, some work involving hardware security measures, which is both fascinating and maddening.
Would love to hear more about CA revocation as what I’ve heard so far makes it seem a bit dodgy.
I’m working on implementing the standard, pre-existing mechanisms for
clients that require them, starting with CRLs. After that, we’ll see
about OCSP (specifically to try to support OCSP pinning) and CT.
Don’t forget to link it from Lobsters. We are currently using TinyCA at NTK and it sucks a lot. A replacement would be welcome.
Sure. It’s cfssl, and in my opinion, it’s as useful as a tool for building CAs with Go as it is an actual CA. In fact, one of the reasons I went to work here was because to build this—it’s something I was trying to build on my own, but it’s a lot of work to do in your spare time. Feature requests welcome.
I’m continuing work on Call to Speakers, my website for tracking open applications for speakers at conferences. I posted it on Hacker News a few weeks ago, and have seen steady growth in Twitter followers. This week I’m working on adding application forms directly to the site, so users won’t have to navigate (often confusing) conference website to submit their talks. Filtering the list of conferences on the front page is also high on the list.
I’ve also been doing API reviews for companies here in SF. I comb through the documentation and play around with an API to find bugs, poorly designed features, and incorrect behavior. It may sound dull, but I find it exciting. It’s a great way to validate all the API work I’ve put in over the last three years. I’m hoping to finish two more reviews this week.
And as always, I’m working on the API at Stripe, addressing performance problems and developing new features. We just had an intern start last week, so I’m helping him get up to speed.
Nice. I pasted to my company Slack, and the preview appeared to be… “foo”. ;-) Be sure to change it.
<meta name="description" content="foo">
Whoops, all fixed :)
Is there anything out there that you would consider to be a good guide to designing (and documenting) a good API? What do you think of things like Swagger?
I’m not a huge fan of Swagger as I think the documentation it generates isn’t user friendly. I’m becoming a fan of Hyper Schema, simply because it’s machine-readable.
Last week, I got some more work done on Hython. I was pleased to get support for tuples in there, as the syntax is a bit strange, and the grammar has several spots that take either a single expr or a tuple. To fully support tuples, I also had to add support for the subscript operator, and built-in functions. I used the latter to spur a cleanup and refactor. I also got a blog post done yesterday on the motivations for the project. It’s a bit squishier than I would have wanted, but I figured I had to get that part out before I can get more technical.
This week: probably blog some more. I’ll probably do some more mechanical things on Hython, but I’ll need to start reading to know what to be looking toward.
Continuing to learn Common Lisp. Mostly just toy stuff like generating fractals, but also writing a library to process GPX files. For the GPX library, I’m porting a Perl library for converting latitude and longitude to UTM. Not as powerful as cl-proj, which wraps Proj.4, but lighter weight and easier to use. I’m hoping top plot the GPX tracks and calculate stats like total elevation gain, max speed, total distance, etc. Totally reinventing the wheel, but fun for learning.
At work I’m continuing work on our resource and state tracking system.
I’ve been working on https://hex.pm again, and I’ve almost got some pull requests ready to get it closer to a production ready state, including email verification, two factor auth, multiple package repos and package signing, hoping to have this all done my Elixir v1.0.0’s release!
Besides $work, I’m hardening my Rust CSV type based encoder/decoder and working on a new CSV toolkit for slicing, splitting, searching, joining, etc. (Think csvkit but hopefully faster and simpler.)
Learning my way around 0install to see if I can convince the felix developer to consider it as a language package manager, rather than rolling his own.
In the Dylan world, we’re thinking a lot about package management as well. It is pretty tempting to look at Nix in that regard. I’ve got a request out to someone to hopefully discuss that later this week. (He’s busy with work.)
0Install looks pretty interesting from that perspective, and it supports Windows, unlike Nix.
Do you know of any other language using 0install rather than rolling their own?
no, but from their webpage i saw the following:
The Ryppl project is using 0install as the package manager for a modular C++ build system, starting with a modularised C++ Boost library. This has driven many of the enhancements in 2.0, such as support compiling 0install packages on Windows. We hope that 0install will one day replace many of the language-specific packaging systems currently in use.
and while ryppl itself seems dead, i at least have the confidence that 0install itself has language-specific packaging as one of its explicit goals. The other big thing is that 0install seems committed to being truly cross-platform; like you i did get excited by nix as a possibility, but dropped the idea when i saw it did not support windows. for a language package manager that’s a pretty huge limitation to impose.
i’m actually not sure this will go anywhere, because the felix developer seems to think it won’t be a good idea, but i’m curious enough to at least give it a try.
I’m just trying to read and understand the Surface code paper to maybe give a talk on it, or at least write something up about it (it’s a scheme which is used to perform fault-tolerant large-scale quantum computation.)
Learning Ember.js, and looking lustfully at all the NLP-in-Clojure stories we have deep in the backlog.
Spinning up a prototype iPhone app that uses Bluetooth 4.0 to find things in your house.
Learning Ember.js for a new app for music practicing. And practicing fiercely on period instruments for the 2014 National Scottish Fiddling Championship, which is on Saturday.
I’m working on a few things.
$WORK is over for me. $SCHOOL is starting, so I’m figuring out my final schedule with labs and office hours and other things. No coursework besides reading yet.
I shipped Clojure code, though I’m sure it’s bad. I ported a Python and Scala library I’d seen before to Clojure. It’s my first shot at writing even really trivial Clojure code, so any code reviews or feedback would be really appreciated.
Congratulations on shipping!
Since you asked for it, a few thoughts on the code:
Silently suppressing all the IllegalArgumentExceptions is not a good thing. The best would be to reshuffle things so you never got them in the first place, but if nothing else you should surface something from the function.
I’d break it up into several namespaces/files. You de facto did this with some of your comments (e.g. ; Stopwords)—I’d go all the way.
Speaking of comments, I’d follow these guidelines on how many semicolons to use.
Try to avoid declare as much as possible.
Some of your function naming could be rethought (e.g. filter-stopwords-wordmap).
The dbs function is scary. Think long and hard before using atoms.
Make a file called config.edn that looks like this, and then use clojure.edn to load it:
I’m always wary of code review since it makes me look like an irritable grouch…again, congratulations on shipping and on an excellent start to Clojure!
No need to feel like a grouch, I wanted feedback! It’s all true and actionable, good traits of a good review. Thank you for helping out.
All the projects at once. That’s what it feels like
Blacktip, my clone of Boundary’s Flake is now working, tested, and pushed to hackage. I can generate 100,000 unique ids in 125 milliseconds - roughly twice as fast as Boundary’s Flake. I could make it faster still but haven’t devoted the time to it.
I wrote more material for my FP/Haskell book, it’s up to 31 pages of content now. I’m still hammering out the pedagogy/approach, but I’m converging on something more similar to the NICTA course or Zed Shaw’s Learn Python the Hard Way than I am traditional Haskell books.
This week I’m seeing family so I’ll get less done than I’d like.
Goals for this week:
Write an HTTP service wrapper for Blacktip. Possibly also a Cloud Haskell wrapper (similar to Flake’s gen server).
Get ~5-10 more pages of the book written and/or get the current content refined a bit. Refinement would entail translating some of the prose into exercises or at least adding exercises.
Continuing to work on Quoddy and Neddick, preparing for the upcoming CED Tech Venture Conference demo sessions.
Also, working on my talk for the upcoming All Things Open conference, and working on a talk for the October Triangle Java User’s Group meeting.
And, since I spilled coffee on my laptop yesterday, working on getting my new laptop setup while in the middle of all that. :-(
On Tuesday night, gave a presentation at the PythonCharlottesville meetup group about Modern Python Concurrency. Got to learn about fun things like concurrent.futures, asyncio, and Python 3.4’s yield from statement. Also had a nice discussion on process- vs thread-level concurrency and how Python compares to other languages in this regard. This week, I’m back at work on some Apache Storm + Cassandra + Elasticsearch data analysis and backend engineering tasks.
This week I wrote a tiny url server in Go with a MySQL backend as a way to get myself familiar with the language and some of the language constructs that eluded me a bit earlier in my playing with it. I’m probably going to be writing a blog post about it just to dive as deeply into different aspects of the code to further my understanding.
Simple, yet effective.
I have started reading GEB. The first few pages are very interesting. I would try to complete another chapter this week. I have started preparing for this year’s ACM ICPC (going through the past codeforces contests). I have my mid-semester exams in the next week. Paying price for not attending lectures :)
Added ability to export and install apps from files in Fire★. Also published example apps that you can drag and drop to install.
Also updated the website with gifs, glorious gifs!
At work we are continuing the migration to AWS for our main project. The last two weeks have been mainly setting up a Solr cluster for our esoteric use of Solr (that pretty much disqualifies any out-of-the box vendors). This is on the list of things to fix, but it’s below getting the thing migrated so we can scale for the black friday sale ;-)
At home I’m still toying with Clojure. I’m implementing a game nobody’s heard of (called phage) using Clojure, ClojureScript, & Reagent. Eventually it will probably use web sockets, but I haven’t managed to get that far. The core game logic now more or less works, and initial rendering using Reagent also works–though they haven’t been linked yet. Comments welcome, as I’m pretty much fumbling in the dark here!