I’ll be doing a lot of reading. At my startup, our main focus largely reduces to “knowledge management.” And over the past few months I’ve been pointedly going back and reading (or re-reading in some cases) a lot of the books that I consider the “canon” of KM, looking to either solidify my understanding of certain things, mine ideas for messaging from a sales and marketing viewpoint, mine ideas for new product features (or even whole new products), etc.
Right now I’m working on The Fifth Discipline by Peter Senge. Once I finish that, I’ll probably move on to The Living Company by Arie de Geus.
If I take a break from all that, it will probably be to go to the gym and lift, or maybe get out and do some bicycle riding if the weather is nice.
What are the others? And if only one, which should someone buy to learn the most important concepts?
I’ve been meaning to write a blog post about this for a while, but haven’t gotten to it yet. But briefly, a few of the books that I consider “the canon of KM” would include:
Wellsprings of Knowledge
If Only We Knew What We Know
Winning The Knowledge Transfer Race
The Fifth Discipline
Business @ The Speed of Thought
The Living Company
If I had to pick one book that gets the key ideas across, I’d probably start with Wellsprings of Knowledge.
Interesting. I’m finding at my company that we have a… deep need for improvement with regards to knowledge management (not a subject I know a lot about, unfortunately.) It seems like most efforts for improvement are rapidly lost in the background noise and “what actually happens” ends up fairly random.
Still working on stuff for https://www.neuralobjects.com. Specifically, working on getting Zeppelin into the basic cluster provisioning process. Once that’s working, I’ll probably start on putting DL4J in, and then move to working on the training API and prediction API components.
There’s also work to be done on the back-end, related to billing. And then there’s a lot of general code cleanup, refactoring, UI tweaks, etc. to be done.
Also, a lot of copy needs to be written for the website it self, and marketing collateral and what-not.
So all of that will be keeping me quite busy for several more weeks.
Still working on NeuralObjects.com stuff. Right now I’m focusing on integrating GPG so we can email sensitive information to customers (login credentials and the like). So I’ll be implementing some code to allow users to upload and manage public keys and revocations, as well as backend code to encrypt data when needed.
Once that’s done, I have another task queued up to setup a scheduled job to automatically renew our Let’s Encrypt certificates.
Once those are done, we’re real close to having enough of the provisioning stuff done to support a beta test. But there’s still work to be done on the Job API and the Prediction API before then as well. So hopefully by next week or so I can start shifting my focus to one of those areas and put the provisioning stuff to bed for a little while.
Oh, and I also have some stuff I want to do in terms of internal analytics / reporting. So at some point I’ll be looking at pulling some data from the Twitter API and possibly other social media sources, as well as Google Analytics.
And then when I’m not working, I’ve got a couple of books I’m reading. I’m chugging through The Penguin History Of The World for one.
Outside of the dayjob, finally back to hacking on neuralobjects.com stuff a bit. I finally finished some work I was doing on having the provisioning process for clusters update a db record in stages throughout the process, and added a service that we can call to get the current “percent provisioning complete” for a given cluster. Now I need to wire that into the UI so that the “manage clusters” page shows a progress bar for “still provisioning” clusters, along with the percentage complete. Related to that will be adding the ability to send an email or an XMPP message or something at the end of the process.
I do a lot of bicycling. MTB'ing is my real passion, but I tore my rotator cuff late last year (in a MTB crash, go figure) and had surgery in January of this year, and my surgeon isn’t clearing me to start mountain biking again until the first week of July. So just a little longer… but in the mean-time, I’ve been doing a lot of road riding just to get back in shape and get some miles in.
Outside of that, I read a lot. Currently I’m (re) reading Asimov’s Foundation series. And then there’s Netflix. Right now I’m going through House of Cards (just started Season 2).
I’m also taking some classes on Coursera, but I don’t really think of that as a “spare” time activity.
And last but not least, I hang out at the local hackerspace sometimes, and tinker with various stuff. I just ordered a bunch of parts for a new project I have in mind. I want to do the “retrocomputer” thing and build myself a homebrew Zilog Z80 based computer. My Z80 chips should be coming in on Wednesday!
Very cool! What OS do you plan to run on it? CP/M?
Probably so, yeah. It’s pretty much that or take a stab at writing my own OS. And as much as I’d like to do that, I’m not sure I have the time or the requisite knowledge. But time will tell, I suppose.
Depends on where your interests lie. If I had the time and built a thing like you’re describing, I’d likely do something like implement an ethernet driver or maybe parts of a TCP/IP stack.
Well, just to give some context… when I was in college in the 90’s, my dream (then) was to graduate with my C.S. degree, and land a job with IBM at the Boca Raton site, working on OS/2. :-)
That didn’t happen for a variety of reasons, but I’ve always been fascinated with OS level stuff. It just happens my career didn’t wind up taking me in that direction. So yeah, I’d still love to take a stab at writing at least a minimal OS, it’s more a question of time than anything. And yeah, implementing ethernet and TCP/IP would be awesome as well.
I’ve seen some homebrew Z80 projects with support for USB, SD-cards and other modern contrivances, so I am hoping I’ll be able to do some fun stuff with this over time.
Finally back to working on https://neuralobjects.com stuff again. The last couple of weeks have been fairly unproductive for various reasons, but I’m hoping to get a lot done this week. Right now I’m reworking the build process a little, and about to deploy a new build to test out some changes in the way we package and build the system. Basically, we use ServiceMix as a container for, well, services. Before I was just dropping our bundle in the deploy directory with a bunch of our jars, but I finally go around to setting up the pom to build a kar (Karaf Archive) feature file.
Functionality wise, I’m still working on having the provisioning process update an object in the database as it works, so we can show the user a progress indicator as it works. Once that’s working, I plan to add a feature to alert the user (via email, xmpp, etc.) when an environment provisioning job is done and an environment is up and ready for use.
Once that is done, I need to do a little more work on the billing stuff (a portion of the monthly charges will be variable depending on resources used) and more work on the API for interacting with the environment (that is, the API for uploading data, defining models, trigger training jobs, making predictions with a model, etc.).
And then I think we’ll be ready for an alpha / soft launch. Piece of cake, right? :-)
Well, for last week I finally made a lot of progress on some backend work for https://neuralobjects.com that I was working on. I had been briefly held up by some issues getting JPA working in an OSGI environment (we use ServiceMix as the container for our backend services), as my first stab at this was based on Spring Data JPA, and Pivotal no longer package the Spring jars as OSGI bundles… and I ran into dependency hell trying to get it all to work.
Then, I realized that I really don’t need Spring Data for this, and since Hibernate ships with OSGI support, I switched it all out to use Hibernate directly and got that working. Now I have to finish defining some domain classes, and plug that code into my provisioning service so that it updates the database in response to provisioning operations.
Next steps for this week: wire that up to a progress indicator in the web interface, and set up something to send an email, xmpp message, etc. when the provisioning of a new environment is complete.
Also, taking a course on Statistical Inference on Coursera, so I’ll have homework to do for that.
Holy shit! An “under construction” gif! Blast from the past! Cool!
LOL, yeah, I couldn’t resist. And other than posts here and barnacl.es, we haven’t really publicized this much yet and the robots.txt is set to disallow all crawlers, so we don’t expect many eyeballs on the site just yet. So I figured a cheesy 90’s “under construction” gif would be a nice touch until we’re ready to release to a broader audience.
Still working on https://neuralobjects.com stuff. Specifically now, I’m working on updating the backend service that does all the cluster provisioning to update a database record as it progress, so we can show a graphical indicator in the UI. I also want to be able to send an email / XMPP message / etc. at the end when everything is up and running.
Unfortunately I got hung up in a bit of a rabbit hole on this over the weekend. I prototyped some of the db stuff using Spring Data JPA which works fine, but the services are deployed as OSGI bundles in ServiceMix… and a lot of the Spring jars are not available as OSGI bundles, and I ran into a lot of issues getting that all to work. So now I’m debating if I want to switch to using Apache Aries or something, or spend the time learning to work with bndtools so I can wrap/repackage the Spring jars myself. sigh
Edit: this might not be so bad. I just realized that I don’t need to go as far as using Aries, which provides a pile of “Enterprise OSGI” features… Hibernate has an OSGI specific bundle which (purportedly) makes it pretty easy to just use vanilla Hibernate (with or without JPA) from inside OSGI. If this works the way I expect, I should have this database stuff done this week with no problem. fingers crossed
Got the Hibernate “Unmanaged JPA” OSGI demo app running in ServiceMix, so this should be downhill now. I just need to wire up Hibernate into my service and fix up a few small things, and that bit will be done at least. Hopefully I can get this done tomorrow or Friday.
I have mixed feelings about this, but as much as I dislike aspects of systemd, and have low regard for the way Poettering conducts himself, I am not exactly “up in arms” about this. Given that there’s a config option to change this behavior, as well as the linger stuff… well, I can live with this. I understand being upset about breaking a long-standing default behavior, but at least they didn’t change the behavior and leave no recourse to correct things.
And I say this as somebody who routinely uses detached screen sessions for long running processes, at least early on when I’m prototyping things or experimenting. If I have to twiddle a setting in /etc/whatever.conf to keep doing that, that’s tolerable in my world.
Definitely FUD. Their was a lot more subtlety to this decision than “you can copy anything and claim fair use”. The specifics DO matter. Copying API declarations / function signatures / etc. is apparently fair use, at least in some cases. Nothing even guarantees this fair-use defense would hold up in a separate case with different details. But more to the point, this case wasn’t about the actual implementation code at all (barring the thing about rangeCheck, which was trivial). It’s not like Google were accused of copying the entire JDK library source code or anything. If that had been the case, the outcome would likely have been different - and rightly so.
So programming as we know it hasn’t ended? Sigh… Back to work.
Maybe it hasn’t but the court ruled that APIs are copyrightable so this might be a Pyrrhic victory in the long run.
I’m pretty sure that was the default state before this trial even began though… and that isn’t any sort of binding precedent, as I understand it. The lower court ruled that API’s aren’t eligible for copyright, the higher court disagreed, and the SCOTUS declined to get involved. But the declination by the SCOTUS is very specifically not considered an endorsement of the other court’s decision, if I understand correctly. IOW, the SCOTUS could still rule against API copyrights in some other, yet to be originated, case.
And even now, what happened in this case isn’t binding precedent for regular District Court circuits, if I’m reading this right. This narrowly pertains to the Federal Circuit where patent claims are heard (this case went to that circuit because the original case included some patent claims).
Somebody who is a lawyer please correct me if I’m wrong, but I think this is a bump in the road, as it stands right now, not a sheer cliff face or whatever.
Just imagine sitting there working when all of a sudden your manager busts through the door and yells “STOP CODING” at you. Like, really? That would actually happen? No, I don’t think people would stop coding. They’d just treat it like the legality of smoking marijuana. (At least, in California.)
Smoking marijuana in California, but your employer is in on it, and has lawyered up in case another corporation is looking at you smoking marijuana and wants to sue the hell out of you because that marijuana is similar to theirs. And also, your employer somehow benefits massively from your smoking marijuana.
Actually, if they won, the law would get changed. You cannot really ask most of the economy to suck it up and comply. Unlike consumers… sigh.
OK, there’s an interesting point there, or at least the kernel of one… but a little more depth would be useful. What agenda does the author of TFA think that the EFF is pursuing? If this is such a fundamental switch in position, what end is being served, and why the swerve? Have usurpers taken control of the EFF? Or was the EFF merely a front for BigGov all along? Or something even more sinister? And if not the EFF, what organization(s) so we support today who are working to defend freedom in cyberspace?
Also, this bit Does the EFF repudiate the law, and want government to avoid regulating cyberspace? Or does the EFF use that law to encourage government to regulate cyberspace? Both have their pros and cons, but you really can have only one. feels like a class Fallacy of the Excluded Middle to me. IOW, I don’t see why the EFF can’t take a pragmatic position and say “you made these rules, now at least play by your own self-imposed rules” while simultaneously wanting the overall rule changed.
My personal take is that the EFF, like any organization, is (subconsciously) primarily concerned with its own perpetuation. What gets people twittering? What brings in the donations? Every advocacy group faces an existential crisis after they’ve won. What battle to fight next?
This article strikes me as wholly unconvincing. I couldn’t even make myself read it all, because the first 2/3'rds or so seemed to be just a lot of assertions with no justification. At best it seems that this may be vacuously true in a pedantic / overly strict sense. “The brain doesn’t store representations of dollar bills” or what have you. That’s probably true. There’s no reason to think that you have an exact image of a dollar bill in your head at all times. That seems pretty irrelevant to me, as it appears that we must store at least some fuzzy representation of the dollar bill, in order to recognize it, or to describe it - from memory - to the amount of detail that we can.
But digital computers don’t necessarily have to work with exact representations either.. ergo all the recent successes we’ve seen on image recognition using artificial neural networks, etc.
Personally I suspect that the brain is a biological implementation of a sort of bayesian pattern matching system which does, indeed, share quite a lot with computers - unless you just define that away by saying “a computer is something that works differently from the way the brain does”.
As if a computer has an exact representation of a dollar bill. An EXACT representation would take an uncountable amount of resources.
This was my main contention with this article. No such “exact” representation of data occurs on a computer, either. It’s not as if people are saying our brains work exactly like computers, anyhow.
@mindcrime, the question of “representation” in the brain of course is very interesting and complicated, but as a starting point, I find the concept of complex feature selective neurons in the brain very fascinating and illuminating. A quick run down is here https://en.wikipedia.org/wiki/Grandmother_cell#Face_selective_cells . This topic was made very popular by the “Jennifer Aniston” cell. The who cell, you ask? That’s the risk you run of trying to popularize your research by tying it to the fickle star of popular culture :)
I just quit my previous $dayjob and I have two weeks until I start at the new job (I offered to work out a two week notice period, but they only needed me to finish up some stuff today), so I should be able to get some stuff done on https://neuralobjects.com
Last week was a total bust for various reasons, so I’m still basically where I was before - need to finishing getting the subscriptions stuff setup in Stripe, fix the signup wizard to include a section for picking a subscription, and then finish integrating everything so that you can build an environment and start loading data, training models, making predictions, etc.
I also need to write a bunch of copy for the website, and finalize the pricing stuff. I also want to plug in Google Analytics on the site.
Once all that’s done, I hope to formally launch our beta version and see what kind of response we can drum up. From there it will be a question of iteratively adding features, fixing issues, and working on the marketing and sales process.
I agree that the phrasing of the announcement is slimy PR-speak, but the new pricing structure is certainly interesting—I can think of organizations I’ve worked at where it will make github dramatically more affordable (~5 devs, 30+ repos).
Sure. There will be edge cases that go both ways. A friend just tweeted
that their bill will go from ~£200 to ~£2000 – an order of magnitude
more expensive. (However, this is the UK government so I’m sure they can
find some loose change to cover that.)
For sure there are many organizations for which it will be much much worse. I can only assume Github has done the numbers and figures most of them are large enough to find some loose change, as you say.
I don’t think it’s an edge-case at all. I’d guess many organisations have more repos than members.
FWIW, for us, our bill just went from $50 / month to @25 / month, and now we have unlimited private repos. And considering that we were one repo short of the cap from the old plan… yeah, I’ll take this new plan in a heart-beat.
FWIW, for us, our bill just went from $50 / month to @25 / month, and
now we have unlimited private repos.
Ours would go from $200/month to $350/month if we moved to this new
plan. We’re using 64 out of 125 private repos in our current plan, so
I’m guessing we’re sticking to that :-)
If we were limited by our repo allowance and were considering the
switch, we could probably make the new plan a bit cheaper if we were
willing to change a few things:
And considering that we were one repo short of the cap from the old
plan… yeah, I’ll take this new plan in a heart-beat.
I would too in your situation!
Ours would go from $450 to $448, but honestly we’re adding people faster than repos. That said, this provides a strong negative incentive to moving our company wiki and documentation to GitHub – it will remain external.
Didn’t get much done last week, so largely more of the same. I did get the Stripe stuff mostly finished up, so what I need to do now includes:
That will still leave some work to do, but once that entire flow is working, we’ll be real close to where we can start a beta run.
I lost my shit at your “Under Construction” gifs and images. Thank you for keeping that alive! :)
EDIT: Seriously, do you want some help copywriting or whatever? PM if interested.
Yeah, I want it to be pretty obvious that the site isn’t “live” yet, and I figure a few cheesy 90’s “under construction” GIFSs make that point pretty clear. Plus we haven’t really publicized this very widely yet. These threads here, the corresponding ones on barnacl.es, and an offhand mention on HN here and there. And the robots.txt is set to disallow crawlers right now, so I’m not too worried about people just stumbling over the site and getting a bad impression before it’s actually ready.
Very possibly. I’ll shoot you a PM.
Seriously, nice work. Not sure what practical application this has, but still very, very cool.
having the coolest terminal in the world.
Late last week I made a lot of progress on getting Stripe integrated so we can start taking payments, which means we are really, really close to a launch. We’ll probably do a soft launch / beta period with just a note here, on HN, etc., so look for it some time in the next month or so, if all goes well.
So, what is this? Well, it’s a “Machine Learning as a Service” offering which is mean to make advanced machine learning / analytics tools more available and accessible to everyone. The first service we offer will be based on a Spark/Hadoop cluster running Apache SystemML. We’ll be providing API’s for uploading data, submitting jobs to train models, and download results, as well as a Prediction API which lets you directly access a trained model. Early on we’ll provide a number of pre-defined ML algorithms, including things like Linear Regression, Random Forests, SVMs, etc. We’ll also be exposing the ability to write and upload your own algorithms in DML (an R-like language designed for SystemML) and PyDML.
The nice thing about DML is that it, along with SystemML, supports seamless scalability, unlike “real” R. With DML, you write your algorithm once and run it on anything from your laptop, to a 5000 node distributed cluster, with no changes required. With SystemML and our service, data scientists can run ML / analytics jobs on huge datasets without having to worry about: installing Spark/Hadoop, installing SystemML, maintaining a data center, porting R code to a different language to get scalability, etc.
Later, as the service evolves, we will be looking to add the ability to directly use TensorFlow, Warp-CTC, CaffeOnSpark and other platforms. Beyond even that, I think we’ll probably eventually offer Beowulf clusters for MPI/OpenMP programming. One thought I’m toying with is to have a cluster type that sets up MPI, R, and Rmpi so that anyone who wants to do distributed development with vanilla R will have access to that.
Anyway, that’s the basic idea. Feel free to poke around at the site, keeping in mind that there are still a lot of things that aren’t fleshed out yet. There’s a username / password of user123 / realitybomb40 that you can use to login, if you want to poke around without registering. If you DO register, we won’t be keeping any information you provide us right now. The database is periodically wiped and reloaded as we iterate the code-base.
In about another week, I expect we’ll have a full end-to-end workflow finished, so you can register, sign-in, spin up a cluster, and start doing work. We’re REAL close to that point, but not quite there yet. If you think you might be interested in something like this, click through to the site, scroll down to the bottom of the main page (or click the “under construction” logo) and fill in the form to subscribe to our (very low volume) mailing list.
Also, before anybody asks… the video that’s there is just a placeholder to mark out where on the page we plan to put our explainer video. I just chose something random that was machine learning related, but don’t take that to mean we are associated with Andrew Ng or anything like that.
bash - it’s usually the default on most Linux systems, and it works well for my purposes. I think it might be interesting to learn a “better” shell one day, but I’ve not yet felt enough pain to invest a lot of time into trying to learn tcsh or zsh or fish or whatever.
Still working on NeuralObjects, our new open source Machine Learning / Analytics as a Service offering. Over the last week I did a lot of work on the UI for signing up and creating environments. What I’ll be doing this week is finishing up the wizard for configuring a new environments, and hopefully test the full end-to-end flow of “sign up as a customer, create an environment, train a model, make some predictions”.
Beyond that, the TODO list still includes doing Stripe integration, figuring out pricing, writing documentation / tutorials / blog posts, integrating Candlepin, and getting Google Analytics setup. I also want to integrate the customer registration stuff with the main Fogbeam CRM system.
Looking further out, we need to get things setup so you we can deploy TensorFlow1 and other packages. We also want to offer more than just spark/hadoop clusters, and we’ll be looking at making MPI clusters part of the offering as well. One specific thing I’ll be looking into, is making it an option to provision a cluster running MPI with R + R/MPI2 installed, for people who want to use R and MPI. Depending on how far down this path we go, we may look at using a provider that supports high speed interconnects like Infinband.
We’ll also be looking into whether or not it makes sense to introduce something like Apache Taverna1 into the stack.
Another option we might explore, is defining a IoT specific environment, add “baked in” support for MQTT and the like, and offer something that caters specifically to IoT analytics.
When it’s all done, we’re going to have a really nice setup that makes it pretty painless to provision environments for doing a variety of machine learning / analytics tasks, and then use APIs to drive the entire process. We’ll also have Apache Zeppelin4 configured for doing interactive and collaborative exploration of data in the environment.