I’ve been slowly learning After Effects for the last couple of months in order to do The Secret Lives of Data as videos rather than complex D3.js visualizations. My thought is to go with short 5 minute videos that are in a conversational style that explain basic use cases and overall structure of the topic. My first video is going to be on Apache Kafka. I’m crossing my fingers that this all works! :)
I’m looking forward to seeing this!
I’m exploring different mediums for The Secret Lives of Data project. I’m studying up on motion graphics and After Effects to see if that would be easier and more effective than the D3.js approach I did with my Raft visualization. I’m also researching how to do effective voiceovers.
If anybody has any tips on After Effects or voiceovers I’d love to hear them!
For voiceovers, I’ve done a few for client project reviews (clients love when they have a nice video with a pleasant voice narrating what they’re paying me for). With the assumption that you’ve never done one before, here is some advice I’ve gleaned from my experience.
You can improvise a pop screen in a pinch with a t-shirt or other piece of thin cloth stretched around a clothes hanger. Just hold it about halfway between you and the mic (you want to be about 3 ft away, give or take). Sit comfortably but not too close, put the mic slightly off-center from you, these things will help reduce puffing/popping, as well as harsh sibilance (the nasty screeching ’s' sound) and overly bass-y tones. Good mic’s can be acquired for relatively cheap. A USB mic might run you 50$, but the quality difference over a webcam mic or (FSM forbid) a headset mic is astounding. Look for something with a bit of a shock-mount to it to help reduce hum. Don’t spend absurd money on anything, I have one of these which I rather like for skype calls and the occasional bit of impromptu guitar recording.
I usually try to write a general script which covers the main thrust of what I’m talking about, and then vamp a little to explain detail as I see it. Some people like to write everything out and just read the script, but this has a tendency to sound very – well… scripted. Unless you’re very good at both writing natural speaking prose, and then reading that speaking prose in a natural way, leave yourself a bit of room to improvise.
Editing programs – even the somewhat featureless stuff like Audacity – make it very easy to duck in audio and generally edit together multiple takes. If you just get in the habit of recording absolutely everything, you’ll find it very easy to pick out the good versions of sentences from across half a dozen or a dozen takes. I’ve been known to duck the occasional word.
A little reverb here, some compression there, a bit of EQ, sooner or latter you have a really nice vocal. Even if you just use the ‘small room’ preset on whatever reverb effect you can fine, and set the dry/wet to around 10%, you’ll notice a marked improvement in how ‘real’ the vocal sounds. A little goes a long way, and you can really make an okay, obviously dubbed voiceover turn into a lovely, seamless track. If you want some more specific advice, I’m happy to share what I know, but I’m definitely an amateur.
Listen on good headphones, on bad headphones, on speakers, on laptop speakers, etc. Listen on a variety of potential sources, try to make it sound good on most of them, but don’t worry if traditionally shitty sources (laptop speakers, crappy/cheap headphones) sound worse than traditionally better or good sources (decent speakers, good headphones, etc).
That’s about all I’ve got. Good luck!
Wow! That was a lot of great advice. I’ve never done voiceovers before and I haven’t found many good resources on voiceovers or voice acting in general. I’ll definitely hit you up if I have more questions!
Sure thing, tbh, I’d just look at resources for vocal processing in songs. Most of the same principles will apply. It’s actually a bit easier in some ways, since you don’t have to worry as much about creating sonic ‘space’ for the vocal to sit (unless you’re going to have some music in the background).
I started looking at implementing the XML DOM spec and HTML 4.01 spec in pure Go. I want to make something like capybara for Go but the biggest issue is the limited XML/HTML support in Go. There’s a decent number of specs to go through:
I debated porting over libxml2 but there’s a lot of extra XML specs implemented in there that I’m not interested in.
I got my chromeless gist mirroring service up and running. I need to submit it to Embed.ly as a provider and hopefully in a week or so I’ll be able to use it to embed D3.js visualizations into Medium blog posts. After that, Project Yak Shave will be complete. :)
I’m building a simple app for serving gists similar to Mike Bostock’s bl.ocks but I need the chrome removed and I’m going to integrate it with Embed.ly. The end goal is to be able to host D3.js visualizations inside of Medium blog posts.
I want to do more visualizations for The Secret Lives of Data project but the Raft visualization I did was really time intensive. I think if I can break everything into smaller pieces then it’ll be much faster to implement.
That Raft visualization is really nice.
I just finished up a W3C-compliant CSS3 parser & lexer in pure Go. There are some docs in there right now but I’m going to work on improving them and adding usage examples. Then I’m going to start on a pure Go LESS port. Ultimately, I want to build a zero-dependency asset pipeline tool so people don’t have to download and install node.js just to use tools like LESS.
I’m refactoring the API on my Go W3C-compliant CSS3 parser/lexer. This project is a stepping stone to writing a pure Go LESS compiler and build a zero-dependency asset pipeline tool.
I’m wrapping up a W3C CSS3-compliant tokenizer and parser in Go. The project originally started as a port of LESS to Go but I realized that doing a solid CSS3 parser first would be easier. Ultimately my goal is to make a fast asset pipeline tool (LESS, SASS, Minifier, etc) with zero dependencies. Just download and run.
Also, I’m looking for suggestions for names for the asset pipeline tool. And no, “asspipe” is not a good name. :-/
Do you have a public repo for it? I’d be very interested in taking a look.
I had a good chuckle at “asspipe”, but given that it’s a tool that would “sew” various assets together, why not the play on words “sewer”?
It’ll be at https://github.com/benbjohnson/css. I’m cleaning it up a little bit right now before I open it up publicly. It uses a similar setup to the go/* packages. The CSS3 spec has additional information per-token so I’m splitting the tokens up into their own types.
Here’s the spec I’m going off of: http://www.w3.org/TR/css3-syntax
I find it strangely fun to implement lexers and parsers. I’m not sure why. :)
Argghh,, why do people use medium for articles about code? It has no good support for actually displaying it.
Author here. I’ve tried a bunch of tools and platforms for blogging but they either have poor code support or they use Markdown. I like Markdown for docs but I find it harder to write my thoughts out since I’m worried about formatting. I picked Medium because it displays non-code really well, supports inline comments, and it does a decent job of displaying code. I don’t find syntax highlighting to be much of a boon so I don’t miss that much.
This article has few listings and each of them is rather small. Medium does the job in this case. And it’s probably easier to promote your articles on Medium.
What do you find lacking? It certainly doesn’t do highlighting, but an argument could be made that that is distracting in the context of short snippets anyway… Is there something else?
I just open sourced a user tracking and funnel analysis application last night called Skybox. It’s still rough around the edges but I’m going to be polishing it up this week. It currently does Mixpanel-style event tracking and lets you build funnels on the fly.
The goal of the project is to let people own their analytics data. Most tools like Google Analytics or Mixpanel have a limited API into the raw data or how you query it. Skybox is backed using SkyDB so I’ll be opening up the raw query API soon.
Most analytics tools cost an arm and a leg too. Mixpanel, for example, costs $150/month to track 500K events. Skybox can process millions of events per month on a $5/month DigitalOcean droplet.
I’d love to hear some feedback!
Thanks for telling us about the demo. Can you add a bunch of dummy data on the demo site so there’s more to play with?
Good call! I’ll put something together.
Welcome! I’m Ben Johnson from Denver, CO. I’m sponsored to work full time on an open source, behavioral analytics database called Sky. I also have some other Go OSS projects such as go-raft and BoltDB as well as a data visualization site called The Secret Lives of Data.
I finished up an ERb-style templating language for Go called ego. It transpiles to pure Go so it allows for any Go constructs, statically compiles, and supports compile-time type checking. Feedback welcome!
I’m using that inside my open source Mixpanel implementation that I’m currently calling Skybox.
I saw this yesterday and thought it looked interesting. What made you decide to not use Go’s text/template and html/template? Have you found that your ego templates are overall more or less complex than the equivalent stdlib templates?
I’ve used the standard library templates before and they work fine but I run into a few issues:
I always have to reference the docs a lot to remember how to use the pipeline syntax.
I have to register any functions I want to use in my view.
I have to separately run go-bindata to embed templates into the binary.
There’s no static type checking.
Nested templates are a pain to manage IMO. With ego, everything is just a function call so they nest well.
So far it’s been pretty straightfoward. I don’t have to context switch in my head between Go syntax and template syntax. It’s all Go code. Compiling templates using the ego CLI is quick and painless too. I’m using line pragmas in the generated source so template errors are reported based on the template’s line numbers (and not the generated code’s line numbers).
Cool! The type safety sounds particularly nice. I look forward to trying out ego!
I’m trying to figure out how to store some timeseries data for trading applications. So far, we’ve implemented a kind of lame time-series database ourselves, but it sucks, so I’ve been trying KairosDB. Turns out KairosDB rounds my data points to 32-bit floats, which isn’t acceptable (in other parts of the system we’re representing prices as 11-digit integers), so I’m thinking I may have to try something else.
For non-$work, I need to finish up turning dumbfts into a book chapter for the “500 lines or less” book coming out later this year. I need to port it to Python 3, fix at least one bug, and explain it clearly. Also, I’m wondering if maybe I can work some kind of index compression into it.
What kind of queries are you doing for the time series data? I work on a behavioral analytics database (similar to time series) and I’ve always been interested in the trading application use cases.
That sucks about 32-bit floats. I used to use LuaJIT for query compilation but it had weird number restrictions. I since moved to LLVM which uses int64 & double.
I’m not sure how much I can say about it, so I’m going to err on the side of silence.
Do you happen to have more information on this book? It sounds pretty interesting!
Yes! Call for reviewers, github repository. I’m excited that this book exists!
Yes! That’s awesome! I wish I had an idea for something to contribute, though perhaps it’s too late at this point.
That’s awesome! I signed up as a reviewer.
Last week I finished up the initial alpha version of Bolt, my LMDB port in pure Go, so this week I’m implementing it in a new project. I also did some extensive documentation that I would love some feedback on:
My new project is an open source behavioral analytics application similar to Mixpanel / KISSMetrics. Those tools are great but they’re so expensive for startups (e.g. lowest plan is $150/month to track 500,000 events). The aim of my project is to be able to track millions of events per month and do ad hoc funnel analysis on a $5/month DigitalOcean box.
I’d love to get some beta/alpha testers if anyone has some analytics they want to track.
I feel like these “what are you working on” threads are my weekly update for what you’re doing just a couple blocks away. Maybe I should visit more often. Oh, and I have analytics I want to track. May I test for you?
lol, yeah, we should get together more often! Maybe a regular lunch or something is in order. What about tomorrow or Friday?
And yes, analytics for me to track would be great! It’s not done yet but I’m hoping it’ll be finished in about a month.
I’m wrapping up my LMDB port to Go this week. I just need to get deletion, tree rebalancing and page reclamation working. And a whole bunch of documentation. The library is pre-alpha but here’s the repo for anyone interested in poking around:
I started using the testing/quick standard library which is a QuickCheck-style library for black box testing. It’s a cool library but I never hear anyone talking about it.
Very cool. I haven’t used LMDB before, how would you compare it in practice with LevelDB?
I used LevelDB before as a backing store in the past but I had some issues with it when using it from multiple threads using the C API. It worked pretty well but it’s approach is somewhat complicated. It has multiple levels of storage files that have to be compacted periodically so performance can be variable.
LMDB uses a mmap’d B+tree that is updated in-place so there’s no compaction required. The mmap is read-only so data structures can be mapped directly to the underlying data so there’s no memcpy() required. (Writes occur using vectorized I/O on a regular file descriptor)
Ultimately I like the LMDB approach because it’s simple and grokable. My implementation omits the niche features in LMDB and it’s currently 1500 LOC.
Thanks, that’s helpful. I’m using LevelDB in a project (on the topic of things we’re working on this week) but I’ve only read about LMDB. In the reading it’s hard to sort out its real merits from the author’s ranting but being able to use values directly from mapped memory sounds useful. One caveat to the LMDB benchmarks for anyone who’s following this and skims them is that they disable compression in LevelDB. This probably speeds it up for the memory workloads but will hurt if you’re using slower storage like AWS EBS volumes.
Also, take a look at BangDB.
I’m currently trying to fight through integrating it into a C project (with an autotools flag to switch between LevelDB, BDB, and BangDB).
Thanks for that! BangDB looks pretty awesome, and I’ll probably give it a try later. I’ve been looking at cross-platform kv store as a persistence layer for a lua project. My initial thought was leveldb, and while it’s supposed to support windows I couldn’t get it to compile at all. I ended up using UnQLite (and wrote luajit bindings for it), but I’m still open to anything with a decent license.
We state quite clearly that LMDB is read-optimized, not write-optimized. I wrote this for the OpenLDAP Project; LDAP workloads are traditionally 80-90% reads. Write performance was not the goal of this design, read performance is. We make no claims that LMDB is a silver bullet, good for every situation. It’s not meant to be – but it is still far better at many things than all of the other DBs out there that do claim to be good for everything.
Disclaimer: I have no experience with LMDB. I maintain one of the ruby LevelDB wrappers: https://github.com/vjoel/ruby-leveldb-native.
Last week I got it into my head to start playing with building an Apple //c emulator, so I started writing a 6502 CPU emulator first.
What are you implementing the emulator in?
C++ – it’ll eventually end up on an Arduino, and this will make the
porting process easier, I hope.
Also, forgot I submitted a pull request to the Go standard library this
weekend, so I’ll probably be working on getting that cleaned up and
fixed as per the feedback I get on it.
I’ve been thinking about this same issue quite a bit lately. I also put myself in the camp of “not smart enough to Haskell”. :)
I know Haskell can prevent a class of errors at compile time but I’m curious if anyone knows if that comes at a higher cost of other types of errors. For example, I find Haskell (and functional languages in general) to be less readable than their imperative counterparts. Does poor readability cause errors when code reviewing or when translating requirements to code? Does abstracting the underlying hardware to high level abstractions cause performance issues down the road?
I’d be curious to hear from functional language folks and from people who used to be functional programmers and have switched back to imperative languages.
Ultimately I agree with kellogh that programming is not the end goal. It seems like it’s possible to write solid code (or shit code) in any language. There are trade offs to everything. I personally love Go because it’s simple and easy for me to reason about.
For example, I find Haskell (and functional languages in general) to be less readable than their imperative counterparts.
It’s interesting because my non-technical friends find imperative languages to be equally less readable as compared to functional languages. This may sound like hyperbole. But bear with me, because I truly believe there’s something important that’s revealed about our assumptions here.
So what exactly do I mean? You learned one way of formulating logic and now it feels natural to you. But I argue there’s nothing inherently natural about it. At least not in a way that isn’t inherently true of functional programming or logic programming or…
Learned patterns do feel natural, but you probably could have just as easily learned some other paradigm and then have argued that imperative languages were difficult to read.
Does poor readability cause errors when code reviewing or when translating requirements to code?
The salient point here is that this whole “functional programming just isn’t as readable” argument is specious: once you learn the idioms, it isn’t true. Also keep in mind, you had to do the same with the imperative paradigm as well.
So no, it doesn’t, because people who have learned functional paradigms don’t suffer from this mythical poor readability ailment. Granted, if you aren’t familiar with a given paradigm, then it’s legitimate to state that for you there is potentially an issue of readability.
My advice? Learn functional programming! :) It really isn’t as hard as it might seem.
I’ve done a fair amount of CL (the One True Lisp ;)), Clojure, and OCaml, dabbled with Erlang, and poked about with a couple other functional languages, and Haskell is still a language I view in the same light as XKCD. However, the fact that I don’t care for the language at all doesn’t mean I’m going to start telling people to stop using it. At the end of the day, build useful, interesting stuff in whatever language gets the job done. For me, that’s mostly Go with some C. For others, it might be Haskell. I don’t really care.
It’s self-deprecating humor though, as parts of xkcd is written in Haskell.
Learn functional programming! :) It really isn’t as hard as it might seem.
I wholeheartedly and unequivocally agree.
However, I’d like to maybe elaborate a bit on this “FP has poor readability” notion. I basically agree with your conclusion (that many of us got used to a different paradigm, so something foreign can be difficult to pick up). But I think there’s more to it then that, at least, for Haskell. (And I know you didn’t mention a specific language, but that’s what I’m going to talk about.)
When starting Haskell, I primarily struggled with two things. The first was being able to read someone else’s code (like, a piece of the standard library or a popular package). The second was laziness. The first is crucial for me personally, because that’s one of the primary ways that I pick up a new language.
I attribute the difficulty of reading someone else’s Haskell source code with the prevalence of Haskell extensions (GADTs, type families, fun deps, existential quantification, higher-rank polymorphism) and the absolute necessity to understand monads and monad transformers.
To your point, it may be the case that if monads were only called Warm Fuzzy Things, it wouldn’t have been so hard. But the bottom line is, I didn’t completely understand them, and it made reading source code (or even the Haddock documentation) incredibly difficult. Once I had that “ah ha!” moment about monads, it’s truly amazing how much code and documentation became almost immediately accessible to me. But it took a long time to get there.
Beyond that, Haskell is a dumping grounds for new academic research. (Which is a good thing!) But as curious programmers that like shiny new things (me included), it’s hard to resist an opportunity to use those things when the opportunity presents itself. But this imposes a huge hurdle on beginning programmers (or, just programmers new to FP in general) that learn by reading someone else’s code.
In general, I don’t think FP has poor readability. But I do think there are some concrete hurdles that beginners need to overcome to learn a language like Haskell. But as should be clear, these hurdles don’t really apply to functional programming in general, but rather, to Haskell specifically. (In my experience.) But we shouldn’t blame beginners for conflating Haskell with all of functional programming. :-)
You learned one way of formulating logic and now it feels natural to you. But I argue there’s nothing inherently natural about it.
Good point. Certain elements (e.g. tail recursion instead of a loop) feels very unnatural but I certainly have a bias.
I’m going to give it another go. I definitely know a lot of smart people who are functional programmers.
You really need to qualify readability before talking about it. The only sensible definition of readability that I’ve found is “to what extent does this code preserve equational reasoning” - a definition which is not only simple but also measurable.
Which is exactly why I find Haskell to be readable.
Does abstracting the underlying hardware to high level abstractions cause performance issues down the road?
Haskell is extremely fast precisely because it’s high-level. A compiler has deep introspection into your code. For example, it can inline code, automatically specialise, fuse equations and decide when things should be evaluated.
It seems like it’s possible to write solid code (or shit code) in any language.
I don’t think this is true. Would you be so diplomatic to Malbolge, INTERCAL or Whitespace?
I agree that programming is not the end goal. You use the tool that is best for the job. Go happens to be a very good tool for distributed systems.
Having said that, strict functional languages like Haskell sometimes are the best tool for the job. In my compilers class in university, we developed a compiler in OCaml. Things like algebraic datatypes and pattern matching come in really handy when you’re processing a complex AST. I wouldn’t say functional programming languages are less readable. If you get familiar with the language and do things in an idiomatic way, the code can be very clear and concise. Also, you tend not to run into bugs because the compiler detects a lot of them as type errors (although the compiler error messages can sometimes be really cryptic).
One more point to the readability of code. The longest part of the software development lifecycle (open source or otherwise) is maintenance. If the code you wrote truly has no side effects then it should be really obvious what’s going on when you go to track down a bug 5 months after you wrote the code. Of course this may or may not have anything to do with the language (though perl is fairly frequently a write once language) but there are languages that take longer, and are generally coded in a fashion that’s harder to troubleshoot and read later on. For instance, concurrency in java can be done and is more easy to read than in assembler. There are languages that just about any developer can jump in and read and figure out what’s going on… now being that I’m not smart enough to learn Haskell I have no idea what it looks like.
I’m porting over the fast LMDB key/value database library to a pure Go implementation.
[Comment removed by author]
I saw the tiedot database but it didn’t look like it had the performance characteristics and features I was hoping for. LMDB has a great simple design that supports real-time updates, safety under system failure and MVCC (to name a few). It’s been a fascinating code base to read through as well.
I’d be really interested to see where this goes. I’ve found gokabinet to be solid, but a native K/V DB would be neat.
Yep, OpenLDAP’s LMDB. I really like the B+tree approach that Howard Chu uses. I’m removing a few niche features (nested transactions, multi-process support) and cleaning up the API. I’ve seen K/V bindings before but it’d be really nice to have a solid, fast pure Go K/V.
I enjoy reading all the distributed systems and database research papers that come out but I feel like I get a better understanding if I actually dive into the code itself.