Working on ipfs, Im building an object to make modifications to files being represented as DAG Trees. Its very strange working on a ‘filesystem’ with no set block size. We are getting really close to an alpha release, just cleaning up a few remaining tasks!
Im cleaning up and polishing on Ipfs. We’re getting really close to having an alpha type ‘release’ (aka, this probably works, y'all should try it). Almost everything is done feature-wise, just finishing up on some refactors before we says its ready. Also, if anyone is interested, we’re looking for Go and Node programmers to help out.
Im working on a method for DHT peer bootstrapping for ipfs, and building a daemon that the user commands communicate with. Boostrapping is a harder problem than I initially thought it would be, there are many solutions, but finding the right one, is tricky.
nice to see the resurgence of interest in vim as one part of a larger system, unix-philosophy style. i remember back in the day the integration of vim as a “kdepart” into kate looked like it was going to be the perfect lightweight middleground between an ide and an editor (it had three panes, for project tree, console and editor), but sadly using vim as a kdepart never quite worked out.
Yeah, seeing vim (or any editor) as simply a replaceable part in a larger system would be really neat. Being able to switch between true vim, emacs, or other (lesser) text editors would be really cool.
On a slightly meta note, I enjoy posts like this, just discussing a certain way of going about doing something. Its nice to hear some polite discourse about different methods and strategies, you dont really get that anywhere else…
Thanks, I have been looking for exactly “polite discourse about different methods and strategies”.
With ideas like this…. it seems “Good To Me”… but it’s always a good thing to bounce ideas off other minds to solidify and grow it… and where need be change or kill the idea.
I’ve been working on Hex yet again, putting the finishing touches on pull requests to make it both more seure, as well as more stable and usable in self run environments.
I didn’t post last week, Open Dylan related or not.
I didn’t think I was going to post this week either. I’ve been going through a change in medications that has left me with a resting heart rate of about 120bpm and occasional spikes above that. It is a pretty terrible experience.
I’m also really much less than thrilled with what is going on in the world (Ferguson among many other things), which probably doesn’t help my heart rate. At any rate, at the end of the day, I haven’t felt like doing much outside of my day to day work. I’ve been told that my Twitter timeline reads like a list of war crimes and health complaints, and that’s probably fair.
For work, I’m still working on stuff related to the memory / heap profiler that will be contributed to emscripten. (There’s actually an open pull request for it.) This has been pretty enjoyable and some interesting work. It seems to be working out pretty well for its intended purpose for my client so far as well.
I’ve had a couple of comments from people that I should do a version that is not limited / targeted at Emscripten and that would work on multiple platforms. This is a pretty interesting idea to me and something I’m actively considering. There’s a lot of room for interesting integrations and extensions as well, like getting a dump of the events inside a GC (be it Boehm or Memory Pool System or something custom). I’m not sure how this would work out as a commercial product though. It seems like a lot of people just don’t care about memory usage to this extent outside of the games industry and some mobile applications. I’m not sure that I’d be willing to invest the effort into something open source along these lines without some sort of funding or compensation.
As for Open Dylan … I’ve been writing some stuff about the type system for future blog posts and I did some quick experiments with the idea of allowing users to create their own kinds of types. I really wish that we had some people interested in type systems helping out, like jozefg, but somehow, that’s never worked out with anyone, which is too bad … as there’s a lot of novel and interesting work that could be done, especially by someone who was more versed in the theory than I am.
I work in the embedded world, and tracking and managing memory consumption is a big challenge. I think the market for software development tooling is also larger in embedded than in games or mobile.
I’m currently working on memory consumption reduction at $WORK, and at my previous job I briefly inherited memory budgeting. This was done entirely by hand in a spreadsheet, which had been expanded from the microcontroller days up to today’s embedded Linux stack with multiple processors and a shared memory architecture. Prior to my arrival, memory consumption wasn’t measured, and it was really becoming a problem. After I left the company I had only managed to add per-process memory consumption to our automated post-mortem crash reports, and guessing memory usage from there.
At my current job, this is done mostly using static analysis tools, which also give limited information. Given that in both cases memory consumption was an issue (old job just dimensioned memory to be “just enough”, current job has devices that are in the field and receive software updates for decades), you might have some success selling to embedded software firms.
Basic memory measurements (ps, /proc/*/map, …) are all on a process level granularity. Many embedded projects are still “one process to rule them all” (really, my current project has a binary that breaks the 1GB limit when you keep debug symbols. We have workarounds for GNU ld bugs that only occur with unrealistically large binaries). So knowing that process X has a heap of 504MB is really not very useful.
I’m not saying that the companies I work(ed) for would buy these tools, but at least there’s some use for them. Getting companies to buy these things is another challenge :) I know we spend a lot of time and manpower on getting valgrind to work on our various platforms, and the performance impact is so large you can’t use it on a running system, only on stand alone tests. Since valgrind is mostly used to find memory leaks, a lower overhead tool that just dispatches information to a developers machine would be very useful. The developers workstation could then just highlight suspicious or growing allocations, allocations that are never released, etc.
Your honesty and openness in recording what you’re up to week to week is a good benchmark to me of what is possible for someone like me who aspires to practice this craft called programming at a higher level :-) Your posts are appreciated. What you describe you’re up to in your free time is more than I would usually get done in a week, work and free-time inclusive. Among the feelings of drudgery at certain jobs I’ve had (PHP-land ugh), it really is an indicator to me the promised land is out there, and people can work on cool things :-) I hope you feel better soon.
Storm 0.9.2 + Kafka 0.7 + Cassandra 2.1.0-rc5 + Elasticsearch 1.3 cluster is now up-and-running in production, running against around 3,000 web traffic requests per second. Time to test it in more detail and make it fast!
we want to upgrade to 0.8.1, but we currently use a Python driver we wrote for 0.7 and are in the midst of merging its functionality with an open source driver for 0.8.1
Ive been reading papers on various different DHTs and implementing one in go for the IPFS project that ive been working on for a little while. Im also trying to build a better testbed for stress testing my code at the moment.
I feel like I always get to these late… But this week im working on building a DSHT for Ipfs. (http://ipfs.io). It is really interesting to see all the different error cases and considerations you have to take into account when making such a distributed application.
In the OSS world, I’ve begun to contribute to the Hex package manager, I’ll be assisting in the web interface redesign, API documentation, trying to close all current issues and bring it into a state where it can be run on an isolated system, as well as many small improvements , such as email verification, password resets, and better support links.
I’ve also pushed some bug fixes to my Lobsters browser for iOS which was accepted today!
In !OSS, I’ve been working on my MVP application which is starting to take shape, as well as reading many papers on targeted advertising for the service!
Ive always been curious what sort of use erlang gets in the real world. It seems odd (from my perspective of course) that there is enough demand for it to need a package manager.
Erlang is really good in applications that can’t go down, and where jobs can be picked up from where they left off in disposable processes, although, the syntax is quite a change, and in some cases inefficient. Elixir fixes a lot of these problems and makes writing OTP applications a lot easier and productive (and sometimes even fun), whilst still maintaining the power and VM of Erlang, hence why many are starting to experiment with it, and even use it in production systems. And from now on, there is a breaking change freeze until 1.0, so now is a perfect time for people to start using it. And people want packages, and it needs to be done well cough npm cough
Ive been really happy with the size that lobsters currently is, the articles and discussions are all very interesting and the level of noise is very low compared to other places I go for for news and information.
Late to the party! Ive been working on translating SDL from C into Go so you can have a graphics library in Go with minimal dependencies. In the process ive written a helper tool to translate a lot of very common, easy to switch difference between Go and C. My goal this week is to get a lot of the unit tests finished so I can then focus on “turning green to red”
This looks interesting, but I really dislike Go’s use of comments for what are essentially preprocessor statements (for instance, in CGo and now in generate). Comments should be, well, comments, and not have any effect on the program generated for the code. I don’t understand why they didn’t just add some other symbol (like #) to denote preprocessor statements.
I agree that they shouldnt use comments for this type of stuff, but their main goal is to avoid making this part of the build. It is strictly separate from it, so if they start defining some sort of “#define” style syntax for it then it starts to feel like a language feature as opposed to a toolchain feature, and a language feature is something that is expected to be done by the compiler.
I working on a Go port of SDL. Rewriting as much of SDL’s source in Go as I can so that we can have a Go package for graphics without (much) other dependencies.
I like this idea, do you normally submit your refactors as pull requests afterwards? or is that just for personal practice? If you submit PRs, do maintainers normally accept refactors?
Full disclosure: I’m now an employee at twitter, although I contributed to twitter OSS before I was employed by them.
Different maintainers feel differently about it. I got excited about twitter/algebird a couple years ago, and submitted a couple of only refactoring pullrequests. It’s very inexpensive for them to accept the refactor, since they only have to read the code and click the merge button. It might be more complicated for other repos–for example, I’m a maintainer on twitter/finagle and we need to go through a thorough review process internally, and also make sure that it passes all of our internal CI tests. However, we still commit documentation changes, including typo fixes.
It really depends upon the kind of maintainer. Some maintainers never look at pull requests. Other maintainers try to constantly be gardening their pull requests. Generally, only really big projects will reject tiny commits.
Definitely depends on the project and maintainers. As another example, there are some projects that would prefer you to open a ticket first talking about your suggested changes.
E.g. Someone opens up a PR with 1000 lines of changes (all good changes!)…but its on code that is slated to be depreciated, removed, or the feature changed completely. Now the maintainers are in a rough spot.
Do you close the PR because the code is going to be ripped out soon anyway? That’s a good way to burn a potential useful contributor.
Do you have someone do a code review (eating up time) so that the contributor feels welcome…only to merge out their changes a week or three later? That’s a waste of everyone’s time.
Do you try to get the contributor involved in the refactor, but find out it is way above their skill and/or time availability? Now your feature XYZ is stalled on an external entity.
Those are kinda extremes, but it makes the point. Unless the PR is small, I like to open a ticket first and get feedback first. Works better for everyone imo.
This is the reason im excited for new innovations in CPU architecture (i.e. the mill cpu). So many of our three point some billion instructions per second are being wasted waiting on memory. Its just absurd.
I suspect that’s really mostly because CPUs are marketed by performance, while RAM is marketed by capacity. There’s no particularly inherent reason memory needs to be so much slower than CPU, but when all the market pressures are for it to be bigger and cheaper rather than faster, it’s obviously not going to catch up.
There’s a lot more in play than just “market pressures”. Take, for example, the L1 cache in a recent CPU. This is an on-chip memory (SRAM) much smaller than main memory (say 32K), and with a vastly higher cost-per-unit-size. And it’s still “slower” than the CPU, in that it’ll typically take multiple cycles to access. The biggest constraints aren’t economic, they’re physical – electrical signals propagate through wires at finite speeds, and as your memory gets bigger (even at just a few KB of SRAM) this starts to be a non-negligible factor. So if you wanted a memory that was “as fast as your CPU”, you could build it, but it’d be so tiny as to be completely unusable. Imagine your register file being all the memory you had.
Notably, the PlayStation 4 has a unified memory system with 8 GB GDDR5—it clocks 176 GB/s! Maybe it’ll persuade other manufactures to adopt similar architectures?
But the problem is that for many (most?) workloads, the relevant aspect of memory performance that’s problematic isn’t bandwidth, it’s latency – and the two are often at odds with each other.
Im really excited about this, more implementations of compilers can only a good thing for the language.
Working on ipfs, Im building an object to make modifications to files being represented as DAG Trees. Its very strange working on a ‘filesystem’ with no set block size. We are getting really close to an alpha release, just cleaning up a few remaining tasks!
Im cleaning up and polishing on Ipfs. We’re getting really close to having an alpha type ‘release’ (aka, this probably works, y'all should try it). Almost everything is done feature-wise, just finishing up on some refactors before we says its ready. Also, if anyone is interested, we’re looking for Go and Node programmers to help out.
Im working on a method for DHT peer bootstrapping for ipfs, and building a daemon that the user commands communicate with. Boostrapping is a harder problem than I initially thought it would be, there are many solutions, but finding the right one, is tricky.
nice to see the resurgence of interest in vim as one part of a larger system, unix-philosophy style. i remember back in the day the integration of vim as a “kdepart” into kate looked like it was going to be the perfect lightweight middleground between an ide and an editor (it had three panes, for project tree, console and editor), but sadly using vim as a kdepart never quite worked out.
Yeah, seeing vim (or any editor) as simply a replaceable part in a larger system would be really neat. Being able to switch between true vim, emacs, or other (lesser) text editors would be really cool.
On a slightly meta note, I enjoy posts like this, just discussing a certain way of going about doing something. Its nice to hear some polite discourse about different methods and strategies, you dont really get that anywhere else…
Thanks, I have been looking for exactly “polite discourse about different methods and strategies”.
With ideas like this…. it seems “Good To Me”… but it’s always a good thing to bounce ideas off other minds to solidify and grow it… and where need be change or kill the idea.
I’ve been working on Hex yet again, putting the finishing touches on pull requests to make it both more seure, as well as more stable and usable in self run environments.
Im glad to hear about progress! Im not familiar with erlang, but how do you handle versioning?
Nothing exciting; a front-end refactor, some schema tidying, and in my free time … dotfiles. There are times when my job is SO GLAMOROUS.
Why does SQL make it such a pain in the ass to deal with composite foreign keys? Oh, because SQL HATE ME. Sigh.
dont worry, SQL hates everyone equally.
How I wish I had some better way to store and query very large volumes of structured data.
EDIT to make sense.
I didn’t post last week, Open Dylan related or not.
I didn’t think I was going to post this week either. I’ve been going through a change in medications that has left me with a resting heart rate of about 120bpm and occasional spikes above that. It is a pretty terrible experience.
I’m also really much less than thrilled with what is going on in the world (Ferguson among many other things), which probably doesn’t help my heart rate. At any rate, at the end of the day, I haven’t felt like doing much outside of my day to day work. I’ve been told that my Twitter timeline reads like a list of war crimes and health complaints, and that’s probably fair.
For work, I’m still working on stuff related to the memory / heap profiler that will be contributed to emscripten. (There’s actually an open pull request for it.) This has been pretty enjoyable and some interesting work. It seems to be working out pretty well for its intended purpose for my client so far as well.
I’ve had a couple of comments from people that I should do a version that is not limited / targeted at Emscripten and that would work on multiple platforms. This is a pretty interesting idea to me and something I’m actively considering. There’s a lot of room for interesting integrations and extensions as well, like getting a dump of the events inside a GC (be it Boehm or Memory Pool System or something custom). I’m not sure how this would work out as a commercial product though. It seems like a lot of people just don’t care about memory usage to this extent outside of the games industry and some mobile applications. I’m not sure that I’d be willing to invest the effort into something open source along these lines without some sort of funding or compensation.
As for Open Dylan … I’ve been writing some stuff about the type system for future blog posts and I did some quick experiments with the idea of allowing users to create their own kinds of types. I really wish that we had some people interested in type systems helping out, like jozefg, but somehow, that’s never worked out with anyone, which is too bad … as there’s a lot of novel and interesting work that could be done, especially by someone who was more versed in the theory than I am.
I work in the embedded world, and tracking and managing memory consumption is a big challenge. I think the market for software development tooling is also larger in embedded than in games or mobile.
I’m currently working on memory consumption reduction at $WORK, and at my previous job I briefly inherited memory budgeting. This was done entirely by hand in a spreadsheet, which had been expanded from the microcontroller days up to today’s embedded Linux stack with multiple processors and a shared memory architecture. Prior to my arrival, memory consumption wasn’t measured, and it was really becoming a problem. After I left the company I had only managed to add per-process memory consumption to our automated post-mortem crash reports, and guessing memory usage from there.
At my current job, this is done mostly using static analysis tools, which also give limited information. Given that in both cases memory consumption was an issue (old job just dimensioned memory to be “just enough”, current job has devices that are in the field and receive software updates for decades), you might have some success selling to embedded software firms.
Basic memory measurements (ps, /proc/*/map, …) are all on a process level granularity. Many embedded projects are still “one process to rule them all” (really, my current project has a binary that breaks the 1GB limit when you keep debug symbols. We have workarounds for GNU ld bugs that only occur with unrealistically large binaries). So knowing that process X has a heap of 504MB is really not very useful.
I’m not saying that the companies I work(ed) for would buy these tools, but at least there’s some use for them. Getting companies to buy these things is another challenge :) I know we spend a lot of time and manpower on getting valgrind to work on our various platforms, and the performance impact is so large you can’t use it on a running system, only on stand alone tests. Since valgrind is mostly used to find memory leaks, a lower overhead tool that just dispatches information to a developers machine would be very useful. The developers workstation could then just highlight suspicious or growing allocations, allocations that are never released, etc.
I have so many things to say in reply to this that it would end up being as long as your comment or longer … I’ll save it for a blog post perhaps!
Your honesty and openness in recording what you’re up to week to week is a good benchmark to me of what is possible for someone like me who aspires to practice this craft called programming at a higher level :-) Your posts are appreciated. What you describe you’re up to in your free time is more than I would usually get done in a week, work and free-time inclusive. Among the feelings of drudgery at certain jobs I’ve had (PHP-land ugh), it really is an indicator to me the promised land is out there, and people can work on cool things :-) I hope you feel better soon.
I always appreciate your weekly posts, thank you for posting this week!
Storm 0.9.2 + Kafka 0.7 + Cassandra 2.1.0-rc5 + Elasticsearch 1.3 cluster is now up-and-running in production, running against around 3,000 web traffic requests per second. Time to test it in more detail and make it fast!
are these 3000 real requests per second? or is that just in benchmarks?
real requests per second
why deploy kafka 0.7 instead of 0.8.1?
we want to upgrade to 0.8.1, but we currently use a Python driver we wrote for 0.7 and are in the midst of merging its functionality with an open source driver for 0.8.1
Ive been reading papers on various different DHTs and implementing one in go for the IPFS project that ive been working on for a little while. Im also trying to build a better testbed for stress testing my code at the moment.
I feel like I always get to these late… But this week im working on building a DSHT for Ipfs. (http://ipfs.io). It is really interesting to see all the different error cases and considerations you have to take into account when making such a distributed application.
In the OSS world, I’ve begun to contribute to the Hex package manager, I’ll be assisting in the web interface redesign, API documentation, trying to close all current issues and bring it into a state where it can be run on an isolated system, as well as many small improvements , such as email verification, password resets, and better support links.
I’ve also pushed some bug fixes to my Lobsters browser for iOS which was accepted today!
In !OSS, I’ve been working on my MVP application which is starting to take shape, as well as reading many papers on targeted advertising for the service!
Until next week! :)
Ive always been curious what sort of use erlang gets in the real world. It seems odd (from my perspective of course) that there is enough demand for it to need a package manager.
Erlang is really good in applications that can’t go down, and where jobs can be picked up from where they left off in disposable processes, although, the syntax is quite a change, and in some cases inefficient. Elixir fixes a lot of these problems and makes writing OTP applications a lot easier and productive (and sometimes even fun), whilst still maintaining the power and VM of Erlang, hence why many are starting to experiment with it, and even use it in production systems. And from now on, there is a breaking change freeze until 1.0, so now is a perfect time for people to start using it. And people want packages, and it needs to be done well cough npm cough
Ive been really happy with the size that lobsters currently is, the articles and discussions are all very interesting and the level of noise is very low compared to other places I go for for news and information.
Late to the party! Ive been working on translating SDL from C into Go so you can have a graphics library in Go with minimal dependencies. In the process ive written a helper tool to translate a lot of very common, easy to switch difference between Go and C. My goal this week is to get a lot of the unit tests finished so I can then focus on “turning green to red”
This looks interesting, but I really dislike Go’s use of comments for what are essentially preprocessor statements (for instance, in CGo and now in generate). Comments should be, well, comments, and not have any effect on the program generated for the code. I don’t understand why they didn’t just add some other symbol (like #) to denote preprocessor statements.
I agree that they shouldnt use comments for this type of stuff, but their main goal is to avoid making this part of the build. It is strictly separate from it, so if they start defining some sort of “#define” style syntax for it then it starts to feel like a language feature as opposed to a toolchain feature, and a language feature is something that is expected to be done by the compiler.
I working on a Go port of SDL. Rewriting as much of SDL’s source in Go as I can so that we can have a Go package for graphics without (much) other dependencies.
Write a test, or document some code. Emailing the owner and asking how you can help out is also useful.
I’ve also read some code and then refactored it after, if I find it confusing to reason about.
I like this idea, do you normally submit your refactors as pull requests afterwards? or is that just for personal practice? If you submit PRs, do maintainers normally accept refactors?
Full disclosure: I’m now an employee at twitter, although I contributed to twitter OSS before I was employed by them.
Different maintainers feel differently about it. I got excited about twitter/algebird a couple years ago, and submitted a couple of only refactoring pull requests. It’s very inexpensive for them to accept the refactor, since they only have to read the code and click the merge button. It might be more complicated for other repos–for example, I’m a maintainer on twitter/finagle and we need to go through a thorough review process internally, and also make sure that it passes all of our internal CI tests. However, we still commit documentation changes, including typo fixes.
It really depends upon the kind of maintainer. Some maintainers never look at pull requests. Other maintainers try to constantly be gardening their pull requests. Generally, only really big projects will reject tiny commits.
Definitely depends on the project and maintainers. As another example, there are some projects that would prefer you to open a ticket first talking about your suggested changes.
E.g. Someone opens up a PR with 1000 lines of changes (all good changes!)…but its on code that is slated to be depreciated, removed, or the feature changed completely. Now the maintainers are in a rough spot.
Those are kinda extremes, but it makes the point. Unless the PR is small, I like to open a ticket first and get feedback first. Works better for everyone imo.
Work, Work, Work. And doing some work on refactoring a game idea Ive been toying with for a while.
This is the reason im excited for new innovations in CPU architecture (i.e. the mill cpu). So many of our three point some billion instructions per second are being wasted waiting on memory. Its just absurd.
I suspect that’s really mostly because CPUs are marketed by performance, while RAM is marketed by capacity. There’s no particularly inherent reason memory needs to be so much slower than CPU, but when all the market pressures are for it to be bigger and cheaper rather than faster, it’s obviously not going to catch up.
There’s a lot more in play than just “market pressures”. Take, for example, the L1 cache in a recent CPU. This is an on-chip memory (SRAM) much smaller than main memory (say 32K), and with a vastly higher cost-per-unit-size. And it’s still “slower” than the CPU, in that it’ll typically take multiple cycles to access. The biggest constraints aren’t economic, they’re physical – electrical signals propagate through wires at finite speeds, and as your memory gets bigger (even at just a few KB of SRAM) this starts to be a non-negligible factor. So if you wanted a memory that was “as fast as your CPU”, you could build it, but it’d be so tiny as to be completely unusable. Imagine your register file being all the memory you had.
Notably, the PlayStation 4 has a unified memory system with 8 GB GDDR5—it clocks 176 GB/s! Maybe it’ll persuade other manufactures to adopt similar architectures?
But the problem is that for many (most?) workloads, the relevant aspect of memory performance that’s problematic isn’t bandwidth, it’s latency – and the two are often at odds with each other.