MongoDB 2.8 is going to ship with a storage engine API. So while TokuMX isn’t going anywhere for a while, we’re also working on a separate storage engine implementation. It won’t have all of TokuMX’s fanciness like Ark, fast updates, partitioned collections, or clustering indexes, but it will have better compatibility with things like text search and it’ll support mixed replica sets.
Hoping to have some good news on that front soon.
These are the types of arguments in complex analysis that I always found the most beautiful. Take something that looks weird and seems to apply too often, rephrase it as a question using the residue theorem, and out pops a Result. So amazing.
This is probably the most comprehensive talk about how fractal trees work: http://www3.cs.stonybrook.edu/~bender/talks/2013-BenderKuszmaul-xldb-tutorial.pdf
I’m going to give a few talks in SF in November, you should come if you’re curious and in the area.
Basically, the fractal tree implements inserts and deletes as messages that travel down the tree. To perform, say, db.coll.update({_id: 10}, {'$inc': {a: 4}}) we basically just encode {'$inc': {a: 4}} as an “update” message, send it down to the key {_id: 10}, and whenever it gets applied (due to a flush or a query), we just increment whatever value was there. You lose out on information you’d get from the query, like “how many docs were affected” or “was there a type error (e.g. incrementing a string)” because we don’t query before returning to the client, but if your app can tolerate that, it’s nice.
Interesting! I’ve only seen a little bit about fractal trees before, I’ll check it out.
And I might be SF, no definitive plans though. When / where are the talks? :D
The November 12 talk is a redux of a talk I gave at Big Data DC in August, it’s more in-depth and theory oriented than the November 11 talk, which is more product-focused.
Hope to see you there!
I don’t know, however I do know that TokuMX’s op log does not require operations to be idempotent, which means you could just toss the operation in the op log and return to the user before actually applying it. If it is implemented this way it also means it might play funny in a distributed setting.
Requiring either the oplog or the tree messages to be idempotent would really mess this up. This feature, fast updates, is one of the main reasons we made sure to not require the oplog to be idempotent from the beginning. This requires e.g. rollback’s algorithm to change (you can’t just blindly play forward stuff from some point before the partition, you actually have to go back and find the spot carefully), but it’s not too bad.
I’m pretty sure that oplog not being idempotent makes doing proper oplog application impossible. How do you deal with applying the change but dying before updating the position in the oplog?
Not sure if this is a troll response or not, but transaction logs are idempotent in every database I’m aware of (which isn’t many) so you can play transactions twice and it’s ok.
We made it safe for the oplog to not be idempotent specifically to support fast updates. This is not a troll response. The storage engine is transactional, so we can apply the effect of the update and modify the applied bit together in an atomic transaction.
How do you guarantee the application of an operation + storing that it was applied happens atomically?
Both are writes to the storage engine that happen in a transaction. Making transactions atomic with respect to crashes is an old and well-studied problem. The most well known paper is probably the ARIES paper, which is how our transaction log works, pretty much. Most durable, transactional storage engines (of which there are many, including BDB and InnoDB) use an ARIES-style log and provide multi-element transactions that are atomic with respect to crash.
So in the ARIES log you materialize the actual changes from the relative ones in the oplog? Thanks for the information.
Finally shipping TokuMX 2.0 tomorrow! This is a big milestone for us, has a lot of cool things we’re excited about (Ark elections, fast updates, geo features, enterprise audit and point-in-time recovery) and it’s taken a lot of work from my amazing and dedicated coworkers. Thrilled to be getting it out the door on time.
I was looking for an equivalent to that option for fish. I found that in fish, “directory history” is always tracked, separately from the “directory stack”. Use prevd and nextd to go forward and back in directory history, and dirh to print the history. The directory stack commands are pushd and popd to move and dirs to print.
With type conversion. Here is their example fixed to work: http://play.golang.org/p/de-6a0bK1w
Note that conversion is different than type assertion. Conversion changes the expression to the type: http://golang.org/ref/spec#Conversions
Assertion says the type has an interface: http://golang.org/ref/spec#Type_assertions
These are easy to get mixed up because of the syntax: string(foo) vs foo.(string)
And do you know why it was decided to add this ‘untyped constant’ idea rather than just say a constant is of a type and one has to use type conversion to force it into their type?
I suspect this is confusing because your thinking constants are variables defined with const. A constant is any literal value in the application. So even if you say something like:
var s string = "foo"
You’ve just assigned the constant string literal “foo” to the variable s.
The reason for this is go is very simple and transparent with how memory is managed. When you have a string literal “foo” in your code, that string literal exists in the program code that is loaded into memory. You can take the address of it, but you can’t modify it because application code can’t be modified. It’s not part of the stack/heap where variables are created.
Getting back to untyped constants… what go is doing is what it’s not doing. It’s not adding a type label to that string literal. The literal exists as a pure string representation. Go determines a default type for it by examining it’s syntax.
When you say:
const foo = "foo"
Your saying, store this string literal and give it a label so I can refer to it.
When you say:
const foo string = "string"
Your saying, store this string literal with a label and also specify it’s type as being a string.
I’m probably off on the exact memory representations going on, but that’s the gist of it.
I am asking why was it decided that I can do var s MyString = "foo" rather than saying that "foo" is always a string and one must always perform a type conversion, so var s MyString = MyString("foo")
I dunno the why and I’ll refrain from guessing, but all the rules are here if you’re interested: http://golang.org/ref/spec#Properties_of_types_and_values
Seems like it’s more for ints than for typedef’d strings. With your proposal:
var x int = 1 // ok
var y uint = 1 // bad
var z uint = uint(1) // ok, but ugly
With go’s behavior:
var x int = 1 // ok
var y uint = 1 // ok, because the constant 1 *could be* of type uint
var z uint = uint(1) // unnecessary, but ok
Most of the time you write:
z := 1
No var needed. The : specifies a new variable and sets the type automatically. I’d recommend walking through the go tour: http://tour.golang.org/#1
I’m giving a talk tomorrow on write-optimization in external memory data structures in DC! You should come! http://www.meetup.com/bigdatadc/events/196161852/
In emacs, there are some facilities to solve this problem (http://www.emacswiki.org/emacs/MultipleModes). They’re less general (typically only some combination of html, js, php, css, etc.) but tend to be more automatic. MuMaMo is fairly solid.
I’m giving a Papers We Love talk in NYC tomorrow night: https://www.meetup.com/papers-we-love/events/184704082/
I just finished a hopefully-correct implementation of the algorithms described in the paper, here: https://github.com/leifwalsh/rmq
I’m continuing work on merging MongoDB 2.6 work into TokuMX. It’s not glamorous but it’s well worth doing. The changes in MongoDB 2.6 are largely refactoring efforts, but they’re leading to some good code cleanup (though some is in danger of introducing performance regressions, we’re being really careful about that) and some nice features or general improvements. In particular, the routing in the sharding layer has gotten a lot better, it’s more careful, handles batches much better, and issues operations to multiple shards concurrently, which has the potential for massive throughput improvements, especially for hashed sharding.
I saw a talk Justin Sheehy gave at some point I think last year (or maybe the one before) where he mentioned the CAP theorem, but stressed that it should be thought of not as a dichotomy, but as a continuum. The words he used were “harvest” and “yield” which I believe come from Harvest, Yield, and Scalable Tolerant Systems.
I’m not a big fan of the names “harvest” and “yield” (mostly because they both mean about the same thing to me in terms of farming, so I struggle to remember which is a metaphor for which systems concept), but the important thing is that it’s more of a sliding scale than most people let on when they talk about CAP.
PACELC does a nice job of saying “during a partition, how does the system trade off between availability and consistency”, but I don’t think we need the new acronym. For one thing, it’s longer and less pronounceable.
What we really need to do is talk about how there is a trade-off around how much system wants to communicate globally to handle client requests, and what happens when it can’t. More communication and tight control creates a more CP system that probably has more latency since operations require more network trips. Less communication and looser control creates a more AP system that probably has less latency but allows state on different servers to diverge for a while.
I still will want to call this “CAP” because that’s a nice name.
You’re definitely correct that there’s a continuum. There is a lot of interesting research around the different models that can be offered with different levels of coordination, and how seeing CAP as a dichotomy misses a lot of opportunities to build systems that are practically more useful.
On the other hand, there are some real discontinuities on this spectrum. Various named consistency models (in both the ACID and CAP senses of consistency) have absolute minimum amounts of coordination that are required to achieve them. Linearizability, for example, is a local model which requires strong coordination within the replicas of a single “value”, but no coordination across “values”. Snapshot Isolation, on the other hand, requires less coordination between replicas (but still quite a lot), but significant coordination across “values”.
Two things in TokuMX:
I’ll be at MongoDB World with the rest of the TokuMX team. If you’re in town, come say hi! We just released TokuMX 1.5 last week (nice stuff for time-series applications in there), and a brand new docs page that I’ve been working on for about a month: http://docs.tokutek.com/tokumx.
Working on beefing up the documentation for TokuMX. We’re releasing version 1.5.0 later this week and I’m hoping to meet that deadline with the new documentation. Pretty excited, it’s work long overdue.
Most NoSQL databases that avoid transactions (providing either multi-key or multi-statement semantics) do so to reduce the amount of consensus that must be reached in the cluster, because consensus is expensive in terms of network traffic, and hurts latency.
I don’t know that much about MarkLogic’s clustering, and I’m finding it hard to learn much in 10 minutes of googling. Do you know what kind of clustering MarkLogic supports? Does it allow for sharding? What kinds of guarantees does it give to transactions that need to touch multiple nodes? If you use transactions like this, how do they perform?
Here’s the part that bugs me:
it provides multi-statement transaction capabilities to the Java developer without sacrificing the other benefits we have come to expect from NoSQL, such as agility and scale-out capability across commodity hardware
TANSTAAFL, so I find it really hard to believe this statement. There’s a tradeoff somewhere and this article isn’t admitting it. Does anyone know what the tradeoff is?
For reference, the product I work on, TokuMX, also is a NoSQL database that supports ACID transactions, but we limit what you can do with them if you’re sharding (currently looking at reducing those restrictions, email me if you want to work on those kinds of things).
@aphyr: is MarkLogic on the Jepsen list? I’d be curious to see what you can figure out.
Got TokuMX 1.4.2 out the door, working on features and code review for 1.5, and writing more documentation. Hoping to finalize the TokuMX-Datadog integration this week as well.
github, emacs, tmux, zsh, cmake, ninja, markdown, golang, clang, arch, irccloud, buildbot, freshdesk, google docs
I was thinking of http://hugin.sourceforge.net/ :-/
Among other things, working on integrating DataDog with TokuMX: https://github.com/Tokutek/dd-agent
So far, really enjoying the sandbox nature of DataDog, and the power of their “graph any expression composing data series you have” feature.
I’m going to San Francisco to give a series of talks, if you’re in SF or the bay area, come say hi! I’ll be around through Sunday if anyone wants to grab a beer, find me on twitter or something.
TokuMX SJC: http://www.meetup.com/TokuMX-SJC/events/209263912/
Just an introduction to TokuMX, what it is, what it do.
TokuMX SFO: http://www.meetup.com/TokuMX-SFO/events/201774892/
An in-depth discussion of how you build a Fractal Tree and why it works, from a write-optimization standpoint. Also, how LSM trees work and what’s different about Fractal Trees.
Papers We Love SF: http://www.meetup.com/papers-we-love-too/events/197577972/#description-tab
I’ll be talking about a neat little tree processing algorithm I like.