https://github.com/begriffs/pg_rational
I wanted to create a custom type for precise fractions in postgres. They are useful for making user-defined row ordering (like a todo list) that you can rearrange without having to renumber surrounding items.
This is my first extension, and also the first time I’ve used C in many years. I wrote a lot of tests for this extension, and would love a code review.
Some coding questions that came to mind:
- I use pointers in many places where I could have passed a 128-bit argument by value. Does this give a speed boost? Is it worth the messier looking code?
- How portable is my use of
__int128_t and __builtin_saddl_overflow?
- I have guards for dangerous two’s complement operations, but do some CPUs use another representation that I ought to respect with conditional compilation?
- In a few places I turned some loops into a goto construction. Sounds nasty, I know, but I thought it helped emphasize that the common code path is not to loop at all.
- Coding conventions: naming, indentation, etc.
- Any optimizations I’m missing?
- Any bugs?
(Cross-posted on r/PostgreSQL)
Not very many places, just
add,mul,cmp, andmediant. Bothnegandsimplifydepend on mutating their arguments.No. All your call sites will be inlined and those struct accesses will be turned into direct register manipulation, whether they’re pointers or not.
On the contrary, all your preamble calling memcpy on PG_GETARG will be slower because you’re putting useless junk on the stack for no reason. Again, the compiler is just going to inline everything and use registers anyway, and if it decides not to then it can save the values on the stack itself. You’d be better off doing this:
It’s not really messier. Change some
->to., drop some&, add some different&, no problem.There are probably intrinsic replacements for what you’re doing.
It’s not nasty at all, these are perfectly reasonable uses of goto.
Love it! Super useful, and thanks for including the how-to, too. I just created a pull request on Github to fix a typo.
Would you mind explaining how that works?
Suppose I used integers to keep track of the order of todos:
If I wanted to insert a new value E between A and B when ordering by prio, I’d have to do:
Basically I have to move a bunch of items ahead to make room for E. Whereas if I number the rows with fractions I can find a place for E without moving any other items.
If I understand you, then you could, e.g., set the priorities of A, …, D to be 1, 1/2, 1/3, 1/4, respectively. Then the priority of E could be (1 + 1/2) / 2 = 3/4. Generally, the priority of an item inserted between items x and y would be the average of their priorities. Super neat!
Edit: I now see your
rational_intermediatefunction, and realize that my approach definitely wouldn’t scale. E.g., my method to find a priority between 3/4 and 1/2 would give a denominator of 8. In the pathological case, these denominators could quickly become too big to efficiently represent.Averages were my first attempt as well. Turns out clockmakers in the 1800s were pretty ingenious, though. :)
You eventually need to normalize the fractions somehow and I don’t see this in your work or documentation.
Andrew (the author of the “User-specified ordering with fractions” wiki page) has given a query to update all fractions across the set of rows to be ordered. That query could be further simplified if you use deferred exclusion constraint instead of a unique constraint, though he prefers the latter.
OK, good call I’ll implement a similar function for that. As a side note, I wonder how long it takes before the growing terms of the fractions created by rational_intermediate() begin to cause problems and should get renormalized?