1. 6

Improvements welcome :).

  1.  

  2. 9

    Thoughts:

    • Optimising on the order of hundreds of bits doesn’t feel so meaningful when practically speaking there will be comparatively enormous overhead involved in transferring product information (i.e. what’s behind the identifier). Given any modification to an item whatsoever will involve a new identifier, you will need to grow that field’s size, fast, and every time will require updates to the mapping table to be distributed to end clients.
    • It’s strange that you identified “stock” as being as important as the item description (identifier) and price – 99% of online stores don’t trade in limited, known inventory sizes.
    • Who says every possible currency has a fraction divided into 100 parts? Many do not have any fraction at all.
    • Glossing over the (necessary) field size descriptions toward the end feels counter to the nature of the exploration, especially given they will change as the inventory expands or different currencies are used.
    • When we’re talking about entries this small, transport methods are really important; a TCP packet retry could easily quadruple the size of an item listing request response. Maybe the extreme economy here is because it wants to be used over I2C. You may suggest UDP or something else entirely.
      • Basically, when we’re talking about saving a handful of bits at a time, whatever real world use case might be trying to do that will care very much about the transport, and a design document like this would be lacking if it didn’t explore that aspect too.

    It feels like the theoretical nature of the survey has led this to being so abstract it doesn’t represent much of anything.

    edit: I should add, I like that you’ve done this at all. I agree with the general “wat” sentiment from @james, but also, why not explore the extremes and edge cases of economy?

    1. 2

      Optimising […]

      It’s for fun, come on :).

      99% of online stores don’t trade in limited, known inventory sizes

      Most I’ve been to actually do tell you how much is left; even Amazon does this.

      […] transport methods […]

      I mention that in the last bit; both are free to use whatever is the most proficient for them. Really the whole thing presumes they both use the most efficient transport method so we can focus on the actual store information.

      1. 1

        It’s for fun, come on :).

        Sure, and my reply is intended in the same manner :) If the point of the fun is to optimise at the level of bits, then we should hold ourselves to that same rigorous standard evenly!

        I mention that in the last bit; both are free to use whatever is the most proficient for them. Really the whole thing presumes they both use the most efficient transport method so we can focus on the actual store information.

        I know, I was responding to that! My point is, you’re doing yourself a disservice by not including discussion of it, because abstraction gaps like that aren’t so neat when you’re talking about the real world. If the example is completely divorced from any concrete representation, then the question arises of why we’re packing into minimal bitfields in the first place, y’know?

        1. 2

          I’ll be honest your original response did not feel as light-hearted as this one :).

          Totally understand what you mean, and really, I agree. The issue is, it’s extremely difficult to analyze those parts because people will then complain we are comparing apples to oranges. So we need to generalize as much as we can, and remove as much as we can from the problem we’re tackling, and then build on top. So maybe the next fun article could be the smallest possible transport protocol - then both the traditional way and the smallest way could use this.

          To me exploring these smaller parts in isolation, then combining them, can be really fun and sometimes something useful can come out of them. I don’t expect anyone to ever implement this, but at least it’s written to be analyzed and criticized.

    2. 6

      That’s not a ‘store’, that’s a ‘database’. I can’t see stuff in it, I can’t buy stuff from it, it doesn’t provide any legally/financially required paper trails, etc. Choosing to include current stock level but not including the transactions themselves is arbitrary.

      If you want to reduce your store database even further: sell omnipotent singular integers. Price is the value of the integer, and there is only one of each (positive) integer available, so once it’s sold you delete (or add) it to the database. Choose the number of bits for integer size that takes your fancy (slightly above your highest probable sale value could work).

      Fundamentally: stores are required to be heavily interconnected with lots of things. An isolated store component is not a store, it needs to be connected to visualisations (list, search), user-editing (transactions), user verification, and admin-editing at an absolute minimum.

      1. 1

        I can’t see stuff in it, I can’t buy stuff from it,

        It’s up to the client to decide how to render the items.

        it doesn’t provide any legally/financially required paper trails, etc. Choosing to include current stock level but not including the transactions themselves is arbitrary.

        I just didn’t think that’d be important, since banks track this information. Why duplicate it?

        If you want to reduce your store database even further: sell omnipotent singular integers. Price is the value of the integer, and there is only one of each (positive) integer available, so once it’s sold you delete (or add) it to the database. Choose the number of bits for integer size that takes your fancy (slightly above your highest probable sale value could work).

        This is essentially what I proposed? Except the “price is the value of the integer”, because many things can be priced the same.

        Fundamentally […]

        I would say no, just listing what is available is all that’s needed. User verification should be taken care of by another service. Editing is an entirely different aspect too.


        All in all great comments :)!

      2. 9

        wat

        1. 2

          Indeed.

        2. 4

          Oh, that’s a trip!

          A schema for some things you might expect in a normal e-commerce system:

          • Customer profiles (at their simplest, a customer identifier, access token/mechanism)
          • Inventory profiles (inventory identifier a.k.a. SKU or GTIN, human-readable name, product description, pictures)
          • Review profiles (customer identifier, inventory identifier, review body)
          • Stock information (inventory identifier, quantity)
          • Transaction profile (customer identifier, inventory identifier(s), payment identifier, shipping identifier, transaction state, discount identifier(s))
          • Payment profile (payment identifier, payment details–card number/bitcoin address/whatever)
          • Shipping profile (shipping profile identifier, customer identifier, shipping details)
          • Discount profile (discount identifier, discount type, discount restrictions predicate, discount details)

          The above would get you to a decent version of a store, although it’d be severely under-prepared for the types of analytics you’d want to run on it. Also, one of the core issues with these sorts of systems is at what point the developers say “okay, fuck it, these really are interlocking state machines” and then inevitably switch to a proper event-sourcing model.

          Anyways, that’s just background (from doing some e-commerce things).

          ~

          Going along with the spirit of the post, maybe we should pick one section of the experience (back-office record keeping, front-office user interactions, etc.) to focus on. The original article kinda focused on back-office stuff, so that’s what I’ll talk about.

          The schema the article picked for representing the stock are 3-tuples { identifier, stock quantity, stock price }. It’s super-limited, but I think you could totally make it work for a (very) barebones store.

          We can golf this further though!

          First thing to do is remove complexity. We’re going to switch away from “stock prices” to tokens, because humans are suckers for token economies. This also means we don’t have to deal with exchange rates or precision or anything else–we’re just going to quantize stock prices into integer tokens. Switching from real dollarydoos to tokens happens elsewhere so we can ignore it.

          Next, let’s unpack the idea of “stock quantity”. The only reason we have that is because the naive representation of “here is a list of every item for sale, literally every instance of every item” would take up too much space (though it is attractive for other reasons). So, the “stock quantity” can really be thought of as a helper for what is effectively run-length encoding of the master list of item instances available. Anyways, to clean up the point the author made, this is just log2(quantity) in terms of bits for storage. If we’re willing to limit the number of items we can hold (say, no more than 10 bathtubs at a time) this can be nice and fixed–otherwise, we’ll have to use a variable-length encoding.

          The next shaving point is the price. If we’ve already quantized everything into tokens, we can organize things into lists of similar token counts so we don’t have to store that for every item. We have the list of 1 token items, the list of 2 token items, etc. etc. I posit the storage of these list labels (read: keys in a dictionary mapping tokens to lists of items) is negligible compared to the overall number of items for sale.

          Next, the identifier need again only be a variable-length integer, the bit size given as log2(number of distinct items for sale).

          Cool, so now we’re down to {identifier, quantity}, with the expectation the price can be inferred from which list you pulled the item from. The size (in bits) is {log2(total identifiers), log2( quantity)}.

          If we want to be really mean about it, we can make the business decision that we drop-ship everything, so “quantity” doesn’t actually matter–we order direct from the manufacturer anyways and keep no stock on hand. So, now we’re just down to just a dictionary of lists of {log2(total identifiers}.

          ~

          I hope this got across the point that real optimizations are often unlocked in the business domain, and not exactly at the software level. :)

          1. 2

            First off I want to say I extremely appreciate the fact you realize the point of the post. I don’t know if it’s because there a ton of egos, or everyone expects serious content, but it really puts a bad taste in my mouth when no one knows how to have fun. Thank you for having fun with me!

            Now your analysis: excellent idea on just using tokens for pricing - it removes the whole tax issue the other user pointed out and no more currency bits are needed.

            The rest of the analysis seems pretty much what I was describing; I was not sure if variable length integers would be more efficient but I guess they would be!

            I like the idea with assigning items to a price list too.

            The dropshipping is a good idea too but I think it relies too much on the fact you need to rely on a third party. Maybe this is just an arbitrary worry.

            I will add your improvements to the piece :D


            And yes, I know for a “fully featured” store, there is a ton to be considered. I really wanted to focus on the most minimal representation, as you picked up on. :)

            1. 1

              I wish we had more fun posts like this. :)

              1. 1

                I’ve got some in the pipeline so keep watch :P. Actually http://len.falken.ink is an rss feed to subscribe to, if you use an rss reader.

          2. 3

            Kind of missing information about who is buying what, and how to deliver the stuff to them…

            1. 1

              Great point; but I’ll ask: doesn’t the transaction information store this? It could store the item identifier to indicate what is being bought, and everything else is “free”.

              How to deliver the stuff is definitely something which should be added!

            2. 2

              Did you just share a bunch of basic maths?

              I think you might want to look more into what kind of information one actually wants to store in a store.

              1. 1

                In a world without taxes, this would have one less problem.

                1. 1

                  Yeah it seems global price calculation is what throws a big wrench into things.

                  1. 2

                    No, I didn’t even mean global.

                    In Europe every country has a different tax rate on different goods, e.g. in Germany there’s less tax on basic goods like food (not everything) and there’s a temporal component as currently it’s 16% and not 19% due to Covid (for the higher, default bracket) - but we still have an economic union so ordering and paying in “foreign” countries is easy, usually only shipping is the problem.

                    And afaik in the US it’s completely different based on state, with sales tax and what not. I simply think that view is so overly simplistic that it’s useless for most applications. I have zero clue about web shops and instantly found a few dealbreakers.

                    1. 1

                      There was the good suggestion by friendlysock to instead just use “tokens” which exist in their own economy, so we can ignore it all :D