1. 15
  1.  

  2. 5

    I would be more attracted to the book if it built on its own conceptual merits, rather than through a critical perspective of object-oriented programming.

    Object orientation remains one the best alternatives to modeling Abstract Data Types, development and runtime frameworks, simulations, process and control systems, and modeling complex problem domains outside the domains of computing, computer infrastructure and the data sciences.

    What is data-oriented programming best suited to?

    How does it allow me to better represent the complexity of say a payroll system into code? In terms of design, how does it differ from the relatively discredited decade of large enterprise applications using Stuctured Analysis, that attempted to model complex systems as data and the flows that transformed them?

    1. 1

      Please don’t read the book is not a critics of OOP.

      The goal of the book is to illustrate how one could apply principles of Data-Oriented programming to any language, (whether it is OOP or FP) in order to reduce the complexity of the system.

      Data-oriented programming is best suited to build information systems.

    2. 2

      Thanks for sharing your book! I really enjoyed reading the discussions, here as well. As a past closure dev and a current Rust dev, I can start to see the ways DOP & DOD are contrasted between two very different languages.

      1. 1

        Could you share your insights about the ways DOP & DOD are contrasted?

        An example: the struct of arrays pattern from DOD is beneficial in terms of performance but not in terms of reducing complexity. It means it is not a good fit for DOP.

      2. 2

        Haskell also is very prone to this style of programming.

        I find it really useful for my personal projects but unfortunately have never been able to “sell it” to any team I have worked with. Maybe this book can bring some awareness into the benefits of designing code with pure data flow transformations in mind.

        1. 2

          Hopefully!

        2. 1

          I don’t know much about data-oriented programming, but it seems popular among some game developers. Cool to see that a book is being written on it though, I might give it a read to see what the fuss is about.

          1. 7

            The programming paradigm used by game developers is called Data-Oriented design, whose main purpose is to improve performances of an application.

            The book is about Data-Oriented programming, a paradigm aimed at reducing the complexity of software systems.

            More about the distinction between the two in this article.

            1. 2

              Really? Those terms are so similar, my bad. That’s quite interesting though, I wonder how it works. Guess I’ll take a look at those preview chapters. Thanks for clearing that up.

              1. 3

                Uhh, when I look at the table of contents, this book is indistinguishable from one about functional programming. Functional programming can be viewed as data-oriented as well – it’s two sides of the same coin.

                Rich Hickey advocates programming with “functions and data”. The book has chapters about persistent and immutable data structures, and the author lists Clojure in his experience.

                So I hope that he explains somewhere why a new term is necessary. I don’t think it is, but I’m not interested enough in the book as is to make it worth finding out.

                1. 2

                  DOP is not a new term and it not the same as FP.

                  1. The term Data-Oriented programming was coined in the 2000’s by Eugene Kuznetsov.
                  2. Clojure was the first programming language to embrace DOP. I don’t think that other FP languages embrace data in the same way as Clojure.
                  3. In a sense, the purpose of the book is to illustrate how to apply DOP to languages other than Clojure.
                  1. 2

                    Clojure was the first programming language to embrace DOP.

                    I’m trying to understand in what way DOP differs from more traditional functional approaches. Erlang is the one I am most familiar with, and between persistent data structures and structural pattern matching, it seems to match DOP just fine. How is Clojure different?

                    1. 1

                      Does Erlang give you a generic access to the data via its information path?

                      I’ll give you an example from my book. Consider a simplistic representation library catalog data:

                      var catalog = {
                          "books": [
                            {
                              "title": "Watchmen",
                              "publicationYear": 1986,
                              "authors": [
                                {
                                  "firstName": "Alan",
                                  "lastName": "Moore"
                                },
                                {
                                  "firstName": "Dave",
                                  "lastName": "Gibbons"
                                }
                              ]
                            }
                      }
                      

                      Assume you’d like to retrieve the first name of the first author of the first book. The information path is: [0, "authors", 0, "firstName"].

                      In DOP, we access the information via a code like this:

                      get(catalog, [0, "authors", 0, "firstName"])
                      

                      Accessing data in DOP via its information path requires only knowledge about the structure of the data (basically field names).

                      As a consequence, we leverage general-purpose data manipulation functions (provided by the language or by third party libraries) to write our business logic.

                      It makes a huge difference!

                      1. 4

                        To be honest that explanation still feels extremely abstract to me, as it’s not clear what “get” actually represents in your description, but in Erlang that kind of “path” can be handled using function argument destructuring. In the following example, the function get_thing accepts an argument that contains a structure corresponding to your object of nested maps and lists, and pulls out the interesting bit without any code in the actual function body, binding it to the variable FirstName which is then printed out in the single line in the function body.

                        % this is a comment, starting with a % character
                        
                        % the following is the main function, and we don't care
                        % about the arguments passed to the function so we use
                        % an underscore instead of naming them or destructuring
                        % them. Underscores will be used later to ignore the
                        % "rest of the list" in the pattern match that handles
                        % our "path".
                        
                        main(_) ->
                            % Catalog corresponds to "var catalog" in
                            % the above example. Variables are
                            % capitalized in Erlang.
                            Catalog = #{
                              "books" => [
                                #{
                                  "title" => "Watchmen",
                                  "publicationYear" => 1986,
                                  "authors" => [
                                    #{
                                      "firstName" => "Alan",
                                      "lastName" => "Moore"
                                    },
                                    #{
                                      "firstName" => "Dave",
                                      "lastName" => "Gibbons"
                                    }
                                  ]
                                }
                              ]
                            },
                            
                            % The final line of an Erlang function ends with a period.
                            % Others before the end end with a comma.
                        
                            % passes the Catalog variable to the get_thing function
                            get_thing(Catalog).
                        
                        % the following is a function that uses 
                        % destructuring to pull apart the complex
                        % argument and binds the interesting item
                        % to the variable named "FirstName",
                        % and then prints it out.
                        
                        get_thing(#{"books" := [ #{"authors" := [#{"firstName" := FirstName} | _]} | _]}) ->
                          io:format("first name of first author of the first book: ~p\n", [FirstName]).
                        

                        It’s not clear to me what a non-data-oriented programming approach to this would be, though. A helpful idea from Saussure (I think): words are meaningful only due to how they are different from other words.

                        1. 2

                          Wouldn’t the ‘information path’ in this case be ["books", 0, ...]? Anyway, I think most people would be more familiar with the notation catalog.books[0].authors[0].firstName, but that’s a small nitpick. I think a more interesting point is that to a ‘dyed in the wool’ functional programmer, this would be an example of where one could use lenses to compose together powerful data access patterns, e.g. in Haskell notation imagine you have some basic lenses on the above data structures, you could compose them like:

                          (firstName.first.authors.first.books) catalog
                          

                          (The words inside the parentheses are the lenses and they’re read from right-to-left.)

                          Gabriel Gonzalez has a very cool blog post about the power of lenses: https://www.haskellforall.com/2013/05/program-imperatively-using-haskell.html

                          1. 3

                            You are right: the correct information path is: ["books", 0, "authors", 0, "firstName"],

                            There is an important difference between the ability to access any piece of data via its information path (like in get(catalog, ["books", 0, "authors", 0, "firstName"])) and the familiar notation that you mentioned (like in catalog.books[0].authors[0].firstName).

                            The difference is that the information path is a first-class citizen (it’s nothing more than an array!) that can be manipulated programmatically.

                            For example, we can:

                            1. Pass the information path as an argument to a function
                            2. Store the information path in a variable
                            3. Count the number of access per information path
                            4. Use the information path as a key for a cache

                            Would you say that with lenses, information path is a first-class citizen?

                            1. 1

                              Information paths remind me of key paths in Swift. Given a struct S, you can define a key path for S.member, store it, pass it and use it as a subscript for any value of S.

                              1. 1

                                Yes, lenses are normal values in the program, so they are very much first-class.

                        2. 1

                          I think it’s weird that you use Clojure as an example of Data-Oriented Programming, seeing as it is a peculiar dialect of Lisp. Lisp is famous for blurring the line between code and data, which allows for a powerful macro system. Can you elaborate on this?

                          1. 3

                            Clojure is not just an example of Data-Oriented Programming. Clojure is as far as I know, the first language to embrace DOP at the level of the language.

                            Clojure native data structures (maps, vectors, sets etc…) are persistent and their implementation is efficient both in terms of memory and computation. Since then, the implementation of Clojure persistent data structures has been ported to other programming languages (e.g Immutable.js for JavaScript, Paguro for Java). As a consequence, DOP is applicable to other languages.

                            Clojure is also a LISP and as so it is a homoiconic language. However, the fact that code and data share the same syntax doesn’t mean the the line between code and data is blurred. It means that we can write macros that manipulate code as it it were data.

                            In DOP, we separate code from data in the sense that data is not encapsulated into objects or lexical scope. We prefer to receive the data that we manipulate as an explicit argument.

                            I hope it makes things a bit clearer.

                            For more details, you can watch my talk “Data-Oriented programming: the secret sauce that makes Clojure systems less complex”

                            1. 2

                              The Erlang behaviors also force you to be very explicit about the state involved, causing overly complex architectures to feel kind of painful until you refactor them to be simpler. This downward pressure imposed by the language is one of my favorite aspects of it. Most languages pride themselves on making abstraction as easy as possible, but it’s hazardous to make abstraction so easy that you forget about the risk accumulating in the system with more code and data.

                        3. 1

                          I agree, it seems quite similar. From the draft Wikipedia article1 he links to in his article, it seems as if Data-Oriented Programming likes more general data structures. Maybe they’re opposed to strong type systems? I can’t really tell myself. It also says “Also, in FP usage of lexical scope could break the clear separation between code and data that DOP requires”, which just makes me more confused. Do they love global state?

                          1. 2

                            Indeed, DOP is a more natural fit on dynamically-typed languages, although I believe it could be applied to statically-typed languages also.

                            According to DOP, functions should receive the data they manipulate as an explicit argument.

                            We are allowed to represent the whole state of the system as a (big nested) hash map. But even then, the state is passed an an explicit argument to functions that access the state.