1. 46
  1. 16

    When writing a small scale personal project, why bother with the headache of setting up yet another MySQL database, or running MongoDB in a docker container? Yes you could use SQLite, and I do find myself doing this on occasion. But SQLite is just a file, as is YAML, so where are the gains?

    Easier parsing, consistent locking, performance?

    1. 4

      That was pretty tongue in cheek but maybe it didn’t come across that way. SQLite is definitely more than just a file.

      • Not sure what you mean by easier parsing, is that by code?
      • I’ve found locking file access with a mutex sufficient
      • Performance agreed is better on SQLite, but the use case for this wouldn’t realistically make a massive difference. At least, I’m not seeing performance from my apps that make me think switching to SQLite would give me any discernible additional performance boost.
      1. 13

        I’d rather write SQL and get structured results back than parse out an ambigious YAML file.

        1. 12

          But then it would be called OrGS, and where’s the fun in that?

          1. 4

            I see your distinction, but if it’s your application which is writing the YAML file it’s not necessarily ambiguous. Likewise with libraries that parse YAML into a kv map it is structured (by varying degrees of the meaning of structured)

            Yes there is nothing stopping you going in and adding data/fields to a YAML file on disc, but realistically for most hobby projects that is the same for the “grown up” DB technology that is used.

            1. 2

              Point taken. A write/read round-trip by your own application can still be ambiguous if you’re not quoting things properly (new lines, quotes, colons etc)

              1. 1

                100% true, it’s not infallible and yes it does lack some protections that things like prepared SQL statements give you

            2. 3

              This is not really YAML specific, but I found that document-style databases (mongodb, a yaml file) tend to be much easier to handle than expected when using a typed language (Golang in the OP’s case). You still need to be kind of careful but you have some sort of schema at object level at least, which is good enough to prevent inconsistent data and other mistakes to sneak in in most cases.

              When it comes to untyped languages I found document stores to be a recipe for disaster though, no matter how careful you are and I would agree with your comment 10/10.

            3. 2

              I’ve read and written a moderate amount of YAML, yet I feel like I will never properly learn to distinct between

              - foo
              - bar
              - baz


              - foo

              This might be a personal failing, but as always, one likes to blame technology in those cases ;) Therefore, if I can, I will choose a language that makes that distinction clear.

              1. 2

                I suppose if you look at it with a more realistic example it makes more sense:

                - name: foo
                - name: bar
                - name: baz


                - name: foo
                  author: Bob
                - name: bar
                  author: John
                - name: baz
                  author: Jeff

                If you put both of those into a YAML to JSON convertor you can see the difference.

              2. 1

                They actually have a page on those benefits. One I liked was how it was more resilient during crashes. Their focus on reliability means both their fast path and error handling work better than 99+% of what developers will write themselves. The first link claims they’re faster than file I/O, too, in some cases. The library itself loads fast, too.

                SQLite is probably the best default for storage if speed, reliability, and portability are concerns.

            4. 11

              There seemed to be quite a bit of interest when I mentioned what I was doing in the What are you doing this week? post, so thought I would polish off the blog post sharpish and get it shared!

              I don’t normally write this much about a topic, but I had to stop myself going too in the reeds for this one. If there is a specific subject that people find interesting I could deep dive though

              1. 2

                but I had to stop myself going too in the reeds for this one

                If you ever felt like going far off in the reeds, I would click that link so hard. Did you tweak kern.bufcachepercent or whatever it’s called to keep YAML files in memory?

                1. 2

                  Thanks for the encouragement. I haven’t needed to change it from the default, mainly because the data volumes I have right now are relatively small.

                  One of my hobby projects is a clone of Lobste.rs written in this stack and while it is functional it isn’t ready for an onslaught of traffic yet.

                  Memory usage is of course something that gets constant attention though, so if it transpires I need to up kern.bufcachepercent then the option is there like you say.

              2. 6

                Your example program is chock full of data races. Don’t use global variables.

                1. 1

                  I wondered when I would see you here :D I refer you to the callout at the bottom:

                  This isn’t optimal code, remember earlier how I said I have gotten by in life by hacking things together, but it is elegant enough for me and does the job in a way that I feel happy with.

                  But thank you for the input, I am always improving and feedback like this helps!

                  1. 5

                    I mean, the design (minus the global variables) is fine enough. But “optimal” and “elegant” are kind of over here (gestures to the left) and “data races” are substantially more this way (gestures to the right). The program violates Go’s memory model, it’s invalid. Data races are always showstopper bugs, you can’t just shrug at them.

                    1. 1

                      I’ll take another look. I think someone else in the comments here or elsewhere mentioned having a Mutex in place for the increment, which is sensible.

                2. 4

                  I quite like it [YAML] though. It is clear, you can comment it, it is parseable by humans and machines, what is not to like?

                  Whitespace-sensitivity, made worse by syntax that results in incredible levels of nesting, five hundred ways to declare a multiline string, fifty thousand ways to accidentally convert something to a boolean, a pointer and reference system, …

                  1. 12

                    When it comes to YAML, just say Norway.

                    1. 3

                      Jeremy Null from Norway, also known as Jeremy None from False

                    2. 2

                      You see reasons not to like it, I see challenges. Nesting is suboptimal I would agree, but for human-parsing tools like yq do make it easier. Arguably easier than ad-hoc extrapolation of data from a more traditional datastore.

                      1. 5

                        I don’t like to be “challenged” by my data formats. They should IMHO be one of the least exciting parts of a stack.

                    3. 2

                      thx for pointing out relayd.

                      A backend I am working on, supports everything I need as far as web serving (it has built-in Jetty), runs HTTP/2, and it also terminates encrypted traffic. The only thing I use nginx for, is, basically, load balancing of https traffic (without terminating TLS on nginx side).

                      For OpenBSD target platform, my ansible role cannot install nginx (not yet, at least). So instead of working on that, I am thinking to just simplify and not use nginx… for OpenBSD targets. But I needed a built-in load balancing solution.

                      Your post helped me to discover relayd.

                      By the way, a single ‘stack’ solution for any personal project that needs dynamic features, could be just Erlang/OTP . It seems like it does everything one would need and some.

                      1. 2

                        Well this is certainly interesting, a different style of systems programming. Relayd is new to me; if you’re saying that config is “a pretty consolidated view” I assume one needs to read the manual quite a bit. Knowing OpenBSD (vaguely), I imagine the manual is pretty good, though?

                        YAML seems the sacrificial anode [1] of this setup, myself feeling just-go-PostgreSQL but I feel SQLite too. Go is just a great default, not much to say about it.

                        I only wonder about scripting: do you really just write that all out in Go or use bash, or some openbsd shell I wonder.

                        [1] in that it will inevitably draw the most criticism and get swapped out first

                        1. 1

                          Consolidated in that it doesn’t contain all the apps I have running on that server, nor IPv6, nor TLS. But if you put that in your /etc/relayd.conf then you’d get the result in the post.

                          YAML is definitely the most divisive component on there. The fact that it’s up and running straight away, schema changes are painless, data can be edited very easily in your favourite $EDITOR, etc. mean it’s a very strong choice for me.

                          Example of scripting: I needed to upload a large number of kv pairs to Hashicorp Vault. I had already written an application that interfaced with Vault so it was very quick to wrap this in some concurrency magic and do the task, rather than write something less-performant in bash or ksh.

                        2. 2

                          Just want to say that I appreciate the web design of the linked post, it’s refreshing to see some different colors and typefaces!

                          1. 2

                            Thanks! I’ve been there and done the black on white, and even off-black on off-white, but this was a labour of love and I don’t think I’ll be doing a redesign for at least a month ;)