1. 2

    This is very interesting and easy to understand. It seems unlikely to me that anyone would setup nginx in this way for anything of importance though. A very simple solution to all the problems mentioned would be to use https (with certificate verification) for the proxy protocol. There is more operational overhead to the certificate management of course. But for production stuff it is most certainly worth it.

    1. 17

      Julia Evan’s zine? http://jvns.ca/ is quickly becoming one of my favorite software blogs. Her insights are really great, I love the pragmatism, she’s clever, and it’s fun to read! Highly recommend.

      1. 4

        Using this in production seems like bad advice.

        Just look at the ruby regex, imagine blindly pasting that into your production code. How could you possibly even begin to read that or debug that?

        What problem does this really address? Are invalid email addresses getting sent off to outbound email clients a necessary thing to defend against? Because obviously a simple regex isn’t sufficient. Why bother with such complexity?

        1. 3

          Worse still, there’s vector for a denial of service attack there. Combining user-supplied content with a backtracking-based regular expression library is dangerous.

        1. 6

          Interesting! This is very pertinent to a thing I’m working on. I recently decided to use \x1F (31 Unit Separator) to encode a set of strings that could validly include commas, tabs, newlines, etc into a single string. I could be pretty sure that \x1F wasn’t part of the data, or at least sanitize it out of the input. I was able to do this precisely because it isn’t a character a human could type or would be interested in. Its a simple solution, works very elegantly, and no dealing with all the flavors of quoted csv.

          1. 4

            and no dealing with all the flavors of quoted csv.

            Preach.

            I used to do quite a bit of “integration engineering” - which almost always means “take this input file (almost always CSV or tab-delimited) and do process X, line by line”. I’ve never seen a CSV that could be parsed the first time. I’ve tried yelling and pleading with people to use RFC 4180 because at least you have something to work against, but nothing.

            I suspect a big reason we see this happening is that generating a CSV is a low difficulty task, and Excel makes it easy to do manual/semi-automated work to the files. But if you ask for US/RS delimited files, or (heaven help us) a fixed-width file, you’ll end up with a programmer in the loop.

          1. 3

            Nice walkthrough. Is there a way to do all that UI-based form-filling via API calls? Even though presumably you only do it once per blog, it’d be cool to have a YAML (or better: TOML) file that defines (almost?) everything. Can something like Terraform do this (might be overkill).

            One thing that would help would be to highlight in the screentshot the form fields that the reader would be changing vs. what they’d leave alone. Especially for that huge CloudFront form.

            1. 2

              Close to 100% of this — if not 100% — should be automatable! Amazon has a looooot of APIs.

              1. 2

                The only thing you probably can’t automate is getting the ACM certificate. Everything else can be achieved through the command line, although it might not be as straight-forward as clicking a couple of buttons. For example, here’s how you can programatically create a CloudFront distribution:

                http://docs.aws.amazon.com/AmazonCloudFront/latest/APIReference/CreateDistribution.html

                1. 1

                  Since it’s AWS, I’m absolutely sure its possible to do entirely in API calls. Just a matter of looking up the APIs and/or using your aws library of choice. It would be very neat to have a simple one-shot end-to-end setup scripted up. Maybe a follow up post!

                  Good point on that. I called out only the fields I changed as bullet points under that section, but point well taken. It would be a lot nicer to only see the differences.

                  Thanks!

                  1. 2

                    I had been working on a system to do this for LetsEncrypt before ACM was launched and have almost all of the code done and working. If you(or anyone else) wants to take some of it or modify it to do this I’d be happy to help with pointers or code(it’s all python + boto3).

                    https://github.com/ubergeek42/lambda-letsencrypt/

                1. 3

                  I hope someone finds this useful and I’d love some feedback on the article as a whole. I’m just getting my blog / blogging together and I’m open to critiques of from my writing to the technical content, etc.

                  1. 4

                    Good article - Thanks. My observation, is that if any of the interfaces change the article will become obsolete - but the principles behind it should remain so maybe focussing on that side of the content would be more valuable long term.

                  1. 4

                    Johnny encrypts every day in a very usable way when he reads his email or banks online. He just doesn’t care to manage a web of trust on his own, and even if he did his friends don’t.

                    I feel the biggest usability challenge to gpg is how to express and maintain a web of trust. Trust is a very complicated thing. And all the complexity of gpg’s trust mechanism is too much for Johnny to care for. Johnny’s friends don’t care either.

                    I would love to see some usability magic applied towards managing a web of trust. Maybe something like letsencrypt.org but for personal keys. Maybe something even cooler.

                    1. 1

                      There’s keybase.io, which sounds similar (using existing social networks to model trust).

                    1. 9

                      Very excited to see Phoenix coming along. I want elixir to succeed so badly.

                        1. 2

                          It’s really great watching intelligent people converse. I’m curious where Simon Peyton Jones would put erlang in his matrix. I assume higher than haskell in terms of usefulness, but where in terms of safeness?

                          Edit: extra commas

                          1. 1

                            I thought it was interesting to think of the WhatApp purchase in the context of this video. ~10 minutes in he’s talking about skype.

                            1. 4

                              Pretty cool. I’m curious how this compares to doing direct find_by_sql queries, if anybody’s tried that.

                              1. 2

                                Just speculating here, but I think if you were to do find_by_sql queries in the same benchmark he’s using, you’d find them just as slow. My thinking is that his optimization is to cache the prepared statement, but with find_by_sql you’d be recreating that string every iteration and losing the benefit. However if you were to create the stored procedure outside the loop and only call it from within, you might get similar results.

                              1. 3

                                I’m trying to learn elixir! And trying to make some more progress on the stripe-ctf. Plus the day job.

                                1. 1

                                  It’s incredibly hard to take this seriously without specific tested versions or any sources or experimental data. Also the article seems contradictory at the end. Its saying that Bignum uses “special” 4th grade math for multiplication, but GMP as well? I’d like to know if there are specific sizes of integer that ruby optimizes at, or if they’re always naive. And if they do optimize, in what versions were the optimizations added.