1. 29
  1.  

  2. 10

    Now I’m learning more about fastcgi, I think we replaced it well when we as a collective moved towards http servers in everything.

    FastCGI was a way to run a web server except speaking a client protocol almost nobody uses. It was a necessary step between the glory of CGI (short lived processing handling a single request in a defined way) and http servers everywhere. Once we could run our code as servers, why not allow us to use our browsers alone when debugging?

    1. 14

      FastCGI is lighter and less complicated, because it isn’t HTTP and it can ignore a lot of the complexities of talking to random real-world HTTP clients with 30 years of back-compatibility and edge cases.

      In the real world you’re going to put some kind of web server / load balancer / application proxy between your app and the outside world anyway, and it has to deal with that stuff, so why not let it? And once it’s done the work of parsing the request, why not just send it straight to the app in a nice concise binary format, instead of going to the effort of re-formatting it as HTTP, adding some headers that say “hey, I know it looks like this request is coming from me, but actually I’m just the middleman, and really it came from over there, and by the way my external name and port is such and such, and the request was/wasn’t secure”, leaving the app to parse the request again and re-form, in an error-prone manner, the state that the server already has, and that the app needs to operate properly? A FastCGI app has the same “already on the inside” perspective as a CGI app, it gets the real client info and the real server info in environment variables, no munging required, and since they don’t come in as headers, you have fewer worries about some malicious client smuggling a value in.

      tl;dr “my app is its own HTTP server” is an asset in dev and a liability in prod.

      1. 1

        tl;dr “my app is its own HTTP server” is an asset in dev and a liability in prod.

        Perhaps it makes sense for prod to need more setup than dev. Including perhaps figuring out headers and ensuring all is ok on the request. Are you aware of mainstream mechanisms for just running fastcgi processes for dev?

        Release engineering is real, and it doesn’t really make sense to minimize it for the sake of not having an independent http parser.

        1. 1

          All my stuff for the last umpteen years uses some kind of psgi/wsgi/asgi/you-get-the-idea compatible framework, so it’s equally capable of running under FastCGI or a standalone HTTP server for dev. But also, it only takes like a dozen lines of config to make nginx or lighttpd forward to a FastCGI app (plus serve your static files if you want), and for a more “organized” dev setup with a docker-compose or something that’s what I’d do… that way dev can replicate prod a little more closely.

      2. 2

        Also see this response: https://lobste.rs/s/xl63ah/fastcgi_forgotten_treasure#c_kaajpp

        summary: FastCGI includes process management that is useful. The binary protocol part is perhaps superfluous

        1. 1

          Fantastic reason.

        2. 1

          FastCGI’s benefit is allowing de-coupling of the web server from the rest of the application, instead of building another monolith.

          This is why it remains relevant.

          1. 1

            It’s rather tightly coupled, don’t you think?

            1. 1

              To what?

              Certainly not to a particular web server…

              1. 1

                To having a web server as part of the monolith

        3. 5

          I’m still using FastCGI! It works well on Dreamhost.

          The Python support is not good! In theory you just write a WSGI app, and it will work under a FastCGI wrapper.

          But I had to revive the old “flup” wrapper, since Dreamhost has Python 2. I downloaded an older tarball and build it myself.

          Use case: I parse thousands of shell scripts on every release and upload the results as a “.wwz” file, which is just a zip file served by a FastCGI script.

          https://www.oilshell.org/release/0.8.1/test/wild.wwz/

          So whenever there’s a URL with .wwz in it, you’re hitting a FastCGI script!

          This technique makes backing up a website a lot easier, as you can sync a single 50 MB zip file, rather than 10,000 tiny files, which takes forever to stat() the file system metadata.

          It’s more rsync-friendly, in other words.

          I also use it for logs in my continuous build: http://travis-ci.oilshell.org/jobs/


          Does anyone know of any other web hosts that support FastCGI well? I like having my site portable and host independent. I think FastCGI is a good open standard for dynamic content, and it works well on shared hosting (which has a lot of the benefits of the cloud, and not many of the downsides).

          (copy of HN comment https://news.ycombinator.com/item?id=24684563)

          1. 1

            Your WWZ system is interesting and I’d love to learn more about it. I remember taking down a note to dig more into it or try to build something like it for my own usage. If I recall correctly in something I read, you created it primarily to take advantage of excellent compression of a large number of text files while being able to manage a single file, and reads from that file are cached. Is that accurate?

            1. 1

              So you generate zip files, store them on disk, then add a FastCGI script to get the zips?

              1. 1

                Yes exactly! Just drop the .zip files in a dir, renamed to .wwz, and they get served!

                This is a tiny program that I meant to share, but never got around to it… I find it very useful, when you want to serve thousands of tiny files. It’s in Python 2 since Dreamhost is running Debian with Python 2.

                1. 1

                  It’s always easier to manage a single archive compared to thousands of smaller files. Zip is really an old compression algo and file format, but it has indexes so it allows faster random access. And easier to manage since most os has builtin support. Dunno if there is any better alternative of zip providing both random access and decent compression ratio. Maybe leveldb?

                  1. 1

                    Zip is really an old compression algo and file format

                    Zip has a few options for compression algorithm but for uses like this it’s very common to use the no-compression option. This has the advantage that you can mmap (or equivalent) the entire file and use the index to get pointers to the data. On a 64-bit system, the FastCGI process can probably mmap all of the files you’ll want and rely on the OS evicting the pages when it is short on RAM. Dovecot uses this strategy (though not with zip files) very effectively, mmaping all of the index files and relying on the OS to keep its physical memory consumption under control (any of the mapped pages can be evicted almost for free and then read back in as needed).

                    1. 1

                      Android does this with asset files inside APKs (which are zips) and the build process includes a utility for adjusting zip files so that every uncompressed file inside is aligned to some alignment (iirc 2 bytes) to support getting assets contents by mmap()ing the APK. :)

                  2. 1

                    Please do share, I’d like to learn more about it.

                    I’m considering adding fastcgi to my project, and the zip portion is also interesting.

              2. 4

                For the non-Pythonistas here you might be interested in looking at WSGI and ASGI, two protocols similar in spirit to FastCGI, but with a tighter coupling to the host language. I find it interesting that WSGI managed to keep up and stay relevant, paving the road for ASGI, which supports WebSocket as well as HTTP/2.

                1. 8

                  WSGI and ASGI aren’t alternatives to FastCGI.

                  • WSGI is a Python protocol, i.e an “API” that gives you a Python dictionary representing the request. You write the response back to a Python file-like object in the dictionary.
                  • CGI and FastCGI are Unix protocols.
                    • CGI starts a process with a given env, and you write the response to stdout. You can write a CGI script in any language (Perl was once the favored language). You can use WSGI or not. Perl now has something analogous called PSGI I think.
                    • FastCGI uses a persistent process that sends the env dictionary over a socket (in some weird binary format).

                  The way I use FastCGI is to write a WSGI app (which can be done using any Python framework; I use my own framework).

                  And then I use the “flup” wrapper to create a FastCGI binary.

                  The project is kind of “hidden” now, but it works: https://pypi.org/project/flup/ and https://pypi.org/project/flup-py3/

                  https://www.saddi.com/software/flup/

                  https://www.geoffreybrown.com/blog/python-flup-and-fastcgi/

                  1. 3

                    Does anyone here have experience with PHP deployment? I’m curious if FastCGI (FPM) is the preferred “gateway solution” for PHP? vs. mod_php which is a shared library dynamically linked with Apache.

                    Some of these links seem to suggest that this is true? You get better performance with FastCGI? That is a little surprising.

                    Either way it seems like FastCGI is relatively popular with PHP, but sorta unknown in other languages? I never heard of anyone running Django or Rails with FastCGI? I think those frameworks are designed to run their own servers, and don’t play well with FastCGI, even if they can technically make a WSGI app in Django’s case.

                    https://serverfault.com/questions/645755/differences-and-dis-advanages-between-fast-cgi-cgi-mod-php-suphp-php-fpm

                    https://blog.layershift.com/which-php-mode-apache-vs-cgi-vs-fastcgi/

                    https://stackoverflow.com/questions/3953793/mod-php-vs-cgi-vs-fast-cgi

                    1. 6

                      Yes! I’m using that already for many years on CentOS/Fedora. See https://developers.redhat.com/blog/2017/10/25/php-configuration-tips/ for more information from Red Hat.

                      I also wrote blog posts for CentOS and Debian 10 on how I use php-fpm in production.

                      1. 1

                        Cool.. Is it correct to say that PHP-FPM is a C program that embeds the PHP interpreter and makes .php scripts into FastCGI apps? I’m just curious how it works.

                        I think Python never developed an analogous thing, which is a shame because then there would be more shared Python hosts like there are shared PHP hosts. The closest thing is “flup”, which is not well documented (or maintained, at least at some points)

                      2. 6

                        mod_php still has some usage, and is still maintained, but IMO yes PHP-FPM (essentially a long lived process manager for PHP) accessed via FastCGI from a regular http server (normally apache or nginx, recently HAProxy also added support for fastcgi) is the “best” solution for now.

                        mod_php will probably have a slight latency benefit, but it means Apache will uses more memory and is limited to the pre-fork worker, plus you lose a lot of flexibility (e.g. going the factcgi/fpm route you can have multiple versions of php installed side by side, you can have multiple completely different FPM instances, etc).

                        1. 1

                          I don’t know if this is fixed, but mod_php used to run the PHP scripts in the same process as the web server. In a shared hosting environment, this meant that any file readable by one user was readable by scripts run by the others (for example, if you put your database password in your PHP file, someone else could write a PHP file that would read that file and show it to the user, then compromise your database). It also meant that a vulnerability in the PHP interpreter could be exploited by one user to completely take control of the web server. The big advantage of FastCGI for multi-tenant systems was the ability to run a copy of the PHP interpreter for each user, as that user.

                          1. 1

                            I don’t think “fixed” is the right term there, but regardless that is the inherent nature of mod_php, yes.

                            There was (/is, via a fork) a variant called mod_suphp that uses a setuid helper, so the process runs as the owner of the php file it’s executing.

                          2. 1

                            Cool thanks… I asked the same question in this sibling.

                            https://lobste.rs/s/xl63ah/fastcgi_forgotten_treasure#c_6u4wq3

                            Basically I want to make an “Oil-FPM” :) I think I can do that with

                            https://kristaps.bsd.lv/kcgi/

                            that wraps the Oil interpreter? And I probably need some more process management too?

                            There is no Python-FPM as far as I know, and that is a shame.

                            I want to preserve the deployment model of PHP – rsync a bunch of .PHP files. Likewise you should be able to rsync a bunch of Oil files and make a simple and fast script :)

                            Similar to what I have here if it was dynamic rather than static: http://travis-ci.oilshell.org/jobs/ That could easily be written in Oil.


                            Found the source. Woah is it true this hasn’t had a release since 2009 ???

                            https://launchpad.net/php-fpm

                            https://code.launchpad.net/php-fpm

                            https://github.com/dreamcat4/php-fpm

                            Or maybe it’s built into PHP now?

                            Ah yes looks like it is in there as of 2011, interesting: https://www.php.net/archive/2011.php#id2011-11-29-1

                            But the old source is useful. It’s about 8K lines of C and handles processes and signals! Doesn’t look too bad. If anyone wants to help integrate it into Oil let me know :)

                          3. 2

                            First of all, I’ve been out of the loop for a few years, but from ~2010-2017 Apache was falling out of favor anyway, so mod_php was out of the question if you used nginx or lighttpd. I think 2.4 brought some renewed interest in Apache, but I have no facts to back that up.

                            1. 1

                              Right, that makes sense. I think Nginx seems to encourage their own uwsgi and it doesn’t have FastCGI support?

                              The downside is that I’ve never seen a shared host that lets you “drop in the uwsgi file” like you just “drop in a .php file” or in Python’s case “drop in WSGI app wrapped by flup” ?

                              Basically Nginx doesn’t seem to support shared hosting as well as Apache? I’d be interested to hear otherwise. Dreamhost still uses Apache and the setup is pretty nice.


                              EDIT: Someone e-mailed me to clarify that uwsgi is a program that supports the FastCGI protocol in addition to the uwsgi protocol :)

                              1. 1

                                No idea, I haven’t used a shared host in many years.

                                But most of the web servers would indeed support an arbitrary FastCGI interface and if you’re allowed to run a binary you could have everything behind that webserver, just that I’ve never seen non-dynamic languages do that, Rust and Go mostly offer a webserver on their own and you just reverse-proxy through.

                            2. 2

                              From what I’ve seen mod_php is not preferred anymore. Apache is a pretty amazing swiss army knife, but it kind of has to do too much in one binary. The trend has been to use a pretty thin L7 proxy like nginx and/or haproxy to route to services.

                              I don’t know why PHP itself uses fastcgi rather than a native http implementation. Maybe it’s faster to parse? Maybe there’s better side-channels for things like remote IP when proxying?

                              A side note: I think apache was unfairly maligned and a victim of bad defaults. IIRC debian shipped it in multi-processs mode with 10 workers, but apache has pluggable mpms (multi-processing modules) so you can configure it to be epoll/thread based like nginx and be a decent file server or proxy. Unfortunately not all modules are compatible with every mpm.

                              1. 3

                                The main difference between a FastCGI backend and an HTTP backend to me is sort of accidental – in the FastCGI world, the process is known to be ephemeral like CGI, but the server keeps it alive between requests as an optimization.

                                If you lose your state, well no big deal – it was supposed to be like a CGI script.

                                But that is not true of all HTTP servers.

                                I think this matters in practice as on Dreamhost I get new FastCGI processes started every minute or 10 minutes. That is not customary for HTTP servers! (They also start 2 at a time).

                                Also I think FastCGI processes are safely killed with signals.


                                So it is true that FastCGI has a weird and somewhat unnecessary binary protocol. But it is also includes the “process” part which is useful.

                                1. 2

                                  Debian shipped Apache with Apache‘s default, which is the worker MPM for 2.2 and the event MPM for 2.4.

                                  But mod_php switched you to the prefork MPM because some PHP extensions are not threadsafe.

                                2. 1

                                  I never heard of anyone running Django or Rails with FastCGI?

                                  Way back in the earliest days, Django’s first packaged release (0.90) shipped handlers for running under mod_python, or as a WSGI application under any WSGI-compatible server, but recommended mod_python. Django 0.95 added a document explaining how to run Django behind FastCGI, and a helper module using flup as the FastCGI-to-WSGI bridge.

                                  The mod_python handler was removed after Django 1.4 (so 1.5 was the first version without it). The flup/FastCGI support was removed after Django 1.8. Since then, Django has only supported running as a WSGI application.

                                  I can’t speak to anyone else, but I for one have run Django in production under each of those options: mod_python, FastCGI, and pure WSGI.

                                3. 1

                                  I guess the similarity is that WSGI passes data in a similar way: the dict acts as the store for environment variables and the stdin/stdout are passed as explicit values. The takeaway for me is that a simple Unix-style design can survive many years of battle testing and stay relevant!

                                4. 3

                                  For context and accuracy, WSGI was not a new idea nor the only ubiquous one around in its problem space. In the WSGI proposal, Guido van Rossum clearly states that the idea was to build something modeled after Java’s Servlet API. Java was the most used programming language at the time. The servlet API is still widely used.

                                  1. 2

                                    The uWSGI documentation also has support for similar integrations with other languages (perl, ruby): https://uwsgi-docs.readthedocs.io/en/latest/PSGIquickstart.html

                                  2. 2

                                    Surprised to see all the love for FastCGI. My recollectoin is that it was a nightmare to use – very fussy (hard to program for and integrate with), and quite brittle (needing regular sysad intervention).

                                    1. 2

                                      I remember trying to set it up once on the server side (~10 years ago?) and it was not fun.

                                      However as a user on shared hosting, it works great. I’ve been running the same FastCGI script for years, and it’s fast, with no problems. So someone figured out how to set it up better than me (which is not surprising).

                                      I think the core idea is good, but for awhile the implementations were spotty and it was not well documented in general. There seems to be significant confusion about it to this day, even on this thread of domain experts.

                                      To me the value is to provide the PHP deployment model and concurrency model (stateless/shared nothing/but with caching), but with any language.

                                      1. 1

                                        We ran FastCGI at quite large scale back around 2000 and it was very reliable and not particularly difficult to work with.

                                        1. 1

                                          I was using it at mid-scale in the aughts (mod_fastcgi on apache) and it was not a pleasant experience. Maybe our sysads were particularly bad, or maybe our devs just didn’t get the concepts, but I recall others in my local user groups having similar difficulties.

                                      2. 1

                                        Also friendly reminder that once upon a time there was Mongrel2 - where the backends were wired to the webserver via ZeroMQ and not FastCGI. Same principle but different protocol.

                                        1. 1

                                          I think a lot of people misunderstand CGI (and I lump cgi, fastcgi, and scgi together here). A cgi server can do almost everything a http server can do - the main difference is websockets and that depends more on the outer server than the cgi per se - but has a lot more standard consistency to it.

                                          Let’s say you want to put a dynamic application on example.com/foo and a different one on example.com/bar. With an embedded http server, you need to configure them to know how to adjust the links somehow itself (or perhaps have the outer layer hack it in but that’s really ugly). This means a config file or a custom header or something. Totally doable, but with CGI, it is part of the standard protocol so it is relatively mindless.

                                          I sometimes see people object “but I don’t want the outer server to do the application url routing”… well, don’t! Everything after the cgi application’s outer path is also passed to the cgi application. Even example.com/cgi-bin/bar/foo/baz. If bar is the application, it sees /foo/baz as its PATH_INFO argument.

                                          It seems to me most language-specific web gateway middleware is basically a reinvention of CGI. That’s not bad - they are solving the same problem so learn from the past and/or independently come to the same solutions - but I do think it is important to realize cgi has indeed also addressed all this and does so pretty well.

                                          And then there’s the worker process management cgi and fastcgi do which again simplifies your application code and naturally leads you to more easy load balancing. Which again so many in-language servers end up having to reinvent sooner or later.