Neat! Asking too much to add some affordances for hitting a running service via /debug/pprof/xxx? I ask because this is overwhelmingly how I use the tool.
Yeah the problem of parsing text-based protocols in an async style has been floating around my head for a number of years. (I prefer not to parse in the async or push style, but people need to do both, depending on the situation.)
This was motivated by looking at the nginx and node.js HTTP parsers, which are both very low level C. Hand-coded state machines.
I just went and looked, and this is the smelly and somewhat irresponsible code I remember:
/* Proxied requests are followed by scheme of an absolute URI (alpha).
All methods except CONNECT are followed by ‘/’ or ‘*’.
I say irresponsible because it’s network-facing code with tons of state and rare code paths, done in plain C. nginx has had vulnerabilities in the analogous code, and I’d be surprised if this code didn’t.
Looks like they have a new library and admit as much:
Let’s face it, http_parser is practically unmaintainable. Even introduction of a single new method results in a significant code churn.
Looks interesting and I will be watching the talk and seeing how it works!
But really I do think there should be text-based protocols that are easy to parse in an async style (without necessarily using Go, where goroutines give you your stack back)
Awhile back I did an experiment with netstrings, because length-prefixed protocols are easier to parse async than delimiter-based protocols (like HTTP and newlines). I may revisit that experiment, since Oil will likely grow netstrings: https://www.oilshell.org/release/0.8.7/doc/framing.html
OK wow that new library uses a parser generator I hadn’t seen:
Was going to say this. Especially when you have people misbehaving around things like Content-Length, Transfer-Encoding: chunked and thus request smuggling seems to imply it’s too complex. Plus, I still don’t know which response code is appropriate for every occasion.
There’s quite a bit. You can ignore most of it, but once you get to HTTP/1.1 where chunked-encoding is a thing, it starts getting way more complicated.
Status code 100 (continue + expect)
Status code 101 - essentially allowing hijacking of the underlying connection to use it as another protocol
Chunked transfer encoding
The request “method” can technically be an arbitrary string - protocols like webdav have added many more verbs than originally intended
Properly handling caching/CORS (these are more browser/client issues, but they’re still a part of the protocol)
Digest authentication
Redirect handling by clients
The Range header
The application/x-www-form-urlencoded format
HTTP 2.0 which is now a binary protocol
Some servers allow you specify keep-alive to leave a connection open to make more requests in the future
Some servers still serve different content based on the User-Agent header
The Accept header
There’s more, but that’s what I’ve come up with just looking quickly.
Would add to this that it’s not just complicated because all these features exist, it’s very complicated because buggy halfway implementations of them are common-to-ubiquitous in the wild and you’ll usually need to interoperate with them.
You make a valid point. I find it easy to read as a human being though which is also important when dealing with protocols.
I’ve found a lot of web devs I’ve interviewed have no idea that HTTP is just plain text over TCP. When the lightbulb finally goes on for them a whole new world opens up.
It’s interesting to note that while “original HTTP” was plain text over TCP, we’re heading toward a situation where HTTP is a binary protocol run over an encrypted connection and transmitted via UDP—and yet the semantics are still similar enough that you can “decode” back to something resembling HTTP/1.1.
Every type in Teal accepts nil as a valid value, even if, like in Lua, attempting to use it with some operations would cause a runtime error, so be aware!
This is a bit disappointing for me to read since nils are by far the most common type errors in Lua. I’m definitely open to the idea of putting a little more work into my coding by thinking in types, but the types need to pull their weight! A type system which can’t catch the most common type error feels like a missed opportunity.
Yeah, it looks really promising. IIRC Pallene is developed by the core Lua developers. Unfortunately the documentation in their repo does not have enough detail to determine whether their type system has the same nil problem as Teal’s.
One of the things I notice when working in Lua is, I’m sure because of its relatively small developer community (as compared to say Java or Python or C/C++) I find a lot of places where the Lua ecosystem goes right up to the edge of the water and then just … stops.
Like, as a for instance, try getting luarocks working on a non *NIX based system. It’s not easy :) I know it’s been done but - not easy.
Again this is totally understandable because polish and depth require engineer hours to create and those don’t grow on trees.
I find a lot of places where the Lua ecosystem goes right up to the edge of the water and then just … stops.
My perspective on this is that Lua developers tend to have more restraint and recognize that sometimes if you can’t do something right, it’s better not to do it at all.
Unrelated to this, but you may be pleased to know that you can use ?. to safely access values that may not exist in JS. e.g. const name = some?.nested?.obj?.name;
Totally agree. This makes me think of all the gyrations Swift goes through to ensure that you’re never using or getting a potentially Nil value unless you really REALLY need it and mean for that to be possible in this circumstance.
I’m half wondering if a PR for the site code to search for a submission URL in comments from the past X days might be worthwhile. Obviously in this case, there wasn’t really any discussion other than someone submitting the project URL. But it seems to happen regularly where there was discussion and a comment search would’ve found it.
And it probably needs to wait till the DB story is settled.
Vortex POK3R
Neat! Asking too much to add some affordances for hitting a running service via /debug/pprof/xxx? I ask because this is overwhelmingly how I use the tool.
I don’t see why not. This was just 10 minutes worth of work I figured I’d share with everyone, especially to lower the bar to get into pprof-ing.
I also made this public to get feedback like this. I’ll throw up an issue on the repo, unless you would like to?
Looks like this made it over to the orange site and doing just as well. Kinda cool since this is a first for me.
I love plain text protocols, but … HTTP is neither simple to implement nor neither fast to parse.
Yeah the problem of parsing text-based protocols in an async style has been floating around my head for a number of years. (I prefer not to parse in the async or push style, but people need to do both, depending on the situation.)
This was motivated by looking at the nginx and node.js HTTP parsers, which are both very low level C. Hand-coded state machines.
I just went and looked, and this is the smelly and somewhat irresponsible code I remember:
https://github.com/nodejs/http-parser/blob/master/http_parser.c#L507
I say irresponsible because it’s network-facing code with tons of state and rare code paths, done in plain C. nginx has had vulnerabilities in the analogous code, and I’d be surprised if this code didn’t.
Looks like they have a new library and admit as much:
https://github.com/nodejs/llhttp
Looks interesting and I will be watching the talk and seeing how it works!
But really I do think there should be text-based protocols that are easy to parse in an async style (without necessarily using Go, where goroutines give you your stack back)
Awhile back I did an experiment with netstrings, because length-prefixed protocols are easier to parse async than delimiter-based protocols (like HTTP and newlines). I may revisit that experiment, since Oil will likely grow netstrings: https://www.oilshell.org/release/0.8.7/doc/framing.html
OK wow that new library uses a parser generator I hadn’t seen:
https://llparse.org/
https://github.com/nodejs/llparse
which does seem like the right way to do it: do the inversion automatically, not manually.
Was going to say this. Especially when you have people misbehaving around things like Content-Length, Transfer-Encoding: chunked and thus request smuggling seems to imply it’s too complex. Plus, I still don’t know which response code is appropriate for every occasion.
Curious what part of HTTP you think is not simple? And on which side (client, server)
There’s quite a bit. You can ignore most of it, but once you get to HTTP/1.1 where chunked-encoding is a thing, it starts getting way more complicated.
There’s more, but that’s what I’ve come up with just looking quickly.
Would add to this that it’s not just complicated because all these features exist, it’s very complicated because buggy halfway implementations of them are common-to-ubiquitous in the wild and you’ll usually need to interoperate with them.
And, as far as I know, there is no conformance test suite.
Ugh, yes. WPT should’ve existed 20 years ago.
Heh, don’t forget HTTP/1.1 Pipelining. Then there’s caching, and ETags.
You make a valid point. I find it easy to read as a human being though which is also important when dealing with protocols.
I’ve found a lot of web devs I’ve interviewed have no idea that HTTP is just plain text over TCP. When the lightbulb finally goes on for them a whole new world opens up.
It’s interesting to note that while “original HTTP” was plain text over TCP, we’re heading toward a situation where HTTP is a binary protocol run over an encrypted connection and transmitted via UDP—and yet the semantics are still similar enough that you can “decode” back to something resembling HTTP/1.1.
UDP? I thought HTTP/2 was binary over TCP. But yes, TLS is a lot easier thanks to ACME cert issues and LetsEncrypt for sure.
HTTP/3 is binary over QUIC, which runs over UDP.
SIP is another plain text protocol that is not simple to implement. I like it and it is very robust though. And it was originally modeled after HTTP.
Weird to put this much work into a programming language but not bother to register a domain for it.
Looking a little deeper:
This is a bit disappointing for me to read since nils are by far the most common type errors in Lua. I’m definitely open to the idea of putting a little more work into my coding by thinking in types, but the types need to pull their weight! A type system which can’t catch the most common type error feels like a missed opportunity.
Fwiw the talk mentioned nil safety as a potential future direction.
While still in semi-early development, Pallene is another alternative with some additional performance benefits.
White Paper
Repo
Yeah, it looks really promising. IIRC Pallene is developed by the core Lua developers. Unfortunately the documentation in their repo does not have enough detail to determine whether their type system has the same nil problem as Teal’s.
One of the things I notice when working in Lua is, I’m sure because of its relatively small developer community (as compared to say Java or Python or C/C++) I find a lot of places where the Lua ecosystem goes right up to the edge of the water and then just … stops.
Like, as a for instance, try getting luarocks working on a non *NIX based system. It’s not easy :) I know it’s been done but - not easy.
Again this is totally understandable because polish and depth require engineer hours to create and those don’t grow on trees.
My perspective on this is that Lua developers tend to have more restraint and recognize that sometimes if you can’t do something right, it’s better not to do it at all.
I appreciate that. I definitely is nice to skip the super annoying “Here are 30 half baked almost implementations of $THING” phase.
Like the fact that there used to be about 9000 Python distros for Windows and now there’s essentially 1 mainstream one.
I didn’t like this either. I appreciate Go have the zero-value idea for basic types, but there is still the nil issue for interfaces and pointers.
Back in my JavaScript days it was tedious always checking for null values before doing the real work.
Unrelated to this, but you may be pleased to know that you can use
?.
to safely access values that may not exist in JS. e.g.const name = some?.nested?.obj?.name;
Totally agree. This makes me think of all the gyrations Swift goes through to ensure that you’re never using or getting a potentially Nil value unless you really REALLY need it and mean for that to be possible in this circumstance.
there is a domain, but not a website: http://teal-language.org
Recently featured here.
Ah, that is why it didn’t show up as a duplicate. The original post was a video.
I’m half wondering if a PR for the site code to search for a submission URL in comments from the past X days might be worthwhile. Obviously in this case, there wasn’t really any discussion other than someone submitting the project URL. But it seems to happen regularly where there was discussion and a comment search would’ve found it.
And it probably needs to wait till the DB story is settled.