A TCP/IP connection is identified by a four element tuple: {source IP, source port, destination IP, destination port}.
IP is an OSI layer 3 (network) protocol, and IP connections are (indeed) identified by this 4-tuple.
TCP is an OSI layer 4 (transport) protocol, and TCP connections are (technically) identified by a 5-tuple: the 4-tuple from the IP, plus a protocol field, which is always gonna be TCP for, er, TCP.
That’s because you can, theoretically, mux different transport-layer protocols over a single network-layer identifier. For example, you can run a UDP/IP server and a TCP/IP server both on localhost:1234. This isn’t a huge deal in practice, because most server software will bind to a SOCK_STREAM socket by default, and most client software will assume a bare host:port should use SOCK_STREAM as well.
The example code demonstrates this:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Let the source address be 192.168.1.21:1234
s.bind(("192.168.1.21", 1234))
s.connect(("www.google.com", 80))
The socket is constructed with AF_INET i.e. IP, and SOCK_STREAM i.e. TCP. So the protocol part is kind of baked-in. Calling bind or connect uses the host and port provided to the function, and takes the protocol from the existing socket.
—
If you want to establish more than 64k (ephemeral port range) connections to a single destination, you need to use all the tricks:
Oof! Sockets and connections aren’t free, especially when they traverse the network (c.f. local Unix domain sockets). Bind-before-connect can be useful in very specific use cases, but if your application is making more than a handful of physical (socket) connections to a single destination, it’s usually a bug or design error.
I think you have the layers mixed up a bit conceptually. IP addresses are an IP concept. Ports are a concept that is shared by the two common transport protocols of TCP and UDP.
I hope I don’t have the concepts mixed up, I’m (supposed to be) a network engineer! 🙃 I also agree with the things you said. Can you point out what you think I got wrong?
https://en.wikipedia.org/wiki/Internet_Protocol_version_4#/media/File:IPv4_Packet-en.svg
At the IP layer, there are source and destination addresses, but ports don’t show up until you get to the TCP or UDP layer:
https://en.wikipedia.org/wiki/Transmission_Control_Protocol#TCP_segment_structure https://en.wikipedia.org/wiki/User_Datagram_Protocol#UDP_datagram_structure
Other protocols, e.g. ICMP (ping) don’t have source or destination ports at all: https://en.wikipedia.org/wiki/Internet_Control_Message_Protocol#Datagram_structure
You’re right, my previous comment was conflating concepts in a way that was confusing.
I think the only thing I really wanted to point out was the different-protocols-same-port bit.
Conceptually though the 4-tuple/5-tuple abstraction is still very useful in practice, especially given that the OSI model is generally… well, it’s a model too, and if I’m remembering my history correctly, it was actually built in parallel to the IPv4 stack that won. The 7-layer model was meant to be used to build a protocol stack that actually implemented all 7 layers, but TCP/IP “won” and OSI lived on in eternity as a model and not as an implementation.
Reasonable people may disagree, but I don’t think the OSI layer model and TCP/IP are, like, mutually exclusive to each other. There’s some squidgy-ness in the details: IP is pretty clearly an OSI layer 3 thing, and TCP is layer 4 and maybe 5 depending on how you look at it. So there’s no bijective mapping from the one to the other. But that’s fine: all models are wrong, some models are useful, and I think the OSI model continues to be useful :)
I don’t know how I missed this reply until today but I absolutely agree. It gets even trickier when you introduce TLS, but yeah, still a useful model!
I don’t think he mixed anything up. TCP connections are identified by the 5-tuple. The three fields that are part of IP are the source and destination (IP) address and the protocol. The source and destination port are part of the TCP or UDP header. As he says, this means that you can use the same port for both UDP and TCP, because they are separate namespaces (though, again, as he says) a lot of software somewhat conflates them. This conflation is even more likely with things like HTTP, which now run over either TLS+TCP or QUIC+UDP and so want to use the same port on the server for both TCP and UDP.
The trick I think is that the 5-tuple is an OS abstraction that leverages the fact that TCP and UDP both have same-sized port fields that behave similarly. At the IP level, there’s nothing preventing me from inventing a new IP packet type (say EUDPX) that has 128-bit port numbers inside a fully encrypted IP payload; most routers and firewalls would just drop it to the floor because they have no idea what it is, but on a local LAN segment it would probably work just fine. The OS wouldn’t know anything about the port numbers, but it would know what the IP addresses were.
but if your application is making more than a handful of physical (socket) connections to a single destination, it’s usually a bug or design error.
just as with all things there are exceptions / mitigating circumstances. for example, for one of the projects where we are providing fixed-wireless internet access, we want to load-test large number (1024) of connections on a base-station / enode-b (for those familiar with mobile networks). as you can imagine, from practical p.o.v, it is not possible perform these kind of simulation in the real world with real devices etc.
so, we have a simple application on running on a x86 machine making 1024 connections to the real base-node, and each connection pretending to be a remote-node and running their control-plane state machines etc. etc.
Oh sure! Load tests are a different beast. It can be surprising, and entertaining, to learn all of the different things that can become bottlenecks. Available ports, file descriptors, the network stack itself, so many things go wrong before you get anywhere near saturating your CPU or exhausting your memory 🙃
Used Alpine for years until I got tired of DNS randomly not working. Moved to Debian slim variants and will never look back.
Is this an Alpine issue or a docker image issue?
A quick search found that it’s a docker image issue, and looks easily sovlable.
Musl has a non conventional behavior when it comes to DNS: it doesn’t upgrade to using TCP even when the DNS response is too large to fit in a single UDP packet. This breaks “silently” and has been the source of countless hours of debugging.
https://twitter.com/RichFelker/status/994629795551031296?s=20
My choice not to do TCP in musl’s stub resolver was based on an interpretation that truncated results are not just acceptable but better ux - not only do you save major round-trip delays to DNS but you also get a reasonable upper bound on # of addrs in result.
wow ! thank you for the information, to me it seems kind of shortsighted (imho) and not really following the latest rfc’s anyways where iirc, tcp support is mandatory.
That’s really interesting. Using Debian and Ubuntu for the last 25 years and sometimes wishing I have a more minimalist system with more up-to-date packages. Had my eyes on Alpine, Void and even *BSD but I’m afraid that I take for granted so many things that work in Debian world that may not work as well otherwise.
Really interested by articles about Alpine/Debian and other minimal desktop OSes. This article was really good, hope to read more from the author.
I use Debian by default but have played with Void and Arch and quite enjoyed both. Of the two I’d say Void is a bit more refined while Arch is more mature. You definitely take for granted so many things that work in the Debian world, and exploring other systems helps make you appreciate it again.
What are the things that “just work” in Debian? I haven’t used it in a long while, I wonder what I’m missing out on.
Things I love:
pacman -Sy
, or Ubuntu where major version upgrades basically always are a clean reinstall. (Ubuntu recently got the ability to try to do a major version upgrade, but my work IT department’s policy on it is “try it if you want, but expect to bring us the computer to be fixed afterwards.”)crossbuild-essential-arm64
or whatever and you’re pretty much ready to get going.Downsides:
testing
release, which is basically a rolling release moderated by the tides of the stable releases, but you will still seldom get a version of software newer than 6 months old, so if you really want to use the most recent kakoune or such then you’ll be building it from source or riding the unstable
release.unstable
release and running it on a more-stable system is a bit of a fraught process. If you want to use unstable
versions of stuff it’s easiest to just have an entire system dedicated to it.Wow, thanks, that’s way more comprehensive that I expected! As for the breaking updates - I only found that to be an issue on Arch. I’ve been using Void as my daily driver for a while now, and it’s been very stable (but not flawless - in particular FDE is clunky, and Bluetooth behaves weirdly on my laptop. That might just be the laptop’s fault, though.)
I remember the first time I heard of Void, it was from someone who came into my IRC channel asking for help because he was trying to use a program I wrote and getting a bunch of TLS errors. After a lot of back and forth, it was revealed that he was using openjdk from Void, which didn’t use the system’s CA store. I wondered aloud why they hadn’t fixed that, but he told me that it was part of the Void philosophy to ship software “pristine” according to upstream and not to “mess around with things” and he was adamant it was better this way.
I was like … “if that way is better, why are you even in this channel trying to debug this issue that was specifically caused by Void’s refusal to integrate the CA store?” and … well, he didn’t really have an answer for that.
My favourite lesser-known C feature:
The header <iso646.h> defines the following eleven macros
(on the left) that expand to the corresponding tokens (on the right):
and &&
and_eq &=
bitand &
bitor |
compl ~
not !
not_eq !=
or ||
or_eq |=
xor ^
xor_eq ^=
So you can write conditions Python-style!
#include <iso646.h>
if (not (cond0 or cond1)) {
// ...
}
Also, since these are naive text substitutions, the particularly perverse among you might enjoy:
int i = 0;
int *i_p = bitand i;
If anyone’s curious exactly why the Vogon delegation proposed that thing and how it wound up in the C90 standard, the C committee fortunately kept records: https://www.lysator.liu.se/c/na1.html .
Also, as someone who still writes Lisp pretty often because of Emacs, this:
if (not (cond0 or cond1))
makes me extremely uncomfortable.
Does not seem to be mentioned here, but the idea of “world” comes from gentoo. I don’t recall if it was as simple to back up as a single file though.
It was, and apparently still is.
but it seems to be not the same afaics in the sense that in alpine, the user is encouraged to edit that file with modifications (hopefully) not getting clobbered. in portage there is no such guarantee, in fact user modifications are ‘aggressively’ edited (according to the posted link)
https://wiki.gentoo.org/wiki/Selected-packages_set_(Portage)#Editing_world_file_by_hand
Though the emerge man page says that the world file can “safely” be edited by hand, Portage will aggressively rewrite that file. Comments or changes in order of packages will be lost and there will be no checking for typos.
Tried giving it a build, basically git clone ... && cd ... && guix shell --development emacs -- {./autogen.sh, ./configure, make}
, but get this error here: https://pastebin.com/raw/7x1Q0AdF
Probably the first build that isn’t on a Mac ¯_(ツ)_/¯
And with that said, this is awesome! I haven’t the faintest idea how this part of the stack works and wouldn’t have known where to start, or even been able to recognize the fundamentals of what the issue was and how to approach it– there’s a kot of background and work in that commit! Are the benefits limited to MacOS, or generalizable to other Unix’s?
Promising! Knowing that works on 18.04 helos a bunch. I’ll take a another look when I have a second, and should probably roll back to upstream and make sure I can build that anyways :p
Definitly is that commit, but it looks like they’re on the case c:
oh, i am really sorry there were minor modifications that i had to make, for that to work, specifically:
make `PROCESS_OUTPUT_MAX` a macro and
make `FD_COPY(...)` also a macro
lemme know if you care about the diffs etc.
Oh, no problem! C just ain’t something I’m familiar with yet, and those diff’s will be upstream soon enough c:
Installing Slackware from a heap of floppies on a second hand piece of crap Packard Bell in 1994 is a very big part of why I’m where I’m at today.
I got a Slackware CD-ROM with a magazine in 1993. Unfortunately, the 2 or 3 MB RAM we had in our home PC was too little. So I traded the CD with my uncle for some Sherlock Holmes game.
In 1994 we had 5 MB RAM (we wanted to play Doom) and I had another Slackware CD-ROM and I was hooked. Lots of discussions with my brother followed about the 40 MB hard disk should be partitioned between Linux and MS-DOS (though I also used loadlin + UMSDOS for a while).
Patrick Volkerding bootstrapped my Unix education.
Edit: I even wrote a Slackware Book when I was a student: https://rlworkman.net/howtos/slackbasics.pdf , though I never completely finished it.
1995 for me. On a Gateway 2000 486SX33 desktop that I picked up for $30 when a local business sold it off as surplus. The floppies were nearly all AOL freebies with scotch tape on the write protect notch.
And I let the smoke out of a 14” monitor that cost at least as much as that PC when I screwed up my XFree86 modeline.
The things I learned exploring that system gave me an entirely different view of what I was studying and influenced my tech choices for a long time after.
The last time I daily drove Slackware was 2002. I’m half tempted to do it again for a while now, just for nostalgia’s sake.
Same here, but in ’97. I downloaded the release from an FTP server on a 33.6 modem. The damn thing took a week!
Whew, I feel young again! Same in 2001, on a 90 MHz Pentium someone at my dad’s office found in a closet. It had a whole 8 MB of RAM and Slackware was the only thing that would install on it. Never did get X11 working on it, but I sure learned a lot about living in the command line.
I don’t recall if I first downloaded Slackware via a modem or what. I bought my first copy of SLS Linux (SoftLanding Systems, which predated Slackware) from a Usenet post from some random guy. He mailed me a set of 5.25in floppies.
After that I was using Slackware for a while.
By the mid-1990’s I had a CD-ROM drive and was buying CDs with the latest releases. Sometimes in stores. Which had shelves of software in boxes. I am so old.
But my experience was on a Toshiba Portege 610CT: https://www.youtube.com/watch?v=Ram4Faoo9t8
Joe’s thesis is also extremely approachable and still full of great ideas: http://www.cs.otago.ac.nz/coursework/cosc461/armstrong_thesis_2003.pdf
if you like this kind of thing, this is the kind of thing you will like :o)
is rosetta-code not a better resource for this sort of thing ?
i found this to be quite instructive as an overview of various hashing techniques. check it out :)
Cool! The overall series looks useful, as I’m implementing a key-value store too.
Of hashing techniques, the only fancy one I’ve tried (in a prior project) is Robin Hood, which I found worked well, increasing both speed and maximum load.
I found that bidirectional linear probing worked better than Robin Hood: https://github.com/senderista/hashtable-benchmarks
This is a great writeup of an interesting and attractive build, but looking at that low laptop-like monitor position is making my neck ache… 🙂
Same! Most ergonomics guides I’ve read advise you to position your monitor with the top edge roughly at eye level. (“Eye level” as you’re looking straight in front of you, not hunched over.) Having a monitor on an adjustable arm is generally one of the big advantages of not using a laptop!
I think the wide-angle photo of my desk made it look like the monitor was much more reclined than it is. Here’s a side photo.
Probably in here; https://github.com/jcs/dotfiles
Nope, I got it right at the first photo. It looks as painfully reclined as in the first set of pics.
That is a really complicated setup.
At first, I was thinking that I can’t imagine what you’d want 25Gbit for. But then again, I moved recently from a 400Mbit cable to a 50-ish DSL and I really, really don’t like DSL. (Side note, they’ve just announced they’re laying fiber in my place, contract is signed and this time next year I could be on gigabit).
I assume jumping from < 1Gbit to 10+ Gbit is just a natural next step. I mean, yes I don’t need that speed all the time. But it’d still be nice to click “Download” and just a minute later, the entire 100+ GB Elder Scrolls online is here.
That is a really complicated setup.
It seems that he’s not even using multiple subnets - I’d say his setup is a lot simpler than mine. :)
I could imagine having 25 Gbps at home, but I’ve just started to deploy 10 Gbps in my internal network (between a few hosts) so it might be slightly overkill for me as well. My current max is 1000/100 but I’ve only opted for 100/100 as I don’t need more, and since I can’t have 1000 Mbps in upload…
might be useful to scan the entire ipv4 address space in a couple of minutes (or even faster)
If your ISP doesn’t block you, that’s a great way to end up on threat intelligence feeds and labelled as a bot.
does this feel exactly like the windows browser saga from about turn of the century to anyone else too ?
Honestly? As someone who remembers the Halloween Documents clearly … not much.
Microsoft’s power at the time is nothing compared to the power Google wields now.
I believe the specific complaint @signal-11 is referring to is that IE 4 had a special mode that made connection establishment faster with IIS, but it only worked if the client is IE 4 and the server is IIS and this was seen as an example of using the browser to get an unfair advantage in the web server market.
i have my doubts, fwiw. programming at the level of specifications is still programming, and is quite hard to get right f.e. try coming up with a specification for a hash table which does not end up being linear search…
Immutable by default prevents a lot of mistakes. Rust got this 100% correct. When I don’t see const
on a variable in C++, I immediately believe that it’s going to change somewhere and I consider it a mistake if it doesn’t.
iirc, mips are still used within csco routers ? there used to be a book called “see mips run” that i have used for hacking around in mips asm. quite good too (fwiw).
If you read ‘seem MIPS run’, make sure it’s the 32-bit version. The 64-bit version has a huge number of errors in it.
That said, even at my most cranky, MIPS assembly is not something I would ever inflict on someone, no matter how much they’d annoyed me. Between the lack of useful addressing modes, the inconsistent register naming (what is $t0? Depends on the assembler you’re using!), the huge number of pseudos that most MIPS assemblers make look like normal instructions but that will clobber $at, the magic of $25 in PIC modes, branch delay slots, and the exciting logic in the assembler for either letting you fill delay slots, padding them with nops, or trying to fill them from one of your instructions depending on the mode, it’s an awful experience.
I’m not really a fan of RISC-V, but RISC-V manages to copy MIPS while avoiding the most awful parts of MIPS. If you want to learn a simple RISC assembly language, RISC-V is a better choice than MIPS. If you want to learn assembly language for a well-designed ISA, learn AArch64. If you want to learn assembly language that’s a joy to write, learn AArch32 (things like stm
and ldm
, predication, and the fact that $pc is a general-purpose register are great to use for assembly programmers, difficult to use for compilers, and awful to implement).
There’s an implicit “RISC-V is not a well-designed ISA” there.
Could you elaborate what issue do you see with RISC-V?
pjsip, while specific to VoIP/RTC, is really interesting because it’s built from the ground up with a focus on portability.
may you please consider updating the url ? the one referenced above, doesn’t point to https://www.pjsip.org …
Thank you for pointing that out. I don’t seem to be able to edit/update the comment. But, here is the PJSIP git repo as well.
i, fwiw, am personally biased towards ietf-xdr. it provides the minimal thing that is required to exchange data between nodes separated on the network. rest everything is up to endpoints.
frankly, the idea of making a procedure call over the network horrifies me :) it just hides so many failure modes…
You might want to read the papers on OKws. This was the OK Cupid web server architecture. It was designed by a few people from MIT and makes clever use of Unix domain sockets.
MIT 6.858 Computer Systems Security covers OKws as a case study. Well worth a watch IMO (or at least a skim of the lecture notes).
is this the one that you had in mind ? may you please let me know ? thank you kindly !
Not GP, but seems correct. You can check the Github repo too and they have linked the paper
imho, ‘C’ models the underlying machine, while ‘LISP’ models the computation, take your pick. everything after that is just window dressing.