$ is by far the most “useless”: up until programming it exclusively meant “dollars”,
Bit of a pedantic Mexican pet peeve here, but it also means “pesos”, which predates the “dollars” usage (because Spain predates England as an economic power in America), and $ currently means about 15 30 different currencies worldwide.
Well, if we’re going to be pedantic… the term “dollar” in the Americas originally referred to pesos, i.e. Spanish pieces of eight. So while it is true that the symbol $ predates the existence of the US dollar, it doesn’t actually predate the dollar.
Joachimsthaler for a silver coin minted/mined in and around Joachimsthal (modern-day Jáchymov, Czechia)
Shortened to thaler and appears in the names of other similarly-sized silver coins of the Holy Roman Empire (Reichsthaler, etc.)
Modified to dollar and variants in several languages including English
Adopted in the Americas to refer to the “Spanish silver dollar” or “Spanish dollar”, a silver coin with face value eight Spanish reales (thus “piece of eight”) and approximately the same silver content as the earlier thalers
Adopted as the name of the official currency of the United States, and initially defined with reference to the Spanish silver dollar
Related: 16 Spanish reales made up a larger currency unit known as the escudo, which was initially minted in gold. The two-escudo (32 reales) coin, known in Spanish as the escudo doblón (“double escudo”), is the origin of the English name “gold doubloon”.
The town was founded in a nameless valley called in German just Thal (i.e. “valley”). Later it was named Sankt Joachimsthal after Saint Joachim, meaning “Saint Joachim’s Valley”.
Yep, that’s why I wrote “the term dollar in the Americas”. It was also used to refer to silver coins in Scandinavia.
More importantly, as far as I know, $ was not used to refer to dollars outside the Americas. I don’t think anyone knows for sure where $ came from but some people think it was originally an S superimposed on a P for pesos.
Less importantly, thal from Joachimsthal (from which Thaler came) is cognate with dale or vale in English. So a dollar is a coin from the dale, or a daler. (I’m sure some pun about a dollar from the dale of dol could be made… but it’s too early in the morning and I haven’t had any coffee yet.)
An alternate form of “$” is an S with two vertical lines through it (found in older print media). That form, is thought, comes from a U superimposed on an S, to stand for “United States”.
Learned that the hard way when I looked at SIM card prices after landing and (as a German) found 25$ to be expensive but not terrible. 25 pesos was quite good on the opposite… (That was before checking for included data, otherwise it would have been a bit much, yeah)
Page 8 has diagrams of the keyboard layout. It looks like the caret symbol ^ would require a shift-N for the 1967 version. The 1963 keyboard would print an up-arrow, although that appears to be protocol equivalent.
Not entirely relevant, but interesting perhaps, to note that at the same time (ASCII 1967) up-arrow became caret, back-arrow became underscore. The model 33 used ASCII 1963 and had up and back arrows.
The back-arrow was used for variable assignment in some programming languages. I would guess the only easily accessible example of this today is Smalltalk.
And on a bit-paired keyboard, shift + _ is ascii '\x20' | '\x5F' which is '\x7F' which is the delete character. For example, the ADM 3A has RUB (out) on the _ key. I haven’t found a keyboard that clearly shares <- (delete) and <- (ascii ’63 backarrow) on a keycap.
My theory is that flipped when lowercase came out. The actual ASR 33 is uppercase-only, and has shift-N as uparrow and shift-O as backarrow, but a separate RUBOUT key. So the shift key is turning bit 5 on. The 7F character is out by itself on the chart so it just gets its own key. The Datapoint 3300, designed as a Model 33 emulator, has this setup.
If you have lowercase, then shift still turns bit 5 on for numbers, but it turns bit 6 off for letters. Putting DEL in that bit 6 zone means shift-DEL generates underscore (not backarrow, because lowercase implies ASCII 1967). The Datapoint 2200 is a very bit-paired keyboard (shift-zero even generates a space!) and has an underscore on the RUB key like the ADM3A.
My assumption has always been that ^ is beginning of line because in ASCII-1963, this character code was assigned to ↑ (up arrow), and that is a better mneumonic for start-of-line than the other available characters (up meaning earlier in the text). This is also the reason why ^ is used for superscript in TeX math mode and exponentiation in some programming languages (up in this case meaning in superscript position).
Why the N key? This is a “bit paired” keyboard, and there is only one bit difference between the character codes of ↑ and N. So the hardware has logic to flip the appropriate bit when shift is depressed. Also note that there is no lower case on the pictured teletype. ASCII-1963 had a 6 bit subset (64 characters) to support the popular hardware at the time which used 6 bits to represent characters.
Why do article titles about regex line anchors mention the EOL code first and the BOL code second? It would be more self-documenting if the title were Why do regexes use ^ and $ as line anchors?. Picky? Yes. Organized? Much yes.
The $ in ex means the last line while addressing lines. Within basic regular expressions (BRE), it still means an anchor that matches the end of each string (effectively, each line of input). This is very likely influenced by its predecessor ed. In fact, ed influenced these choices in sed too. For example, all three commands below print the last line of /etc/hosts.
echo '$p' | ed -s /etc/hosts
echo '$p' | ex /etc/hosts
sed -n '$p' /etc/hosts
All three commands below print the second line while replacing the last three characters of the second line with the string “foo”.
echo '2s/...$/foo/p' | ed -s /etc/hosts
echo '2s/...$/foo/p' | ex /etc/hosts
sed -n '2s/...$/foo/p' /etc/hosts
It is possible to use both meanings of $ in the same command. For example, the following commands print the last line while replacing the last three characters of the last line with the string “foo”.
echo '$s/...$/foo/p' | ed -s /etc/hosts
echo '$s/...$/foo/p' | ex /etc/hosts
sed -n '$s/...$/foo/p' /etc/hosts
Quoting the relevant sections from the POSIX documents below.
An address is either a decimal number that counts input lines cumulatively across files, a ‘$’ character that addresses the last line of input, or a context address (which consists of a BRE, as described in Regular Expressions in sed, preceded and followed by a delimiter, usually a <slash>).
A <dollar-sign> ( ‘$’ ) shall be an anchor when used as the last character of an entire BRE. The implementation may treat a <dollar-sign> as an anchor when used as the last character of a subexpression. The <dollar-sign> shall anchor the expression (or optionally subexpression) to the end of the string being matched; the <dollar-sign> can be said to match the end-of-string following the last character.
Bit of a pedantic Mexican pet peeve here, but it also means “pesos”, which predates the “dollars” usage (because Spain predates England as an economic power in America), and
$currently means about1530 different currencies worldwide.Well, if we’re going to be pedantic… the term “dollar” in the Americas originally referred to pesos, i.e. Spanish pieces of eight. So while it is true that the symbol $ predates the existence of the US dollar, it doesn’t actually predate the dollar.
The origin of “dollar” is:
Related: 16 Spanish reales made up a larger currency unit known as the escudo, which was initially minted in gold. The two-escudo (32 reales) coin, known in Spanish as the escudo doblón (“double escudo”), is the origin of the English name “gold doubloon”.
Dollar, taler, (the thing from) the valley.
Yep, that’s why I wrote “the term dollar in the Americas”. It was also used to refer to silver coins in Scandinavia.
More importantly, as far as I know, $ was not used to refer to dollars outside the Americas. I don’t think anyone knows for sure where $ came from but some people think it was originally an S superimposed on a P for pesos.
Less importantly, thal from Joachimsthal (from which Thaler came) is cognate with dale or vale in English. So a dollar is a coin from the dale, or a daler. (I’m sure some pun about a dollar from the dale of dol could be made… but it’s too early in the morning and I haven’t had any coffee yet.)
An alternate form of “$” is an S with two vertical lines through it (found in older print media). That form, is thought, comes from a U superimposed on an S, to stand for “United States”.
And worldwide the first coin named ‘dollar’ was from Bohemia, struck about 200 years before the US was founded
Learned that the hard way when I looked at SIM card prices after landing and (as a German) found 25$ to be expensive but not terrible. 25 pesos was quite good on the opposite… (That was before checking for included data, otherwise it would have been a bit much, yeah)
All of the authors mentioned in this newsletter are still alive. If you wanna know why they picked these symbols you could probably get in touch.
When I started reading the article I thought this was going to be what the author did to answer the question.
This is a manual for the Teletype Model 35.
Page 8 has diagrams of the keyboard layout. It looks like the caret symbol ^ would require a shift-N for the 1967 version. The 1963 keyboard would print an up-arrow, although that appears to be protocol equivalent.
Not entirely relevant, but interesting perhaps, to note that at the same time (ASCII 1967) up-arrow became caret, back-arrow became underscore. The model 33 used ASCII 1963 and had up and back arrows.
The back-arrow was used for variable assignment in some programming languages. I would guess the only easily accessible example of this today is Smalltalk.
And on a bit-paired keyboard, shift +
_is ascii'\x20' | '\x5F'which is'\x7F'which is the delete character. For example, the ADM 3A has RUB (out) on the _ key. I haven’t found a keyboard that clearly shares <- (delete) and <- (ascii ’63 backarrow) on a keycap.My theory is that flipped when lowercase came out. The actual ASR 33 is uppercase-only, and has shift-N as uparrow and shift-O as backarrow, but a separate RUBOUT key. So the shift key is turning bit 5 on. The
7Fcharacter is out by itself on the chart so it just gets its own key. The Datapoint 3300, designed as a Model 33 emulator, has this setup.If you have lowercase, then shift still turns bit 5 on for numbers, but it turns bit 6 off for letters. Putting DEL in that bit 6 zone means shift-DEL generates underscore (not backarrow, because lowercase implies ASCII 1967). The Datapoint 2200 is a very bit-paired keyboard (shift-zero even generates a space!) and has an underscore on the RUB key like the ADM3A.
My assumption has always been that
^is beginning of line because in ASCII-1963, this character code was assigned to↑(up arrow), and that is a better mneumonic for start-of-line than the other available characters (up meaning earlier in the text). This is also the reason why^is used for superscript in TeX math mode and exponentiation in some programming languages (up in this case meaning in superscript position).ztoz beat me to this, but here’s a photo of a teletype keyboard with
↑on theNkey: https://alchetron.com/cdn/bit-paired-keyboard-4746cb46-b473-4e0a-8d72-059ff014604-resize-750.jpgWhy the N key? This is a “bit paired” keyboard, and there is only one bit difference between the character codes of
↑andN. So the hardware has logic to flip the appropriate bit when shift is depressed. Also note that there is no lower case on the pictured teletype. ASCII-1963 had a 6 bit subset (64 characters) to support the popular hardware at the time which used 6 bits to represent characters.Why do article titles about regex line anchors mention the EOL code first and the BOL code second? It would be more self-documenting if the title were Why do regexes use
^and$as line anchors?. Picky? Yes. Organized? Much yes.Interestingly, in
ex(which today we mostly experience when we press:invi),$means the end of the buffer: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ex.html#tag_20_40_13_02The
$inexmeans the last line while addressing lines. Within basic regular expressions (BRE), it still means an anchor that matches the end of each string (effectively, each line of input). This is very likely influenced by its predecessored. In fact,edinfluenced these choices insedtoo. For example, all three commands below print the last line of/etc/hosts.All three commands below print the second line while replacing the last three characters of the second line with the string “foo”.
It is possible to use both meanings of
$in the same command. For example, the following commands print the last line while replacing the last three characters of the last line with the string “foo”.Quoting the relevant sections from the POSIX documents below.
From https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ed.html:
From https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ex.html:
From https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html:
Quoting from section 9.3.8 “BRE Expression Anchoring” of https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html:
Yes, sorry. I didn’t mean to imply that it didn’t mean EOL too. It’s just interesting to me that it means last-line in a line context.