Zig runtime is very bare-bones: no GC, no utilities for HTTP or JSON, even no way to concatenate strings! But this also implies that most minimal Zig programs can remain small.
The frontpage for Zig has an example parsing some JSON from the standard library. I feel like there’s an important distinction between the standard library and the runtime that I’m missing. Can anyone provide any more context?
Yeah that isn’t the most clear statement. While the standard library may have JSON and other utilities, it isn’t part of the final binary unless you use it. Compared to something like Go, where you have the garbage collector included in the output executable file. (Though I’m not sure of any compiled language that would include an http server if you don’t need it, I guess it might exist?)
I’m currently working my way through http://craftinginterpreters.com/ and really enjoying it. I’ve previously tried implementing an interpreter in a very slap-dash fashion, so having a more structured approach to it feels much nicer to me.
Also I recently moved my desktop machine from Windows to Linux, using https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF to run Windows in a VM with near-native performance for gaming. I’ll be working to improve boot times and get the host OS into a more usable state.
How about, “Four books professional server-side Python developers should read (and a few they don’t need to)”.
As a client-side C++ programmer, that list doesn’t do me much good, except for “High Performance Browser Networking” which is a killer read for anyone involved in HTTP-based APIs on either end of the wire. (And OK, I do write some utility scripts in Python.)
On the other hand, “Design Patterns” is a stone classic.
The stone cold classic is SICP, to the point of cliche but no less valid for it. It doesn’t have the instant for-my-job practicality boost that a style guide like “90 Specific Ways to Write Better Python” does, but it’s in another ballpark of depth.
@eatonphil I am curious if you had criticisms of it or if you think its lessons are just not nuts-and-bolts enough for your list.
It’s a perennial goal of mine to finish SICP. 😀 If someone is coming to this list and sees value in four top books, I can’t really recommend a book in the same category here that’s taken me years to finish. But for people who read a lot, of course you can give it a go.
I’m very much a PL nerd. As mentioned elsewhere, I’ve spent a lot of time programing in Standard ML (see ponyo on my github) and Scheme (see bsdscheme on my github). Forcing myself to spend years getting good at both has seriously improved my ability to write simple code.
But I also can’t really recommend that to most devs either because who knows if you’ll enjoy it. Doing what inspires you is the most important thing so you actually complete what it is you want to do.
Scheme and the art of programming by George Springer is almost the same material as SICP but in Scheme. Honestly if you have worked for years in ML or Scheme you probably don’t need SICP since it is a introductory level text.
I just picked this one up a few weeks back and have slowly been working through it. I’m surprised that I hadn’t run into it before then. It’s a textbook and it reads like a textbook, there’s no getting around that, but it’s accessible and seems to be a great intro to Scheme in general.
SICP is using Scheme too, isn’t it? I recall running across MIT being the recommended implementation, but I’ve had no problems with Guile, and there’s a language in Racket, too. I bounce between the Springer book, SICP, and Little Schemer when I’m not too busy. I’m in no rush, but I’ve found those three to be a great way to get acquainted with emacs/geiser & guile.
I wish I could recommend books on other languages! But no book I’ve read on JavaScript (thinking JavaScript the Good Parts, How JavaScript Works), Go (there’s almost nothing here that’s not just reference), Java (Effective Java is massive and not all useful), or C++ (what, Meyers’ Effective C++?) is in the same category in terms of information density, being well-written, and still being applicable today. I’d be happy to take recommendations though for books not among these.
Thinking Forth (1984) and Writing Solid Code (1993) are two books that radically changed how I approached programming. Yes, one is about Forth, and one is about C, but I’ve found them applicable to more than just those two languages.
Fully agree on both and in particular that TF is applicable beyond Forth. Great book.
I wonder whether you might enjoy Elements of Programming and/or The Humane Interface (and here’s the bit I’ve quoted most often). Very different books, both of them insightful and hard. EoP is a C++ book that’s about C++ in the way TF is about Forth, THI is a UI book that uses phrases like “information-theorietic efficiency” instead of “at [famous company] we do this [this way]”.
Personally, my qualm wouldn’t be about the language, it would be about the domain. I haven’t read the books, but those 4 titles really are web focused. I’m sure lots of those translate to other domains, but it’s clearly not aimed as the general purpose developer who doesn’t yet know which domain they will tackle. No, this is for backend web developers.
Thus, your title should really be closer to: Four books professional backend developers should read. That way it’s more focused to your actual target audience.
My point is that it wasn’t intended to be backends focused. I haven’t read any really great books on JavaScript (but HPBN is clearly browser, not backend, focused). And I genuinely wish frontend developers I worked with knew more that’s covered in DDIA.
But I don’t pretend to speak for every domain of programming for sure.
It’s easy to for us to get myopic about stuff outside our domain.
Note that even here you seem to be implicitly assuming that not-backend = in-browser. That’s leaving out (native) desktop, mobile and embedded development.
I suspect there aren’t actually any books that are super important for all serious programmers. The books I learned the most from are way out of date (K&R, the Smalltalk-80 Blue Book, Advanced Cryptography) or very domain specific (Unix Network Programming) or both (Inside Macintosh.)
Yes I’m definitely missing domains but mobile isn’t one of them! HPBN devotes a ton of time to the effect of protocols on mobile battery life of all things. There’s a chapter or two on the behavior of mobile networks. Mobile developers are also responsible for monitoring and alerting on their production deploys. And they’ll also be heavy users of backend systems.
All this isn’t to say that I’ve got the list perfect. I hope I change my mind on it over time as I’ve read new books, learned new things, and expired different domains.
I feel The Go Programming Language is quite good; certainly more than just a reference.
The Little Schemer is probably the best programming book I’ve read. While it’s quite limited in scope (functional programming in Scheme), it does explain it in such a way that’s very insightful and useful for other languages too, IMO. And the writing style is just fun!
I feel The Go Programming Language is quite good; certainly more than just a reference.
It’s not bad. And certainly new Go developers would do well to use it as a style guide since writing “idiomatic” Go takes a bit. I’ll consider adding it to a future version of this post.
The Little Schemer is probably the best programming book I’ve read. While it’s quite limited in scope (functional programming in Scheme), it does explain it in such a way that’s very insightful and useful for other languages too, IMO. And the writing style is just fun!
In the category of niche languages there is no shortage of gems. :) I haven’t read The Little Schemer but I’m working through The Little Typer at the moment. I’ve got a post in mind for a rundown of 6-8 Lisp books alone. But I wouldn’t consider recommending any of these to developers en masse (even though I believe programming in Scheme or Standard ML for a while helps you write simpler code).
I don’t think the value of The Little Schemer is in learning Scheme, but rather in learning recursion and functional programming, and Scheme is just the tool that explains these concepts. The reason I got it in the first place is because it came recommended to better understand JavaScript’s functional aspects.
You Don’t Know JS was an excellent language-specific book back in ~2013 when I read it, and I think it’s been updated after that.
I do prefer more general books these days, now that I know enough languages well enough. SICP is a famous one, but I think How to Design Programs might be a more approachable similar book. Nb. I’ve only worked about halfway through it.
My copy of Effective Java by Joshua Bloch is 346 pages, are you perhaps confusing it with another? I found it useful, too. Some of it is a bit self-evident, but I opened my copy at a random page and read an item (46, pages 212-214, about for(:) compared to for(;;)) and still found it good.
It’s not a book on java, though. It’s for people who already know java, and who write or maintain large java programs, say 50kloc or more.
I dunno, I’m reading through Designing Data-Intensive Applications at the moment, and it feels pretty language agnostic. I’ve seen some snippets of bash, but that’s it. Definitely not Python-focused.
Language-agnostic, but server-oriented. That’s why I wrote “server-side”. As a client (mostly mobile) developer, I’m not much concerned with terabyte-scale data or with distributed algorithms that only make sense for clusters in a data center (because they assume latency is zero and uptime is 99% and all peers are trusted.)
I’m not dissing any of this, just pointing out the applicability is narrower than advertised :)
Thanks! I was going to remove it and take another picture but then I thought, well why not just show a slice of everyday life? It’s cold and dry where I live in Canada and my skin needs some lotion so I don’t get the alligator complexion.
I was thinking a lot of this excellent Calvin and Hobbes comic when I was taking the picture, should I clean up my desk before I take a picture so I appear to be neat and tidy or just present my life as-is unfiltered?
I often struggle with how messy my desk becomes. My preferred style of note taking to work out a problem is a good pen and pads of paper, so things end up accumulating and I don’t feel like I want people to see my office. Thank you for sharing this picture! I’m right in the middle of reorganizing, or I’d show you how bad mine can get.
It is! It was an accessory that was available with that monitor and from what I recall, a lot of Dell business/professional monitors. Here’s what it looks like off the monitor.
It’s not the most sensible implementation, because simple key patterns cause collisions that never resolve, even when resizing the hashtable. The comment you linked specifically mentions this pathology, and the numerous ways it destroys performance.
The rest of the comment describes how CPython mitigates the issue by adding weak integer hashing to its collision probing. At first I thought integer keys were never hashed at any point, hence my surprise.
From the comments it sounds like sequential integer dict keys are somehow common in Python, which I don’t understand. But I don’t write much Python.
From the comments it sounds like sequential integer dict keys are somehow common in Python, which I don’t understand. But I don’t write much Python.
While you can have a dict with keys of any hashable type – and a single dict may have keys of many types – the most common case, so overwhelmingly more common that it’s almost not even worth thinking about other cases, is a dict whose keys are all strings. This is because, sooner or later, basically everything in Python is backed by a dict. Every namespace is backed by a dict with string keys (the names defined in that namespace). Every object is backed by a dict with string keys (the names of the object’s attributes and methods). Keyword arguments to functions/methods? Yup, dict. In comparisons of languages by their “one big idea”, Python is sometimes described as having its big idea be “what if everything was a string-keyed hash table”?
Anyway. This is so common that Python goes out of its way to have special-case optimized implementations for the case of a dict whose keys are all strings (and for what it’s worth, in CPython as of Python 3.4, str is hashed using SipHash-2-4).
As to hashing of numeric types, it’s a bit more complicated than “ints hash to themselves”. Here’s what the Python documentation has to say. For the specific case of int, you can think of it as reducing to hash(n) == hash(n % sys.hash_info.modulus), where in CPython sys.hash_info.modulus is 2^61 - 1 on systems with 64-bit long and 2^31 - 1 on systems with 32-bit long.
While I don’t have a way of being certain, I suspect the linked comment’s note that the hashing of int is “important” has to do with the importance of real-world int key values being unlikely to collide with the hashes of other common real-world key types.
In comparisons of languages by their “one big idea”, Python is sometimes described as having its big idea be “what if everything was a string-keyed hash table”?
I’ve always thought of PHP’s “one big idea” as “What if everything is an array” where array means PHP’s strange half-dict half-list interface that funnily enough Python is now one small step closer to.
It depends on the exact implementation, but in my understanding, not exactly; you also want a good distribution between your buckets, even if there are patterns / non-random distributions in the actual encountered keys. It might waste space rather than time, but it’s still not great.
Python’s hash table isn’t implemented as an array-of-buckets. It’s a single contiguous array into which you insert a new element at the position determined by the hash of its key, and if that position is occupied you try the next one in a pseudo random order. Same with lookups: you try entries in succession until you find the one that equals (it’s usually the first one). And this is why the number of free spots and the probability of collisions are directly related.
Python 3.7.6 (default, Dec 21 2019, 11:56:31)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> hash(2)
2
>>> hash(37)
37
>>> hash(892474)
892474
Python 3.8.1 (default, Jan 8 2020, 23:09:20)
[GCC 9.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> hash(-1)
-2
>>> hash(-2)
-2
>>> hash(-3)
-3
I’ve heard that this is a somewhat common way to implement hashing for ints, but I don’t understand why it’s a good idea. Isn’t hash collisions terrible for hash tables? And isn’t a somewhat common key pattern “some number with some low bits masked”? And wouldn’t that be a pathological case for a hash table which grows with a factor of 2?
Are hash table implementations which does hash(x) = x somehow better at handling collisions than most hash tables, or do they just hope that the ints people want to put in their tables have high entropy in the lower bits?
IIRC there is some sort of random salt added to it and it goes through some internal hash for the actual hash table, since there was a DoS attack by abusing worst case scenario over HTTP requests.
This doesn’t answer the most important question: is the
.cpp2
file extension fixed or can it be.cxx2
,.cc2
,.C2
, and.c++2
?Edit: On a serious note, here is a video (starts at a bit after 4h mark) of Herb’s talk at CppCon. It’s pretty good, IMO.
My vote is for
.c++++
or.cpp++
.I got it,
.Cⵌ
Oh…
Nah, .c*
I’m a little confused by this statement:
The frontpage for Zig has an example parsing some JSON from the standard library. I feel like there’s an important distinction between the standard library and the runtime that I’m missing. Can anyone provide any more context?
Yeah that isn’t the most clear statement. While the standard library may have JSON and other utilities, it isn’t part of the final binary unless you use it. Compared to something like Go, where you have the garbage collector included in the output executable file. (Though I’m not sure of any compiled language that would include an http server if you don’t need it, I guess it might exist?)
I’m currently working my way through http://craftinginterpreters.com/ and really enjoying it. I’ve previously tried implementing an interpreter in a very slap-dash fashion, so having a more structured approach to it feels much nicer to me.
Also I recently moved my desktop machine from Windows to Linux, using https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF to run Windows in a VM with near-native performance for gaming. I’ll be working to improve boot times and get the host OS into a more usable state.
How about, “Four books professional server-side Python developers should read (and a few they don’t need to)”.
As a client-side C++ programmer, that list doesn’t do me much good, except for “High Performance Browser Networking” which is a killer read for anyone involved in HTTP-based APIs on either end of the wire. (And OK, I do write some utility scripts in Python.)
On the other hand, “Design Patterns” is a stone classic.
The stone cold classic is SICP, to the point of cliche but no less valid for it. It doesn’t have the instant for-my-job practicality boost that a style guide like “90 Specific Ways to Write Better Python” does, but it’s in another ballpark of depth.
@eatonphil I am curious if you had criticisms of it or if you think its lessons are just not nuts-and-bolts enough for your list.
It’s a perennial goal of mine to finish SICP. 😀 If someone is coming to this list and sees value in four top books, I can’t really recommend a book in the same category here that’s taken me years to finish. But for people who read a lot, of course you can give it a go.
I’m very much a PL nerd. As mentioned elsewhere, I’ve spent a lot of time programing in Standard ML (see ponyo on my github) and Scheme (see bsdscheme on my github). Forcing myself to spend years getting good at both has seriously improved my ability to write simple code.
But I also can’t really recommend that to most devs either because who knows if you’ll enjoy it. Doing what inspires you is the most important thing so you actually complete what it is you want to do.
Scheme and the art of programming by George Springer is almost the same material as SICP but in Scheme. Honestly if you have worked for years in ML or Scheme you probably don’t need SICP since it is a introductory level text.
I just picked this one up a few weeks back and have slowly been working through it. I’m surprised that I hadn’t run into it before then. It’s a textbook and it reads like a textbook, there’s no getting around that, but it’s accessible and seems to be a great intro to Scheme in general.
SICP is using Scheme too, isn’t it? I recall running across MIT being the recommended implementation, but I’ve had no problems with Guile, and there’s a language in Racket, too. I bounce between the Springer book, SICP, and Little Schemer when I’m not too busy. I’m in no rush, but I’ve found those three to be a great way to get acquainted with emacs/geiser & guile.
I’m in the same boat with SICP and similar books with an academic feel. If it’s not project oriented, it’s usually too hard for me to focus on.
I wish I could recommend books on other languages! But no book I’ve read on JavaScript (thinking JavaScript the Good Parts, How JavaScript Works), Go (there’s almost nothing here that’s not just reference), Java (Effective Java is massive and not all useful), or C++ (what, Meyers’ Effective C++?) is in the same category in terms of information density, being well-written, and still being applicable today. I’d be happy to take recommendations though for books not among these.
Thinking Forth (1984) and Writing Solid Code (1993) are two books that radically changed how I approached programming. Yes, one is about Forth, and one is about C, but I’ve found them applicable to more than just those two languages.
Edit: added years first published.
Fully agree on both and in particular that TF is applicable beyond Forth. Great book.
I wonder whether you might enjoy Elements of Programming and/or The Humane Interface (and here’s the bit I’ve quoted most often). Very different books, both of them insightful and hard. EoP is a C++ book that’s about C++ in the way TF is about Forth, THI is a UI book that uses phrases like “information-theorietic efficiency” instead of “at [famous company] we do this [this way]”.
Personally, my qualm wouldn’t be about the language, it would be about the domain. I haven’t read the books, but those 4 titles really are web focused. I’m sure lots of those translate to other domains, but it’s clearly not aimed as the general purpose developer who doesn’t yet know which domain they will tackle. No, this is for backend web developers.
Thus, your title should really be closer to: Four books professional backend developers should read. That way it’s more focused to your actual target audience.
My point is that it wasn’t intended to be backends focused. I haven’t read any really great books on JavaScript (but HPBN is clearly browser, not backend, focused). And I genuinely wish frontend developers I worked with knew more that’s covered in DDIA.
But I don’t pretend to speak for every domain of programming for sure.
It’s easy to for us to get myopic about stuff outside our domain.
Note that even here you seem to be implicitly assuming that not-backend = in-browser. That’s leaving out (native) desktop, mobile and embedded development.
I suspect there aren’t actually any books that are super important for all serious programmers. The books I learned the most from are way out of date (K&R, the Smalltalk-80 Blue Book, Advanced Cryptography) or very domain specific (Unix Network Programming) or both (Inside Macintosh.)
Yes I’m definitely missing domains but mobile isn’t one of them! HPBN devotes a ton of time to the effect of protocols on mobile battery life of all things. There’s a chapter or two on the behavior of mobile networks. Mobile developers are also responsible for monitoring and alerting on their production deploys. And they’ll also be heavy users of backend systems.
All this isn’t to say that I’ve got the list perfect. I hope I change my mind on it over time as I’ve read new books, learned new things, and expired different domains.
I feel The Go Programming Language is quite good; certainly more than just a reference.
The Little Schemer is probably the best programming book I’ve read. While it’s quite limited in scope (functional programming in Scheme), it does explain it in such a way that’s very insightful and useful for other languages too, IMO. And the writing style is just fun!
It’s not bad. And certainly new Go developers would do well to use it as a style guide since writing “idiomatic” Go takes a bit. I’ll consider adding it to a future version of this post.
In the category of niche languages there is no shortage of gems. :) I haven’t read The Little Schemer but I’m working through The Little Typer at the moment. I’ve got a post in mind for a rundown of 6-8 Lisp books alone. But I wouldn’t consider recommending any of these to developers en masse (even though I believe programming in Scheme or Standard ML for a while helps you write simpler code).
I don’t think the value of The Little Schemer is in learning Scheme, but rather in learning recursion and functional programming, and Scheme is just the tool that explains these concepts. The reason I got it in the first place is because it came recommended to better understand JavaScript’s functional aspects.
I highly recommend Eloquent JavaScript by Marijn Haverbeke. It has a great style of writing and starts from some really nice computing basics.
Learn you a haskell has been transformative to me, for this reason
You Don’t Know JS was an excellent language-specific book back in ~2013 when I read it, and I think it’s been updated after that.
I do prefer more general books these days, now that I know enough languages well enough. SICP is a famous one, but I think How to Design Programs might be a more approachable similar book. Nb. I’ve only worked about halfway through it.
My copy of Effective Java by Joshua Bloch is 346 pages, are you perhaps confusing it with another? I found it useful, too. Some of it is a bit self-evident, but I opened my copy at a random page and read an item (46, pages 212-214, about for(:) compared to for(;;)) and still found it good.
It’s not a book on java, though. It’s for people who already know java, and who write or maintain large java programs, say 50kloc or more.
I dunno, I’m reading through Designing Data-Intensive Applications at the moment, and it feels pretty language agnostic. I’ve seen some snippets of bash, but that’s it. Definitely not Python-focused.
Language-agnostic, but server-oriented. That’s why I wrote “server-side”. As a client (mostly mobile) developer, I’m not much concerned with terabyte-scale data or with distributed algorithms that only make sense for clusters in a data center (because they assume latency is zero and uptime is 99% and all peers are trusted.)
I’m not dissing any of this, just pointing out the applicability is narrower than advertised :)
Obligatory please don’t tell anyone how I live, here is my very messy desk:
OS: Arch Linux
CPU: Intel i5-6600 @ 3.30 GHz
RAM: 16 GB DDR4
WM: i3
MB: Gigabyte Q170M-D3H
KB: IBM Model M
GPU: Nah
Cat: Orange and White Maine Coon, “Salsa” aka “Salsa T. Cat Esq.”
Cat treats: Chicken
Water: Tap
Coffee: Black
Whisky: Neat
I enjoyed this image very, very much. Thank you for your honesty! I particularly enjoyed the pump bottle of vaseline.
Thanks! I was going to remove it and take another picture but then I thought, well why not just show a slice of everyday life? It’s cold and dry where I live in Canada and my skin needs some lotion so I don’t get the alligator complexion.
I was thinking a lot of this excellent Calvin and Hobbes comic when I was taking the picture, should I clean up my desk before I take a picture so I appear to be neat and tidy or just present my life as-is unfiltered?
This feels like home. I don’t know if you can actually compare two messes but our areas feel equal in messiness.
I see the base for the soldering iron. I’m scared to ask where in here the actual iron is.
Haha, it’s off to the left, on the window sill.
Fantastic! As well as Arch, I’m a huge Kubrick fan—where did you get your desktop background?
Awesome, glad you liked it. I’ve had that one for a long time, I did a search on the filename and there is a copy here: https://www.colipera.com/you-deserve-nothing/vector-2001_00359644/
I often struggle with how messy my desk becomes. My preferred style of note taking to work out a problem is a good pen and pads of paper, so things end up accumulating and I don’t feel like I want people to see my office. Thank you for sharing this picture! I’m right in the middle of reorganizing, or I’d show you how bad mine can get.
is that a speaker strapped to the bottom of the left monitor? If yes, why?
It is! It was an accessory that was available with that monitor and from what I recall, a lot of Dell business/professional monitors. Here’s what it looks like off the monitor.
Oh wow I hope not. Is this actually true in CPython?
This is the most sensible implementation as you want to avoid collisions in a hash table. It isn’t supposed to bear any cryptographic properties if that’s your concern. Here’s more: https://github.com/python/cpython/blob/master/Objects/dictobject.c#L134
It’s not the most sensible implementation, because simple key patterns cause collisions that never resolve, even when resizing the hashtable. The comment you linked specifically mentions this pathology, and the numerous ways it destroys performance.
The rest of the comment describes how CPython mitigates the issue by adding weak integer hashing to its collision probing. At first I thought integer keys were never hashed at any point, hence my surprise.
From the comments it sounds like sequential integer dict keys are somehow common in Python, which I don’t understand. But I don’t write much Python.
While you can have a
dict
with keys of any hashable type – and a singledict
may have keys of many types – the most common case, so overwhelmingly more common that it’s almost not even worth thinking about other cases, is adict
whose keys are all strings. This is because, sooner or later, basically everything in Python is backed by adict
. Every namespace is backed by adict
with string keys (the names defined in that namespace). Every object is backed by adict
with string keys (the names of the object’s attributes and methods). Keyword arguments to functions/methods? Yup,dict
. In comparisons of languages by their “one big idea”, Python is sometimes described as having its big idea be “what if everything was a string-keyed hash table”?Anyway. This is so common that Python goes out of its way to have special-case optimized implementations for the case of a
dict
whose keys are all strings (and for what it’s worth, in CPython as of Python 3.4,str
is hashed using SipHash-2-4).As to hashing of numeric types, it’s a bit more complicated than “ints hash to themselves”. Here’s what the Python documentation has to say. For the specific case of
int
, you can think of it as reducing tohash(n) == hash(n % sys.hash_info.modulus)
, where in CPythonsys.hash_info.modulus
is 2^61 - 1 on systems with 64-bitlong
and 2^31 - 1 on systems with 32-bitlong
.While I don’t have a way of being certain, I suspect the linked comment’s note that the hashing of
int
is “important” has to do with the importance of real-worldint
key values being unlikely to collide with the hashes of other common real-world key types.I’ve always thought of PHP’s “one big idea” as “What if everything is an array” where array means PHP’s strange half-dict half-list interface that funnily enough Python is now one small step closer to.
Avoiding collisions isn’t as important as using up a larger % of the spots before you need allocate and move things, I believe.
Aren’t those the same thing? Less collisions implies you can go longer without expanding.
It depends on the exact implementation, but in my understanding, not exactly; you also want a good distribution between your buckets, even if there are patterns / non-random distributions in the actual encountered keys. It might waste space rather than time, but it’s still not great.
Python’s hash table isn’t implemented as an array-of-buckets. It’s a single contiguous array into which you insert a new element at the position determined by the hash of its key, and if that position is occupied you try the next one in a pseudo random order. Same with lookups: you try entries in succession until you find the one that equals (it’s usually the first one). And this is why the number of free spots and the probability of collisions are directly related.
it is:
Almost! hash(-1) returns -2.
wat
do you happen to know why?
Ah, it’s because the C API function uses -1 as an error code. It goes deeper than that too:
I’ve heard that this is a somewhat common way to implement hashing for ints, but I don’t understand why it’s a good idea. Isn’t hash collisions terrible for hash tables? And isn’t a somewhat common key pattern “some number with some low bits masked”? And wouldn’t that be a pathological case for a hash table which grows with a factor of 2?
Are hash table implementations which does
hash(x) = x
somehow better at handling collisions than most hash tables, or do they just hope that the ints people want to put in their tables have high entropy in the lower bits?IIRC there is some sort of random salt added to it and it goes through some internal hash for the actual hash table, since there was a DoS attack by abusing worst case scenario over HTTP requests.
Why would that be a problem?
The walrus operator is fine (it doesn’t light my fire but it’s not a bad thing).
I do wish Python would add two operators:
A None-coalescing operator. You get something like this:
And so if
foo
isNone
you get the value"default"
. You can almost do the same thing withor
but there are values that are falsey that aren’t None.A Smalltalk-like chaining operator:
That executes the
foo
,bar
, andbaz
methods onmyobject
, discarding the return values fromfoo
andbar
. That saves you from having to dowhen you don’t care about the returned values.
Maybe I should write a PEP. :)
There’s a PEP for the None-coalescing operator, at least - https://www.python.org/dev/peps/pep-0505/
Well, actually it adds three operators, so that’s exciting.
??
for coalescing,?.
for None-aware attribute access, and?[]
for None-aware indexing.