1. 7

Hash [] and fetch are subtly different.

{}.fetch(:a)
KeyError: key not found: :a
from (pry):7:in `fetch'

{}[:a]
=> nil

But I usually want to use “[]” not “.fetch()” as there is so much less typing….

….but I’m paranoid sometimes I end up with a nil where I didn’t want to. Either as an empty string from a .to_s or as no method for nilclass exception somewhere down the line.

So…

hsh = Hash.new{|h,k| raise "No such key #{k.inspect}, expected one of #{h.keys.sort.inspect}"}
hsh[:f]=2
hsh[:g]=4
hsh[:z]
 RuntimeError: No such key :z, expected one of [:f, :g]
 from (pry):9:in `block in __pry__'

Gee, that’s even better than fetch(), as I get a list of possible keys I could have meant!

  1.  

  2. 7

    The downside of this is that it’s non-standard behaviour, and may be quite surprising for new people. You may also run in to the trap of thinking you’re using your custom Hash implementation, but actually aren’t.

    It seems to me that just using the standard and explicit .fetch() is clearer for everyone (including yourself). It’s unfortunate that it’s not the default, but it is what it is, and attempting to change the language semantics to meet your needs usually isn’t worth it.

    1. 2

      I think the second thing you raise would prove to be a huge source of problems for me: if you make this your default Hash, you will start expecting every Hash to behave like this and Hashes returned by various libraries won’t behave this way. Even for your own version you would have to override e.g. select, transform_values and various other methods so Hashes constructed from this one are the same kinds of Hashes.

      Also note that a Hash such as this one cannot be serialized. Problematic when storing as yaml, communicating between processes using DRb, etc.

      So definitely a cute Ruby trick that I hadn’t thought of, but I’m afraid I have to pass :)

      1. 2

        I don’t think it’s “unfortunate” that the default for grabbing data out of a Hash is to return nil instead of an exception. This is, in my opinion, one of the better uses of a hash data structure…if you don’t really care much about the contents and are just passing on some non-structured data.

        If you do care about that kind of stuff, it’s way better to use your own object in place of a Hash, and define the [] method yourself:

        class Dictionary
          include Enumerable
        
          def initialize(hash = {})
            @hash = hash
          end
        
          def method_missing(method, *arguments, &block)
            return super unless respond_to_missing? method
            @hash.public_send(method, *arguments, &block)
          end
        
          def respond_to_missing?(method, include_private = false)
            @hash.respond_to?(method) || super
          end
        
          def [](key)
            raise KeyError, key unless key?(key)
            @hash[key]
          end
        end
        

        I have a few reasons for doing things this way:

        • You have more control over the data structure
        • When debugging, .class.name points you to the class you defined, rather than Hash
        • It’s easier to test, and reproduce issues found in the wild
        • Hash is not code you control, it can change without your knowledge, and cause subtle bugs. Hash is also not entirely written in Ruby, so when things go wrong, you’ll find yourself perusing through C code and difficult-to-read documentation for some of the lesser-used Hash features…

        I always recommend avoiding subclassing built-in objects, and in Hash’s case, using the block notation to define default behavior. Much easier, and clearer, to describe that in your own class.

        1. 2

          It’s slow to bounce through method_missing for most methods…. rather…

           class Dictionary < Hash
          

          And then just override []

          I have used that approach before.

          In this particular case I was dealing with a lot of POD’s created as hash literals when I got bitten by a silent failure when using the wrong key.

          My trick, and it’s companion….

          hsh.default_proc = ->(h,k){ raise "No such key #{k.inspect}, expected one of #{h.keys.sort.inspect}"}
          

          caught the bug on the line it happen.

          Hash is not code you control, it can change without your knowledge,

          The interface is very stable.

          1. 1

            nitpicking but there is a few bugs, @hash = hash in initialize, and should be return super unless @hash.respond_to? method. you are also forgetting &block in method_missing.

            1. 1

              edited the obvious errors, but the guard clause seems more like a style thing…I prefer the happy || sad syntax in that method myself.

          2. 1

            Well .default_proc= is part of the language semantics… but yes, some libraries (even some of my own code) do use code like….

            foo = hsh[bah]
            if foo
               do stuff with foo
            else 
              handle key bah didn't exist
            end
            

            So yes, this breaks that behaviour…. on the other hand if you need to catch your own stupid quick…. this trick can save you and hour of scratching your head.

          3. 2

            Hash [] and fetch are subtly different.

            I would not say “subtly” as this is the whole point of Hash#fetch.

            If you want less typing you might want to have a Struct and do thing.a instead (which raises if a doesn’t exist) (I don’t like OpenStruct personally).

            I have a SimpleStruct class that has the best of Hash/Struct/OpenStruct combined which I like a lot (it’s quite messy though but let’s say the core idea is this:

              def method_missing(method_name, *args, &block)
                if @data.keys.include?(method_name)
                  @data[method_name]
                else
                  super
                end
              end
            

            (I’m sure there is a better way, I want to move the class to inherit from Hash (as I’m reimplementing most of its methods) and define_method in the initializer instead of method_missing)

            1. 1

              I think this is a valid use of default_proc, but with the caveat that your program may break in unexpected ways if you try to pass this object to a method expecting a Hash.

              Why are you using RuntimeError instead of KeyError?

              1. 2

                Hmm, you’re probably right.

                Curiously enough KeyError has a method to report the key and receiver but not to set them…

                ri KeyError
                KeyError < IndexError
                
                (from ruby core)
                ------------------------------------------------------------------------------
                Raised when the specified key is not found. It is a subclass of IndexError.
                
                  h = {"foo" => :bar}
                  h.fetch("foo") #=> :bar
                  h.fetch("baz") #=> KeyError: key not found: "baz"
                ------------------------------------------------------------------------------
                Instance methods:
                
                  key
                  receiver
                
                [13] pry(main)> ri IndexError
                IndexError < StandardError
                
                (from ruby core)
                ------------------------------------------------------------------------------
                Raised when the given index is invalid.
                
                  a = [:foo, :bar]
                  a.fetch(0)   #=> :foo
                  a[4]         #=> nil
                  a.fetch(4)   #=> IndexError: index 4 outside of array bounds: -2...2
                ------------------------------------------------------------------------------
                

                And nothing useful on constuctor either….

                KeyError.new(:a,{})
                ArgumentError: wrong number of arguments (given 2, expected 0..1)
                from (pry):7:in `initialize'
                
                1. 1

                  That is curious. It seems KeyError#key was added relatively recently, so I wonder whether it was an oversight.

                  At least you can fake it using #fetch even with default_proc:

                  2.5.1 :012 > h = Hash.new { 42 }
                   => {}
                  2.5.1 :013 > h.fetch('foo')
                  Traceback (most recent call last):
                          3: from /home/pbrannan/.rvm/rubies/ruby-2.5.1/bin/irb:11:in `<main>'
                          2: from (irb):13
                          1: from (irb):13:in `fetch'
                  KeyError (key not found: "foo")
                  
              2. 1

                Thinking about it for a few… I don’t think I want this, because I don’t see a good place to use it. In a small codebase not likely to be shared, it seems like a significant amount of boilerplate for minimal gain - you’re less likely to have surprising results that aren’t immediately clear. In a big codebase, where it can be tucked away somewhere, it’s too likely to trip up someone not familiar with it or interfere with the operations of some other gem that expects to be passed a normal hash. If you want that behavior, just stick with fetch.

                1. 1

                  The nice thing is it’s not silent or obscure.

                  If it kicks in, it gives you a stack trace all the way down to the file / line or the where the default proc is defined….which of course has a couple of comments about what this is doing and why.