1. 17
  1.  

  2. 14

    Nice article.

    Sorry for the tangent, but I wanted to use this as an opportunity to discuss error handling strategies. There was some discussion back on https://lobste.rs/s/yjvmlh/go_ing_insane_part_one_endless_error - but I and others expressed needing specific cases in order to discuss improvements, removing if err != nil { return err } patterns.

    For this code, I’d probably make a structure for encapsulating the state of the decoding process:

    type headerDecoder struct {
      bs []byte
      offset int
      err error
    }
    

    Then for the read helpers, I’d change them so that they fail silently:

    func (dec *headerDecoder) readUint32() (i uint32) {
      if dec.err != nil {
        return
      }
      end := offset + 4
      if end > len(dec.bs) {
        dec.err = errOverranBuffer
        return
      }
      i = binary.LittleEndian.Uint32(dec.bs[dec.offset:end])
      dec.offset = end
      return
    }
    

    Alternatively, can return an extra/ignorable ok bool if the call sites need it.

    Now, your main decoding logic is dramatically cleaner:

    version := dec.readUint16()
    bitFlag := dec.readUint16()
    compression := noCompression
    compressionRaw := dec.readUint16()
    if compressionRaw == 8 {
        compression = deflateCompression
    }
    lmTime := dec.readUint16()
    lmDate := dec.readUint16()
    lastModified := msdosTimeToGoTime(lmDate, lmTime)
    crc32 := dec.readUint32()
    compressedSize := dec.readUint32()
    uncompressedSize := dec.readUint32()
    fileNameLength := dec.readUint16()
    extraFieldLength := dec.readUint16()
    fileName := dec.readString(int(fileNameLength))
    extraField := dec.readBytes(int(extraFieldLength))
    return &localFileHeader{
        signature: signature,
        version: version,
        bitFlag: bitFlag,
        compression: compression,
        lastModified: lastModified,
        crc32: crc32,
        compressedSize: compressedSize,
        uncompressedSize: uncompressedSize,
        fileName: fileName,
        extraField: extraField,
        fileContents: fileContents,
    }, dec.offset, dec.err
    
    1. 3

      Programmable control flow, nice!

      1. 6

        Rob Pike wrote about this pattern a few years ago: https://go.dev/blog/errors-are-values

      2. 2

        I’m a huge fan of this pattern. (As @pims says in another reply, if you don’t know the pattern, check out Rob Pike’s “Errors are values” on the Go blog.)

        Another nice use for it is in command line applications. I define a small helper function like this:

        func (app *App) NoOp() bool {
        	return app.ExitValue != exitSuccess || app.HelpWanted || app.VersionWanted
        }
        

        That function is called at the top of (almost) all the methods for App:

        if app.NoOp() {
            // return whatever makes sense for the given method, often nothing at all
        }
        
        

        Then the command’s main Run function looks as clean as this:

        func Run(args []string) int {
            app := &App{ExitValue: exitSuccess}
            cfg := app.ParseFlags(args)
            courseData := app.Unmarshal(classFile)
            app.Validate(cfg, courseData)
            app.Write(cfg, courseData)
            return app.ExitValue
        }
        

        Run in turn is called from a very simple main.go. (Happily stolen from Nate Finch, Mat Ryer, and @carlmjohnson.

        func main() {
            os.Exit(cli.Run(os.Args[1:]))
        }
        
        1. 1

          That’s a nice idea for code like this. I guess it’s a little bit less efficient in the error case, as it’d have to run through all those checks even if the failure is on the first one, but presumably the error case is relatively rare so that doesn’t matter.

          1. 3

            presumably the error case is relatively rare so that doesn’t matter

            This is true. Moreover, the worst case performance here is bounded by the performance of the success case.

            Not super relevant here, but in other cases, this approach is useful for improving error reporting more generally. For example, compilers that give up at the first sign of error are less pleasant to work with than those that attempt to recover and report multiple errors in a single compiler run.

            1. 1

              Moreover, the worst case performance here is bounded by the performance of the success case.

              Yep, good point.

              For example, compilers that give up at the first sign of error are less pleasant to work with than those that attempt to recover and report multiple errors in a single compiler run.

              Maybe it depends on one’s working style, but I tend to just fix one (or maybe one or two) issues and recompile often, so I haven’t found this to be the case (Python only shows you the first error, for example). Even in Go, which prints the first few errors (10 I think) I often only fix one or two at a time, then go run again. Then again, I guess it’s strictly better to give you at least a few errors at once – if you want to only fix one at a time, you can, but with Python you don’t get the choice.

        2. 2

          I didn’t know about the zip format specification that you linked. A couple years ago I happened upon the extra field concept while reading the Ruby Zip API. We use it to store a signature of validity, so it would have been nice to have this!