1. 25
  1. 7

    While I know that this will always be up for debate, I think the author is wrong to suggest that parsing should not return a Result. Always verify when you’re handling stuff coming from outside of your program. Also, all those unwraps can be avoided by using ?.

    Other than that, I think there is valuable feedback here.

    1. 2

      Always verify when you’re handling stuff coming from outside of your program

      The case mentioned was not for parsing the HTML – it was for “parsing” the CSS selector the author had hard-coded into the program. Which raises a question about whether it really was “coming from outside your program”. From the perspective of the scraper authors, sure. From the perspective of the programmer using scraper, though, it wasn’t.

      I suspect the static-typist’s ideal alternative would be a language that could express grammatically-valid CSS selectors as a type and thus expose a Selector interface that just statically rejects anything that isn’t a CSS selector[1]. But as a matter of programmer convenience unless/until that ideal world arrives, there might be room to explore alternatives that don’t require the extra Selector::parse for a trusted value.

      [1] This is mostly humor. One can write parsers and validators for the grammar of CSS selectors, but would gain very little real-world safety from doing so – the risks associated with untrusted input would not be mitigated by a requirement that all selectors be grammatically valid, any more than SQL injection vulnerabilities would be mitigated by only accepting user input that is grammatically-valid SQL.

      1. 6

        This is an important lesson to learn as soon as possible when working with failures encoded in return values. It’s 100% acceptable, and in fact more correct, to panic/throw/crash on errors that are the fault of the author of the function.

        You (generally, as a rule of thumb) return an error value if the inputs of the function cause failure or if one of the dependencies of the function failed (database returned a duplicate key error, etc).

        IMO, of course.

        I like to reference the last section of this page: https://dev.realworldocaml.org/error-handling.html

        1. 2

          Another important lesson is: when you disagree with someone, don’t automatically assume it’s because they’re ignorant and don’t know what you know.

          It’s entirely possible that someone could be perfectly well familiar with your preferred pattern and still disagree that that pattern is the best possible one for this case.

          This is, also, why functional programming gets kind of a bad name – the built-in assumption of “everything about this is so obviously objectively superior that only someone who doesn’t know about it would disagree” is off-putting.

          1. 1

            I’m not exactly sure why I deserved this response. I thought it was clear that I was simply advocating for a particular “philosophy” around when to return an error and when to throw an exception.

            Yes, I framed it as a “lesson” that one might learn from working with languages that return error values, but l feel like I left room for myself not being omniscient (the part where I said “IMO”).

            It was also agreeing with you. Since the person using the library was hard coding the values used for selectors, it’s better to unwrap and crash than to bubble up an error value that only says “the author of this function wrote it wrong”.

          2. 1

            Thanks for linking that page - after reading it I think it makes some great points that are certainly applicable outside of OCaml. The final sentence makes for a nice TL;DR:

            In short, for errors that are a foreseeable and ordinary part of the execution of your production code and that are not omnipresent, error-aware return types are typically the right solution.

      2. 5

        I try to avoid async rust when I can still tbh, dunno if its just me.

        1. 5

          Not just you, it’s intense. The errors I get and their apparent lack of relation to the actual code as typed make me feel like I’m C++ template programming again.

        2. 4

          Good timing - I’ve recently been in a similar boat. Actix was giving me async-related errors that no amount of web searching could help with. So I switched to warp, which resulted in more errors. So I’ve switched to Elixir. I like Rust very much, but I’m starting to think it’s too big to be a hobby language. The correct solution to my errors above is really to go away and learn Rust’s async, but I’m using elixir for another project so I easily slipped into it instead.

          1. 4

            Rust is not a good “just learn while doing” language. Using an async HTTP web library in Rust requires you to understand a lot of language features. Just starting and then trying to understand what the heck the errors mean can be overwhelming.

            A better approach is to go by the book and learn the language first. Whether it is worth it for you, is another thing. There are definitely easier languages for a simple web app.

            The other part that I commonly hear and do not get at all is that actix is too much boiler-plate. Mind you, I only used older versions of actix (which had more of it) and not the recent ones but I really don’t get it. Adding and mounting a request handler takes only few lines. Eliminating one or two more lines of that nearly seems pointless or probably contraproductiv.

            1. 3

              The big error “type cannot be shared between threads safely” looks scary, but it contains all the information you need. If you follow “because it appears within the type” you’ll find the type you’ve used (here scraper::Html).

              the compiler cannot reason that it isn’t used anywhere after inital part.

              It is used after the initial part — its destructor is called at the end of the scope, so the document is kept around until the end. This is unfortunately a common gotcha. It’s done deliberately, because Rust guarantees a predictable drop order, so things like mutex guards can be relied upon.

              1. 2

                So far I’ve had my biggest success using actix, though I was used to futures pre async/await. I’d say the ecosystem really lacks some of the stuff you’d just get out of the box from rails/django etc. (Though async mysql can be a real pain in python also..)

                I’ve had to implement basically everything based on different libs of my web-service-controller. So auth, 2FA, login verification, DB etc. I may have gone a little bit overboard by using sled as DB, making me implement my own indices on top of a KV-Store.

                1. 1

                  I’ve had to implement basically everything based on different libs of my web-service-controller. So auth, 2FA, login verification, DB etc.

                  Is there anything you built that could be bundled up into it’s own crate and re-used as an actix add-on crate? Sounds like you have built some of pieces that would fill a substantial gap in the ecosystem today?

                  1. 1

                    Oh, I don’t think I expressed that very well. I’ve used for all of this different crates that already do some of the core functionality (TOTP for 2FA, bcrypt implementation, etc) but the thing was to find good crates, wire this all up into one thing and add some kind of user permission system on top and having this verification for every relevant API path. The latter done by three big macros. (Yeah I’ll have to experiment with some kind of middleware but this will also include the next actix upgrade..)

                    I ended up creating a REST like json API and just the typical react frontend that calls this. Though I feel a little bad for my choice of bitflags for different permissions, this is bad to use in javascript.

                2. 1

                  For the CSS selector issue, this is a good candidate for working your parsing and unwrapping in user code. You could just wrap it up into a more readable panicking function with #[track_caller]

                  At my work, our crates all have a crate::prelude with various goodies and helpers to make it easier depending on the most common needs

                  1. 1

                    No matter what, rust is going to be like an ‘order-of-magnitude’ more complicated to build in than Python. Not saying APIs can’t be optimized. The borrower checker and the relatively advanced type system adds that cost upfront. However, once you get going with actix or some other library for a project, it’s easy to reuse the code going forward and everything is more reliable and less buggy. Costs get amortized.