1. 15
  1. 9

    Can I use this with a browser that’s not Chrome-based? No.

    Then I’m not using it.

    1. 1

      To elaborate on that, I think it’s this bit that’s Chrome-specific:

      Uses DevTools protocol to intercept all requests, and caches responses against a key made of (METHOD and URL) onto disk. It also maintains an in memory set of keys so it knows what it has on disk.

      So I guess it’s fair to say it’s “offline browsing”, but maybe not “archive”. If you want an “archive”, then tying it to a specific browser seems like a very bad idea.

      1. 1

        You may enjoy archivebox. It uses headless chrome to download stuff, but you can use it from whatever browser you want. It doesn’t do the fancy “devtools to intercept requests” shenanigans though, so it isn’t always as good with js heavy sites.

        1. 1

          Is there anything like archivebox, but one that extracts the web page as single HTML without ads? Ideally, a readable version of the article.

          1. 1

            Archivebox does this. It processes download articles through readability and mercury, saving each of those as separate files along side the other formats. Seems to do a pretty good job of it in my experience.

      2. 3

        I don’t understand how this works in the presence of dynamic content, either server side or client side. What I see in my browser is not a pure function of the addresses I load. Wouldn’t I need to reproduce the execution too?

        Actually, how this thing even work in the presence of people serving static html but overwriting them sometimes?