1. 3

    Use what you are the most familiar and feel productive with, then work on it to be horizontally scalable.

    1. 3

      Wow! And my homelab for the last 9 years has been an old beat up Lenovo laptop with the cheapest external usb-hard drive available..

      1. 2

        Old laptops are the best, they even have a built in battery backup that’s already tightly integrated with the OS!

        1. 1

          3 months ago I had a similar setup to yours, a ThinkPad X201 with a nice SSD and a USB3 HDD over a USB2 port.

          Yesterday I had one of the things I learnt with that laptop solve a 2-week partial outage at $WORK.

          Never underestimate what one can learn from hobbying. Keep it up!

        1. 15

          This post uses GIN indexes, one of their authors (Oleg Bartunov) strongly suggested to use RUM indexes (https://github.com/postgrespro/rum) when I spoke with him at PGCon in Ottawa (2019).

          Other things that would be nice:

          • show us your PostgreSQL config
          • show us your query plans
          • use ts_vector column type instead of functional indexes
          • avoid using OR in the query by using ANY - this would likely improve query plans a lot (if we were able to see them)

          For a moment I thought there is enough information to actually reproduce the test but I think there is a fundamental flaw in this whole experiment. The datasets are not the same for both systems!

                JSON.stringify({
                  key: `${faker.lorem.word()} ${faker.lorem.word()}`,
                  val: faker.lorem.words(),
                  valInt: Math.floor(faker.random.float()),
                  valDate: faker.date.past()
                })
          

          Please correct me if I am wrong as I don’t know the ‘faker’ used here but I would assume that it’s building a randomised lorem ipsum document. This depending on the actual distribution of words would greatly impact how the index is build and would affect the execution times of queries used against it. This means that one system could have gotten a much worse execution path with a document biased differently than the other system.

          Regardless of word distribution - is elasticsearch really working on the same initial data and are it’s documents split into equivalent logical chunks? PostgreSQL is searching over 1.5M entries, is the data loaded into elasticsearch also split into 1.5M documents?

          Am I really not understanding something fundamental? I only see one iteration (Iterations = 1) of the faker for Elasticsearch:

          const faker = require("faker")
          const { writeFileSync } = require("fs")
          const Iterations = 1
          writeFileSync(
            "./dataset.ndjson",
            Array.from(Array(Iterations))
              .map(() =>
                JSON.stringify({
                  key: `${faker.lorem.word()} ${faker.lorem.word()}`,
                  val: faker.lorem.words(),
                  valInt: Math.floor(faker.random.float()),
                  valDate: faker.date.past()
                })
              )
              .join("\n")
          )
          
          1. 4

            Lorem ipsum text also makes it impossible to evaluate search result quality (unlike, say, the Wikipedia dataset).

            1. 1

              what is the wikipedia dataset? url?

              1. 1
                1. 1

                  Wikipedia offers full database backups : https://en.m.wikipedia.org/wiki/Wikipedia:Database_download

            1. 10

              Maybe people will reconsider using MiTMflare if we get a few more outages like this.

              1. 7

                Can you suggest some comparable services with better uptime or, failing that, better postmortems?

                1. 6

                  What’s the use case you have?

                  I just use … “my web host” (which happens to be Dreamhost, which does offer optional Cloudflare integration, but I intentionally leave it off). It has survived all the HN traffic news spikes just fine, as well as spikes from Reddit, lobste.rs, from an O’Reilly newsletter, and from what I think is some weird Google content suggestion thing (based on user agent).

                  It has worked fine for 10 years. The system administration seems very competent. I don’t know the details, but they have their own caches.

                  I noticed this guy said the same thing about Dreamhost: https://www.roguelazer.com/2020/07/etcd-post-follow-up/ i.e. that it’s worked for 15 years.

                  I feel like a lot of people are using Cloudflare for some weird “just in case” moment that never happens. I’m not saying you don’t have that use case, but I think many people talking about and using Cloudflare don’t.

                  To me Cloudflare is just another layer of complexity and insecurity. I would consider using something like it if I had a concrete use case, but not until then. Computers are fast and can serve a lot of traffic.

                  1. 3

                    The use case is free caching, and free bandwidth if you use some services for hosting (like backblaze). Which cuts down a lot of costs depending on the website you’re running.

                    1. 3

                      Where is the original site hosted? Why does it need caching?

                      (I’m not familiar with Backblaze – is it a web host or is it an object store or both?)

                      My point is that, depending on the use case, you probably don’t need caching, so it doesn’t matter if it’s free. There is a downside in security and complexity, which is not theoretical (as this outage shows, and as MITM attacks by state actors and others have shown.)

                      1. 2

                        (I’m not familiar with Backblaze – is it a web host or is it an object store or both?)

                        Backblaze has a backup service, as well as a service called “b2” which is basically an s3 like object storage service.

                    2. 1

                      For the use cases I’ve had, I have (we have) used Fastly, a local Varnish/Apache/Nginx, or Rails middleware. The goals were some combination of a) overriding the backend’s declared cache lifetime b) speeding up page response c) letting the client cache various things even if not cachable by intermediates.

                      Cloudflare combines all that with good DDOS protection and good performance globally. I can see how that’s an attractive feature set to many people, and while it’s a shame that VCs haven’t funded three dozen copycats, suggestions like that of @asymptotically that people just shouldn’t use it are stupid. It’s a fine combination of features, and telling people to just not want it, without suggesting alternatives, is IMO offensive and stupid.

                    3. 4

                      I don’t think so. I think that Cloudflare’s offerings are very good, they got this whole thing fixed in 30 minutes and explained how they’re making sure nothing similar happens again.

                      The main problem I have with Cloudflare is their size. What good is a decentralised internet if we just connect through the Cloudflare VPN, resolve a domain via Cloudflare DNS and then get our requests proxied through Cloudflare?

                      I also hate the captchas that you are occasionally forced to do.

                      1. 3

                        the captchas that you are occasionally forced to do

                        Or all the time when connecting through Tor. Privacy Pass barely works :/ and it’s really silly that you need captchas to just view public pages! If they want to prevent comment spam and whatnot, why not restrict captchas to non-GET requests by default >_<

                      2. 1

                        DNS or anti-ddos? Doesn’t OVH have anti ddos-servers for example.

                        1. 6

                          Cloudflare is a CDN with DDOS features (and has some related products, such as a registrar). It offers quick page access anywhere in the world, excellent support for load spikes, and DDOS protection.

                          A lot of ISPs offer anti-DDOS features for their products (which may be a product like Cloudflare’s or a different one, like OVH), but the feature is often one that displeases the victim: Dropping packets to the attacked IP address until the attacker grows bored and goes away. I don’t know what OVH means by anti-DDOS and they description page sounds a little noncommittal to my ears.

                          1. 3

                            OVH’s anti-ddos will trigger on legitimate traffic and then people will say your website has been “hugged to death” when it’s just OVH that shut down all incoming connections.

                            1. 2

                              OVH, the service from which 1/3 of my current bot-attacks come..

                              1. 1

                                Okay. Never used their services myself and don’t know how bots affect their anti-ddos or DNS.

                          2. 2

                            My impression was BGP problems (specifically BGP leaks, I think) were not just a problem in a CDN like Cloudflare, but also allowed mistakes by small players to make huge numbers of people to temporarily lose internet access.

                            Is there a difference in what happened here, and if so, is it a difference of scale, or some other kind of difference?

                            1. 3

                              This incident is related to internal BGP, not eBGP, and could’ve happened with any internal routing protocol.

                          1. 3

                            The link seems to go into your post on some other website? This is the correct link, https://www.itworldcanada.com/slideshow/xerox-parc-50-years-of-innovation

                            1. 4

                              Why remap to enter key? It’s so far placed. I remapped my ‘a’ key to control as I pretty much always hover over ‘a’ key already.

                              1. 1

                                I agree with you, but the author shared their motivations in an older post, http://emacsredux.com/blog/2013/11/12/a-crazy-productivity-boost-remap-return-to-control/

                                1. 1

                                  I guess I can see it, but the advantages seem pretty minor, certainly not what I would call “crazy”. Also, having used the “dual function” key that is described, is can be pretty finicky sometimes. Hitting Enter instead of Control can be rather disruptive.

                                  If it works, great. It just feels unnecessary.

                                  1. 2

                                    Well, back then this type of keyboard remapping seemed pretty novel to me, probably today I wouldn’t use the adjective crazy to describe it. :-) Still, Enter is definitely easier to press than the actual left CTRL with a pinky on most ANSI keyboards, and not having to move my hand off the home row is quite nice. I did play at some point with using SPC as control, but I typing several spaces in a row becomes quite problematic with this arrangement. :-)

                                    1. 1

                                      One more thing - I came across this idea when I was working on a Mac keyboard without a left control to begin with and it was the only way I wouldn’t lose any other key (e.g. one of the Options) in exchange for the left control I desperately needed. With Linux and a normal Win keyboard that’s not as big of an issue, but I still prefer that arrangement over a control on the bottom row of the keyboard.

                              1. 3

                                The webpage seems to be down, is there a working mirror?

                                1. 2

                                  It’s back up. I need to move it over to my new server; will probably do that this weekend.

                                  1. 2

                                    What solution pleroma solves in contrast to Mastodon? Can you elaborate a bit more? Edit: because I thought Mastodon and Pleroma both use ActivityPub and can interconnect that way.

                                    1. 3

                                      Yep both use the same protocol. I suggested pleroma because I think meta-federation is important, meaning that not all instances should run mastodon and also instances running different software should be able to talk between each other.

                                  1. 1

                                    CaptureOne, Synology NAS and JottaCloud

                                    1. 5

                                      It’s a shame that Deno’s community feels hostile. An example of when somebody suggested they should use a code of conduct,

                                      1. 1

                                        Set your DoH endpoint to “custom” and add https://doh.libredns.gr/dns-query (For more info, https://libredns.gr/) :)

                                          1. 3

                                            Used to use Androids but the support cycle was ridiculous, I felt like most major upgrades never landed on phones I had. Then got iphone, it felt hassle-free and everything just worked and they support older models pretty long too. I’ll probably use my iphone x as long as it gets updates.

                                            1. 11

                                              One thing I like about HN is that it has a broader scope; there are often many interesting non-IT related articles, although the broader scope is both an advantage and disadvantage.

                                              Tildes might be a good compliment to Lobsters for that, although it’s still very young. I’ve got a bunch of invites if people want them.

                                              1. 3

                                                I like this about hn too. It’s fine for lobsters to be a place solely devoted to computer technology, but I’m happy to read hacker news articles that “gratify one’s intellectual curiosity” without necessarily being computing related.

                                                1. 1

                                                  Yup, for being “tag driven” lobsters is surely narrow. Seems like wasted opportunity of having a more open and flexible platform. I was disappointed to find out that tags were basically crossposts.

                                                  1. 3

                                                    Having a narrower platform has its benefits. Reddit and 4chan have a lot of openness and flexibility, but a decidedly different culture.

                                                    1. 2

                                                      The platform is available; depending on design that tweak may be as little as a few hours of work.

                                                      This community, however, succeeds in large part because it is narrow. Look at Reddit 13 years ago today: it was a programming community with thoughtful sidelines into history, science, and even politics. Today I can only describe it as lowest-common-denominator pop culture, mostly images and video that can be fully consumed and understood in a few seconds.

                                                      1. 1

                                                        I mean is lobsters narrow by design or because there aren’t many people here? I’d argue the later.

                                                        It has always been the case of numbers for reddit and many other forums. Moderating critical mass is the challenge and I don’t see how Lobsters addresses that. I was under impression tags were the proposed solution but it doesn’t seem to be any different from subreddits other than you can add multiple tags instead of one subreddit.

                                                        Personally I think that dynamic tags with tag subscriptions is the way to do this. Diaspora kinda nailed the design in that regard but unfortunately failed in many others.

                                                        1. 3

                                                          Lobsters is narrow by design. The about page has always included that focus in its first sentence, and always included:

                                                          [Tagging] keeps the site on-topic by only allowing a predefined list of tags. These tags represent what most of the users of the site want to read, so content that does not fit into any of those categories should not be submitted.

                                                          1. 1

                                                            That’s a bit of a stretch isn’t it. It’s identical to suvreddits - it’s just now you can have 3 from limited selection instead of 1 from very big selection. It doesn’t make anything narrow just limited.

                                                      2. 1

                                                        Laarc had some great articles but didn’t take off. The experiment did show how it might look to have a HN-like site with a small community, more open tagging, and better search capabilities. Check its design out.

                                                      3. 1

                                                        I’d love to have an invitation, my email is jussi at protonmail.com - thanks!

                                                      1. 2

                                                        Gandi and namegear

                                                        1. 2

                                                          Gitlab CI when I get to choose, but usually I’ve used Drone or Jenkins

                                                          1. 1

                                                            bin

                                                            binaries and scripts I want to include in my $PATH

                                                            development

                                                            I organize everything by repositories, like

                                                            • development/sr.ht/jussi/repo-name
                                                            • development/github.com/metosin/reitit

                                                            documents

                                                            A lot of various documents from my download directory, sorted by filetype, client name or content. I automate sorting these using Hazel.

                                                            archive

                                                            Automated by hazel as well. After a file has been in documents for 4 months, it’ll be archived (or deleted, depends on various things like content and client).

                                                            1. 5

                                                              I like Wire quite a lot! Desktop & mobile apps are both great.

                                                              1. 20

                                                                This is not an apology for Comcast, but my gut tells me that wrapping yet another protocol in HTTPS is maybe not the best idea. To be more technical, TCP overhead and SNI loopholes make DoH seem like a half-solution–which could be worse than no solution at all.

                                                                Also, I think DoH is yet another Google power-play–just like AMP–to build yet another moat around the castle.

                                                                1. 16

                                                                  Yea .. I mean, the slides aren’t wrong. And once Firefox is DoH->CloudFlair and Chrome is DoH->Google, who is to say either one wouldn’t just decide to delist a DNS entry they don’t like claiming it’s hate speech. Keep in mind, both companies have already done this to varying extents and it should be deeply troubling.

                                                                  I run a local DNS server on my router that I control. Still, it queries root servers plain-text and my ISP could see that (even though I don’t use my ISPs DNS .. not sure if they’re set to monitor raw DNS traffic or not). I could also pump that through one of my hosting providers (Vultr or DigitalOcean) and it’s less likely they’d be monitoring and selling DNS data (but they still could if they wanted).

                                                                  Ultimately the right legal argument that should be lobbied for is banning ISPs from collecting DNS data or altering DNS requests at all (no more redirects to a Comcast search page for non-existent domains!) That feels like it’s the more correct solution than centralizing control in Google/CloudFlare’s DNS.

                                                                  1. 12

                                                                    I also run a local resolver (a pihole – for dns based ad filtering), but also use DoT (dns over tls) between my resolver and an upstream resolver.

                                                                    It seems like host OS resolvers natively (and opportunistically) supporting DoT would solve a lot of problems, vs this weird frankenstein per-app DoH thing we seem to be moving towards.

                                                                    1. 4

                                                                      not sure if they’re set to monitor raw DNS traffic or not

                                                                      They most certainly do, and a few less scrupulous ISPs have been shown to be MITM’ing DNS responses for various reasons but usually $$$.

                                                                      1. 4

                                                                        Isn’t the real problem here the users choice of ISP? Or has so much of the internet become extremely monopolized around the world?

                                                                        1. 9

                                                                          In the USA, there is basically zero choice in who your ISP can be, many even big urban areas have only 1 ISP provider. Perhaps if SpaceX can get their starlink stuff commercialized next year, the effective number will grow to 2…. maybe. I can’t speak for other countries, but in my experience they aren’t generally better in terms of options, but they do tend to be better in price. US ISP’s know they are the only game in town and charge accordingly.

                                                                          1. 2

                                                                            In the USA, there is basically zero choice in who your ISP can be

                                                                            That’s understandable, but DoH is not the answer here. Addressing the lack of choice is the answer. If Google and Firefox/CF get a free pass in the US, it affects the rest of the world.

                                                                            1. 1

                                                                              I totally agree with you.

                                                                            2. 2

                                                                              I am considering myself lucky then. I can choose between anything that can run over POTS, cable and fiber. The POTS and fiber networks being required to open up their network for other ISP’s as well.

                                                                              1. 1

                                                                                In the USA, there is basically zero choice in who your ISP can be, many even big urban areas have only 1 ISP provider.

                                                                                This is not strictly true at all. Most urban areas in the US of A are a duopoly insofar as the internet goes. You usually have a choice for the internet between the cableco or the telco. In addition, telcos are often required to provide CLECs with some sort of access to the copper lines as well, so, there’s some potential for a additional choices like Sonic DSL, although those become more rare because often the telco charges CLECs more for access to this copper than the price of their internet service directly to the consumer, so, Sonic is one of the few remaining independent CLECs out there.

                                                                                Some areas do have extra third choices like PAXIO, Webpass, Google Fiber, as well as local municipal networks in some areas.

                                                                                1. 3

                                                                                  10% of the US at any speed have more than 2 providers. When you get into slower speeds, there are 2 choices(telco and cable company).

                                                                                  “At the FCC’s 25Mbps download/3Mbps upload broadband standard, there are no ISPs at all in 30 percent of developed census blocks and only one offering service that fast in 48 percent of the blocks. About 55 percent of census blocks have no 100Mbps/10Mbps providers, and only about 10 percent have multiple options at that speed.” - https://arstechnica.com/information-technology/2016/08/us-broadband-still-no-isp-choice-for-many-especially-at-higher-speeds/

                                                                                  Figure 5 in the linked article above pretty much sums it up. So we are both correct, depending on perspective. :) The FCC thinks all is fine and dandy in the world of US internet providers. Something tells me the Cable companies are encouraging that behaviour :)

                                                                          2. 1

                                                                            And once Firefox is DoH->CloudFlair and Chrome is DoH->Google

                                                                            Once the standards are in place for DHCP (et al) to report a default DoH endpoint to use, and OSes can propagate its own idea, informed by DHCP or user configuration, to clients (or do the resolving for them via DoH), there’s little reason for Firefox or Chrome not to use that data.

                                                                            That issue is regularly mentioned in the draft RFCs, so there will be some solution to that. But given that there’s hijacking going on, browser vendors seem to be looking for a solution now instead of waiting that this part of the puzzle propagated through systems they don’t control.

                                                                            Also, web browsers have a culture of “implement first, standardize once you experienced the constraints”, so this is well within their regular modus operandi - just outside their regular field of work.

                                                                            Lobbying work isn’t as effective as just starting to use DoH because you have to do it in each of the nearly 200 jurisdictions around the globe.

                                                                            1. 1

                                                                              Not holding my breath on a legal solution. US gov has not been a friend of privacy, and other governments are far worse.

                                                                              Only thing coming to mind here is some sort of privacy-oriented low-profit/non-profit organization to pool and anonymize queries over many different clients. Even that’s not so great when most setups are 8.8.8.8, admin/password, and absolutely DNGAF.

                                                                              1. 1

                                                                                And once Firefox is DoH->CloudFlair and Chrome is DoH->Google, who is to say either one wouldn’t just decide to delist a DNS entry they don’t like claiming it’s hate speech. Keep in mind, both companies have already done this to varying extents and it should be deeply troubling.

                                                                                Like Cloudflare not supporting edns.. :/

                                                                              2. 3

                                                                                To be more technical, TCP overhead and SNI loopholes make DoH seem like a half-solution

                                                                                The TCP/TLS overhead can be minimized with keep-alive, which DoT clients like stubby already do. You can simply reuse an established connection for multiple queries. This has worked very well for me in my own setups.

                                                                                As others have probably pointed out, the SNI loophole can be closed with eSNI. How soon and if this is going to take hold is anyones guess at this point. But I personally see privacy as more of a side effect as I simply care that my queries are not manipulated by weird networks.

                                                                                This is not an apology for Comcast, but my gut tells me that wrapping yet another protocol in HTTPS is maybe not the best idea.

                                                                                I would love to agree with you here (and I do so in principle), but from my own experience with DoT and DoH I can tell you that many networks simply don’t allow a direct DoT port, leaving you with either DoH or plain DNS to an untrusted (and probably non-validating) resolver. The shift to “X over HTTPS” is but a reaction to real world limitations, where almost everything but HTTP(s) is likely to be unreachable in many networks. I’d love to use DoT and do so whenever I can. But I need to disable it more often than I’d like to. :(

                                                                                A minor fun fact regarding DoH: Since a http(s) server can redirect to different endpoints, it’s in principle possible for clients to choose different “offers” - a DoH server may offer a standard resolution on /query and filter out ad networks on /pihole or whatever. And using dnsdist, this is easy to setup and operate yourself. DoH doesn’t really mean DNS centralization but the opportunity for quite the opposite: You could now take your own resolver with you wherever you go.

                                                                                1. 1

                                                                                  I’m fine with DoH as a configurable system-level feature, but application-level resolvers are bad news, and that seems to be where all of this is headed.

                                                                                  If that’s where it goes, many of applications will default to their own favorite DoH provider for some kind of kickback. The prospect of having to find the “use system resolver” check box for every single application after every other update does not bring joy.

                                                                                2. 3

                                                                                  HTTPS is upgrading to QUIC, so we’ll eventually have DNS back on UDP, but with proper encryption this time.