Threads for miku

  1. 15

    The Internet Archive is a unique memory institution in our time. It aims to preserve parts of the vast digital world, digitize bits of the physical world [1], it doesn’t track anyone, you can access it freely (and conveniently, e.g. from the command line [2]) and it is used by millions every day.

    I hope people support their fight [3][4].

    [1] https://twitter.com/internetarchive/status/1589671252968734720 [2] https://github.com/jjjake/internetarchive [3] https://blog.archive.org/2023/03/25/the-fight-continues/ [4] https://mastodon.archive.org/@internetarchive/110081855742221604

    Disclaimer: I’m contributing to a few software projects at IA.

    1. 7

      The Internet Archive is a unique memory institution in our time.

      Most people I’m reading who know about archiving and its importance are livid. Not at the court, at IA for doing something so mind-bogglingly risky that might end up bringing the whole project down.

      The best support the rest of us can offer might be to stand up a replacement with better leadership, get as much data copied over as possible, and then let IA collapse.

    1. 4

      6th generation Thinkpad X1, cannot think of a better machine at this point in time (only wish it would be more repairable) - X220 for recreation

      1. 2

        I use a 5th gen X1 Carbon and keep failing to find a good reason to update it.

      1. 20

        After I learned about “ci” in vim I got hooked. All of the sudden replacing text in quotes became as simple as ci” and now I’m having a hard time to use other editors. Sometimes a little detail is all that it takes.

        1. 8

          This was extremely helpful thanks.

          Just to clarify to others. In vim if you are on a word “c” starts a change and the next keystroke determines what will be changed. For example, “c$” removes text from where the cursor is to the end of the line.

          Now what is new for me is vim has a concept of “inner text”. Such as things in quotes, or inbetween any two symmetric symbols. The text between those two things are the “inner text”.

          For example, in this line, we want to change the “tag stuff” to “anything”.

          <tag style="tag stuff">Stuff</tag>
          

          Move the cursor anywhere between the quotes and type ci then a quote and you are left with

          <tag style="">Stuff</tag>
          
          1. 8

            This is a good example of why to me learning vi is not worth the trouble. In my normal editor, which does things the normal way, and does not have weird modes that require pressing a key before you are allowed to start typing and about which there are no memes for how saving and quitting is hard, I would remove the stuff in the quotes by doing cmd-shift-space backspace. Yes, that technically is twice as many key presses as Vi. No, there is no circumstance where that would matter. Pretty much every neat Vi trick I see online is like “oh if you do xvC14; it will remove all characters up to the semicolon” and then I say, it takes a similar number of keystrokes in my editor, and I even get to see highlight before it completes, so I’m not typing into a void. I think the thing is just that people who like to go deep end up learning vi, but it turns out if you go deep in basically any editor there are ways to do the same sorts of things with a similar number of keystrokes.

            1. 14

              There is not only the difference in the number of keystrokes but more importantly in ergonomics. In Vim I don’t need to hold 4 keys at once but I can achieve this by the usual flow of typing. Also things are coherent and mnemonic.

              E.g. to change the text within the quotes I type ci”(change inner “) as the parent already explained. However this is only one tiny thing. You can do all the commands you use for “change(c)” with “delete(d)” or “yield(y)” and they behave the same way.

              ci”: removes everything within the quotes and goes to insert mode di”: deletes everything within the quotes yi”: copies everything within the quotes

              d3w, c3w, y3w would for example delete, replace or copy the next 3 words.

              These are just the basics of Vim but they alone are so powerful that it’s absolutely worth to learn them.

              1. 3

                Just a small correction; I think you meant “yank(y)” instead of “yield(y)”.

                1. 1

                  Haha yes thanks I really got confused :)

                2. 2

                  And if you want to remove the delimiters too, you use ‘a’ instead of ‘i’ (I think the logic is that it’s a variation around ‘i’ like ‘a’ alone is).

                  Moreover, you are free to chose the pair of delimiters: “, ’, {}, (), [], and probably more. It even works when nested. And even with the nesting involves the same delimiter. foo(bar(“baz”)) and your cursor is on baz, then c2i) will let you change bar(“baz”) at once. You want visual mode stuff instead? Use v instead of c.

                  This goes on for a long time.

                3. 6

                  One difference is that if you are doing the same edit in lots of places in your editor you have to do the cmd-shift-space backspace in every one, while in vi you can tap a period which means “do it again!” And the “it” that you are doing can be pretty fancy, like “move to the next EOL and replace string A with string B.”

                  1. 2

                    Sublime Text: ctrl+f search, ctrl+alt+enter select all results, then type your replacement.

                    1. 2

                      Yeah I just do CMD-D after selecting a line ending if I need to do something like that.

                  2. 3

                    I would remove the stuff in the quotes by doing cmd-shift-space backspace

                    What is a command-shift-space? Does it always select stuff between quotes? What if you wanted everything inside parentheses instead?

                    and then I say, it takes a similar number of keystrokes in my editor, and I even get to see highlight before it completes, so I’m not typing into a void

                    You can do it that way in vim too if you’re unsure about what you want, it’s only one keypress more (instead of ci" you do vi"c; after the " and before the c the stuff you’re about replace will be highlighted). You’re not forced to fly blind. Hell, if your computer is less than 30 years old you can probably just use the mouse to select some stuff and press the delete key and that will work too.

                    The point isn’t to avoid those modes and build strength through self-flagellation; the point is to enable a new mode of working where something like “replace this string’s contents” or “replace this function parameter” become part of your muscle memory and you perform them with such facility that you don’t need feedback on what you’re about to do because you’ve already done it and typed in the new value faster than you can register visual feedback. Instead of breaking it into steps, you get feedback on whether the final result is right, and if it isn’t, you just bonk u, which doesn’t even require a modifier key, and get back to the previous state.

                    1. 2

                      What if you wanted everything inside parentheses instead?

                      It is context sensitive and expands to the next context when you do it again.

                      Like I appreciate that vi works for other people but literally none of the examples I read ever make me think “I wish my editor did that”. It’s always “I know how I would do that in my editor. I’d just make a multiselection and then do X.” The really powerful stuff comes from using an LSP, which is orthogonal to the choice of editors.

                    2. 2

                      I do not disagree. For vim, as for your editor, the process is in both places somewhat complex.

                      Like you I feel I only want to learn one editor really well. So I choose the one which is installed by default on every system I touch.

                      For which I give up being able to preview what happens and some other niceties. Everything is a tradeoff in the end

                    3. 2

                      In a similar way, if you want to change the actual tag contents from “Stuff” to something else:

                      <tag style="tag stuff">Stuff</tag>
                      

                      you can use cit anywhere on the line (between the first < and the last >) to give you this (| is the cursor):

                      <tag style="tag stuff">|</tag>
                      

                      Or yit to copy (yank) the tag contents, dit to delete them etc.. You can also use the at motion instead of the it motion to include the rest of the tag: yat will yank the entire tag <tag style="tag stuff">Stuff</tag>.

                      Note that this only works in supported filetypes, html, xml etc., where vim knows to parse markup tags.

                    4. 2

                      I really like that I keep stumbling on tidbits like this one that continue to improve my workflow even further.

                    1. 2

                      One related PEP that is not mentioned would be PEP 582 “Python local packages directory” [1], and a tool that uses that called pdm [2].

                      I regularly come across the use case where people should be able to get started with Python code taken from a repo, but do not know about virtual environments. PEP 582 would seem like a real solution in that respect.

                      Projects that use a source management system can include a pypackages directory (empty or with e.g. a file like .gitignore). After doing a fresh check out the source code, a tool like pip can be used to install the required dependencies directly into this directory.

                      [1] https://www.python.org/dev/peps/pep-0582/ [2] https://pdm.fming.dev/

                      1. 19

                        If the project allows it, Go and sqlite3 can be a wonderful combination. For a recent web service [1], we had billions of rows (and documents) stored in various sqlite3 databases, between 250 and 600GB in total. Plain vanilla net/http, sqlx and go-sqlite3 got us to hundreds of requests per second - resulting at times in over 10K queries per second (on a single m4.2xlarge equivalent). Now that’s not that impressive, but what make this setup enjoyable is its simplicity and the prospect of low maintenance.

                        [1] lightning talk: https://github.com/miku/dwstalk

                        1. 6

                          That is pretty impressive! I know this isn’t a tech silver bullet (nothing is) but I do think it’s an underutilized pattern right now. There is a time and a place for giant database servers/services and there’s a time when a file sitting on a hard drive will do just fine.

                          1. 2

                            Ah, don’t undersell it. IMO, while hundreds of rq/s isn’t interesting for a ping-pong type webserver benchmark, it’s pretty respectable for something that’s actually reading information from persistent storage and doing real work.

                            1. 1

                              as a devops evangelist (perhaps more towards the ops side) it’s a refreshing pattern… this is the kind of stuff ops people love to run

                              i wonder if that’s who’s building a lot of the systems designed like this, the type of devs who appreciate what it takes to run a service

                            1. 36

                              Pip pip! Time to get rev our coding engines to a high RPM! We’ll be sure to have a lot of snacks (yum!) to keep our spirits up. Of course, we’ll be apt to get some people who nix our great ideas, but I’m sure that’s just because of a desire to avoid cargo-cult programming.

                              1. 24

                                This seems like a tangled ball of yarn from the go get. It’s possible there will be some gems, but that’s assuming that nobody placed a hex on the conference. Regardless, I’m sure there will be a cabal of mavens in attendance ready to talk about their stack.

                                1. 4

                                  Take it easy everyone. Go brew some coffee and watch some asdf videos.

                                  1. 3

                                    Nah, me and the Guix were hungry–and not for drinks or something chocolately–so we rock-paper-scissor’d, but I threw papers in luarocks so this little coding cabal’ll be getting some spago at the place with the foreign name, Leiningen’s.

                                2. 8

                                  banned

                                  1. 3

                                    Or let’s just play some pac-man.

                                  1. 12

                                    Another nice project with an S3 compatible API is seaweedfs (née weedfs): https://github.com/chrislusf/seaweedfs, inspired by the haystack paper (from FB, when FB had around 1-2K photo uploads per second); we use it in production (albeit not in distributed mode). A lightning talk I did a few month ago: https://github.com/miku/haystack

                                    1. 1

                                      if you do not mind, a question – did you find any solutions that are based on JDK 11+ (Java/clojure/scala, etc) – I am looking for a file store open source lib, but I would like it to be compatible with a JVM ecosystem.

                                      1. 1

                                        Interesting, I’d assume a JVM ecosystem would permit non JVM code. Is it a JVM client library you want?

                                        1. 1

                                          Not the OP, but I’ve heard that some banks will refuse to deploy any code that doesn’t run on the JVM.

                                          1. 1

                                            Wow, do you have perhaps an example, or country of a possible example?

                                            I know crux db is on the JVM, and they can use and even encourage their object store to be on Kafka (famously JVM)

                                            1. 1

                                              Unfortunately, no. This was just word-of-mouth from people in adjacent businesses so feel free to take it with a grain of salt.

                                              The general contour of reasoning was that with security being a top concern, they prefer to deploy code in ways that are familiar.

                                          2. 1

                                            Thank you for the follow-ups. I would like the whole service to be packageable and accessible as a JAR that I can incorporate in our ‘uber’ JAR.

                                            The backend I am working on, has one of its ‘features’ – a simple deployment. In the translation, it means a single-jar + PostgresSQL.

                                            The single-jar within it, has about 20 ‘micro-services’, essentially. So a user can start one up just one jar and have everything in ‘one host’ or start the jar with config params telling the JAR which microservices to start on which host. That configuration file is like a ‘static’ service orchestrator. It is the same for all the hosts, but there are sections for each host in the deployment.

                                            One of the microservices (or easier to call them just services) I am going to be enhancing is a ‘content server’. Today the content service basically needs a ‘backend-accessible-directory’.

                                            That service does all the others things: administering the content, acting as IAM Policy Enforcement Point, caching content in a memory-mapped db (if a particular content is determined to be needed ‘often’), a non-trivial hierarchical directory management to ensure that too many files do not end up in ‘one folder’, and so on.

                                            I need to support features where files are in a ‘remote content server’ (rather then in a locally accessible directory). Where the content server is an S3 (or some other standard compatible system) So I would like the ‘content server’ to be included as a 21st service in that single JAR.

                                            Which is why, I am not just looking for a client, but for the actual server – to be compatible with JVMs. Hope the explanation gives more color to the question I asked.


                                            With regards to other comment where folks mention that some organizations like banks – prefer a JVM only code. That’s true to a degree, it is a preference, not an absolute requirement though.

                                            That’s because some of these organizations have built by themselves ‘pre-docker’ deployment infrastructures. Where it is easy to request a ‘production capacity’ as long as the deployed backend is a JAR (because those deployment infrastructures are basically JVM clusters that support migrations, software defined networks, load balancing, password vaults, monitoring, etc)

                                            So when a vendor (or even internal team) comes in and says: for our solution we run on docker, it is OK, but they have invested millions… and now want to continue to get benefits (internal payment basically) for their self-build infrastructure management tools … Which is why there is a preference for JVM-only solutions and, perhaps, will be for some time.

                                            And to be honest, JVM (and JVM based languages) and their tools ecosystem continues to evolve (security, code analysis, performance, etc) — it seems that the decisions back then about investing into managed infrastructure around JVM – were good decisions.

                                      1. 3

                                        Daily:

                                        • z (better cd)
                                        • jq (json data munging)
                                        • delta (diff)
                                        • pandoc (pdf, docx generation)
                                        • tokei, ssc (sloc)
                                        • tmux (terminal multiplexer)
                                        • pv (progress bar)
                                        • shuf (reservoir sampling)
                                        • fzf and ag (only within vim)

                                        Alias:

                                        $ type open
                                        open is aliased to `xdg-open'
                                        

                                        Occasionally:

                                        • ncdu, diskonaut (disk usage)
                                        • pup (like jq, but for html)
                                        • ranger, nnn (file browser)
                                        • fd (line find)
                                        • esbulk, solrbulk, esdump, solrdump (indexing and index export tools)
                                        • git-cal (git calendar view)
                                        • clinker (mass link checker)
                                        • tcptrack, speedometer, iftop, bmon (net monitoring)
                                        1. 5

                                          While technically impressive, copilot seems to me like a solution looking for problem. If all you have is GPT, then all problems look like a completion problem with a prompt.

                                          On the flip-side, there is already so much code in the world today - someone, someday needs to read it, if it’s relevant. People forget that ratio between writing and reading code is maybe 1/9. I feel writing code is not the bottleneck.

                                            1. 8

                                              The largest railway operator in Europe, Deutsche Bahn recently announced they finally moved completely into the cloud. This article confirms my own back of the envelope calculations that they just increased their operational cost by a good amount. Cloud is like really cool, if you are getting started and you do not have any capital. In many other scenarios you pay a good premium for what might become a vendor lock in over time.

                                              1. 11

                                                There are two things that make the cloud cheaper than alternatives:

                                                • You’re outsourcing things like physical security, buying replacements, installing software updates, and so on and so can share those costs with a load of other companies and pay a lot less than you’d pay full-time admin staff to do them.
                                                • You’re paying for what you need. You’re probably going to pay more for base load than you could otherwise, but if your peak demand times are ten times higher than your baseline then your average cost is going to be a lot lower than if you had to provision enough infrastructure to cover your peak load all of the time (this is where AWS came from: Amazon had a load of spare infrastructure from buying enough servers to cover peak buying time and wanted to make money from it the rest of the time).

                                                That said, this article isn’t discussing the merits of the cloud, it’s comparing two cloud providers. I don’t know OVH, but my experience with other smaller providers is that they typically have a single datacenter and various single points of failure that can cause long periods of downtime. As a result, they’re cheaper. For my personal use, that’s absolutely fine: if I have 10 hours of downtime, it costs me absolutely nothing and I’m much rather pay less and occasionally grumble. For corporate use, 10 hours of downtime may cost more than I’m paying for a year of the server.

                                                1. 3

                                                  OVH datacenters one of the bigger players. I’m personally running stuff on a “root” VM from netcup, with an “Minimum availability” of 99,9% for one simple personal Host. So yeah of course can amazon provide much more - but others can also be pretty well equipped or have a really good guarantee. Why buy from them if you really don’t it at all?! I don’t think this requirement is really necessary for most of the companies moving to the “cloud”. They probably have more outages due to their own misconfigurations and errors in their own software.

                                              1. 2

                                                This piece inspired a simplistic key-value store, we wrote to replace memcachedb. During development, the pcstat tool came in handy to answer the question:

                                                “is Linux caching my data or not?”

                                                pcstat gets that information for you using the mincore(2) syscall.

                                                1. 1

                                                  Another great utility in this space is diffoscope, which I believe came out of the reproducible builds effort; Chris Lamb presented it last year.

                                                  1. 1

                                                    This is cool. Better to have good UI though.

                                                  1. 3

                                                    For some background, I would also suggest The Design of the UNIX Operating System (1986), by Maurice Bach.

                                                    1. 3

                                                      lovely book, I’d recommend starting with this and “Unix Internals: The New Frontiers” by Uresh Vahalia.

                                                      1. 1

                                                        That’s a wonderful book, that I wish had been updated.