Threads for zladuric

  1. 8

    I partly disagree. Of course you can practice learning but I certainly also think that you should be an expert on something you are writing on sad a guide.

    The reason I is that there is certainly harm being done when people learn and adopt bad habit’s and bad mental models. It can be really hard to untrain these things and the reward is low.

    Of course it depends on the context, but I think the advise on that everything should publish tutorials on everything they touch is bad advise.

    However writing on it for all the personal reasons listed (writing skills, better understanding, etc.) in my opinion is good advice.

    Don’t get me wrong I don’t think it’s bad, but writing tutorials and explanations that create off mental models are something I wished people would be more aware off. I’ve just seen a lot of people having wrong understanding of a part of what they do every day.

    To be fair however, I think blogs and stuff are the smaller issue here. Books containing these issues or half-marketing or simply hype based articles are way worse.

    Like I said it’s context dependent though and of course the reader is also to blame when they blindly copy problems in code or concepts they found on a personal blog.

    So in short I think the discussion is a bit too black and white. Maybe a good option would be to convey at the beginning of the article when you are just starting out. I think a problem might be that sometimes a side goal is to look great to employees or similar and the tone therefor is like that of an expert.

    On a similar note, this all applies even more to books. I’ve started to be very skeptical of books (or video courses or similar) written by people who are authors and essentially never use what they were about in any production app.

    One last advice. Try to start with official docs as a reader and try to look to official docs for further reading. Also consider to write official documentation because then it’s central, and more likely to be updated or fixed.

    Also wikibooks, etc. still exist. All of these have the benefit that others can update and fix them if necessary. And writing official docs certainly looks good on the resume if that’s a side goal.

    1. 15

      I think you’re missing the point. The article isn’t saying everyone should write guides, as you seem to imply. It’s more that the article is suggesting that you should document your learning process. Not so much to teach others, as much as to teach yourself.

      I mean, the paragraph suggesting topics to write about is indicative of this: mistakes you made, understandings that helped you progress passt an issue, things you didn’t know… It’s more about writing for yourself, not for others. And in that case, even newbies can write. And hopefully get feedback on what they could do better to but that’s irrelevant.

      1. 2

        Yes, this wasn’t really just in direct response to the article. So I didn’t mean to imply that. I meant this more as a “yes, but there is also these cases…”. Me saying that I “partly disagree” was probably the wrong wording. Sorry about that.

        I meant it more as a “so make sure to be clear about documenting your learning process”. Don’t end up phrasing things like you are an expert for whatever reason that may be. These things can happen on accident, maybe you write it for yourself in a style you are used for books, maybe at some point you seek for a job, so you make it your portfolio, maybe you become an expert in a field and then it can be hard to differentiate for readers. These are things that happen, and for someone searching for information it can be hard to distinguish.

        There’s many layers of this. There’s of course also the non-expert that just tries to sell knowledge.

        On a related note various companies do something that is similar to guest authors on blogs, but more like “knowledge bases built by users”, where users are paid cash or product credits for writing content. If you like to write about stuff that sounds fun, but it can lead to unintended side effects of lower quality, SEO optimized content as well, that might even look like it’s by a company on first sight.

        So there’s multiple thoughts there. Be sure to be transparent about something being your learning content, think about how you will deal with outdated information, explicitly state when something is an ugly hack or an experiment, and like I wrote, also consider writing official documentation when you found something. Easier to find, and people typically review and change it.

        All of those are just suggestions though. I think there is a lot of “document what you do” proponents out there, and I see myself as one of them. However, I have certainly been bitten by documentation that taught faulty concepts and was worse than it looked. This can happen for various reasons. So I think the next step should be thinking about what you want to do and how to achieve that. If it’s really just notes, make sure the reader knows, if you think there is too little documentation on a project, consider contributing to the official documentation, if you hacked something, write about it and just mention that it’s a hack. It might be more obvious to you than someone clicking on the fifth search result.

      2. 7

        I believe the best time to write a tutorial is when you both:

        • know the subject matter very well;
        • remember what it was like not knowing it.

        It matters that you are in a place of expertise, but remembering where you started from, and what actually helped you learn, can make the tutorial even more valuable. Though I reckon it’s not reliable, I feel that my most liked writings are those who follow this pattern.

        1. 5

          This was my first thought, too, but there’s a balance.

          It’s definitely helpful to read about other people’s learning experiences, and to do that somebody has to write about their learning experience…

          What I can’t stand though, and where I think the danger lies, is the “newbie expert” style of writing. Sometimes I’ll be reading through an article with an authoritative style, on a topic I’m familiar with or a sub-topic related to it, and half way through I’ll realize the author has basically no (or very minimal) real experience and doesn’t really understand what they’re writing about.

          1. 2

            This case is something I see so many times, and I don’t even think it’s ill-fated in most situations. I think it might be wanting to share something that was found, and using the style that was used in a book, official docs, etc. Or maybe people believe they are experts, which can also happen. And then there’s also people pretending to be experts though, oftentimes that seems to be related to be for monetary reasons or in the hopes to impress a future employer, or simply for fame.

            And therefor it makes sense to be clear about what your intention as a writer it is. This can easily be done in a sentence. Starting an article “I’ve recently started looking into X, and while setting it up I had some problems with Y, so I thought I’d share this here” comes off differently from “How to set up X with Y”.

          2. 5

            If the author of the technical content is honest about the fact that they just learned it, readers will know better than to take it as gospel. Students attempt to teach each other concepts they just learned in class an hour ago all the time and this is encouraged by educators.

            The real damage comes from actual experts who speak with authority but are still fallible. Any example of a bad mental model that is difficult to unlearn can probably be traced back to misguided advice by actual experts.

            1. 3

              Here’s an idea: write for yourself. You don’t even have to share it!

              1. 1

                If it was less work for experts to correct people who are “wrong on the internet” then writing about things you are interested in would be like automated homework.

              1. 5

                It seems to me like everything that would fit in ‘fediverse’ would also fit under a broader ‘social media’ tag. So why not that? They’re alike in their potential for discussion about all kinds of nerd topics, they’re both made out of computer, they have the same problems categorically.

                I think we can agree that ‘social media’ wouldn’t be a good tag to add, but it’d certainly be welcome if there was a post about, say, why Twitter wasn’t going to immediately break even after errywon got fired. That fits because it is appropriate for other tags. Imagine an article that is appropriate for ‘social media’ but not for any other tags. Bad fit, right?

                Is there a difference between ‘fediverse’ and ’social media’ that would make an article with only that tag be a good fit?

                1. 16

                  Personally I would rather see an ActivityPub tag, it’s much more technical and has narrower focus. I’m not sure if we’d have enough stories for it, though.

                  1. 7

                    But what about the dozens of people using Diaspora and Ostatus?

                    1. 1

                      That’s not very Zot of you.

                      1. 1

                        I mean I considered including Zot but there’s only red links about it on the wiki article.

                        And in fact, searching for “Zot” here turns up this comment, pointing to a repo site that’s currently closed for new acccounts: https://framagit.org/zot/zap/blob/master/spec/Zot6/Messages.md

                        This seems to be the canonical site: https://getzot.com/, which is empty.

                  2. 2

                    Social media is only one possible use. There’s really no limit on what could be communicated over it. There’s already a chess server, several blogging platforms, and file sharing (Nextcloud). While these can be social, it’s not social media in a way people think when they say social media.

                  1. 7

                    I love all the talk about security, encryption etc we can do in regards to the Fediverse (I really do want it to, wish it could work), but without some kind of ranking system around it’s full-text search, it’s practically useless as a Twitter competitor. That’s why it markets itself as “microblogs”, which is something nobody really wants. If you want a ranking system, Twitter is very strong evidence that basically you need need to pipe all of your data to GCP.

                    Ignoring the lack of ranking, we have a network that has demonstrated it fragments and disintegrates given time, where admins can grep your DMs if they want, and anyone with enough users and/or domains will spam you as much as they please without any kind of access restriction; a large part of the product of Twitter is the content and user moderation, something which they struggled with as a $44bn company - I don’t see a couple German nonprofits and guys on basement Raspberry Pi LAMP stacks coming up with an adequate solution to this problem that doesn’t require centralisation.

                    1. 15

                      Actually the standard search and ranking system on commercial social media is one of their flaws, and the lack of it is one of the key advantages the fediverse has over them. Globally searchable content makes it possible for behaviour modification. Governments and advertising agencies just need to find ways to optimise the search results and they can get global reach, not based on quality of content but on quantity of capital. Good content on the fediverse gradually spreads to users that are interested in it but in a way that can’t easily be manipulated for profit and power.

                      I know this is not a popular position. I know people coming from other platforms expect to have world-wide reach from the word go, but we need to examine this desire and have a real debate about whether it is necessary or desirable to have instant world-wide reach in a social network. Social networks are not news platforms, so instant spread of information is less important than the ability to communicate easily with a specific group of personal contacts. The very definitions of the words ‘social’ and ‘network’ strongly imply not having a system of mega-influencers with millions of followers and legions of unwashed nobodies that follow them.

                      I feel like the social aspect of social networks has been lost in the churn somehow. Even if what you really want is a new twitter, shouldn’t there be a place for people like me that just want to share information and news with friends and family? To keep in touch and communicate without being bombarded with advertising and attempted behaviour modification?

                      1. 3

                        I’m not saying you shouldn’t or can’t have a fediverse, or microblogs, but I’m saying they’re not a competitor to Twitter et. al because the apparent similarities are mostly surface-level, user-facing. The value Twitter had over everyone just blogging or texting eachother is lost in the migration to federation.

                        The search and ranking features are only “flaws” because you don’t like them. Great. I don’t care, I find utility in these features and millions of other people do too, with centralisation being one of the few things that can coordinate defense against malicious actors wrt these features.

                        1. 1

                          Right but you are suggesting we change something into something else. I just think it is fair to point out that maybe not everyone wants it changed. Making mastodon just like twitter might be attractive to some people, but a lot of the people that have used the fediverse and supported and developed it for years might not want that. It is easy enough to just make a new twitter exactly as it was and leave the fediverse alone.

                          …only “flaws” because you don’t like them

                          I think you might be misreading the opinion of most fediverse users towards commercialisation and centralisation. Maybe I am wrong, time will tell. It should be interesting to see how things develop.

                          1. 2

                            I’m not at all suggesting Mastodon should become Twitter. All I’m saying is it’s not the competitor to Twitter people want it to be, and it won’t have the migration of users people pray it’ll have, because it’s not what people want.

                      2. 5

                        without some kind of ranking system around it’s full-text search, it’s practically useless as a Twitter competitor.

                        Maybe that’s a personal preference? I have never once thought “gee, I’d like to search all the text on this social network.” Because 99.9% of that text will be crap written by stupid people I don’t know. Part of the usefulness of a social network is that you find stuff (and people) through the people you already follow. But then, I’ve never understood the appeal of Twitter.

                        1. 2

                          You’ve never used the search function on Twitter to find new things, or people? I used to find so many smart or funny accounts to follow by searching “sentence fragment that’s interesting”.

                          1. 4

                            I’ve used search on Twitter to see reactions to a current event or meme, but I’ve never followed someone as a result. By definition, search is randos.

                            1. 1

                              I find utility in Twitter, but I very rarely use search, especially so for discovery.

                              To me, Twitter stuff is ephemeral and if I didn’t see it “there and then” I probably don’t need it now. I very rarely ever used the “trending” stuff.

                              This requires aggressive pruning, you constantly have to modify who you follow, though. The “algorithm” had made this difficult; instead of managed and moderated stuff, I see a ton of content from people I don’t know about, on topics I don’t care about. E.g. I follow a tech person, I like their blog posts and tweet threads. But, they liked (not responded to!) someone’s political update. Now my timeline is polluted with this political situation.

                              Plus now everybody there is treating it as an outreach platform. Not honest discussions and daily sharing of thoughts.

                              I think that is why Twitter lost it’s apeal to me. It’s still interesting sometimes, but signal to noise is quite bad, I open it once or twice a week, or when I wanna announce the next angular meetup.

                              For the last few years, I’ve liked fediverse much more, for “social” part of my internet needs. All this is to say, I don’t think search is that important.

                            2. 1

                              I’d guess 95% of my searches on Twitter were “I know person X posted something about Y” but of course I didn’t bookmark it and even if I had faved it I wouldn’t find it easily.

                            3. 2

                              I mean they already created a high-quality platform that is good enough to warrant something like 7 million users (estimated), ~550,000 of those having joined in the last week.[0] Pretty sure they’ll figure out something suitable for search. Regarding content & user moderation, a couple German nonprofits and randos with basement RPi’s aren’t the ones doing the moderation. Each instance is responsible for moderating properly. It’s not up to some outsourced below-minimum-wage team. Further, the actual design of content moderation in mastodon makes it substantially easier to deal with. Any user on an instance can report content (or a user) and if the admins/moderators block a user (or instance), all members of the instance benefit from that action immediately. Thus, the entire community works together to keep things running smoothly. Who knows, we’ll see how it goes, but so far the platform has been working extremely well, even despite adding another ~million users over the past handful of months.

                              [0] https://bitcoinhackers.org/@mastodonusercount

                              1. 4

                                Each instance

                                These are the nonprofits and raspis.

                            1. 1

                              Just a hint, I know you’re not scraping instances themselves, but if you did plan to, please note that a lot of them have that explicitly forbidden in their rules pages (in addition to robots entry). I know it may not mean much but people are sensitive about it.

                              1. 2

                                Yeah that’s one of the reasons I’m only working with the instances.json data from https://instances.social/ - which works on an opt-in basis from the instances themselves.

                              1. 34

                                I’m not saying the interest is gonna last only one news cycle but let’s wait at least one news cycle.

                                1. 9

                                  FWIW decentralised networks like the fediverse have existed for a long time. GNU social, identica etc are just a few, they’ve been around for a long time.

                                  Personally I like how mastodon approaches this: you get four key-value entries, so you can put whatever in there. I find that quite suitable.

                                  1. 1

                                    And this approach handles a variety of services types. In addition to fediverse and code forges, it could also cover sites like keyoxide.

                                  2. 7

                                    Out of curiosity, how would this feature bother anyone? It has all the indicators that it would require minimal changes (multiple inputs for the “website” profile entry)

                                    People on lobste.rs have shown interest in the fediverse for quite some time.

                                    1. 11

                                      It wouldn’t bother me none, especially if fields only present when populated. It’s your profile, if that’s how you want to be reached that’s appropriate info. I just don’t think one busy day represents a trend and the decision shouldn’t be made on that argument specifically.

                                    2. 5

                                      On the flip side: If you only see twitter, facebook and github on peoples profiles, you’ll obviously not think about the possibility of the fediverse. But the usual candidates are getting free PR.

                                    1. 5

                                      I’ve been searching for a mastadon alternative written in Go, so that I can contribute if need be. I’m total n00b to fediverse and activitypub and so all of my searches never found this. Thank you greatly for posting.

                                      1. 4

                                        Another option that fits the small and hackable criteria is honk. There are a handful of people that run their own forks I’ve discovered since running my own instance.

                                        1. 3

                                          I’d say that if you’re wanting high compatibility with the fediverse, honk is probably further away than gotosocial and with less inclination to fix the issues.

                                          1. 6

                                            High compatibility with Mastodon, you mean. honk is perfectly ActivityPub compliant. Can’t blame honk if Mastodon does things in a non-standard way.

                                          2. 1

                                            I have been considering honk for a while but haven’t made the switch yet. Do you have experience with moving an account using the account migration feature from Mastodon? (https://docs.joinmastodon.org/user/moving/#move)

                                            1. 3

                                              Not sure about honk, but gotosocial doesn’t support the move activity yet, there’s an issue and the thing is on the roadmap for next year.

                                              1. 2

                                                Honk does have an import command for pulling in content from Mastodon and Twitter backups. I’ve never tried it before; I started with Honk and left my Twitter behind.

                                              2. 1

                                                One more question, do you maybe have some pointers to forks? I couldn’t find any and I’d like to remove the sqlite dependency and let honk serve tls itself instead of requiring relayd or a webserver in front of it.

                                                  1. 2

                                                    Thanks! Just added some patches myself: https://github.com/timkuijsten/honk

                                              3. 3

                                                GtS is still a young project, but definitely one that would benefit from those eager to help!

                                                1. 1

                                                  There is also akkoma, which I’ve yet to have a good look at. Written in elexir i believe

                                                1. 11

                                                  When I chose a server, I considered their federation policy, because I didn’t want to out-source deciding which accounts I should be allowed to follow.

                                                  https://fosstodon.org/about and https://hachyderm.io/about/more both have long lists of suspended servers: “No data from these servers will be processed, stored or exchanged, making any interaction or communication with users from these servers impossible”.

                                                  I prefer the federation policy of https://qoto.org/about/more, which doesn’t suspend any servers. There’s a few others like that.

                                                  1. 8

                                                    The unfortunate reality of being on an instance like qoto.org is other, “heavily moderated” instances will suspend/silence you because of the lax moderation policy.

                                                    1. 6

                                                      The qoto.org admin notes:

                                                      Thankfully the servers blocking us are few and far between and are limited to only the most excessive and aggressive block lists. As I said, QOTO has one of the largest federation footprints on the fediverse,

                                                      https://qoto.org/@freemo/109319817943835261

                                                      1. 1

                                                        Anecdotally, every other server I’ve seriously looked at joining has had QOTO completely blocked/suspended/filtered. There are some things about it I found attractive but it seems like I’d be cut off from a lot of the community I’m looking to find on the fediverse based on where my twitter follows/followers have migrated.

                                                        Alright, should have double checked before posting. It looks like this is correcting, as at least Hachyderm and infosec.exchange do allow it now. (Still appears blocked at Hachyderm but the issue removing it is closed)

                                                      2. 2

                                                        It seems to have a lax federation policy, not a lax moderation policy. It doesn’t block other instances, but it moderates its members’ behavior.

                                                      3. 3

                                                        I can understand your line of thought, but often times there are good reasons to defederate certain instances. For example pawoo.net (japanese instance) allows content which is illegal in other countries. And since mastodon caches content of remote servers, this makes defederation or at least restrictions almost a must.

                                                        1. 3

                                                          Yes, qoto.org’s policy is:

                                                          We do not silence or block other Fediverse instances based on agenda, politics, or opinions held by their staff or users. We only require servers we federate with to follow one simple rule: respect a user’s right to disengage. Offending servers will only be silenced, not blocked, blocks will be reserved for technical assaults only such as DDoS attacks, or legal issues such as sexual abuse and child porn.

                                                          qoto.org doesn’t currently block any servers, but is willing to if needed for the above technical/legal reasons.

                                                          Other instances blocklists go beyond these technical/DDoS reasons. The advantage of a federated protocol is being able to pick.

                                                          1. 1

                                                            I was on mastodon.technology, but the whole time I just wanted my own instance. Now when it shut down, I finaly have one. Then I can deal with my own policies.

                                                          2. 2

                                                            Wow, I didn’t know Mastodon instances are censoring each other already.

                                                            I just tried to send a message from qoto.org to hachyderm.io and it did not arrive.

                                                            No error message on the sending side.

                                                            Then I sent a message from indiehackers.social to hachyderm.io and it arrived immediately.

                                                            1. 5

                                                              hachyderm.io has recently removed qoto.org from its blocklist: https://github.com/hachyderm/hack/issues/8

                                                              1. 1

                                                                But the direct message never arrived.

                                                                1. 1

                                                                  Why is it still listed on their /about/more page?

                                                                  1. 2

                                                                    Possibly a mistake and/or the lifted ban hasn’t taken effect yet.

                                                                2. 4

                                                                  Instances have blocked/silenced other instances for a long time. It’s a core part how the Fediverse views federation.

                                                                  1. 3

                                                                    One of the core ideas of Mastodon is that instances control who they federate with.

                                                                    So you are free to create an account on any instance you like and post anything that stays within the instance’s rules. You just aren’t guaranteed an audience – other people may block you, or other instances my choose not to federate with the instance you’re posting on. This is freedom of speech in its purest form: you can say what you like, and other people can ignore you if they like. Or if they dislike their instance’s policies, they can move to another one or set up their own. But you can never, ever, a million billion times never, force another instance to federate with you or show your posts, or force another user to listen to you.

                                                                1. 10

                                                                  I’ve been writing a small service in Deno for the past couple months, and I really like it. By far the nicest experience writing Typescript for the backend. Hard to express what a relief it is not having to deal with bundlers and transpiling.

                                                                  The stdlib is quite excellent, I would highly recommend being as minimal with npm as you can.

                                                                  1. 1

                                                                    Thanks for this comment, it’s just been an inspiration - I started writing a small CLI tool yesterday, and didn’t feel like messing with setup and linting and everything, but I haven’t considered Deno. I think I will give it a try.

                                                                    1. 2

                                                                      Deno is especially interesting for CLI tooling, because it supports compiling your tool into a standalone binary.

                                                                  1. 35

                                                                    Pretty ironic to see this post hosted on a .sh domain name. Yes, .io domains are harmful but so are .sh domains!

                                                                    1. 16

                                                                      true!! I wish someone had informed me before I hitched my life to a .sh domain. i hope my post helps someone avoid similar regrets.

                                                                      1. 1

                                                                        Well the web isn’t (yet) hotel california, domains can move hinthint

                                                                        1. 3

                                                                          easier said than done when my email is hosted on that domain as well & everyone polling my RSS feed would need redirection 😥 plus, all my historical links would break. i would like to move but it’s a lot of work and churn for a site that i’ve built up for years.

                                                                          1. 4

                                                                            I hosted my website, RSS feeds, and email on a .in domain for 16 years. Then I decided to move everything to a .net domain. Since my entire setup is version controlled, it took only a few commits to move my website and email from the old .in domain to the new .net domain:

                                                                            The migration went well without any issue. There has been no loss of RSS feed subscribers. The web server access logs show that all the RSS clients and aggregators were successfully redirected to the new domain.

                                                                            1. 4

                                                                              receive emails for both old and new domains

                                                                              I agree with you that it’s not hard to switch domains (I’ve also done it myself), but, because I used my old domain for email, I’m essentially forced to keep paying for it. Otherwise, someone could just buy up my old domain and start receiving my emails. I’ve gone through the process of updating my email for just about every service I know, but even after four years I still get the occasional email sent to my old domain. Maybe if I can go five years without receiving any emails for my old domain I’d feel more comfortable giving it up, bit even then it still feels like a security risk.

                                                                              Switching domains for ethical reasons seems moot if you still have to keep paying for it.

                                                                    1. 22

                                                                      Decent photo organizing software.

                                                                      Photos and LightRoom both fail on the “let me put thousands of photos on a remote share and let any number of laptops/phones/tablets poke around, tagging, organizing, making smart albums, etc.”. I don’t care about RAW, I don’t care about anything more than basic editing, and would like to keep all the face recognition/etc. that the basic Photos app can do. If I could have it generate a “feed” or something I could point a smart TV or smart picture frame to and have it cycle through a smart album or something that’d be amazing.

                                                                      This seems insane that Photos can’t do this. I tried something from Adobe too a while ago (can’t remember the name now) but it wasn’t up to the task either.

                                                                      1. 4

                                                                        Did you ever look into digiKam? I use it a lot, from what you’re describing, it looks like it matches a lot of your “requirements”.

                                                                        Some things that might fit your bill:

                                                                        • it can do face recognition for you (and learn and help you automate this stuff)
                                                                        • it doesn’t “import” your pics anywhere, it just works on wherever they are.
                                                                        • it has its database, but:
                                                                        • you can write most of the changes in the sidecar file, so that other programs can also use it.
                                                                        • the db is sqlite (by default) so you can even do something with tools like datasette e.g. query what’s the most frequent time of day or location your photos are taken, or - if you manage RAWs there as well - what focal lengths do you have the most or similar.
                                                                        • it can do some basic editing for you(for proper edits I still use external software)
                                                                        • you should be able to point it to a smart TV, I think, and even if not, there are many cool plugins for such stuff (as well as for e.g. backups to box, google, posts to twitter, facebook, regular html galleries and a lot of others.

                                                                        The part that I like the most about it is that it is focused on the library functions: searching, tagging, filtering etc. (I also combine it with rapid photo downloader to get pics off my cameras, and darktable and/or gimp to do actual editing but that’s just me).

                                                                        1. 1

                                                                          After using digiKam for a few years I only use it to import images from the camera. Persistent issues:

                                                                          • The DB is corrupted once in a while, and the application crashes on the regular.
                                                                          • The UI is 1990s. Why is the settings button not on the home screen? Why is there a splash screen (by default, you can turn this off but dammit, who actually wants that?)? Why is there no way to go from “WTF does this do?” to the actual documentation? Like, unless I spend every waking second editing photos, why would I know which “quality” setting to choose?

                                                                          Darktable uses sidecar files, and although they refuse to believe the filesystem is important (so I have to keep re-importing the entire catalogue and run a service to clean out missing files nightly), at least it isn’t actually corrupted. And it’s much faster than digiKam, making it barely usable on a reasonably modern desktop.

                                                                          1. 1

                                                                            Wow, we really have different experiences there. I don’t think I’ve ever seen the database get corrupt, for example, and with darktable I had that happen.

                                                                            And the UI, some of it might seem “1990s”, but for me it’s just fine, and far, far better then Darktable for managing a library (not editing, I still use dt for that).


                                                                            For importing, did you ever consider rapid photo downloader? For me this does noticeably better job then the other two.

                                                                        2. 2

                                                                          I think Photoprism should be able to handle thousands of photos. However I’ve tried to import around 100 000 photos and it was almost impossible to manage that much in it.

                                                                          So far I’ve settled with plex for building playlist + digikam for organizing collection + my own photoframe software: https://github.com/carbolymer/plex-photo-frame

                                                                          1. 1

                                                                            oh wow this looks very promising… thank you for the recommendation!

                                                                            1. 1

                                                                              Literally just set up my own PhotoPrism instance a week ago, and it currently has some 55k images and videos in it.

                                                                              There are a few things that could be improved, e.g. allowing renaming of persons tagged through face detection, easier labeling of multiple images and better face detection for kids. All of these improvements already have issues on GitHub BTW.

                                                                              One thing it doesn’t support is timeshifting of images with bad EXIF tags. For some reason I had a whole lot of photos that had GPS data but a wrong EXIF create date. Luckily exiftool was able to fix all bad timestamps. Here’s a oneliner that can fix the issue that I had:

                                                                              exiftool -progress -r \
                                                                                -if '$createdate eq "2002:12:08 12:00:00"' \
                                                                                -if '$gpsdatetime' '-createdate<$gpsdatetime' \
                                                                                -overwrite_original_in_place . |&
                                                                                tee -a log.txt
                                                                              

                                                                              All in all I’m pretty satisfied with PhotoPrism!

                                                                          2. 2

                                                                            Yea, I have a similar dream. For me, it would be an open source project with a single-binary self-hosting approach, which embeds a high performance web server for an interface and API similar to Google Photos. The storage is “just files” so the binary + photos can be backed up easily. Metadata stored in SQLite alongside the photos on the filesystem. Cool tricks like face recognition and autosizing, autocropping, etc. outsourced to other binaries with a scheme for storing “derived” images non-destructively, similar to what Picasa used to do. And then sync tooling that puts the same thing in a self-hosted cloud, with a VM + cache for 30 days of photos, with rest backed by Amazon S3, Google GCS, or Backblaze B2. Then maybe some tooling that can auto install this on a VM and secure it via Google Login, while supporting private photo sharing at stable URLs similar to Google Photos.

                                                                            This would be a big project because, to be good, there’d also have to be great open source mobile apps on iOS and Android.

                                                                            Some friends of mine prototyped the start of this project using redbean (Lua single binary web server) and got as far as the basics of photo serving from disk and having access to SQLite schema for metadata. It’s totally doable.

                                                                            For the time being, I have been using Android’s support for SD card readers to sync all my photos from phones, DSLR, and mirrorless cams into Google Photos, and then keeping RAW storage separate on my Mac Mini and backed up using standard 3-2-1 backup approach. But, it’s cumbersome (requires manual processes), lossy (I lose backup of edits in GPhotos), proprietary, and I’m afraid for Google’s track record with longevity of services. It also saddens me that Google Photos doesn’t give me any way to post public photo albums at my domain (https://amontalenti.com) with stable URLs that I know I can make last forever, regardless of photo serving tech.

                                                                            My theory for why something like this never got built is that Apple Photos and Google Photos are just so darn convenient, and non-phone photography relatively rare among consumers these days, so it just never fell into the indieweb and f/oss sweet spot of itch worth scratching. But I still have the itch and have for a long time.

                                                                            It also seems like a lot of the backend functionality could be done by somehow modularizing the GNOME Shotwell codebase. For example to handle image metadata, color correction, cropping, and so forth.

                                                                            1. 1

                                                                              I know it’s not software, but in case you didn’t know about this, if you have an Apple TV you can show slideshows on your TV.

                                                                            1. 1

                                                                              One of these days I’ll really have to go explore that Emacs stuff. I’ve never tried it, maybe it’s time.

                                                                              Although, I’m not sure getting online, for this fediverse stuff, is why I should start using it.

                                                                              1. 14

                                                                                A thing I would classify as a performance “no no” is having high latency between the database(s) and the application servers. Running the database server on (presumably) a residential internet connection potentially far away from the data center where the application servers are, likely increase the latency a significant amount, as each page view on the website likely will require many roundtrips back and forth to the database for the page to be rendered.

                                                                                Moving the Redis cache to also be on the other side of this high latency (compared to intra-DC) link is likely to make this even worse, as programmers probably often would assume the cache is only a few milliseconds away to ask for information from.

                                                                                It’s possible the fact the Raspberry Pi was running locally to the PostgreSQL and Redis databases made a big difference in how quickly it could process tasks due to the roundtrip time being much lower than the Sidekiq instance running at Digital Ocean.

                                                                                This is, of course, all speculation, but it doesn’t sound like an optimal setup to me for getting the best performance out of the system.

                                                                                1. 5

                                                                                  I’m actually impressed they went ahead with selfhosting parts of this and it worked out. I wouldn’t bet on my home network, power and the ping times. Especially when I’ll also be using the connection. I do have a quite powerful machine at home for server stuff, but I’m using it primarily for stuff that doesn’t need external connection and let the server in the datacenter take over everything that has to run always.

                                                                                  1. 5

                                                                                    In case of fediverse, I think latency isn’t that problematic, as long as there’s enough throughput. Federation mostly takes a toot and shoots it to 50 different databases. If you can put it fast enough, it’s not that bad.

                                                                                  2. 3

                                                                                    You’re probably right. Based on my experience in the US, they’ll have ~15ms latency between home and DC at absolute minimum, probably more like ~40ms. But we also have seen lobsters posts about 25 gbps home fiber connections…

                                                                                    1. 1

                                                                                      From my experience, bandwidth can hog the service even worse then latency. And combined it’s probably more then linearly affecting things.

                                                                                    1. 1

                                                                                      That is, it could handle queued tasks about 5 times as quickly as an untuned Mastodon instance.

                                                                                      Provided that you have 25 vCPUs, that is. It I’m sure scaling up the worker count is beneficial even beyond the CPU count, but i think the overhead is noticeable.

                                                                                      1. 3

                                                                                        Ruby has the GIL and can’t scale beyond a single core with a single process. Having multiple vCPUs won’t do anything except let you saturate each one with a Ruby process for 25x the memory usage of a different language.

                                                                                        1. 1

                                                                                          Sidekiq is mostly calling imagemagick or ffmpeg. The subprocess can run on a different cpu, and the GIL (but not the current thread) is released while waiting for the subprocess to finish running.

                                                                                          Ruby also has Fiber, which lets you yield control during IO without using an extra thread.

                                                                                          If your workload is “Download a file, pass it to a subprocess, upload the result elsewhere” then you might want 20x as many threads as you have CPUs.

                                                                                          1. 2

                                                                                            What you’re stating is only half-correct. While it’s true that the GIL may be released more often during I/O loads, you still have to prove that this is where your process spends most of the time. Increasing the number of sidekiq thread workers without tuning the VM parameters, such as malloc arena max, and potentially mem compaction, will only result in a lot of page faults, and your process spending more time garbage collecting than I should.

                                                                                            While ruby has fibers, sidekiq makes no use of that. So this is a non-factor for your setup.

                                                                                            Another thing: sidekiq is written in ruby. Not ruby on rails. But one downside of running rails in sidekiq (at least for older versions of rails?) is having the workers fetching a database connection early, before doing actual work, and holding on to it until the work is done. This is suboptimal, as you’ll have to always keep the max dB pool size the same as the thread count, in order to avoid timeout errors acquiring connections . This severely hampers horizontal scaling. If mastodon were built in a different ruby stack, using the sequel gem, you could have N max db pool size for N thread workers, where N > M , no problem. But mastodon isn’t gonna be rewritten anytime soon, so be aware of this known (anti) pattern.

                                                                                            However, if you can work your way around that known limitation, it’s always preferable to run, as an example, 5 sidekiq processes of 5 workers each, than a single process with 25 workers, due to the limitations explained above.

                                                                                            Another option is running jruby.

                                                                                            1. 1

                                                                                              While it’s true that the GIL may be released more often during I/O loads, you still have to prove that this is where your process spends most of the time.

                                                                                              I’ve worked on 30+ rails codebases since 2007, including multiple sites in the alexa 10k list. I think I’ve seen one case, in all that time, where background workers were CPU-bound. You definitely want to check, but IMO it’s a reasonable starting assumption.

                                                                                              one downside of running rails in sidekiq (at least for older versions of rails?) is having the workers fetching a database connection early

                                                                                              Rails added lazy-loaded connection pooling in version 2.2 (2008). If your knowledge of performance tuning rails dates to before 2008, then perhaps include that caveat in your advice?

                                                                                              It’s always preferable to run, as an example, 5 sidekiq processes of 5 workers each, than a single process with 25 workers, due to the limitations explained above.

                                                                                              Sidekiq doesn’t offer preload/fork (or if it does, the docs are well hidden). Without forking, 5x5 has a higher memory footprint (for mid-sized rails apps that’s plausibly ~1gb of additional ram use).

                                                                                              1. 1

                                                                                                Lazy-loaded connection doesn’t stop the behaviour I described.

                                                                                                As for sidekiq, it offers managed forks in the enterprise version.

                                                                                                1. 1

                                                                                                  Perhaps I’ve misunderstood. If you aren’t starting a transaction (which reserves a connection until you finish it), why would the worker hold a connection?

                                                                                                  1. 1

                                                                                                    You may acquire a connection only to perform a read query. In such cases, sequel acquires, selects, returns the results, and sends it back to the pool. Active record would acquire, select, and keep the connection, until the request would be over and the rack middleware would be reached. This was done so probably because the assumption is rhat you want to have multiple queries, so better keep the connection around. In practice, this means that the contention threshold is higher, so better recommend thread / db pool size parity.

                                                                                                    I haven’t researched rails 6 or more recently, so I don’t know how much changed.

                                                                                      1. 7

                                                                                        I’m a little puzzled. I thought the storage was actually encrypted on these things, and the existence of this bug seems to strongly suggest otherwise unless I’ve severely misunderstood. If swapping out an attacker controlled SIM can get you access to the device storage, it’s not encrypted, right? Is everything here a lie?

                                                                                        1. 3

                                                                                          After accepting my finger, it got stuck on a weird “Pixel is starting…” message, and stayed there until I rebooted it again.

                                                                                          After rebooting the phone, putting in the incorrect PIN 3 times, entering the PUK, and choosing a new PIN, I got to the same “Pixel is starting…” state.

                                                                                          I thought the same thing until I saw these snippets. I believe the “Pixel is starting…” screen is it decrypting the phone using your pin (and failing in this case).

                                                                                          1. 3

                                                                                            To my knowledge an Android phone is encrypted (if you have encryption enabled) when shut off. On boot, you decrypt it using a pin or password.

                                                                                            After the decryption after boot the lock screen is just a simple lock screen. It prevents somebody from accessing your data through the GUI, but the decryption key is loaded somewhere and a dedicated attacker might be able to get the data off a running phone.

                                                                                            There is also a small difference between the two lock screens. The first lock screen (which decrypts the device) has a small additional message telling you to unlock the phone to use all features (translated it from my language, probably other words on native English devices). The lock screens afterwards do not show this message.

                                                                                            I’m really bad at mobile phones though, so my understanding might be wrong. That’s how I understood it when I researched android device encryption.

                                                                                            1. 5

                                                                                              To my knowledge an Android phone is encrypted (if you have encryption enabled) when shut off. On boot, you decrypt it using a pin or password.

                                                                                              For a while now android uses file-based encryption and not full-disk encryption. This means that on boot there is no longer a point where you need to type the password to continue booting. Android’s file-based encryption allows the phone to boot all the way to the lockscreen. However at this point user data is still all encrypted.
                                                                                              After the user types their pin correctly (the first time after boot) user data is decrypted.
                                                                                              And yes you’d be correct that after this point the user data is decrypted and the lockscreen now just acts as a lockscreen.

                                                                                              but the decryption key is loaded somewhere and a dedicated attacker might be able to get the data off a running phone.

                                                                                              That’s not entirely correct, at least not for modern phones with dedicated security chips, like the Pixel’s Titan M. The decryption key is ‘stored’ in the Titan M - its very much protected in there. I say ‘stored’ in quotes because its technically a lot more complicated than that (Key Encryption Keys, Weaver tokens, etc).

                                                                                              1. 2

                                                                                                The key is stored, there but the data is not. Which is what the commenter above said that the attacker could get.

                                                                                                1. 1

                                                                                                  Oh, I see.

                                                                                                2. 1

                                                                                                  So, is the thought here that inserting the new SIM and resetting its PIN then resulting in a “unlock encrypted user volume” functionality?

                                                                                                  1. 1

                                                                                                    I honestly have no idea. In fact I’m surprised doing anything with the SIM affects the encryption system like this.

                                                                                                3. 1

                                                                                                  I was assuming the physical SIM swap involved a reboot. Maybe that was too generous an assumption.

                                                                                                  1. 3

                                                                                                    The video clearly shows doing the SIM swap whilst powered on.

                                                                                                    1. 1

                                                                                                      I didn’t doubt that. But I thought swapping it would reboot from a cold state, not hold any decryption keys in memory.

                                                                                                4. 1

                                                                                                  That’s how I first interpreted this too, but in the demo video you can see that they never turn the phone off.

                                                                                                  It’s still a pretty useful bug. If someone steals/seizes your phone you don’t have time to turn it off, and you probably don’t carry it around powered off.

                                                                                                1. 24

                                                                                                  Any OpenSSL 3.0 application that verifies X.509 certificates received from untrusted sources should be considered vulnerable. This includes TLS clients, and TLS servers that are configured to use TLS client authentication.

                                                                                                  I won’t call this a nothingburger but there are a few things that make it look a bit less spicy for people running servers:

                                                                                                  • 3.x only
                                                                                                  • Most TLS servers aren’t doing client authentication
                                                                                                  • From the advisory, if you are doing client TLS, the vulnerability is post-chain validation: so you have to get a public CA to sign a bad certificate (and at least you’ll see them in CT logs, wonder if anyone has searched!) or you have to skip cert validation
                                                                                                  • Even then, you have to get past mitigations to turn a DOS into an RCE

                                                                                                  It’s still obviously bad, but it doesn’t seem to be “internet crippling”

                                                                                                  1. 8

                                                                                                    Well said! While there was a lot of anxiety around this issue since it was marked CRITICAL (a-la-heartbleed), the OpenSSL team did post a rationale in their blog (see next response by @freddyb) for the severity downgrade (which aligns with your explanation as well).

                                                                                                    1. 6

                                                                                                      I don’t understand why the OpenSSL maintainers didn’t downgrade the pre-announcement. People cleared their diaries for this; a post-pre-announcement saying “whoops it’s only a high, you can stop preparing for armageddon” might have saved thousands of hours, even if it only came on Monday.

                                                                                                      1. 7

                                                                                                        Aren’t the OpenSSL maintainers basically one guy fulltime and some part-timers? Why are they expected to perform at the level of a fully-funded security response team? If they can save thousands of hours, shouldn’t they be funded accordingly?

                                                                                                        1. 2

                                                                                                          I mean, it’s absolutely true in general that people making money out of FOSS should fund its development and maintenance, and to the extent this isn’t already true of OpenSSL, of course I think it should be fixed. But I think it’s wrong to couch everything in terms of money. Voluntary roles exist in lots of communities, and their voluntary nature doesn’t negate the responsibility that comes with them. If it’s wrong for big tech to profit by extracting free labour out of such social contracts—and I do think it is—it doesn’t seem much better to just assimilate all socially mediated labour into capitalism and have done.

                                                                                                          But I also just think that if one makes a mistake that wastes lots of people’s time, it’s a nice gesture to try to give them that time back when the mistake is realised.

                                                                                                          1. 1

                                                                                                            I think it’s wrong to couch everything in terms of money.

                                                                                                            I wonder what you would say if your employer responded like this when you asked why you didn’t get your paycheck?

                                                                                                            Voluntary roles exist in lots of communities, and their voluntary nature doesn’t negate the responsibility that comes with them.

                                                                                                            Uh, what responsibility is that exactly? The OpenSSL people have gone above and beyond to handle this security issue and you’re complaining that they’re not fulfilling their ‘voluntary responsibility’ because they didn’t save a few hours of your time? Do you realize how entitled and churlish you sound? Do me a favour, don’t talk about Open Source until you develop a sense of gratitude for the immensity of the benefits you’re getting from the maintainers every day. And I’m not even talking about paying them money, since that seems too much to ask! Just some basic human respect and gratitude.

                                                                                                            1. 1

                                                                                                              Uh, what responsibility is that exactly? The OpenSSL people have gone above and beyond to handle this security issue and you’re complaining that they’re not fulfilling their ‘voluntary responsibility’ because they didn’t save a few hours of your time?

                                                                                                              It’s not my time personally that’s at issue here. I respect that the OpenSSL people have handled this security issue, and that they’ve stepped up to maintain an important thing. I do think that position comes with a responsibility not to raise false alarms if possible.

                                                                                                              The OpenSSL maintainers are, like us, members of a community in which we all contribute what we can. I don’t think that gratitude and criticism are mutually exclusive, but I am sorry if I seemed too complainy, and I appreciate the people in question were probably operating under no small amount of pressure.

                                                                                                              1. 1

                                                                                                                I do think that position comes with a responsibility not to raise false alarms if possible.

                                                                                                                People are very quick to assign new responsibilities to people who give them free stuff. Time and again I find this pretty incredible. Like, here are some people making a public good on their own dime. And users feel so incredibly comfortable jumping in with critiques. They didn’t do this, they should have done it like that. Pretty easy to sit back, do nothing, and complain about others’ hard work.

                                                                                                        2. 3

                                                                                                          I heard third hand that the downgrade was done a couple hours before the release. Wouldn’t be that useful if it is indeed the case.

                                                                                                        3. 1

                                                                                                          Well, a lot of corporate networks have internal trust chains and suites and often you have to skip validation to get past this step.

                                                                                                          Still not Internet crippling though.

                                                                                                        1. 5

                                                                                                          In terms of committing tests with the feature/fix, I’m keen on the idea that we write a “failing test” before the fix (to avoid the failure mode where you write the test for your fix afterwards, see it pass and say “that’s all ok”).

                                                                                                          What do people think about reflecting this in the commit history? Doing so makes a visible statement that you did this, which is good (and helps promote the practice on the team) but it also breaks CI until you push the fix. This might be a feature? (The codebase does actually have a bug, it is just that it wasn’t surfaced until now). But this might also be disruptive.

                                                                                                          1. 4

                                                                                                            A perfect deployable master branch at every commit is over rated. It is perfectly acceptable to gate decisions based on tests or even human awareness.

                                                                                                            1. 3

                                                                                                              having a perfectly deployable master branch when working with other people means that at any moment I can just start work on a new feature by taking master.

                                                                                                              Perhaps you use some other marking to determine this, but then it’s the same principle, just with a different mark.

                                                                                                              (Of course in practice that means that sometimes things get merged into the main branch that are busted, and this doesn’t hold. But if it’s exceptional it’s wasting way less time than if it’s the norm)

                                                                                                              1. 2

                                                                                                                This, there’s so many cases where you just don’t have a nice way of solving a problem, and sacrificing the actual solution at the altar of an arbitrary process is not worth it imo.

                                                                                                                1. 2

                                                                                                                  In addition, you rarely find solutions to current problems in git history. Sometimes, yes, for consultation, but rarely will you actually be deploying that ancient commit.

                                                                                                                  1. 4

                                                                                                                    Most often you want to know why the line of code was added, what problem it was fixing. Or you are doing a git bisect, and a failing build might interfere with that, but most often you really just need sufficient context for the commit in the commit.

                                                                                                                    Anything else, perfect tests, perfect messages, perfect docs, is overkill, though can be nice.

                                                                                                                2. 3

                                                                                                                  When I’ve done this workflow (commit the failing test first) I do it in a branch, then publish a PR to GitHub (where I can watch CI fail on the PR page), then squash commit that to main later along with the implementation.

                                                                                                                  I just saw an interesting idea on Twitter: pytest offers an “xfail” mechanism which can mark a test as “expected to fail” such that it can fail while still leaving CI green. So you can use that to check in a known-to-fail test without breaking things like “git bisect” later on: https://twitter.com/maartenbreddels/status/1586609659464630273

                                                                                                                  1. 3

                                                                                                                    You can, even should, write the test first, but it should go in the same commit as the fix, at least for published commits.

                                                                                                                  1. 6

                                                                                                                    I see where the author is coming from, I also think some of the design changes in macOS looks worse then they used to. But at the same time, 99% of my time is spent in either the Terminal (SSHed into a local VM running Debian) or a browser, so I feel like I am not much exposed to the design changes anyways, and I can imaging that there’s quite a large percentage of macOS users that would feel the same way.

                                                                                                                    1. 2

                                                                                                                      That’s exactly why I don’t complain much about gnome. I don’t use many applications, and t those that I do use, each does it’s own thing relatively well. Si i don’t need too much else from my DE.

                                                                                                                      1. 1

                                                                                                                        The big UI regression for me was around 10.7, when they significantly increased the size of shadows in background windows. Before then, I had never typed in one window thinking another was foreground on OS X. After that change, I think I’ve done it at least once a week. I have no idea how that passed any kind of user testing.

                                                                                                                      1. 4

                                                                                                                        … we found a simple solution: block the undesired traffic from these apps. Even so, we continue to serve about 100TB of “Access Denied” pages monthly!

                                                                                                                        What am I missing here? Something seems off. Unless this traffic is generating 100 billion requests a month? Or your access denied response is enormous? Or my math sucks?

                                                                                                                        1. 16

                                                                                                                          100 billion requests a month doesn’t actually sound beyond the realm of possibility if these phones are making multiple requests per day even when the browser isn’t used. After all, “popular in India” could mean >hundreds of millions of installments.

                                                                                                                          1. 9

                                                                                                                            esp. with certain adroid devices being very eager to kill unused apps one would have a lot of restarts in a day.

                                                                                                                          2. 6

                                                                                                                            India has ~1_411_853_746 people. If 0.01% of them use the browser that are ~141_185.37 users. If each of these browsers has to restart 10 times a day due to memory or battery savings you are back at 1_411_853.75 requests.

                                                                                                                            Let’s say I need 146bytes to serve a nearly blank 404, as my nginx does, then I’m at ~0.2GB of traffic per day, purely for 404 status codes.

                                                                                                                            You can vary these numbers by having more users, more restarting or re-downlading every 15 minutes to be up-to-date (very possible) and maybe a bigger 404 page (more information, headers, …). Keep in mind these are all multipliers.

                                                                                                                            Also modern request frameworks on android have automatic retry.

                                                                                                                            1. 2

                                                                                                                              You’d really hope they don’t have automatic retry for 4xx…

                                                                                                                              1. 3

                                                                                                                                I do hope so, but I could think of a lot of reasons this ends up re-requesting all the time, maybe even per page view. Also it appears they’re not 404ing but instead send access-denied. Which isn’t that clear-cut ?

                                                                                                                            2. 3

                                                                                                                              If you assume an average of 10 restarts per day, 100b reqs a month comes to about 300 million devices using this misconfiguration. That’s a lot but still sounds within reasonable limits.

                                                                                                                              1. 1

                                                                                                                                Wait, but that could have made it worse. Serving a small file straight from the disk is one of the fastest things you could do with a Web server. The only thing that could be faster is not serving any content at all. I have a feeling a nice looking standard error page from CloudFlare may require more resources than the actual file, especially if it’s constructed dynamically.

                                                                                                                              1. 6

                                                                                                                                I take issue with the statement that you need a relational database, and not just because my day job is at a document-database company (Couchbase.) Saying “most data is naturally relational” is misleading. Most data includes relationships, links, between records, yes. That does not mean the same thing as the specific mathematical formalism of relations implemented in relational databases.

                                                                                                                                For example, the linked-to article about switching from MongoDB talks about the social network Diaspora. Social network data sets are practically poster children for graph databases, another type of non-relational DB. The key reason Diaspora switched from MongoDB turns out to be:

                                                                                                                                What’s missing from MongoDB is a SQL-style join operation, which is the ability to write one query that mashes together the activity stream and all the users that the stream references. Because MongoDB doesn’t have this ability, you end up manually doing that mashup in your application code, instead.

                                                                                                                                Ouch. That is a problem with MongoDB, not with document databases themselves. As a counterexample, Couchbase’s N1QL query language definitely has joins (it’s roughly a superset of SQL) and there are other document DBs I’m less familiar with that do joins too. Joins are not something limited to relational databases. (And they’re of course the bread and butter of graph DBs.)

                                                                                                                                In my own projects I’ve found document databases very useful during prototyping and development because their schemas are much more flexible. You can more easily apply YAGNI to your schema because, when you do need to add a property/column/relation, you don’t have to build migrations or upgrade databases or throw out existing data. You just start using the new property where you need it. (This is an even bigger boon in a distributed system where migrating every instance in lockstep can be infeasible.)

                                                                                                                                1. 7

                                                                                                                                  very useful during prototyping and development because their schemas are much more flexible

                                                                                                                                  Prototypes always become production systems. That flexibility makes them hell to work with (ask me how I know). I feel MongoDB is the document DB to most devs, and I’ll share my experience has been miserable in every. single. instance.

                                                                                                                                  1. 1

                                                                                                                                    My experience differs, I think the way the data is structured brings a lot of pain - it leeway - later on.

                                                                                                                                    The problem I’ve seen often with both SQL and document-based solutions are not always obviously similar, but often fall into the same general category.

                                                                                                                                    But yeah, working with low level SQL be low level mongodb usually has more pitfalls.

                                                                                                                                1. 83

                                                                                                                                  I feel like this entire post reinforces just how difficult Python dependency management is. I’m a huge fan of Python, despite all of its flaws, but the dependency management is horrible compared to other languages. Nobody should have to learn the intricacies of their build tools in order to build a system, nor should we have to memorize a ton of flags just for the tools to work right. And this isn’t even going into the issue where building a Python package just doesn’t work, even if you follow the directions in a README, simply because of how much is going on. It is incredibly hard to debug, and that is for just getting started on a project (and who knows what subtle versioning mistakes exist once it does build).

                                                                                                                                  I think Cargo/Rust really showed just how simple dependency management can be. There are no special flags, it just works, and there are two tools (Cargo and rustup) each with one or two commands you have to remember. I have yet to find a Rust project I can’t build first try with Cargo build. Until Python gets to that point, and poetry is definitely going down the right path, then Python’s reputation as having terrible dependency management is well deserved.

                                                                                                                                  1. 20

                                                                                                                                    Completely agree. I’ve been writing Python for 15 years professionally, and it’s a form of psychological abuse to keep telling people that their problems are imaginary and solved by switching to yet another new dependency manager, which merely has a different set of hidden pitfalls (that one only uncovers after spending considerable time and energy exploring).

                                                                                                                                    Every colleague I’ve worked with in the Python space kind of feels jaded by anyone who promises some tool or technology can make life better, because they’ve been so jaded by this kind of thing in Python (not just dependency management, but false promises about how “just rewrite the slow bits in C/Numpy/multiprocessing/etc” will improve performance and other such things)–they often really can’t believe that other languages (e.g., Go, Rust, etc) don’t have their own considerable pitfalls. Programmers who work exclusively in Python kind of seem to have trust issues, and understandably so.

                                                                                                                                    1. 13

                                                                                                                                      The problem is that no matter how good Poetry gets, it still has to deal with deficiencies that exist in the ecosystem. For example, having lockfiles are great, but they don’t help you if the packages themselves specify poor/incorrect package version bounds when you come to refresh your lockfiles (and this is something I’ve been bitten by personally).

                                                                                                                                      1. 11

                                                                                                                                        That’s not a python-specific issue though. It’s not even python-like issue. You’ll have the same problem with autoconf / go.mod / cargo / any other system where people have to define version bounds.

                                                                                                                                        1. 21

                                                                                                                                          if I create a go.mod in my repo and you clone that repo and run “go build” you will use the exact same dependencies I used and you cannot bypass that. I cannot forget to add dependencies, I cannot forget to lock them, you cannot accidentally pick up dependencies that are already present on your system

                                                                                                                                          1. 15

                                                                                                                                            Keep in mind that Go and Rust get to basically ignore the difficulty here by being static-linking-only. So they can download an isolated set of dependencies at compile time, and then never need them again. Python’s import statement is effectively dynamic linking, and thus requires the dependencies to exist and be resolvable at runtime. And because it’s a Unix-y language from the 90s, it historically defaulted to a single system-wide shared location for that, which opens the way for installation of one project’s dependencies to conflict with installation of another’s.

                                                                                                                                            Python’s venv is an attempt to emulate the isolation that statically-linked languages get for free.

                                                                                                                                            1. 4

                                                                                                                                              I described the situation for Go during build time, not during runtime.

                                                                                                                                              1. 3

                                                                                                                                                And my point is that a lot of the things people complain about are not build-time issues, and that Go gets to sidestep them by being statically linked and not having to continue to resolve dependencies at runtime.

                                                                                                                                                1. 2

                                                                                                                                                  I don’t get the importance of distinguishing when linking happens. Are there things possible at build time that are not possible at runtime?

                                                                                                                                                  1. 7

                                                                                                                                                    Isolation at build time is extremely easy – it can be as simple as just downloading everything into a subdirectory of wherever a project’s build is running. And then you can throw all that stuff away as soon as the build is done, and never have to worry about it again.

                                                                                                                                                    Isolation at runtime is far from trivial. Do you give each project its own permanent isolated location to put copies of its runtime dependencies? Do you try to create a shared location which will be accessed by multiple projects (and thus may break if their dependencies conflict with each other)?

                                                                                                                                                    So with runtime dynamic linking you could, to take one of your original examples, “accidentally pick up” things that were already on the system, if the system uses a shared location for the runtime dynamically-linked dependencies. This is not somehow a unique-to-Python problem – it’s the exact same problem as “DLL hell”, “JAR hell”, etc.

                                                                                                                                                    1. 4

                                                                                                                                                      Isolation at runtime is far from trivial. Do you give each project its own permanent isolated location to put copies of its runtime dependencies? Do you try to create a shared location which will be accessed by multiple projects (and thus may break if their dependencies conflict with each other)?

                                                                                                                                                      But the same issues exist with managing the source of dependencies during build time.

                                                                                                                                                      1. 4

                                                                                                                                                        Yeah, I’m not seeing anything different here. The problem is hard, but foisting it on users is worse.

                                                                                                                                                        The project-specific sandbox vs disk space usage recurs in compiled langs, and is endemic to any dependency management system that does not make strong guarantees about versioning.

                                                                                                                                                        1. 3

                                                                                                                                                          No, because at build time you only are dealing with one project’s dependencies. You can download them into an isolated directory, use them for the build, then delete them, and you’re good.

                                                                                                                                                          At runtime you may have dozens of different projects each wanting to dynamically load their own set of dependencies, and there may not be a single solvable set of dependencies that can satisfy all of them simultaneously.

                                                                                                                                                          1. 1

                                                                                                                                                            You can put them into an isolated directory at runtime, that’s literally what virtualenv, Bundler’s deployment mode or NPM do.

                                                                                                                                                            And at build time you don’t have to keep them in an isolated directory, that’s what Bundler’s standard mode and Go modules do. There’s just some lookup logic that loads the right things from the shared directories.

                                                                                                                                                            1. 2

                                                                                                                                                              The point is that any runtime dynamic linking system has to think about this stuff in ways that compile-time static linking can just ignore by downloading into a local subdirectory.

                                                                                                                                                              Isolated runtime directories like a Python venv or a node_modules also don’t come for free – they proliferate multiple copies of dependencies throughout different locations on the filesystem, and make things like upgrades (especially security issues) more difficult, since now you have go track down every single copy of the outdated library.

                                                                                                                                            2. 8

                                                                                                                                              It might be possible to have this issue in other languages and ecosystems, but most of them avoid them because their communities have developed good conventions and best practices around both package versioning (and the contracts around versioning) and dependency version bound specification, whereas a lot of the Python packages predate there being much community consensus in this area. In practice I see very little of it comparatively in say, npm and Cargo. Though obviously this is just anecdotal.

                                                                                                                                              1. 1

                                                                                                                                                Pretty sure it’s not possible to have this issue in either of your two examples; npm because all dependencies have their transitive dependencies isolated from other dependencies’ transitive dependencies, and it just creates a whole tree of dependencies in the filesystem (which comes with its own problems), and Cargo because, as @mxey pointed out (after your comment), dependencies are statically linked into their dependents, which are statically linked into their dependents, all the way up.

                                                                                                                                                This has been a big problem in the Haskell ecosystem (known as Cabal hell), although it’s been heavily attacked with Stack (a package set that are known to all work together), and cabal v2-* commands (which builds all the dependencies for a given project in an isolated directory), but I don’t think that solves it completely transitively.

                                                                                                                                                1. 2

                                                                                                                                                  @mxey pointed out (after your comment), dependencies are statically linked into their dependents, which are statically linked into their dependents, all the way up.

                                                                                                                                                  That’s not true for Go. Everything that is part of the same build has their requirements combined, across modules. See https://go.dev/ref/mod#minimal-version-selection for the process. In summary: if 2 modules are part of the same build and they require the same dependency, then the higher version of the 2 specified will be used (different major versions are handled as different modules). My point was only that it’s completely reproducible irrelevant of the system state or the state of the world outside the go.mod files.

                                                                                                                                                  1. 1

                                                                                                                                                    Ah, I misunderstood your comment and misinterpreted @ubernostrum’s response to your comment. Thanks for clarifying. Apologies for my lack of clarity and misleading wording.

                                                                                                                                                  2. 1

                                                                                                                                                    To be clear, I’m not talking about transitive dependencies being shared inappropriately, but the much simpler and higher level problem of just having inappropriate dependency versioning, which causes the packages to pick up versions with breaking API changes.

                                                                                                                                                    1. 1

                                                                                                                                                      Ah, I reread your original comment:

                                                                                                                                                      For example, having lockfiles are great, but they don’t help you if the packages themselves specify poor/incorrect package version bounds when you come to refresh your lockfiles (and this is something I’ve been bitten by personally).

                                                                                                                                                      Are you talking about transitive dependencies being upgraded with a major version despite the parent dependency only being upgraded by a minor or patch version because of the parent dependency being too loose in their version constraints? Are you saying this is much more endemic problem in the Python community?

                                                                                                                                                      1. 2

                                                                                                                                                        Well, it fits into one of two problem areas:

                                                                                                                                                        • As you say, incorrect version specification in dependencias allowing major version upgrades when not appropriate - this is something I rarely if ever see outside Python.

                                                                                                                                                        • A failure of common understanding of the contracts around versioning, either by a maintainer who doesn’t make semver-like guarantees but downstream consumers who assume they do, or the accidental release of breaking changes when not intended. This happens everywhere but I (anecdotally) encounter it more often with Python packages.

                                                                                                                                                    2. 1

                                                                                                                                                      npm because all dependencies have their transitive dependencies isolated from other dependencies’ transitive dependencies

                                                                                                                                                      npm has had dedupe and yarn has had --flat for years now.

                                                                                                                                                      Go handles it by enforcing that you can have multiples of major versions but not minor or patch (so having both dep v1.2.3 and v2.3.4 is okay, but you can’t have both v1.2.3 and v1.4.5).

                                                                                                                                                      1. 1

                                                                                                                                                        npm has had dedupe and yarn has had --flat for years now.

                                                                                                                                                        I was unaware of that, but is it required or optional? If it’s optional, then by default, you wouldn’t have this problem of sharing possibly conflicting (for any reason) dependencies, right? What were the reasons for adding this?

                                                                                                                                              2. 11

                                                                                                                                                I have mixed feelings about Poetry. I started using it when I didn’t know any better and it seemed like the way to go, but as time goes on it’s becoming evident that it’s probably not even necessary for my use case and I’m better served with a vanilla pip workflow. I’m especially bothered by the slow update and install times, how it doesn’t always do what I expected (just update a single package), and how it seems to be so very over-engineered. Anthony Sottile of anthonywritescode (great channel, check it out) has made a video highlighting why he will never use Poetry that’s also worth a watch.

                                                                                                                                                1. 5

                                                                                                                                                  If you have an article that summarizes the Poetry flaws I’d appreciate it (I’m not a big video person). I’ll defer to your opinion here since I’m not as active in Python development as I was a few years ago, so I haven’t worked with a lot of the newer tooling extensively.

                                                                                                                                                  But I think that further complicates the whole Python dependency management story if Poetry is heavily flawed. I do remember using it a few years back and it was weirdly tricky to get working, but I had hoped those issues were fixed. Disappointing to hear Poetry is not holding up to expectations, though I will say proper dependency management is a gritty hard problem, especially retrofitting it into an ecosystem that has not had it before.

                                                                                                                                                  1. 15

                                                                                                                                                    Sure, here’s what he laid out in the video from his point of view:

                                                                                                                                                    • he ran into 3 bugs in the first 5 minutes when using it for the first time back in 2020, which didn’t bode well
                                                                                                                                                    • it pulls in quite a few dependencies (45 at the time of writing this, which includes transitive dependencies)
                                                                                                                                                      • create virtual environment
                                                                                                                                                      • pip install poetry
                                                                                                                                                      • pip freeze --all | wc -l
                                                                                                                                                    • it by default adds dependencies to your project that automatically would result in updates up to either a major or minor version bump, depending on the initial version
                                                                                                                                                      • for example python = "^3.8", which is equivalent to >= 3.8, <4
                                                                                                                                                      • this causes conflicts with dependencies of libraries that are often updated and with those that aren’t
                                                                                                                                                        • he mentions requests specifically
                                                                                                                                                    • pip already has a dependency resolver and a way to freeze requirements and their very specific versions
                                                                                                                                                      • i.e. use == and not use caret or tilde versioning
                                                                                                                                                      • he also shouts out ‘pip-tools’ here, which I haven’t used myself for the sake of keeping things simple
                                                                                                                                                    • the maintainers of Poetry have done something weird with how they wanted to deprecate an installer, which has eroded trust (for him)
                                                                                                                                                      • they essentially introduced a 5% chance that any CI job that used get-poetry.py (their old way of installing Poetry) would fail to get people to move away from using that script and if you weren’t in CI then the script would just fail
                                                                                                                                                      • this is terrible because it introduces unnecessary flakiness in CI systems and does not give people time to actually migrate away in their own time, but rather forces it upon them
                                                                                                                                                    1. 6

                                                                                                                                                      I have used pip-tools and it is my favorite way of doing dependency management in Python, but it’s also part of the problem because I have a solution for me, so it doesn’t matter that the core tools are user hostile. The Python core team should really be taking ownership of this problem instead of letting it dissolve into a million different little solutions.

                                                                                                                                                      1. 5

                                                                                                                                                        the maintainers of Poetry have done something weird with how they wanted to deprecate an installer, which has eroded trust (for him)

                                                                                                                                                        I don’t wish to ascribe malice to people, but it comes off as contemptuous of users.

                                                                                                                                                        Infrastructure should be as invisible as possible. Poetry deprecating something is Poetry’s problem. Pushing it on all users presumes that they care, can act on it, and have time/money/energy to deal with it. Ridiculous.

                                                                                                                                                        1. 1

                                                                                                                                                          Absolutely, very unprofessional. Is the tool deprecated? Just drop the damn tool, don’t bring down my CI! You don’t want future versions? Don’t release any!

                                                                                                                                                    2. 1

                                                                                                                                                      I wanted to just settle on Poetry. I was willing to overlook so many flaws.

                                                                                                                                                      I have simply never gotten it to work on Windows. Oh well.

                                                                                                                                                    3. 5

                                                                                                                                                      Poetry is here though and is ready to use. There are good reasons to not make things included and frozen in upstream distribution. For example rubygems is separate from ruby. Cargo is separate from the rust compiler. The Python project itself doesn’t have to do anything here. It would be nice if they said: this is the blessed solution, but it doesn’t stop anyone now.

                                                                                                                                                      1. 9

                                                                                                                                                        Another commenter posted about the issues with Poetry, which I take as it not being quite ready to use everywhere. I think not having a blessed solution is a big mistake, and one that the JS ecosystem is also making (it’s now npm, yarn, and some other thing) — it complicates things for no discernible reason to the end user.

                                                                                                                                                        While Cargo and rubygems may be separate from the compiler/interpreter, they are also closely linked and developed in sync (at least I know this is the case for Cargo). One of the best decisions the Rust team made was realizing that a language was its ecosystem, and investing heavily in the tooling that was best in class. Without a blessed solution from the Python team I feel as though the dependency management situation will continue as-is.

                                                                                                                                                        1. 4

                                                                                                                                                          There was a time in the beforefore, when we didn’t have bundler, and ruby dependency management was kind of brutal as well. I guess there is still hope for python if they decide to adopt something as a first-class citizen and take on these problems with an “official” answer.

                                                                                                                                                      2. 2

                                                                                                                                                        I tried to add advice about dependency and packaging tooling to my code style guide for Python. My best attempt exploded the size of the style guide by 2x the word count, so I abandoned the effort. I recently wrote about this here:

                                                                                                                                                        https://amontalenti.com/2022/10/09/python-packaging-and-zig

                                                                                                                                                        I’d really like to understand Rust and Cargo a little better, but I’m not a Rust programmer at the moment. Any recommendations to read about the cargo and crate architecture?