Error handling in drop is problematic indeed. Some libraries provide close(self) -> Result
for handling this the hard way when you really care about the result.
std::file::File
chooses to ignore those errors.
Ah, but importantly, it gives you the option to explicitly handle them and also explicitly documents what happens in the default case:
https://doc.rust-lang.org/stable/std/fs/struct.File.html#method.sync_all
To be honest how would you like to handle that situation in your program? Terminate it completely? Retry closing? What if you can’t close the file at all? This is one of those scenarios where error handling isn’t obvious.
Only a Sith deals in absolutes. (:
What about tools like PostgREST that allow you to embed all the business logic in the database?
There are conflicting opinions on this.
Some people think they should push everything into the DB and the UI basically equates to pretty eye candy and user experience(UX).
Some people prefer a middle ground, and some people think all the business logic should live outside the DB.
I personally don’t think there is any one right answer. I think it depends on where the API boundary is, which mostly depends on the application.
If you have a need/desire to give end users DB access, then you almost certainly want a lot(if not all) business logic in the DB.
If you treat the DB as nothing more than a convenient place to store some data, then putting business logic there is stupid.
Most databases can support any/all of these options.
zie’s answer is good. Another perspective: single responsibility for microservices.
You don’t reimplement logic in multiple apps writing to the same store. So either you have a microservice dedicated to providing the API needed while remaining the single-source-of-truth, or you have business logic in the RDBMS so that the RDBMS effectively embeds that microservice.
And then it’s a question of available skillsets of your staff, and how many people are good at debugging stored procedures vs issuing distributed traces, etc.
It’s all trade-offs.
so that the RDBMS effectively embeds that microservice
That’s an awesome way to describe the approach that I haven’t heard before. It sounds like it could also be used to make some db engineers twitch when I refer to their stored procedures repo as a microservice. Great! :-)
It’s not business logic, it’s pretty much like a database view, which certainly belongs into a database. I think abstractions for data belong close to the data. That also means that you can change, replace, the thing that interacts with the data, the actual business logic.
One can be cut by this easily, when designing database schemas to close to a framework, ORM, etc. only for things to change and having a horrible migration path, potentially having to replicate very unidiomatic behavior.
So I’d argue, data and data representation doesn’t belong in your business logic.
Or at least for keeping logic and data separate, which you don’t do if you essentially build parts of the schema or views in your business logic.
…unless you can put all the logic in there. …or, you need to do selects on that data. …or, other business units have views into your schema …or, you have compliance requirements to do so
etc
What this article fails to mention is that none of the popups demonstrated are necessary, and hence of dubious legality. Rather than design a clear cookie consent banner, just defer displaying it until it is necessary. And no, your Google AdWords cookie is not necessary.
It is also ironic that the article itself is covered by a floating banner:
To make Medium work, we log user data. By using Medium, you agree to our Privacy Policy, including cookie policy.
Plus the idea that the reason these bad dialogs exist because no one’s designed a better one is just … hopelessly misguided. Offering a better alternative won’t make sites switch to it, because they’re doing what they’re doing now because it’s bad, not because it’s good.
Yes. It’s basically a form of civil disobedience.
Basically operating on game theory, hoping the other sites don’t break rank
It’s basically a form of civil disobedience.
Uhhhh… that’s an odd analogy.
Civil disobedience is what you do when the law is unjust or immoral; this is more like “the law doesn’t allow us to be as profitable as we would like so we are going to ignore it.”
Civil disobedience doesn’t have to be ethically or morally just.
civil disobedience, also called passive resistance, the refusal to obey the demands or commands of a government or occupying power, without resorting to violence or active measures of opposition; its usual purpose is to force concessions from the government or occupying power.
Hm; never thought about it that way but fair point! It feels a bit off to compare corporate greed with like … Gandhi, but technically it fits.
I’ve actually had some very reasonable patches that I didn’t submit because of this.
This is not under the control of the Go team.
Whose control is it under? I don’t like it when people say that “it can’t be done”. It can, there is a difference between “can’t” and “we don’t want to”.
It sounds like it’s a requirement for all Google projects, so whoever at Google is in charge of open-source.
The core problem is the only entities currently paying for web browser development have mixed motives. The EU should just buy out Mozilla and make Firefox into the browser for the people instead of waiting around for Google to stop breaking their laws.
The GDPR is specific about cookie banners not being obtrusive, and that rejecting tracking is as easy as accepting.
The only compliant banner I regularly see is from gov.uk, and I find it doesn’t annoy me at all.
The popups are as obnoxious as possible to make us hate the GDPR. Can’t we oppose the tracking instead of the law telling us when it’s happening?
And of course the core thing is you don’t need the cookie popups if you’re not doing random tracking of people!
Every cookie popup is an announcement that the site has some automated trackers set up. If you are just using cookies for things like handling sessions you do not need the cookies.
Absolutely. The options are either make your tracking opt-in through genuinely informed consent, or don’t track at all.
Companies found the secret third option, which is just ignore the law and dark pattern your users into agreeing to anything.
Banners say things like “we need cookies for this site to work” and pretend they need your permission to use them. Ironically they only need permission for the cookies that aren’t essential to make the site work.
Hiding things away under “legitimate interest” makes things even more confusing. Are the other things illegitimate interests?
…you do not need the cookies.
Do you mean the cookies or the popups? I’m not familiar with how the GDPR treats non-cookie based things like JWT in local storage and sent with every request.
The same. You require consent to store any data on user computer. However it do not require some “essential” cookies - for example cookie with preferences for dark/light theme do not require consent if it is direct action on website, cookie containing session ID do not require consent, etc. That applies for local cookies only though.
If you already block tracking by any mean, you can get rid of those banners using something like https://addons.mozilla.org/en-GB/firefox/addon/i-dont-care-about-cookies/.
Yeah, the EU’s heart was in the right place, but implementation has been a disaster. It’s like passing a law that murder is okay as long you say “I am going to murder you” as you take out the knife.
What the EU did was basically passing a law that makes murder illegal. Companies/Murderers just ignore it and go around saying “anyone that doesn’t want to be murdered please answer by saying your name within of the next millisecond. Guess no one answered, so you’ve just consented to murder!”
GDPR explicitly bans all the annoying dark patterns of cookie banners. A GDPR-compliant cookie banner would ask you once whether you consent to tracking. It’d have one huge no button (but no easily accessible yes button). If you ever click no, it’d have to remember as long as possible and close itself immediately. If you click yes, you’d have to go through a handful of options to specifically choose which tracking methods to allow.
So, basically the polar opposite of many cookie popups today, which have a big “I ACCEPT” button and a “More options” button that you have to click to manually turn off all tracking…
Indeed. Which is now finally happening: https://www.theverge.com/2022/4/21/23035289/google-reject-all-cookie-button-eu-privacy-data-laws
Except large Internet companies are much more powerful and accountable to public pressure than murderers, so they should face at least as much public scorn as the lawmakers.
There’s a saying, that road to hell is paved with good intentions.
That often means that if someone’s is not sure how to help, then proceeding with helping can create more problems than resolve anything.
That’s better than having no law against murder. Then we can move away from all the people saying “I am going to murder you.”
Umm… we’ve just today decided to instruct Matomo not to use cookies rather then implement cookie banner for our new Wagtail-based websites. I think it’s working?
What’s to buy? It’s open source. They can contribute to it or fork it if Mozilla Corp doesn’t like their changes.
The Mozilla organization, including the expertise necessary to develop and maintain Firefox. It would probably cost more to build an independent organization capable of doing the same thing.
Which Mozilla organization? The non-profit Mozilla Foundation or the for-profit Mozilla Corporation?
The Mozilla Corporation is owned in its entirety by the Mozilla Foundation. Even if somehow the Foundation were convinced to sell the Corporation, the Foundation is the one that owns the key intellectual property and is the actual steward of the things people think of as “Mozilla”. The Corporation’s purpose is to be an entity that pays taxes and thus can have types of revenue and business deals that are forbidden to a non-profit.
The employees who work on Firefox and everything that encompasses work for the Corporation. It has more of a purpose than “taxes”.
MoCo gets all of the revenue that’s generated by Firefox and employs most of the developers. All but one of the members of the Firefox Technical Leadership team work for Mozilla Corp - the one that doesn’t did until relatively recently: https://wiki.mozilla.org/Modules/Firefox_Technical_Leadership
While the Foundation technically owns the IP the Corporation controls the direction of the product and collects all of the revenue generated by the work of both their employees and contributions from the community.
Declare Firefox a public infrastructure and fund Mozilla or another entity to upkeep and enhance that infrastructure.
A slightly related Go nit, the case of structure members determines whether they’re exported or not. It’s crazy, why not explicitly add a private keyword or something?
why not explicitly add a private keyword or something?
Because capitalization does the same thing with less ceremony. It’s not crazy. It’s just a design decision.
And limiting variable names to just “x”, “y” and “z” are also simpler and much less ceremony than typing out full variable names
I’m not sure how this relates. Is your claim that the loss of semantic information that comes with terse identifiers is comparable to the difference between type Foo struct
and e.g. type public foo struct
?
That is actually a Go convention, too. Two-letter or three-letter variable names like cs instead of customerService.
This would be a more substantive comment chain if you can express why it’s crazy, not just calling it crazy. Why is it important that it should be a private keyword “or something”? In Go, the “or something” is literally the case sensitive member name…which is an explicit way of expressing whether it’s exported or not. How much more explicit can you get than a phenotypical designation? You can look at the member name and know then and there whether it’s exported. An implicit export would require the reader to look at the member name and at least one other source to figure out if it’s exported.
It’s bad because changing the visibility of a member requires renaming it, which requires finding and updating every caller. This is an annoying manual task if your editor doesn’t do automatic refactoring, and it pollutes patches with many tiny one-character diffs.
It reminds me of old versions of Fortran where variables that started with I, J, K L or M were automatically integers and the rest were real. 🙄
M-x lsp-rename
I don’t think of those changes as patch pollution — I think of them as opportunities to see where something formerly private is now exposed. E.g. when a var was unexported I knew that my package controlled it, but if I export it now it is mutable outside my control — it is good to see that in the diff.
That’s not the point. The point is you have to edit every place that variable/function appears in the source.
I was going to suggest that gofmt
‘s pattern rewriting would help here but it seems you can’t limit it to a type (although gofmt -r 'oldname -> Oldname'
works if the fieldname is unique enough.) Then I was going to suggest gorename
which can limit to struct fields but apparently hasn’t been updated to work with modules. Apparently gopls
is the new hotness but testing that, despite the “it’ll rename throughout a package”, when I tested it, specifying main.go:9:9 Oldname
only fixed it (correctly!) in main.go
, not the other files in the main
package.
In summary, this is all a bit of a mess from the Go camp.
It looks like rsc’s experimental “refactor” can do this - successfully renamed a field in multiple files for me with rf 'mv Fish.name Fish.Name'
.
The author of the submitted article wrote a sequel article, Go’ing Insane Part Two: Partial Privacy. It includes a section Privacy via Capitalisation that details what they find frustrating about the feature.
A slightly related not-Go nit, the private keyword determines whether struct fields are exported or not. It’s crazy, why not just use the case of the field names saving everyone some keypresses?
I really appreciate it, and find myself missing it on every other language. To be honest, I have difficulty understanding why folding would want anything else.
On the contrary, I rather like that it’s obvious in all cases whether something is exported or not without having to find the actual definition.
https://data.firefox.com/dashboard/user-activity
Sadly soon there will be nobody to experience any speedups, maybe it is time to rethink their development.
Oh please. Do you really believe they should just give up because they now have >100M active clients in the last month?
No, maybe it is time to stop cutting features left and right, making it more like chrome and focusing on irrelevant things unrelated to web browsers.
I just can’t shake the feeling that Kubernetes is Google externalizing their training costs to the industry as a whole (and I feel the same applies to Go).
Golang is great for general application development, IME. I like the culture of explicit error handling with thoughtful error messages, the culture of debugging with fast unit tests, when possible, and the culture of straightforward design. And interfaces are great for breaking development into components which can be developed in parallel. What don’t you like about it?
It was initially the patronizing quote from Rob Pike that turned me off Go. I’m also not a fan of gofmt [1] (and I’m not a fan of opinionated software in general, unless I’m the one controlling the opinions [2]). I’m also unconvinced about the whole “unit testing” thing [5]. Also, it’s from Google [3]. I rarely mention it, because it goes against the current zeitgeist (especially at the Orange Site), and really, what can I do about it?
[1] I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.
[2] And yes, I realize I’m being hypocritical here.
[3] Google is (in my opinion, in case that’s not apparent) shoving what they want on the entire industry to a degree that Microsoft could only dream of. [4]
[4] No, I’m not bitter. Really!
[5] As an aside, but up through late 2020, my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.
Way to go!
I should also note that my current manager retired, the other developer left for another job, and the QA engineer assigned to our team also left for another job (but has since come back because the job he moved to was worse, and we could really use him back in our office). So nearly the entire team was replaced back around December of 2020.
I’m sorry, but opening braces go on their own line. We aren’t developing on 24 line terminals anymore, so stop shoving your outdated opinions in my face.
I use a portrait monitor with a full-screen Emacs window for my programming, and I still find myself wishing for more vertical space when programming in curly-brace languages such as Go. And when I am stuck on a laptop screen I am delighted when working on a codebase which does not waste vertical space.
Are you perhaps younger than I am, with very small fonts configured? I have found that as I age I find a need for large and larger fonts. Nothing grotesque yet, but I went from 9 to 12 to 14 and, in a few places, 16 points. All real 1/72” points, because I have my display settings configured that way. 18-year-old me would have thought I am ridiculous! Granted, you’ve been at your current employer at least 10 years, so I doubt you are 18🙂
I’m also unconvinced about the whole “unit testing” thing … my department had a method of development that worked (and it did not involve anything resembling a “unit test”)—in 10 years we only had two bugs get to production. In the past few moths there’s been a management change and a drastic change in how we do development (Agile! Scrum! Unit tests über alles! We want push button testing!) and so far, we’ve had four bugs in production.
I suspect that the increase in bugs has to do with the change in process rather than the testing regime. Adding more tests on its own can only lead to more bugs if either incorrect tests flag correct behaviour as bugs (leading to buggy ‘bugfixes,’ or rework to fix the tests), or if correct tests for unimportant bugs lead to investing resources inefficiently, or if the increased emphasis leads to worse code architecture or rework rewriting old code to conform to the old architecture (I think I covered all the bases here). OTOH, changing development processes almost inevitably leads to poor outcomes in the short term: there is a learning curve; people and secondary processes must adapt &c.
That is worth it if the long-term outcomes are sufficiently better. In the specific case of unit testing, I think it is worth it, especially in the long run and especially as team size increases. The trickiest thing about it in my experience has been getting the units right. I feel pretty confident about the right approach now, but … ask me in a decade!
Are you perhaps younger than I am, with very small fonts configured?
I don’t know, you didn’t give your age. I’m currently 52, and my coworkers (back when I was in the office) often complained about the small font size I use (and have used).
I suspect that the increase in bugs has to do with the change in process rather than the testing regime.
The code (and it’s several different programs that comprise the whole thing) was not written with unit testing in mind (even though it was initially written in 2010, it’s in C89/C++98, and the developer who wrote it didn’t believe in unit tests). We do have a regression test that tests end-to-end [1] but there are a few cases that as of right now require manual testing [2], which I (as a dev) can do, but generally QA does a more in-depth testing. And I (or rather, we devs did, before the major change) work closely with the QA engineer to coordinate testing.
And that’s just the testing regime. The development regime is also being forced changed.
[1] One program to generate the data required, and another program that runs the eight programs required (five of which aren’t being tested but need to be endpoints our stuff talks to) and runs through 15,800+ tests we have (it takes around two minutes). It’s gotten harder to add tests to it (the regression test is over five years old) due to the nature of how the cases are generated (automatically, and not all cases generated are technically “valid” in the sense we’ll see it in production).
[2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”.
[2] Our business logic module queries two databases at the same time (via UDP—they’re DNS queries), so how does one automate the testing of result A returns before result B, result B returns before result A, A returns but B times out, B returns and A times out? The new manager wants “push button testing”
Here are three options, but there are many others:
Re-ordering results from databases is a major part of what Jepsen does; you could take ideas from there too.
Yes, of course the networking code would still need to be tested.
Ideally, the networking code would have its own unit tests. And, of course, unit tests don’t replace integration tests. Test pyramid and such.
🚀
netfilter can be automated. It’s an API.
What’s push button testing?
You want to test the program. You push a button. All the tests run. That’s it. Fully automated testing.
👌🏾
Everything I’ve worked on since ~2005 has been fully and automatically tested via continuous integration. IMHO it’s a game changer.
Would love to hear about your prior development method. Did adopting the new practices have any upsides?
First off, our stuff is a collection of components that work together. There are two front-end pieces (one for SS7 traffic, one for SIP traffic) that then talk to the back-end (that implements the business logic). The back-end makes parallel DNS queries [1] to get the required information, muck with the data according to the business logic, then return data to the front-ends to ultimately return the information back to the Oligarchic Cell Phone Companies. Since this process happens as a call is being placed we are on the Oligarchic Cell Phone Companies network, and we have some pretty short time constraints. And due to this, not only do we have some pretty severe SLAs, but any updates have to be approved 10 business days before deployment by said Oligarchic Cell Phone Companies. As a result, we might get four deployments per year [2].
And the components are written in a combination of C89, C++98 [3], C99, and Lua [4].
So, now that you have some background, our development process. We do trunk based development (all work done on one branch, for the most part). We do NOT have continuous deployment (as noted above). When working, we developers (which never numbered more than three) would do local testing, either with the regression test, or another tool that allows us to target a particular data configuration (based off the regression test, which starts eight programs, five of which are just needed for the components being tested). Why not test just the business logic? Said logic is spread throughout the back-end process, intermixed with all the I/O it does (it needs data from multiple sources, queried at the same time).
Anyway, code is written, committed (main line), tested, fixed, committed (main line), repeat, until we feel it’s good. And the “tested” part not only includes us developers, but also QA at the same time. Once it’s deemed working (using both regression testing and manual testing), we then officially pass it over to QA, who walks it down the line from the QA servers, staging servers and finally (once we get permission from the Oligarchic Cell Phone Companies) into production, where not only devops is involved, but QA and the developer who’s code is being installed (at 2:00 am Eastern, Tuesday, Wednesday or Thursday, never Monday or Friday).
Due to the nature of what we are dealing with, testing at all is damn near impossible (or rather, hideously expensive, because getting actual cell phone traffic through the lab environment involves, well, being a phone company (which we aren’t), very expensive and hard to get equipment, and a very expensive and hard to get laboratory setup (that will meet FCC regulations, blah blah yada yada)) so we do the best we can. We can inject messages as if they were coming from cell phones, but it’s still not a real cell phone, so there is testing done during deployment into production.
It’s been a 10 year process, and it has gotten better until this past December.
Now it’s all Agile, scrum, stories, milestones, sprints, and unit testing über alles! As I told my new manager, why bother with a two week sprint when the Oligarchic Cell Phone Companies have a two year sprint? It’s not like we ever did continuous deployment. Could more testing be done automatically? I’m sure, but there are aspects that are very difficult to test automatically [5]. Also, more branch development. I wouldn’t mind so much this, except we’re using SVN (for reasons that are mostly historical at this point) and branching is … um … not as easy as in git
. [6] And the new developer sent me diffs to ensure his work passes the tests. When I asked him why didn’t he check the new code in, he said he was told by the new manager not to, as it could “break the build.” But we’ve broken the build before this—all we do is just fix code and check it in [8]. But no, no “breaking the build”, even though we don’t do continuous integration, nor continuous deployment, and what deployment process we do have locks the build number from Jenkins of what does get pushed (or considered “gold”).
Is there any upside to the new regime? Well, I have rewritten the regression test (for the third time now) to include such features as “delay this response” and “did we not send a notification to this process”. I should note that is is code for us, not for our customer, which, need I remind people, is the Oligarchic Cell Phone Companies. If anyone is interested, I have spent June and July blogging about this (among other things).
[1] Looking up NAPTR records to convert phone numbers to names, and another set to return the “reputation” of the phone number.
[2] It took us five years to get one SIP header changed slightly by the Oligarchic Cell Phone Companies to add a bit more context to the call. Five years. Continuous deployment? What’s that?
[3] The original development happened in 2010, and the only developer at the time was a) very conservative, b) didn’t believe in unit tests. The code is not written in a way to make it easy to unit test, at least, as how I understand unit testing.
[4] A prototype I wrote to get my head around parsing SIP messages that got deployed to production without my knowing it by a previous manager who was convinced the company would go out of business if it wasn’t. This was six years ago. We’re still in business, and I don’t think we’re going out of business any time soon.
[5] As I mentioned, we have multiple outstanding requests to various data sources, and other components that are notified on a “fire and forget” mechanism (UDP, but it’s all on the same segment) that the new regime want to ensure gets notified correctly. Think about that for a second, how do you prove a negative? That is, something that wasn’t supposed to happen (like a component not getting notified) didn’t happen?
[6] I think we’re the only department left using SVN—the rest of the company has switched to git
. Why are we still on SVN? 1) Because the Solaris [7] build servers aren’t configured to pull from git
yet and 2) the only redeeming feature of SVN is the ability to checkout a subdirectory, which given the layout of our repository, and how devops want the build servers configured, is used extensively. I did look into using git
submodules, but man, what a mess. It totally doesn’t work for us.
[7] Oh, did I neglect to mention we’re still using Solaris because of SLAs? Because we are.
[8] Usually, it’s Jenkins that breaks the build, not the code we checked in. Sometimes, the Jenkins checkout fails. Devops has to fix the build server [7] and try the call again.
As a result, we might get four deployments per year [2]
AIUI most agile practices are to decrease cycle time and get faster feedback. If you can’t, though, then you can’t! Wrong practices for the wrong context.
I feel for you.
Thank you! More grist for my “unit testing is fine in its place” mill.
Also: hiring new management is super risky.
This is just a laptop with a proprietary parts system.
Not sure if this advertisement belongs here.
proprietary parts system
That’s true for some of the parts I’m sure (due to necessity since the market doesn’t have a concept of “standardized laptop enclosures”), but the expansion cards are just internal USB C dongles. They’ve also released the CAD files for the expansion card housing, so people can make their own.
Maybe, apart from the screen, expansion cards with ports on them, speakers, memory, storage, camera, microphone, plastic bit around the screen, wifi module.
Nothing is stopping people from buying the same components or compatible components or even making new compatible components. If I am wrong and naive then please tell me why.
Proprietary as in “only used by the one company”, or proprietary as in “fees required for production of compatible devices”?
If the former, that’s how most good hardware standards start off - someone makes their version and shows it can work (and gains nontrivial marketshare), then others produce components that can match.
If the latter, well, that’s news to me.
If the latter, well, that’s news to me.
AIUI only the USB3-based slots are open and royalty-free. Anything else is proprietary.
Sadly idiots online have made this effort harder for the team: https://github.com/temporary-audacity/audacity/issues/48#issuecomment-874555049
It looks like the maintainer has resigned due to IRL harassment from channers: https://github.com/tenacityteam/tenacity/issues/99
Based on their followup comment, seems like harassment is an understatement, they were assaulted. :((
Assault with a knife and a running Investigation by the Federal Criminal Police Office. So it doesn’t actually matter anymore what 4chan might have bothered, I hope they really catch those guys for good.
It’s likely that they’ll catch and convict the one person who did the assault, and that nobody else will have any liability. I say this as somebody who’s followed the activities of hate groups for years.
Well that would be at least something. If we can push such activity back to online harassment it’ll already be a win.. Or rather: I wouldn’t be surprised if they can’t find out where/who it was and the charge is so light for some technical reasons that nothing actually happens. I think it’s fair to convict at least the person that was ready enough to start going at people with a knife, these are ticking bombs anyway in my experience. But yeah, it’s probably not the last “raid” of 4chan.
Is the context for this preserved somewhere? The comment is replying to @alicemargatroid but I don’t see any posts from them in that issue. Seems like GitHub may have Optimized our Experience.
As near as I can tell in the five minutes I’m willing to spend looking in to this, the joke on 4chan was that the project should be named “Sneedacity”. Apparently “sneed” is some sort of meme, in-joke, or something. And instead of leaving it as just some joke comment people started “campaigning” to name it Sneedacity.
🤷
Huh. I assumed it was a play on Au
(gold) vs Sn
(tin) with “ee” filled in to make it pronounceable.
I’m glad I’m at least operating on a different level, whether you want to call it higher or lower. Wow. When I first heard about the policy change announcements, I thought about grabbing the source from the last change set before the transition, tossing it into a git repo, and putting builds online for the platforms I use + Windows. ’Cause I use it regularly and generally build it from source for myself.
Now I’ll just keep building it for myself, and won’t jump into this fray. It’s not like I was really going to do any more maintenance than fixing the odd wx upgrade breakage anyway. Bleh
Apparently it’s 4chan-speak for “special needs”, though with the amount of fake symbol-recontextualizing 4chan does I’m not sure if I believe it.
It’s a Simpsons reference: https://knowyourmeme.com/memes/sneeds-feed-and-seed
I believe a poll organised by the dev for the name was won by “sneedacity” and the dev refused to use the name therefore starting this situation.
Sorry, but this reads as blaming the victim. Sure, the dev decided not to use the name, but angry 4chan mob appearing in front of his place is way above any meaningful escalation.
I don’t think it was intended like that at all: it was just establishing what happened exactly, not assigning blame. That’s how I read it anyway.
Well yeah, but what I mean is that people came and voted and didn’t “hack” the result, whatever that is supposed to mean.
I’m honestly appalled that such an ignorant article has been written by a former EU MEP. This article completely ignores the fact that the creation of Copilot’s model itself is a copyright infringement. You give Github a license to store and distribute your code from public repositories. You do not give it a permission to Github to use it or create derivative works. And as Copilot’s model is created from various public code, it is a derivative of that code. Some may try to argue that training machine learning models is ‘fair use’, yet I doubt that you can call something that can regurgitate the entire meaningful portion of a file (example taken from Github’s own public dataset of exact generated code collisions) is not a derivative work.
In many jurisdictions, as noted in the article, the “right to read is the right to mine” - that is the point. There is already an automatic exemption from copyright law for the purposes of computational analysis, and GitHub don’t need to get that permission from you, as long as they have the legal right to read the code (i.e. they didn’t obtain it illegally).
This appears to be the case in the EU and Britain - https://www.gov.uk/guidance/exceptions-to-copyright - I’m not sure about the US.
Something is not a derivative work in copyright law simply due to having a work as an “input” - you cannot simply argue “it is derived from” therefore “it is a derivative work”, because copyright law, not English language, defines what a “derivative work” is.
For example, Markov chain analysis done on SICP is not infringing.
Obviously, there are limits to this argument. If Copilot regurgitates a significant portion verbatim, e.g. 200 LOC, is that a derivative? If it is 1,000 lines where not one line matches, but it is essentially the same with just variables renamed, is that a derivative work? etc. I think the problem is that existing law doesn’t properly anticipate the kind of machine learning we are talking about here.
Dunno how it is in other countries, but in Lithuania, I can not find any exceptions to use my works without me agreeing to it that fit what Github has done. The closest one could be citation, but they do not comply with the requirement of mentioning my name and work from which the citation is taken.
I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.
There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.
The fact that similar works were done previously doesn’t mean that they were legal. They might have been ignored by the copyright owners, but this one quite obviously isn’t.
There is no separate “mining” clause. That is not a term in copyright. Notice how research is quite explicitly “non-comercial” - and I very much doubt that what Github is doing with Copilot is non-comercial in nature.
Ms. Reda is referring to a copyright reform adapted on the EU level in 2019. This reform entailed the DSM directive 2019/790, which is more commonly known for the regulations regarding upload filters. This directive contains a text and data mining copyright limitation in Art. 3 ff. The reason why you don’t see this limitation in Lithuanian law (yet), is probably because Lithuania has not yet transformed the DSM directive into its national law. This should probably follow soon, since Art. 29 mandates transformation into national law until June 29th, 2021. Germany has not yet completed the transformation either.
That is, “text and data mining” now is a term in copyright. It is even legally defined on the EU level in Art. 2 Nr. 2 DSM directive.
That being said, the text and data mining exception in Art. 3 ff. DSM directive does not – at first glance, I have only taken a cursory look – allow commercial use of the technique, but only permits research.
Oh, huh, here it’s called an education and research exception and has been in law for way longer than that directive, and it doesn’t mention anything remotely translatable as mining. It didn’t even cross my mind that she could have been referring to that. I see that she pushed for that exception to be available for everyone, not only research and cultural heritage, but it is careless of her to mix up what she wants the law to be, and what the law is.
Just as a preventative answer, no, Art 4. of DSM directive does not allow Github to do what it does either, as it applies to work that “has not been expressly reserved by their rightholders in an appropriate manner, such as machine-readable means in the case of content made publicly available online.”, and Github was free to get the content in an appropriate manner for machine learning. It is using the content for machine learning that infringes the code owners copyright.
I gave them the license to reproduce, not to use or modify - these are two entirely different things. If they weren’t, then Github has the ability to use all AGPL’d code hosted on it without any problems, and that’s obviously wrong.
Important thing is also that the copyright owner is often different person than the one, who signed a contract with GitHub and uploaded there the codes (git commit vs. git push). The uploader might agree with whatever terms and conditions, but the copyright owner’s rights must not be disrupted in any way.
Nobody is required to accept terms of a software license. If they don’t agree to the license terms, then they don’t get additional rights granted in the license, but it doesn’t take away rights granted by the copyright law by default.
Even if you licensed your code under “I forbid you from even looking at this!!!”, I can still look at it, and copy portions of it, parody it, create transformative works, use it for educational purposes, etc., as permitted by copyright law exceptions (details vary from country to country, but the gist is the same).
Ms. Reda is a member of the Pirate Party, which is primarily focused on the intersection of tech and copyright. She has a lot of experience working on copyright-related legislation, including proposals specifically about text mining. She’s been a voice of reason when the link tax and upload filters were proposed. She’s probably the copyright expert in the EU parliament.
So be careful when you call her ignorant and mistaken about basics of copyright. She may have drafted the laws you’re trying to explain to her.
It is precisely because of her credentials that I am so appalled. I cannot in a good mind find this statement not ignorant.
The directive about text mining very explicitly specifies “only for “research institutions” and “for the purposes of scientific research”.” Github and it’s Copilot doesn’t fall into that classification at all.
Indeed.
Even though my opinion of Copilot is near-instant revulsion, the basic idea is that information and code is being used to train a machine learning system.
This is analogous to a human reviewing and reading code, and learning how to do so from lots of examples. And someone going through higher ed school isn’t “owned” by the copyright owners of the books and code they read and review.
If Copilot is violating, so are humans who read. And that… that’s a very disturbing and disgusting precedent that I hope we don’t set.
Copilot doesn’t infringe, but GitHub does, when they distribute Copilot’s output. Analogously to humans, humans who read do not infringe, but they do when they distribute.
I don’t think that’s right. A human who learns doesn’t just parrot out pre-memorized code, and if they do they’re infringing on the copyright in that code.
The real question, that I think people are missing, is learning itself is a derivative work?
How that learning happens can either be with a human, or with a machine learning algorithm. And with the squishiness and lack of insight with human brains, a human can claim they insightfully invented it, even if it was derived. The ML we’re seeing here is doing a rudimentary version of what a human would do.
If Copilot is ‘violating’, then humans can also be ‘violating’. And I believe that is a dangerous path, laying IP based claims on humans because they read something.
And as I said upthread, as much as I have a kneejerk that Copilot is bad, I don’t see how it could be infringing without also doing the same to humans.
And as a underlying idea: copyright itself is a busted concept. It worked for the time before mechanical and electrical duplication took hold at a near 0 value. Now? Not so much.
I don’t agree with you that humans and Copilot are learning somewhat the same.
The human may learn by rote memorization, but more likely, they are learning patterns and the why behind those patterns. Copilot also learns patterns, but there is no why in its “brain.” It is completely rote memorization of patterns.
The fact that humans learn the why is what makes us different and not infringing, while Copilot infringes.
No I don’t think that’s the real question. Copying is treated as an objective question (and I’m willing to be corrected by experts in copyright law) ie similarity or its lack determine copying regardless of intent to copy, unless the creation was independent.
But even if we address ourselves to that question, I don’t think machine learning is qualitatively similar to human learning. Shoving a bunch of data together into a numerical model to perform sequence prediction doesn’t equate to human invention, it’s a stochastic copying tool.
It seems like it could be used to shirk the effort required for a clean room implementation. What if I trained the model on one and only one piece of code I didn’t like the license of, and then used the model to regurgitate it, can I then just stick my own license on it and claim it’s not derivative?
Ms. Reda is a member of the Pirate Party
She has left the Pirate Party years ago, after having installed a potential MEP “successor” who was unknown to almost everyone in the party; she subsequently published a video not to vote Pirates because of him as he was allegedly a sex offender (which was proven untrue months later).
Why exactly do you think someone from the ‘pirate party’ would respect any sort of copyright? That sounds like they might be pretty biased against copyright…
Despite a cheeky name, it’s a serious party. Check out their programme. Even if the party is biased against copyright monopolies, DRM, frivolous patents, etc. they still need expertise in how things work currently in order to effectively oppose them.
Have you read the article?
She addresses these concerns directly. You might not agree but you claim she “ignores” this.
And as Copilot’s model is created from various public code, it is a derivative of that code.
Depends on the legal system. I don’t know what happens if I am based in Europe but the guys doing this are in USA. It probably just means that they can do whatever they want. The article makes a ton of claims about various legal aspects of all of this but as far as I know Julia is not actually a lawyer so I think we can ignore this article.
In Poland maybe this could be considered a “derivative work” but then work which was “inspired” by the original is not covered (so maybe the output of the network is inspired?) and then you have a separate section about databases so maybe this is a database in some weird way of understanding it? If you are not a lawyer I doubt you can properly analyse this. The article tries to analyse the legal aspect and a moral aspect at the same time while those are completely different things.
The short code snippets that Copilot reproduces from training data are unlikely to reach the threshold of originality.
The thing is that the core logic of a tricky problem can very well be very little code. Take quicksort for instance. It is super-clever algorithm, yet not much code. Luckily quicksort is not under some license that hinders its use in any setting, but it could very well be. Just because it is only 10 lines, it does not mean it is not an invention that is copy-rightable. Code is very different from written language in that regard.
Yeah the title is ignoring this important bit. The claim is precisely that Github Copilot can suggest code that DOES exceed the threshold of originality.
Based on machine learning systems I’ve used, it seems very plausible. There is no guarantee that individual pieces of the training data don’t get output verbatim.
And in fact I say that’s likely, not just plausible. Several years ago, I worked on a paper called Deep Learning with Differential Privacy that addresses the leakage from the training data -> model -> inferences (which seems to have nearly 2000 citations now). If such things were impossible then there’s no reason to do such research.
That was/is still a concern at least about two years ago, because it was one of the topics I was contemplating for my bachelor’s thesis. The hypothesis I was presented was that there (allegedly) is a smooth transition: the better your model the more training data it leaks, and vice versa. Unfortunately, I chose a different topic so I can’t go into detail here.
There is no guarantee that individual pieces of the training data don’t get output verbatim.
The funny thing is that humans do that too. They read something and unknowingly reproduce the same thing in wiring as their own w/o bad intents. I think there is a name for that effect, but I fail to find it atm.
I’d say it depends on the scale. Sure in theory it’s possible for a human to type out 1000 lines of code that are identical to what they saw elsewhere, without realizing it, but vanishingly unlikely. They might do that for 10 lines, but not 1000 lines, and honestly not even 100.
On the other hand it’s pretty easy to imagine a machine learning system doing that. Computers are fundamentally different than humans in that they can remember stuff exactly … It’s actually harder to make them remember stuff approximately, which is what humans do :)
Quicksort is an algorithm, and isn’t covered under copyright in the first place. A specific implementation might be, but there’s a very limited number of ways you can implement quicksort and usually there’s just one “obvious” way, and that’s usually not covered under copyright either – and neither should it IMO. It would be severely debilitating as it would mean that the first person to come up with a piece of code to make a HTTP request or other trivial stuff would hold copyright over that. Open source code as we know it today would be nigh impossible.
So that means the fast square root example from the Quake engine source code can be copied by anyone w/o adhering to the GPL? It is just an algorithm after all. If that is truly the case, then the GPL is completely useless, since I can start copy & pasting GPL code w/o any repercussions, since it “just an algorithm”.
If your friend looks at the Quake example and describes it to you without actually telling you the code – by describing the algorithm – and you write it in a different language, you are definitely safe.
If your friend copies the Quake engine code into a chat message and sends it to you, and you copy it into your codebase and change the variable names to match what you were doing, you are very probably in the wrong.
Somewhere in between those two it gets fuzzy.
It looks like my friend copilot is willing to paste the quake version of it directly into my editor verbatim, comments included, without telling me its provenance. If a human did that, it would be problematic.
In the Quake example it copied the (fairly minor) comments too, which is perhaps a bit iffy but a minor detail. There is just one way to express this algorithm: if anyone were to implement this by just having the algorithm described to them but without actually having seen the code then the code would be pretty much identical.
I’m not sure if you’re appreciating the implications if it would work any different. Patents on these sort of things are already plenty controversial. Copyright would mean writing any software would have a huge potential for copyright suits and trolls; there’s loads of “only one obvious implementation” code like this, both low-level and high-level. What you’re argueing for would be much much worse than software patents.
People seem awfully focused on this Copilot thing at the moment. The Open Source/Free Software movement has spent decades fighting against expansion of copyright and other IP on these kind of things. The main beneficiaries of such an expansion wouldn’t be authors of free software but Microsoft, copyright trolls, and other corporations.
Semi-related, Carmack sorta regrets using the GPL license
https://twitter.com/ID_AA_Carmack/status/1412271091393994754
Semi-related, Carmack sorta regrets using the GPL license
…in this specific instance.
Quote from Carmack in that thread: “I’m still supportive of lots of GPL work, but I don’t think the restrictions helped in this particular case.”
I’m not implying you meant to imply otherwise, I’m just adding some context so it’s clearer.
That is fine, I am not the biggest GPL fan myself, yet if I use GPL code/software, I try to adhere to its license
quicksort is not under some license that hinders its use in any setting, but it could very well be
Please do not confuse patent law and copyright law. By referring to an algorithm you seem to be alluding to the patent law. And yet you mixed the term “invention” and “copy-rightable”. Please explain what you mean further because as far as I know this is nonsense. As far as I know “programs for computers” can’t be regarded as an invention and therefore patented under the european legal system, this is definitely a case in Poland. This is a separate concept from source code copyright.
Maybe your comment is somehow relevant in the USA but I am suspicious.
Ignoring all the other problems the fact that the verification options are on purpose hidden in the UI is extremely harmful and I’ve been ranting about it for years. Instead they should make this a prominent feature and also make it easier to verify someone by using a trusted 3rd party that you both verified. On top of that the verification number is not a real hash - it is imperative to check the entire number and not just the beginning or the end of it as far as I can tell which obviously is a huge problem since personally I just checked the first couple of groups of numbers before I learned that they can stay the same. I think this is a serious problem as well.
The reason some stay the same is because the safety numbers are just a concatenation of your hash and the other party’s, so the overall number is the same on both devices. That simplifies the UX because users have to do one comparison that works both ways instead of two that work one way. Clearly though the fact that you made this mistake means there’s more work needed to improve the UX.
I do think the QR code scanning helps a lot with that, if you can use it.
Either way you can make this a single comparison for example by always appending a larger number to a smaller number before hashing them. This is not an argument.
Global rate limit of 100 RPS by IP
The bane of anyone using a VPN/connecting from a university campus etc.
This article actually made me consciously aware that I habitually hover my mouse cursor over a link before clicking it, to see what the pop-up at the bottom of the screen says is the real URL. I did this before I clicked the link, like I have apparently trained myself to do with any link over many years of browser usage, and immediately noted that the link was to a different page on this person’s site rather than the wikipedia article. I thought for a moment that the “real” trick might be faking the text in that pop-up, but that wasn’t it.
I do this as well but you can bypass this with JavaScript, a user will see a different link on hover then the one you navigate to after clicking it.
When I’m feeling particularly distrusting (i.e. by default, when not on a site I’m pretty confident won’t be engaging in such sleaziness), I often opt for a right-click (or perhaps even a shift-right-click as I recently learned can be used to bypass right-click interception fuckery), copy the link, and paste it into a new tab as an extra line of defense.
bit of an odd example because go test
has a repeat flag; you can do go test -run TestWhatever -count=1000
And how is what is written in the article different than a 3 line bash script with a loop in it? Why do I need rr? What advantage does rr bring to the table? What is it used for in this article that a bash script can’t solve? Those and many other questions are left unanswered I am afraid.
Because it is actually reproducible, or at least closer:
Remember, you’re debugging the recorded trace deterministically; not a live, nondeterministic execution. The replayed execution’s address spaces, register contents, syscall data etc are exactly the same in every run.
The rr website explains it pretty well https://rr-project.org/
You record a failure once, then debug the recording, deterministically, as many times as you want. The same execution is replayed every time.
Why is it used then? Put your actions where your mouth is and don’t use it.
The thing is, Medium doesn’t feature poor visual accessibility of their site design. Virtually every Medium article renders great in Reader Mode on Safari or Chrome, which lets you choose the font, font size, and contrast pretty easily.