I think it’s kind of funny that they include the antivirus updates when their service is probably responsible for the viruses.
It’s not clear if the experiment broke ethical or even legal boundaries, since it relied on confusion if not outright deceit to trick people into installing something other than what they intended to install. Still, the lesson the experiment imparts is worth heeding.
I’m not sure I understand what the “ethical or even legal boundaries” they are implying were broken here. It doesn’t go into detail about what the script he wrote does, if it was something malicious that would make more sense. But if i read it correctly he basically wrote a script that shows a warning message telling the developer their mistake and pings home to register the download in order to see how large the attack vector was. Am I missing something or is the article trying to make things sound way more interesting than they really were?
I think the article is just being bombastic. I don’t see anything unethical about it, it’s basically how all computer security research works.
That being said, judging from the recent CFAA cases, it probably would be considered illegal by US law. It’s a good thing the student lives in Germany and not the US, or they might be looking at jail time (especially since the package infected .mil domains).
I would argue that the specifics of what he did were both unethical and illegal. Illegal by the letter of the Computer Fraud and Abuse Act, as you mentioned (he certainly exceeded authorized access on the machines that downloaded his fraudulent packages, as the users no doubt had no expectation that downloading the packages would result in searching of their machine or transmission of data to some outside location). Unethical because his packages scanned the user’s machines, including command history, resulting in potential accidental disclosure of private information. I understand that he had some personal justification for this in the context of his research, but without permission (which would likely have had to have been given by the users when they first accessed the package manager, likely with some sort of credential system to track their having opted in to experiments that may expose personal information), this definitely seems like a breach of reasonable ethical practices in the security field.
Where does it say his program scanned the user’s machine? All I got from the article is that it logged its own invocations.
It’s on page 23 of the thesis for which this work was done. Here’s the quote listing what the fraudulent packages collected and transmitted back to the university machine. Note that all data was transmitted unencrypted over HTTP as the query string of a GET request.
- The typosquatted package name and the (assumed) correct name of the package. This information was hard-coded in the notification program before the package was distributed. Example: coffe-script and coffee-script (correct name).
- The package manager name and version that triggered the operation. The package manager name was also hard-coded, before the package was uploaded. The package manager version was retrieved dynamically. Example: pip and the outputs of the command pip –version
- The operating system and architecture of the host. Example: Linux-3.14.48
- Boolean flag, that indicates whether the code was run with administrative rights. Getting this information on Windows systems is not trivial and possibly error prone.
- The past command history of the current user that contains the package manager name as a substring. This information could only be retrieved from unixoid systems, because Windows systems do not store shell command history data. Example: Output of the shell command grep “pip[23]? install” ~/.bash_history
- A list of installed packages that were installed with the package manager.
- Hardware information of the host. Example: Outputs of lspci for linux. On OS X, the outputs of system_profiler -detailLevel mini were taken.
users no doubt had no expectation that downloading the packages would result in searching of their machine or transmission of data to some outside location
Why do you say that? That’s what pretty much every ruby gem I’ve ever installed did. Digs around on my hard drive for a while, downloads some more pieces, compiles some code, runs some code, blah blah, finally announces it’s done.
I suppose I should have been more precise. The type of data collection the program did, in particular grep
ing for commands in the bash history of any Linux machine for commands containing the name of the package manager, and then transmitting the result of that search back to a remote machine is probably behavior the average user would not expect.
A Ruby gem that computes and installs dependencies is not even remotely the same thing as what happened here.
I absolutely do not expect installing a package or gem will scrape arbitrary information from my system and send it to an unknown third party, and I don’t think many people do expect that or think it’s okay.
Experience is likely to lead to specialization. I’ve spent a lot of time in with Parsing Techniques: A Practical Guide, Salomon’s Handbook of Data Compression, Noble/Weir’s Small Memory Software, and various books about TCP/IP protocols and implementations in the last year, but for someone who focuses on machine learning, mobile, or front-end development, that list would probably be completely different. (I do mainly embedded & distsys stuff.) Learning programming is about learning problem domains as much as it is learning languages and libraries.
That said, Christina Lopes’s Exercises in Programming Style is cross-cutting and thought-provoking. It takes a fairly straightforward task, shows a few dozen different ways the program could be structured, and compares the approaches' design trade-offs.
This tweet also comes to mind:
Advice I gave to a talented dev (already good at half a dozen langs) on which prog language he should learn next - “learn math or hardware” - @ravi_mohan
For learning more about hardware coming from a software background, I highly recommend the two Patterson and Hennessy books: Computer Organization and Design and Computer Architecture: A Quantitative Approach. The former is the introductory textbook on the subject, and the second is a more advanced level book.
Serious question, what do people here think is likely to be a better fit for a self-taught programmer, math or hardware? I’ve dabbled with both, both were fun. I currently work in the financial world, and like low-stress, high-paying jobs.
I want to complement my skills, but I suspect neither math nor hardware offer much career opportunities without a degree in math or engineering. Is this the case? I’ve only got a BA in CS.
For me (being self-taught), I hit a wall about ten years into my career where I realized I needed to learn a lot more math to go further as a programmer, and I’ve never regretted it. Both math and hardware are deeply related to all the hard problems in programming.
That’s orthogonal to career opportunities, though; for me, it’s more about mastery and finding ways to go deeper. If you want better career opportunities, I think learning about project management is a better bet.
I was wondering when Andrew’s thesis would show up here. If you like this, check out our recently published tech report on RocketChip, our open-source RISC-V SoC generator.
http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-17.html
I’m a Ph.D. student at UC Berkeley in the computer architecture lab. I worked with Andrew before he graduated and will be interning at his new company, SiFive, this summer. Happy to answer any questions about RISC-V or RocketChip.
As the article mentions, his “last lecture” at Berkeley will be Friday, May 6. The title of the talk will be “How to be a Bad Professor”.
It really is not that hard to instead of just searching for all available projects, talk to some peers and see what mature options are out there, or do the ‘research" yourself. If it is a good product and there is no documentation, and that’s a problem for you, write some. Don’t just pull some repository with version 0.0.3 and use it like a blithering idiot. It’s your product and your product alone will fail, so it’s your responsibility to make sure the code you pulled in is sane, not joe schmo who puts in 3 hours a year to this side project.
I think the problem is with transitive dependencies, not direct dependencies. Sure, you can make sure all the libraries you use are high quality and well documented, but who’s to say that those libraries' maintainers did the same thing? It may just mean that you need to consider transitive dependency quality in the vetting process, although that would be a substantial amount of work in most cases.
If your project is going to be put on production, where up time, and security matters you better be sure that the transitive dependencies are not a weak link. Either that or the main dependency is so popular that the issue will be resolved quickly I suppose (lookin at you React) but that is a bit like driving a volvo without a seatbelt. Storing your dependencies with your code (or somewhere), and testing before merging new dependencies might not be a bad idea. I think it’s a pretty strong given that in any platform, package manager, etc, catastrophic failure is possible, and we should be at least a little prepared for it, especially when it’s inexpensive to do so.
My point is, though, that why are you putting it on the consumer to figure out if your project is worth using or not. It’s also not always clear, from the consumer standpoint, that a project isn’t worth using fully until they’ve already invested some time in it. It’s usually not as clear cut as zero documentation - often there is very limited or poor documentation (I’ve often cited a static site generator that I used that, once you got past the very basic getting started, you were stuck just reading the code to figure out how to use it). It’s clear looking through the issues that a lot of developers attempt to use these projects - so, what, as an open source developer, do I gain by putting a project out there that, in many case, wastes developers time or frustrates them mostly due to a lack of documentation?
It’s always on the user to decide whether or not the project is worth using or not. The developer obviously cannot jump into the user’s brain and make the decision for them Inception-style. But there are generally some useful heuristics that you can use. How long as this project been around? Is it actively maintained? Does it have a relatively large userbase? Is it backed by a reputable tech company, etc? None of these questions is difficult to find answers to.
If you put code out there without documentation and somebody decides to use it, you didn’t waste their time. They wasted their own time by making a poor decision.
The real reason for the redesign, of course, is to stymie those fake Uber drivers who hustle you at New York airports, holding up a fake ‘UBER’ sign and conning you with their sheer confidence and chutzpah. The new logo is a bitch to draw and will no doubt cause such con-men to rethink their mis-use of the Uber brand. Additionally, it has such little resemblance to anything related to Uber that prospective victims will pass by blissfully unaware of the great danger they have just avoided.
The thesis here is “just hire random people (regardless of coding ability) and pay them all $30k a year; software will happen,” right? Does that actually match anyone’s experience? Wouldn’t anyone who discovers that they’re decent immediately leave to make twice, three times, or four times more money? Wouldn’t you have an almost immediate dead sea effect in which the only people staying in your company were the people you hired for their “soft skills” who proved useless except for their pleasant small talk in meetings?
Well, they’re not going to stay at $30K a year. The proposal explicitly states increasing their salary as they advance in the program.
But yeah, I’m pretty skeptical this could ever work. First, because there are very few companies who can afford to spend two years training someone with no prior dev experience. Second, because it assumes you can teach anyone to code. Some people just won’t take to it. Third because, as you said, you have no way to incentivize them to stay once they’ve completed their training and become much more employable.
One of the things that seemed strangest to me about this proposal was the complete neglect of the people already at a company. It takes a lot of work to get a senior developer in place without serious disruption in a small team. I can’t even imagine what bringing in multiple people without experience would do.
That said, I think it’s easy to imagine there’s something spectacularly unique about programming which precludes people who want to keep a job from improving at it. I really don’t think that’s true and it’s one of the reasons I shared this in the first place. It would definitely take a high degree of humility and patience to execute any part of this, but I think the main thrust of the post (open hiring to a broader pool of applicants) isn’t far off from something to work toward
I would say that using a less mind-bogglingly complex ISA would reduce the likelihood of these issues. But having fixed so many stupid bugs in RocketChip over the last year, I’m not so convinced. There’s no way to make things completely bug-free. There’s just way too many weird corner cases.
Verification of FM9801 (2002) says the proof of the microprocessor design was checked by a theorem prover, so it’s apparently possible. Has there been more progress on this front?
I think one issue is how do you verify a multicore processor? Sure, for a single core cpu, you can generate a formal spec from the ISA manual. But how do you specify the myriad interactions of a multicore system.
https://media.ccc.de/v/32c3-7171-when_hardware_must_just_work is directly relevant to this question. Formal proof is only a part of cpu design.
Working on posters for our lab’s winter retreat. This is when we showcase the work we’ve done over the last semester to our government and industry sponsors so that they know their money is being put to good use.
Hey, that’s awesome. I’m on the RISC-V team at Berkeley, and we’re trying to move towards un-tethered systems as well. I won’t be able to attend the RISC-V workshop, but some of the other students will. I think it would be great if you could share some of your experience with them so we know what pitfalls to avoid.
So far, we have most of the required hardware features like MMIO and a hardware device tree implemented. From here on, it’s mostly implementing peripherals and updating all the software.
It’s a shame you can’t make it. Both Wei Song and I are flying over for the workshop and presenting - Wei will give more detail on the untethering work. It will be good to catch up with your colleagues.
Our current implementation uses a separate I/O bus for simplicity and to minimise the invasive changes to the Rocket codebase. It seems like your MMIO implementation may have sorted most of the bugs now, so we may want to move over to that.
We finished our computer architecture class project last Friday. This week, I’m merging some of the changes back into the master branch of Rocket Chip. Unfortunately, we weren’t very careful managing our branches, so it’s going to take some git magic and careful validation to put together a decent pull request.
This article probably OT for lobste.rs, but personally am interested in this kind of content (anyone up for a lobste.rs clone for this kind of alternative economy stuff?). This article is toothless though; as many words as it contains now it needs to be prefixed with that many again to explode terms from economic science like ‘economic growth’ and ‘income per capita’ and ‘sustainability’ into the smaller, understandable, re-humainsed concepts we need for the future: economic science can not fix economic science.
Ok, so for anyone interested: http://aesi.news:3000 (DNS still updating, www. seems to have updated faster for me)
Would like to play with it for a few days in ‘dev’ mode as I am not an expert sys admin or rails person so this needs to be ‘shored’ up a bit quickly if it gets used. @irene @brinker @zem PM your email address here for invitation.
@jcs any tips for a fresh lobste.rs install?
Not really, it’s pretty self-contained. I hate external dependencies and dangling configuration files, so pretty much everything is in the git repo with the exception of the files listed in .gitignore
. The nginx config is very typical for proxying to unicorn. There are two cron jobs, one that runs every 20 minutes to run rake ts:index
and one that runs every 5 minutes to run script/mail_new_activity
and script/post_to_twitter
.
You know, I really like this model of “it’s easy to host your own” as a way to kind of spread the good things around and grow adjacent communities, without being Reddit.
Although subreddits have communities, a problem is as you get bigger, you succumb to reddit’s own hivemind, with its own problems.
Also note subreddits, for the most part, killed off traditional forums. It’s all under Reddit, instead of separate places, with their own style/accounts/etc.
Yes, essentially this. Subreddit moderators have a fair amount of autonomy, but are still working within Reddit’s larger policy framework. It means, for example, that the Lobsters invite-only policy wouldn’t make sense because it’s too large a user pool for that to mean anything. I don’t regard being part of Reddit as bad per se, it’s just a different choice.
I’d enjoy perhaps documentation in the repo about all this, speaking as someone who considered using the Lobsters code for a similar goal - perhaps there could be even an ecosystem around Lobsters “clones” focused on certain fields out of scope of Lobsters itself.
The code needs some reworking to abstract the remaining Lobsters-specific stuff in there, like wording in templates, icons, colors, etc. Most of the forks on Github are people using it for their own sites that have had to make a bunch of commits to re-brand it, but then they can’t rebase because of all their changes.
It was reasonably straightforward following the main readme. Had to fiddle a little with some dependencies and using rvm for managing ruby (Ubuntu 14.04) seemed useful as I had some problems with a dependency (nokogiri) that seemed to be related to using ruby 1.9.3, but using ruby 2.1.0 seemed to fix it. Other things outside the docs were setting up SMTP and creating content for About, etc. but nothing that took more than few mins to figure out.
Agreed, that would be a very interesting forum to have but is most likely of limited appeal here - I was surprised to realize we have a finance tag, so it’s worth asking for general opinions, but to really discuss economics involves a lot of specialized knowledge and almost certainly wouldn’t go well with an audience not primarily interested in it.
/r/economics is one of the two subreddits I read without feeling strange for reading Reddit (one of about five I read overall :)), and I do recommend it if you’re not aware of it.
If you start such a clone, I’ll join it. The lobste.rs codebase and policies are definitely a good basis for that sort of thing.
What’s the other subreddit you read without feeling strange?
Also, I too would join a lobste.rs-like forum for the discussion of economics.
got an idea for a name for it? @brinker
If we want to stick with the plural animal name domain-hack type, I’ve come up with the following options, which are all potentially available according to Domainr.
Edit: Removed was.ps, as it turned out to not be a valid domain name.
.gs (dugon.gs, herrin.gs, and starlin.gs) looks to be as easy a a .com. I’ll check out the others.
Edit: .as (barracud.as, chinchill.as, cobr.as, echidn.as, and hyen.as) is easy but expensive. I’ve also removed was.ps, as it turned out not to be valid.
Given all of this, of the three it seems dugon.gs, herrin.gs, and starlin.gs are the best choices.
I use marcaria.com for most of my ccTLD registrations like lobste.rs, as they have agents in all of those countries to satisfy the local residency requirements. Pricing is quite a bit higher than normal domains because of that, though.
Ah, then that would change the list a bit, as I excluded any TLD with strict residency requirements from consideration. Anyway, it sounds like we’re going to punt on the name for now.
I just double-checked. Looks like dugon.gs, herrin.gs, and starlin.gs are all actually available for registration.
Looks like .gs is ~40EUR? I might start with a cheap throw away domain, see if some people use it and if they do then round up some people who want to share server/admin responsibility (all of us in this thread seem like initial candidates?) and rebrand at that point with a sweet strange animal name domain ;)
That sounds like a good idea - run the site on a boring .com or .org, for a while, first, and see whether it gathers critical mass. Naming discussions can take a while, anyway. :)
I have to decline your kind offer to administrate; mental health stuff periodically affects my SLA for that sort of task, so I’m not a good choice. But I hope other people on the thread will volunteer. :)
I like the lighthearted feel of starlin.gs. But dugon.gs would stick with the theme of marine life that doesn’t get a lot of positive attention. :)
You know, with all the new gTLDs being registered, I wouldn’t have been surprised if this actually did exist. But it looks like it doesn’t … yet.
Well, Assembler is becoming more and more irrelevant and people program in high-level languages with compilers and translators written in C.
I love RISC to be honest and am not a big fan of the amd64 instruction set. A new, “clean” approach to a 64-Bit architecture would be the “cleanest” solution, but I definitely understand why no one cares. If an architecture doesn’t bring anything new to the table, who would use it? If I came up with some revolutionary way of language design which somehow magically reduces the number of cache-misses, it may be adopted. But not just for the architecture’s sake.
This was a nice write-up on this matter. Good work by the author!
A new, “clean” approach to a 64-Bit architecture would be the “cleanest” solution, but I definitely understand why no one cares.
The main things RISC-V brings to the table are
Custom opcode space is a recipe for cruft as soon as people start using it. What’s the betting OSes end up having to support two different extensions that do the same thing with slightly different opcode syntax?
Midnight today is the submission deadline for ISCA. I’m co-authoring a paper with other students in my lab.
After that, it’s back to working on our computer architecture class project, which involves comparisons of various prefetching and DMA schemes.
There is no other language, for example, that is close enough to English that we can get about half of what people are saying without training and the rest with only modest effort.
Why would that be abnormal? There are a lot of language isolates. Just look at Basque, Hungarian, Korean, etc.
It’s indeed not abnormal. It seems to me that Greek and (modern Israeli) Hebrew are in the same situation.
Also, what’s a language, as opposed to a dialect? Are Scottish and Jamaican dialects really the same language, and Serbian and Croatian two distinct ones? The usual answer, “a Schprach is a Dialekt mit an Armee un Flot”, doesn’t seem to hold in case of Jamaica and Scotland.
And then apparently going even further and using contains
. Seriously, how bad does your backend have to be to not handle “null” in any part of the string.
Just got to Shenzhen (深圳) and started exploring the streets, arts districts, museums and makerspaces. Easily the most exciting city I’ve visited in China…
If anyone has suggestions for places to check out for someone on a tech tourism kick, let me know!
Whoa, that’s pretty cool. How long are you planning on staying in China, and do you plan to visit any other cities?
Sorry that I missed your response!
I’m still wandering around the Chinese speaking world. Thus far I’ve visited Beijing, Shenyang, Tongliao, Chifeng, Tai'an, Shanghai, Shenzhen, Chengdu, HK, Tai(pei|chung|nan). I’m on a bit of a tech break with my girlfriend hanging out in the mountains of Taiwan right now. Should head back toward HK and southern China in a couple weeks though!
Do you have any suggestions? I’d still really appreciate them.
Posting this mainly for the part about the feline inebriation bug; “The original bug report is, ‘There’s cat vomit all over my tavern, and there’s a few dead cats'”. It starts at “It’s funny how I have popular bugs, right?”.
Sounds like a feature and not a bug to me.
Yeah, that would be some funny shit to have encountered. It’s novel, something to figure out, and a good story for friends later. One of best combos. All the things that led to it also make for a more interesting intro to emergent behavior. I mean, still start with flocking but make next one “why is there cat vomit and dead cats all over my tavern in game?” Many different reactions will happen to that one that should all motivate greater understanding in their own way. :)