My biggest question on this is whether or not the TOTP key database is hashed with the user passwords. Does anyone know? Elliot’s assertion is that these cannot be hashed but if you’re already running the user password hash why not run the same hash to expose the TOTP key? Of course that means the unhashed TOTP will remain in the clear while waiting for the user to input the 2FA value (or else user would need to reenter password if you miss the 2FA window). Anyone have experience with this?
2FA only really matters if your password is already owned. So if you’re deriving the TOTP secret from the password, it’s not much better than storing it in the clear, right?
Here’s what I gather from the released documents:
The FBI started with the suspect and worked backwards. This isn’t a situation of PureVPN giving up customer IP addresses
I would be surprised if you can pull even transferred packet counts off an iPhone.
Can you? Does anyone monitor that?
No need to rely on the phone. Connect to wifi, run tcpdump on router. Read the dictionary aloud until a matching ad appears.
If Instagram uses TLS, you’ll need to jailbreak the phone, install your own ca, issue certs for Instagram/Facebook, and mitm the encrypted traffic.
Sure, to be able to see the actual data. But even without decrypting, it would be meaningful to see that traffic is going to Instagram servers when the phone is locked.
Doubt you can do it without jailbreaking, but you should be able to at least pipe everything through Charles and get a sense of network activity from there.
This class of issues is where I find most of my performance bugs when I use Rails, et al.
It’s just not obvious how the ORM interacts with the database. Which is nice as an abstraction, but it causes all sorts of inefficient behavior to arise (N+1 queries, memory bugs, overeager loading, etc.)
This is disappointing.
With an automated, zero-cost CA, there are very few legitimate cases for wildcard certificates, and the risks increase with their use.
I don’t understand why LE couldn’t simply allow for higher thresholds on certificate issuance, and instead support certificates that are actually a worthwhile goal: free S/MIME that doesn’t involve suckling at the Comodo teat.
The biggest use case for wildcard certs is SaaS. If I have 10,000 SaaS customers with hosted domains like customer.example.com, LE wouldn’t want to issue (and renew!) that many certs. It also may exceed their rate limiter.
LE creates SAN certificates, which let you group together multiple domains under one certificate. So you can use LE for a SaaS product like this if you’re clever about automatically grouping domains together. See: https://letsencrypt.org/docs/rate-limits/
I know that LE can support up to 100 domains in the same certificate with SAN certificates. But I feel like the complexity implied by grouping domains together is not worth the few hundred bucks of a wildcard certificate.
I do like the option when it’s there. For example when SNI is not available and you are running low on IPs.
The main concern is phishing.
If you look at your URL bar and see a green lock next to https://www.paypal.com.mysite.biz/login.php, you’re a lot more likely to log in.
[Comment removed by author]
I agree. If you can prove you own the domain, shouldn’t you be able to call your domain whatever you want and get a certificate for it?
So the real risk, it seems to me, is in the way you show that proof. If the CA asks for this proof in a way that’s not secure, that to me would be a problem.
You may be interested to know that browsers limit wildcard certs to one level deep, for this reason.
What does this risk have to do with phishing?
In any event, the CAs aren’t the right place to solve phishing, services like SafeBrowsing are.
I like supporting wildcards but I do wish they’d dramatically increase the rate limits and decrease the suspension time. Getting banned for a week after a fuckup or bug is nuts.
You can do this with nginx using the built-in gzip_static module.
If you want to protect /blah.txt, create the empty file blah.txt and store your zip bomb as blah.txt.gz. Make sure to add "text/plain" to gzip_types, so that nginx knows to serve the compressed version.
It’s not built by default. There is really no need for it in this situation, though — you can just basically send the gz files by themselves for certain locations, maybe modifying some headers slightly to explicitly indicate the encoding.
If anyone here is interested in this sort of thing, you’ll find r/mechanicalkeyboards interesting,
Typo: r/mechanicalkeyboards
Since the rule is almost entirely phonetic, this is actually quite common among people who have mostly read and written English and not spoken it much or at all.
There could also be potentially valid cases for paypal in a domain, like “dont-use-paypal.com” or “paypal-is-awful.com” or something similar. Someone may even want one of those domains to have perfectly valid SSL too.
OpenBSD developer uebayasi@ once had problems signing up to some web site which complained that he shall not use a username which contains ‘ebay’.
Yes, but the idea of having trusted CAs is that there should be someone taking a look at these before approving the certificates.
How could a CA tell? What’s to stop me from creating paypal.mysite.com, filling it with blog posts critiquing paypal’s business practices, and then once I get a cert for it, replacing it with a phishing site?
All a CA can do is verify that the person requesting a cert for a given domain name controls the domain. Constantly auditing the ‘worthiness’ of a cert-holder’s domain would be an enormously impractical burden.
[Comment removed by author]
The “major CAs” that the article talks about will gladly issue me a certificate for wildcard subdomains. There is nothing new about this problem with Let’s Encrypt. A cert for, say, *.*.mysite.com will let me serve paypal.com.mysite.com.
The new part would be it’s free and easy the automate. I still think this isn’t letsencrypt’s problem.
Indeed, wildcard certificates are only supported for the first level subdomain. See, eg, RFC2818, section 3.1:
Matching is performed using the matching rules specified by [RFC2459]. If more than one identity of a given type is present in the certificate (e.g., more than one dNSName name, a match in any one of the set is considered acceptable.) Names may contain the wildcard character * which is considered to match any single domain name component or component fragment. E.g., .a.com matches foo.a.com but not bar.foo.a.com. f.com matches foo.com but not bar.com.
Of course, you can just buy a certificate for *.com.mysite.com, which lets you do the same thing.
You pose that as if they’re equivalent, but it’s not really the case, is it? Domain registrars have issued invalid certificates used for evil, yes. But is it quite so prevalent as with LE?
The point jcs is trying to make, I believe, is that the domain registrars allowed someone to register the domains in the first place. For DV (domain validation) certificates, it is only verified that you own the machine at the IP address the DNS for that domain points to.
You have to first install a program on the computer, then you have to have a detector with line of sight to the HDD LED. I suppose this would make a good plot for a TV episode, but at this level of intrusion, why not monitor the user’s screen?
Years ago there were a lot of modems and network devices (routers, etc.) where the Tx and Rx LEDs (or equivalent) had sufficiently fast response times that you could actually read the data going through the device [cite]–no special software required.
That’s one of the neatest side-channel attacks I’ve seen. Intercepting data just by looking at the device.
The novelty is in leaking the information covertly with some sort of control. Observing a user’s screen isn’t exactly the same.
Let me see if I understand correctly: SQL is insecure because mainstream programming languages don’t have good interfaces to SQL databases?
Bad programmers will write bad code using any tools, but that’s really besides the point. OP argues that SQL is insecure because:
Raw SQL strings, prepared statements and ORMs are all interfaces between databases and programming languages. Unfortunately, none of them is perfect:
The real problem is types. Most programming languages don’t have sophisticated enough type systems to model the operations of relational algebra. (But some do!) In particular, nominal types don’t help. For example, if you have two classes Customer and Order, there is no type-level operation that can produce a third class corresponding to select * from Customer C join Order O on C.CustomerID = O.CustomerID. This is a very sorry state of affairs, and it absolutely isn’t SQL’s fault.
Prepared statements are a chore to use and they don’t buy you that much security, because you’re still manually supplying strings.
How so? The strings you do supply to prepared statements are incapable of changing the pre-prepared parse tree, which is the big insecurity of smashing random strings together.
I agree I’d rather use an ORM with correct-by-design types, though.
If a new maintainer of your software needs to add a field of a weird type to the query, will they learn how to add that flavor of field to the prepared statement, or will they interpolate a string in just this once?
I know what I would do with Go+Postgres: add a type conversion from string $1::json (etc) in the SQL, then marshal the data to string right before the query and give a string to the driver in Exec() or Query().
I’m actually not sure what the “correct” way to do that would be. One that popped out from the documentation is to implement sql/driver.Valuer on a local typedef or something like that. But that’s a massive pain in the behind and also depends on driver internals.
That isn’t really true. Ur/Web rules out invalid queries at compile time. But this requires two things:
Oh you’re talking about the strings submitted for the prepared statements, not the user input filling in the ?s. I misunderstood and was talking about runtime input.
Yes, I was primarily talking about the strings submitted for the prepared statements. However, even the user input filling in the ?s is often less statically checked than it could be. Will Java’s type checker complain if you attempt to suply an int where the database would expect a varchar? Ur/Web’s will.
How exactly can you argue in quantitative terms the difficulty of using prepared statements?
Isn’t the difficulty of a thing somewhat subjective?
This whole post seems like… satire
Nobody guarantees that the result of preparing a statement will be meaningful according to the database schema. That’s the difficulty.
Contrast with Ur/Web, where the type-checker makes sure that your SQL statements make sense.
You beat me to it. I was going to add Opa language, too, as it raises the bar vs common options. One could throw in memory-safe languages like Component Pascal or concurrency-safe languages like Eiffel or Rust. Like the web languages, these simply don’t allow specific classes of problems to occur unless the developer goes out of their way to make it happen. Always good to design languages to knock out entire classes of common problems without negative impact on usability if possible.
The real problem is types. Most programming languages don’t have sophisticated enough type systems to model the operations of relational algebra.
Types, yes. Type systems, no.
Q has tables, and operations that work on tables. There’s no reason a lesser language like PHP couldn’t do this, it’s just that PHP programmers don’t do this.
Prepared statements are a chore to use and they don’t buy you that much security, because you’re still manually supplying strings.
If you move your authentication into the database (like with row-level security) then your web-layer can simply authenticate against the database and run the prepared queries like an RPC. The biggest problem I see people have with prepared statements is if they have inadequate tooling and don’t invest in it. (Migrations are a dumb and painful way to program, and while commercial offerings are much better, open source is very popular)
My day job is to maintain a rather large ERP system. You know, the kind where the typical table has 40-50 fields and the typical primary key has 5 fields. The kind where people are afraid of altering existing tables, because who knows what queries might be affected, so they create another table with the same exact primary key, whose rows are intended to be in 1-to-1 correspondence with the original table, even though that will only make things harder in the long run and we know it.
This tremendous amount of pain is the price of the lack of coordination between language and database. If there were an automatic, convenient way to determine what parts of our application have to be changed in response to a given change in the database, I estimate that we could be twice as productive, while at the same time creating less technical debt. This is precisely the problem type systems solve.
This tremendous amount of pain is the price of the lack of coordination between language and database.
I’m not disagreeing with that: Having the business logic in the same language as the database is another way to obtain that coordination, and it offers far more benefits:
A large amount of pain is had in synchronising the continuous single history of “the business database” with the many-branches of modern software development. Building directly on top of the database, and solving the problems that you need in order to do that eliminates pain that you never thought possible, like writing migrations or having to maintain test databases. A type system doesn’t help me get there.
I estimate that we could be twice as productive
Using the same language for your database and your application wins much more than 2x. I would say it wins 10x or even 100x.
The kind where people are afraid of altering existing tables, because who knows what queries might be affected, so they create another table with the same exact primary key
Really the goal should be to have the data in the correct shape. KDB is column-based, and column-based data stores are useful here because you don’t usually want to alter the existing table. You want to hang another column on there, or you want another rollup/index somewhere. That’s cheap (microseconds) in KDB.
Having the database contain your program also means you can easily to analytics on which queries touch which columns, which increases bravery significantly (and safely!).
My day job is to maintain a rather large ERP system.
I have a similar database, although in addition to those fat business data tables that is ingested from a bunch of Oracle/Siebel databases, it also contains very tall analytics data growing at a rate of around 300m web events per day and around 60k call records per day.
KDB also has the advantage of being quite a bit faster than other database engines, so it wouldn’t surprise me if I’m dealing with more data than you.
If you don’t know KDB/Q, you should look into it. Ur/web+postgresql is great, but it has nothing on commercial offerings.
A large amount of pain is had in synchronising the continuous single history of “the business database” with the many-branches of modern software development.
Right. We need a notion of “time-evolving schema”, allowing new data to have a different structure from old data, while at the same time allowing queries to be meaningful across schema versions. As far as I know, that problem hasn’t been satisfactorily solved yet.
Building directly on top of the database, and solving the problems that you need in order to do that eliminates pain that you never thought possible, like writing migrations or having to maintain test databases. A type system doesn’t help me get there.
You piqued my curiosity. Let’s say you have a language where tables are first-class values. Altering the structure of a table amounts to changing its type. (As opposed to inserting, updating or deleting rows, which amounts to constructing a different value of the same type.) How do you validate that every part of your application that depends on this table is compatible with the new version, without type checking?
KDB is column-based, and column-based data stores are useful here because you don’t usually want to alter the existing table.
This is a physical implementation detail. I don’t want to worry about that.
KDB also has the advantage of being quite a bit faster than other database engines, so it wouldn’t surprise me if I’m dealing with more data than you.
I’m not too worried about the amount of data I need to process. I’m worried about the complexity of the logical constraints the data must satisfy in order to make sense. Logical errors can manifest themselves even with modest amounts of data.
If you don’t know KDB/Q, you should look into it.
I will.
Right. We need a notion of “time-evolving schema”, allowing new data to have a different structure from old data, while at the same time allowing queries to be meaningful across schema versions. As far as I know, that problem hasn’t been satisfactorily solved yet.
Tooling can help a lot, though, and may be good enough. There is commercial tooling (like Control for Kx) which is basically an IDE for your database, complete with multi-user version control. It has the disadvantage of being an online tool, but it provides hints of what the correct solution might look like to me.
This is something I’ve been thinking about for a while.
Let’s say you have a language where tables are first-class values. Altering the structure of a table amounts to changing its type.
However adding a column doesn’t affect code that doesn’t use the column.
How do you validate that every part of your application that depends on this table is compatible with the new version, without type checking?
Static analysis remains possible without type systems provided you don’t learn column names from the network (and if you do, your type system would be incomplete anyway).
This is a physical implementation detail. I don’t want to worry about that.
I know you don’t, but removing abstraction is reduces program size (and therefore bugs), and increases program speed so much that I think it’s often worth thinking about the fact we are meat programming metal. Bugs mean fixes, which is programming we didn’t plan for, and slowness generates heat that harms the environment. And so on.
If you want to change the type of a column from an 64-bit unix-seconds to a 32-bit time and a 32-bit date (KDB has native date types, btw), you have to decide:
And so on. These are real considerations that affect a real system. If we could only sit in our purely-software universe and have enough abstraction, we could make our decisions on what makes better software (asking for a date and getting a date is probably better than doing arithmetic on seconds – and what happens when the calendar changes, anyway) but someone has to solve them, and unfortunately a type system doesn’t actually solve these problems.
A type system only helps with the same part of the problem that tooling solves: Static analysis can find the code, and having a real table “type” means you just use a couple in-memory copy of some of the rows you the programmer believe are representative, which then form your tests for regression tracking.
However having views and a real table type (i.e. doing the database in your programming language) means (performance) testing is easier, there’s a migration path for the data, and you’ll have a good handle on what the real user-impact is.
I will.
Awesome. It is not easy to get into without a commercial need, but the #kq channel on freenode contains people willing to help answer questions. It’s not as high-volume as #ocaml so you might have to wait for the earth to turn and someone in the right timezone to wake up :)
That’s a good statement of a important point. If the simplest, most obvious way to use a tool isn’t secure, we must consider the system fundamentally insecure because that’s what will happen in practice. The programmer’s UX of security concerns is vital.
That seems mostly reasonable to me. Using SQL in PLs where the default way to use it is by passing in ordinary strings that contain code is indeed insecure. Imagine if mainstream PLs had us defining and calling functions by calling eval() on strings all over the place: I would expect that to lead to terrific quantities of horrid security problems too. I accuse that passing a string to sqlite3_exec() or mysql_query()or PQexec() is equally as scary as passing a string to eval() because RDBMS query languages are either powerful enough to execute arbitrary code or complicated enough to inevitably have bugs that can be leveraged into arbitrary execution.
I’ve seen an interesting alternative in one of C J Date’s older books, “An Introduction to Database Systems”. He has examples of relational queries embedded directly into a language that looks like PL/1, where the queries are actually fully parsed at compile time. I think they had all the niceties, like references to ordinary lexical scoped variables in the queries turning into code that does all the correct binding at runtime and everything.
I’m thinking that one could make a much safer language be just as convenient as doing broken string concatenation is in current PHP, by using quasiquoting, reader macros or just straight up embedding SQL’s entire grammar into the PL’s own grammar in an expression context. I’d identify “PHP with mysql_query() replaced by quasiquoting” as a safer PL than “current PHP”.
Another strategy for making SQL injection harder to write by accident that I’ve seen is in the postgresql-simple library for Haskell. The query execution functions accept a string-like type called Query for which there is an IsString instance, so you can switch on the OverloadedStrings pragma and write code like execute connection "INSERT INTO dogs VALUES (? ?);" (name, cuteness) — so the correct, parameterised-query pattern is easy and convenient to write. At the same time, the incorrect string-concatenation code is still possible but much less convenient, so you’re much less likely to write it it. While you can build Query objects from strings, the syntax to actually do that is longer and involves looking up more stuff than the syntax for putting parameters in your queries.
IIRC there are also quasiquoters that let you write that as something looking like [sql|INSERT INTO foo VALUES (${name}, ${cuteness});] as an expression and automatically turn that into the above parameterised-query.
In all of the above, anywhere I refer to “PHP” you may instead read “any PL in which you use SQL by passing an ordinary string to a function or method with query or execute in the name”, i.e. very nearly all of them. PHP only does slightly worse than average here because mysql_query() comes bundled with the runtime but you have to install an ORM on purpose, whereas plenty of other PLs come with neither SQL bindings nor an ORM so it’s almost equally difficult to install the ORM or the raw SQL binding.
Imagine if mainstream PLs had us defining and calling functions by calling
eval()on strings all over the place: I would expect that to lead to terrific quantities of horrid security problems too.
This gets to the heart of my position. Very well said.
Unfortunately, this is still as insecure as end-to-end encryption since the author deploys it potentially on Internet-connected computers. The computer will be hacked, a keylogger installed, and plaintext recovered. It also may have the property of Perfect Forward Interception where future messages are breached via a rootkit. The necessary modification comes from high-assurance security: data diode. A one-way link. It will allow the 32-byte packets to leave the encrypting machine but no attacks can come back in. Preferably an optical one with the two at a specific distance to eliminate electrical-level attacks. With the two computers in separate, faraday cages running on batteries to cover some other attacks. :)
Two computers in separate space observatories, point the telescopes at each other and use maritime signalling.
I mean, none of this protects metadata privacy at all… with the space observatories, it’s incredibly obvious who’s talking to who. :)
Such a risk might be mitigated with the observatory equivalent of mix networks like we did with email. That means they gotta constantly look at each other plus random ones. All of them participating in even one shared message do this with a secret timing ensuring they look at the right time. Probably be out of service due to mechanical failure most of the year.
My old method of using infrared at drop locations is starting to look better. Just got the idea to modify flood lamps or street lights to encode the message in flickering. It can be obvious with timing looking like a broken light or imperceptible flickering like the old LED side channels.
This is still potentially insecure. One station might use it’s signaling ability to start running IPoMS (IP over Maritime Signaling), and end up getting hacked.
Taking the first 4 bytes and just passing them straight into malloc? That’s terrifying, I hope this isn’t facing the open internet.
This post is garbage. Don’t learn 3 as a beginner? Fine, I’ll agree with that. Don’t use it at all? I don’t know about that.
Unicode/byte strings are annoying, sure. Learn how they work and move on. People break non-latin strings constantly, but when they’re using unicode strings by default it happens much less often.
Complaining about running py2 bytecode in the py3 VM? How often do you run into that problem?
At this point there are more 3-only libraries than 2-only libraries, and that’s only going to get more so. Beginners are better off starting with 3.
I thought the HTTP-layer cacheability of GET requests was one of the features of a properly working web API!
Passwords are pretty insecure. Most people reuse them. Compromising their password in one place (eg. Neopets) will allow you to assume their identity on most other services (Google, Facebook, Amazon, maybe a bank).
Is it passwords that are insecure or the person behind the password? If you use a password manager that lets you generate strong and unique passwords then I don’t see an issue. Then if a site gets compromised you just need to regenerate a new password for that site and not worry about the others you may have used the old password on.
At the end of the day, it comes down to making a conscious effort to be smart about how you maintain your online identities.
From a policy perspective, there’s not much difference between “this cannot be used securely” and “this is not used securely”. The net result in either case is poor security, and bemoaning the fact that people don’t pick secure passwords doesn’t solve the problem.
Obviously from a personal perspective, there’s a lot you can do to maximize the security of your passwords (starting with generating distinct long, random passwords for everything and keeping them in a password manager). But your good password practices don’t really matter to Google, or anybody else using those passwords to authenticate you.
Many security issues will be closely linked in some way to people. If your security mechanisms can’t account for the soft exploits, it’s an insecure system.
The problem is that most people don’t do this, and there’s only so much a site can do to encourage its users to do this. In the end, it can’t tell if your password is used elsewhere, or from a password manager, or anything else. What they can do is look for some method of authentication that avoids or augments the password, to (hopefully) provide a greater degree of security by default.
Passwords are absolutely broken.
LinkedIn was hacked 4 years ago, 164 million accounts compromised, and we just find this out in the past month?
Https? There’s no way to ensure that it’s even set up properly. The DROWN attack and heartbleed are both great examples. https://thehackernews.com/2016/03/drown-attack-openssl-vulnerability.html
Depending on any multi-use token for authentication should be considered poor security.
Https? There’s no way to ensure that it’s even set up properly.
I’m not sure what this has to do with passwords.
If this is a problem for passwords, isn’t it also a problem for biometric data sent to a backend?
Edit: in fact, all biometric data is “reused”, so wouldn’t that be even worse because once someone captures whatever data your fingerprint is turned into they can use that with any system that uses the same type of data?
[Comment removed by author]
Korelogics analysis of the Linkedin hash dump shows some interesting issues with the use and generation of passwords.
I’m not sure what the solution is - but for me passwords are part of the problem.
Here’s the thing: You’d have a similar problem with storing biometric data. Biometrics are strictly equivalent to a reasonably strong password, from the server’s perspective.
They’re quite different from the user’s perspective, but I’d argue that they’re a step backwards because they’re immutable.
Wasn’t the hack acknowledged way back in 2012? The new thing you’re hearing about is that the hacked passwords are finally being used.
Depending on any multi-use token for authentication should be considered poor security.
What would you do instead, in the context of a web application needed to authenticate a user?
Are there performance benchmarks for this? Writing something in C doesn’t necessarily mean it’ll run faster than if it were written in eg: LuaJIT, especially if the C code isn’t written with cache efficiency in mind. And GLib is (or was the last time I used it) very pointer-fetchy.
Hopefully it’s a lot faster than “hundreds of requests per second”, the stat touted on the site.
The author says he is planning to publish some benchmarks.
People are using Lua to serve web apps. I wonder if the performance hit is bad enough to lead to a DoS.
I think this came up and the consensus was that there are many other dos avenues, and the best solution is probably some sort of front end that filters stupid requests.