1. 19

  2. 5

    I have read for years about how evil is email enumeration… but guess what? I think the benefits of being able to tell a user that is using the wrong username instead of a wrong password, outweighs any theoretical danger of revealing that certain email is being used. Change my mind.

    1. 10

      I’ll take a stab at trying to change your mind. For some context I’m a Penetration Tester by trade and this specific topic is, in my opinion a great example of subtle risks with huge real world impacts.

      The issue of username/email enumeration has two attack patterns:

      • Password spraying - Guessing a weak password across tons of accounts, like bruteforcing but trying to find the email with the weak password not the weak password for the email.
      • Password “stuffing” - Taking a known compromised credential and trying to authenticate to tons of other services that the credential pair was re-used at

      For password spraying, there is only one thing I actually need: a username/email. In the real world I go from an External Network Penetration Test to internal network access ~80% of the time because of username enumeration and some strategically guessed passwords. Having the ability to get a list of known usernames to target greatly reduces the amount of guesses I have to make and ramps my accuracy up a ton.

      For a full example, say I am targeting your corporate mail server based off of Exchange or O365 to try and guess credentials that I can then re-use on the target VPN infrastructure. My very first step is to grab a list of known emails/usernames from previous password dumps, public information, or directories. Then I generate a list of potential name combinations from location specific birth information by year. Next comes the actual username enumeration where I try and identify the “valid” accounts (aka what you are asking). In my example, Microsoft agrees with you and doesn’t believe that username/email enumeration is a risk… Which is why I wrote a ton of tooling to automatically use NTLM/HTTP timing based responses to enumerate the valid users. Now armed with a list of what are guaranteed usernames/emails, I just start picking down the list of the seasons hottest passwords over the next few days; Summer2020!, Password1!, Companyname2020!. All I really need is one credential. It’s not about the single user, it’s about the bulk knowledge. If I was going in blind without the confirmed accounts then I would be generating tons and tons more traffic and would be even easier to flag on, having enumeration puts the statistics of getting automated guesses way way more on the attackers side.

      The other example is password stuffing. This is more straight forward, given that I have a compromised username/email and password for a user I can take a bot that knows how to authenticate to tons of different services (banks, social media, blah blah blah) and try those combinations. If username enumeration exists on these services it actually allows me to check to see if accounts are valid for the service before actually submitting my automated logins. If I am a bot herder my job is to try and stay undetected for as long as possible and the enumeration greatly assists in that.

      Hopefully that helps! It’s one of those strange things where people forget about the collective risk and focus more on the singular threat models, attackers rarely care about the individual irl.

      1. 4

        This is great advice. And it really reinforces for me why appsec people should be way more involved in the software development process as early as possible.

        At a previous job we were identified by nine digit numeric characters (no, not those nine digits!). I built a public facing API for internal use that returned public facing data created by employees. No problem, thinks me. But I left the SSO ID on the API because why not? Ship it!

        A few days later one of the blue team guys sends me an email with 2/3rd of my database, exfiltrated by walking the API with a dictionary file and explains what you just explained above. Oops.

        1. 2

          Not a pen-tester, but I would’ve assumed allowing Password1! as a valid password is a bigger issue than email enumeration. You can now check against lists of bad passwords from dumps.

          1. 2

            You’d think right? But you are fighting human nature and historical IT theories. As it turns out making a comprehensive deny list is extremely difficult, and then you add the fact that hashing is in play the only time it gets checked is at the filter level when changing that credential. You can’t just look up your passwords in your ntds.dit and compare it with historical dumps (I try and do that for my clients because the reality is the offensive tools are actually better at it than the defensive). As for historical reasons, often times IT resets credentials to a weak or organizationally default credential and it never gets changed, support desk staff often don’t remember to check the “change after first login” checkbox.

            Like I said, it only takes one. Also password patterns follow human nature in more ways than one, I’ve been popping my American clients that have comprehensive blocklists left and right with Trump2020!. Passwords suck haha.

            EDIT: To add another thing think about Password1!, lots of orgs have an 8 character password with special and numerical requirement. Technically it fits lots of places. If there is organizational SSO if the filters are not forced everywhere it can also propagate to other authentication areas.

            1. 2

              To add another thing think about Password1!, lots of orgs have an 8 character password with special and numerical requirement.

              Even better is to have entropy requirements, including dictionary files. zxcvbn is a good example of a frontend library for this.

              You can also compare hashes with the HIBP Pwned Passwords dataset and reject new passwords that match.

              1. 1

                Are there other databases than HIBP that are commonly used for this?

                1. 2

                  I don’t know. Pwned Passwords has 573 million SHA1 hashes, so I’ve not felt the need to look further.

          2. 1

            This is great advice. Thank you for writing such a comprehensive answer.

          3. 1

            Aside from the technical side explored by other replies, depending on your location and/or the location of your users, you could face legal consequences. Under legislation such as the GDPR, an email address is considered personally identifying information. If someone realises that you are leaking such personal information and reports you, you could face a fine. In some cases, the user may also claim compensation from you. If the user suffers a loss due to your failure to safeguard their data, then it could a large amount of money. (e.g. Imagine you run a site which is legal, but not considered socially acceptable. A public figure signs up using their email address. Someone uses email enumeration to discover that said public figure has an account on your site, causing damage to their reputation and consequent loss of earnings)

          4. 7

            Great read but the section on “third-world country” and “non-English” speaking was disappointing. We’ve seen major “first-world” websites get hacked by kids in “third-world” countries.

            1. 1

              Yeah, there are probably plenty of resources on Host header injection in the contractors’ native language – they just didn’t care, or weren’t very good.

            2. 2

              One I would like to add is to create cryptographically secure random number as a reset token. Make sure you have something that can’t be guessed by an attacker. So let me explicitly state:

              • Don’t create a hash from a timestamp
              • Don’t create a hash related to user input (like an id or email of the user)