1. 31
  1.  

  2. 23

    I’m strongly in favour of #NoEstimates under the following conditions….

    • If you aren’t going to do anything useful with the estimates, don’t bother.
    • If you aren’t going change anything when new information arrives that alters the estimates, don’t bother.
    • If you’re going to ignore the estimate and impose a fantasy deadline… don’t bother.
    • If you aren’t going to keep the estimate up to date as things progress / change. Don’t bother.
    • If you aren’t going to estimate properly, don’t bother.
    • If you think the result of an estimation is a deadline. Don’t bother, since you’re pretty clueless anyway and lack of estimates is the least of your problems.

    Good Estimation isn’t impossible, it’s just work. Quite a lot of it.

    Thus if you aren’t going to be getting an equivalent amount of value from all that work, you’d be better off ignoring estimation and delivering real value instead.

    All that said…..

    I’m actually in favour of estimation.

    The result should be a living daily updated burndown chart giving probability distribution for completion.

    It should be updated daily to reflect changes in staffing (sickness, leave, turnover…)

    It should be updated daily to reflect changes in scope, new requirements uncovered, features negotiated out, new risks discovered.

    It should be acted on daily. It’s telling you about risks and value. You should be using it to work out how you can manage the risks, how you can prioritize delivering value sooner.

    The probability distribution for completion is worthless.

    No one cares.

    You care about opportunities lost, opportunity costs, staffing costs, revenue earned or not earned. Without those estimates in the model as well…

    Who cares when it’s done anyway?

    Retrospectives are often disappointing talkfests that achieve nothing… The same points arise time and time again.

    If they are about closing the feedback loop to work out the root cause of discrepancies between the estimates and reality… progress might be achieved.

    Of course, if you do a sensitivity analysis, quite likely the biggest source of error is not time estimation.

    Odds on it’s market size and appetite estimates.

    If those totally dwarf time estimates…again, why are you bothering with time estimates? Ignore estimation, go for a MVP as soon as possible and get some real data from the market.

    Part of the value in estimation should be bullshit detection and to learn to smell it. If you never do it properly and understand what’s going wrong, you never learn.

    Again the hard part is “The number is 42, what you’re going to do about it?” Hire staff? Fire staff? Tell them to sit on their hands?

    No hiring and firing are the most expensive things a company can do.

    What you’re going to do is pick the highest value target and tell them work at that until it’s achieved.

    And since usually the market estimates of ROI are wildly more error prone than the wildly error prone dev estimates… a thumb suck from the devs is usually as much as you need anyway.

    1. 9

      Wow, that first link was a trip. The #NoEstimates people weird me out.

      When it comes to why estimates are difficult, most engineers and engineering leads throw their hands up in the air and say, “Software has too many unknowns! We can’t tell what unknowns we’ll find, even with the simplest task!”

      This is the second time in two days I’ve seen this “software is special!!!” thinking. We’re not. Nothing makes software essentially harder to estimate than any other large, complex project. We might not have the training, or the tools, or the theories on how to estimate better. But software isn’t special. Either we can estimate software or we can’t estimate anything.

      1. 19

        My dad has worked in construction for many years, and although we don’t know much about each other’s work worlds, we are always able to connect when talking about project planning, estimations, and management. There are so many parallels that I can’t help but believe that there’s some hope for software.

        That said, it’s not like it’s a solved problem in civil engineering projects either. Crazy stuff happens there too. My intuition tells me there is a lack of appreciation for the non-physical complexity in software. If someone finds a pipeline no one expected when digging, it seems easier to grasp the gravity of it versus finding a dependency in code that no one knew about.

        1. 8

          Same setup here, with the experience of construction sites (actually demolishing though) and software.

          In construction I’m used to doubling or tripling the original estimate, but never of “oops, it’s 10x the work now”. But that mostly speaks of experience and apparently more “normal” projects, if you look at Berlin’s BER airport disaster, I guess 100x comes closer…

          I hate to turn to this age old bad examples, but I’ve never heard of stuff like:

          • oh we DO need a basement after all, throw away the ground level
          • we decided to use wood and not cement
          • wait, nobody told me we need to put lights in there?

          What I did hear:

          • no, we wanted the other wall to be torn down
          • we want the new door exactly here! (turns out it there was a wall at the back, imagine a T - shaped piece of wall)

          But yes, sometimes there are known tasks like

          • this wall is 3m x 5m, it will take X hours
          • this feature needs one new form and 3 database calls, it will take X hours

          and then there are things where you simply don’t know from a one sentence description if they want a garden shack or a new mall.

          The main problem is that it’s less tangible and a lot more opaque. On the other hand some things that are 10 minute tasks can seem like magic - that also hardly happens in construction. You don’t just get a surprise benefit by discovering a material(library) unknown to you that will save a week worth of work.

          1. 5

            If I ask you to estimate the weight of all your colleagues and to guess the total weight of the whole team….

            I bet you won’t be far out either.

            Some probability distributions, especially physically based ones are thin tailed.

            Your guesses are very unlikely to be far off, a smallish sample is likely to be representative, nothing is going to be orders of magnitude different, and when you sum (or average) your guesses, the deviations cancel out.

            Other distributions, especially non-physical ones like wealth, software complexity, ….. are fat tailed.

            ie. You need very large samples to be able to estimate accurately, yes, quite likely there is one story in the backlog that explodes out to 100x larger than the rest, summing ( or averaging) does reduce the deviation but not really.

            What really happens in the software world is when that task blows up by a factor of 100x… Odds on we say, Meh, maybe we don’t need to do that in the 1st release… or 2nd or ..

            Estimation is not about deadlines, it’s about risk management. When we see a story blowing up… we need to step in and make some hard choices.

          2. 2

            I work in Chicago so I immediately thought of the much-delayed Jane Byrne Interchange construction.

            https://www.chicagotribune.com/news/ct-met-jane-byrne-delays-20190306-story.html

            In January 2015 — just over a year into construction — university workers noticed the building had been sinking and shifting, leaving cracks in the foundation and making it impossible to shut some doors and windows, according to court records.

            Over the next 1½ years, IDOT blamed engineering firms it had hired for missing the poor soil conditions that contributed to the problem. That led to a redesign of a key retaining wall that boosted costs by $12.5 million and dragged out that part of the project at least 18 more months.

          3. 13

            But software is, if not special, different to most physical engineering projects.

            Building software is like combining the worst cases in renovation TV shows - you have a heritage listed building with unknown structural issues, a homeowner who followers the builders around everywhere, can’t envision what they want until they see it, and constantly change their mind after work has been completed. And they have a strict budget and need to be in before Christmas.

            The fundamental reasons why software estimation is hard are:

            • most team leads/scrum masters won’t say no to changes in the estimated scope
            • team membership changing due to business priorities
            • insufficient investment in quality measures across the board

            They are, to a degree, our own fault. We don’t say ‘no’. We don’t insist on quality. We don’t insist on team stability.

            In my time as a team lead I pushed as hard as I could on those issues. If priorities changed and some new work needed to be fitted into the current timeframe, I made the product owners pick an equivalent sized piece of work to remove. The team already had a great quality ethic, which I protected by making sure that there was enough time allowed to maintain our test suite etc. I couldn’t really stop management from moving people in and out of the team, but I made enough noise that it was at least somewhat uncomfortable for them. Our estimates were not perfect, but pretty decent.

            But it was exhausting, and while I think people respected me, it would have been “career limiting behaviour” if I’d been focused at all on rising up the ranks in that org.

            After 6 months I left and found a pure dev job.

            1. 11

              I’ve been interviewing “crossovers”, people who started off as traditional engineers and moved to software. I’m basing all my claims off what they said. The overwhelming consensus is that almost everything we think about “trad” is a misconception.

              Building software is like combining the worst cases in renovation TV shows - you have a heritage listed building with unknown structural issues,

              In software, if you need to figure out what’s wrong with the codebase, you can inspect the source code. If you need to figure out what’s wrong with the electrical system, you have to tear down the wall.

              One former electrical engineer talked about how often mechanical projects would go wrong. Often a supplier had tons of implicit knowledge about their parts- switching to a different supplier for the exact same design could get you something completely incompatible, simply because of slight differences in the tolerances.

              a homeowner who followers the builders around everywhere,

              Plenty of engineers complained about this. Scope creep and overbearing clients are universal.

              can’t envision what they want until they see it,

              Also extremely common. It’s less of a problem in most engineering fields, but most of my interviewees think it’s because they just spent more time gathering requirements in trad.

              and constantly change their mind after work has been completed.

              I talked to one engineer who had to move a bridge. I’d have to go back and check with her, but I think it was something like “The demographics of the area had changed.”

              The fundamental reasons why software estimation is hard are:

              • most team leads/scrum masters won’t say no to changes in the estimated scope
              • team membership changing due to business priorities
              • insufficient investment in quality measures across the board

              These happen all the time in trad, too.

              1. 3

                a homeowner who followers the builders around everywhere,

                Plenty of engineers complained about this. Scope creep and overbearing clients are universal.

                I can attest to that. I’ve trained and worked in software for 20 years now, but I originally trained to be a TV-repairer. One repairshop I had a placement in during training had a notice on the wall, that could be seen by customers:

                We charge 300/hour.
                If the customer wants to watch, we charge 600/hour.
                If the customer wants to help, we charge 900/hour.

                To be totally honest I believe it was meant more as a humerous deterrent than to be taken literally, but there’s no smoke without fire as they say :-)

                1. 2

                  I’ve been interviewing “crossovers”, people who started off as traditional engineers and moved to software. I’m basing all my claims off what they said. The overwhelming consensus is that almost everything we think about “trad” is a misconception.

                  That’s really interesting, and supports a suspicion I’ve long held that the software industry’s “imposter syndrome” with respect to the engineering community causes is to over-glorify physical engineering.

                  Although I haven’t been interviewing or studying it, I know quite a few tradesmen (and my father was a plumber and builder). I get the impression that, as you say, estimation is not exactly a precise art for them, either.

                  1. 1

                    Agreed on all counts. These are the kinds of things I hear from my father, too. One difference might be that in software we’re too eager to deploy what we’ve got and then never change it. Also, software does a lot less work up front. (Sometimes that code doesn’t get examined much!)

              2. 9

                Something I don’t understand is why Evidence Based Scheduling isn’t more popular. Joel’s original article is well-written and insightful, but I haven’t seen anyone mention this outside of Fog Creek.

                I wrote a small tool to try this technique out, and was surprised to find out how reliable the data can be (the left and center charts represent the same dataset, one is the raw data and one is a graph of the median IIRC). Not only does it show that “smaller” tasks have a tighter estimation bound, but my estimates weren’t meaningless.

                This tool unfortunately has sat dormant for a while – I recently started trying to resurrect it to get it working with AppEngine’s “Cloud Endpoints Frameworks version 2.0” (what a mouthful), but ran into some snags. If I get it working again I’ll post more about it here.

                1. 6

                  I mean, I worked at Fog Creek, and the thing is that EBS is honestly eerily, amazingly accurate, but you need to:

                  1. Stay completely on top of what you’re working on, and
                  2. Not lie

                  Most people don’t do #1, so then they try to guess what they actually did, which results in a well-intentioned #2. And plenty of people actually do #1, but they don’t like what the 80th percentile for shipping is, so, eh, I mean, I know I marked that I spent 45 minutes on ticket 1234, but like half that was actually email, so I’ll just revise the estimate. In both cases, even if you were coming from a good place, EBS will now be wrong, and you blame the tool.

                  I’d be curious if @tedu has a radically different opinion, but I doubt it.

                  That said, I’d love a reimplementation. As I said, when you do feed it good data, it is accurate. There are times I’d gladly deal with the micromanagement to get the right results.

                  1. 1

                    Did @tedu work at Fog Creek as well then?

                    1. 1

                      Yes. He was a key dev on the Unix version of FogBugz.

                  2. 1

                    This actually does sound really good. However, I wouldn’t want to ask a team to try this without also presenting a good tool that could actuate it. In my case, something which could use Trello’s API to automatically get data on card lifetimes to feed into the EBS calculations would be ideal.

                    1. 1

                      I’ve done something similar with t-shirt sizes – track how long tasks actually take and classify them based on story points. I never extended it to the probability curve level, but over time it did become an increasingly useful metric. It was also excellent feedback for the developers as we figured out more appropriate t-shirt size definitions.

                    2. 8

                      The author writes:

                      Estimates matter because most people and businesses are date-driven

                      And here lies the problem exactly: Whenever I give you an estimate, you hear a deadline. I never gave you a deadline.

                      Or as a client once told me, “There is effort and there is duration”

                      I am in favor f estimation, when you understand what the answer you received is. Otherwise you are setting me up and this is something I do not like.

                      1. 6

                        So, projects (as defined by the rather pedantic PMI) have an end.

                        After this engineer launched stuff like Skype on Xbox One, what happened? Did all work stop, because the project ended? All engineers never touched that code again? Nope! They posted 4-5 months later about new features and bug fixes.

                        What happened here is they estimated a product launch (which could be a project) and then kept working on the product, into perpetuity. If it’s supported into perpetuity, it doesn’t qualify as a project anymore.

                        The reason I make the distinction is that most of the teams I have seen struggle are struggling because once the “launch” is complete, the project management team gives them a new project for a completely different product. They then spend their time losing their minds trying to do maintenance for all of their old stuff while also building the new stuff.

                        1. 2

                          Funny you’re asking this question. Yes, work continued for a while. And about 6 months later - right after that second update - the team was dismissed, Skype for Xbox one put into maintenance mode. It was good enough. An Xbox team specialising I’m maintaining “complete” apps took it over, doing monitoring and having the code, in case any changes were needed later. Devs either moved to a new team within Skype/Microsoft, or left. None of the original devs touched the codebase again (as far as I know).

                          So despite what you’d think from the outside, there really was an end here quite soon after the launch.

                          1. 1

                            Yes, work continued for a while. And about 6 months later - right after that second update - the team was dismissed

                            So…. you had a project timeline that you set with 100+ teams a year in advance. The product launched, but it had enough new features to add (and bugs to fix) that they kept the project running for an extra 6 months (+50% of the original estimate). This is your example of a good case for estimates?

                            Your other case was that you beat your estimate, leaving more time to experiment… but doesn’t that mean you had a feature complete app that you sat on for a period of time - experimenting? Why was this a good thing?

                            From your article:

                            Have you noticed how Apple ships most of their big bang projects at WWDC, a date they commit to far ahead?

                            From the Wall Street Journal (quoted in another place, admittedly)

                            “Of the 70-plus new and updated products launched during Mr. Cook’s tenure,” the Journal notes, “five had a delay between announcement and shipping of three months or more and nine had delays of between one and three months.”

                            Being generous and assuming 70+ means “80”, Apple has missed a little over 17% of their product launch dates by more than a month. 6% of the time they’re off by more than 3 months. What are the odds that they could achieve similar results simply by saying

                            “Whatever is ready for launch the month before WWDC makes it into the presentation”

                            instead of

                            “Here’s a sneak peek at Apple AirPower, coming next year!” followed a year+ later with “whoops! It turns out we can’t figure out how to make it work. 😳”