Accuracy and Precision

Today I got asked what I thought was the difference between accuracy and precision as it pertains to relative estimation, I wasn’t ready for the question though and it’s been sitting on my mind bugging me so here are my thoughts:

I would think accuracy would be how close the estimation was to the truth and precision is how often you predict the same estimation given the same or similar complexity, in other words how repeatable you are. Accuracy is how right or wrong you are and precision is about how consistent you are (irrespective of right or wrong).

So, if you correctly estimate over and over, you’d be accurate and precise but if you estimate incorrectly, but are consistent about the “incorrectness” then you would be precise but not accurate. If you estimate correctly only every now and then for the same given complexity then you would be accurate but lack precision.

Interesting to hear what others think (like if I’ve lost my mind completely 🙂 )…

 

Velocity for the layman

WARNING:This is a brain dump of an idea I’ve been playing around with, some confusion may occur!

As a ScrumMaster I often struggle to explain the concept of velocity to people, a critical concept to grasp to understand how to report on progress and capability. I often get these questions thrown at me:

  • How many features can we complete in X weeks? or…
  • How many bugs can one developer fix per day ?
The danger of these questions, for me, is not so much in the question (which doesn’t make sense) but in what happens afterwards:
  • 1 developer = 1 story per sprint so 2 developers = 2 stories per sprint, right?
  • 1 developer can fix one bug in 1 day, we could double our rate by doubling our people!! or…
 So, to the challenge: how to explain the danger is making all pieces of work equal and how to explain the benefits and logical sense of measuring in velocity.
Let’s start with the analogy: You’re going on holiday, you need to travel from city A to city B (your goal) with 5 stops between.
Doing it wrong would be to assume that you multiply the time to the first stop by the number of remaining stops and that would be your final travelling time.
Doing it right would be to look at the distance between each stop, calculate the average speed you can maintain, and then add up the travel time between each stop. Even better would be to:
  • Adapt your average speed based on environmental conditions. Bad weather, day/night travelling, fatigue, additional rest stops, travelling through mountains? These all affect your average speed.
  • Keep track of your average speed as you travel, this should give you an indicator of exactly what average you can maintain.
  • Check what affects the distance between towns like road closures, alternate routes.
  • Be ready to adapt to changing conditions, like accidents and sudden road closures.

Given the above you should be able to accurately predict when you will arrive at your goal. As a bonus you can also work backwards. Given your need to arrive at a particular time, you can sum up the total distance and divide it by the time you have and this should give you the speed you should try to average.

So, to bring it back to velocity, you average travel time can be equated to the teams average velocity (or capability). Stops are user stories/bugs and distance between cities is the effort of doing the story. Conditions of road, road closures, mountain travelling, predicted bad weather etc is your complexity.

Here’s hoping this analogy makes sense, if it doesn’t please let me know. I still think it’s missing something though so I may add to it at some point…

When will we be finished?

I’m a firm believer that all work should have an agreed end date, but end dates are always a bone of contention and cause endless confusion and wasted arguments and disagreements. These usually hover around the “when will we be finished?” question and, more to the point, no properly defined (empirical) way of working it out.

That in mind I thought I’d lay down the process we use to arrive at our end date. I will also attempt to do this without using the word scrum or agile, not because I don’t like or agree with them, but because those words tend to create predefined thoughts and have a life of their own (unfortunately).

 

So first the basics: In a nutshell we work in iterations, releasing something (hopefully of value) every 2 weeks based on a prioritized backlog. The backlog is constantly updated as stories are added / removed / completed.

Now for the fun. In order to calculate an end date with reasonable empirical accuracy you need the following: remaining effort and an average effort per sprint / day the team is capable of.

Effort Scale

Effort is difficult to define but key to the entire process. Everyone should agree and understand what is meant when we say “X amount of effort”. What you must realize is that it does not matter what unit of measure you use to define your “effort” as long as:

  • it is consistently used across all stories
  • it is understood by everyone
  • something of effort X closely matches another thing of effort X,  for small values of X. 
  • the units should increase roughly exponentially. This is to allow for indication of greater uncertainty with significantly higher numbers. Some people use the Fibonacci sequence (1, 2, 3, 5, 8, 13 …). Also remember to allow for zero effort.

We struggled to settle on our effort scale but eventually had to settle on a time relative scale (for various reasons):

  • Very Easy (1 point) – roughly a day to complete
  • Easy (2 points) – between a day and 2
  • Medium (4 points)  – roughly half a sprint
  • Hard (7 points) – roughly a full sprint
  • Epic (20 points) – unknown or way more than a sprint

and of course 0 points for quickies.

Effort per story

Assigning effort to stories is an exercise the entire team should do but failing that at least the majority of the team. Stories should also be continuously rechecked to see if the effort is still correct, especially stories with high effort assigned to them.

The process we use is as follows:

  1. Discuss the story
  2. Vote on an effort, if there’s dispute, use the highest of the choices, if there’s huge uncertainty, assign the highest effort.
  3. Rinse and repeat

Remember also to compare stories of similar effort. Another way or form of checksumming is to, during planning, assign tasks to the story (as in what you’re going to do to achieve the story) and then assign a rough hour estimate to do the tasks. Sum up the hours for the story and an equivalent effort story should probably have a similar amount of hours, although this is only really useful for well understood stories.

Average Effort Per Iteration

Calculating the effort per iteration (sometimes called the velocity) is a matter of adding all effort units (points in our case) for all completed stories for a sprint. It’s important to only tally up completed stories or the whole process is pointless. Average effort is just the sum of the effort for X number of iterations divided by X. Obviously the more data you have the better. What we also do is remove the highest and lowest values (as anomalies) from the calculation, to account for those strange iterations (over Christmas, Easter etc.)

Remaining Effort

Remaining effort is simply the sum of all effort that has not been completed.

The Formula

Calculating the end date then becomes the following:

By Day: date today + (remaining effort / average velocity / number of work days in iteration) divided by 5 and then multiplied 7 to account for weekends = your end date

By Iteration: current iteration number + (remaining effort / average velocity)  rounded up to whole number = final iteration of work, end of iteration is end date

and that’s it, you’ve predicted a future date based on historical data using a reasonably empirical method. You can improve the accuracy of this estimation by doing the following on a regular basis:

  • Re-evaluate the backlog from an effort point of view, focusing particularly on high effort stories.
  • Recalculate your average velocity at the end of each sprint and factor it in.
  • Factor in holidays and leave into your calculation.

Comments or suggestions welcome

Star Trek, a model for Agility

As a Trekkie I’m prone to watching reruns of old Star Trek episodes from time to time and I started seeing many similarities between an agile approach to “getting things done” and the way the crew of the Enterprise work. Yes, granted, it’s a militaristic structure but that’s not to say you can’t still be agile:

  • Crew members exhibit a general knowledge of most fields, with expertise in a particular field (see how well the bridge crew interchange roles during disaster situations where members go down).
  • Away teams are generally cross functional based on the task at hand, and usually range in size from 3 to 7.
  • The Captain issues orders (“what”) but relies on the crew member(s) to execute them (“how”), and does not interfere unless further clarity is needed.
  • There is a deep level of trust and respect between crew members.
  • There is constant collaboration, communication channels are constantly open from anywhere on the ship (and off).
  • They celebrate their victories and learn from their mistakes, and adapt as needed.
  • At any point in time, the crew is working on whatever will add the most “value” to the mission (work is prioritized).
  • There is an understanding that collaborative teamwork is the way to success.
  • There is a good work/life balance, even on a Starship.
  • The crew are passionate about their roles.
  • Mission planning sessions only plan enough to get the mission going, and are then adapted on the way.

Any other Trekkies out there see similarities I’ve missed?

Change is good

I’m often amazed at the number of dogmatic “prophets” out there when it comes to the latest fad’s of agile development. I myself am about getting productive and like to take the best of ALL worlds; change is not only inevitable but necessary, even changing of a process BUT, only if the change leads to improvement! I believe a process of continual introspection and change by a team, combined with a good way of measuring the effect of that change,  is essential for good team performance.

That in mind we made the following changes to the way we practice scrum:

  1. We added a default story to cover recovering technical debt (or refactoring if you will) in that story. Essentially we assign a “story point cap” to this story (say 5 points) that’s based on our current velocity and duration of sprint. We then assign “technical debt recovery” tasks to this story until the team feels there are enough tasks to fill the story points. I know, sounds like the wagon pushing the horse but it’s something we’re trying that other teams have used effectively.
  2. We’ve started estimating (wait for it…) HOURS on tasks. Stories are still broken into story points but the tasks are estimated in “hour blocks” (2, 4, 8, 16) as they are put up. The purpose of this is to act as a type of “sanity check” for story point estimation. An added benefit I found was that the team members seemed to think a lot more about a task when they had to assign time to it.
  3. We’ve started using a round table instead of the tradition square meeting tables. I found this very effective in encouraging participation and collaboration.
  4. Story point estimation was done using a technique I learned on training that speeds up the estimation incredibly (works well with lots of stories). This is how it works (well, how we did it anyway):
    1. All the stories are explained to the team and placed in a pile on the table.
    2. A wall/board is divided into sections, each representing a story amount (we used 1, 3, 5, 8, 13, 20, 40, 100).
    3. Each team member takes a pile of cards and sticks them in the column(s) they believe it belongs, no-one talks or interacts other than with the product owner to understand the story further.
    4. If a team member disagrees with where a card is, he/she simply moves it to where he believes it SHOULD be.
    5. This carries on until all the tasks are done. Remember, the key is NO COMMUNICATION between team members during this process, this has the effect of forcing people to think about why someone made a decision to place something in a particular column, or why they’re moving it.
  5. We’ve started experimenting with TDD and Paired Programming, which seems to be going well.

One additional thing I’m toying with is moving to a more Kanban based board, the team has problems with taking on too much work and I think creating a way of limiting the pulling of work will work.

Finally on how to measure if these changes are effective, well, to be honest I’m not entirely sure. I think the best way is simply to see if the team becomes better at actually doing what they predict they can do while maintaining a sustainable pace and having fun; other than that I guess we wait and see…

I’d be interested to know what kind of changes you brought about in your team and the effect of them (both good and bad), let me know…

Continuous Integration with Hudson and .NET

Recently I was tasked with setting up a development environment for a .NET shop. I’m a great fan(atic) of Continuous Integration (CI) and set about thinking of a good way of doing this (and a fun/cool way as well).

For those of you new to Continuous Integration, I think Martin Fowler describes it best on his Continuous Integration site:

Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily – leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.

I’d been using CruiseControl for a while now, but for Java projects, and I had a recommendation to look at Hudson as an alternative. I evaluated it and decided it would be a much better solution for the following reasons:

  • It’s easy to use web based interface makes setup, configuration and use a breeze (over the XML driven environment of CruiseControl). This, combined with…
  • Hudson’s extensible plugin architecture makes it easy to create custom functionality. Finally add to this…
  • A large community support/development base

and you have a great, easy to use, well supported product with numerous extensions that will inevitably cover everything you need to do.

Ok, I know, CruiseControl has lots of this as well, but come on, who want’s to struggle with XML all their lives?

Installation

Installation is pretty easy, follow the instructions here based on your setup / choices. Personally we installed Tomcat 6 and deployed the Hudson WAR via the Tomcat Management interface. One additional thing we did was to setup a HUDSON_HOME environment variable to point to a disk with lots of space, I like to keep LOTS of build information (I love metrics) so space was an issue.

Configuration

Configuration is a breeze, once Hudson is up and running, access it via a web browser (normally http://<server address>:<port>/hudson) and select Manage Hudson. Before configuring Hudson itself (Configure System) I suggest you install any plugins you want to use. Do this by selecting Manage Plugins.

Plugins

The plugin interface is divided into four sections, Updates, Available, Installed and Advanced, all pretty self explanatory (right?). Use one of the following methods to install your plugin(s):

  1. Select the plugin(s) in the Available tab, scroll  down to the bottom and select Install.
  2. Download the plugin(s) (generally from Hudson plugin repository), select the Advanced tab, fill in any proxy settings (if needed) and upload the plugin.

either way your plugin will be available once you restart Hudson.

For our installation these were some of the key plugins we installed, and why:

  • Hudson Backup – Allows you to backup/restore your Hudson configuration, we found this invaluable for obvious reasons.
  • Hudson build timeout – This plugin allows you to set a timeout threshold on builds, in case it goes into a “hang” state.
  • Claim – This plugin allows a team member to claim responsibility for fixing a broken or unstable build. Broken builds are BAD, and should be fixed immediately; it’s always good if someone takes ownership.
  • Green Balls – Green Balls instead of Blue Balls. People respond better to the Green/Red colour combination than Blue/Red.
  • Hudson GIT – We chose GIT as our repository and this plug-in enables access to it. No need to stress the importance of good Source Code Management (right?)
  • Hudson Doxygen – This plugin enabled the generation of Doxygen documentation from the source code. It’s a good tool to (visually) check the overall quality of the in-code documentation and I believe it encourages people to write coherent understandable documentation (although it’s only truly effective if someone periodically goes over it.)
  • Hudson MSBuild – MSBuild is used to build our projects from the command line. This plugin enables this functionality.
  • NCover – This plug-in enables the gathering and presentation of NCover results within Hudson.
  • Hudson NUnit – This plug-in enables the gathering and presentation of NUnit statistics within Hudson
  • Hudson Seleniumhq – This enables running and gathering of Selenium tests. This exercises our front end to ensure we don’t break anything already in place.
  • Task Scanner – This plugin scans for open tasks in a specified set of files in the project modules and displays the results.
  • Warnings – This plugin collects the compiler warnings of the project modules and visualizes the results.
  • Hudson instant-messaging – This enables instant messaging capabilities in Hudson. Used by the Jabber plugin.
  • Jabber – Sends build notifications to jabber contacts and/or chatrooms. Also allows control of builds via a jabber ‘bot’.
  • Twitter – This allows the sending of (basic) tweets after builds have completed to a specific twitter account. Visibility is important on any project.
  • Hudson Violations – This allows the gathering of data from various “violation checking” tools (we use it for StyleCop and Simian specifically) and presents it in Hudson. Also allows for limits to be set on “violations” forcing builds into an unstable state if necessary.
  • Hudson Release – This plugin allows you to configure pre and post build actions that are executed when a release build is manually triggered. We use this plugin to do our “releases”
  • The Continuous Integration Game – Enables scoring of builds, encourages good behaviour and is generally a good fun plugin to have.
  • Chuck Norris – A “fun” plugin, fun is important. Chuck get’s upset when the build breaks, plus a few cool Chuck lines to boot.

Hudson

Ok, now for Hudson itself. Hudson’s configuration is pretty self explanatory and well documented (click on the question marks for context sensitive help). I’ll run through what we did to get our server ready for jobs (some of these options only appear once the plugins are installed, which is why I recommend doing them first):

  • Home Directory – This is picked up from the HUDSON_HOME environment variable.
  • System Message – Appears on your landing page, write something cool here about your CI environment.
  • Enable Security – checked, with these options:
    • Random TCP Ports, Access Control using Hudsons own Database, Logged in users can do anything
  • Ant – added an Ant installation for use by our Release Plugin.
  • MSBuild Builder – Configured for our builds, needed for .NET building
  • Selenium Remote Control – Configured the htmlSuite runner for running our Selenium tests.
  • Git – Configured the location of the Git executable (include the actual exe as part of the path).
  • Email Notification – Pretty standard SMTP information, remember to set the Hudson URL to something people who receive the email can use to access the build machine (ie, don’t use something like localhost). Also, set the System Admin Email address.
  • Continuous Integration Game – Activate it.
  • Jabber Notification – Enable it. Use the context help to understand the options and complete the need stuff. One suggestion, change the Bot command prefix to something other than !! to prevent outside people accidentally accessing your build bot and messing with your system.
  • Global Twitter Settings – Setup twitter access, Hudson will tweet based on the options you give, configurable per project as well.

Done, click Save

Creating Jobs

We were constrained to use .NET so we chose to build all our projects as freestyle projects, and the following describes how and what we configured using this option:

Click New Job, fill in a job name and choose Build a freestyle-software project.

Again, most of the options are well documented in the context sensitive help. Here are some of the options that we configured (I’ve left out most of the the common ones that are self explanatory):

  • Main Section
    • Discard Old Builds – We chose to only keep 100 builds, for metrics purposes (did I mention I like metrics?), may change this later.
  • Advanced Project Options Section
    • Quiet Period – Changed this to 15 secs to prevent problems with synchronization and building.
  • Source Code Management Section
    • Git – We use git as out SCM of choice, we have a “central point” where everyone synchronizes to and the build server monitors for change. Just added a file based location (was on the same server as the Hudson so this was the easiest option) to point to the git repository. The rest of the options were left as defaults.
  • Build Triggers Section
    • Poll SCM – checked, polls SCM every minute for changes. This triggers the build.
  • Build Environment Section
    • Configure Release Build
      • Release Version Template – This label is used by the Release Manager to label builds. Can use the environment variables defined in the next section.
      • Release Parameters – Define parameters that you need during the release build here. We defined RELEASE_VERSION to allow for a release number to be passed through to the ant build script, defined later. These parameters will be prompted for when selecting Release Build from the projects menu.
      • Before Release Build – Steps to run before the build. Multiple steps can be defined.
      • After Release Build – Steps to run after the build. Multiple steps can be defined. We define a single Ant Build step to package, compress and add the build number to the relevant artifacts for the particular project. This compressed package is then uploaded to a FTP “release” site for consumption by other parties (QA people, release people etc).
    • Abort the build if stuck – We set this at 10 min, fail if it gets stuck, just in case.
  • Build Section
    • Here you can define a number of build steps to construct your solution and prepare it for release. These are the steps we performed, in order (for most projects, some were excluded for particular projects):
      • Windows Batch Command to pre-configure the environment. Essentially we modified the existing app.config and web.config and merged in environment specific values to override default values for things like databases and file paths. The tool we used for this was the MergeConfiguration tool supplied with Microsoft Enterprise Library 4.1. It takes a delta file (xxx.dconfig) and merges it with the existing file (app.config / web.config) and creates a new web.config / app.config. One thing to note here was that we had to move the existing app.config / web.config file to a temporary file (eg base.config) BEFORE merging and merge base.config with xxx.dconfig to create  app.config / web.config

      move “<path>\app.config” “<path>\base.config”
      “<path>\MergeConfiguration.exe” “<path>\base.config” “<path>\ci.dconfig” “<path>\app.config”

      • Windows Batch Command to run in the database scripts using osql.exe. All databases are recreated from scratch using supplied scripts each build. We did this as a failsafe to make sure the SQL scripts matched the code that was being deployed.

      osql -S<server> -U<user> -P<password> -i<path>\ResetCIDatabase.sql
      osql -S<server> -U<user> -P<password> -d<database> -i<path>\Initialise.sql

      • MSBuild step, no parameters, just build the solution from the root.
      • Windows Batch Command to run the Doxygen command line utility to generate the code docs for the project using a custom doxyfile.config file.

      “<path>\doxygen.exe” doxyfile.config

      • Windows Batch Command to run the StyleCop command line utility (StyleCopCmd) to check the code for style violations. Some things to note about the command line version were that the settings file MUST be called Settings.Stylecop and be referenced without a path (ie, put it in the root of the project). Run the command without parameters to see what else you can do with it. This generates an XML output of violations that is used by the Violations plugin.

      “<path>\StyleCopCmd.exe” -of stylecop-output.xml -d . -r -sc Settings.Stylecop -ifp AssemblyInfo*

      • Windows Batch Command to run the Simian command line utility to check the code for duplication. A very handy tool to catch you cut and paste warriors out there, generates an XML file used by the Violations plugin.

      “<path>\simian-2.2.24.exe” **/*.cs -formatter=xml:simian-output.xml -failOnDuplication-

      • Windows Batch Command to execute Nunit and run Ncover against the results to generate a report on Unit Test coverage. Nunit executes the unit tests, NCover takes the XML output of Nunit and generates reports with trends. Very handy tool. One thing we haven’t figured out yet is how to run the unit tests that require a running web service, although supposedly the //iis parameter does it (not for use though). One thing we haven’t figured out yet is how to run the unit tests that require a running web service, although supposedly the //iis parameter does it (not for use though)

      “<path>\ncover.console.exe” //x coverage.xml “<path>\nunit-console.exe” /noshadow /nologo   <full path to Unit Test DLL’s, space seperated> //h coveragedir //at ncover3.trend

      • Windows Batch Command to release the built artifacts to the internal environment. We “release” out files onto the CI server (running IIS) for Selenium to run against. Currently we do not have a need to run Selenium but when the need does come up, this will facilitate it.

      xcopy PrecompiledWeb <path of built artifacts> /I /R /V /E /Y

    That’s it as far as building the project for now. There are a few more things that we have not implemented yet, but will in time, like Selenium tests, but at the time of this blog they were not done.

  • Post Build Actions Section

This section handles all the reporting back of builds. Here are some of the important things we enabled/configured:

  • Archive the artifacts – We chose to archive the entire build for reference purposes. Maybe a bit overkill.
  • Activate the Chuck Norris Plugin – Chuck is watching you!
  • Scan workspace for open tasks – This plugin scan the workspace for TODO’s, FIXME’s and NOTE’s (the tags we use to signify something to do in the code). Reports on the occurence of these. These are bad, putting TODO’s in code is pointless, rather just DO it.
  • Report Violations – gathers all the input XML files and creates a single Violations report. Used to pull in the Simian and StyleCop violations into a report.
  • Scan for Compiler Warnings – checks the standard out for compiler warnings, we set it to scan only for MSBuild parser.
  • Publish Doxygen – publish the Doxygen generated files, using the Doxygen HTML directory.
  • Email notifications – enabled the email notifications to send mail to everyone in the team whenever an event happened on the build (fail, fix, build etc.)
  • Activate the Continuous Integration Game (see the plugin link above for more details on this)
  • Allow broken build claiming – Allow people to claim broken builds. Ownership is important, someone must take responsibility for fixing the broken build. Remember, it’s not about blame, it’s about fixing it.
  • Jabber notification – activate and entered everyones Jabber accounts. This allows instant notification of build events to developers. Again, everyone gets notified of everything.

That’s it, your job is created and ready to build.

I hope this proves useful to some of you out there, I didn’t have too many issues with configuring Hudson in a .NET environment and those I did have were solved eventually. There are still a few issues/concerns that we need to iron out like:

  • How do we control versions and releasing versions?
  • How do we deal with build dependencies and releasing dependencies with projects? (Hudson supposedly handles this well, yet to be proven though)
  • How do I get NUnit to run the Web Service Unit tests?
  • Automating releases into a QA environment.

Other than that things are running pretty smoothly, will post more (shorter) entries as we make more progress.