Friday, June 19, 2009

Team Foundation Server: Some thoughts on source control branching strategies

I thought I'd just write a short note on some of the source control branching strategies employed and how these relate to how I might use my preferred source control repository, Team Foundation Server (TFS).

I have been working with a customer who are using the IBM Rational toolset for just about everything and I have been ranting about how bad it is and how great TFS is, but one of the things that has given me cause to think is the way that the customer uses the ClearCase source control for branching and merging.

This has echoes with another customer I worked with last year (very large UK bank) that was using Harvest - now a CA product - for their source control. This was also working in a similar way and I was considering how this contrasted with the way that we usually use TFS.

Streaming Source Control Model

The first thing that I would say is that TFS is tuned for developer productivity whereas IBM Rational ClearCase is tuned for control. This in itself makes for some interesting differences, which I may expound on in a later blog post. The next immediate observation is that they handle changesets differently. This is because Harvest and ClearCase use a hierarchial branching structure. Typically, you might have a series of environments that your code is going to progress through, and you might have different builds of code at each of these stages. You therefore create a branch that represents each of these stages and order them in a hierarchy. At the top of the stack you have your current production code, then you might have QA, then maybe System Integration, then Continuous Integration and at the bottom of the stack you have a branch where you are actually developing.

The way that the Harvest and ClearCase work is that you create packages of changes that essentially ought to relate to features, which may actually contain multiple check-ins of code. When the feature is deemed to be complete in development it is "promoted" to the next level (i.e. CI), built and tested. You might vary the next bit depending on your project methodology, but essentially you take periodic releases of the software. This is then usually done by promoting the tested backages from your CI environment and progressing them through the remaining branches until they become the production release. If bugs occur through the environment fixes can be applied at any of the other levels and added to the build.

One of the sticking points here is that you can only check in changes against a single feature at one time, and so this makes concurrent development more difficult - and this is especially difficult when it comes to a bug-fixing stage of a project iteration when lots of small changes are being made with high frequency. (I know we should only be promoting bug-free code, but get in the real world).

In Team Foundation Server source control it is possible to set up a similar branching strategy and use the merging features as a means of promotion. You can still make all of the associated levels but there is no direct hierarchy implicit in the system. This is enforced by usage and also that merges can only be made up and down the lines of the branches.

One of the major contrasts lies with TFS where changesets. These can be / are associated with one or more work items, but they have a much looser correlation, and the changeset is the atomic unit of change rather than the feature. Therefore if you want to promote a feature you have to merge in all of the changesets up to the last changeset for that feature. This will also usually mean merging in all of the changesets that relate to other features as well, meaning that code on "unfinished" features may get promoted as well.

This might appear to mean that TFS has got much less control (and there is some merit in that observation) but it also means that you have a more consistent behaviour when you merge. In the feature-based model of ClearCase it is possible that some dependent code, not changed in a feature being promoted but changed in another non-promoted feature, will change the overall behaviour of the solution. If you promote all changesets up to a given point then at least you know that your build will behave the same.

Release-Based Source Control Model

One of the implicit assumptions of the streaming approach is that you have a separate build in each stream and that the sourec code of the stream constitutes the "release" in many ways. An alternative model that is often used in TFS is to branch based on releases. Let's imagine you are shipping a software product Widget 2009. You also have to support Widget 2008, Widget 2007 and Widget 2006, all of which are based on the same source code but with enhancements and developments in the intervening period. You have to support all of these products and be able to issue service packs against them. You also need to make sure that if you bug fix Widget 2006 that same fix can be merged into the later releases as well.

In this scenario the hierarchical streaming model is not suitable, because each time you promote a new set of features to your production stream and start building a new "production" release of your software you are effectively ending the ability to build your previous versions.

What you might do in this model is to have a branch for each release of your software again with a build process for each branch, but each branch does not overwrite the others and has its own lifecycle. If you need to patch a previous release into "production" you don't have to overwrite the other production release.

When to use each approach

I have discussed a couple of different approaches to branching and maintaining releases, both of which are in use in various organisations and each has its merits and demerits. The question is - if you have to put in place a branching strategy which one qould you choose? Which one is most appropriate in different circumstances? What are the pros / cons of this approach?

What I would say with this is that if you are shipping software product where you need to be able to manage the source on many versions of the software at the same time then you will need the second approach, and have a branch for each release. This scenario involves many different users who use different releases from each other and therefore all need to be supported. Desktop applications definitely fall into this category, as do many other retail packaged software products such as components and server products.

The release-based approach does have its side-effects and these should not be ignored. The main side-effect is that if you have many releases of your software you end up with a large number of branches and build profiles and these become difficult to manage. As I said, an annual release of a software product isn't going to lead to unacceptable overhead in this model.

However, another very common scenario is where an organisation has a software product that needs to be refreshed on a periodic basis. When the upgrade happens all users are affected at the same time and cannot choose whether they participate in the product or not - they are sent the upgrade anyway. This is most commonly applied to software teams within an company producing bespoke software, but may also be applied to self-updating applications such as iTunes where upgrades are pushed out regularly and downloaded over the Internet and installed. It also applies to .com organisations where you obviously have one current production build of your software.

In these scenarios you tend to have a high number of releases, especially with agile projects, and once a release has made it to production you discard the previous releases as you will never be opening up and servicing the old code. In this scenario it is easier to manage your source code if you have a limited number of streams and promote changes up through them, irrespective of whether you are prmoting changesets in TFS or features in Harvest or ClearCase. Once you have got all of your branches building you may find you have a lower project overhead in maintaining your builds.

In conclusion.....

A modern source control repository must support effective branching and merging in order to handle development of new versions of software whilst supporting current versions. The manner in which you branch will depend on your release cycle and the type of software that you produce. Picking the correct branching strategy for your project will have a direct impact on how effectively you can support your software, so take time to think about it and get it right.

Tuesday, June 09, 2009

Enable Unit Testing in BizTalk 2009

BizTalk 2009 has, for the first time, built-in developer support for testing schemas, maps and pipelines that is available for automated unit testing rather than being hidden behind features in Visual Studio where it can only be used for manual testing.

Anyone interested in finding out more about BizTalk unit testing wouldn't go far wrong looking at Michael Stephenson's series on unit testing

To use or not to use (the unit testing framework)

There has been a debate around the office lately regarding the unit testing features in BizTalk 2009. The debate goes something like this:

1. You don't want to have unit testing enabled on your production assemblies.
2. If you enable unit testing on your assemblies for test and then switch it off for release then you are releasing different (although generated) code.
3. Is the unit testing framework any good anyway?

There has seemed to be a consensus that the unit testing features in BizTalk 2009 aren't that good and we should do without them. However, I beg to differ! Premise #1 - that unit testing should not be enabled on production releases - is I think a false premise.

For example, let's look at unit testing of schemas in more detail. When you enable unit testing you actually change the base class from which your schemas derive, but this base class in turn inherits from the SchemaBase class anyway. All that is added is a method called ValidateInstance which you hook into for your tests - all the rest of the implementation is the same, so to me there is no issue with using this in production code. It means that you can build in release mode and then run your unit tests against your production assemblies to examine their quality which surely should be a good thing!

Remember to set the configurations

When you set your deployment properties on your BizTalk project, you are probably developing in Debug mode, and you might not think to change the deployment settings for release. If you don't explicitly enable unit testing for all configurations or specifically for release then by default your release assemblies will be built without unit testing enabled.

This is something that happened to us today. We are in the first iteration of a project, we've successfully put in a build process right at the start, that build process packages the code into an InstallShield package and installs it onto a Consuinuous Integration environment. We were just getting the build to run the unit and integration tests and publish the results back to examine the build quality and we started getting an error like this when building the test projects:

Cannot implicitly convert type 'XXXX.YYYY.Transforms.CustomerAccount_to_CustomerDetails' to 'Microsoft.BizTalk.TestTools.Mapper.TestableMapBase'

My first reaction was to check that the references in the unit test project were correct - which they were. I had copied the Microsoft.BizTalk.TestTools.dll assembly into a referenced assemblies folder. That was all OK. I then checked to see whether the referenced assembly was available on the build server. It was. In the end the issue was that we were building in Release mode and I had only enabled unit testing in Debug mode. Because of this, my schemas were inheriting from SchemaBase and not TestableSchemaBase and my maps from MapBase and not TestableMapBase etc. Whilst the schemas and maps projects all built OK the build error appeared in the test project.

As soon as I enabled unit testing for all build configurations the schemas then everything built OK and I could go ahead and build my tests.

Test tools not available

This then led quickly onto another issue. The build process always builds in release mode, but it had been set so that unit testing was not enabled. Then when I enabled unit testing the build was OK, and all of the unit tests were passing but we're doing continuous integration so the build process, after the unit tests, includes a deployment step and then a second set of integration tests on the deployed system.

Unfortunately, once the build was making the assemblies testable the deployment script failed (we use InstallShield to put the DLLs onto the file system and then use BTSTask to deploy them into BizTalk as a custom action). BTSTask was failing to deploy the new build because the reference to Microsoft.BizTalk.TestTools could not be resolved. I checked my development machine - it has BTS2009 Developer Edition and Microsoft.BizTalk.TestTools was in the Global Assembly Cache. I then checked the build server, also using developer edition, Microsoft.BizTalk.TestTools was in the GAC on there. I then checked the integration server where we are automatically deploying to - Microsoft.BizTalk.TestTools was missing from the GAC as the developer tools were not installed.

I had to modify the build so that the build tools were deployed into the GAC and then the assemblies would build OK. This goes right back to point #1 above: if the core BizTalk 2009 install does not include the test tools should we be putting them on there anyway or using a different method of unit testing that does not rely on the test tools? On the other hand, if the test tools are there out-of-the-box, shoudn't we be using them?

The debate rumbles on.......

Wednesday, June 03, 2009

Some funnies with creating and writing to event logs

Today I was working with my project team on tracing and diagnostics, and we made a decision to move all of our event sources into a new application log.  So, I changed my installer to create a new application log, removed my old application sources and registered my new application sources.  

I then made sure that the BizUnit tests were changed to look for the event log entries in the correct event log and ran the regression suite and guess what - they all failed!  I looked for the events and found that my new event log was there but pitifully empty and instead the events were still being written into the Application log.  

I then checked my registry keys in HKLM/SYSTEM/CurrentControlSet/Services/EventLog and they were all OK as well, all pointing to the correct event log.

I was also wondering if this was a Windows 2008 thing, because the event log infrastructure has changed quite a bit in there as well.

I did a bit of rooting around, and found some good thoughts out there, including a forums post here which is one place that points to some support info from Microsoft and that didn't really take me anywhere because I was doing what seemed to be the right thing already.

In the end it was much more mundane.  If you have removed an event source and registered it onto a different event log you need to reboot.  When the machine comes back up again all the events go into the right place. 

Tuesday, June 02, 2009

Gotcha with the BizTalk Pipeline Component Wizard

When I was in the BizTalk equivalent of "short trousers" I put lots of my integration functionality inside orchestrations (as the training courses suggest).  As I got further under the bonnet, especially when performance was critical or, more recently, when considering asynchronous ESB patterns, I have tried to put more functionality into the messaging infrastructure when designing my BizTalk architectures.  

The key to unlocking the messaging engine is pipelines, where you can manipulate messages and message context without having to fire up the orchestration engine.  However, when I first started to write my own pipeline components I mainly based them on the SDK.  This was OK, but there was a lot of cut&paste in there.  it was a blessing to get hold of the BizTalk Pipeline Component Wizard from Codeplex ( to use it as a software factory for creating pipeline components.

So there I was, merrily making pipeline components with the pipeline component wizard and all seemed OK.  I created a new project with the wizard when I got this....

So, I created a solution and then created some solution folders to help organise things, and then I right-clicked on a solution folder and chose the project template for PLCW:

And then putting in some options 

So, after this I hit the finish button and:

The actual text is as follows:

System.ArgumentException: Value does not fall within the expected range.

   at EnvDTE.Projects.Item(Object index)

   at MartijnHoogendoorn.BizTalk.Wizards.PipeLineComponentWizard.BizTalkPipeLineWizard.CreateProject(Solution mySolution)

   at MartijnHoogendoorn.BizTalk.Wizards.PipeLineComponentWizard.BizTalkPipeLineWizard.CreateSolution(_DTE IDEObject, Object[] ContextParams)

   at MartijnHoogendoorn.BizTalk.Wizards.PipeLineComponentWizard.BizTalkPipeLineWizard.Execute(Object Application, Int32 hwndOwner, Object[]& ContextParams, Object[]& CustomParams, wizardResult& retval)

This is a fully repeatable issue, and here is to work around this, right-click on the solution itself and then add the project to the solution directly.  If you do this it works like a dream.  Once the project has been created you can then move it under the correct solution folder.  

I am assuming that if you have read this the chances are that you have done a search on Google when you've run into this issue, so there's your answer.  Happy pipelining!

Monday, June 01, 2009

Some funnies with BizTalk property schemas

I haven't blogged for a couple of weeks, and I have been doing loads of interesting stuff recently, but so little time to blog about it.

So, just as a little nugget, I have been creating an ESB solution using BizTalk (mainly messaging) and I have created a property schema that I want to use for my routing in my ESB.  Because I wanted to share my property schema I put it into a separate assembly.  Let's say it's default namespace and assembly name is of this form:


Let's say that I have a single property in the property schema (out of pure laziness because I'm replaying this in a sample app) - and called it Property1.  I signed built and then built the assembly.  So far so good. If I want to use the property it will be MyCompany.MyApplication.Integration.Schemas.Routing.Property1.

I then created another schema assembly and started to add schemas to it.  Again, let's say that the default namespace of the assembly is in this form:


I added a schema to the file with a couple of fields under the root element.  I signed and built it.  So far so good.

I then referenced my property schema assembly and promoted some fields by referencing the property schema from the referenced assembly.  Again, so far so good, it all builds.

I then decided that I wanted the namespace of the property so that it is shorter (as it is to be shared).  Let's say that I want it to be MyApplication.Property1.  In order to achieve this Iclick on the property schema file, view the properties and then change the namespace to MyApplication.  I then try to build the project and - BANG - I get the following error:

Error 1 The type or namespace name 'Routing' does not exist in the namespace 'MyCompany.MyApplication.Integration.Schemas' (are you missing an assembly reference?) C:\_projects\Solutions\Integration.Schemas.Routing\Integration.Schemas\Schema1.xsd.cs 43 68 Integration.Schemas

This may not be too unexpected because I didn't change the reference of the property schema in my schema, so I thought I'd remove the reference for the property schema and re-add it.  When I add the property schema everything seems to be OK, but I still get the same error!  Even though the property schema builds OK and the property schema can be added to the schema OK, but when it builds it fails.  

Ho-ho I thought.  What's going on here?  I then tried changing the default namespace in the assembly to be the same as the namespace for my file.  Same error.  I then tried changing the default and file namespaces to be something completely different (Badgers) and then it builds!

After some chasing up, it appears that there is a known issue with BizTalk references that means you can't include part of a namespace inside a reference because it can't resolve the namespace.  You either have to keep the original namespace which is fully qualified from the root namespace as per your naming conventions, or you have to choose something that will not clash at all.  You can't abbreviate with just part of the namespace.  That's just the way it is.  If you're reading this and it's saved you from tearing out some of your hair then let me know :)