The Perfect Build Part 2: Version Control / October 10, 2009 by Matt Wrock

Over a year ago, my team was using Visual Source Safe (VSS) for version control. I’m not going to spend time discussing why we wanted to migrate away from VSS. The reasons should be obvious to anyone who has worked with VSS and especially to those who have worked with both VSS and another source control system like SVN, CVS or GIT.

We decided to migrate to Subversion. At the time there were only two options we were considering: TFS and Subversion (SVN). We chose SVN because it seemed much lighter than TFS, it had wide industry adoption, another business unit in our organization had been happily using it for years and I could not find anything stating SVN sucks. If I were making the same decision today, I would probably have chosen GIT. Especially since shortly after our SVN migration, we spawned a team in Vietnam that is somewhat bandwidth constrained.

The source Safe repository we had been using had been in existence for over eight years and had survived several managerial “dynasties.” In short, the repository structure was completely disorganized. We did not intend to simply transplant our VSS structure to SVN. We wanted to use the migration as an opportunity to reorganize our repository into something much more usable that would support a smooth code promotion process. As a side effect of this decision, we also had no intention of migrating our VSS history to the SVN repository. There are tools that will assist in doing this. Our plan was to keep the VSS repository alive in a read only state as a reference for project history. As new development progresses on the new SVN repository, the need to refer to VSS would become more and more infrequent.

Structure Requirements

My development team consists of a few sub teams (2 to 4 developers each) that each build and maintain multiple applications. We also have common libraries that are shared across teams. As we planned our structure, we had the following requirements:

We wanted to have a structure that would allow us to commit common code to a single trunk so that when other teams checked out the trunk, they would be guaranteed to have the latest code.
We wanted checkouts to be simple where developers could get everything they need by doing a single checkout of the trunk.
We wanted to be able to lock down an app at a certain point in its release cycle so that changes to common code from another team would not propogate to another team’s app when they were in final QA.
We wanted to have snap shots of every app at each release.

“Suggested” SVN Structure

Many blogs and the SVN Book suggest a structure that looks like the following:

/projA/trunk

/projA/branches

/projA/tags

/projB/trunk

/projB/branches

/projB/tags

I’m sure this structure works great in the majority of cases where you have a single dev team or are working on a single application or if you have multiple teams that have strictly partitioned code bases. However, it would be awkward to apply it to our environment. How would you handle shared libraries here?

Our Structure

/prod

/prod/app1/20090903

/prod/app1/20090917

/prod/app2/20090903

/sandbox

/sandbox/users/mwrock

/staging

/staging/app1

/staging/app2

/trunk

/trunk/libraries

/trunk/desktop

/trunk/docs

/trunk/services

/trunk/sql

/trunk/testing

/trunk/thirdparty

/trunk/web

Here we have the following root folders:

Trunk: This holds the latest code revisions suitable for integration into the mainline. There is a single trunk shared by all projects and teams. Developers only have to checkout this single folder and they have everything they need.
Sandbox: These are personal development areas used for branching long running changes that you want to keep separate from the trunk until they are ready to be merged back to the trunk.
Staging: This is the final QA/UAT area. The trunk is copied here once development is thought to be stable and ready for final testing. This protects the release from development commited to the trunk by other teams. When a release is in this stage, you do not want unknown commits from someone else entering your code base.
Prod: This contains production releases. Each application has its own folder under prod and each release has a folder named after the date of its release. The staging branch is copied to these release tags upon deployment and they represent a snapshot of the code at the time of release. The prod area is a historical record of exactly what was released and when.

Limitations

This structure has worked great for us. However no structure is perfect for everyone. For instance if you have separate teams working on applications with no shared code and that are completely unrelated, it may be better to separate the code bases of such teams into completely separate repositories. Another limitation we have run into is with offshore development. We have a team in Vietnam that has a much thinner bandwidth pipe and it takes a couple hours for them to check out the entire trunk. Admittedly, my personal experience with SVN is not vast and there are likely better ways to organize a repository for our team. But the fact remains that this has worked well for us and has had tremendously improved the state of our configuration management.

In my next post in this series, I will discuss how our integration server interacts with our repository and automates builds and deployments.

The Perfect Build Part 1 / September 30, 2009 by Matt Wrock

A year ago, my team was using Visual Source Safe as our version control repository and our builds were built and deployed manually using sophisticated tools like windows explorer and remote desktop. This was not a happy time.

VSS was awfully slow and practically impossible to work with remotely.
VSS branching, tagging and merging capabilities were primitive at best.
Our manual deployments were not scripted and therefore not easily repeatable and certainly not guaranteeably repeatable.
Deployments were inadequately documented and it was often impossible to know exactly what was deployed and when.
Complicated deployments were time consuming, error prone and just plain stressful for the build engineer.

Well all of this has changed now. We migrated to Subversion for source control and implemented CruiseControl and NANT as an integration server and deployment solution. Looking back on this migration, I can honestly say that it represents the biggest leap forward we have made in terms of process improvement. This is a first in a series of blog posts where I will document this process in detail. I will outline how we tackled the following:

Repository structure: We did not want to merely copy our VSS repository tree. We wanted to create something fundamentlly more useful.
Generic NANT scripts to handle Visual Studio Projects and Solutions
Managing configuratioin files in different environments (dev, staging and production)
Automated tagging production deployments
Cruisecontrol organizaion

Some may say, "Its only source control, what's the big deal?" Well our source is our product. Any inability to produce our product or any risk to the integrity of our product is a big deal. As developers, we interact with source control throughout every working day and it is imperative that these interactions are simple, reliable and valuable. If there is waste in terms of time or lost code at this layer of the develoopment process, its like a death by a thousand cuts to the whole dev team.

This was a migration that I took very seriously. Like most software shops, we are always very busy and juggle lots of projects at the same time. This migration provided no direct revenue increase to our business and was always difficult to prioritize. I was often asked "why don't we just migrate now? Just have Joe Jr. Developer port the code from VSS to SVN and be done with it." This was something I really wanted to think through, research and do right. I knew it was not going to be a "set and forget" operation. It needed to be planned, implemented and enforced and baby sat. My team, including myself, had been using VSS for years and were not familiar with the branch/merge paradigm of SVN. I knew we needed to understand these concepts and use them to their fullest. I also knewthat developers were going to experience hickups with usig SVN and its related tools. I needed to educate myself and make myself availale as a mentor of this technology although I was a novice. There was no budget to bring in a consultant.

So I researched on the web, I talked to colleagues with experience in SVN in varying environments, and I mapped out my own custom solution. My tech department is composed of different teams that produce and maintain several different applications that are web based, desktop and background services. We also have some common libraries shared and maintained by all teams and we share third party tools and libraries some of which are open source and require constant updates to new versions. We needed a solution that would allow us to build separate applications maintained by separate teams with code shared accross the organization. We needed a setup that would allow developers to commit code to a centralized Trunk that had all of the most recent changes but would present these changes to different applications in a controlled manner.

I beieve we have acieved exactly this and more. So what do we have now?

We have a source control repository that informs us of exactly what is on an individual dev's environment, on our integrated dev box, on staging and on production.
We have a setup that allows us to see exactly what code was deployed on which application on which date.
Commits to SVN are automaticaly built and deployed to our integration dev server
Commits to staging are automaically built and deployed to our staging servers.
With a click of a button in CruiseControl, an application can be deployed to our production server farms
Each environment can have its own versioned configuration
Rolling back to a previous deployment is trivial

My next post in this series will focus on the SVN structure we developed and how it partitions our different environments and applications but ensures that each new build incorporates the latest comon code.

Debugging Windows Services in Visual Studio / September 25, 2009 by Matt Wrock

One challenge involved in developing windows service applications is debugging them in Visual Studio. I don't know why Visual Studio does not provide better support for this, but I've seen some creative techniques employed to make debugging windows services possible. One popular method is to put a really long thread.sleep(30000) in the OnStart event and then install the service, start it and attach the debugger to the service's process hoping that it will take less than 30 seconds to start it, find the process and attach.

There is a better way and it is quite trivial.

There is one prerequisite: Make sure you do not have service logic in your OnStart() method. This turns out to be a good thing either way. I've seen 100 line service OnStart methods that put a good deal if not all of the logic into this one method. Well what if you want to reuse this logic in a console, WPF or win forms app? Its not a very flexible methodology. I believe that a windows service project should never contain more than your installer and entry point class and the entry point class only handles start, stop and pause by caling into a completely separate assembly. However, before over philosophizing this, lets quickly jump into how to setup your service project so that it can be easily dubugged in VS.

Follow these steps:

Change your Main() entry point method to Main(string[] args)
Change the Output Type property of the servce project from Windows Aplication to Console Application
Your Main(string[] args) method shouldlook something like:

        static void Main(string[] args)        {            if (args.Length > 0)            {                MyApp app = new MyApp();                app.Run();            }            else            {                ServiceBase[] ServicesToRun;                ServicesToRun = new ServiceBase[]                 {                     new Service1()                 };                ServiceBase.Run(ServicesToRun);            }        }

Finaly, in the Debug section of the project properties, provide a command line argument.

Your OnStart() should contain the same app startup code as you Main method:

        protected override void OnStart(string[] args)

            MyApp app = new MyApp();

            app.Run();

Thats it. Now hit F5 and you will see a command window pop up and all of your break points should be recognized. MyApp contains the meat of te logic. Main and OnStart are just dumb harnesses.

Debugging Managed Production Applications with WinDbg / September 24, 2009 by Matt Wrock

Yesterday our issue tracking software was hanging and the vendor was not responding to our ticket requsts (They are nine hours ahead of us). The application is a .NET application so I decided to capture a crash dump and dive in with windbg. I have a love/hate relationship with windbg. I love it because it provides a vast wealth of informationvirtually telling me EVERYTHING thats going on with my process. It has saved my behind several times. I hate it because I don't use it frequently enough to have all of the cryptic commands memorized and often have to relearn and reresearch the commands I need to use in order to solve my problem. Windbg is not for the faint of heart. There is no drag and drop here. But if you have an app bugging out on a production server and don't want to attach a debuger to it, windbg is the tool for you.

This post is an adaptation of a document I created for my team and me a few years back. I use it like a cheat sheet to help me get started quickly.

When servers start crashing and/or hanging in production, often the only recourse you have is to capture a memory dump of the ailing process and analyze it using Microsoft’s Native debugger – WinDbg. Without this tool, you may just be shooting in the dark. These techniques can not only be applied to web applications but to any application – under managed or unmanaged code.

A memory dump will allow you to see everything going on in the captured process: executing threads and how long each have been running, stack traces of all threads and even the values of parameters passed to functions. They can also be used to troubleshoot memory leaks, allowing you to see what is in the heap.

A word of caution is in order: windbg is a pain to use. At least that has been my experience. There is almost no documentation included and the commands are very unintuitive, and this is compounded by the fact that you (hopefully) rarely use it.

There are three basic steps to this process:

Preparing the debugging environment on the problem server.
Actually capturing the dump while your process is crashing or crashed.
Analyzing the dump in windbg

Preparing the Debugging Environment

There are a few steps to complete to get the server ready:

Install Microsoft’s Debugging toolkit. Get the latest version at http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx. Note that there is a 32 bit and 64 bit version. If you are running on a 64 bit server but you have a managed app that is compiled for 32 bit, you will need to use the 32 bit version of windbg to debug.
Create an environment variable for path to symbol file (.pdb files that contain information that map native instructions tp function calls). Create a system environment variable called _NT_SYMBOL_PATH with the value: C:\symbols\debugginglabs*http://msdl.microsoft.com/download/symbols;C:\symbols\debugginglabs;C:\Program Files\Microsoft.Net\FrameworkSDK\symbols;C:\windows\system32
Copy sos.dll from the Microsoft.net directory to the same directory where you installed the debugging toolkit. This file provides extensions to windbg for analyzing managed code.

Capturing a Memory Dump

This step can be a bit tricky depending on the circumstances of your crashing behavior. There are typically 3 ways to do this:

Call up a test URL to see if the app has crashed or is locking
Use Task Manager to see if the CPU is pinned
Use Performance Monitor and look for queueing threads. If threads are queueing, that means that all available .net worker threads are busy which usually means something is wrong.

Once you have determined that the process has crashed, bring up a command prompt and navigate to the directory where you downloaded the debugging toolkit. Next type:

adplus.vbs –hang –pid [process ID of problem process]

If there are more than one worker process running and you are not sure which one is causing problems, repeat the above command for both processes.

This command will launch windbg in a separate window to load the process information. Just let it run and it will close when it completes.

Analyzing the Dump

Open windbg.exe which is inside the directory that you extracted the debugging toolkit to.
Go to File/Open Crash Dump and find the dump (.DMP) file you just captured. It will be in a subfolder of the debugging toolkit directory.
type .load sos.dll to load the managed code extensions.

You are now ready to start troubleshooting. Below are some commands I commonly use to get useful information. At the end of this document are some links to some MS white papers with more detailed information on performance debugging.

Listing all threads and how long they have been running

!runaway

Note the thread IDs of any particularly long running threads. If you have several threads that have been running for minutes, that could point to a never ending loop that is eating CPU or just a long running background thread.

Listing Managed Threads

!threads

There are several noteworthy tidbits here:

Lock Count: If this is greater than 0, it means that the thread is waiting(blocking) for another thread. For instance it might be waiting for a DB query to come back or a response from a socket. If you have a bunch of these, it could be a tip that there is a bad query. See below on how to get the call stack of an individual thread to see exactly what it is doing.

Domain: This is the address of the app domain that the thread is running in. This is very helpful if you have several web sites running in the same worker process. Once you find the problem thread(s), you can use this to see which web app is causing the problem. Keep in mind that all asp.net workerprocess have a “default” app domain used for launching new app domains (there is one per web app) and handling GC.

Determine which Web Application a thread is running in

!dumpdomain [address]

This dumps a list of assemblies loaded into the domain which should tip you off as to which web app it is running in.

Get a summary information on the Threadpool

!threadpool

This tells you haw many threads are free/in use and what the CPU utilization was at the time of the capture.

Get the stack trace of a single thread including passed parameters

~[thread id]e !clrstack –p

Get the thread ID from !threads or use “*” to get a dump of ALL threads.

Get detailed information on an object

!dumpobj [address]

This gives info on all fields in the object.

More Resources

http://msdn.microsoft.com/en-us/library/ms954594.aspx
This is an old link but has good and thorough informatioin.

http://blogs.msdn.com/tess/ This is Tess Ferrandez's blog. She has tons of great posts on this subject and also on analyzing memory leaking problems.

Web Site Performance - It's not just your code / September 18, 2009 by Matt Wrock

I'm a software developer. By default, I tend to focus on optimizing my code and architecture patterns in order to tune performance. However, there are factors that lie well outside of your application code and structure that can have a profound impact on web site performance. Two of these factors that this post will focus on are the use of a CDN and IIS compresion settings.

Over the last month, me and my team have contracted a CDN and tweaked compression settings resulting in a 40% improvement of aveage page load time outside of our immediate geographical region!

Using a CDN for static content

A CDN (content delivery network) is a company that has data centers distributed over the globe. Some of the dominant players are Akamai, CDNetworks and Limelight. These have many data centers dispersed over the United States as well as several scattered internationaly. Chances are they have servers that are located much closer to your users than your own servers. You may not think this is a big deal. What's a couple hundred milliseconds to load an image? Well if your site has alot of images as well as css and javascript (jquery alone packs a hefty download size), the boost can be significant. Not only will average load time improve, but you will be less suseptible to general internet peaks and jitters that can potentially cause significant and annoying load delays.

One of the disadvantages of using a CDN is cost. You will need to work with a commercial CDN if you want to have your own images and other static content that you have authored to be hosted via CDN. However, several popular frameworks are freely hosted on the CDNs of large internet companies. For example just this week Scott Guthrie announced that Microsoft will be hosting jQuery and the asp.net AJAX libraries. You can also use Google's CDN to serve jQuery. I use Yahoo's YUI ajax libraries and they can be served from Yahoo. Another cost related consideration is that the bandwidth you would be paying to serve this content yourself is now absorbed by the CDN.

One other disadvantage that I often hear related to CDNs is that the control and availability of the content is now at the mercy of the CDN. Well I personally feel much more comfortable using Google's vast network resources than my own.

Migrating to a CDN for the hosting of your own media is fairly transparent and painless especially if you already host all of this content on its own domain or sub domain (ie images.coolsite.com). The CDN will assign you a DNS CNAME and you will add that to your domain's zone file. Now all requests for static content will go to the CDN's servers. If they do not have the content, they will go to your server to get it. Then all subsequent requests will go to the CDN's cache. You can specify what the cache expiration will be and you should also be able to manually flush the cache if you need to.

One other perk that most commercial CDNs provide is detailed reporting on bandwidth and geographical data telling where your content is being requested from.

Our servers are based in California's Silicon Valley and we realized a 30% performance boost in the midwest and eastern United States. It should also be noted that our pages are very image light. So a site that has lots of rich media has even more to gain.

Compress Everything

I had always been compressing static content but not dynamic content. I think I was scared off by the warnings that the IIS documentation gives regarding high CPU usage. Yes, compression does incur CPU overhead, but with today's CPU specs, chances are that it is not significant enough to keep you from turning on this feature. Our servers tend to run at about 5 to 10%. After turning on dynamic content compression, I saw no noticable CPU increase but I did see a 10% increase in page load performance. All of this is free and took me just a few minutes to configure. In fact it is better than free, you will save on bandwidth.

I did do some poking around the web for best practices and found that it is worth while to tweak the default compression levels in IIS. Here is a good blog article that goes into detail on this topic. To turn on static and dynamic compression at the ideal levels on IIS 7. I issued these commands:

C:\Windows\System32\Inetsrv\Appcmd.exe set config -section:urlCompression -doStaticCompression:true -doDynamicCompression:true

C:\Windows\System32\Inetsrv\Appcmd.exe set config
-section:httpCompression -[name='gzip'].staticCompressionLevel:9
-[name='gzip'].dynamicCompressionLevel:4

Tools for analyzing page load performance

Here are a few nice tools I use to observe and analyze web site performance:

Firebug. A must have Firefox plugin that will tell how long each resource takes to load.
An external monitoring service. I use Gomez. This will not only tell you how long it takes for your site to load, but it can monitor from around the globe and provide very rich and detailed reporting. I have several alerts configured that page me if my site is taking too long to load or is broken. "Broken" can mean 500 errors, Server too busy errors, non responsive servers due to server outages or bad DNS or it can even mean a failed pattern match of strings expected on the page.
YSlow. This is a firefox plugin from Yahoo that works with Firebug and analyzes several key indicators on your site. It examines your headers, caching, javascript, style sheet usage and much more and then gives you a detailed list of possible improvements you can make.

So if you feel that you have done all the code tweaking you can to get the most performance from your site, think again and take a look at these tools to see how the outside world is experiencing your content.

Hurry Up and Wait!

Tales from an automation engineer

The Perfect Build Part 2: Version Control / October 10, 2009 by Matt Wrock

The Perfect Build Part 1 / September 30, 2009 by Matt Wrock

Debugging Windows Services in Visual Studio / September 25, 2009 by Matt Wrock

Debugging Managed Production Applications with WinDbg / September 24, 2009 by Matt Wrock

Web Site Performance - It's not just your code / September 18, 2009 by Matt Wrock

Latest Posts