www.orangelightning.co.uk makes 40% improvement in page load performance using RequestReduce / September 22, 2011 by Matt Wrock

This week I worked with Phil Jones (@philjones88) of to get RequestReduce up and running on his web site hosted on AppHarbor. There was a couple issues specific to AppHarbor’s configuration that prevented RequestReduce’s default configuration from working. Its actually a fairly typical situation where their load balancers forward requests to their web servers on different ports. RequestReduce then assumes that the site is publicly accessible on this non standard port which it is not and things quickly begin to not work too well. In fact they did not work well at all. It was easy to work around this and by doing so, I was able to make my app all the more accessible.

So now that Phil has got orangelightning up and running on RequestReduce, their Google Page speed score went from 81 to 96 and their Yslow grade went from a low B at 83 to a solid A at 95.

Total number of HTTP requests were cut in half from 13 to 6 requests. And a page size of 93K to 54K.

And of course the bottom line is page load times. Using http://www.webpagetest.org, I tested from the Eastern United States (orangelightning is in the UK) over three runs here are the median results:

With RequestReduce

Without RequestReduce

RequestReduce is free, Requires very little effort to install and supports both small blogs and large multi server, CDN based enterprises. You can download it from http://requestreduce.com/ or even easier, simply enter:

Install-Package RequestReduce

From the Nuget Packet Console right inside Visual Studio. Source code, wiki with thorough documentation and bug reporting is available from my github page at https://github.com/mwrock/RequestReduce.

RequestReduce now fully compatible with AppHarbor and Medium Trust hosting environments. / September 20, 2011 by Matt Wrock

Now even more sites can take advantage of automatic CSS merging and minification as well as image spriting and color optimization with no code changes or directory structure conventions.

This week I rolled out two key features which add compatibility to RequestReduce’s core functionality and some popular hosting environments. In a nutshell, here is what has been added:

Support for web server environments behind proxies. No extra configuration is needed. It just works.
Full support for AppHarbor applications. If you have not heard of AppHarbor, I strongly encourage you to check it out. It ties into your GIT repository and automatically builds and deploys your Visual Studio solution upon git push.
RequestReduce now runs in Medium Trust environments such as GoDaddy. There are some features that will not work here such as image color and compression optimizations and other multi server synchronization scenarios, but the core out of the box functionality of CSS merging, minification and on the fly background image spriting will work in these environments.

And as from the beginning, RequestReduce will run on ANY IIS hosted environment including ASP.NET Web Forms, all versions and view engines of MVC, Webmatrix “Web Pages” and even static html files.

So download the latest bits from www.RequestReduce.com or simply enter:

Install-Package RequestReduce

from the nuget power shell to get these features added to your site with no change to your code, almost no configuration and no rearranging of files and stylesheets into arbitrary folder conventions. As long as your background images are marked no-repeat and have explicit widths in their class properties, RequestReduce does all of the tedious work for you on the fly and makes sure that these resources have far future Expires headers and multi server friendly ETags allowing browsers to propperly cache your content.

Do you need multi server synchronization and CDN support? RequestReduce has got you covered.

Resolving InvalidCastException when two different versions of Structuremap are loaded in the same appdomain / September 19, 2011 by Matt Wrock

Last week I was integrating my automatic css merge, minify and sprite utility, RequestReduce, into the MSDN Forums and search applications. Any time you have the opportunity to integrate a component into a new app, there are often new edge cases to explore and therefore new bugs to surface since no app is exactly the same. Especially if the application has any level of complexity.

The integration went pretty smotthly until I started getting odd Structuremap exceptions in the search application. I had never encountered these before. I had a type that was using the HybridHttpOrThreadLocalScoped Lifecycle and when structuremap attempted to create this type I received the following error:

System.InvalidCastException: Unable to cast object of type 'StructureMap.Pipeline.MainObjectCache' to type 'StructureMap.Pipeline.IObjectCache'

Well that’s odd since MainObjectCache derives from IObjectCache. This smelled to me like some sort of a version conflict. The hosing application also uses Structuremap and uses version 2.6.1 while my component RequestReduce uses 2.6.3. I use IlMerge to merge RequestReduce and its dependencies into a single dll - RequestReduce.dll. While Nuget does make deployment much more simple, I still like having just a single dll for consumers to drop into their bin.

Unfortunately, searching online for this exception turned up absolutely nothing; so I turned to Reflector. The exception was coming from the HttpContextLifecycle class and it did not take long to track down what was happening. HttpContextLifecycle includes the following code:

public static readonly string ITEM_NAME = "STRUCTUREMAP-INSTANCES";

public void EjectAll(){    FindCache().DisposeAndClear();}

public IObjectCache FindCache(){    IDictionary items = findHttpDictionary();

    if (!items.Contains(ITEM_NAME))    {        lock (items.SyncRoot)        {            if (!items.Contains(ITEM_NAME))            {                var cache = new MainObjectCache();                items.Add(ITEM_NAME, cache);

                return cache;            }        }    }

    return (IObjectCache) items[ITEM_NAME];}

public string Scope { get { return InstanceScope.HttpContext.ToString(); } }

public static bool HasContext(){    return HttpContext.Current != null;}

public static void DisposeAndClearAll(){    new HttpContextLifecycle().FindCache().DisposeAndClear();}

protected virtual IDictionary findHttpDictionary(){    if (!HasContext())        throw new StructureMapException(309);

    return HttpContext.Current.Items;}

Its ITEM_NAME which is the culprit here. This is a static readonly field that is the key to the object cache stored in the HttpContext. There is no means to change or override this so whichever version of Structuremap is the first to create the cache, the other version will always throw an error when retrieving the cache because while both with store an IObjectCache, they will be different versions of IObjectCache and therefore different classes altogether which will lead to an InvalidCastException when one tries to cast to the other.

The work around I came up with was to create a new class that has the same behavior as HttpContextLifecycle but uses a different key:

public class RRHttpContextLifecycle : ILifecycle{    public static readonly string RRITEM_NAME = "RR-STRUCTUREMAP-INSTANCES";

    public void EjectAll()    {        FindCache().DisposeAndClear();    }

    protected virtual IDictionary findHttpDictionary()    {        if (!HttpContextLifecycle.HasContext())            throw new StructureMapException(309);

        return HttpContext.Current.Items;    }

    public IObjectCache FindCache()    {        var dictionary = findHttpDictionary();        if (!dictionary.Contains(RRITEM_NAME))        {            lock (dictionary.SyncRoot)            {                if (!dictionary.Contains(RRITEM_NAME))                {                    var cache = new MainObjectCache();                    dictionary.Add(RRITEM_NAME, cache);                    return cache;                }            }        }        return (IObjectCache)dictionary[RRITEM_NAME];    }

    public string Scope    {        get { return "RRHttpContextLifecycle"; }    }}

As you can see, I copy most of the code from HttpContextLifecycle but use a different key for the string and scope. To get this all wired up correctly with HybridHttpOrThreadLocalScoped, I also need to subclass HttpLifecycleBase. Here is the code from HttpLifecycleBase:

public abstract class HttpLifecycleBase<HTTP, NONHTTP> : ILifecycle    where HTTP : ILifecycle, new()    where NONHTTP : ILifecycle, new(){    private readonly ILifecycle _http;    private readonly ILifecycle _nonHttp;

    public HttpLifecycleBase()    {        _http = new HTTP();        _nonHttp = new NONHTTP();    }

    public void EjectAll()    {        _http.EjectAll();        _nonHttp.EjectAll();    }

    public IObjectCache FindCache()    {        return HttpContextLifecycle.HasContext()                   ? _http.FindCache()                   : _nonHttp.FindCache();    }

    public abstract string Scope { get; }}

All HybridHttpOrThreadLocalScoped does is derrive from HttpLifecycleBase and use HttpContextLifecycle as the HTTP cache; so I need to do the same using RRHttpContextLifecycle instead:

public class RRHybridLifecycle : HttpLifecycleBase<RRHttpContextLifecycle, ThreadLocalStorageLifecycle>{    public override string Scope    {        get        {            return "RRHybridLifecycle";        }    }}

Then I change my container configuration code from:

x.For<SqlServerStore>().HybridHttpOrThreadLocalScoped().Use<SqlServerStore>().    Ctor<IStore>().Is(y => y.GetInstance<DbDiskCache>());

x.For<SqlServerStore>().LifecycleIs(new RRHybridLifecycle()).Use<SqlServerStore>().    Ctor<IStore>().Is(y => y.GetInstance<DbDiskCache>());

This does feel particularly dirty. Copying and pasting code always feels wrong. What happens if Structuremap makes changes to the implementation of HttpContextLifecycle and I do not update my code to sync with those changes. You can see how this could become fragile. It would be nice if ITEM_NAME were not static and there was a way for derived types to override it. Or if the key name at least was appended by the version name of the Structuremap assembly.

Well until such changes are made in Structuremap, I see no better alternative to my work around.

I hope this is helpful to any others who have experienced this scenario. I am also very open to suggestions for a better workaround. In the meantime, I have submitted a pull request to the Structuremap repository that appends the assembly version to the the HttpContext.Items key name.

Adopt RequestReduce and see immediate Yslow and Google Page Speed score improvements not to mention a faster site! / September 10, 2011 by Matt Wrock

Since March I have been working in my “free” time on a framework to reduce the number and size of HTTP requests incurred from loading a web page. In short it merges and minifies css and javascript on your page and automatically sprites and optimizes css background images. All this is done on the fly (with caching) with no code changes or configuration required. All processed and reduced resources are served with far future caching headers and custom ETags.

In August I had something solid enough to push to production on Microsoft’s Gallery platform (what brings home my bacon) which hosts the Visual Studio Gallery, the MSDN Code Samples Gallery, the Technet Gallery and Script Center and many more galleries.

UPDATE: In November, RequestReduce was adopted by MSDN and Technet Forums and Search which serve millions of page views a day.

Results on Microsoft Gallery Platform

I’m very pleased with the results. We saw an 18% improvement in global page load times. We have a large international audience and the further you are from Washington state the more you will benefit from this improvement. VisualStudio Gallery raised its YSlow score from a B to an A and went from 41 HTTP requests to 30. Additionally, our workflow for spriting background images is completely automated.

Results from China Without RequestReduce:	Results from China With RequestReduce:

Key RequestReduce WebPage Optimizations

RequestReduce will do the following on any page where the RequestReduce HttpModule is loaded:

Look for background images that it can sprite. This is the process of combining multiple images into a single image and using some CSS syntax to pull in specific images from that single file into a CSS class’s background.
Merge these images into a single PNG that is quantized down to 256 colors and then run through optipng for lossless compression. Unsatisfied with the quality I was getting from the popular opensource quantizers, I created a quantizer based on the Wu quantization algorithm and have released that separately as nQuant on codeplex. This often reduces the image size up to 3x smaller than the original.
Merges all CSS in the head and minifies it. This includes any text/css resource so it includes files like WebResource.axd.
Automatically expands CSS @imports.
Minifies and merges all adjacent javascript on the page that do not have a nocache or no-store header and an expired or max-age less than a week. This includes any valid javascript mime type file so ScriptResource.axd and WebResource.axd are included.
Manages the downloads of these CSS and image requests using ETag and expires headers to ensure optimal caching on the browser.

Other Great RequestReduce Features

Since I wanted to deploy RequestReduce on Microsoft websites, it obviously needed to scale to Millions of page views and be maintainable in an enterprise environment. To do this RequestReduce supports:

CDN hosting of the CDN and Sprited images.
Synchronizing generated CSS and image files across multiple servers via a Sql Server db or a distributed file replication system.
Custom API allowing the addition of your own minifier or filtering out specific pages or resources.

Of coarse RequestReduce works perfectly on a small site or blog as well. This blog went from a YSlow C to an A after using RequestReduce.

Why I Created RequestReduce

First, I’ve been looking for an idea for quite some time to use for an Open Source project. This one struck me while on a run along the Sammamish River in February. Over the past 10 years I have worked on many large, high traffic websites that used somewhat complicated frameworks for organizing CSS. These often make including simple minification an impossible task in a build script especially if CSS can be changed out of band. Also, image spriting has always been difficult to keep up with. New images get rolled in to CSS and we are too busy getting features out the door; so spriting these images falls through the cracks. To have a process that did all of this automatically and at run time (Note: RequestReduce does not block requests while it does this. That would be a perf catastrophe. See here for details.) seemed ideal. I wanted a plug and play solution. Drop a dll in the bin directory and it just happens.

RequestReduce makes this vision come very close to reality. In this version, there are some things that RequestReduce expects of the CSS class containing the background image in order to successfully sprite it. In a future release I will be taking advantage of CSS3 which will mean RequestReduce will be able to sprite more images on modern browsers. The Microsoft Gallery sites have to support IE 7 and 8 so the first release had to be CSS2 compliant.

RequestReduce is now available for community use and contributions

To get started using RequestReduce:

If you have Nuget, simply enter this command in the Package Manager Console and skip steps two and three:
```
Install-Package RequestReduce
```
Otherwise, download the latest RequestReduce version.
Extract the contents of the downloaded zip and copy RequestReduce.dll as well as optipng.exe to your website's bin directory. (If for some reason you cannot put optipng.exe in your bin, RequestReduce will function as expected but will skip the lossless compression of its sprite images.
Add the RequestReduceModule to your web.config or using the IIS GUI
Optimize your CSS to help RequestReduce better locate your background images
Optional: Configure RequestReduce. You can control where generated css and sprites are stored, their size thresholds and specify a CDN host name to reference You may also fork the RequestReduce source code from its github site.

For links to RequestReduce documentation, bug reports and the latest download, you can visit

http://www.RequestReduce.com

. I’d be very interested in hearing any feedback on the tool as well as any problems you have implementing it. I plan to be continually adding to this project with more features to reduce HTTP Requests from any web site.

Latch waits on 2:1:103? You are probably creating too many temp tables in Sql Server / September 10, 2011 by Matt Wrock

Last week my team pushed our monthly release and all looked well and good after the launch. Then at about 10PM our site monitoring alerts started to fire as our page load times soared and this continued until early morning. There were no unusual errors in our logs and our traffic load was normal. What was clear was that our database had been running quite hot on CPU.

As we continued to investigate the following day on a Friday, all systems looked quite healthy and we were very perplexed. That Friday and Saturday night there was no reoccurrence of this behavior but on Sunday it happened again. A very abrupt and fast rise in page load times, followed by a just as sudden drop in load times at around 9AM. All of this happening during what we thought was our lull in traffic.

Interestingly to us but not so topical to this post, after studying our logs back several weeks, we learned that 10PM to 9AM actually is our traffic peak. So it began to make sense why this was occurring during this time. Knowing that this was manifesting itself as a database CPU issue, we ran the following query:

select * from master..sysprocesses sp cross apply fn_get_sql(sql_handle) order by cpu desc

This query will return a row for every process in descending order of CPU load along with the exact SQL statement they are executing. The will also tell you if the process is blocking on another process and if so, which process and what resource it is waiting on. Having run this query many times under this load we were seeing a pattern with results like this:

spid	kpid	blocked	waittype	waittime	lastwaittype	waitresource	dbid
254	9680	0	0x0034	2	PAGELATCH_EX	2:1:103	9
128	10288	163	0x0032	11	PAGELATCH_SH	2:1:103	9
129	0	0	0x0000	0	MISCELLANEOUS		9
116	10152	0	0x0000	0	SOS_SCHEDULER_YIELD		9
169	0	0	0x0000	0	MISCELLANEOUS		1
52	0	0	0x0000	0	MISCELLANEOUS		4
148	9468	163	0x0034	9	PAGELATCH_EX	2:1:103	11
90	0	0	0x0000	0	MISCELLANEOUS		9
274	6612	0	0x0034	15	PAGELATCH_EX	2:1:103	9
131	8568	163	0x0032	4	PAGELATCH_SH	2:1:103	9
120	0	0	0x0000	0	MISCELLANEOUS		11
105	8728	163	0x0034	11	PAGELATCH_EX	2:1:103	9
247	5660	0	0x0000	0	PAGELATCH_SH	2:1:103	9
216	0	0	0x0000	0	MISCELLANEOUS		1
163	8108	0	0x0000	0	PAGELATCH_SH	2:1:103	9
201	7004	0	0x0034	1	PAGELATCH_EX	2:1:103	9
267	0	0	0x0000	0	MISCELLANEOUS		6
117	11124	163	0x0034	10	PAGELATCH_EX	2:1:103	11
205	10016	0	0x0034	1	PAGELATCH_EX	2:1:103	9
210	10108	0	0x0034	14	PAGELATCH_EX	2:1:103	9
226	0	0	0x0000	0	MISCELLANEOUS		10
164	0	0	0x0000	0	MISCELLANEOUS		1
175	0	0	0x0000	0	MISCELLANEOUS		11
98	0	0	0x0000	0	MISCELLANEOUS		7

What this tells me is that there are several queries lined up behind the query executing under spid 163 and they are waiting for 163 to release its shared lock on a latch so that they can aquire an exclusive lock. To quote the SQLCAT team, “Latches are lightweight synchronization primitives that are used by the SQL Server engine to guarantee consistency of in-memory structures including; index, data pages and internal structures such as non-leaf pages in a B-Tree.” See this white paper for an in depth discussion on latch contention. So the next question is what is 2:1:103 – the resource everyone is fighting for.

The first number will be the database id, the second is the database file and the third and last number is the page. 2 is temp db. If you don’t believe me, just execute:

select db_name(2)

It says “tempdb” doesn’t it? I just knew that it would.

To track down the actual object that owns that page, execute:

DBCC Traceon(3604)DBCC PAGE (2, 1, 103)

The key item of interest in the results is the Metadata: ObjectId with a value of 75. Then execute:

select object_name(75)

and the object is sysmultiobjrefs.

So what the heck am I supposed to do with this? If this was blocking on a user table that was part of my application, it seems I would have something to work with. But tempdb? Really? Ugh.

After recovering from a spiral of primal self-loathing and despair, I turn to google. I’m sure lots of people have seen contention on 2:1:103 and it will be no time at all before a clear explanation and remedy are revealed. Well, yes, there are several people struggling with contention on this object, but alas, not there is no clear explanations and answers. I find several posts talking about 2:1:1 and 2:2:3 (see this KB article for those). But those struggling with 2:1:103 seem to be being told to turn on trace flag –T1118 but then later complaining that this flag is not doing anything. Or they were told to add more files but they replied that they had plenty of files. And by the way, –T1118 was turned on already in our DB and we have plenty of files too.

Well finally I come across this post from someone with this exact issue and the poster states that they contacted Microsoft and then reports the explanation MS gave and this explanation actually makes sense. The issue is from contention of DDL operations in tempdb. In other words, the Creating and dropping of temp tables. Well we just so happen to do a lot of that and added more in this last release.

I then came upon this white paper and while it did not specifically refer to 2:1:103, it really provided an in depth description of the inner workings of tempdb and how to identify and troubleshoot different forms of contention there. Of particular interest was its explanation of contention in DML and DDL operations.

Contention in DML (select, update, insert) operations is identified by contention on 2:1:1 or 2:1:3. This can be caused by either “internal” objects such as the objects that the sql query engine creates to when executing certain hash joins, sorts and groupings. Or it can be caused by user objects like temp tables and temp table variables. At the root of the issue here is SQL Servers struggle to allocate and deallocate pages and updating metadata tables that track those pages. In addition to suggesting that those who suffer from this issue analyze query plans to make more efficient use of tempdb, it also suggests turning on –T1118 and adding data files to tempdb.

My team’s plight seemed to fall under DDL Contention. This can be determined by checking if the object under contention is a system catalog table in sysobjects. Well according to:

select * from sys.objects where object_id=75

sysmultiobjrefs is a system table. DDL contention is not impacted by internal objects but deals strictly with the creation and destruction of user objects. Namely temp tables and temp table variables.

In addition to checking on the wait resource, there are two performance counters that you can turn on to track the real time creation and destruction of temp tables:

Temp Tables Creation Rate
Temp Tables For Destruction

You can turn these counters on and run your queries to determine how many temporary resources they are creating. We were able to identify that 70% of our traffic was issuing a query that now created 4 temp tables. The advice given here is to:

Use fewer temp tables and temp table variables.
Check to see if your use of temp tables takes advantage of temp table caching which reduces contention. Sql Server will automatically cache created temp tables as long as:
- Named constraints are not created.
- Data Definition Language (DDL) statements that affect the table are not run after the temp table has been created, such as the CREATE INDEX or CREATE STATISTICS statements.
- Temp object is not created by using dynamic SQL, such as: sp_executesql N'create table #t(a int)'.
- Temp object is created inside another object, such as a stored procedure, trigger, and user-defined function; or is the return table of a user-defined, table-valued function.

Well we use dynamic SQL so none of our temp tables get cached which probably means 2:1:103 is being updated on every creation.

Our “quick fix” was to take the query that 70% of our traffic hits and remove the use of temp tables as well as trim it down to something highly optimized to exactly what that traffic was doing. Fortunately something very simple. Moving forward we will need to seriously curb the use of temp tables or wrap them in sql functions or stored procs.

I am intentionally putting 2:1:103 in the title of this post with the hope that someone will find this more quickly that it took me to track all of this down.

UPDATE: Michael J. Swart (@MJSwart) wrote a great post on this exact same problem here. I strongly encourage you to read about his run in with this awful awful problem.

Hurry Up and Wait!

Tales from an automation engineer