Prettify

Saturday, July 12, 2014

DLL Powers(hell) - A New Twist on an Old Problem

The Descent, or Finding a New Way into Hell

Not too long ago, we were trying to run some Powershell scripts as part of a release. As is typical, we were under the gun to get things done quickly. This time, however, we ran into a weird issue, and it's an issue that's had a lot of twists and turns.  Ultimately, it turned out to be interesting enough to write about - so here we are!

First, a little bit of background.  Each Autonomous Service we would build would generally have its own Powershell module assemblies, which would be bin-deployed into separate folders on a given server.  We would then write scripts to load the cmdlets and perform various tasks.

So what was the actual issue?  It wasn't a bug in the business logic.  Instead, a subtle and unusual form of DLL Hell unexpectedly reared its head.  Here's a breakdown of what happened:
  1. We wrote two scripts that each loaded and called cmdlets belonging to different Autonomous Services.  The mechanism was a pretty standard set of Import-Module calls, and as I mentioned before, these cmdlets were bin-deployed into separate folders
  2. The bin-deployed code had dependencies on some Common library assemblies that we had built and hosted in NuGet.  The dependent assemblies were also bin-deployed into their respective folders.
  3. One of the Powershell module assemblies was referencing the v1 version of a particular Common library, while the other was referencing v2.  Both cmdlets compiled and ran just fine in isolation.
  4. We called the cmdlet referencing v1 in the first script, then the one referencing v2 in the second script.
  5. We got the dreaded MissingMethodException when the second cmdlet tried to call its dependent library.
The issue was that the second cmdlet was expecting v2 of a dependent assembly, but v1 was loaded into the default Powershell appdomain already, causing it to be used instead.  Our bin-deployment strategy had prevented dependencies from colliding on disk, but could do nothing to prevent Powershell from doing the same in memory.

 

The First Exit Goes to Purgatory

After realizing what was happening, we were able to get our release out the door by quickly upgrading the v1 dependency to v2 and redeploying that Service.  However we knew this issue would keep surfacing so we tried a couple ideas to address the root cause.  We were definitely not about to abandon the Autonomous Services pattern because of a few lousy scripts!

One of the team members tried strong-naming the conflicting assemblies.  After all, two assemblies that differ by strong name can be loaded side-by-side into a single appdomain.  It worked!  However, we collectively frowned at the implications.  Strong-named assemblies cannot have dependencies on any assemblies that do not have strong names.  Since we were using so many third party-components on that team, we were skittish about adopting such a heavy-handed policy as a remedy.

 

I Think the Exit to Earth is That Way

As an alternative, I started playing around with using separate appdomains in which to sandbox the loading of dependent assemblies.  Our original cmdlets were dead simple, looking something like the code below:

using System.Management.Automation;

namespace CmdletOne
{
    [Cmdlet(VerbsCommon.Show, "Path")]
    public class ShowPathCmdlet : Cmdlet
    {
        protected override void ProcessRecord()
        {
           // ...
           // Run some business logic & load assemblies
           // ...
           WriteObject("Value is foo");
        }
    }
}

The trick was to find a way to wrap the logic we needed to run so that its dependencies would not load in the default Powershell appdomain, but instead in a separate appdomain we could control.

 

First Step

The first step involved pulling out all the meat of the original cmdlet into a class that would run in the separate appdomain.  This seemed simple enough, and looked something like this:
using System;
using Common;

namespace CmdletOne
{
    public class Proxy : MarshalByRefObject
    {
        public string DoWork()
        {
            // ...
            // Run some business logic & load assemblies
            // ...
            return "foo";
        }
    }
}

 

Second Step

The second thing we needed was a way to create new app domains and remoting proxies for arbitrary objects. Well, I say "arbitrary" but I really mean objects that derive from MarshalByRefObject, like the Proxy class we made.  Note that we needed to use reflection along with a supplied path string to locate the assembly and create the type, rather than relying on the ambient environment to do it automatically. We have to do everything we can to make sure that the code we want to execute (inside T) doesn't run except through the proxy created through the CreateInstanceAndUnwrap() call.
using System;
using System.IO;

namespace Common
{
    /// <summary>
    /// General-purpose class that can put a remoting proxy around a given type and create a new appdomain for it to run in.
    /// This effectively "sandboxes" the code being run and isolates its dependencies from other pieces of code.
    /// </summary>
    /// <typeparam name="T">The type of object that will be run in the sandbox.  Must be compatible with Remoting.</typeparam>
    public class ExecutionSandbox<t> : IDisposable 
        where T : MarshalByRefObject
    {
        /// <summary>
        /// Local copy of the sandbox app domain
        /// </summary>
        private AppDomain _domain;

        /// <summary>
        /// Reference of the proxy wrapper for T
        /// </summary>
        public T ObjectProxy { get; private set; }

        /// <summary>
        /// Creates an instance of ExecutionSandbox
        /// </summary>
        /// <param name="assemblyPath" />The path where the assembly that contains type T may be found
        public ExecutionSandbox(string assemblyPath)
        {
            Type sandboxedType = typeof (T);
            AppDomainSetup domainInfo = new AppDomainSetup();
            domainInfo.ApplicationBase = assemblyPath;
            _domain = AppDomain.CreateDomain(string.Format("Sandbox.{0}", sandboxedType.Namespace), null, domainInfo);

            string assemblyFileName = Path.Combine(assemblyPath, sandboxedType.Assembly.GetName().Name) + ".dll";
            object instanceAndUnwrap = _domain.CreateInstanceFromAndUnwrap(assemblyFileName, sandboxedType.FullName);
            ObjectProxy = (T)instanceAndUnwrap;
        }

        /// <summary>
        /// Allows safe cleanup of the sandbox app domain.
        /// </summary>
        public void Dispose()
        {
            if (_domain != null)
            {
                AppDomain.Unload(_domain);
                _domain = null;
            }

            ObjectProxy = null;
        }
    }
}

Third Step...

Bringing it all together, we could then refactor our cmdlets to do something like the snippet below. There's certainly more code involved than the original, however about 50% of it exists to optimize for performance and could probably be refactored away.
using System;
using System.IO;
using System.Management.Automation;
using System.Reflection;
using Common;

namespace CmdletOne
{
    [Cmdlet(VerbsCommon.Show, "Path")]
    public class ShowPathCmdlet : Cmdlet
    {
        private static ExecutionSandbox _executionSandbox;
        private readonly object _lockObject = new object();

        protected override void ProcessRecord()
        {
            DateTime start = DateTime.Now;

            lock (_lockObject)
            {
                if (_executionSandbox == null)
                {
                    string cmdletExecutionPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
                    _executionSandbox = new ExecutionSandbox(cmdletExecutionPath);
                }
            }

            Proxy proxy = _executionSandbox.Value;
            string path = proxy.DoWork();

            DateTime end = DateTime.Now;

            WriteObject(string.Format("Value is {0}.  Elapsed MS: {1}", path, (end - start).TotalMilliseconds));
        }
    }
}

...Right Into Hot Lava

The code above looks like it should work, however here's where things really stop making sense - it doesn't work.  Not exactly.  What happens is that yes, you do create separate app domains, and the proxy stub casts properly and the code inside it runs.  You can even see it loading dependencies using AppDomain.CurrentDomain.GetAssemblies().  The main issue is that not only does all of this happen, but somehow the Default AppDomain loads the dependencies too.  In other words, inside your proxy, v1 of a dependent assembly loads into the alternate appdomain, but also loads into the Default appdomain.  Then your second proxy runs, but it blows up because it loads the wrong dependency, probably from the default appdomain (but I'm not sure).

Fourth Step - When All Else Fails, Sometimes Random Wandering Is Best

I spent days and days in the lava, trying to make sense of what was going on.  I consulted a number of MSDN articles like this one, thinking I had missed something fundamental.  I plugged in fuslogvw.exe, which reported some assembly loading behavior that looked odd, but did not tell me what was the cause.  I even read and reread Suzanne Cook's early .NET articles like this one on the dark art of assembly loading contexts.  I was convinced that something was awry with these load contexts.  I kept switching bits back and forth, trying to chance something, anything that would make this work.  Then, I stumbled on this blog post.  I noted that he decorated his proxy classes with an interface.  Since his example looked like it would work too, I gave it a shot.  Astonishingly, adding an interface to the Proxy class shown above fixed everything. This was it:
namespace CmdletOne
{
    public interface IProxy
    {
        string DoWork();
    }
}
combined with a one-liner to the code seen previously:
using System;
using Common;

namespace CmdletOne
{
    public class Proxy : MarshalByRefObject, IProxy
    {
        public string DoWork()
        {
            // ...
            // Run some business logic & load assemblies
            // ...
            return "foo";
        }
    }
}

What Now?

So I escaped from Hell, because I randomly stepped on a teleporter that sent me home.  Yes, I know that sounds like Doom... It's a good analogy, OK?   Anyways, I'm just happy to be un-stuck for the time being.  I'm going to eventually figure out why this interface fixed everything, but that will be a topic for another day.  Maybe Jon Skeet knows? The only evidence I was left with was a mysterious difference in the IL code, which I am as yet too dense to figure out why it matters.

Code Samples

Working code samples for the above can be found on Github.

If you open a Powershell console (I used v3, v2 might also work), you can run three different test scripts:
  • WorksOk.ps1 - runs a cmdlet that references v2 before running a cmdlet that references v1.  In this case, the calls made to the wrong assembly don't cause a MissingMethodException.  This demonstrates the fact that things could be wrong without any external manifestation of a problem.  In other words, things could work entirely by accident.
  • GeneratesError.ps1 - reproduces the MissingMethodException and fouls up your Powershell session so that running WorksOk.ps1 now fails.  The only way to get out of this state is to restart the console.  This case is what we ran into during our release.
  • AlwaysWorks.ps1 - The order of the cmdlets being run in this script is identical to GeneratesError.ps1, however the cmdlet is using the sandboxed appdomains and does not manifest any errors.

In Summary...

If you find yourself running into DLL Hell with Powershell, try one of the following:
  • Add strong names to your dependencies and put them in the GAC
  • Add strong names to your dependencies and bin-deploy to separate folders, one per Powershell module assembly
  • If strong-naming isn't an option, bin-deploy to separate folders, one per Powershell module assembly.  Inside your cmdlets, use proxy wrappers to run your code inside separate appdomains
  • This might be overkill, but you might also be able to build your own Powershell hosting context in which you can customize how you load and run cmdlets.  If you've got one handy, try it!


Friday, May 9, 2014

Domain-Driven Design Rewired My Brain

"DDD... What is that?" I asked when interviewing with Michael Paterson at EBSCO back in 2010.

"It is...awesome!" was the reply, in the style I would learn to associate with Mike. Head nod for emphasis and big grin, as if remembering how it felt to watch the game-winning field goal in the 2001 Super Bowl.

In popular vernacular, "awesome" is reserved for CoD headshots, explosions in slo-mo, spring break vacations, and Pearl Jam concerts. Hyperbole. Idle interjection. Kid stuff. He-Man was Awesome in 1982.
A strong feeling of fear or respect and also wonder.
That's what something that's Awesome gives you. Mike was right. Extremely right. One Blue Book, Red Book, and White Book later, I left the labyrinth behind forever. 

What strikes me as so profound about Eric Evans' insight is just how astonishingly powerful it is.  Not to mention elegant and intuitive.  How did we all miss this?  Thinking back to a time before I knew about DDD, I recalled instances where I worked on the design and implementation for numerous projects, some more successful than others.  The successful ones, as I recall, involved a crude form of Domain Modeling and creation of a Ubiquitous Language.  Sitting in a room with Sales and Domain Experts and hashing out the details made an incredible difference in the quality of the deliverable. Of course, my code wasn't that sophisticated back then, but worked reliably and could adapts as needs changed.  I had no idea why this this was the case at the time, nor could I really hope to repeat that success consistently.   

In my opinion, what's truly profound about DDD as presented in the first few chapters of the Blue Book is the tacit claim that tools and algorithms alone can't unravel the complexity at the heart of software.  This is a human problem - a problem of language, of comprehension and consensus.  Before a software engineer can ever hope to context-switch into his or her programming language of choice, the internal monologue intones, interrupting and nagging, "But what does it all mean?"

From the Domain Model spring forth so many valuable artifacts - illustrative diagrams, objects and database entities, Autonomous Services, events, user stories and backlogs, cross-functional teams, training manuals, glossaries, etc.  I've even noticed that Domain Modeling can help Experts and stakeholders see inefficiencies and problems in their own business processes - without any help from an analyst or engineer.  Awesome, indeed.

In some ways, I see the design patterns presented in the Blue Book as secondary.  Don't get me wrong, Aggregate Roots, Value Objects, Bounded Contexts, and Anticorruption Layers are extremely useful in practice.  But without a visceral understanding of the early chapters, I suspect one could mis-apply all those great ideas and end up in a very bad place.  I'm not convinced the reverse is true.  Without strong design patterns, it seems that one can still manage to build valuable software that follows the domain, albeit crudely.

I think it's the focus on language, concepts, and feedback loops that really makes DDD work.  It's already in our brains, central to our lives as humans.  We just have to rewire a little bit and take advantage of it.