The Descent, or Finding a New Way into Hell
Not too long ago, we were trying to run some Powershell scripts as part of a release. As is typical, we were under the gun to get things done quickly. This time, however, we ran into a weird issue, and it's an issue that's had a lot of twists and turns. Ultimately, it turned out to be interesting enough to write about - so here we are!First, a little bit of background. Each Autonomous Service we would build would generally have its own Powershell module assemblies, which would be bin-deployed into separate folders on a given server. We would then write scripts to load the cmdlets and perform various tasks.
So what was the actual issue? It wasn't a bug in the business logic. Instead, a subtle and unusual form of DLL Hell unexpectedly reared its head. Here's a breakdown of what happened:
- We wrote two scripts that each loaded and called cmdlets belonging to different Autonomous Services. The mechanism was a pretty standard set of Import-Module calls, and as I mentioned before, these cmdlets were bin-deployed into separate folders
- The bin-deployed code had dependencies on some Common library assemblies that we had built and hosted in NuGet. The dependent assemblies were also bin-deployed into their respective folders.
- One of the Powershell module assemblies was referencing the v1 version of a particular Common library, while the other was referencing v2. Both cmdlets compiled and ran just fine in isolation.
- We called the cmdlet referencing v1 in the first script, then the one referencing v2 in the second script.
- We got the dreaded MissingMethodException when the second cmdlet tried to call its dependent library.
The First Exit Goes to Purgatory
After realizing what was happening, we were able to get our release out the door by quickly upgrading the v1 dependency to v2 and redeploying that Service. However we knew this issue would keep surfacing so we tried a couple ideas to address the root cause. We were definitely not about to abandon the Autonomous Services pattern because of a few lousy scripts!One of the team members tried strong-naming the conflicting assemblies. After all, two assemblies that differ by strong name can be loaded side-by-side into a single appdomain. It worked! However, we collectively frowned at the implications. Strong-named assemblies cannot have dependencies on any assemblies that do not have strong names. Since we were using so many third party-components on that team, we were skittish about adopting such a heavy-handed policy as a remedy.
I Think the Exit to Earth is That Way
As an alternative, I started playing around with using separate appdomains in which to sandbox the loading of dependent assemblies. Our original cmdlets were dead simple, looking something like the code below:using System.Management.Automation; namespace CmdletOne { [Cmdlet(VerbsCommon.Show, "Path")] public class ShowPathCmdlet : Cmdlet { protected override void ProcessRecord() { // ... // Run some business logic & load assemblies // ... WriteObject("Value is foo"); } } }
The trick was to find a way to wrap the logic we needed to run so that its dependencies would not load in the default Powershell appdomain, but instead in a separate appdomain we could control.
First Step
The first step involved pulling out all the meat of the original cmdlet into a class that would run in the separate appdomain. This seemed simple enough, and looked something like this:using System; using Common; namespace CmdletOne { public class Proxy : MarshalByRefObject { public string DoWork() { // ... // Run some business logic & load assemblies // ... return "foo"; } } }
Second Step
The second thing we needed was a way to create new app domains and remoting proxies for arbitrary objects. Well, I say "arbitrary" but I really mean objects that derive from MarshalByRefObject, like the Proxy class we made. Note that we needed to use reflection along with a supplied path string to locate the assembly and create the type, rather than relying on the ambient environment to do it automatically. We have to do everything we can to make sure that the code we want to execute (inside T) doesn't run except through the proxy created through the CreateInstanceAndUnwrap() call.using System; using System.IO; namespace Common { /// <summary> /// General-purpose class that can put a remoting proxy around a given type and create a new appdomain for it to run in. /// This effectively "sandboxes" the code being run and isolates its dependencies from other pieces of code. /// </summary> /// <typeparam name="T">The type of object that will be run in the sandbox. Must be compatible with Remoting.</typeparam> public class ExecutionSandbox<t> : IDisposable where T : MarshalByRefObject { /// <summary> /// Local copy of the sandbox app domain /// </summary> private AppDomain _domain; /// <summary> /// Reference of the proxy wrapper for T /// </summary> public T ObjectProxy { get; private set; } /// <summary> /// Creates an instance of ExecutionSandbox /// </summary> /// <param name="assemblyPath" />The path where the assembly that contains type T may be found public ExecutionSandbox(string assemblyPath) { Type sandboxedType = typeof (T); AppDomainSetup domainInfo = new AppDomainSetup(); domainInfo.ApplicationBase = assemblyPath; _domain = AppDomain.CreateDomain(string.Format("Sandbox.{0}", sandboxedType.Namespace), null, domainInfo); string assemblyFileName = Path.Combine(assemblyPath, sandboxedType.Assembly.GetName().Name) + ".dll"; object instanceAndUnwrap = _domain.CreateInstanceFromAndUnwrap(assemblyFileName, sandboxedType.FullName); ObjectProxy = (T)instanceAndUnwrap; } /// <summary> /// Allows safe cleanup of the sandbox app domain. /// </summary> public void Dispose() { if (_domain != null) { AppDomain.Unload(_domain); _domain = null; } ObjectProxy = null; } } }
Third Step...
Bringing it all together, we could then refactor our cmdlets to do something like the snippet below. There's certainly more code involved than the original, however about 50% of it exists to optimize for performance and could probably be refactored away.using System; using System.IO; using System.Management.Automation; using System.Reflection; using Common; namespace CmdletOne { [Cmdlet(VerbsCommon.Show, "Path")] public class ShowPathCmdlet : Cmdlet { private static ExecutionSandbox_executionSandbox; private readonly object _lockObject = new object(); protected override void ProcessRecord() { DateTime start = DateTime.Now; lock (_lockObject) { if (_executionSandbox == null) { string cmdletExecutionPath = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location); _executionSandbox = new ExecutionSandbox (cmdletExecutionPath); } } Proxy proxy = _executionSandbox.Value; string path = proxy.DoWork(); DateTime end = DateTime.Now; WriteObject(string.Format("Value is {0}. Elapsed MS: {1}", path, (end - start).TotalMilliseconds)); } } }
...Right Into Hot Lava
The code above looks like it should work, however here's where things really stop making sense - it doesn't work. Not exactly. What happens is that yes, you do create separate app domains, and the proxy stub casts properly and the code inside it runs. You can even see it loading dependencies using AppDomain.CurrentDomain.GetAssemblies(). The main issue is that not only does all of this happen, but somehow the Default AppDomain loads the dependencies too. In other words, inside your proxy, v1 of a dependent assembly loads into the alternate appdomain, but also loads into the Default appdomain. Then your second proxy runs, but it blows up because it loads the wrong dependency, probably from the default appdomain (but I'm not sure).Fourth Step - When All Else Fails, Sometimes Random Wandering Is Best
I spent days and days in the lava, trying to make sense of what was going on. I consulted a number of MSDN articles like this one, thinking I had missed something fundamental. I plugged in fuslogvw.exe, which reported some assembly loading behavior that looked odd, but did not tell me what was the cause. I even read and reread Suzanne Cook's early .NET articles like this one on the dark art of assembly loading contexts. I was convinced that something was awry with these load contexts. I kept switching bits back and forth, trying to chance something, anything that would make this work. Then, I stumbled on this blog post. I noted that he decorated his proxy classes with an interface. Since his example looked like it would work too, I gave it a shot. Astonishingly, adding an interface to the Proxy class shown above fixed everything. This was it:namespace CmdletOne { public interface IProxy { string DoWork(); } }combined with a one-liner to the code seen previously:
using System; using Common; namespace CmdletOne { public class Proxy : MarshalByRefObject, IProxy { public string DoWork() { // ... // Run some business logic & load assemblies // ... return "foo"; } } }
What Now?
So I escaped from Hell, because I randomly stepped on a teleporter that sent me home. Yes, I know that sounds like Doom... It's a good analogy, OK? Anyways, I'm just happy to be un-stuck for the time being. I'm going to eventually figure out why this interface fixed everything, but that will be a topic for another day. Maybe Jon Skeet knows? The only evidence I was left with was a mysterious difference in the IL code, which I am as yet too dense to figure out why it matters.Code Samples
Working code samples for the above can be found on Github.If you open a Powershell console (I used v3, v2 might also work), you can run three different test scripts:
- WorksOk.ps1 - runs a cmdlet that references v2 before running a cmdlet that references v1. In this case, the calls made to the wrong assembly don't cause a MissingMethodException. This demonstrates the fact that things could be wrong without any external manifestation of a problem. In other words, things could work entirely by accident.
- GeneratesError.ps1 - reproduces the MissingMethodException and fouls up your Powershell session so that running WorksOk.ps1 now fails. The only way to get out of this state is to restart the console. This case is what we ran into during our release.
- AlwaysWorks.ps1 - The order of the cmdlets being run in this script is identical to GeneratesError.ps1, however the cmdlet is using the sandboxed appdomains and does not manifest any errors.
In Summary...
If you find yourself running into DLL Hell with Powershell, try one of the following:- Add strong names to your dependencies and put them in the GAC
- Add strong names to your dependencies and bin-deploy to separate folders, one per Powershell module assembly
- If strong-naming isn't an option, bin-deploy to separate folders, one per Powershell module assembly. Inside your cmdlets, use proxy wrappers to run your code inside separate appdomains
- This might be overkill, but you might also be able to build your own Powershell hosting context in which you can customize how you load and run cmdlets. If you've got one handy, try it!
No comments:
Post a Comment