Productive Rage

14 June 2015

Throwing exceptions through COM

At work, I've got a project where we're trying to migrate from an old technology to new - we've got COM components that used to be hosted in one environment that has now been replaced, with a view to replacing the legacy COM components in the future. This means that these components are often brought to life in this new environment and have some of their internal functionality relying on calls back into the new host. In a roundabout way, what I'm trying to say is that these COM components are called by .net code, they call back into that .net code themselves and sometimes have to deal with exceptions being thrown by the .net code they're calling and have to communicate those failures back up to the .net code that called them!

Phew! I think I just made unnecessarily hard work out of that introduction! :D

So.. an interesting problem has arisen in this scenario. There are limits to the ways in which managed (.net) code can talk with unmanaged (COM) code. Most of the time you can get away with following a few simple rules around types and then letting the .net interoperability magic do its work.

One place where this falls down is in exception handling. Specifically, if the (.net) hosting environment calls a COM component that calls back into the host - if that call throws an exception then that exception travels through the COM component and comes out the other side.. just not quite in the way that you might expect. Exceptions from COM components can basically only express a string message and a 32-bit "HRESULT" value (this is simplifying a bit, but it's close enough). The HRESULT value is a status code that follows a particular format (see the MSDN article "HRESULT" for the full details, but I'll touch on it further down).

If the HRESULT is a value that .net recognises as relating nicely to a particular framework exception, then it will represent the error from the COM component as an instance of that exception - eg. "COR_E_ENDOFSTREAM" (0x80070026, or -2147024858 as an Int32) will result in an EndOfStreamException being raised. Any value that isn't considered special enough to have its own framework type will be raised as a COMException.

This means that if you have any custom exceptions that you want to have pass through the system then you're in for a nasty surprise.

Take the following contrived example:

public string GetMagicName(int id)
{
  try
  {
    return GetMagicNameFromComponent(id);
  }
  catch (WidgetiserException e)
  {
    // Log Widgetiser dependency failure..
    throw;
  }
}

private string GetMagicNameFromComponent(int id)
{
  // Note: The "_widgetiser" reference is a ComVisible .net component passed into
  // the COM component as a dependency
  dynamic legacyComponent = Activator.CreateInstance(
    Type.GetTypeFromProgID("OldSystem.Calculator")
  );
  legacyComponent.Widgetiser = _widgetiser;
  return legacyComponent.GetMagicName(id);
}

There is a fictional "Calculator" COM component that was used by the old system. It had a "Widgetiser" dependency that was provided by this old system. In the new system, this component is still required but as part of the work of replacing the system around it, a new "Widgetiser" has been created - it is a facade over functionality in the new system, such that the interface expected by the legacy components is still available.

In this story I'm telling, sometimes the call into the legacy component fails. Of those times, it is useful for us to know which were due to a problem raised by the "Widgetiser" and which were due to something else that originated in the component. Helpfully, the Widgetiser throws specialised exceptions -

[Serializable]
public class WidgetiserException : Exception
{
  protected WidgetiserException(string message, string gadgetName) : base(message)
  {
    GadgetName = gadgetName;
  }

  public string GadgetName { get; private set; }

  private const string GADGET_NAME_ID = "GadgetName";
  protected WidgetiserException(SerializationInfo info, StreamingContext context)
    : base(info, context)
  {
    GadgetName = info.GetString(GADGET_NAME_ID);
  }
  public override void GetObjectData(SerializationInfo info, StreamingContext context)
  {
    info.AddValue(GADGET_NAME_ID, GadgetName);
    base.GetObjectData(info, context);
  }
}

This follows the best practices (it ends with the word "Exception", it's serialisable and it has the serialisation-based constructor and GetObjectData method).

However..

In testing, it's found that the "Log Widgetiser dependency failure.." condition is never entered.

This shouldn't (now) be a surprise since I've just explained that custom exceptions can never come directly from COM failures; any exception may only be represented by a COMException or some other specific framework exception classes.

But I want my custom exceptions! And I don't want to have to worry about whether particular code in my new system is allowed to throw custom exceptions because it will only ever be called by .net code, or if it will have to stick to built-in exception types since it might be called by legacy components.

Working with what we've got

So what we're basically limited to working with is a message (string) and a "HRESULT". If the Widgetiser throws an exception then, while it will be transformed into a COMException when it comes out of the COM component, whatever message and HRESULT values we specify will be maintained.

The HRESULT is a structure that describes a response - either a success or a failure. The first (ie. most significant) of its 32 bits indicates "severity" (1 being failure, 0 being success). Then there's a reserved bit, then a bit to indicate whether this is a "customer" response (1) or a Microsoft / framework value (0). Then two more bits we don't worry about and set to zero, then 11 bits to indicate a "Facility" (but since the options for Facility are things like "The error code is specific to Windows CE" and "The source of the error code is a Windows Defender component" it makes sense to leave these all as zero for custom errors, which means "Default Facility"). Then there's 16 bits for an error code. This, basically, can be whatever you like but each code should uniquely identify a given error type.

So, if we raise custom exceptions that have unique error codes then we could potentially use the HRESULT value from the COMException to map back to the original type (just like happens with those special exception types that .net automatically maps, like COR_E_ENDOFSTREAM to EndOfStreamException).

The simplest approach, then, would be to change our calling code -

public string GetMagicName(int id)
{
  try
  {
    return GetMagicNameFromComponent(id);
  }
  catch (COMException e)
  {
    if (e.HResult == WidgetiserException.UNIQUE_HRESULT)
    {
      // Log Widgetiser dependency failure..
    }
  }
}

private string GetMagicNameFromComponent(int id)
{
  // Note: The "_widgetiser" reference is a ComVisible .net component passed into
  // the COM component as a dependency
  dynamic legacyComponent = Activator.CreateInstance(
    Type.GetTypeFromProgID("OldSystem.Calculator")
  );
  legacyComponent.Widgetiser = _widgetiser;
  return legacyComponent.GetMagicName(id);
}

and to change the exception class slightly -

[Serializable]
public class WidgetiserException : COMException
{
  public static readonly int UNIQUE_HRESULT = -1610612735;
  protected WidgetiserException(string message, string gadgetName)
    : base(message, UNIQUE_HRESULT)
  {
    // .. the rest of the class is unaltered..

(Note: Inheriting from COMException doesn't magically allow for the exception class to be recognised when it pops out of a COM component, but it does have a constructor that takes a message and a HRESULT value, which is handy here).

This still feels too error-prone for my liking, though. What if I add a range of custom exceptions that need to be supported? Then I'd need to check for all of these different HRESULT values in my try..catch blocks.

A slight variation would be to have a helper function "TryToRetrieveCustomException" -

public static Exception TryToRetrieveCustomException(COMException e)
{
  if (e.HResult == WidgetiserException.ERROR_HRESULT)
    return new WidgetiserException(e.message, "something");
  return null;
}

and to call this from within each catch block. That way, when new exceptions are defined they only need to explicitly be considered by the "TryToRetrieveCustomException" function and not within every possibly-affected catch block.

Another thing that bothers me is that (returning to my example) the "GetMagicName" function has to consider that it's relying upon a COM component and that a COMException must be caught. In the future, the COM component may get re-written into a .net version - at which point, it will look odd to future maintainers that a COMException is being caught when, really, it's a WidgetiserException that is of interest.

We can do better.

The "COMSurvivableException"

In case you weren't paying full attention, there is another problem in the "TryToRetrieveCustomException" function above - the constructor on the WidgetiserException takes two arguments; the message and the gadgetName. When "TryToRetrieveCustomException" creates a new WidgetiserException, it can only set the message (not the gadgetName) since that's all that's available on the COMException that is has a reference to. It doesn't know what the gadgetName should be!

Let's jump straight into a possible solution -

[Serializable]
public abstract class COMSurvivableException : COMException
{
  private static readonly Dictionary<ushort, Reviver> _revivers
    = new Dictionary<ushort, Reviver>();
  protected COMSurvivableException(string messageWithAnyStateData, Reviver reviver)
    : base(messageWithAnyStateData)
  {
    if (string.IsNullOrWhiteSpace(messageWithAnyStateData))
      throw new ArgumentException("Null/blank messageWithAnyStateData specified");
    if (reviver == null)
      throw new ArgumentNullException("reviver");

    lock (_revivers)
    {
      _revivers[UniqueErrorCode] = reviver;
    }
    HResult = CustomErrorHResultGenerator.GetHResult(UniqueErrorCode);
  }

  protected delegate COMSurvivableException Reviver(string messageWithAnyStateData);

  protected abstract ushort UniqueErrorCode { get; }

  protected COMSurvivableException(SerializationInfo info, StreamingContext context)
    : base(info, context) { }

  [DebuggerStepThrough]
  public static void RethrowAsOriginalIfPossible(COMException e)
  {
    if (e == null)
      throw new ArgumentNullException("e");

    var uniqueErrorCode = CustomErrorHResultGenerator.GetErrorCode(e.HResult);
    Reviver reviver;
    lock (_revivers)
    {
      if (!_revivers.TryGetValue(uniqueErrorCode, out reviver))
        return;
    }
    throw reviver(e.Message);
  }

  private static class CustomErrorHResultGenerator
  {
    private const int CUSTOMERROR_BASE
      = (1 << 31) /* Severity = 1 */
      | (1 << 29) /* Customer = 1 */;

    public static int GetHResult(ushort errorCode)
    {
      return CUSTOMERROR_BASE | errorCode;
    }

    private const int ERROR_CODE_MASK = (int)short.MaxValue;
    public static ushort GetErrorCode(int hresult)
    {
      return (ushort)(hresult & ERROR_CODE_MASK);
    }
  }
}

This is a base class that exceptions may be derived from if they must be able to survive travelling through COM components.

Each derived class will be responsible for declaring a unique error code and a "reviver". The error code is of type ushort, which is a 16 bit .net type - there's no point making the derived types do the maths around working out how to set the is-failure, is-custom-error, etc.. bits in a HRESULT. A reviver is basically just a way to de-serialise data that is stored in the message property.

Derived classes are responsible for serialising data in the message property - this sounds like it could be complex but it can also be very simple in many cases (as shown below). The COMSurvivableException maintains mappings of error codes to revivers so that it can implement a "RethrowAsOriginalIfPossible" function to handle translating a COMException back into a more meaningful type.

In practice, this means that we could implement WidgetiserException as

[Serializable]
public class WidgetiserException : COMSurvivableException
{
  protected WidgetiserException(string message, string gadgetName)
    : base(GetMessageWithStateData(message, gadgetName), Revive)
  {
    GadgetName = gadgetName;
  }

  public string GadgetName { get; private set; }

  protected override ushort UniqueErrorCode { get { return 1; } }

  private const string GADGET_NAME_ID = "GadgetName";
  protected WidgetiserException(SerializationInfo info, StreamingContext context)
    : base(info, context)
  {
    GadgetName = info.GetString(GADGET_NAME_ID);
  }
  public override void GetObjectData(SerializationInfo info, StreamingContext context)
  {
    info.AddValue(GADGET_NAME_ID, GadgetName);
    base.GetObjectData(info, context);
  }

  private static string GetMessageWithStateData(string message, string gadgetName)
  {
    return (message ?? "").Replace("\n", " ") + "\n" + gadgetName;
  }
  private static COMSurvivableException Revive(string messageWithAnyStateData)
  {
    var messageParts = (messageWithAnyStateData ?? "").Split(new[] { '\n' }, 2);
    if (messageParts.Length != 2)
      throw new Exception("Invalid state data");
    return new WidgetiserException(messageParts[0], messageParts[1]);
  }
}

and then change the COM component calling code to

public string GetMagicName(int id)
{
  try
  {
    return GetMagicNameFromComponent(id);
  }
  catch (WidgetiserException e)
  {
    // Log Widgetiser dependency failure..
    throw;
  }
}

private string GetMagicNameFromComponent(int id)
{
  try
  {
    // Note: The "_widgetiser" reference is a ComVisible .net component passed into
    // the COM component as a dependency
    dynamic legacyComponent = Activator.CreateInstance(
      Type.GetTypeFromProgID("OldSystem.Calculator")
    );
    legacyComponent.Widgetiser = _widgetiser;
    return legacyComponent.GetMagicName(id);
  }
  catch (COMException e)
  {
    COMSurvivableException.RethrowAsOriginalIfPossible(e);
    throw;
  }
}

Note that the "GetMagicName" function has now returned to the idealised version that I started with.

There are a couple of sacrifices - the WidgetiserException has expanded a little and it has become aware that it must be "COM-survivable". This adds some maintenance burden in that all exceptions that may have to pass through a COM component need to know that this may happen. And, in the glorious future in which all COM components have been rewritten with efficient, clean, testable, beautiful .net versions, it will look strange that there are still exceptions in use which identify themselves as COM-survivable. When this day arrives, it should not be much work to change the custom exceptions to not derive from COMSurvivableException - and doing so should be a pleasingly cathartic way to celebrate the completed migration :)

Another cost is that the call to the COM component has now got an additional try..catch block. I think this is a very reasonable trade-off, though, since now the complexity has been pushed right down to the call site of the COM component (the "GetMagicNameFromComponent" function) - the function that calls that ("GetMagicName") is clean. And when the COM component is no longer used, when the "Activator.CreateInstance" call is replaced with a simple new'ing-up of a .net class, the no-longer-necessary try..catch may be removed - this is another benefit to the complexity being pushed down to the call site; it's clear what it's for (unlike in the first solution I proposed, where it's not clear why the "GetMagicNumber" function has to go querying HRESULT values until you look into the "GetMagicNameFromComponent" that it calls).

Other remarks

The WidgetiserException class above uses a very coarse "serialisation" mechanism - it hopes that the "message" string will have no line returns in (replacing any that it does have with spaces) and then uses a line return as a delimiter in a string that combines the "message" and "gadgetName" properties. This is very simple to "deserialise", the combined string need only be split on the first line return and then the two original values are once again available. It's feasible that this approach would not be acceptable in some use cases (if the "message" value must be allowed to contain line returns) but I just used it to make the example clear. The serialisation mechanism could use the .net xml serialiser, for example, to cram the exception's state into a string. Or you might be a big fan of Json.NET and want to serialise into json. So long as you can record the derived classes' data in a string to give to the COMSurvivableException, you can do whatever you want!

On the COMSurvivableException class itself, you might notice that it has to lock its error-code-to-reviver mappings dictionary whenever a new instance of an exception is created that is derived from this abstract class. And locks it again when that mapping data is read in order to try to translate a COMException back into its original form. There are times to stress and worry about whether locking is going to affect throughput and what it might do to concurrency, but when you're throwing an exception then you're already in the middle of a fairly expensive operation (it has to generate a stack trace, aside from anything else) and so this is not somewhere where a little bit of locking is going to be anything to lose sleep over.

The "RethrowAsOriginalIfPossible" function on the COMSurvivableException class is decorated with the "DebuggerStepThrough" attribute. This instructs the debugger never to break inside this function. Considering that the sole purpose of this method is to raise an exception for the caller, it makes little sense to allow the debugger to stop inside the function - the point is to deliver interesting information to the caller of "RethrowAsOriginalIfPossible" and that is where it may make sense for a debugger to stop on an exception. Only a minor tweak but it makes this whole dirty workaround a little more palatable when used in the real world.

More information

I've already linked to the MSDN article that explains the layout of the HRESULT value but it might also be of interest to know which HRESULT values are automatically translated into framework exception types - see How to: Map HRESULTs and Exceptions. That article lists the constants by name, I don't know of any better way to find out what the numeric values are (should you need them) than putting the name into Google and following a link or two!

Finally, there was an old MSDN article about doing something similar in nature to what I've covered here - but it seems to have gone off-line now, the link I had to it is broken. Thankfully the WayBackMachine can help us and make available Throwing Custom Exception Types from a Managed COM+ Server Application. From what I understand, this talks about wrapping components fully in a proxy object that maps custom exceptions based upon HRESULT and that serialises exception state as XML. There are complexities around proxying in this sort of manner but the article is worth a read - the benefits are that the custom exceptions do not need to be derived from anything other than Exception and, once the proxies are created, you don't need to add extra try..catch blocks around COM component calls in order to map exceptions. The downsides include potential complications around deployment and debugging, which are described in the article.

Posted at 21:29

Tags:

Comments

29 October 2013

What is Nothing?

What a deep existential question!

Well.. maybe not in this context..

Here I'm talking about good old VBScript; a technology at work that just refuses to completely go away. We still have software running on a combination of VBScript and .net. One of them makes use of Windows Scripting Components; basically VBScript wrapped up to act like a COM component. The advantage is that we can look at replacing areas of legacy code with .net (on-going maintenance and testing concerns are important here but the performance gap between the two technologies is startling too*) without having to throw everything away all at once.

* (Not surprising since not only is VBScript interpreted - rather than compiled - but also since it hasn't benefited from optimisation or active development for over a decade).

One of the downsides of this, however, is dealing with VBScript's oddities. A lot of this is handled very nicely by COM (and .net's COM integrations) at the boundaries - a lot of basic types can be passed from .net to these components (and vice versa). You pass in a .net string and it's happily translated into BSTR (see Eric's Complete Guide To BSTR Semantics, before Eric Lippert was a C# genius he was responsible for a lot of work on the VBScript interpreter). Likewise with ints and booleans.

But one of the craziest areas of VBScript is its representations of null. It has three of them. Three. And this is where we can get unstuck.

Empty, Null, Nothing (why, oh why?)

This is a bit of history, if you've ended up at this page looking for the same thing I was (until recently) looking for (how "Nothing" can be represented by .net) then jump down to the next section.

I'm going to draw a parallel to JavaScript here since that effectively has two representations of "null" and will be much more well known.

In JavaScript, if a variable is declared but unintialised then it has type "undefined" - eg.

var a;
alert(typeof(a)); // "undefined"

This means that this variable has no value, we have not given it a value, we don't care at this point what it's value may or may not be.

This is different from explicitly setting a variable to null. This is an intentional application of a value to a variable - eg.

var a = null;
alert(typeof(a)); // "object"

Why it decides to describe "null" as an "object" could be a discussion for another day, but it's sufficient to show that it has been given an actual value, it is not "undefined" any more.

Now these are similar to VBScript's Empty and Null - in VBScript, Empty means that the variable has not been initialised while Null means that it has explicitly set to Null. There are occasions where it's useful to say "I have tried to access this item and have found it to be absent" - hence giving it a null value - as opposed to "I haven't even attempted to populate this value".

But Nothing is a different beast. VBScript has different assignment semantics for what it considers to be object references versus primitive types. If you want to set a value to be an "object" type (a VBScript class instance, for example) then you have to use the "SET" keyword -

Set u = GetUser()

If you omitted the "SET" then it would try to set "u" to what VBScript considers a value type (a string, number, etc..). To do this it would look for a default (parameter-less) property or function on the object. If it can't find one then it will throw a rather unhelpful "Type mismatch" error.

So far as I can tell, this is solely to try to make some tasks which are already easy even easier. For example, if the GetUser function returns an object reference with a default (and parameter-less) Name property then writing

WScript.Echo GetUser()

would print out the Name property. This is presumably because

WScript.Echo GetUser().Name

would be too hard??

By supporting these default member options, a way to say "I don't want a default property, I want the object reference itself" is required. This is what the "SET" keyword is for.

I'm thinking it's total madness. While possibly making some easy things a tiny bit easier, it makes some otherwise-not-too-difficult things really difficult and convoluted!

The prime example is "Nothing". If you want a function that will return an object then you will call that method using "SET". But this will mean that you can't return Null to indicate no result since Null isn't an object and trying to do what amounts to

Set u = Null

will result in another unfriendly error

Object required: 'Null'

Fantastic.

So VBScript needs a way to represent an object type that effectively means "no value", but that is different to Empty (since that means not initialised) and Null (since that isn't an object).

Nothing in .net

For a long time I'd thought that Nothing must somehow be an internal VBScript concept. There were three things that had me half-convinced of this:

There was no carryover into VB.Net, there is "Nothing" there but it is equivalent to null in C# - there aren't two values that can be accessed (Null vs Nothing), not even for some sort of backward compatibility
If you pass Nothing over the COM boundary to a .net COM component, you get a null reference (not some magic other object type)
Multiple web searches failed; "How do I represent Nothing in a COM component to interact with VBScript?" Crickets..

Point 2 is partly down to the cleverness of the .net / COM integration where it converts types into native CLR types where it can. VBScript's "Nothing" really could be said to equate to null in an environment where such a hard distinction between value and reference types is unrequired.

But there could be legacy WSC components that have methods that differentiate between an argument that represents Null and one that represents Nothing, so I didn't want to give up completely.

At some point, I had two breakthroughs. I don't know what was different about this web search.. maybe the work I did earlier this year with COM and IDispatch has helped me understand that way of thinking more or perhaps I was just more dogged in my refusing to accept defeat when looking for an answer. But I've finally struck gold! (Wow, such an exaggeration for something that may never be of use to anyone else, ever :)

And as I write it out, it sounds frustratingly rudimentary. But, as I said, I found it incredibly hard to actually piece this together.

In VBScript, all values are of type VARIANT. This can represent booleans, numbers, strings, a pointer to an IDispatch implementation, all sorts.

A VARIANT has a type to indicate what it represents, as can be seen on MSDN: VARIANT Type Constants.

To VBScript, Empty means a null VARIANT. No reference to a variant at all.

Null means a VARIANT of type VT_NULL (incidentally, System.DBNull.Value maps back and forth onto this over the COM boundary).

Nothing means a VARIANT of type VT_EMPTY. (VBScript internally decides that this is an "object" type, as opposed to Null, which a value type).

So the final puzzle piece; how do we represent this arbitrary VARIANT type in .net?

I found this article (well, chapter from the book ".NET and COM: The Complete Interoperability Guide"): The Essentials for Using COM in Managed Code - which contains this magic section

Because null (Nothing) is mapped to an "empty object" (a VARIANT with type VT_EMPTY) when passed to COM via a System.Object parameter, you can pass new DispatchWrapper(null) or new UnknownWrapper(null) to represent a null object.

And that's it! All you need is

var nothing = new DispatchWrapper(null);

and you've got a genuine "Nothing" reference that you can pass to VBScript and have it recognise! If you use the VBScript TypeName function then you get "Nothing" reported. That's all there is to to it, it is possible!

Follow-up: ComVisible return types (28th Dec 2013)

I've done some more experimenting with this since I found some legacy code that I'd written a few years ago that infuriatingly seemed to manage to return Nothing from a method without explicitly specifying it with the DispatchWrapper as above.

It turns out that if the return type of a method is a class that has the [ComVisible(true)] attribute then returning null from .net will result in VBScript interpreting the response as Nothing. However, if the return type is not a type with that attribute then it will not be translated into null.

public ComVisibleType Get(int id)
{
  return null; // VBScript will interpet this as Nothing
}

public object Get(int id)
{
  return null; // VBScript will interpet this as Empty
}

[ComVisible(true)]
public class ComVisibleType
{
  public string Name { get; set; }
}

Posted at 22:34

Tags:

Comments

5 March 2013

Supporting IDispatch through the COMInteraction wrapper

Some time ago, I wrote some code that would generate a wrapper to apply a given interface to any object using reflection. The target object would need to expose the properties and methods of the interface but may not implement the interface itself. This was intended to wrap some old WSC components that I was having to work with but is just as easy to demonstrate with .Net classes:

using System;
using COMInteraction.InterfaceApplication;
using COMInteraction.InterfaceApplication.ReadValueConverters;

namespace DemoApp
{
  class Program
  {
    static void Main(string[] args)
    {
      // Warning: This code will not compile against the current code since the interface has changed
      // since the example was written but read on to find out how it's changed!
      // - Ok, ok, just replace "new InterfaceApplierFactory" with "new ReflectionInterfaceApplierFactory"
      //   and "InterfaceApplierFactory.ComVisibilityOptions.Visible" with "ComVisibilityOptions.Visible"
      //   :)
      var interfaceApplierFactory = new InterfaceApplierFactory(
        "DynamicAssembly",
        InterfaceApplierFactory.ComVisibilityOptions.Visible
      );
      var interfaceApplier = interfaceApplierFactory.GenerateInterfaceApplier<IAmNamed>(
        new CachedReadValueConverter(interfaceApplierFactory)
      );

      var person = new Person() { Id = 1, Name = "Teddy" };
      var namedEntity = interfaceApplier.Apply(person);
    }
  }

  public interface IAmNamed
  {
    string Name { get; }
  }

  public class Person
  {
    public int Id { get; set; }
    public string Name { get; set; }
  }
}

The "namedEntity" reference implements IAmNamed and passes the calls through to the wrapped "Person" instance through reflection (noting that Person does not implement IAmNamed). Of course, this will cause exceptions if an instance is wrapped that doesn't expose the properties or methods of the interface when those are called.

I wrote about the development of this across a few posts: Dynamically applying interfaces to objects, Part 2 and Part 3 (with the code available at the COMInteraction project on Bitbucket).

And this worked fine for the purpose at hand. To be completely frank, I'm not entirely sure how it worked when calling into the WSC components that expose a COM interface since I'm surprised the reflection calls are able to hook into the methods! It felt a bit hacky..

But just recently I've gotten into some of the nitty gritty of IDispatch (see IDispatch (IWastedTimeOnThis but ILearntLots)) and thought maybe I could bring this information to bear on this project.

Supporting Reflection and IDispatch

The existing InterfaceApplierFactory class has been renamed to the ReflectionInterfaceApplierFactory and a new implementation of IInterfaceApplierFactory has been added: the IDispatchInterfaceApplierFactory. (Where do I come up with these catchy names?! :).

Where the existing (and now renamed) class generated IL to access the methods and properties through reflection, the new class generates IL to access the methods and properties using the code from my previous IDispatch post, handily wrapped up into a static IDispatchAccess class.

The code to do this wasn't too difficult to write, starting with the reflection-approach code as a template and writing the odd bit of test code to disassemble with ildasm if I found myself getting a bit lost.

While I was doing this, I changed the structure of the ReflectionInterfaceApplierFactory slightly - I had left a comment in the code explaining that properties would be defined for the type generated by the IL with getter and setter methods attached to it but that these methods would then be overwritten since the code that enumerates methods for the implemented interface(s) picks up "get_" and "set_" methods for each property. The comment goes on to say that this doesn't appear to have any negative effect and so hasn't been addressed. But with this revision I've gone a step further and removed the code the generates the properties entirely, relying solely on the "get_" and "set_" methods that are found in the interface methods as this seems to work without difficulty too! Even indexed properties continue to work as they get the methods "get_Item" and "set_Item" - if you have a class with an indexed property you may not also have a method named "Item" as you'll get a compilation error:

"The type 'Whatever' already contains a definition for 'Item'

I'm not 100% confident at this point that what I've done here is correct and whether or not I'm relying on some conventions that may not be guaranteed in the future. But I've just received a copy of "CLR via C#" in the post so maybe I'll get a better idea of what's going on and amend this in the future if required!

The types generated by the IDispatchInterfaceApplierFactory will not work if properties are not explicitly defined (and so IL is emitted to do this properly in this case).

Choosing Reflection, IDispatch (or neither)

Another new class is the CombinedInterfaceApplierFactory which is intended to take the decision-making out of the use of reflection / IDispatch. It will generate an IInterfaceApplier whose Apply method will apply an IDispatch wrapper if the object-to-wrap's type's IsCOMObject property is true. Otherwise it will use reflection. Actually, it performs a check before this to ensure that the specified object-to-wrap doesn't already implement the required interface - in which case it returns it straight back! (This was useful in some scenarios I was testing out and also makes sense; if the type doesn't require any manipulation then don't perform any).

Not being Lazy

I realised, going back to this project, that I'd got over-excited when I discovered the Lazy<T> class in .Net 4 and used it when there was some work in the code that I wanted to defer until it was definitely required. But this, of course, meant that the code had a dependency on .Net 4! Since I imagine that this could be useful in place of the "dynamic" keyword in some cases, I figured it would make sense to try to remove this dependency. (My over-excitement is probably visible when I wrote about it at Check, check it out).

I was using it with the "threadSafe" option set to true so that the work would only be executed once (at most). This is a fairly straight forward implementation of the double-checked locking pattern (if there is such a thing! :) with a twist that if the work threw an exception then that exception should be thrown for not only the call on the thread that actually performed the work but also subsequent calls:

using System;

namespace COMInteraction.Misc
{
  /// <summary>
  /// This is similar to the .Net 4's Lazy class with the isThreadSafe argument set to true
  /// </summary>
  public class DelayedExecutor<T> where T : class
  {
    private readonly Func<T> _work;
    private readonly object _lock;
    private volatile Result _result;
    public DelayedExecutor(Func<T> work)
    {
      if (work == null)
        throw new ArgumentNullException("work");

      _work = work;
      _lock = new object();
      _result = null;
    }

    public T Value
    {
      get
      {
        if (_result == null)
        {
          lock (_lock)
          {
            if (_result == null)
            {
              try
              {
                _result = Result.Success(_work());
              }
              catch(Exception e)
              {
                _result = Result.Failure(e);
              }
            }
          }
        }
        if (_result.Error != null)
          throw _result.Error;
        return _result.Value;
      }
    }

    private class Result
    {
      public static Result Success(T value)
      {
        return new Result(value, null);
      }
      public static Result Failure(Exception error)
      {
        if (error == null)
          throw new ArgumentNullException("error");
        return new Result(null, error);
      }
      private Result(T value, Exception error)
      {
        Value = value;
        Error = error;
      }

      public T Value { get; private set; }

      public Exception Error { get; private set; }
    }
  }
}

Conclusion

So finally, the project works with .Net 3.5 and can be used with only the following lines:

var interfaceApplierFactory = new CombinedInterfaceApplierFactory(
  new ReflectionInterfaceApplierFactory("DynamicAssembly", ComVisibilityOptions.Visible),
  new IDispatchInterfaceApplierFactory("DynamicAssembly", ComVisibilityOptions.Visible)
);
var interfaceApplier = interfaceApplierFactory.GenerateInterfaceApplier<IWhatever>(
  new CachedReadValueConverter(interfaceApplierFactory)
);
var wrappedInstance = interfaceApplier.Apply(obj);

In real use, you would want to share generated Interface Appliers rather than creating them each time a new instance needs wrapping up in an interface but how you decide to handle that is down to you!*

* (I've also added a CachingInterfaceApplierFactory class which can be handed to multiple places to easily enable the sharing of generated Interface Appliers - that may well be useful in preventing more dynamic types being generated than necessary).

Posted at 23:29

Tags:

Comments

18 February 2013

IDispatch (IWastedTimeOnThis but ILearntLots)

For something I've been working on it looked like I was going to have to interact with COM objects from a legacy system without type libraries and where the internals were written in VBScript. Ouch. It seemed like a restriction of the environment meant that .Net 4 wouldn't be available and so the dynamic keyword wouldn't be available.

It would seem that the COMInteraction code that I wrote in the past would be ideal for this since it should wrap access to generic COM objects but I encountered a problem with that (which I'll touch briefly on later in this post).

So the next step was to find out about the mysterious IDispatch interface that I've heard whispered about in relation to dealings with generic COM objects! Unfortunately, I think in the end I found a way to get .Net 4 into play for my original problem so this might all have been a bit of a waste of time.. but not only was it really interesting but I also found nowhere else on the internet that was doing this with C#. And I read up a lot. (There's articles that touch on most of it, but not all - read on to find out more! :)

What is IDispatch

From IDispatch on Wikipedia:

IDispatch is the interface that exposes the OLE Automation protocol. It is one of the standard interfaces that can be exposed by COM objects .. IDispatch derives from IUnknown and extends its set of three methods (AddRef, Release and QueryInterface) with four more methods - GetTypeInfoCount, GetTypeInfo, GetIDsOfNames and Invoke.

Each property and method implemented by an object that supports the IDispatch interface has what is called a Dispatch ID, which is often abbreviated DISPID. The DISPID is the primary means of identifying a property or method and must be supplied to the Invoke function for a property or method to be invoked, along with an array of Variants containing the parameters. The GetIDsOfNames function can be used to get the appropriate DISPID from a property or method name that is in string format.

It's basically a way to determine what methods can be called on an object and how to call them.

How to use it

I got most of the useful information first from these links:

Calling a member of IDispatch COM interface from C# (Stack Overflow)
Setting a Property by IDispatch Invoke (particularly section 3.4)
Using VARIANTs in Managed Code Part 3 (section 2.4)

The first thing to do is to cast the object reference to the IDispatch interface (this will only work if the object implements IDispatch, for the COM components I was targetting this was the case). The interface isn't available in the framework but can be hooked up with

[ComImport()]
[Guid("00020400-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IDispatch
{
  [PreserveSig]
  int GetTypeInfoCount(out int Count);

  [PreserveSig]
  int GetTypeInfo
  (
    [MarshalAs(UnmanagedType.U4)] int iTInfo,
    [MarshalAs(UnmanagedType.U4)] int lcid,
    out System.Runtime.InteropServices.ComTypes.ITypeInfo typeInfo
  );

  [PreserveSig]
  int GetIDsOfNames
  (
    ref Guid riid,
    [MarshalAs(UnmanagedType.LPArray, ArraySubType = UnmanagedType.LPWStr)]
    string[] rgsNames,
    int cNames,
    int lcid,
    [MarshalAs(UnmanagedType.LPArray)] int[] rgDispId
  );

  [PreserveSig]
  int Invoke
  (
    int dispIdMember,
    ref Guid riid,
    uint lcid,
    ushort wFlags,
    ref System.Runtime.InteropServices.ComTypes.DISPPARAMS pDispParams,
    out object pVarResult,
    ref System.Runtime.InteropServices.ComTypes.EXCEPINFO pExcepInfo,
    out UInt32 pArgErr
  );
}

Then the GetIDsofNames is called to determine whether a given method is present:

private const int LOCALE_SYSTEM_DEFAULT = 2048;

// rgDispId will be populated with the DispId of the named member (if available)
var rgDispId = new int[1] { 0 };

// IID_NULL must always be specified for the "riid" argument
// (see http://msdn.microsoft.com/en-gb/library/windows/desktop/ms221306(v=vs.85).aspx)
var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");

var hrRet = ((IDispatch)source).GetIDsOfNames
(
  ref IID_NULL,
  new string[1] { name },
  1, // number of names to get ids for
  LOCALE_SYSTEM_DEFAULT,
  rgDispId
);
if (hrRet != 0)
  throw new Exception("Uh-oh!");

return rgDispId[0];

Then the Invoke method is called with the Disp Id, the type of call (eg. execute method, set property, etc..), a "local context" ("applications that do not support multiple national languages can ignore this parameter" - IDispatch::Invoke method (Automation) at MSDN) and the parameters.

Calling a argument-less method

private const int LOCALE_SYSTEM_DEFAULT = 2048;
private const ushort DISPATCH_METHOD = 1;

var dispId = 19; // Or whatever the above code reported

// This DISPPARAMS structure describes zero arguments
var dispParams = new System.Runtime.InteropServices.ComTypes.DISPPARAMS()
{
  cArgs = 0,
  cNamedArgs = 0,
  rgdispidNamedArgs = IntPtr.Zero,
  rgvarg = IntPtr.Zero
};

var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");
UInt32 pArgErr = 0;
object varResult;
var excepInfo = new System.Runtime.InteropServices.ComTypes.EXCEPINFO();
var hrRet = ((IDispatch)source).Invoke
(
  dispId,
  ref IID_NULL,
  LOCALE_SYSTEM_DEFAULT,
  DISPATCH_METHOD,
  ref dispParams,
  out varResult,
  ref excepInfo,
  out pArgErr
);
if (hrRet != 0)
  throw new Exception("FAIL!");
return varResult;

The DISPPARAMS structure (which is part of the framework) enables the specification of both "named" and "unnamed" arguments. When calling a method, unnamed arguments may be passed in but when setting a property, the value that the property is to be set to must be passed as a named argument with the special constant DISPID_PROPERTYPUT (-3).

The above code could also be used to retrieve a property value (a non-indexed property) by replacing the DISPATCH_METHOD value with DISPATCH_PROPERTYGET (2).

Calling a single-argument method

[DllImport(@"oleaut32.dll", SetLastError = true, CallingConvention = CallingConvention.StdCall)]
static extern Int32 VariantClear(IntPtr pvarg);

private const int LOCALE_SYSTEM_DEFAULT = 2048;
private const ushort DISPATCH_METHOD = 1;
private const int SizeOfNativeVariant = 16;

var dispId = 19; // Or whatever the above code reported
var arg = "Value";

// This DISPPARAMS describes a single (unnamed) argument
var pVariant = Marshal.AllocCoTaskMem(SizeOfNativeVariant);
Marshal.GetNativeVariantForObject(arg, pVariant);
var dispParams = new System.Runtime.InteropServices.ComTypes.DISPPARAMS()
{
  cArgs = 1,
  cNamedArgs = 0,
  rgdispidNamedArgs = IntPtr.Zero,
  rgvarg = pVariant
};

try
{
  var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");
  UInt32 pArgErr = 0;
  object varResult;
  var excepInfo = new System.Runtime.InteropServices.ComTypes.EXCEPINFO();
  var hrRet = ((IDispatch)source).Invoke
  (
    dispId,
    ref IID_NULL,
    LOCALE_SYSTEM_DEFAULT,
    DISPATCH_METHOD,
    ref dispParams,
    out varResult,
    ref excepInfo,
    out pArgErr
  );
  if (hrRet != 0)
    throw new Exception("FAIL!");
  return varResult;
}
finally
{
  VariantClear(pVariant);
  Marshal.FreeCoTaskMem(pVariant);
}

As mentioned above, when calling methods there is no need to named arguments so cNamedArgs is still 0 and rgdispidNamedArgs is still IntPtr.Zero (a managed version of a null pointer).

From what I understand (and I'd never used Marshal.AllocCoTaskMem or Marshal.GetNativeVariantForObject before a couple of days ago!), the AllocCoTaskMem call allocates a chunk of unmanaged memory and then GetNativeVariantForObject copies a managed reference into that memory. A variant is always 16 bytes. This is the same variant type used for all VBScript calls, for example, and used for method arguments for IDispatch. More about the VARIANT structure can be found at this MSDN article.

The framework does some sort of clever manipulation to copy the contents of the managed reference into unmanaged memory, the internals of which I'm not going to worry too much about. But there's a couple of things to note; this is a copy operation so if I was getting involved with unmanaged memory for performance reasons then I'd probably want to avoid this. But it does mean that this copied memory is "safe" from the garbage collector doing anything with it. When you peel it back a layer, managed memory can't be expected to work as predictably as unmanaged memory as the garbage collector is free to be doing all manner of clever things to stay on top of memory usage and references and, er.. stuff. Which is a good thing because (for the large part) I don't have to worry about it! But it would be no good if the garbage collector moved memory around that the COM component was in the middle of accessing. Bad things would happen. Bad intermittent things (the worst kind). But this does have one important consequence; since the GC is not in control of this memory, I need to explicitly release it myself when I'm done with it.

Another side note on this: The system also needs to be sure that the GC doesn't do anything interesting with memory contents while it's performing to copy to the variant. The framework uses something called "automatic pinning" to ensure that the reference being considered by the Marshal.GetNativeVariantForObject doesn't move during this operation (ie. it is "pinned" in place in memory). There is also a way to manually pin data where a particular reference can be marked such that its memory not be touched by the GC until it's freed (using GCHandle.Alloc and the GCHandleType.Pinned option, and later calling .Free on the handle returned by Alloc) which may be used in the passing-by-reference approach I alluded to above, but I won't need it here.

Setting a (non-indexed) property

[DllImport(@"oleaut32.dll", SetLastError = true, CallingConvention = CallingConvention.StdCall)]
static extern Int32 VariantClear(IntPtr pvarg);

private const int LOCALE_SYSTEM_DEFAULT = 2048;
private const ushort DISPATCH_PROPERTYPUT = 4;
private const int DISPID_PROPERTYPUT = -3;
private const int SizeOfNativeVariant = 16;

var dispId = 19; // Or whatever the above code reported
var arg = "Value";

// This DISPPARAMS describes a single named (DISPID_PROPERTYPUT) argument
var pNamedArg = Marshal.AllocCoTaskMem(sizeof(Int64));
Marshal.WriteInt64(pNamedArg, DISPID_PROPERTYPUT);
var pVariant = Marshal.AllocCoTaskMem(SizeOfNativeVariant);
Marshal.GetNativeVariantForObject(arg, pVariant);
var dispParams = new System.Runtime.InteropServices.ComTypes.DISPPARAMS()
{
  cArgs = 1,
  cNamedArgs = 1,
  rgdispidNamedArgs = pNamedArg,
  rgvarg = pVariant
};

try
{
  var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");
  UInt32 pArgErr = 0;
  object varResult;
  var excepInfo = new System.Runtime.InteropServices.ComTypes.EXCEPINFO();
  var hrRet = ((IDispatch)source).Invoke
  (
    dispId,
    ref IID_NULL,
    LOCALE_SYSTEM_DEFAULT,
    DISPATCH_PROPERTYPUT,
    ref dispParams,
    out varResult,
    ref excepInfo,
    out pArgErr
  );
  if (hrRet != 0)
    throw new Exception("FAIL!");
}
finally
{
  VariantClear(pVariant);
  Marshal.FreeCoTaskMem(pVariant);
  VariantClear(pNamedArg);
  Marshal.FreeCoTaskMem(pNamedArg);
}

The example code in section 3.4 of the Setting a Property by IDispatch Invoke post I linked to earlier uses a manual pinning approach to specifying the named arguments data but as I understand it we can copy the DISPID_PROPERTYPUT value into unmanaged memory instead, in the same way as the property value is passed over the COM boundary.

Specifying multiple arguments

The final step is to support multiple arguments, whether this be for calling methods or for dealing with indexed properties. This is the step that I've been unable to find any examples for in C#.

The problem is that there need to be multiple variant arguments passed to the Invoke call but no built-in way to allocate an array of variants to unmanaged memory. This Stack Overflow question on IntPtr arithmetics looked promising but didn't quite cover it. And it revealed that I didn't know very much about the unsafe and fixed keywords :(

The final code I've ended up with doesn't seem that complicated in and of itself, but I feel like I've gone through the wringer a bit trying to confirm that it's actually correct! The biggest question was how to go allocating a single variant

var rgvarg = Marshal.AllocCoTaskMem(SizeOfNativeVariant);
Marshal.GetNativeVariantForObject(arg, rgvarg);

// Do stuff..

VariantClear(rgvarg);
Marshal.FreeCoTaskMem(rgvarg);

to allocating multiple. I understood that the array of variants should be laid out sequentially in memory but the leap took me some time to get to

var rgvarg = Marshal.AllocCoTaskMem(SizeOfNativeVariant * args.Length);
var variantsToClear = new List<IntPtr>();
for (var index = 0; index < args.Length; index++)
{
  var arg = args[(args.Length - 1) - index]; // Explanation below..
  var pVariant = new IntPtr(
    rgvarg.ToInt64() + (SizeOfNativeVariant * index)
  );
  Marshal.GetNativeVariantForObject(arg, pVariant);
  variantsToClear.Add(pVariant);
}

// Do stuff..

foreach (var variantToClear in variantsToClear)
  VariantClear(variantToClear);
Marshal.FreeCoTaskMem(rgvarg);

Particularly the concerns about the pointer arithmetic which I wasn't sure C# would like, especially after trying to digest all of the Stack Overflow question. But another Add offset to IntPtr did give me some hope thought it led me get thrown by this MSDN page for the .Net 4 IntPtr.Add method, with its usage of unsafe and fixed!

public static void Main()
{
  int[] arr = { 2, 4, 6, 8, 10, 12, 14, 16, 18, 20 };
  unsafe {
    fixed(int* parr = arr) {
      IntPtr ptr = new IntPtr(parr);
      for (int ctr = 0; ctr < arr.Length; ctr++)
      {
        IntPtr newPtr = IntPtr.Add(ptr, ctr * sizeof(Int32));
        Console.Write("{0}   ", Marshal.ReadInt32(newPtr));
      }
    }
  }
}

So the good news; pointer arithmetic would, dealt with properly, not end the world. Ok, good. And apparently it's safe to always manipulate them using the ToInt64 method

IntPtr ptr = new IntPtr(oldptr.ToInt64() + 2);

whether on a 32 or 64 bit machine. With overhead on 32 bit systems, but I'm not looking for ultimate performance here, I'm looking for functionality! (This last part is one of the answers on Stack Overflow: Add offset to IntPtr.

From what I've learnt about pinning and its effects on the garbage collector, the "fixed" call in the MSDN example is to lock the array in place while it's being iterated over. Since at each insertion into the unmanaged memory I've allocated I'm using Marshal.GetNativeVariantForObject then I don't need to worry about this as that method is copying the data and automatic pinning is holding the data in place while it does so. So I'm all good - I just need to keep track of the variants I've copied so they can be cleared when I'm done and keep tracking of the one area of unmanaged memory I allocated which will need freeing.

One more thing! And this took me a while to track down - I wasn't getting errors but I wasn't getting the results I was expecting. According to the MSDN IDispatch::Invoke method (Automation) page, arguments are stored in the DISPPARAMS structure in reverse order. Reverse order!! Why??! Ah, who cares, I'm over it.

So, without further ado, here's an Invoke method that wraps up all of the above code so that any variety of call - method, indexed-or-not property get, indexed-or-not property set - can be made with all of the complications hidden away. If you don't want it to try to cast the return value then specify "object" as the type param. Anything that has a void return type will return null. This throws the named-argument requirement for property-setting into the mix but should be easy enough to follow if you're fine with everything up til now. (Where an indexed property is set, the last value in the args array should be the value to set it to and the preceeding args elements be the property indices).

public static T Invoke<T>(object source, InvokeFlags invokeFlags, int dispId, params object[] args)
{
  if (source == null)
    throw new ArgumentNullException("source");
  if (!Enum.IsDefined(typeof(InvokeFlags), invokeFlags))
    throw new ArgumentOutOfRangeException("invokeFlags");
  if (args == null)
    throw new ArgumentNullException("args");

  var memoryAllocationsToFree = new List<IntPtr>();
  IntPtr rgdispidNamedArgs;
  int cNamedArgs;
  if (invokeFlags == InvokeFlags.DISPATCH_PROPERTYPUT)
  {
    // There must be at least one argument specified; only one if it is a non-indexed property and
    // multiple if there are index values as well as the value to set to
    if (args.Length < 1)
      throw new ArgumentException("At least one argument must be specified for DISPATCH_PROPERTYPUT");

    var pdPutID = Marshal.AllocCoTaskMem(sizeof(Int64));
    Marshal.WriteInt64(pdPutID, DISPID_PROPERTYPUT);
    memoryAllocationsToFree.Add(pdPutID);

    rgdispidNamedArgs = pdPutID;
    cNamedArgs = 1;
  }
  else
  {
    rgdispidNamedArgs = IntPtr.Zero;
    cNamedArgs = 0;
  }

  var variantsToClear = new List<IntPtr>();
  IntPtr rgvarg;
  if (args.Length == 0)
    rgvarg = IntPtr.Zero;
  else
  {
    // We need to allocate enough memory to store a variant for each argument (and then populate this
    // memory)
    rgvarg = Marshal.AllocCoTaskMem(SizeOfNativeVariant * args.Length);
    memoryAllocationsToFree.Add(rgvarg);
    for (var index = 0; index < args.Length; index++)
    {
      // Note: The "IDispatch::Invoke method (Automation)" page
      // (http://msdn.microsoft.com/en-us/library/windows/desktop/ms221479(v=vs.85).aspx) states that
      // "Arguments are stored in pDispParams->rgvarg in reverse order" so we'll reverse them here
      var arg = args[(args.Length - 1) - index];

      // According to http://stackoverflow.com/a/1866268 it seems like using ToInt64 here will be valid
      // for both 32 and 64 bit machines. While this may apparently not be the most performant approach,
      // it should do the job.
      // Don't think we have to worry about pinning any references when we do this manipulation here
      // since we are allocating the array in unmanaged memory and so the garbage collector won't be
      // moving anything around (and GetNativeVariantForObject copies the reference and automatic
      // pinning will prevent the GC from interfering while this is happening).
      var pVariant = new IntPtr(
        rgvarg.ToInt64() + (SizeOfNativeVariant * index)
      );
      Marshal.GetNativeVariantForObject(arg, pVariant);
      variantsToClear.Add(pVariant);
    }
  }

  var dispParams = new ComTypes.DISPPARAMS()
  {
    cArgs = args.Length,
    rgvarg = rgvarg,
    cNamedArgs = cNamedArgs,
    rgdispidNamedArgs = rgdispidNamedArgs
  };

  try
  {
    var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");
    UInt32 pArgErr = 0;
    object varResult;
    var excepInfo = new ComTypes.EXCEPINFO();
    var hrRet = ((IDispatch)source).Invoke
    (
      dispId,
      ref IID_NULL,
      LOCALE_SYSTEM_DEFAULT,
      (ushort)invokeFlags,
      ref dispParams,
      out varResult,
      ref excepInfo,
      out pArgErr
    );
    if (hrRet != 0)
    {
      var message = "Failing attempting to invoke method with DispId " + dispId + ": ";
      if ((excepInfo.bstrDescription ?? "").Trim() == "")
        message += "Unspecified error";
      else
        message += excepInfo.bstrDescription;
      var errorType = GetErrorMessageForHResult(hrRet);
      if (errorType != CommonErrors.Unknown)
        message += " [" + errorType.ToString() + "]";
      throw new ArgumentException(message);
    }
    return (T)varResult;
  }
  finally
  {
    foreach (var variantToClear in variantsToClear)
      VariantClear(variantToClear);

    foreach (var memoryAllocationToFree in memoryAllocationsToFree)
      Marshal.FreeCoTaskMem(memoryAllocationToFree);
  }
}

public static int GetDispId(object source, string name)
{
  if (source == null)
    throw new ArgumentNullException("source");
  if (string.IsNullOrEmpty(name))
    throw new ArgumentNullException("Null/blank name specified");

  // This will be populated with a the DispId of the named member (if available)
  var rgDispId = new int[1] { 0 };
  var IID_NULL = new Guid("00000000-0000-0000-0000-000000000000");
  var hrRet = ((IDispatch)source).GetIDsOfNames
  (
    ref IID_NULL,
    new string[1] { name },
    1, // number of names to get ids for
    LOCALE_SYSTEM_DEFAULT,
    rgDispId
  );
  if (hrRet != 0)
  {
    var message = "Invalid member \"" + name + "\"";
    var errorType = GetErrorMessageForHResult(hrRet);
    if (errorType != CommonErrors.Unknown)
      message += " [" + errorType.ToString() + "]";
    throw new ArgumentException(message);
  }
  return rgDispId[0];
}

public enum InvokeFlags : ushort
{
  DISPATCH_METHOD = 1,
  DISPATCH_PROPERTYGET = 2,
  DISPATCH_PROPERTYPUT = 4
}

private static CommonErrors GetErrorMessageForHResult(int hrRet)
{
  if (Enum.IsDefined(typeof(CommonErrors), hrRet))
    return (CommonErrors)hrRet;

  return CommonErrors.Unknown;
}

public enum CommonErrors
{
  Unknown = 0,

  // A load of values from http://blogs.msdn.com/b/eldar/archive/2007/04/03/a-lot-of-hresult-codes.aspx
}

Included is a GetDispId method and an "InvokeFlags" enum to wrap up those values. If an error is encountered, it will try to look up the hresult value in an enum that I've trimmed out here but you can find the values at http://blogs.msdn.com/b/eldar/archive/2007/04/03/a-lot-of-hresult-codes.aspx.

A waste of my time?

It's looking like the environment restriction against using .Net 4 is going to go away (I think it was just me being a bit dense with configuration to be honest but I'm not quite convinced yet!) so I should be able to replace all of this code I was thinking of using with the "dynamic" keyword again.

But it's certainly been interesting getting to the bottom of this, and it's given me a greater appreciation of the "dynamic" implementation! Until now I was under the impression that it did much of what it does with fairly straight forward reflection and some sort of caching for performance. But after looking into this I've looked into it more and realised that it does a lot more, varying its integration method depending upon what it's talking to (like if it's a .Net object, a IDispatch-implementing reference, an Iron Python object and whatever else). I have a much greater respect for it now! :)

The COMInteraction Project

One thing it has got me thinking about, though, is the COMInteraction code I wrote. The current code uses reflection and IL generation to sort of force method and property calls onto COM objects, which worked great for the components I was targetting at the time (VBScript WSC components) but which failed when I tried to use it with a Classic ASP Server reference that got passed through the chain. It didn't like the possibly hacky approach I used at all. But it is happy with being called by the Invoke method above since it implements IDispatch. So I'm contemplating now whether I can extend the work to generate different IL depending upon the source type; leaving it using reflection where possible and alternately using IDispatch where reflection won't work but IDispatch may. Sort of like "dynamic" much on a more conservative scale :)

A little bit more about dynamic's magic

Now that I understand more about how IDispatch enables the implementing type to be queried it answers a question I've wondered about before: how can the debugger show properties and data for a dynamic reference that's pointing at a COM object? The GetTypeInfo and GetIDsOfNames of the IDispatch interface can expose this information.

There's some example code on this blog post (by the same guy who wrote some of the other posts I linked earlier): Obtain Type Information of IDispatch-Based COM Objects from Managed Code.. I've played with it a bit and it looks interesting, but I've not gone any further than his method querying code (he retrieves a list of methods but doesn't examine the arguments that the methods take, for example).

Posted at 20:54

Tags:

Comments

27 December 2011

Dynamically applying interfaces to objects - Part 3

In this final part of this mini series there are two enhancements I want to add:

For cases where the same interface may be applied to multiple objects, I want to replace the WrapObject method (which applies the interface to a single reference) with a GenerateInterfaceApplier method which returns an object that applies a specified interface to any given reference but only generating the IL once
The ability to recursively wrap returned data

eg. be able to apply ITest to the ExampleObject reference:

public interface ITest
{
    IEmployee Get(int id);
}

public interface IEmployee
{
    int Id { get; }
    string Name { get; }
}

public class ExampleObject
{
    public object Get(int id)
    {
        if (id != 1)
            throw new ArgumentException("Only id value 1 is supported");
        return new Person(1, "Ted");
    }

    public class Person
    {
        public Person(int id, string name)
        {
            if (string.IsNullOrEmpty(name))
                throw new ArgumentException("Null/blank name specified");
            Id = id;
            Name = name.Trim();
        }
        public int Id { get; private set; }
        public string Name { get; private set; }
    }
}

Task 1: The IInterfaceApplier class

The first part is fairly straight-forward so I'll jump straight in -

public class InterfaceApplierFactory : IInterfaceApplierFactory
{
    private string _assemblyName;
    private bool _createComVisibleClasses;
    private Lazy<ModuleBuilder> _moduleBuilder;
    public InterfaceApplierFactory(string assemblyName, ComVisibility comVisibilityOfClasses)
    {
        assemblyName = (assemblyName ?? "").Trim();
        if (assemblyName == "")
            throw new ArgumentException("Null or empty assemblyName specified");
        if (!Enum.IsDefined(typeof(ComVisibility), comVisibilityOfClasses))
            throw new ArgumentOutOfRangeException("comVisibilityOfClasses");

        _assemblyName = assemblyName;
        _createComVisibleClasses = (comVisibilityOfClasses == ComVisibility.Visible);
        _moduleBuilder = new Lazy<ModuleBuilder>(
            () =>
            {
                var assemblyBuilder = Thread.GetDomain().DefineDynamicAssembly(
                    new AssemblyName(_assemblyName),
                    AssemblyBuilderAccess.Run
                );
                return assemblyBuilder.DefineDynamicModule(
                    assemblyBuilder.GetName().Name,
                    false
                );
            },
            true // isThreadSafe
        );
    }

    public enum ComVisibility
    {
        Visible,
        NotVisible
    }

    public IInterfaceApplier<T> GenerateInterfaceApplier<T>()
    {
        if (!typeof(T).IsInterface)
            throw new ArgumentException("typeparam must be an interface type");

        var typeName = "InterfaceApplier" + Guid.NewGuid().ToString("N");
        var typeBuilder = _moduleBuilder.Value.DefineType(
            typeName,
            TypeAttributes.Public | TypeAttributes.Class | TypeAttributes.AutoClass |
            TypeAttributes.AnsiClass | TypeAttributes.BeforeFieldInit |
            TypeAttributes.AutoLayout,
            typeof(object),
            new Type[] { typeof(T) }
        );

        // The content from the previous posts goes here (generating the constructor,
        // properties and methods)..

        return new InterfaceApplier<T>(
            src => (T)Activator.CreateInstance(
                typeBuilder.CreateType(),
                src
            )
        );
    }

    public IInterfaceApplier GenerateInterfaceApplier(Type targetType)
    {
        var generate = this.GetType().GetMethod("GenerateInterfaceApplier", Type.EmptyTypes);
        var generateGeneric = generate.MakeGenericMethod(targetType);
        return (IInterfaceApplier)generateGeneric.Invoke(this, new object[0]);
    }

    private class InterfaceApplier<T> : IInterfaceApplier<T>
    {
        private Func<object, T> _conversion;
        public InterfaceApplier(Func<object, T> conversion)
        {
            if (!typeof(T).IsInterface)
                throw new ArgumentException("Invalid typeparam - must be an interface");
            if (conversion == null)
                throw new ArgumentNullException("conversion");
            _conversion = conversion;
        }

        public Type TargetType
        {
            get { return typeof(T); }
        }

        public T Apply(object src)
        {
            return _conversion(src);
        }

        object IInterfaceApplier.Apply(object src)
        {
            return Apply(src);
        }
    }
}

public interface IInterfaceApplierFactory
{
    IInterfaceApplier<T> GenerateInterfaceApplier<T>();
    IInterfaceApplier GenerateInterfaceApplier(Type targetType);
}

public interface IInterfaceApplier<T> : IInterfaceApplier
{
    new T Apply(object src);
}

public interface IInterfaceApplier
{
    Type TargetType { get; }
    object Apply(object src);
}

Using this class means that we only need one instance of the ModuleBuilder no matter how many interfaces we're wrapping around objects and an "IInterfaceApplier" is returned instead of a reference to the interface-wrapped object. Note that I've used the .Net 4.0 Lazy class to instantiate the ModuleBuilder only the first time that it's required, but if you're using an earlier version of .Net then this could be replaced with the implementation (see my previous post about this here) or even by instantiating it directly from within the constructor.

I've also supported an alternative method signature for GenerateInterfaceApplier such that the target interface can be specified as an argument rather than a typeparam to a generic method - this will become important in the next section and the only interesting things to note are how IInterfaceApplier<T> is returned from the generic method as opposed to the IInterfaceApplier returned from the typeparam-less signature and how the alternate method calls into the typeparam'd version using reflection.

Task 2: Recursively applying interfaces

The approach I'm going to use here is to introduce a new interface that will be used when generating the interface appliers -

public interface IReadValueConverter
{
    object Convert(PropertyInfo property, object value);
    object Convert(MethodInfo method, object value);
}

Values will be passed through this when returned by property getters or (non-void) methods and it wil be responsible for ensuring that the value returned from the Convert method matches the property.PropertyType / method.ReturnType.

This will mean we'll change the method signature to:

public InterfaceApplier<T> GenerateInterfaceApplier<T>(IReadValueConverter readValueConverter)

That we'll change the constructor on the generated type:

// Declare private fields
var srcField = typeBuilder.DefineField("_src", typeof(object), FieldAttributes.Private);
var readValueConverterField = typeBuilder.DefineField(
    "_readValueConverter",
    typeof(IReadValueConverter),
    FieldAttributes.Private
);

// Generate: base.ctor()
var ctorBuilder = typeBuilder.DefineConstructor(
    MethodAttributes.Public,
    CallingConventions.Standard,
    new[] { typeof(object) }
);
var ilCtor = ctorBuilder.GetILGenerator();
ilCtor.Emit(OpCodes.Ldarg_0);
ilCtor.Emit(OpCodes.Call, typeBuilder.BaseType.GetConstructor(Type.EmptyTypes));

// Generate: if (src != null), don't throw exception
var nonNullSrcArgumentLabel = ilCtor.DefineLabel();
ilCtor.Emit(OpCodes.Ldarg_1);
ilCtor.Emit(OpCodes.Brtrue, nonNullSrcArgumentLabel);
ilCtor.Emit(OpCodes.Ldstr, "src");
ilCtor.Emit(
    OpCodes.Newobj,
    typeof(ArgumentNullException).GetConstructor(new[] { typeof(string) })
);
ilCtor.Emit(OpCodes.Throw);
ilCtor.MarkLabel(nonNullSrcArgumentLabel);

// Generate: if (readValueConverter != null), don't throw exception
var nonNullReadValueConverterArgumentLabel = ilCtor.DefineLabel();
ilCtor.Emit(OpCodes.Ldarg_2);
ilCtor.Emit(OpCodes.Brtrue, nonNullReadValueConverterArgumentLabel);
ilCtor.Emit(OpCodes.Ldstr, "readValueConverter");
ilCtor.Emit(
    OpCodes.Newobj,
    typeof(ArgumentNullException).GetConstructor(new[] { typeof(string) })
);
ilCtor.Emit(OpCodes.Throw);
ilCtor.MarkLabel(nonNullReadValueConverterArgumentLabel);

// Generate: this._src = src
ilCtor.Emit(OpCodes.Ldarg_0);
ilCtor.Emit(OpCodes.Ldarg_1);
ilCtor.Emit(OpCodes.Stfld, srcField);

// Generate: this._readValueConverter = readValueConverter
ilCtor.Emit(OpCodes.Ldarg_0);
ilCtor.Emit(OpCodes.Ldarg_2);
ilCtor.Emit(OpCodes.Stfld, readValueConverterField);

// All done
ilCtor.Emit(OpCodes.Ret);

That we'll change the reading of properties:

// Define get method, if required
if (property.CanRead)
{
    var getFuncBuilder = typeBuilder.DefineMethod(
        "get_" + property.Name,
        MethodAttributes.Public | MethodAttributes.HideBySig | MethodAttributes.NewSlot |
        MethodAttributes.SpecialName | MethodAttributes.Virtual | MethodAttributes.Final,
        property.PropertyType,
        Type.EmptyTypes
    );

    // Generate: return this._readValueConverter.Convert(
    //  property.DeclaringType.GetProperty(property.Name)
    //  _src.GetType().InvokeMember(property.Name, BindingFlags.GetProperty, null, _src, null)
    // );
    var ilGetFunc = getFuncBuilder.GetILGenerator();
    ilGetFunc.Emit(OpCodes.Ldarg_0);
    ilGetFunc.Emit(OpCodes.Ldfld, readValueConverterField);
    ilGetFunc.Emit(OpCodes.Ldtoken, property.DeclaringType);
    ilGetFunc.Emit(
        OpCodes.Call,
        typeof(Type).GetMethod("GetTypeFromHandle", new[] { typeof(RuntimeTypeHandle) })
    );
    ilGetFunc.Emit(OpCodes.Ldstr, property.Name);
    ilGetFunc.Emit(
        OpCodes.Call,
        typeof(Type).GetMethod("GetProperty", new[] { typeof(string) })
    );
    ilGetFunc.Emit(OpCodes.Ldarg_0);
    ilGetFunc.Emit(OpCodes.Ldfld, srcField);
    ilGetFunc.Emit(OpCodes.Callvirt, typeof(Type).GetMethod("GetType", Type.EmptyTypes));
    ilGetFunc.Emit(OpCodes.Ldstr, property.Name);
    ilGetFunc.Emit(OpCodes.Ldc_I4, (int)BindingFlags.GetProperty);
    ilGetFunc.Emit(OpCodes.Ldnull);
    ilGetFunc.Emit(OpCodes.Ldarg_0);
    ilGetFunc.Emit(OpCodes.Ldfld, srcField);
    ilGetFunc.Emit(OpCodes.Ldnull);
    ilGetFunc.Emit(OpCodes.Callvirt, methodInfoInvokeMember);
    ilGetFunc.Emit(
        OpCodes.Callvirt,
        typeof(IReadValueConverter).GetMethod(
            "Convert",
            new[] { typeof(PropertyInfo), typeof(object) }
        )
    );
    if (property.PropertyType.IsValueType)
        ilGetFunc.Emit(OpCodes.Unbox_Any, property.PropertyType);
    ilGetFunc.Emit(OpCodes.Ret);
    propBuilder.SetGetMethod(getFuncBuilder);
}

And that we'll change the calling of methods:

// .. skipped out the first half of the method-generating code
// - see https://www.productiverage.com/Read/15

// Generate either:
//  _src.GetType().InvokeMember(method.Name, BindingFlags.InvokeMethod, null, _src, args);
// or
//  return this._readValueConverter.Convert(
//   method.DeclaringType.GetMethod(method.Name, {MethodArgTypes})
//   this._src.GetType().InvokeMember(
//    property.Name,
//    BindingFlags.InvokeMethod,
//    null,
//    _src,
//    null
//   )
//  );
if (!method.ReturnType.Equals(typeof(void)))
{
    // We only need to use the readValueConverter if returning a value

    // Generate: Type[] argTypes
    var argTypes = ilFunc.DeclareLocal(typeof(Type[]));

    // Generate: argTypes = new Type[x]
    ilFunc.Emit(OpCodes.Ldc_I4, parameters.Length);
    ilFunc.Emit(OpCodes.Newarr, typeof(Type));
    ilFunc.Emit(OpCodes.Stloc_1);
    for (var index = 0; index < parameters.Length; index++)
    {
        // Generate: argTypes[n] = ..;
        var parameter = parameters[index];
        ilFunc.Emit(OpCodes.Ldloc_1);
        ilFunc.Emit(OpCodes.Ldc_I4, index);
        ilFunc.Emit(OpCodes.Ldtoken, parameters[index].ParameterType);
        ilFunc.Emit(OpCodes.Stelem_Ref);
    }

    // Will call readValueConverter.Convert, passing MethodInfo reference before value
    ilFunc.Emit(OpCodes.Ldarg_0);
    ilFunc.Emit(OpCodes.Ldfld, readValueConverterField);
    ilFunc.Emit(OpCodes.Ldtoken, method.DeclaringType);
    ilFunc.Emit(
        OpCodes.Call,
        typeof(Type).GetMethod("GetTypeFromHandle", new[] { typeof(RuntimeTypeHandle) })
    );
    ilFunc.Emit(OpCodes.Ldstr, method.Name);
    ilFunc.Emit(OpCodes.Ldloc_1);
    ilFunc.Emit(
        OpCodes.Call,
        typeof(Type).GetMethod("GetMethod", new[] { typeof(string), typeof(Type[]) })
    );
}
ilFunc.Emit(OpCodes.Ldarg_0);
ilFunc.Emit(OpCodes.Ldfld, srcField);
ilFunc.Emit(OpCodes.Callvirt, typeof(Type).GetMethod("GetType", Type.EmptyTypes));
ilFunc.Emit(OpCodes.Ldstr, method.Name);
ilFunc.Emit(OpCodes.Ldc_I4, (int)BindingFlags.InvokeMethod);
ilFunc.Emit(OpCodes.Ldnull);
ilFunc.Emit(OpCodes.Ldarg_0);
ilFunc.Emit(OpCodes.Ldfld, srcField);
ilFunc.Emit(OpCodes.Ldloc_0);
ilFunc.Emit(OpCodes.Callvirt, methodInfoInvokeMember);

if (method.ReturnType.Equals(typeof(void)))
    ilFunc.Emit(OpCodes.Pop);
else
{
    ilFunc.Emit(
        OpCodes.Callvirt,
        typeof(IReadValueConverter).GetMethod(
            "Convert",
            new[] { typeof(MethodInfo), typeof(object) }
        )
    );
    if (method.ReturnType.IsValueType)
        ilFunc.Emit(OpCodes.Unbox_Any, method.ReturnType);
}

ilFunc.Emit(OpCodes.Ret);

Task 2.1: Implementing IReadValueConverter

A naive implementation might be as follows:

public class SimpleReadValueConverter : IReadValueConverter
{
    private IInterfaceApplierFactory _interfaceApplierFactory;
    public SimpleReadValueConverter(IInterfaceApplierFactory interfaceApplierFactory)
    {
        if (interfaceApplierFactory == null)
            throw new ArgumentNullException("interfaceApplierFactory");
        _interfaceApplierFactory = interfaceApplierFactory;
    }

    public object Convert(PropertyInfo property, object value)
    {
        if (property == null)
            throw new ArgumentNullException("property");
        return tryToConvertValueIfRequired(property.PropertyType, value);
    }

    public object Convert(MethodInfo method, object value)
    {
        if (method == null)
            throw new ArgumentNullException("method");
        return tryToConvertValueIfRequired(method.ReturnType, value);
    }

    private object tryToConvertValueIfRequired(Type targetType, object value)
    {
        if (targetType == null)
            throw new ArgumentNullException("targetType");

        // If no conversion is required, no work to do
        // - Note: We can only deal with applying interfaces to objects so if a conversion
        //   is required where the target is not an interface then there's nothing we can do
        //   here, we'll have to return the value unconverted (likewise, if the target type
        //   is an int but the current value is null, although this is obviously incorrect
        //   there's nothing we can do about it here)
        if (!targetType.IsInterface || (value == null)
        || (value.GetType().IsSubclassOf(targetType)))
            return value;

        return _interfaceApplierFactory.GenerateInterfaceApplier(targetType, this)
            .Apply(value);
    }
}

This will do the job but it jumps out at me that if the same interface needs to be applied to multiple return values (ie. from different properties or methods) then the work done to generate that interface applier will be repeated for each request. It might be better (require less memory and cpu resources) to build up a list of interfaces that have already been handled and re-use the interface appliers where possible -

public class CachedReadValueConverter : IReadValueConverter
{
    private IInterfaceApplierFactory _interfaceApplierFactory;
    private NonNullImmutableList<IInterfaceApplier> _interfaceAppliers;
    private object _writeLock;

    public CachedReadValueConverter(IInterfaceApplierFactory interfaceApplierFactory)
    {
        if (interfaceApplierFactory == null)
            throw new ArgumentNullException("interfaceApplierFactory");

        _interfaceApplierFactory = interfaceApplierFactory;
        _interfaceAppliers = new NonNullImmutableList<IInterfaceApplier>();
        _writeLock = new object();
    }

    public object Convert(PropertyInfo property, object value)
    {
        if (property == null)
            throw new ArgumentNullException("property");

        return tryToConvertValueIfRequired(property.PropertyType, value);
    }

    public object Convert(MethodInfo method, object value)
    {
        if (method == null)
            throw new ArgumentNullException("method");

        return tryToConvertValueIfRequired(method.ReturnType, value);
    }

    private object tryToConvertValueIfRequired(Type targetType, object value)
    {
        if (targetType == null)
            throw new ArgumentNullException("targetType");

        // If no conversion is required, no work to do
        // - Note: We can only deal with applying interfaces to objects so if a conversion
        //   is required where the target is not an interface then there's nothing we can
        //   do here so we'll have to return the value unconverted (likewise, if the target
        //   type is an int but the current value is null, although this is obviously
        //   incorrect but there's nothing we can do about it here)
        if (!targetType.IsInterface || (value == null)
        || (value.GetType().IsSubclassOf(targetType)))
            return value;

        // Do we already have an interface applier available for this type?
        var interfaceApplierExisting = _interfaceAppliers.FirstOrDefault(
            i => i.TargetType.Equals(targetType)
        );
        if (interfaceApplierExisting != null)
            return interfaceApplierExisting.Apply(value);

        // Try to generate new interface applier
        var interfaceApplierNew = _interfaceApplierFactory.GenerateInterfaceApplier(
            targetType,
            this
        );
        lock (_writeLock)
        {
            if (!_interfaceAppliers.Any(i => i.TargetType.Equals(targetType)))
                _interfaceAppliers = _interfaceAppliers.Add(interfaceApplierNew);
        }
        return interfaceApplierNew.Apply(value);
    }
}

There will still be cases where there have to be multiple interface appliers for a given interface if there are interrelated references but this should limit how many duplicates are generated. For example:

using COMInteraction.InterfaceApplication;
using COMInteraction.InterfaceApplication.ReadValueConverters;

namespace Tester
{
    class Program
    {
        static void Main(string[] args)
        {
            var n1 = new Node() { Name = "Node1" };
            var n2 = new Node() { Name = "Node2" };
            var n3 = new Node() { Name = "Node3" };
            n1.Next = n2;
            n2.Previous = n1;
            n2.Next = n3;
            n3.Previous = n2;

            var interfaceApplierFactory = new InterfaceApplierFactory(
                "DynamicAssembly",
                InterfaceApplierFactory.ComVisibility.NotVisible
            );
            var interfaceApplier = interfaceApplierFactory.GenerateInterfaceApplier<INode>(
                new CachedReadValueConverter(interfaceApplierFactory)
            );
            var n2Wrapped = interfaceApplier.Apply(n2);

        }

        public class Node
        {
            public string Name { get; set; }
            public Node Previous { get; set; }
            public Node Next { get; set; }
        }
    }

    public interface INode
    {
        string Name { get; set; }
        INode Previous { get; set; }
        INode Next { get; set; }
    }
}

Each unique interface applier has a specific class name (of the form "InterfaceApplier{0}" where {0} is a guid). Examining the properties on n2Wrapped you can see that the class for n2Wrapped is different to the class for its Next and Previous properties as the INode interface applier hadn't been completely generated against n2 before an INode interface applier was required for these properties. But after this, all further INode-wrapped instances will share the same interface applier as the Previous and Next properties received - so there will be one "wasted" generated class but that's still a better job than the SimpleReadValueConverter would have managed.

"Inaccessible Interface"

In the above example, the INode interface has to be located outside of the Program class since it must be accessible by the InterfaceApplierFactory and the Program class is private. This isn't an issue for the Node class as the InterfaceApplierFactory doesn't need direct access to that, it just returns an IInterfaceApplier that we pass the n2 reference to as an argument. Alternatively, the Program class could be made public. This isn't exactly rocket science but if it slips your mind then you're presented with an error such as -

TypeLoadException: "Type 'InterfaceApplierdb2cb792e09d424a8dcecbeca6276dc8' from assembly 'DynamicAssembly, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' is attempting to implement an inaccessible interface."

at the line

return new InterfaceApplier<T>(
    src => (T)Activator.CreateInstance(
        typeBuilder.CreateType(),
        src,
        readValueConverter
    )
);

in InterfaceApplierFactory, which isn't the friendliest of warnings!

Posted at 18:38

Tags:

Comments

14 December 2011

Dynamically applying interfaces to objects - Part 2

Today, I'm going to address some of the "future developments" I left at the end of the last post. Specifically:

Wrapping up the "Interface Applier" into a generic method that specifies the target interface
Handling interface hierarchies
Marking the wrapper as being ComVisible

Wrapping up in a method

Part one is easy. Take the code from the last article and wrap in

public static T WrapObject<T>(object src)
{
    if (src == null)
        throw new ArgumentNullException("src");
    if (!typeof(T).IsInterface)
        throw new ArgumentException("Typeparam T must be an interface type");

    // Insert existing code that generates the new class with its constructor, properties
    // and methods here..

    return (T)Activator.CreateInstance(
        typeBuilder.CreateType(),
        src
    );
}

Ta-da! Note that we ensure that the typeparam T really is an interface - we made assumptions about this in the last article, so we need to assert this fact here. (It means that we only ever have to deal with properties and methods, and that they will always be public).

Handling interface hierarchies

This part is not much more difficult. We'll introduce something to recursively trawl through any interfaces that the target interface implements and build a list of them all:

public class InterfaceHierarchyCombiner
{
    private Type _targetInterface;
    private List<Type> _interfaces;
    public InterfaceHierarchyCombiner(Type targetInterface)
    {
        if (targetInterface == null)
            throw new ArgumentNullException("targetInterface");
        if (!targetInterface.IsInterface)
            throw new ArgumentException("targetInterface must be an interface type", "targetInterface");

        _interfaces = new List<Type>();
        buildInterfaceInheritanceList(targetInterface, _interfaces);
        _targetInterface = targetInterface;
    }

    private static void buildInterfaceInheritanceList(Type targetInterface, List<Type> types)
    {
        if (targetInterface == null)
            throw new ArgumentNullException("targetInterface");
        if (!targetInterface.IsInterface)
            throw new ArgumentException("targetInterface must be an interface type", "targetInterface");
        if (types == null)
            throw new ArgumentNullException("types");

        if (!types.Contains(targetInterface))
            types.Add(targetInterface);

        foreach (var inheritedInterface in targetInterface.GetInterfaces())
        {
            if (!types.Contains(inheritedInterface))
            {
                types.Add(inheritedInterface);
                buildInterfaceInheritanceList(inheritedInterface, types);
            }
        }
    }

    public Type TargetInterface
    {
        get { return _targetInterface; }
    }

    public IEnumerable<Type> Interfaces
    {
        get { return _interfaces.AsReadOnly(); }
    }
}

Then, in this new WrapObject method, we call instantiate a new InterfaceHierarchyCombiner for the typeparam T and use retrieve all the properties and methods from the Interfaces list, rather than just those on T.

eg. Instead of

foreach (var property in typeof(ITest).GetProperties())
{
    // Deal with the properties..

we consider

var interfaces = (new InterfaceHierarchyCombiner(typeof(T))).Interfaces;
foreach (var property in interfaces.SelectMany(i => i.GetProperties()))
{
    // Deal with the properties..

and likewise for the methods.

It's worth noting that there may be multiple methods within the interface hierarchy with the same name and signature. It may be worth keeping track of which properties / methods have had corresponding IL generated but - other than generating more instructions in the loop than strictly necessary - it doesn't do any harm generating duplicate properties or methods (so I haven't worried about it for now).

Com Visibility

What I wanted to do for the wrappers I was implementing was to create classes with the [ComVisible(true)] and [ClassInterface(ClassInterface.None)] attributes. This is achieved by specifying these attributes on the typeBuilder (as seen in the code in the last article):

typeBuilder.SetCustomAttribute(
    new CustomAttributeBuilder(
        typeof(ComVisibleAttribute).GetConstructor(new[] { typeof(bool) }),
        new object[] { true }
    )
);
typeBuilder.SetCustomAttribute(
    new CustomAttributeBuilder(
        typeof(ClassInterfaceAttribute).GetConstructor(new[] { typeof(ClassInterfaceType) }),
        new object[] { ClassInterfaceType.None }
    )
);

Again, easy! Once you know how :)

Example code / jumping to the end

I've not included a complete sample here since it would take up a fairly hefty chunk of space but also because I created a GitHub repository with the final code. This can be found at https://github.com/ProductiveRage/COMInteraction

This includes the work here but also the work I want to address next time; a way to automagically wrap return values where required and a way to change the WrapObject method so that if the same interface is to be applied to multiple objects, only a single call is required (and an object be returned that can wrap any given reference in that interface). The example I put out for this return-value-wrapping was that we want to wrap an object in ITest and have the return value from its Get method also be wrapped if it did not already implement IEmployee:

public interface ITest
{
    IEmployee Get(int id);
}

public interface IEmployee
{
    int Id { get; }
    string Name { get; }
}

For more details of how exactly this will all work, I'll see you back here; same bat-time, same bat-channel! :)

Posted at 21:16

Tags:

Comments

13 December 2011

Dynamically applying interfaces to objects

Another area of this migration proof-of-concept work I'm doing at the day job involves investigating the best way to swap out a load of COM components for C# versions over time. The plan initially is to define interfaces for them and code against those interfaces, write wrapper classes around the components that implement these interfaces and one day rewrite them one-by-one.

Writing the interfaces is valuable since it enables some documentation-through-comments to be generated for each method and property and forces me to look into the idiosyncracies of the various components.

However, writing endless wrappers for the components to "join" them to the interfaces sounded boring! Even if I used the .Net 4.0 "dynamic" keyword it seemed like there'd be a lot of repetition and opportunity for me to mistype a property name and not realise until debugging / writing tests. (Plus I found a problem that prevented me from using "dynamic" with the WSCs I was wrapping - see the bottom of this post for more details).

I figured this is the sort of thing that should be dynamically generatable from the interfaces instead of writing them all by hand - something like how Moq can create generate Mock<ITest> implementations. I did most of this investigation back in Summer, not long after the AutoMapper work I was looking into, and had hoped I'd be able to leverage my sharpened Linq Expression dynamic code generation skills. Alas, it seems that new classes can not be defined in this manner so I had to go deeper..

Reflection.Emit

I was aware that IL could be generated by code at runtime and executed as any other other loaded assembly might be. I'd read (and chuckled at) this article in the past but never taken it any further: Dynamic... But Fast: The Tale of Three Monkeys, A Wolf and the DynamicMethod and ILGenerator Classes

As I tried to find out more information, though, it seemed that a lot of articles would make the point that you could find out how to construct IL by using the IL Disassembler that's part of the .Net SDK: ildasm.exe (located in C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin on my computer). This makes sense because once you start constructing simple classes and examining the generated code in ildasm you can start to get a reasonable idea for how to write the generation code yourself. But it still took me quite a while to get to the point where the following worked!

What I really wanted was something to take, for example:

public interface ITest
{
    int GetValue(string id);
    string Name { get; set; }
}

and wrap an object that had that method and property such that the interface was exposed - eg.

public class TestWrapper : ITest
{
  private object _src;
  public TestWrapper(object src)
  {
    if (src == null)
      throw new ArgumentNullException("src");
    _src = src;
  }

  public int GetValue(string id)
  {
    return _src.GetType().InvokeMember(
      "GetValue",
      BindingFlags.InvokeMethod,
      null,
      _src,
      new object[] { id }
    );
  }

  public string Name
  {
    get
    {
      return _src.GetType().InvokeMember("Name", BindingFlags.GetProperty, null, _src, null)
    }
    set
    {
      _src.GetType().InvokeMember("Name", BindingFlags.SetProperty, null, _src, new object[] { value });
    }
  }
}

It may seem like using reflection will result in there being overhead in the calls but the primary objective was to wrap a load of WSCs in C# interfaces so they could be rewritten later while doing the job for now - so performance wasn't really a massive concern at this point.

Somewhere to work in

The first thing to be aware of is that we can't create new classes in the current assembly, we'll have to create them in a new one. So we start off with

var assemblyBuilder = Thread.GetDomain().DefineDynamicAssembly(
  new AssemblyName("DynamicAssembly"), // This is not a magic string, it can be called anything
  AssemblyBuilderAccess.Run
);
var moduleBuilder = assemblyBuilder.DefineDynamicModule(
  assemblyBuilder.GetName().Name,
  false
);

// This NewGuid call is just to get a unique name for the new construct
var typeName = "InterfaceApplier" + Guid.NewGuid().ToString();
var typeBuilder = moduleBuilder.DefineType(
  typeName,
  TypeAttributes.Public
    | TypeAttributes.Class
    | TypeAttributes.AutoClass
    | TypeAttributes.AnsiClass
    | TypeAttributes.BeforeFieldInit
    | TypeAttributes.AutoLayout,
  typeof(object),
  new Type[] { typeof(ITest) }
);

The TypeAttribute values I copied from the ildasm output I examined.

Note that we're specifying ITest as the interface we're implementing by passing it as the "interfaces" parameter to the moduleBuilder's DefineType method.

The Constructor

The constructor is fairly straight forward. The thing that took me longest to wrap my head around was how to form the "if (src == null) throw new ArgumentNullException()" construct. If seems that this is most easily done by declaring a label to jump to if src is not null which allows execution to leap over the point at which an ArgumentNullException will be raised.

// Declare private _src field
var srcField = typeBuilder.DefineField("_src", typeof(object), FieldAttributes.Private);
var ctorBuilder = typeBuilder.DefineConstructor(
  MethodAttributes.Public,
  CallingConventions.Standard,
  new[] { typeof(object) }
);

// Generate: base.ctor()
var ilCtor = ctorBuilder.GetILGenerator();
ilCtor.Emit(OpCodes.Ldarg_0);
ilCtor.Emit(OpCodes.Call, typeBuilder.BaseType.GetConstructor(Type.EmptyTypes));

// Generate: if (src != null), don't throw new ArgumentException("src")
var nonNullSrcArgumentLabel = ilCtor.DefineLabel();
ilCtor.Emit(OpCodes.Ldarg_1);
ilCtor.Emit(OpCodes.Brtrue, nonNullSrcArgumentLabel);
ilCtor.Emit(OpCodes.Ldstr, "src");
ilCtor.Emit(OpCodes.Newobj, typeof(ArgumentNullException).GetConstructor(new[] { typeof(string) }));
ilCtor.Emit(OpCodes.Throw);
ilCtor.MarkLabel(nonNullSrcArgumentLabel);

// Generate: _src = src
ilCtor.Emit(OpCodes.Ldarg_0);
ilCtor.Emit(OpCodes.Ldarg_1);
ilCtor.Emit(OpCodes.Stfld, srcField);

// All done!
ilCtor.Emit(OpCodes.Ret);

Properties

Although there's only a single property in the ITest example we're looking at, we might as look ahead and loop over all properties the interface has so we can apply the same sort of code to other interfaces. Since we are dealing with interfaces, we only need to consider whether a property is gettable, settable or both - there's no public / internal / protected / private / etc.. to worry about. Likewise, we only have to worry about properties and methods - interfaces can't declare fields.

foreach (var property in typeof(ITest).GetProperties())
{
  var methodInfoInvokeMember = typeof(Type).GetMethod(
    "InvokeMember",
    new[]
    {
      typeof(string),
      typeof(BindingFlags),
      typeof(Binder),
      typeof(object),
      typeof(object[])
    }
  );

  // Prepare the property we'll add get and/or set accessors to
  var propBuilder = typeBuilder.DefineProperty(
    property.Name,
    PropertyAttributes.None,
    property.PropertyType,
    Type.EmptyTypes
  );

  // Define get method, if required
  if (property.CanRead)
  {
    var getFuncBuilder = typeBuilder.DefineMethod(
      "get_" + property.Name,
      MethodAttributes.Public
        | MethodAttributes.HideBySig
        | MethodAttributes.NewSlot
        | MethodAttributes.SpecialName
        | MethodAttributes.Virtual
        | MethodAttributes.Final,
      property.PropertyType,
      Type.EmptyTypes
    );

    // Generate:
    //   return _src.GetType().InvokeMember(property.Name, BindingFlags.GetProperty, null, _src, null)
    var ilGetFunc = getFuncBuilder.GetILGenerator();
    ilGetFunc.Emit(OpCodes.Ldarg_0);
    ilGetFunc.Emit(OpCodes.Ldfld, srcField);
    ilGetFunc.Emit(OpCodes.Callvirt, typeof(Type).GetMethod("GetType", Type.EmptyTypes));
    ilGetFunc.Emit(OpCodes.Ldstr, property.Name);
    ilGetFunc.Emit(OpCodes.Ldc_I4, (int)BindingFlags.GetProperty);
    ilGetFunc.Emit(OpCodes.Ldnull);
    ilGetFunc.Emit(OpCodes.Ldarg_0);
    ilGetFunc.Emit(OpCodes.Ldfld, srcField);
    ilGetFunc.Emit(OpCodes.Ldnull);
    ilGetFunc.Emit(OpCodes.Callvirt, methodInfoInvokeMember);
    if (property.PropertyType.IsValueType)
      ilGetFunc.Emit(OpCodes.Unbox_Any, property.PropertyType);
    ilGetFunc.Emit(OpCodes.Ret);
    propBuilder.SetGetMethod(getFuncBuilder);
  }

  // Define set method, if required
  if (property.CanWrite)
  {
    var setFuncBuilder = typeBuilder.DefineMethod(
      "set_" + property.Name,
      MethodAttributes.Public
        | MethodAttributes.HideBySig
        | MethodAttributes.SpecialName
        | MethodAttributes.Virtual,
      null,
      new Type[] { property.PropertyType }
    );
    var valueParameter = setFuncBuilder.DefineParameter(1, ParameterAttributes.None, "value");
    var ilSetFunc = setFuncBuilder.GetILGenerator();

    // Generate:
    //   _src.GetType().InvokeMember(
    //     property.Name, BindingFlags.SetProperty, null, _src, new object[1] { value }
    //   );
    // Note: Need to declare assignment of local array to pass to InvokeMember (argValues)
    var argValues = ilSetFunc.DeclareLocal(typeof(object[]));
    ilSetFunc.Emit(OpCodes.Ldarg_0);
    ilSetFunc.Emit(OpCodes.Ldfld, srcField);
    ilSetFunc.Emit(OpCodes.Callvirt, typeof(Type).GetMethod("GetType", Type.EmptyTypes));
    ilSetFunc.Emit(OpCodes.Ldstr, property.Name);
    ilSetFunc.Emit(OpCodes.Ldc_I4, (int)BindingFlags.SetProperty);
    ilSetFunc.Emit(OpCodes.Ldnull);
    ilSetFunc.Emit(OpCodes.Ldarg_0);
    ilSetFunc.Emit(OpCodes.Ldfld, srcField);
    ilSetFunc.Emit(OpCodes.Ldc_I4_1);
    ilSetFunc.Emit(OpCodes.Newarr, typeof(Object));
    ilSetFunc.Emit(OpCodes.Stloc_0);
    ilSetFunc.Emit(OpCodes.Ldloc_0);
    ilSetFunc.Emit(OpCodes.Ldc_I4_0);
    ilSetFunc.Emit(OpCodes.Ldarg_1);
    if (property.PropertyType.IsValueType)
      ilSetFunc.Emit(OpCodes.Box, property.PropertyType);
    ilSetFunc.Emit(OpCodes.Stelem_Ref);
    ilSetFunc.Emit(OpCodes.Ldloc_0);
    ilSetFunc.Emit(OpCodes.Callvirt, methodInfoInvokeMember);
    ilSetFunc.Emit(OpCodes.Pop);

    ilSetFunc.Emit(OpCodes.Ret);
    propBuilder.SetSetMethod(setFuncBuilder);
  }
}

The gist is that for the getter and/or setter, we have to declare a method and then assign that method to be the GetMethod or SetMethod for the property. The method is named by prefixing the property name with either "get_" or "set_", as is consistent with how C# generates it class properties' IL.

The call to

_src.GetType().InvokeMember(
  property.Name,
  BindingFlags.SetProperty,
  null,
  _src,
  new object[] { value }
);

is a bit painful as we have to declare an array with a single element to pass to the method, where that single element is the "value" reference available within the setter.

Also worthy of note is that when returning a ValueType or setting a ValueType. As we're expecting to either return an object or set an object, the value has to be "boxed" otherwise bad things will happen!

Like the TypeAttributes in the Constructor, the MethodAttributes I've applied here were gleaned from looking at IL generated by Visual Studio.

Methods

We're on the home stretch now! Methods are very similar to the property setters except that we may have zero, one or multiple parameters to handle and we may or may not (if the return type is void) return a value from the method.

foreach (var method in typeof(ITest).GetMethods())
{
  var parameters = method.GetParameters();
  var parameterTypes = new List<Type>();
  foreach (var parameter in parameters)
  {
    if (parameter.IsOut)
      throw new ArgumentException("Output parameters are not supported");
    if (parameter.IsOptional)
      throw new ArgumentException("Optional parameters are not supported");
    if (parameter.ParameterType.IsByRef)
      throw new ArgumentException("Ref parameters are not supported");
    parameterTypes.Add(parameter.ParameterType);
  }
  var funcBuilder = typeBuilder.DefineMethod(
    method.Name,
    MethodAttributes.Public
      | MethodAttributes.HideBySig
      | MethodAttributes.NewSlot
      | MethodAttributes.Virtual
      | MethodAttributes.Final,
    method.ReturnType,
    parameterTypes.ToArray()
  );
  var ilFunc = funcBuilder.GetILGenerator();

  // Generate: object[] args
  var argValues = ilFunc.DeclareLocal(typeof(object[]));

  // Generate: args = new object[x]
  ilFunc.Emit(OpCodes.Ldc_I4, parameters.Length);
  ilFunc.Emit(OpCodes.Newarr, typeof(Object));
  ilFunc.Emit(OpCodes.Stloc_0);
  for (var index = 0; index < parameters.Length; index++)
  {
    // Generate: args[n] = ..;
    var parameter = parameters[index];
    ilFunc.Emit(OpCodes.Ldloc_0);
    ilFunc.Emit(OpCodes.Ldc_I4, index);
    ilFunc.Emit(OpCodes.Ldarg, index + 1);
    if (parameter.ParameterType.IsValueType)
      ilFunc.Emit(OpCodes.Box, parameter.ParameterType);
    ilFunc.Emit(OpCodes.Stelem_Ref);
  }

  var methodInfoInvokeMember = typeof(Type).GetMethod(
    "InvokeMember",
    new[]
    {
      typeof(string),
      typeof(BindingFlags),
      typeof(Binder),
      typeof(object),
      typeof(object[])
    }
  );

  // Generate:
  //   [return] _src.GetType().InvokeMember(method.Name, BindingFlags.InvokeMethod, null, _src, args);
  ilFunc.Emit(OpCodes.Ldarg_0);
  ilFunc.Emit(OpCodes.Ldfld, srcField);
  ilFunc.Emit(OpCodes.Callvirt, typeof(Type).GetMethod("GetType", Type.EmptyTypes));
  ilFunc.Emit(OpCodes.Ldstr, method.Name);
  ilFunc.Emit(OpCodes.Ldc_I4, (int)BindingFlags.InvokeMethod);
  ilFunc.Emit(OpCodes.Ldnull);
  ilFunc.Emit(OpCodes.Ldarg_0);
  ilFunc.Emit(OpCodes.Ldfld, srcField);
  ilFunc.Emit(OpCodes.Ldloc_0);
  ilFunc.Emit(OpCodes.Callvirt, methodInfoInvokeMember);

  if (method.ReturnType.Equals(typeof(void)))
    ilFunc.Emit(OpCodes.Pop);
  else if (method.ReturnType.IsValueType)
    ilFunc.Emit(OpCodes.Unbox_Any, method.ReturnType);

  ilFunc.Emit(OpCodes.Ret);
}

The boxing of ValueTypes when passed as parameters or returned from the method is required, just like the property accessors.

The only real point of interest here is the array generation for the parameters - this also took me a little while to wrap my head around! You may note that I've been a bit lazy and not supported optional, out or ref parameters - I didn't need these for anything I was working on and didn't feel like diving into it at this point. I'm fairly sure that if they become important features then whipping out ildasm and looking at the generated code there will reveal the best way to proceed with these.

The payoff!

Now that we've defined everything about the class, we can instantiate it!

var wrapper = (ITest)Activator.CreateInstance(
  typeBuilder.CreateType(),
  src
);

This gives us back an instance of a magic new class that wraps a specified "src" and passes through the ITest properties and methods! If it's applied to an object that doesn't have the required properties and methods then exceptions will be thrown when they are called - the standard exceptions that reflection calls to invalid properties/methods would result in.

But I think that's pretty much enough for this installment - it's been a bit dry but I think it's been worthwhile!

And then..

There are a lot of extension points that naturally arise from this rough-from-the-edges code - the first things that spring to my mind are a way to wrap this up nicely into a generic class that could create wrappers for any given interface, a way to handle interface inheritance, a way to possibly wrap the returned values - eg. if we have

public interface ITest
{
  IEmployee Get(int id);
}

public interface IEmployee
{
  int Id { get; }
  string Name { get; }
}

can the method that applies ITest to an object also apply IEmployee to the value returned from ITest.Get if that value itself doesn't already implement IEmployee??

Finally, off the top of my head, if I'm using these generated classes to read interact with WSCs / COM components, am I going to need to pass references over to COM components? If so, I'm going to have to find a way to flag them as ComVisible.

But these are issues to address another time :)

Footnote: WSC properties and "dynamic"

The code here doesn't use the .Net 4.0 "dynamic" keyword and so will compile under .Net 2.0. I had a bit of a poke around in the IL generated that makes use of dynamic since in some situations it should offer benefits - often performance is one such benefit! However, in the particularly niche scenario I'm working with it refuses to work :( Most of the components I'm wrapping are legacy VBScript WSCs and trying to set properties on these seems to fail when using dynamic.

I've pulled this example from a test (in xUnit) I wrote to illustrate that there were issues ..

[Fact]
public void SettingCachePropertyThrowsRuntimeBinderException()
{
  // The only way to demonstrate properly is unfortunately by loading an actual wsc - if we used a
  // standard .Net class as the source then it would work fine
  var src = Microsoft.VisualBasic.Interaction.GetObject(
    Path.Combine(
      new FileInfo(this.GetType().Assembly.FullName).DirectoryName,
      "TestSrc.wsc"
    )
  );
  var srcWithInterface = new ControlInterfaceApplierUsingDynamic(src);

  // Expect a RuntimeBinderException with message "'System.__ComObject' does not contain a definition
  // for 'Cache'"
  Assert.Throws<RuntimeBinderException>(() =>
  {
    srcWithInterface.Cache = new NullCache();
  });
}

public class ControlInterfaceApplierUsingDynamic : IControl
{
  private dynamic _src;
  public ControlInterfaceApplierUsingDynamic(object src)
  {
    if (src == null)
      throw new ArgumentNullException("src");

    _src = src;
  }

  public ICache Cache
  {
    set
    {
      _src.Cache = value;
    }
  }
}

[ComVisible(true)]
private class NullCache : ICache
{
  public object this[string key] { get { return null; } }
  public void Add(string key, object value, int cacheDurationInSeconds) { }
  public bool Exists(string key) { return false; }
  public void Remove(string key) { }
}

The WSC content is as follows:

<?xml version="1.0" ?>
<?component error="false" debug="false" ?>
<package>
  <component id="TestSrc">
    <registration progid="TestSrc" description="Test Control" version="1" />
    <public>
      <property name="Config" />
    </public>
    <script language="VBScript">
    <![CDATA[
      Public Cache
    ]]>
    </script>
  </component>
</package>

I've not been able to get to the bottom of why this fails and it's not exactly a common problem - most people left WSCs back in the depths of time.. along with VBScript! But for the meantime we're stuck with them since the thought of trying to migrate the entire codebase fills me with dread, at least splitting it this way means we can move over piecemeal and re-write isolated components of the code at a time. And then one day the old cruft will have gone! And people will consider the earlier migration code the new "old cruft" and the great cycle can continue!

Update (2nd May 2014): It turns out that if

var src = Microsoft.VisualBasic.Interaction.GetObject(
  Path.Combine(
    new FileInfo(this.GetType().Assembly.FullName).DirectoryName,
    "TestSrc.wsc"
  )
);

is altered to read

var src = Microsoft.VisualBasic.Interaction.GetObject(
  "script:"
  Path.Combine(
    new FileInfo(this.GetType().Assembly.FullName).DirectoryName,
    "TestSrc.wsc"
  )
);

then this test will pass! I'll leave the content here for posterity but it struck me while flipping through this old post that I had seen and addressed this problem since writing this.

Posted at 22:01

Tags:

Comments

27 August 2011

Impersonating the ASP Request Object

I've got something coming up at work soon where we're hoping to migrate some internal web software from VBScript ASP to .Net, largely for performance reasons. The basic structure is that there's an ASP "Engine" running which instantiates and renders Controls that are VBScript WSC components. The initial task is going to be to try to replace the main Engine code and work with the existing Controls - this architecture give us the flexibility to migrate in this manner, rather than having to try to attack the entire codebase all at once. References are passed into the WSC Controls for various elements of the Engine but also for ASP objects such as Request and Response.

The problem comes with the use of the Request object. I want to be able to swap it out for a .Net COM component since access to the ASP Request object won't be available when the Engine is running in .Net. But the Request collections (Form, QueryString and ServerVariables) have a variety of access methods that are not particular easy to replicate -

' Returns the full QueryString content (url-encoded),
Request.QueryString

Request.QueryString.Count
Request.QueryString.Keys.Count

' Loops over the keys in the collections
For .. in Request.QueryString
For .. in Request.QueryString.Keys

' Returns a string containing values for the specified key (comma-separated)
Request.QueryString(key)
Request.QueryString.Item(key)

' Loops over the values for the specified key
For Each .. In Request.QueryString(key)
For Each .. In Request.QueryString.Item(key)

Approaches

In the past I've made a few attempts at attacking this before -

First trying a VBScript wrapper to take advantage of VBScript's Default properties and methods. But it doesn't seem possible to create a collection in VBScript that the For.. Each construct can work over.

Another time I tried a Javascript wrapper - a returned array can be enumerate with For.. Each and I thought I might be able to add methods of properties to the returned array for the default properties, but these were returned in the keys when enumerated.

I've previously tried to write a COM component but was unable to construct classes that would be accessible by all the above examples. This exact problem is described in a thread on StackOverflow and I thought that one of the answers would solve my problem by returning different data depending upon whether a key was supplied: here.

Hooray!

Actually, no. I tried using that code and couldn't get it to work as advertised - getting a COM exception when trying to access QueryString without a key.

However, further down in that thread (here) there's another suggestion - to implement IReflect. Not an interface I was familiar with..

IReflect

It turns out writing a class that implements IReflect and specifies ClassInterface(ClassInterfaceType.AutoDispatch) will enable us to handle all querying and invoking of the class interface from COM! The AutoDispatch value, as I understand it (and I'm far from an authority on this!), prevents the class from being used in any manner other than late binding as it doesn't publish any interface data in a type library - callers must always query the object for method, property, etc.. accessibility. And this will enable us to intercept this queries and invoke requests and handle as we see fit.

It turns out that we don't even really have to do anything particularly fancy with the requests, and can pass them straight through to a .Net object that has method signatures with different number of parameters (which ordinarily we can't do through a COM interface).

A cut down version of the code I've ended up with will demonstrate:

// This doesn't need to be ComVisible since we're never returning an instance of it through COM, only
// one wrapped in a LateBindingComWrapper
public class RequestImpersonator
{
    public RequestDictionary Querystring()
    {
      // Return a reference to the whole RequestDictionary if no key specified
    }
    public RequestStringList Querystring(string key)
    {
      // Return data for the particular key, if one is specified
    }

    // .. code for Form, ServerVariables, etc..

}

[ClassInterface(ClassInterfaceType.AutoDispatch)]
[ComVisible(true)]
public class LateBindingComWrapper : IReflect
{
    private object _target;
    public LateBindingComWrapper(object target)
    {
      if (target == null)
        throw new ArgumentNullException("target");
      _target = target;
    }

    public Type UnderlyingSystemType
    {
      get { return _target.GetType().UnderlyingSystemType; }
    }

    public object InvokeMember(
      string name,
      BindingFlags invokeAttr,
      Binder binder,
      object target,
      object[] args,
      ParameterModifier[] modifiers,
      CultureInfo culture,
      string[] namedParameters)
    {
      return _target.GetType().InvokeMember(
        name,
        invokeAttr,
        binder,
        _target,
        args,
        modifiers,
        culture,
        namedParameters
      );
    }

    public MethodInfo GetMethod(string name, BindingFlags bindingAttr)
    {
      return _target.GetType().GetMethod(name, bindingAttr);
    }

    public MethodInfo GetMethod(
      string name,
      BindingFlags bindingAttr,
      Binder binder,
      Type[] types,
      ParameterModifier[] modifiers)
    {
      return _target.GetType().GetMethod(name, bindingAttr, binder, types, modifiers);
    }

    public MethodInfo[] GetMethods(BindingFlags bindingAttr)
    {
      return _target.GetType().GetMethods();
    }

    // .. Other IReflect methods for fields, members and properties

}

If we pass a RequestImpersonator-wrapping LateBindingComWrapper reference that wraps one of the WSC Controls as its Request reference then we've got over the problem with the optional key parameter and we're well on our way to a solution!

RequestDictionary is enumerable for VBScript and exposes a Keys property which is a self-reference so that "For Each .. In Request.QueryString" and "For Each .. In Request.QueryString.Keys" constructs are possible. It also has a default GetSummary method which returns the entire querystring content (url-encoded). The enumerated values are RequestStringList instances which are in turn enumerable so that "For Each .. In Request.QueryString(key)" is possible but also have a default property which combines the values into a single (comma-separated) string.

VBScript Enumeration

I spent a lot of time trying to ascertain what exactly was required for a class to be enumerable by VBScript - implementing Generic.IEnumerable and/or IEnumerable didn't work, returning an ArrayList did work, implementing ICollection did work. Now I thought I was on to something! After looking into which methods and properties were actually being used by the COM interaction, it seemed that only "IEnumerator GetEnumerator()" and "int Count" were called. So I started off with:

[ComVisible(true)]
public class RequestStringList
{
    private List<string> _values;

    // ..

    [DispId(-4)]
    public IEnumerator GetEnumerator()
    {
        return _values.GetEnumerator();
    }
    public int Count
    {
        get { return _values.Count; }
    }
}

which worked great.

This concept of Dispatch Ids (DispId) was ringing a vague bell from some VB6 component work I'd done the best part of a decade ago but not really encountered much since. These Dispatch Ids identify particular functions in a COM interface with zero and below having secret special Microsoft meanings. Zero would be default and -4 was to do with enumeration, so I guess this explains why there is a [DispId(-4)] attribute on GetEnumerator in IEnumerable.

However, .. RequestStringList also works if we DON'T include the [DispId(-4)] and try to enumerate over it. To be completely honest, I'm not sure what's going on with that. I'm not sure if the VBScript approach to the enumeration is performing some special check to request the GetEnumerator method by name rather than specific Dispatch Id.

On a side note, I optimistically wondered if I could create an enumerable class in VBScript by exposing a GetEnumerator method and Count property (implementing an Enumerator class matching .Net's IEnumerator interface).. but VBScript was having none of it, giving me the "object not a collection" error. Oh well; no harm, no foul.

More Dispatch Id Confusion

As mentioned above, RequestDictionary and RequestStringList have default values on them. The would ordinarily be done with a method or property with Dispatch Id of zero. But again, VBScript seems to have its own special cases - if a method or property is named "Value" then this will be used as the default even if it doesn't have DispId(0) specified.

Limitations

I wrote this to try to solve a very specific problem, to create a COM component that could be passed to a VBScript WSC Control that would appear to mimic the ASP Request object's interface. And while I'm happy with the solution, it's not perfect - the RequestDictionary and RequestStringList classes are not enumerable from Javascript in a "for (var .. in ..)" construct. I've not looked into why this this or how easy (or not!) it would be to solve since it's not important for my purposes.

One thing I did do after the bulk of the work was done, though, was to add some managed interfaces to RequestDictionary, RequestStringList and RequestImpersonatorCom which enabled managed code to access the data in a sensible manner. Adding classes to RequestImpersonatorCom has no effect on the COM side since all of the invoke calls are performed against the RequestImpersonator that's wrapped up in the LateBindingComWrapper.

Success!

After the various attempts I've made at looking into this over the years, I'm delighted that I've got a workable solution that integrates nicely with both VBScript and the managed side (though the latter was definitely a bonus more than an original requirement). The current code can be found on GitHub at: https://github.com/ProductiveRage/ASPRequestImpersonator.

Posted at 10:20

Tags:

Comments

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

You may also be interested in (see here for information about how these are generated):

About

Recent Posts

Highlights

Archives