When Is the Content of a String Not What You See?

Turns out, in C#, when it’s coming from a COM interface…sometimes.

I’m writing a C# library to interface with Windows’ File History subsystem. There’s no type library so I have to roll my own set of classes and interfaces in order for C#’s built-in COM interop stuff to access the functions embedded in the File History DLLs. Not too hard, at least for a simple case like the File History subsystem (these directions were invaluable for getting started).

The basic approach involves two steps. First, define a class decorated with two key attributes:

 [ComImport, Guid("ED43BB3C-09E9-498a-9DF6-2177244C6DB4")]
 internal class FHManager
 {
 }

ComImport tells the compiler this class is a COM object, while the Guid attribute identifies which COM object the class represents.

Next, define an interface for the COM object:

[ Guid( "6A5FEA5B-BF8F-4EE5-B8C3-44D8A0D7331C" ), InterfaceType( ComInterfaceType.InterfaceIsIUnknown ) ]
internal interface IFHManager
{
    void LoadConfiguration();
    void CreateDefaultConfiguration( bool overwriteIfExists );
    void SaveConfiguration();
    void AddRemoveExcludeRule( bool add, ProtectedItemCategory category, [ MarshalAs( UnmanagedType.BStr ) ] string item );
    IFhScopeIterator GetIncludeExcludeRules( bool include,  ProtectedItemCategory category );
    ulong GetLocalPolicy( LocalPolicy localPolicy );
    void SetLocalPolicy( LocalPolicy localPolicy, ulong policyValue );
    BackupStatus GetBackupStatus();
    void SetBackupStatus( BackupStatus status );
    IFhTarget GetDefaultTarget();
    ValidationResult ValidateTarget( [ MarshalAs( UnmanagedType.BStr ) ] string url );
    void ProvisionAndSetNewTarget( [ MarshalAs( UnmanagedType.BStr ) ] string url, [ MarshalAs( UnmanagedType.BStr ) ] string name );
    void ChangeDefaultTargetRecommendation( bool recommend );
    void QueryProtectionStatus( out int protectionState, [ MarshalAs( UnmanagedType.BStr ) ] out string protectedUntilTime );
}

Note that the Guid for the interface is different than the Guid for the COM object (both Guids can be found in the SDK header files).

Be careful about the InterfaceType you declare — there are at least two different types, and they’re not compatible. The SDK header files can help you figure out which one a given interface is (or you can just use trial and error).

Also be careful in how you define the interface. Apparently (I’m no COM interop expert) what matters is the order in which you define the interface methods and their parameters, not their names. Again, the SDK header files are invaluable for getting this right.

You access the COM object methods by creating an instance of the COM object class and then casting it to the COM interface:

// remember to check that the casted value is non-null
// if it's null, you've probably misconfigured something in the
// C# interface definition.
var _fhMgr = new FHManager() as IFHManager;

The problem I ran into involved the QueryProtectionStatus() method defined in line 17 of the interface listing. I thought the MarshalAs attributes decorating the parameters — which as out parameters are return values — would take care of all the messy details involved in converting unmanaged objects to managed objects.

Turns out there’s a little detail — a little, invisible detail — that they don’t handle: they don’t strip out special Unicode characters which can be embedded in a string but don’t display. In this particular case they are codes that indicate whether the text they wrap should be displayed left-to-right or right-to-left.

Those invisible codes totally flummox the DateTime parsing routines I was trying to use to get the DateTime when File History last did a backup. I thought I had lost my mind when the value returned by QueryProtectionStatus() — “5/30/2019 8:52 PM” was deemed unequal to the exact same string I had typed into a test statement in the code.

You can only tell the codes are there by checking the length of the string and seeing if it matches what you think the length should be by looking at the displayed string. Or by doing a character-by-character scan, which is how I first stumbled across the problem.

The solution is to remove all those invisible Unicode characters. One way of doing that is like this:

lastBackup = new string( lastBackup.Where( c => c < 128 ).ToArray() );

I doubt this would work for non-English languages — which have valid Unicode character values above 128 — but it works for English.

Thanx to I4V over on StackOverflow for his explanation and fix.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Archives
Categories