29 March 2011
Earlier this year I was introduced by someone I work with to AutoMapper. At a very convenient time it turned out since I was in the middle of a couple of projects that I had to do a lot of run-of-the-mill gluing together of request between web services where there were very similar object models in play but which came from different services - so I was looking at writing a load of code that basically took a request from one side and re-formed it into a very similar request to push elsewhere. Not particularly fun, and I find one of the places I'm most like to make stupid mistakes are when I'm not 100% mentally switched on because the task at hand makes me feel like I'm being a robot!
So, for one of these projects I was getting stuck into; AutoMapper to the rescue!
AutoMapper is an "object-to-object" mapper which, well.. maps from one object to another! :) If the source and destination objects have identical structure but different namespaces then most times AutoMapper will be able to translate from one to another as-if-by-magic, and there are several conventions that are applied by the default mapper that perform simple object flattening and other tricks.
There's loads of introductory tutorials out there for AutoMapper so this is just a dead simple example to get across the gist - I can use one call to a CreateMap method and use a nice fluent coding style to tweak it how I want, then conversion between lists or arrays or enumerables of mappable types are automatically handled:
var data = new Employee()
{
Name = new Employee.EmployeeName()
{
Title = "Mr",
First = "Andrew",
Last = "Test",
},
DateOfBirth = new DateTime(1990, 6, 14)
};
Mapper.CreateMap<Employee, Person>()
.ForMember(d => d.Name, o => o.MapFrom(s => s.Name.Title + " " + s.Name.First + " " + s.Name.Last));
var dataList = new Employee[] { data };
var translated = Mapper.Map<Employee[], List<Person>>(dataList);
public class Employee
{
public EmployeeName Name { get; set; }
public DateTime DateOfBirth { get; set; }
public class EmployeeName
{
public string Title { get; set; }
public string First { get; set; }
public string Last { get; set; }
}
}
public class Person
{
public string Name { get; set; }
public DateTime DateOfBirth { get; set; }
}
This doesn't even scratch the surface; it can handle nested types and complex object models, you can define custom naming conventions for property mappings, specify properties to ignore or map other than to the conventions, map onto existing instances rather than creating new, create distinct configuration instances, .. loads and loads of stuff.
An example of its use out-in-the-field is in the MVC Nerd Dinner demo project and Jimmy Bogard (who wrote AutoMapper) mentions how he uses it in his article "How we do MVC" -
AutoMapper to go from Domain -> ViewModel and Domain -> EditModel. This is again because the view and controller put constraints on our model that we didn't want in our domain. AutoMapper flattened our domain into very discrete ViewModel objects, containing only the data for our view, and only in the shape we want.
.. which sounds like a very sensible application for it to me! (The rest of that article's definitely worth a read, btw).
There was one gotcha using it that caught me out, but it made perfect sense when I reasoned it through afterward.
I was mapping from one large, most-flat object into another where the first was a subset of the second; it was an old legacy webservice where the interface accepted every property for several types of bookings, where maybe 60% of the properties were shared between types and then the rest were specific to different booking types. So a booking made through the web interface resulted in an HotelBooking being instantiated, for example, and this was mapped onto the "super" booking object of the legacy service interface.
var source = new Hotel(
Guid.NewGuid(),
// .. other properties
"Test"
// .. other properties
);
Mapper.CreateMap<Hotel, Booking>();
var dest = Mapper.Map<Hotel, Booking>(source);
public class Hotel
{
public Hotel(Guid id, /* .. other properties .. */ string network)
{
if ((network ?? "").Trim() == "")
throw new ArgumentException("Null/empty network specified");
// .. other validation ..
Id = id;
//.. other properties..
Network = network;
//.. other properties..
}
public Guid Id { get; private set; }
// .. other properties ..
/// <summary>
/// This will never be null
/// </summary>
public string Network { get; private set; }
// .. other properties ..
}
public class Booking
{
public Guid Id { get; set; }
// .. other properties
public string NetworkType { get; set; }
// .. other properties
}
On the translated "dest" instance, the NetworkType property is "System.String" - er, what??
Well it turns out that AutoMapper finds that there is no NetworkType property to map from Hotel to Booking but sees that there is a "Network" value. It then tries to see if it can perform some object flattening by checking whether the Network value has a Type property which, being a string, it doesn't. But it then consider a property retrieval method rather than a standard property getter so it looks for a GetType() method which, since string inherits from objects, it does! So it takes the .Network.GetType() value and assumes we want this for the Booking.NetworkType value!
Like I said, it all makes perfect sense but it took me a little while to work out what was happening in this case :)
Did I mention AutoMapper is open source? This is great cos it let me have a poke around the source code and get a feel for what magic seemed to be going on!
My biggest problem is with scenarios where I want to do the opposite of the above - instead of translating from an "always-valid" internal object to a webservice class I'd like to be able to instantiate a class through its constructor, using data from a source class.
Now, AutoMapper does have some sort of support for using constructors for mapping - eg.
Mapper.CreateMap<Booking, Hotel>()
.ConstructUsing(src => new Hotel(src.Id, /* .. other properties .. */));
But here I've got to manually map all of the properties from the source to arguments in destination's constructor! What I really want is all of that clever name convention malarkey done in AutoMapper to be applied to constructor arguments of destination types. I mean, argument names are always present in compiled C# code so it's not like that data is unavailable for examination by AutoMapper. And having conversions like this would save me having to write a lot of boring code at webservice boundaries!
Now, since I seem to think it's so easy - How Hard Can It Be? :) - I'm going to have a bit of a play around and see if I can slap something together to do this. If I don't end up reduced to tears (and maybe even if I do!) I'll see what I can do about posting the results!
Posted at 21:38
22 March 2011
Having sung the praises of immutability last time, there are a couple of flies in the ointment. The first is a bit of a non-issue I think, but bears mentioning; if I have a class with half a dozen properties it feels like a lot of monotonous typing hammering out those private properties, those arguments-not-whitespace-or-null checks, those property assignments, those public properties, those comments about the class contract - it's boring! Now I know that developers should all be superfast, touch-typist maniacs (http://www.codinghorror.com/blog/2008/11/we-are-typists-first-programmers-second.html) - and I am, that's not the problem - but it still makes me grimace when I know I have to throw down a big chunk of boilerplate-looking code like the Name class I used in an example last time.
public class Name
{
public Name(string title, string firstName, string lastName)
{
if ((title ?? "").Trim() == "")
throw new ArgumentException("Null/empty title specified");
if ((firstName ?? "").Trim() == "")
throw new ArgumentException("Null/empty firstName specified");
if ((lastName ?? "").Trim() == "")
throw new ArgumentException("Null/empty lastName specified");
Title = title;
FirstName = firstName;
LastName = lastName;
}
/// <summary>
/// This will never be null or empty
/// </summary>
public string Title { get; private set; }
/// <summary>
/// This will never be null or empty
/// </summary>
public string FirstName { get; private set; }
/// <summary>
/// This will never be null or empty
/// </summary>
public string LastName { get; private set; }
}
Now, on the other hand, this means that callers can take on certain guarantees about the class and so contain less "gotcha" code (checking for null values and all that). So this is probably code that would have to be written in one way or another elsewhere. Possibly many times over. So I think it's definitely a win overall, which is why I said it's kind of a non-issue - but still it makes my fingers hurt a bit for that quick blaze of crazy typing. I'm concerned the key motions for ArgumentNullException are so ingrained in my muscle memory that my one day my hands will refuse to type anything else!
Another issue I've come across a few times was highlighted quite nicely by something I was writing the other day; we had some forms that we wanted to generate from xml config files so some of the elements could be added, removed, made optional, compulsory, etc, etc.. It was fairly straight-forward and each element was parsed from the file and described by a corresponding immutable class but there was a problem - some of the elements were related, or rather one might depend on another. A cascading dropdown scenario, basically. So each element needed a way to access the other elements in the form data to read their values and whatnot. But when initialising each element there didn't exist any single object that had awareness of all of the elements since we were still in the process of initialising them! Catch-22, bang!
To work around this I used an object that appear immutable to the element but which would not guarantee to be able to respond to requests for references to other element until the Init phase of the pagecycle (this was in ASP.Net and the process was to parse the config file, build a list of controls that the elements required and then add those controls to a Page all at once - so the Init event for each of those controls would be raised after all of the elements had been initialised and the controls created). This object would contain no data initially and be used just as a reference to pass to the elements during initialisation. When the initialisation of all of the elements was complete, a reference to the list of these elements was passed to this mystery object; our "deferred element store". And then the elements' controls were added to the Page. So when the element classes requested access to other elements during or after the Init phase, the data was available!
Now, this clearly immutable data - it's more like some sort of single-setting, delayed-instantiation object.. or something. I'm going to link to Eric Lippert again here since he's pretty much the guru on this sort of thing and since he describes this precise scenario in the following article:
.. I'm not so sure about the phrase "popsicle immutability" but that's basically what I'm talking about! There's a slight variation that I've used here (which actually is talked about in the comments for that article) where the real "element store" class is not passed to the elements during initialisation, only a wrapper around it. This ensures that the element classes couldn't mess with the state, only the form parser could:
public interface IDeferredElementStore
{
AbstractFormElement TryToGetElement(string id);
}
public class DeferredElementStore : IDeferredElementStore
{
private NonNullImmutableList<AbstractFormElement> _elements;
public DeferredElementStore()
{
_elements = new NonNullImmutableList<AbstractFormElement>();
}
public void StoreElementData(NonNullImmutableList<AbstractFormElement> elements)
{
if (elements == null)
throw new ArgumentNullException("elements");
_elements = elements;
}
public AbstractFormElement TryToGetElement(string id)
{
var element = _elements.FirstOrDefault(e.Id = id);
if (element == null)
throw new ArgumentException("Invalid Id");
return element;
}
}
public class ReadOnlyDeferredElementStore : IDeferredElementStore
{
private IDeferredElementStore _elementStore;
public ReadOnlyDeferredElementStore(IDeferredElementStore elementStore)
{
if (elementStore == null)
throw new ArgumentNullException("elementStore");
_elementStore = elementStore;
}
public AbstractFormElement TryToGetElement(string id)
{
return _elementStore.TryToGetElement(id);
}
}
.. and the element generation code could look something like:
var elements = new List<AbstractFormElement>();
var elementStore = new DeferredElementStore();
var elementStoreReadOnly = new ReadOnlyDeferredElementStore(elementStore);
elements.Add(new FreeTextElement(.., elementStoreReadOnly, ..));
elements.Add(new DropDownElement(.., elementStoreReadOnly, ..));
elementStore.StoreElementData(new NonNullImmutableList<AbstractFormElement>(elements));
foreach (var element in elements)
{
foreach (var control in element.Controls)
this.Controls.Add(control);
}
This could just as well be used if there are circular references between classes. I suppose then you'd have to have a container to handle both objects being instantiated and pass a read-only wrapper of this container to both classes, then push references to those instances into the container.
This isn't quite the same as the "observational immutability" described in that article, but I feel I've got an article about dynamic factory classes coming on which will touch on that!
All in all, I'm still definitely a big fan of this immutability lark and am still convinced it makes the code easier to deal with overall. I was reading something earlier that I know can't find so I'll have to paraphrase - they were saying that when you're trying to get to grips with existing code, the less you have to keep in your head about what's going on at any time, the easier it is. This is hardly news but it was used in the context of the advantages of immutable data; that if you have references that just are and aren't going to undergo all sorts of states changes, there's much fewer potential interactions you have to deal with mentally. And that means it should be easier to deal with!
Posted at 19:00
14 March 2011
I love immutable data. There, I said it. I think over the last couple of years a few major factors have had the most influence in leading me to this point -
The first point could really be addressed in all sorts of ways - the code's all a bit wishy-washy and poorly defined and nobody seems to know which fields are for what in the example I'm thinking of. But when I think of immutable types I instinctively think of classes whose values are set once through a constructor (though there are other variations that can be used) and then that instance is "locked" such that we know its state will never change - and that constructor will have ensured that this state is valid. If the classes in point were all written in this way then never again (hopefully!) would there be concerns regarding the validity of the states of the objects, they must have been valid in order to be instantiated and immutability means they can't have changed since!
While we're doing some sort of validation on the constructor arguments I think it also encourages you to think about the various states that can exist - eg.
public class Employee
{
public string Title { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string[] Roles { get; set; }
}
This is the sort of thing that's found all over the place - especially across webservice interfaces. Assume that we have the requirements that Title, FirstName and LastName all have values and that all Employees have zero or more Roles. I think describing the requirements in constructor validation and then some liberal commenting ends up in nicer code:
public class Employee
{
public Employee(Name name, DefinedStringList roles)
{
if (name == null)
throw new ArgumentNullException("name");
if (roles == null)
throw new ArgumentNullException("roles");
Name = name;
Roles = roles;
}
/// <summary>
/// This will never be null
/// </summary>
public Name Name { get; private set; }
/// <summary>
/// This will never be null
/// </summary>
public DefinedStringList Roles { get; private set; }
}
public class Name
{
public Name(string title, string firstName, string lastName)
{
if ((title ?? "").Trim() == "")
throw new ArgumentException("Null/empty title specified");
if ((firstName ?? "").Trim() == "")
throw new ArgumentException("Null/empty firstName specified");
if ((lastName ?? "").Trim() == "")
throw new ArgumentException("Null/empty lastName specified");
Title = title;
FirstName = firstName;
LastName = lastName;
}
/// <summary>
/// This will never be null or empty
/// </summary>
public string Title { get; private set; }
/// <summary>
/// This will never be null or empty
/// </summary>
public string FirstName { get; private set; }
/// <summary>
/// This will never be null or empty
/// </summary>
public string LastName { get; private set; }
}
Except - wow! - the amount of code seems to have ballooned and I've not even included the "DefinedStringList" class! (Well, not here at least - it's down the bottom of the post).
But what we do have now will be instances of Employee that are always in a known good state and we can safely retrieve employee.Name.FirstName without first ensuring that Name is not null. We also know that Employees that have not been assigned roles will have a Roles instance that declares a Count of zero rather than wondering if it will be that or whether there will be a null Roles instance. So the upshot should be that there will actually be less code in places where Employee instances are accessed.
Now, to recreate a really trivial version of the multithreaded datastore I mentioned earlier, imagine we have a local store of Employees that is being written to and read from - eg.
public class EmployeeStore
{
private List<Employee> _data = new List<Employee>();
public IEnumerable<Employee> GetAll()
{
lock (_data)
{
return _data.AsReadOnly();
}
}
public void Add(Employee employeeToAdd)
{
if (employeeToAdd == null)
throw new ArgumentNullException("employeeToAdd");
lock (_data)
{
_data.Add(employeeToAdd);
}
}
}
We'll ignore any concept or deleting or updating for now. Since we don't know how many threads are at work in this scenario, or who's doing what, we lock the internal data at each read or write. We're also returning the data as an IEnumerable and using List's .AsReadOnly method in an optimistic attempt to keep the internal data from being manipulated externally after we return it. In fact, in the example I had, the data was actually (deep-)cloned before returning to ensure that no caller could manipulate any data inside the data store.
If we're working with immutable data types and have access to an immutable list then we can change this without much effort to require no locks for reading and we can implicitly forget any AsReadOnly or cloning malarkey if we have an immutable list to work with as well. An immutable list works by returning new instances when methods that would otherwise effect its contents are called - so if a list has 3 items and we call Add then the existing list is unchanged and the Add method returns a new list with all 4 items. Example code is at the end of this post, along with a DefinedStringList implementation, as mentioned earlier.
public class EmployeeStoreWithoutReadLocking
{
private object _writeLock = new object();
private ImmutableList<Employee> _data = new ImmutableList<Employee>();
public ImmutableList<Employee> GetAll()
{
return _data;
}
public void Add(Employee employeeToAdd)
{
if (employeeToAdd == null)
throw new ArgumentNullException("employeeToAdd");
lock (_writeLock)
{
_data = _data.Add(employeeToAdd);
}
}
}
Easy! Of course this relies upon the Employee class being immutable (which must cover all of its properties' types as well). Now we're not just reaping the benefits in state validity but we've got more performant threaded code (again, my example was heavy on reads and light). In a lot of cases immutability such as this can make areas of multi-threaded code much easier to write and maintain.
I think in this case I extended the ImmutableList to a NonNullImmutableList which had validation to ensure it would never contain any null references. Similar to how the DefinedStringList will ensure it has no null or empty values. Another layer of comforting behaviour guarantee so that callers don't have to worry about nulls. It makes me feel warm and fuzzy.
In most scenarios it seems I've been working with recently, classes such as Employee would be instantiated just the once and then not changed unless another query was executed that returned a new set of Employee data. But feasibly we may want to alter the Employee class such that it is "editable" in the same way that the DefinedStringList that we're talking about is - you can call methods that return a new instance of the class with the alteration made, leaving the original reference unaltered.
public class Employee
{
public Employee(Name name, DefinedStringList roles)
{
if (name == null)
throw new ArgumentNullException("name");
if (roles == null)
throw new ArgumentNullException("roles");
Name = name;
Roles = roles;
}
/// <summary>
/// This will never be null
/// </summary>
public Name Name { get; private set; }
/// <summary>
/// This will never be null
/// </summary>
public DefinedStringList Roles { get; private set; }
public Employee UpdateName(Name name)
{
// This will throw an exception for a null name reference
return new Employee(name, _roles);
}
public Employee AddRole(string role)
{
// This will throw an exception for a null or empty role value
return new Employee(_name, _roles.Add(role));
}
public Employee RemoveRole(string role)
{
return new Employee(_name, _roles.Remove(role));
}
}
Here the name can be overwritten and roles can be added or removed. What's interesting about this approach is that returning new instances each time means you could persists a chain of changes - an undo history or sorts! I must admit that I've never taken advantage of this in any way, but it's often struck me that it could be useful in some situations..
While writing this post, I did a bit of research to try and make sure I wasn't say anything either too done-to-death or too stupid and the following links are articles I like, largely because they agree with me! :)
Immutable data structures are the way of the future in C#
http://blogs.msdn.com/b/ericlippert/archive/2007/10/04/path-finding-using-a-in-c-3-0-part-two.aspx
One of reasons why immutable types can be faster is that they are optimized due to having dealt with memory management in years past
http://en.csharp-online.net/CSharp_Coding_Solutions-Immutable_Types_Are_Scalable_Types
However there's also this one:
The "verbose constructor" is itself a good candidate for an anti-pattern for the following reasons:
http://blog.dezfowler.com/2009/05/always-valid-entity-anti-pattern.html
I've worked with Derek before so although I read that article two or three times and couldn't agree with it, I didn't give up 'cos I know he's a bright guy. And it finally broke for me what I think he meant when I read the comments on that piece - there's only four and it's the last one that made it stick for me. Partly because someone I work with now has a similar view, I think. The way I see things working together is that the validation in these "verbose constructors" is a last line of defense to ensure that the object's state is ensured to be valid and is not business logic where the intention is to throw a load of possibly-valid values at it and see what sticks. There should be a nice validation layer between the UI and these constructors that only allows through allowable state and handles the aggregation of errors where required. The exceptions in the constructor should still be just that; exceptions, not the norm for invalid UI input.
But in summary, I'm still all for these "verbose constructors" - as this final defense that allows us not to worry about instances of these immutable classes - if they exist, then they're valid. And I like that.
Since this code is a bit long to jam in the middle of the article, here it is in all its glory:
public class ImmutableList<T> : IEnumerable<T>
{
private List<T> values;
private IValueValidator<T> validator;
public ImmutableList(IEnumerable<T> values, IValueValidator<T> validator)
{
if (values == null)
throw new ArgumentNullException("values");
var valuesList = new List<T>();
foreach (var value in values)
{
if (validator != null)
{
try { validator.EnsureValid(value); }
catch (Exception e)
{
throw new ArgumentException("Invalid reference encountered in values", e);
}
}
valuesList.Add(value);
}
this.values = valuesList;
this.validator = validator;
}
public ImmutableList(IEnumerable<T> values) : this(values, null) { }
public ImmutableList(IValueValidator<T> validator, params T[] values)
: this((IEnumerable<T>)values, validator) { }
public ImmutableList(params T[] values) : this(null, values) { }
public T this[int index]
{
get
{
if ((index < 0) || (index >= this.values.Count))
throw new ArgumentOutOfRangeException("index");
return this.values[index];
}
}
public int Count
{
get { return this.values.Count; }
}
public bool Contains(T value)
{
return this.values.Contains(value);
}
public ImmutableList<T> Add(T value)
{
if (this.validator != null)
{
try { this.validator.EnsureValid(value); }
catch (Exception e)
{
throw new ArgumentException("Invalid value", e);
}
}
var valuesNew = new List<T>();
valuesNew.AddRange(this.values);
valuesNew.Add(value);
return new ImmutableList<T>()
{
values = valuesNew,
validator = this.validator
};
}
/// <summary>
/// Removes the first occurrence of a specific object
/// </summary>
public ImmutableList<T> Remove(T value)
{
var valuesNew = new List<T>();
valuesNew.AddRange(this.values);
valuesNew.Remove(value);
return new ImmutableList<T>()
{
values = valuesNew,
validator = this.validator
};
}
/// <summary>
/// This is just a convenience method so that derived types can call Add, Remove, etc.. and return
/// instances of themselves without having to pass that data back through a constructor which will
/// check each value against the validator even though we already know they're valid! Note: This
/// can only be used by derived classes that don't have any new requirements of any type - we're
/// setting only the values and validator references here!
/// </summary>
protected static U toDerivedClass<U>(ImmutableList<T> list) where U : ImmutableList<T>, new()
{
if (list == null)
throw new ArgumentNullException("list");
// Use same trick as above methods to cheat - we're changing the state of the object after
// instantiation, but after returning from
// this method it can be considered immutable
return new U()
{
values = list.values,
validator = list.validator
};
}
public IEnumerator<T> GetEnumerator()
{
return this.values.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
public interface IValueValidator<T>
{
/// <summary>
/// This will throw an exception for a value that does pass validation requirements
/// </summary>
void EnsureValid(T value);
}
That's all the setup to enable a DefinedStringList class, which we can with:
public class DefinedStringList : ImmutableList<string>
{
public DefinedStringList(IEnumerable<string> values)
: base(values, new NonNullOrEmptyWrappingValueValidator()) { }
public DefinedStringList(params string[] values) : this((IEnumerable<string>)values) { }
public DefinedStringList() : this(new string[0]) { }
public new DefinedStringList Add(string value)
{
return toDerivedClass<DefinedStringList>(base.Add(value));
}
public new DefinedStringList Remove(string value)
{
return toDerivedClass<DefinedStringList>(base.Remove(value));
}
private class NonNullOrEmptyWrappingValueValidator : IValueValidator<string>
{
public void EnsureValid(string value)
{
if ((value ?? "").Trim() == null)
throw new ArgumentException("Null/empty value specified");
}
}
}
These are actually cut-down versions of classes I've got in one of my projects that also includes AddRange, Insert, RemoveAt, Contains(T value, IEqualityComparer
A final side note(*) - you might notice that internally the ImmutableList does actually participate in some mutability! When calling the Add method, we validate the new value (if required) and then create a new instance of the class with no data and then assign its internal "values" and "validator" references, meaning we sidestep the looping of all the data in the constructor which is unnecessary since we know the values are all valid, that's part of the point of the class! BTW, it feels like a bit of a trick updating these private references after creating the new instances and it's only possible because we've just created the instance ourself and the new object is an instance of the class that is performing the work. I don't know if there's a phrase to describe this method and I was a bit surprised to discover it could be done since it has a feeling of breaking the "private" member contract!
* I don't want to go into too much detail since I want to talk about this further another time!
Update (26th November 2012): A re-visit of this principle can be seen in the post Persistent Immutable Lists which has an alternate implementation of the immutable list with improved performance but all of the immutability-based safety!
Posted at 20:14
Dan is a big geek who likes making stuff with computers! He can be quite outspoken so clearly needs a blog :)
In the last few minutes he seems to have taken to referring to himself in the third person. He's quite enjoying it.