Work Posts Résumé

How I Converted ADO.NET DataSets into Dapper data models

Parcel Details is a monolithic .NET web app written using the now deprecated Web Forms framework that Microsoft introduced in 2002. For data access this project uses a technology called ADO.NET DataSets. In this article I’ll walk you through how I silently replaced the entire data access layer of this web app with Dapper-based data models.

Create a new DataSet

To create a DataSet in Visual studio you can add it to your project like any other .cs or .cshtml file. Upon creating the DataSet you’ll be greeted with page that you can fill with drag-and-drop elements by right clicking and creating a new TableAdapter. These TableAdaptors are models of specific tables in your database.

TableAdapter Wizard

A wizard leads you through the process of picking out which database table you’d like to query, what operations (Create, Read, Update, and Delete) you’d like to perform on this table, and what SQL queries you specifically want to execute when you call the methods of this TableAdapter.

A real production DataSet

With the wizard completed a new little object will appear on your screen that shows the fields in the database table and the names of the methods you defined to execute specific SQL actions against this table.

Autogenerated DataSet code

In the background Visual Studio auto-generates a bunch of code to define the TableAdapter and its methods. If you modify the drag-and-drop object this auto-generation process will repeat and if you modify the auto-generated code this whole process gets unhappy.

A DataSet modification error

With the exception of the actual SQL statements, for which there is an additional SQL definition wizard, this is basically a no-code database interaction layer.

But I am a programmer; I like code.

I want more control and I want a data access layer that my coworkers and I can reason about with ease. This means reducing the amount of abstraction between the database and the data models in my application.

Hello Dapper

Micro-ORMs are a hot topic in many application domains, but specifically in .NET land there seems to be four schools of thought that I’ve encountered.

  1. The Microsoft Docs say to use EF Core.
  2. I don’t think EF Core’s features are necessary, let’s use something simpler (Dapper).
  3. I don’t like ORMs, and I know how to use basic ADO.NET connections, I’ll roll my own. (Re-inventing Dapper)
  4. What’s an ORM? Just make it work. Curse Microsoft for making this hard.

These are all valid responses. For greenfield applications #1 makes the most sense to me. Requirements can change rapidly and the flexibility that you can get from using a full ORM like EF Core can make pivoting to meet new customer requirements trivial. Plus the knowledge base and community that’s built up around EF Core over the past few years is pretty good.

For existing apps like my own I think #2 makes the most sense. I don’t want to reinvent the wheel. I don’t want to provide training or documentation on the boilerplate of my database interaction layer.

I would much rather point someone to a tutorial on using Dapper, and then once they’re comfortable with the library walk them through how I’ve mobilized its capabilities in my specific application. This saves me time and it increases the rate at which new developers can come up to speed on the codebase.

I had a coworker go with #3 on a large team project around the middle of last year. This individual is the most talented developer I work with and they have an intuitive grasp of our databases. I am not this good. Worse yet, the other developers we onboarded later in the project were from the camp described by #4 and struggled mightily to understand data models and how to write integration tests for them.

Getting back to the plot, Dapper is a Micro-ORM that offers a thin abstraction over the iDbConnection interface in .NET. If an ADO.NET database provider exists for the specific database you want to talk to then you can use Dapper. This is in contrast to EF Core which requires its own EF Core-specific database providers.

ADO.NET database providers have been around since the dawn of time, or at least the dawn of dotnet. One of our largest databases is an Oracle 11g instance. We also have a variety of MS-SQL databases that all have to play nice with our data models. Because EF Core doesn’t have a provider that supports our specific and admittedly quite old, Oracle database, I ruled it out. Thanks to its simplicity Dapper does not suffer from this flaw and I can use Oracle.ManagedDataAccess, an existing ADO.NET database provider.

If I need to switch a data access method from one database to another all I have to do is swap the name of the database provider for that specific method and then rewrite the SQL statement in a syntax compatible with the new database provider. The structure and responsibilities of the data access method stay the same. I find this quite satisfying from an object-oriented programming PoV.

DataSets to data models

I had seven DataSets that I needed to convert into data models. I started by creating a new DataAccess folder inside the project and then I created a folder for each of the databases this app needs to talk to.

For the sake of simplicity I kept the namespace of all of these models limited to {appName}.DataAccess rather than what Visual Studio defaults to when you create a new class {appName}.DataAccess.{databaseFolderName}, which mirrors the folder structure of the application. I find that this practice reduces the amount of cognitive load involved when referencing a data model while maintaining a workable folder structure as opposed to a grab bag of junk.

Extracting an SQL statement from a TableAdapter

Then I opened up one of the smaller DataSets that modeled just a couple of tables and went to work. When looking at a TableAdapter, if you click on the specific data access method you’re interested in; its name and the SQL statement it uses will appear in the properties panel in Visual Studio. From there you can copy the name, and extract the raw SQL it use to query the database.

Testing PL-SQL statements

I like to test SQL statements before I put them into code. Oracle’s SQL Developer application is the best tool I have to run queries against our Oracle 11g database and get immediate feedback.

For our MS-SQL database I used the venerable SQL Server Management Studio to fill the same role. I copied the SQL from the data access method, ran it against the database, and then looked at the results returned by query to verify that it worked right.

A real data model

Based on the result of the query I created a new class in C# and wrote up the getters and setters that mirror that result. As a matter of convention, I kept each of the data models in their own class file, although they all shared the DataAccess namespace. Getters and Setter are great but we need a data access method to make this data model useful.

For this I followed the first example from this Dapper Tutorials site. It’s a static method that returns an instance of the data model. It wraps the Dapper call in a using block that creates an instance of the ADO.NET database provider that Dapper then uses to execute the SQL statement that was defined in this method as a string and passed into the Dapper call as its first parameter.

What Dapper returns to your method can be one of four things:

  1. A data model with data in it.
  2. A data model with nothing in it.
  3. Null
  4. An exception.

#3 and #4 are configurable within Dapper based on which query you use.

For example QueryFirstOrDefault will never return an exception but QuerySingle will return an exception if the database responded with more than one row or it responds with no rows.

This is a pick your poison moment, but I prefer to write my flow control around checking for specific ranges of valid values as opposed to checking for nulls or wrapping calls in try catch statements to handle exceptions thrown by the data model.

Often I’ll find myself using the null-coalescing operator to check for the OrDefault response and then creating an instance of the data model where I just return just the parameters that were passed in.

Using the data models

Alright, now I’ve got a data model that returns the same data as the TableAdapter I built it to replace. But how do you actually perform the swap?

Step one is to use the Find All command to locate all of the places in the app where the name of that specific TableAdapter is mentioned. Then you can work through them on a case by case basis; be forewarned this isn’t a copy and replace kind of solution, rather this will require a bit of thinking.

After and before

On the left we have the version of this function where I’ve used a data model and on the right we have the older version that uses a TableAdapter. The key points to note here are that the version with a data model is more succinct, doesn’t require nested using blocks, a DataTable, or DataRows. The data model is a much simpler abstraction which makes it easier to pick out flaws like bad variable naming (ex. OtherAssessments and otherAssessements), oops.

It also allows me to make better use of C#’s static typing to prevent errors. If you compare how the “local” string is set inside each foreach loop you can see that in the version on the right I have to treat the DataRow like an array and I’m calling the ToString method because who knows what’s really in row[2]? In the version on the left I’m just grabbing a property of the data model and I don’t have to do any conversion because it’s already a string. In this way the data model makes it easier to reason about the data it contains and prevent simple mistakes.

To flip this on its head for a moment, the use of relatively complicated structures like a TableAdapter and a DataTable didn’t help the original author to write better code. Worse yet, they actively made it difficult for subsequent programmers to reason about and modify the existing code. There certainly can be advantages to adding additional abstractions, but abstraction have a cost. There’s an engineering trade off being made here between complexity and abstraction.

It took about two weeks for me to convert all seven DataSets into forty-one data models and replace their use in this web app’s code base. Of course, it wasn’t a completely painless transition. In the week following the introduction of this new DataAccess layer I triaged about 7 new bugs that our telemetry picked up in production. These issues were mostly due to the data models attempting to serialize poorly formatted data from the database into static types.

Specifically, one table kept returning poorly formatted dates that would occasionally fail to be serialized into DateTimes. The solution here was to serialize it as a private string and then use the DateTime.TryParse() technique to set a similarly named public field in the data model.

Is there really anything wrong with DataSets and their TableAdapters? Well, no. They are a perfectly functional way to interact with a database inside of a C# app. But for Parcel Details and my team, working with data models is a much easier task than maintaining and extending these increasingly grumpy DataSets.

Not only is it possible to convert DataSets into data models in a reasonable amount of time, but it’s kind of fun.