Monday, January 14, 2013

Data Migration Framework Overview & Shortcomings

When I attended the AX Technical Conference in Seattle this last October, I was mildly impressed with some of the new features coming into the AX circle.  One of the tools I was impressed with was the Data Migration Framework (DMF), which was, and still is, in its 'beta' phase.  Now, I've only just recently been exposed to AX implementations in August this year when I started this job as a technical consultant for a Microsoft Gold Partner, so my knowledge of what is required for data migrations is still quite limited at this point.  In fact, the partner I work for typically has went through another partner to handle data migrations.  However, in attempt to further grow our firm, my boss has asked me to undertake the task of researching data migration and the tools that are available to us.  I did a little research, and like other AX-related Google searches, I found very little.  However, since I had been to the Technical Conference, I knew about DMF and some of the theory behind it.  From my understanding, if you are moving from another version of AX to a newer version of AX, the 'Data Import' is what should be used.  However, if you are moving from a non-AX system to AX, the DMF is the recommended solution (minus the fact that it is still in Beta 2.0 AND Microsoft doesn't provide support for it).  However, after playing with the DMF, I've found it to be very buggy and wouldn't recommend it just yet.  Although, like anything else Microsoft has released that's new, it's really great in theory, there are just too many things, that I imagine will be fixed in further iterations, that are just not quite there yet.  I recommend using AIF instead, I will be providing a similar analysis of AIF in the near future, follow me on Twitter for that update and other AX related tweets.  In the meantime, I will give you a general overview of the process I used to use the tool and what I've found along the way.

Installation


The first step is the installation of the tool.  There is an article that helps with this.  It's not an intuitive install.  In addition to executing the DmfSetup.exe, which MUST be 'Run as Administrator', you also have to install an .axmodel file (which is extracted from the setup) using ax-util via PowerShell, which the setup neither informs you of nor just does for you, that would just make too much sense.  The article I mentioned above walks you through it.  Once it is fully installed, you have to wait 8+ hours for the AOT to do a full compile and also have to kick off a full CIL compile.  Having completed that and waiting like 2 work days, the 'Data migration framework' shows up as a module in the navigation pane of AX.

Setup


The next step is setup, there isn't too much to it.  The first thing you must do is set up a shared directory where your source files will reside.  This is set up in 'Data migration framework parameters' (pictured below).  You must click the Validate button to continue to use DMF.  At first I was getting an error: "Unable to load file. AOS user does not have access to shared folder."  This was resolved by adding permissions to the shared directory I was selecting.  To see what user should be granted access, you will need to look at the AOS service and figure out what user it is executing under.  When the proper permissions were added, the shared directory then would validate.  I didn't even need to restart the AOS, that's always a bonus.


Another part of the setup was setting up 'Source data formats'.  This is where you specify what your input file format will be.  For instance, I am primarily working with tab-delimited files, so as pictured below, you can see where I specified that.  Other options of input data include fixed width data files and also a direct database connection through ODBC.  There is another option called AX which I am unsure about at this point so I won't further pretend like I know what that does.  But, we can make a good guess as to how ODBC works, but I won't discuss this either as in my case, I don't have an environment that has direct access to the source database, so this option will not work.  In the meantime, I'm sticking with using delimited flat files.


Introduction


Now that we are set to begin, I will start with the high level overview of the theory behind how the data progresses through DMF.  You have the "Source", the "Stage", and the "Target".  The Source is in my case the flat file I will be using.  The source file's data will migrate to one flat temporary staging table within AX.  From the staging table, the data will migrate to a target.  That target is in most, if not all, cases are multiple tables that are related to each other.

Because DMF comes shipped with many pre-defined entities (table mappings), 'Target entities' (pictured below) will be the easiest thing to look at right away; this is the relationship between Stage and Target.


This is a listing of all the mappings that come shipped with DMF.  You can also define your own custom entities here if you have a custom data that must be imported into AX, but I won't go over that.  The interesting thing to note is the AOT objects referenced here.  The Staging table is a table in AX, the Entity class is a class within AX, and the Target entity is a query object within AX.  So, if we click on the 'Entity structure', we will see the same structure defined as the query object mentioned before.


This is the magic behind how data will get from one flat staging table to multiple target tables, but something is missing yet.  We need to map fields from the staging table to the target fields.  Luckily, the target entities that come shipped with DMF already have these defined.  Click on 'Modify target mapping'.


This looks chaotic but it has to be.  It is a complex table relation.  The lucky part is that this is defined already.  Truthfully, the hard part will be getting the data into the staging table.

One thing to note here that I won't elaborate too much is the 3 boxes in the middle.  These are 3 methods located in the Entity class I referred to earlier.  These are created when there are complex data transformations that are needed before reaching the target tables.  For instance, you see the GenerateDefaultDimension method; this is because AX 2012 handles financial dimensions using a very different table structure than before.

Looking at this, you might be confused as to why the target shows up as one table and not multiple as I mentioned before.  The simple explanation is that this is built off a query object.  Imagine that you execute this query against the database, you will get one result set back; that is what this is.  AX will manage this similar to how the rest of AX handles table buffers just like in forms and form data sources, it might look like one grid of data, but it may come from multiple sources, the same applies here except that the data is flowing in the opposite direction.  Notice how ItemId in the Staging table is mapped to multiple Target fields; this is because ItemId exists as fields in multiple tables of the target.

That pretty much wraps up the introduction, everything as it is shipped with.  Everything other than that are things we have to custom-define and execute manually.  This involves how the data gets from the Source to the Staging table.

Source-to-Stage


Now we are ready to start migrating data from the Source to the Staging table.  First, we will need to create a 'Processing group'.  This is basically just a way to group all the types of data you will be processing together.  For instance, we will be trying to import data relating to items, so we will create a group for that.


Notice the above example, the Entities button is the only option available to us (never mind the 'Export to' option).  If we click on Entities, it brings up another form.  This is basically where we set up everything source file related.


If we add a record here, we need to specify an Entity, a data format, and the path to the sample file.  I selected Items as the Entity, as we are importing items.  I also chose Tab, as we set up earlier, as my data format as my flat file is tab-delimited.  This is the only setup needed as far as the source file is concerned.  The next part is to define the table mappings between the Source and the Staging table.

First thing that must be done is the 'Generate source mapping' option at the top.  This sets up your Source, giving it the ability to be mapped.  Careful in pressing this later on, this will reset all your mappings to the default.  Most often, your column names in the source file won't match AX naming conventions, so nothing may map at all by default.  In fact you may get a warning: "There are field(s) which are not mapped to Entity field(s)".  This is okay and expected.

Now we have the ability to change the default mappings by clicking 'Modify source mapping'.  As pictured below, we can see that no fields were mapped by default.


So, from here, I can click and drag from any black dot by a column name of the Source and drag it to a black dot of another column in the Staging table.  I added only a few just to keep this example simple.


One thing to note here, there is an option at the top to see a 'Mapping details' view.  Here you will see exactly what you can see in the visualization, only this is in grid format.


If you select one of the source fields, you can click the Conversion button.  Here, you can define conversions or transformations that will be expected from the source file.  In my example, I have a column called 'type' which is mapped to the ItemType field.  AX will be expecting either 'Item' or 'Service' as valid values.  However, my flat file only has 'I' and 'S' as values.  The ItemType within AX will not know how to handle these.  So, we can add a conversion to say when the value is 'S', change it to 'Service'.  When it is 'I', change it to 'Item'.  This is a very neat feature of DMF.

Also, don't forget to hit Save before you exit, a message will warn you if you didn't.  Now that the mappings have been defined, we can hit 'Validate' and it will make sure that all the required fields are mapped.  Hopefully by this point we should be set, but before we move on we should preview what it will look like when it goes into the Staging table.  Click 'Preview source file', a grid should appear in the Preview pane, pictured below.


At this point, it looks great!  We are now ready to load data into the Staging table.  We can close out of the Entities form and a new option appears in our Processing group, 'Get staging data'.


Go ahead and click it.  We are asked to supply a Job ID, the default value will do.  The Job ID is used to track what data was processed successfully and also what failed with errors.  We could run more jobs later but still be able to tell what errors came out of the initial run.


This data execution form appears and lets us know what is expected to be run and what hasn't run yet.  We can click on Run and a batch dialog appears.  At this point, we're not going to batch process this, so we can just leave the defaults and run this right away.  An infolog message shows: "'12' 'Items' record(s) inserted in staging" for however many records were inserted.  If there were any errors, it would let you know.  For more specifics on the errors, you can open up the Execution history form to see the log.

Now, the data should all be loaded into the Staging table.  We can verify this easily by looking at the DMF table in the AOT.

Stage-to-Target


Back at the 'Processing groups' form, we now see all options available to us.  The option 'Copy data to target' is the one we are interested in next.  We click on that, we are prompted for the Job ID we were using earlier.  If we have run several jobs and loaded different pieces of data, we can easily segregate the migration by choosing only specific jobs.  There is also the option to only run records that previously had errors or records by user selection.  Click OK to continue.

Another execution form appears that is similar to the Staging table execution.  If we click Run, it will start processing records.

Unfortunately, this is where my tutorial ends.  I get a stack trace error and cannot seem to continue.  If I someday figure out why this happens or Microsoft fixes this bug, I will update this post.  But this should help get someone started if they were interested in looking to the future and setting up DMF and playing around a bit.

Shortcomings


In reviewing this tool, I found a couple things that I would call shortcomings of this software.  These are definite issues that Microsoft should definitely have a look at.

1) Stack Trace Error:
In the last step, I mentioned a stack trace error I've been getting.  I go to Run the Copy data to target and my debugger pops up with the following error:
Error executing code: The field with ID '0' does not exist in table 'Common'.
Stack trace
(C)\Classes\DMFEntityWriter\write - line 534
(C)\Classes\DMFEntityWriter\run - line 83
(C)\Classes\DMFEntityWriter\main - line 24
(C)\Classes\xMenuFunction\run
(C)\Classes\MenuFunction\run - line 87
(C)\Forms\DMFWriteData\Designs\DesignList\DMFEntityWriterBatch\Methods\Clicked - line 16
Sounds like a very generic error.  But considering the Stage-to-Target stuff is out of the box functionality, this should never happen in any case.

2) Cannot Map One-to-Many
When setting up the Source file mapping, it appears like you are able to map one Source field to multiple Stage fields.  However, when you attempt to run the process to copy the data into the Staging table, an error appears:
Error updating flat file source properties-Exception from HRESULT: 0xC0207015
After Google'ing that exception code, all signs were pointing to multiple columns having the same name.  Of course, I'm positive I didn't have two columns named the same.  I double-checked myself and there wasn't, but I realized it was because I had a one-to-many mapping on the Source.  After I removed the mapping, I didn't get the error.  After fighting with DMF, I decided that it would just be easier to create duplicate columns in the source flat file itself, with a different column name of course, and use the duplicated column for the mapping instead.

3) No Defaulted Values
Although we can specify Conversions per each field in the mappings, there is no option to have a default value.  Most often, I'd assume the data you are trying to import will not match what AX is expecting.  There should be an option to have a default value when none of the Conversions are matched.  This would allow for a much cleaner and smoother data migration.

4) Source Data Types Not Recognized
When we initially generate the source mapping of a flat file, all columns show up as String data types, regardless if all the values in that column are numeric or dates.  So, when you try to map these String fields to Date or Real or Int fields in the Staging table, an error pops up letting you know that the types don't match.  This is a big issue!

The only workaround I can think of is to somehow load your flat file into a database and then use an ODBC connection to read your source data, but that seems like a lot of work just to avoid something that should already be working.

5) Mandatory Fields Not Complete
When DMF is shipped, it comes with all the DMF* Staging tables.  These tables have certain fields marked as Mandatory in the AOT properties.  However, some of the Entities and the tables relating to it actually require more fields than what is specified in these DMF* Staging tables.  This is something the user must keep in mind, even if your Source validates and successfully gets to the Staging table, it may not successfully get to the Target tables because it might be missing some required fields that it did not inform you of earlier.