Monday, March 21, 2011

Unintentional DB Synchronize before an XPO Import

The last thing anyone wants to do is cause their business to drop to their knees and force their entire customer service department to hand-take orders for a full day.  Well that's what happened when a co-worker of mine who was simply trying to do a code compare the day before a weekly build.  If you, also, are a business who cares enough to take precaution enough to compare code before moving code from a test environment to a production environment, then you may want to listen up.

Our business has 3 environments, namely: DEV, TEST, and PROD.  When promoting code through the environments, we export code from one environment using into an XPO file, and then import it into the next environment.  Along the way, during each import, we compare the code first so we ensure there are no surprises with the outcome.  The primary process of comparing code is done using the Import form as if we were actually importing code, only in this case we would just cancel out before committing the actual changes.  In the special case of our PROD environment, we plan our build (code changes) ahead of time and only import code off hours, typically anywhere from 1am-3am when our customer service department is closed.  However, we compare the code well before our builds, usually the day before during normal business hours, so we can have time to correct any differences we don't expect.  One of these times, we were attempting to do a code compare on our SalesLine table and the AX client session locked up and the DBAs told us that our SPID put an exclusive lock on it and dropped all the indexes.  After a day of hell and rebuilding indexes, I uncovered something disturbing about what automatically happens behind the scenes during an import (or code compare).

Unfortunately, us AX developers are forced to bow down to the almighty kernel and just trust that what it does is right, but it is even worse that we cannot even see what the code does.  After much debugging and using SQL Profiler (to see what SQL gets emitted from DAX and its kernel), I found that before a code compare happens, it does a Synchronize if it is a table (more specifically, in the Import form when you right click on a table but before the context menu pops up, giving you the option, Compare).  To the typical AX developer, this doesn't seem like a big deal, but this is a very unsafe operation that you do NOT want to be invoked in such a reckless manner.

A Synchronize is intended to keep your database schema in sync with what it shows in the AOT.  Most often, the database will probably already be up to date, but sometimes it might get out of sync, like when you modify fields or indexes.  I won't walk through each of those cases because there are many and probably more than I even know about.  But during a sync, it is possible to have this happen:
  • Put an exclusive lock on the table being sync'd
    • CREATE TABLE X<TableId>X ...
      • Creates a new table with the same schema, where <TableId> is the ID of the table being sync'd
    • INSERT INTO X<TableId>X SELECT * FROM <Table>
      • Notice: No DataAreaId, so your indexes will be of no use
    • DROP TABLE <Table>
    • SP_RENAME X<TableId>X <Table>
      • Renames the newly created copy of the table back to the original name
    • CREATE INDEX ...
      • Creates all the indexes that are defined for this table
  • Remove the exclusive lock
Now I don't know about you, but I would not want this to happen to a table like SalesLine, which it did for us.  The 30 million rows we have in that table didn't do justice for the INSERT INTO statement using SELECT * FROM SalesLine.  Our DBAs ended up killing the SPID after getting calls from several of our departments that touch anything sales-related.  Somewhere in the process of killing it halfway, it left SalesLine without any indexes!

Because we don't have that access to the kernel code to see "exactly" what code runs, we're not entirely sure what decision points that made it happen, it still remains a mystery.  Regardless, we are talking about business, we want to do everything to prevent this from happening again.  Granted, we still want our database to be in sync, but wouldn't it make more sense to perform the Synchronize off-hours when it isn't used?

I was shocked to see that something as innocent as a code compare could cause a Synchronize without asking you first, much less a notification.  After debugging a ton, I narrowed it down to one line that calls into the kernel.  From there, the call stack eventually gets to the Synchronize portion.  But this is the only line that is editable outside of the kernel to stop this from happening.  It is in a class/method called, SysImportElements.buildTreeContextMenu(), the line that calls it is:


infolog.endImport(exportId, 1);

You can choose to comment it out and it won't cause a Synchronize during an import.  A better solution would be to wrap a dialog box around it, that way you have the choice every time.  An example we use is below:

if (tmpImportAot.UtilElementType == UtilElementType::Table
    && box::yesNo(strfmt("Do you really want to syncronize: %1?", tmpImportAot.TreeNodeName), DialogButton::Yes))
    infolog.endImport(exportId, 1);

It is unclear if there are side effects when you just ignore the infolog.endImport() call entirely.  We don't know if there are objects in memory that are normally destructed or disposed during this call.  We just don't know, we can't see what the kernel does, but we know that by ignoring it, we are putting a stop to the Synchronize.

UPDATE 2012-08-01: We have discovered another case where the Import form issues a Synchronize without warning.  We had a project XPO with a number of objects within them.  We had only planned on importing some, not all, of the objects as this was to be a build that was a follow-up from another build (an original fix didn't work).  So, we were importing the same project from before but with more objects.  But, we had no plans to import the whole thing, just the few objects that have changed.  So, during the Import, we "checked" only the objects we wanted to import, along with the Shared Project defintion object, and we imported it.  For some reason, a Synchronize was issued for all Table objects within the XPO even though none of them were checked for Import.  I don't have the resources right now to look into exactly why this happens, but I'm guessing that somewhere, another infolog.endImport() is getting called and is ignoring what objects are truly getting imported.  My initial thought is, there is probably no workaround for this.  The only thing that comes to mind, is that when we Import, we must make sure that the XPO in question contains ONLY what we plan to change, nothing more.

No comments:

Post a Comment