Monday, February 22, 2010

AT Transition Update

I apologize for the long delay in posting to the blog. I have spent the better part of the last several months addressing several (thousand) issues raised by our legacy data migration and plug-in development. Today, though, I am happy to announce that the day has finally come where we, Manuscripts and Archives, have finally made the jump. We are now fully in and committed the AT! Unfortunately, much still remains to be worked out.

First off, we have to fix thousands of errors reported out from our programmatic efforts to match legacy instance information to resources in the AT. These were mostly the result of errors in our finding aids and, less often, errors in our location database. These have mostly been addressed, leaving only the truly disparate errors that need a good deal of research into how collections were processed and, often times, examining/verifying box contents.

Another major category of clean-up work stems from our QC of data sent back from our consultant where we found that, with some overlap with our other error logs, several thousand barcodes from our locations database did not come into the AT. Thankfully, many of these can be easily diagnosed with our Box Lookup plug-in and fixed with our Assign Container Information plug-in. Others require more in-depth problem-solving. For those large collections where things just went haywire, or for those small collections with only a few boxes, we've decided to delete the resource, re-import and then re-assign instance information using the Assign Container Information plug-in. Other collections will require deleting one or more components, importing those components in a dummy resource, transferring said components back into the original resource, and then re-assigning container information.

A third major challenge is cleaning up restriction flags we've assigned instances based on notes in our locations database. Our locations database had a variety of notes both at the collection/series/accession level and at the item level. Since these notes were wildly inconsistent and unable to be easily parsed, we created blanket restrictions for instances based on the notes. As a result, we have to review the restrictions assigned, verifying those that need to be restricted are and fixing those that are open. Thankfully, these errors can easily be fixed with our Box Lookup and Assign Container Information plug-ins.

Aside from these data errors, which are our first priority, we also have to finalize workflows, procedures, and documentation for accessioning, arrangement and description, and collections management. Although equally critical to our day-to-day operations, these were put off until we were in the AT so that we could fully model what needed to be done.

So, although we've made such great progress up until this point, much remains to be done, much needs to be resolved. This is more or the less the lasting impression of the project. For other large institutions planning similar migration projects, I can't say enough just how much work is involved and how important it is to get staff, especially technical staff, on board. For those institutions without technical support and dedicated staff, it is probably best to hire a consultant, especially when it comes to legacy data (e.g. instance) migration and customizations to the AT.


  1. Thanks so much for sharing this with the world.

  2. Follow-up to mere "thanks."

    It's important to see the amount of work and effort that is required to make data clean. The benefits will become apparent. For those of us with lots of legacy data, the improvements are more remarkable once the data is available, but they are certainly hard-earned.

    So thanks for being a pioneer!