Friday, March 12, 2010

Locations

Creating locations in the AT is fairly straightforward, especially for simple numeric ranges. AT's batch location creation tool is very handy here. Creating more complex locations, especially those with alphanumeric components, however, cannot be done in the AT. This was unfortunate for us because although the vast majority of our material is stored offsite, the number of distinct onsite locations we have is still close to 3000 given the unique combination of rooms, ranges, sections, and shelves. Rather than try to generate these manually in the AT, I chose to create them in Excel and then paste directly into the AT tables. Here's how I did it.

First (after having surveyed all of our locations to come up with room numbers, ranges, sections, and shelves), I opened the LocationsTable in Navicat and copied the last record into Excel. [note: we had entered in some locations already.] I then used this record to structure and format subsequent location entries and assign sequential locationIds. Excel's fill-in feature is extremely helpful here, especially for alphanumeric data. Simply typing in 1A1, for example, and then dragging the fill-in cross-hairs to the appropriate point, allows you to quickly generate values 1A1, 1A2,...1AX. Once finished adding in all your data, your spreadsheet should look like this: sample (.xls).

The final step is to paste the records into the LocationTable via Navicat. [note: as I explain in my post on creators, you need to make sure you are the only one using the AT when pasting in data or looking up ids. Again, this is to prevent overlap or duplication of work.] Once finished, check your work in the Locations module (Tools > Locations), making sure you can create additional (i.e. test) locations.

Thursday, March 11, 2010

AT Issues: Creators

Another pesky problem we've encountered with the AT EAD import process is that finding aids not encoded with a <corpname/> or <persname/> or <famname/> in <origination label="creator"/> do not get names created with the assigned function of creator in resource records. Unfortunately for us, this amounted to the vast majority of our finding aids, some 1600+ records. Rather than fix these one at a time in the AT, we came up with the following workaround.

1. Generate list of filenames and creators from our EAD files that lack <corpname/> or <persname/> or <famname/> in <origination/>. To do this Mark Matienzo, our new digital archivist, wrote a script for us that parsed our EAD files and dumped this data into an Excel file.

2. Format filenames as needed in separate spreadsheet (e.g. remove .xml extension, add quotes) to create list as follows:

"mssa.ms.0001",
"mssa.ms.0002",
...

3. Use Navicat to run query generating list of resourceIds based on formatted filenames (eadFaUniqueIdentifier in AT Resources table):

SELECT eadFaUniqueIdentifier, resourceId FROM Resources WHERE eadFaUniqueIdentifier IN ("mssa.ms.0001", mssa.ms.002", "mssa.ru.9019") ORDER BY eadFaUniqueIdentifier;

4. Copy resourceIds back into Excel file

5. Format names in separate spreadsheet (add quotes) to create list as follows:

"Yale University. Office of the Secretary.",
"Yale University. Office of the President.",
...

6. Create query in Navicat to retrieve nameids for any names that did make it into the AT:

SELECT nameId, sortName FROM Names WHERE sortName IN ("Yale University. Office of the Secretary.", "Yale University. Office of the President.", "Yale University. Office of the Provost") ORDER BY sortName;

7. Paste nameIds into master file with resourceIds and corpnames, famnames and persnames.

For those records with nameIds, proceed to step 13. Otherwise, for those names not present in the name table, proceed to step 8.

8. Open Names table in Navicat and proceed to last record

9. Copy last record in Names table into Excel to serve as model, noting column names. It is important to note that corpnames and persnames use different columns/fields so make sure to examine records in Names table for formatting both persnames and corpnames.

10. Copy contents of your master original file with persnames, famnames and corpnames into the new spreadsheet according to the Name table structure. Use Excel's fill-in feature to fill in data as needed and assign sequential nameIds from last record in table. Make sure to format the lastUpdated and created fields as text in Excel so as to mirror date encoding in the AT. Here is what your spreadsheet should look like: sample (.xls).

11. Paste records into Names table using Navicat

12. Copy nameIds for newly created names back into your master file of resourceIds and corpnames and persnames

13. Open ArchDescriptionNames table in Navicat

14. Go to last record and copy into new Excel file to serve as model for formatting data

15. Copy contents of master file with resourceIds, corpnames, famnames, and persnames, and nameids into Excel file mirrororing structre of sample record. Use Excel's fill-in feature to format data and assign sequential ids from last record in table. Here is a what your spreadsheet should look like: sample (.xls).

16. Paste contents from Excel file into ArchDescriptionNames table using Navicat

A couple of caveats. First, it is important that these steps be taken by someone comfortable with the aforementioned programs, as well as MySQL. Second, and most important, id creation and pasting of data directly into the AT tables should be done when no one else is using to AT to prevent accidental overlap or duplication of work. Finally, make sure to test the results, including creating new names and resource creator links in the AT client just to make sure everything is ok. If something does go haywire, you can always delete the records from the AT tables you just pasted in.