MARC Report and XML

The program's support for XML has undergone, and will continue to undergo, many changes. The purpose of this page is to provide an overview of the current status of XML support in MARC Report.

For a list of all articles on this site pertaining to MARC/XML, type 'xml' in the search box on the left and press <Enter>.

MARCXML Support

There are several reasons why MARCXML is important (please click here for one assessment of this). But the main thing to remember is that it quickly moves our bibliographic data without loss into the international world of XML.

MARC Report has built-in support for two MARCXML conversions (under the Utilities menu):

These conversions require no external resources, such as schemas or stylesheets, and have been designed as batch operations that will attain very good performance on large files1). The latest version of the program (233) also includes an option to export 'qualified' MARCXML, and to automatically recognize qualified MARCXML during import.

'Other' XML Support

Exporting MARC to XML

MARC Report also supports conversion of MARCXML into other XML types, subject to the availability of third-party stylesheets. Therefore, it may be possible to use the 'MARC to XML' utility to export your database to MODS, or to DUBLIN CORE. The scenario we want to support looks like this–

 MARC --> MARCXML --> OTHER XML

–where the first conversion (MARC to MARCXML) is handled by MARC Report's built-in converter, and the second conversion (MARCXML to OTHER XML) is handled by the stylesheet selected by the user.

For a conversion to 'Other' XML to be successful, therefore, a stylesheet that converts MARCXML to another XML is required. The two primary XML targets currently supported by MARC Report are MODS/MADS, and simple DC. There are a number of stylesheets for these XMLs available on the LC MARCXML site, and several others can be found at the linked pages for MODS and MADS. Note that, as LC refers to MARCXML as 'MARC21Slim', all of the filenames of the supported stylesheets will be named accordingly: MARC21slim2MODS3-3.xsl, MARC21slim2OAIDC.xsl, etc.

It may also be possible for the user to convert MARC to another XML using a custom stylesheet. Theoretically, as long as the custom stylesheet converts MARCXML to the target XML, the conversion should be supported by MARC Report. However, in practice, we have found that this is not always the case. If you run into a problem here, we would greatly appreciate it if you would email the stylesheet and a sample of the XML to us so that we can improve the program.

Importing XML to MARC

Conversions in the other direction (importing XML to MARC) using stylesheets are also possible:

 OTHER XML --> MARCXML  --> MARC

Here the intermediary stylesheet must support a conversion of the Other Xml into MARCXML; and once that is accomplished, MARC Report's built-in converter will take over. Stylesheets that fit into this scenario are also available on the LC link listed above, though they are not as numerous as in the export direction2).

Screenshots and options

Exporting MARC

To export MARC to XML, start the program, select the MARC file that you want to convert, then select the corresponding option from the Utilities menu. You will be greeted with the following screen:

Click on the XML button to setup a filename for the results. By default, the program will construct a result filename for you by appending '.xml' to the MARC filename.

Select the corresponding checkbox if you want to export qualified MARCXML3).

If you are not converting to another XML, at this point just click the 'Start conversion' button. But if you are, first you must select the stylesheet to use (see Stylesheets below).

Importing XML

To import XML to MARC, start the program, and select the 'XML to MARC' option from the Utilities menu. You will be greeted with the following screen:

Click on the XML button to select the XML file that you want to import, then setup a filename for the results. By default, the program will construct a result filename for you by appending '.mrc' to the XML filename.

If you are importing from MARCXML, just click the 'Start conversion' button. But if you are importing from another type of XML, you may here need to select the stylesheet to use (see Stylesheets below). There is no need to tell the program whether it has to look for qualified MARCXML or not, as it will automatically determine that.

Stylesheets

Stylesheets (files with the .xsl extension) are specified by clicking on the 'Custom' tab at the right edge of the form. The page that is then shown will look somewhat like the following ('somewhat', because this part of the program is changing often these days):

To select a stylesheet, click inside the 'Stylesheet' box at the top of the form. The program will open an explorer window that will list all of the stylesheets distributed with the current version. Select the one you want to use and click 'Open'; the program will return to the 'Custom' form, and you may now click 'Start conversion' to begin the export.

The program assumes that you know which stylesheet to use! When you click in the 'Stylesheet' box, it will list all the stylesheets in the folder, and will not restrict the list to the current conversion context (ie., show only those stylesheets that could apply).

Stylesheet folder

If you want to use your own stylesheet, you can either navigate and select it in explorer, or you can copy it to the stylesheet repository folder used by the program; in a default installation, this folder is:

 C:\Program Files\TMQ\MARC Report\data

In general, the stylesheets distributed with the program are those available from the LC MARCXML sites. However, in some cases you will notice two stylesheets that have the same name but for the letter 'r' appended to one. The 'r' denotes that we have made a minor change to the LC stylesheet4).

If your installation is non-standard, you will no doubt get an error (something like 'Error -226; unable to find specified file') the first time you try to use an 'r' stylesheet. To fix this problem, open the stylesheet in a text editor, look for this line (near the top)–

<xsl:include href="file:///C:/Program Files/TMQ/MARC Report/data/MARC21slimUtils.xsl"/>

–and change 'C:/Program Files/TMQ/MARC Report' to the MARC Report installation folder. Note that the slashes are forward, and not backward as in Windows.

This problem has been fixed in version 235–see the Change note for that version

MARC to XML stylesheets

The following stylesheets can be used for exporting MARC records5):

All of the above stylesheets are available from the LC MARCXML website. Its possible that other stylesheets, that convert MARCXML to other forms of XML, may also work here.

XML to MARC stylesheets

The following stylesheets can be used for importing XML to MARC:

Again, all of the above stylesheets are available from the LC MARCXML website.

If the XML file is not MARCXML, but one of the two other types of XML that the program currently supports (MODS and DC), it may not be necessary to select a stylesheet. When an XML file is selected, the program will parse the beginning of the file and try to automatically determine what stylesheet (if any) to use.

However, if you have an XML file that is neither MARCXML, nor one of these other two supported xml types, you will need your own stylesheet to convert the XML into MARCXML.

Assorted Notes

This is probably obvious to most catalogers and metadaticians, but for those for whom it is not: exporting MARC to Dublin Core is very lossy. This may, or may not, be acceptable depending upon your objective. If you have a choice, and would like to get as much of your bibliographic data out of MARC and into XML, use MODS instead.

There are fewer stylesheets available for importing than exporting. For example, although MARC can be exported to MADS, LC has not provided a stylesheet to import MADS into MARC 6).

One quirk about exporting MARC to DC is that there is no top-level, 'collection'-type element defined for DC. However, to create a valid XML file we have to add one. Therefore, the current version (233) of the program will use the element '<tmqCollection>' for this purpose.

Converting a file from MODS to MARC seems to be very slow, even when using a TMQ-modified 'r' stylesheet.

On the 'Custom' page is a section called MSXML options. In most cases, the program automatically selects the correct set of options, and you should probably not make any changes to these checkboxes unless we tell you to.

On the 'Custom' page is an editbox called 'Document Element'. This another aspect of the program that is undergoing a lot of change, and this too should be left blank unless instructed otherwise.

1)
on a 2007 Dell Precision 490 with 15K SCSI disks, the export converter outputs about 1,000 records per second; the import converter attains about 250 records per second
2)
which makes sense, perhaps, as we now seem to be more interested in moving our existing data out of MARC, than importing other data into MARC
3)
There are times when you will want to do this and times when you won't. Qualified MARCXML increases the size of the export file by 20%, and thus may slow things down a little bit. If you are using MARC Report to upgrade a large MARC-8 file to UTF-8, for example, then there's no reason to use qualified MARCXML. On the other hand, if you are setting up a conversion to another XML, some stylesheets may require qualified XML. Because we think the latter case is more likely to suit our users, the program selects 'Qualified MARCXML' by default
4)
Almost all LC stylesheets reference a 'utility' stylesheet on the LC website. We have found that there is a significant performance improvement (for example, when using the LC stylesheet to convert MARC to MODS3.3, performance improves up to 10 times with the 'r' stylesheet; when using the LC stylesheet to convert MARC to OAIDC, performance improves up to 15 times) when the URI for this stylesheet is redirected from the web to the local disk. And so, in the 'r' stylesheets, the reference to the 'MARC21SlimUtils' stylesheet has been changed to point the copy of this stylesheet distributed with the program
5)
LC's naming convention seems to be–
XML Source type + '2' + XML Target type
–where 'MARC21Slim' is LC's name for MARCXML
6)
Not sure how much to read into this. One might say that MARC, although an excellent source of bibliographic metadata, is not the best choice as a format for working with metadata generated elsewhere. On the other hand, one might argue that the export of MARC into XML is so lossy, that we do not want to encourage moving data in the other direction