Why MARCXML?
There are two, perhaps three, reasons why MARCXML is important to us now.
First, MARCXML is primarily a communications format1). With MARCXML, all of your bibliographic data can be quickly moved into the XML world without loss. From here, it is a relatively painless task to move the MARCXML data to a more useful XML format, like MODS or DUBLIN CORE. All that is needed for the secondary conversion is a stylesheet, and software which supports XSLT, both of which are available on the web.
Second, MARCXML, by virtue of its being XML, supports unicode, without requiring any intervention (ie., special software). Technically, this argument is not as strong as it once was, because UTF-8 encoding is now well-supported in MARC and by most library systems. Also, UTF-8 in itself does not solve all problems of character encoding, and may perhaps simply exchange one set of problems for another2). However, in the larger picture of internationalization, and the resultant global sharing of resource descriptions, the fundamental support for unicode in XML has to be viewed as a major advantage3).
Finally, using MARCXML as a communications format will enable us to get around some of the persistent limitations of MARC itself. What happens in MARC when a record wants to exceed 99,999 bytes, or a field wants to be longer than 9,999 bytes? The result is purely dependent upon the vendor's software, and there are no good solutions, only kludges, and many software bugs. These problems could be solved by making changes to MARC, supported within the standard itself in some places, and extending it in others 4). But doing so would break all MARC software.
Consider the MARC-8 escape sequences used to represent foreign script in the 880 fields of LC records:
880 $6260-03/(3/r$a(3edJGf, HGcSJGf :(B$b(3GdecJHI GdaGQhbjI,(B$c1977(3.(BIn MARCXML, the same data, based on the UTF-8 encoding standard, will look like this, in the raw–
<datafield tag="880" ind1=" " ind2=" "> <subfield code="6">260-03/(3/r</subfield> <subfield code="a">ملتان، باكستان :</subfield> <subfield code="b">المكتبة الÙاروقية،</subfield> <subfield code="c">1977.</marc:subfield> </datafield>–and like this in the typical browser:
<datafield tag="880" ind1=" " ind2=" "> <subfield code="6">260-03/(3/r</subfield> <subfield code="a">ملتان، باكستان :</subfield> <subfield code="b">المكتبة الفاروقية،</subfield> <subfield code="c">1977.</subfield> </datafield>