You are on page 1of 11

Robyn Ward-MARCXML 1

Running Head: MARCXML AS STANDARD FOR ACME INSTITIONAL


REPOSITORY

Why MARCXML Should Be Considered For ACME University Institutional Repository

Robyn Ward

Emporia State University


Robyn Ward-MARCXML 2

Why MARCXML Should Be Considered For ACME University Institutional Repository

The intention of this paper is to represent MARCXML as a viable metadata

schema for ACME University’s Digital Institutional Repository. Outside research will be

presented to support the choice and recommendation of MARCXML. Factors such as

limitations and benefits of the schema will be addressed and the practical uses and

implementation will also be presented.

Introduction

According to Perkins (2007) “In an open access environment it is necessary to

ensure that all collections meet the minimal requirements for interoperability” (p. 24).

Metadata planning is about interoperability and compliance. There are three types of

interoperability: (1) Semantic, (2) Syntactic, and (3) Structural. These may occur at a

number of different levels, i.e. local, consortial, or within communities of practice. In

order to choose a metadata scheme one must first evaluate the local needs and what

functions the metadata needs to serve.

With considering the above criteria, MARCXML will meet the semantic,

syntactic and structural needs for the Institutional Repository and each will be addressed

below as the MARCXML schema is evaluated.

Background and Purpose of MARCXML

MARC (MAchine Readable Cataloging) has been the standard for exchanging

bibliographic records between systems for decades. This standard is unlikely to go away

for a number of reasons, including but not limited to: financial commitment and

familiarity within the library community. “The development of an XML version of

MARC21 was critical for the format. The economically deep commitment to MARC data
Robyn Ward-MARCXML 3

elements, proliferation of schemas beyond the library community control, and the rapidly

growing XML tool environment mandated an evolutionary path into XML for MARC

21” (McCallum, 2006, 4).

In 2002 the Library of Congress established MARCXML as a standard to transmit

MARC data into XML (extensible Markup Language) syntax. MARCXML record

structure is based on the W3C XML standard. Preceding MARCXML during the 1990’s,

the Library of Congress developed two SGML DTDs for MARC21 one for Bibliographic

information and the other for Authority information. These SGML DTDs have been

converted to XML DTD. The MARCXML standard has been expanded from the DTDs

and has many differences from the DTDs.

The MARCXML schema supports all MARC-encoded data regardless of format.

And is currently used to aid in interoperability and transferability of cataloging records

between metadata standards. According to the MARC21 XML Schema Web Site it is a

“framework [that] is intended to be flexible and extensible to allow users to work with

MARC data in ways specific to their needs. The framework itself includes many

components such as schemas, stylesheets, and software tools.”

Applicable Research

Findings and research from the Los Alamos National Laboratory Research

Library will be presented in order to support MARCXML as the standard for ACME

University’s digital repository. Los Alamos “Library Without Walls” team compared five

XML schemas for consideration when creating their digital object repository.

MARCXML, Dublin Core, PRISM, ONIX, and MODS were all considered viable for

their needs. The Los Alamos team conducted a survey of each schema based upon three
Robyn Ward-MARCXML 4

distinct requisites for a uniform standard. These included: (1) Granularity, (2)

Transparency, and (3) Extensibility. Other traits the team looked for were: (4) the support

of hierarchical data structures, (5) cooperative management of the standard, (6) support

for simple and complex use, and (7) familiarity or experience with the selected standard.

These seven recommendations are also important to keep in mind for the implementation

of a standard for ACME University Digital Repository. Findings from the study

concluded that MARCXML was a robust schema capable of meeting all of the

requirements of granularity, transparency, and extensibility. These three requirements

should be further explained. Granularity “insures lossless data mapping without blurring

the finer shades of meaning intrinsic to the original data”. Transparency “…this

requirement relates to interoperability, requiring a standard widely known throughout the

community…” and Extensibility “since no one metadata standard is appropriate to every

situation, standards must permit growth without fracture” (Goldsmith & Knudson, 2006,

¶ 7).

Jeffrey Beall, Catalog Librarian at Auraria Library, University of Colorado at

Denver performed an analysis of twelve metadata schemes that are available for use. His

findings are appropriate for this paper. He compiled his findings in a chart comparing

each scheme to the following criteria: granularity, formats of description, content

standards, availability of searching systems, level of community or domain specificity,

interoperability, proven success (reputation and popularity), training, viability of the

organization behind the scheme, ability to handle a particular metadata function,

adaptability of the scheme to local needs, scalability, and surrogacy. MARC did well in

all categories. MARC has rich granularity. The content standards are flexible though
Robyn Ward-MARCXML 5

highly established with AACR2 and Library of Congress Subject Headings. There are

many commercial systems available for its use. Training is high and associated with the

library community at large. These are just few samplings from the findings on MARC.

More detail can be attained from the Beall article (Beall, 2007, 31).

These are two separate analysis that should be considered when deciding upon the

XML schema for ACME Institutional Repository.

The Low Down or How It Works

The MARCXML framework is quite a simple XML structure that contains

MARC data. Following are characteristics of the MARCXML schema. The control fields,

including the leader, are treated as a data string, the MARC fields are treated as elements

with the tags and indicators as attributes. Subfields are also treated as sub-elements with

the sub-field codes as attributes. The presentation of MARC data in XML is possible

through writing an XML stylesheet. This stylesheet allows for the selection of particular

MARC elements to be displayed. It also allows for the application of appropriate markup.

There are three categories for MARCXML consumers. The first category,

transformation, consists of the conversion between MARCXML and other metadata

formats such as Dublin Core. The second is presentation. This allows for the display and/

or markup of MARC data into some readable form. And the third category is analysis,

which involves the processing of MARC data to produce analytical output such as

validation. Validation is important for making sure that the basic XML is in accordance to

the MARCXML schema, the MARC21 tagging of fields and subfields, and also of the

MARC record content. The above functionalities of MARCXML are provided through

downloadable software offered by the Library of Congress and is referred to as the


Robyn Ward-MARCXML 6

MARCXML toolkit. Another function provided by the toolkit is the FRBR (Functional

Requirements for Bibliographic Records) display. FRBR is intended to be independent of

any particular cataloging code or implementation structure and relationships of

bibliographic and authority records.

Pros and Cons

MARC format has a number of limitations that must be considered when looking

at a metadata schema that will essentially support this existing format. According to the

American Library Association report (2005) limitations of MARC include: (1) exclusive

record structure and coding, (2) inconsistent granularity, (3) technical obsolescence, and

(4) lack of scalability to digital materials (p.21). The team at Los Alamos National

Research Library identified other limitations of MARC and MARCXML which included

the idea that MARC was too “bibliocentric and rigid”, the increasing lack of popularity in

the library community, its viability, and the complexity of the format. These limitations

and have been proven to be either unfounded or manageable.

Benefits of MARCXML out weigh the negatives or limitations. MARCXML can

produce an exact equivalent of the MARC21 record, thus allowing lossless to and from

conversion. MARCXML is also a schema that has been widely used and according to

McCullum (2006) is the basis for the international standard for an XML version of the

MARC structure that Danish Standards have proposed to ISO (p.4).

MARCXML structure allows users to more easily write their own tools to ingest,

manipulate and convert MARC data, thus making MARCXML extensible. The

architecture also allows for different software in order to build custom solutions. (Library

of Congress, 2006). The use of being able to use external software is a positive in the
Robyn Ward-MARCXML 7

above benefit, but can be seen as a limitation in the fact that validation of MARC can

only be enforced by external software and not by the schema itself. This is one minor

limitation.

The MARCXML schema also supports all MARC encoded data regardless of

format. It also has a number of potential uses that will systematically be described further.

The first use is being able to represent a complete MARC record in XML. It can be used

for original resource description in the XML syntax and can function as metadata in XML

that can then be combined with an electronic resource. Secondly it can be used as an

extension schema to METS (Metadata Encoding and Transmission Standard). METS

supports metadata standards such as MARCXML, which allows for the inclusion of

different metadata schemes to describe various facets of an object and various

representations of an object. METS is a digital “wrapper”, which is an XML text file that

binds together content files and metadata and specifies the logical relationship among

them. Thirdly, MARCXML can represent metadata for OAI-Harvesting (Open Archives

Initiative). The Open Archives Initiative is dedicated to providing digital library

interoperability by defining simple protocols and standards. The protocol’s function is to

transfer metadata from a source archive to a destination archive.

Sampling of MARC/XML Put to Use

MARCXML has been used for OCLC’s Terminology Services Project. The

intention of the project is to provide web services that are machine-to-machine

applications that can be used in a number of different ways. This project handles

knowledge organization vocabularies, i.e. authority files, subject heading systems,

thesauri, and classification schemes (McCallum, 2006, 5). Hence it maps one term in one
Robyn Ward-MARCXML 8

vocabulary to one or more terms in a different vocabulary. MARCXML is used to

normalize the data in MARC21. Normalization is a “formal analytical process by which

various metadata formats are standardized to a pre-selected metadata standard” (Hutt,

Rose-Sandler, & Westbrook, 2007, 41).

Terry Reese, at the Oregon State University developed MarcEdit. MarcEdit is a

MARC21 editing utility. MARCXML is central to the crosswalk tools provided in

MarcEdit. These crosswalk tools include data conversions from Dublin Core, EAD and

FGDC to MARC21.

At New York University, Bill Jones used MARC/XML to perform routine tasks

on the libraries catalog. HE used XSL transformations of batches of records to update

data within the records all at one time. He changed content in XML then converted or

transmitted it into a MARC record for the library general catalog.

Work has been done at Virginia Tech with the Networked Digital Library of

Theses and Dissertations regarding the use of the OAI harvesting protocol. The Open

Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) supports the use of

MARC21 in XML. The Library of Congress American Memory project also exposes its

metadata for OAI harvesting in MARCXML.

The Library of Congress makes use of MARCXML to create MODS (Metadata

Object Description Schema, another XML schema for MARC21) records from MARC21

records, move ONIX to MARCXML, and provides distribution of all of its MARC21

cataloging records in the MARCXML schema, in addition to the ISO 2709 structure.

MARCXML is also used by the Search/Retrieve URL service and Search/Retrieve

Web service (SRU/SRW) protocols. “XML is thus the retrieval vehicle for searches
Robyn Ward-MARCXML 9

requesting MARC21 records in their entirety (McCallum, 2006, 6). The SRU/SRW is an

XML based protocol correlates to Z39.50 in binary environment. “A very important

aspect of structured search-and-retrieve protocols such as Z39.50 and the SRW/U family

is the provision of a structure query language in which rich queries can be expressed”

(Taylor & Dickmeiss, 2006, 8). SRW/U provides a text query format known as CQL or

Common Query Language.

Conclusions

The above uses demonstrate the flexibility of the MARCXML format in meeting

current and expanding needs of expressing MARC in the an XML environment.

MARCXML is able to support external technologies, such as OAI-PMH, SRU/W.

Regarding structural metadata, MARCXML can utilize METS for wrapping and

packaging objects together. MARCXML has crosswalkability functions with the aid of

utilities such as MarcEdit. As presented in the paper, MARCXML has the ability of

losslessness when converting to and from another syntax such as MARC21. MARCXML

is currently used within the library environment and should grow in use as more libraries

experiment and move toward digital collections. MARC21 is the long-established

standard for expressing data and the advent of MARCXML seems to be the logical XML

standard for expressing MARC21 data in the digital environment. MARCXML also

complies with the interoperability standards of syntax, semantics, and structure. The

evidence provided in favor as MARCXML as the metadata schema of choice for the

upcoming implementation of the Institutional Repository at ACME University.


Robyn Ward-MARCXML
10
Bibliography

American Library Association. (2005). Update on major metadata standards. In

Library Technology Reports, 41(6), 20-33. Retrieved April 9, 2007, from

Academic OneFile via Thomson Gale.

Beall, J. (2007). Discrete criteria for selecting and comparing metadata schemes. Against

the Grain, 19(1), 28-31.

Clarke, K. S. (2002). Updating MARC records with XMLMARC. In R. Tennant (Ed.),

XML in libraries (pp. 3-16). New York: Neal-Shuman Publishers.

Goldsmith, B., & Knudson, F. (2006). Repository librarian and the next

crusade. D-Lib Magazine, 12(9). Retrieved April 9, 2007, from

http://www.dlib.org/dlib/september06/goldsmith/09goldsmith.html

Hutt, A., Rose-Sandler, T., & Westbrook, B. D. (2007). Balancing the needs of producers

and managers of digital assets through extensible metadata normalization. Against

the Grain, 19(1), 41-43, 45.

Library of Congress. (2006). MARCXML. Retrieved April, 21 2007, from

http://www.loc.gov/standards/marcxml/

McCallum, S. H. (2006). MARC/XML sampler. International Cataloguing and

Bibliographic Control, 35(1), 4-6. Retrieved April 15, 2007, from Library

Literature & Information Science via Wilson Web.

Perkins, J. (2007). Planning for metadata: the quick tour. Against the Grain, 19(1), 20-27.

Radebaugh, J. (2007). MARC 21 / MARCXML. Computers in Libraries, 27(4), 15.

Taylor, M, & Dickmeiss, A. (2006). Delivering MARC/XML records from the Library of

Congress catalogue using the open protocols SRW/U and Z39.50. International
Robyn Ward-MARCXML
11
Cataloguing and Bibliographic Control, 35(1), 7-10. Retrieved April 15, 2007,

from Library Literature & Information Science via Wilson Web.

Wolfe, J., & Anderson, M. (2007). Digital collections, the next generation. Against the

Grain, 19(1), 37-40.

You might also like