You are on page 1of 4

Florence Margaret Paisey April 26, 2013 LIS 5787: Professor Stvilia XML Metadata Schema Creation

<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="resource"> <xs:annotation> <xs:documentation> Rationale: The following descriptive metadata is encoded with Qualied Dublin Core (QDC) elements. </xs:documentation> </xs:annotation> <xs:complexType> <xs:sequence> <xs:element form="qualied" name="title" maxOccurs="3" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: A name given to the resource. A title is the name by which a resource is formally known. Constraints: Some resources are formally recognized by more than one name; I have arbitrarily decided on "3" as the maximum. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="creator" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: The creator refers to the entity primarily responsible for producing the content of the resource. Creators should be listed separately and may include a person, an organization or a service. Contraints: Generally, the creator of a document is named -- however, some ancient texts as well as later texts of the 17th - 19th centuries -- may not have a creator named. In this case, I would use the creator element to state that the creator's identify is not known. In other cases, a creator may have used "anonymous" to remain unknown. Here, anonymous would be the creator. When the creator or creators are known, both the family and given names should be included, when known. Constraints: Some documents, particularly in the sciences will have several creators. So, the maximum is set to unbounded. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="subject" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Dublin Core describes this element as the topic of the resource. Best practice recommends selecting the most signicant keywords, including classication numbers or descriptors from controlled vocabularies. Layne (2002) and Harpring (2002) discuss how to determine the subject of works of pictorial art. The emphasize the usefulness of considering kinds of of-ness as well as an image's about-ness. Art may depict (be of) people as well as events and activities. These of-ness subjects can be described variously -from generic vocabulary to very specic terms (broad to narrow) using controlled vocabularies. Layne (2002) points out that description of narrative is different from images because images concern a specic work of art. I would argue that some written documents also concern specic works of art and their hermeneutics. So descriptors of some narrative may nd both concepts, of-ness and aboutness useful. About-ness relates to interpretation of art or "meaning beyond simply what it is of." Certainly many textual objects also involve interpretation or its hermeneutics, the meaning beyond the storyline. The of-ness and about-ness are important in providing subject access points. It is also pointed out that the subject -- of-ness and about-ness -- can overlap with different metadata categories -- redundancy. Deciding which subject terms to use involves audience, the interested disciplines and the object itself. Harpring (2002) holds that the subject of an artwork may encompass the "narrative, iconic, or non-objective meaning." She states that subject involves "what is depicted in and by a work of art." This denition includes the function of an object (such as architecture or furnishings) that otherwise have no narrative content. Constraints: A document may include more than one subject; in this case multiple topics or subjects are discussed, so unbounded occurrances are required. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="description" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Used to present information that will foster discovery of the resource. According to Dublin Core guidelines, best practice recommendation is to use complete sentences. Information can be copied or extracted from the resource if no abstract or structured information is available. In Dublin Core this element subsumes the "note" element in MODS -- meaning that notation from the designer of metadata may be included. Constraints: The description may include a table of contents, an abstract, or any information, including graphical representations, that may facilitate resource discovery. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="publisher" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: The publisher is the entity repsonsible for issuing the resource. Prior to the modern publishing house, printers were often publishers in the sense the we use the term today. In such a case, the printer would be identied as the publisher. Dublin Core states that the nature of the publisher is ambiguous. Generally, best practice recommends using publisher for organizations and creator for individuals. When the distinction for responsibility is not clear, best practice recommends using contributor. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="source" maxOccurs="3" minOccurs="0"> <xs:annotation> <xs:documentation> Purpose: To distinguish between the publisher and the publication in which the resource was issued. Constraints: Publisher and publication may, ostensibly, be identical. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="contributor" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: A contributor involves any agent responsible for making contributions to the resource. This could be an individual, such as an editor, organization or service, such as consulting services. Constraints: There need not be any contributors, or, conversely, there may be many. So, the minimum number of occurrences is set to 0 while the maximum is indenite. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="date" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: The date is associated with an event in the life cycle of a resource, usually this will be the date of creation or availability. It is recommended to encode the date following the rules of ISO 8601. Constraints: The date refers to the single initial event. Other associated dates with the development of the resource are associated with qualiers. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="dateModied" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Date modied refers to the date on which the initial resource was changed. Contraints: More than one modication can be recorded, when only one is recorded, it is assumed to be the latest modication. So, this is an optional element. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="type" maxOccurs="1" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Type identies the nature or genre of the resource. Best practices recommend using a controlled vocabulary. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="format" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Format describes the physical or digital manifestation of the object. It can include the physical dimensions as well as the material from which an object is made. Constraints: Format may refer to the generic name of the object, a formalized name, its dimensions and the materials from which it is constructed. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="isFormatof" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: To distinguish between objects transformed in format by technologies. For example, a facsimile of a rare book, an electronic reproduction of an initial print format. Constraints: Depending upon the object, multiple formating transformations can occur. With books, folios may be re-formatted to octavos, quartos, etc. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="identier" maxOccurs="3" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: The identier is an unambiguous reference to the resource within a given context. Constraints: I used the number "3" for the maximum number of occurrances attribute because some objects are identied with more than one identier depending on the institution or ownership. Musuems and special collections often have more than one identier for objects. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="language" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: Dublin Core designates language as an element. Constraints: Not all objects are of a linguistic nature, hence no required language. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="coverage" maxOccurs="unbounded" minOccurs="1" type="xs:string">

<xs:annotation> <xs:documentation> Purpose: Coverage refers to the extent or scope of the resource's content. This includes spatial (geographic location) and/or temporal period (date range or historical period). Constraints: Unbounded to allow for the fullest extent of coverage, if deemed desirable. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="spatial" maxOccurs="unbounded" minOccurs="0"> <xs:annotation> <xs:documentation> Spatial characteristics of the resource. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="rights" maxOccurs="unbounded" minOccurs="1" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: A rights statement is essential and covers intellectual property rights, copyright, property rights or management rights over the resource. When no rights statement exists, no assumptions regarding rights can be made. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="audience" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: To identify the audience for whom the resource is intended. In DC, the audience is viewed as a renement or qualier. It is not required, though desirable. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="provenance" maxOccurs="unbounded" minOccurs="1"> <xs:annotation> <xs:documentation> Purpose: To verify the chain of legal custody of the object or resource. This is essential for authenticity, integrity and interpretation. Provenance includes a description of changes that successive custodians may have made to the resource. Preservation and conservation comments seem appropriate here. Constraints: Depending upon the age of the resource, the provenance history may be complex as well as descriptions of alterations or changes of custodians. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="instructionalMethod" maxOccurs="unbounded" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: To describe the process used to engender knowledge, skills and attitudes. This includes ways of presenting instructional materials and mechanisms by which levels of learning are measured. Constraints: Various instructional styles and strategies may be employed as well as the means of measuring learning. </xs:documentation> </xs:annotation> </xs:element> <xs:element form="qualied" name="accrualMethod" maxOccurs="4" minOccurs="0" type="xs:string"> <xs:annotation> <xs:documentation> Purpose: To document how items or resources are added to a collection. Constraints: Various means of acquisitions exist including purchase, donation, gift. Each means involves legal denitions and terms of acceptance. Some agencies may not reveal the methods of building collections or resources. </xs:documentation> </xs:annotation> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>

XML Objects Journal Paper


<?xml version="1.0" encoding="ISO-8859-1" ?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mySchemaresource.xsd"> <title>"Begemmed and beAmulletted: Tennyson and Those "Vapid" Gift Books</title> <creator>Karen Ledbetter</creator> <subject>gift books, annual gift books, Alfred Lord Tennyson, Victorian attitudes, "The Gem", "The Amulet", "The Keepsake," Victorian illustrated books</subject> <description>Literary annuals or gift books became a protable publishing fad during the early Victorian period. They were named gift books or annuals because they were issued once a year, usually during the Autumn, and they were intended as ornamental books suitable for gifts. Many noted English and American authors, such as Tennyson, Wordsworth, Poe and Emerson contributed to them. These books were usually lavishly bound with numerous engravings or illustrations. Noted illustrators included Turner and Landseer. This paper describes their social function, content, and conicted attitudes felt among some who contributed to them. </description> <publisher>West Virginia University Press</publisher> <source>Victorian Poetry</source> <contributor>John P. Lamb, editor</contributor> <date>1996</date> <dateModied>No modications are documented.</dateModied> <type>text</type> <format>text/html</format> <isFormatof>This electronic text was initially issued in a print journal, then transformed to a digital format.</isFormatof> <identier>http:www.jstor.org/stable/40002915</identier> <identier>ISSN: 0042-5206</identier> <language>en-U.S.</language> <coverage>Early Victorian period</coverage> <coverage>1820-1860</coverage> <rights>U.S. copyright protects this text le.</rights> <audience>adolescent and adult</audience> <provenance>Associated with access policies, use policies and licensing.</provenance> <instructionalMethod>Narrative presentation.</instructionalMethod> <accrualMethod>Contractual policy between West Virginia Press and, the journal, Victorian Poetry with the database, JsTor, apply.</accrualMethod> </resource>

Website
<?xml version="1.0" encoding="UTF-8"?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mySchemaresource.xsd"> <title>V &amp; A: teachers' resource: Victorian social life from paintings</title> <creator>Victoria and Albert Museum, Online Museum, Web Team</creator> <subject>Victorian social life, genre paintings, Victorian art, Victorian themes, teachers, online, resource, schools, education </subject> <description>"The V&amp;A is one of the best museums in the world for learning about the Victorians. This resource contains an introduction to some of the Victorian paintings in the V&amp;A and suggests ways of using them as sources of historical evidence." </description> <publisher>Victoria and Albert Museum, Cromwell Road, South Kensington, London SW7 2RL. Telephone +44 (0)20 7942 2000. Email vanda@vam.ac.uk" </publisher> <source>Victorian Poetry</source> <contributor>Victoria and Albert Museum, Digital Media webmaster@vam.ac.uk</contributor> <date>2011-01-13</date> <dateModied>2013-01-13</dateModied> <type>multimedia</type> <format>interactive resource</format> <identier>Asset ID 81635</identier> <identier>http://www.vam.ac.uk/content/articles/t/a-teachers-resource-victorian-social-life-from-paintings/</identier> <language>en-U.K.</language> <coverage>Early Victorian period</coverage> <coverage>1820-1860</coverage> <rights>U.K. copyright protects content and digital images.</rights> <audience>adolescent and adult</audience> <provenance>Associated with access policies, use policies and licensing at the Victoria &amp; Albert Museum</provenance> <instructionalMethod>Resource-based instruction. Guided questions that stimulate analytical thought and synthesis are presented.</instructionalMethod> </resource>

Artifact
<?xml version="1.0" encoding="UTF-8"?> <resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="mySchemaresource.xsd"> <title>Regency mahogany sofa</title> <creator>unknown</creator> <subject>furnishings, Regency mahogany framed sofa, interiors period rooms, domestic spaces, living rooms</subject> <description>This artifact, located in the Geffrye Museum, London, England, typies early Victorian middle class drawing room furniture. This artifact is a Mahogany framed sofa on reeded sabre legs with brass straw caps, upholstered in modern silk. Gift books would have been exchanged among this social population. </description> <publisher>unknown manufacturer</publisher> <contributor>unknown artisans including woodworker, upolsterer, furniture designer</contributor> <contributor>photographer, Chris Ridley</contributor> <date>c.1830</date> <dateModied>recent restoration of artifact, no date supplied</dateModied> <type>physical object</type> <format>physical object</format> <identier>2/1937</identier> <identier>02033</identier> <identier>http://www.geffrye-museum.org.uk/period-rooms-and-gardens/explore-rooms/drawing-room-1830/</identier> <coverage>Victorian Period,1714-1837</coverage> <spatial>90cm. x 222cm x 64cm</spatial> <rights>U.K. licensing and physical property rights protected</rights> <audience>primary school through adults</audience> <provenance>previous ownership not disclosed, currently owned by the Geffye Museum</provenance> <instructionalMethod>experiential</instructionalMethod> <accrualMethod>purchase</accrualMethod> </resource>

Discussion
Basis for Schema Several questions and subsequent decisions guided me in designing a

metadata schema applicable to the three objects journal, webpage and artifact. These questions involved the purpose of the metadata schema should the schema facilitate interoperability, resource discovery, preservation, document navigation or all? Who will use or find the metadata useful who are the users or communities of users? Besser and Trant (1995) maintain that a key best practice involves consideration of the users, uses and characteristics of a collection. In addition, Besser (2004) notes the importance of providing consistent descriptive metadata and discovery metadata. With regard to the digital version of a book, Besser emphasizes the need for structural metadata that aids navigation through the resource and identification metadata to determine the particular version of a work as well as conditions metadata for rights and use constraints. While printers and publishers of printed materials have a considerable body of literature that sets standards for the publication of editions and its issues, the necessity of dissemination standards regarding digital works and the documentation of their versions is now only gradually being recognized, understood and accepted. Identification of particular versions of digital works is particularly important in view of their evanescent nature. How easily a different digital version or edition can supplant an earlier one! This would be particularly true with born digital documents where no physical version exists. Identification metadata can serve to identify these various and variant versions. In a nutshell, NISO refers to metadata as a method to identify, authenticate, describe, locate and manage resources in a precise and consistent way... (Hodgson, 2008). This definition would, clearly, require identification metadata. However, the effective implementation of any metadata initiative, presupposes that the schema has widespread acceptability and use. I felt Dublin Core (DC) would be an appropriate schema. Dublin Core has widespread usage and that has the advantage of integrating with other metadata schemas. It has been developed collaboratively and is designed to be extensible, international and capable of handling different types of information from metadata describing electronic records and websites to elements that can describe a multitude of physical objects. According to Hockey, (1995) DC allows specific applications to build on this framework to develop their own metadata systems (Hockey, 1995, p.13.) So, DC can be foundational, extensible and interoperable to the extent that any current schema interacts or unifies with other schemas well. The Warwick Framework supports this aggregating or unifying aspect of Dublin Core (Besser, 2004). Within the Warwick Framework or container architecture, each community of users can maintain its own container of metadata and still interact or interoperate with metadata sets or packages from other containers or communities of expertise (Lagoze, 1996). The integrating set of metadata would be Dublin Core. So, it would seem that the museum or special collections communities could employ the metadata scheme Cataloguing Descriptive Works of Art (CDWA) or another appropriate metadata schema, (VRA or MODS), for their specific user community and collections while also overlapping and interoperating with core Dublin Core descriptive metadata. CDWA is excellent for nuanced and layered metadata, but it is complex, laborintensive and requires substantial knowledge of the object, the collections and the users. This is not always a realistic option for museums or special collections. If one intends to provide detailed metadata, Dublin Core could start up a metadata initiative while one gradually builds the schema most useful for a particular community of users or experts. The notions of the user and collection are paramount. XML Elements It was difficult selecting elements I considered the initial scenario where the fictional Victorian scholar designs a special collections exhibition of 19th century ornamental gift books and wants scholarly information as labels and pamphlets to accompany the exhibit. In order to provide a sense of the cultural ambience, she also wanted period furnishings. Finally, she needed scholarly information regarding these particular books, so her lecture on the exhibition would attract university students as well as scholars. What descriptive metadata elements would retrieve materials helpful to her? So, the user or the information needs of this Victorian scholar were central in deciding on elements. Given that I felt Dublin Core elements would provide sufficient descriptive metadata to meet the information needs of the Victorian scholar I applied DC elements. But, which elements and qualifiers would supply sufficient metadata for baseline description, identification and potential resource discovery, while also retaining DCs streamlined simplicity? This question directed my thought on the selection of elements and a few qualifiers. With some elements, my documentation extends beyond that of the DC data dictionary. One instance of this is in description where I raised the issue of about-ness and of-ness. This documentation and related information would become very useful, if an additional metadata schema, based on CDWA or VRA were developed for scholarly art-related communities.

In addition, in my previous schema, I had not addressed the distinction between the publisher and the specific journal that issued the academic paper. In trying to resolve this issue, I decided to use the element publisher as well as the element source. A further complication arose when teasing out how to document the papers digital transformation and addition to the database JsTOR. The website had been modified once and the modification dates were clearly documented in the source code. So applying the qualifier dateModified seemed straightforward. The website is also an instructional site, so I included the element instructionalMethod. I found this element particularly useful and ended up applying it to all three objects both museums offered extensive instructional activities and materials relating to the topic or the artifact interactive instructional activities along with teachers lesson plans, guidance for parents and hands-on activities when visiting the museum. The museum artifact should also have a statement or element for provenance and accrualMethod. This information is essential in any art or rare book collection. And, if one is conducting a project on the history of a particular item or items, these elements will serve information needs of those users. The problems I ran into were technical and though I researched them, I recognize that developing a strong base will require much more effort and time. Some of these problems involved terms such as Literal, simpleLiteral and complexLiteral and what role they play and whether they are desirable. These sorts of issues aside, I feel I now have a foundation on which to build more skill with metadata and am aware of resources that support further study.

References Baker, T. (2000). A grammar of Dublin Core. D-Lib Magazine, 6(10). Besser, H. (2004). Past, present and future of digital libraries. In Schreibman, S., Siemans, Ray & John Unsworth (Ed.), A Companion to digital humanities (pp. 557-575). Oxford: Blackwell. Greenberg, J. (2012). Understanding Metadata and Metadata Schemes. In R. Smiraglia (Ed.), Metadata: A cataloger's primer. New York: Routledge. Groth, P. (2012). Requirements for provenance on the Web. The International Journal of Digital Curation, 7(1), 39-56. doi: 10.2218/ ijdc.v7I1.213 Harold, E. R. (2004). XML 1.1 bible (3rd ed.). New York: Wiley. Harpring, P. (2002). The language of images: Enhancing access to images by applying metadata schemas and structured vocabularies. In M. Baca (Ed.), Art Image Access. Los Angelos: Getty Research Institute. Hockey, S. (2000). Electronic texts in the humanities. Oxford: Oxford University Press. Hodgson, C. (2008). Building a metadata schema: Where to start? Washington: NISO. Hyvom, E. (2012). Publishing and using cultural heritage linked data on the semantic web: Morgan & Claypool. Lagoze, C. (1996). The Warwick Framework: A container architecture for diverse sets of metadata. D-Lib Magazine. Retrieved from D-Lib website: Ledbetter, K. (1996). "BeGemmed and beAmuletted": Tennyson and Those "Vapid" Gift Books. Victorian Poetry, 34(2), 235-245. Trust, J. P. G. (2006). CDWA Lite: Specification for an XML Schema for Contributing Records via the OAI Harvestng Protocol (pp. 33). Los Angelos: J. Paul Getty Trust.

You might also like