You are on page 1of 49

Ensuring Consistency of Data in XML

Documents

Objectives
In this lesson, you will learn to:
☛ Declare elements and attributes in a Document Type
Definition (DTD)
☛ Create an XML Schema

©NIIT eXtensible Markup Language/Lesson 2/Slide 1 of 49


Ensuring Consistency of Data in XML
Documents

Problem Statement 2.D.1


☛ The head office of CyberShoppe sends the information
about its products to the branch offices. The product
details must be stored in a consistent format at all
branches. Restrictions must be placed on the kind of
data that can be saved in the data store to ensure
uniformity and consistency of information.
The products sold by CyberShoppe are organized into
two categories, toys and books. The product details
comprise the name of the product, a brief description
about it, the price of the product, and the quantity
available in stock. Every product is uniquely identified
by a product ID.

©NIIT eXtensible Markup Language/Lesson 2/Slide 2 of 49


Ensuring Consistency of Data in XML
Documents

Task List
☛ Identify the elements required for storing structured
data.
☛ Identify the attributes.
☛ Identify the method for storing consistent data.
☛ Identify the method for declaring elements to be used
for storing structured data.
☛ Identify the method for declaring attributes.
☛ Identify the method to validate the structure of data.

©NIIT eXtensible Markup Language/Lesson 2/Slide 3 of 49


Ensuring Consistency of Data in XML
Documents

Task List (Contd.)


☛ Declare elements and attributes.
☛ Store data.
☛ Validate the structure of data.

©NIIT eXtensible Markup Language/Lesson 2/Slide 4 of 49


Ensuring Consistency of Data in XML
Documents

Task 1: Identify the elements required for storing


structured data.
Result:
☛ The elements required to store the details about
products sold at CyberShoppe are as follows:
Element Description
PRODUCTDATA Indicates that data specific to various products is being stored in the document.
Acts as the root element for all other elements.

PRODUCT Represents the details (product name, description, price, and quantity) for each
product.
PRODUCTNAME Represents the name of each product.

DESCRIPTION Represents the description of each product.

PRICE Represents the price of each product.

QUANTITY Represents the quantity of each product.

©NIIT eXtensible Markup Language/Lesson 2/Slide 5 of 49


Ensuring Consistency of Data in XML
Documents

Task 2: Identify the attributes.


Result:
☛ In case of CyberShoppe, you need to store all details
about products in an XML document.
☛ Each product needs to have a unique identification
number for easy identification of a particular product.
Therefore, PRODUCTID can be defined as an
attribute of the PRODUCT element.
☛ The category classifies a product as Book or Toy.
Therefore, CATEGORY can also be defined as an
attribute of the PRODUCT element.

©NIIT eXtensible Markup Language/Lesson 2/Slide 6 of 49


Ensuring Consistency of Data in XML
Documents

Task 2: Identify the attributes. (Contd.)


☛ The following table specifies the attributes to be
used in the XML document that stores product
details:
Attribute Description
PRODUCTID Represents a unique identification value for each product. It must
be specified for every product.

CATEGORY Represents the category of a product, and specifies whether a


product is a TOY or BOOK.

©NIIT eXtensible Markup Language/Lesson 2/Slide 7 of 49


Ensuring Consistency of Data in XML
Documents

Task 3: Identify the method for storing


consistent data.
Document Type Definition
☛ A DTD defines the structure of the content of an XML
document, thereby allowing you to store data in a
consistent format.
☛ XML allows you to create your own DTDs for
applications.
✓ You can check an XML document against a DTD.
✓ This checking process is called validation.
✓ XML documents that conform to a DTD are
considered valid documents.

©NIIT eXtensible Markup Language/Lesson 2/Slide 8 of 49


Ensuring Consistency of Data in XML
Documents

Task 3: Identify the method for… (Contd.)


Result:
☛ As a DTD allows you to specify the structure and type
of data elements, a DTD can be created to specify the
structure of the document.

©NIIT eXtensible Markup Language/Lesson 2/Slide 9 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for declaring


elements to be used for storing structured data.
☛ In DTD, elements are declared by using the following
syntax:
<!ELEMENT elementname (content-type or
content-model)>
☛ Elements can be of following types:
✓ Empty
✓ Unrestricted
✓ Container

©NIIT eXtensible Markup Language/Lesson 2/Slide 10 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for… (Contd.)


☛ While declaring elements in a DTD, different symbols
can be used to specify whether an element is mandatory
or optional, and whether it can occur more than once.
☛ The following table lists the various symbols that can be
used while defining the DTD:
Symbol Meaning Example Description
, “and” in specific PRODUCTNAME, PRODUCTNAME or
order DESCRIPTION DESCRIPTION must occur in
that order.

| “or” PRODUCTNAME| Either PRODUCTNAME or


DESCRIPTION DESCRIPTION.

©NIIT eXtensible Markup Language/Lesson 2/Slide 11 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for… (Contd.)


Symbol Meaning Example Description
? “optional”, can DESCRIPTION? DESCRIPTION need not be
occur only once. present, but if it is present, it
can occur only once.

* An element can (PRODUCTNAME| Any number of


occur zero or DESCRIPTION)* PRODUCTNAME or
multiple times. DESCRIPTION elements
can be present in any order.
+ An element must DESCRIPTION+ DESCRIPTION can occur
occur at least once. multiple times.
There can be
multiple
occurrences.

©NIIT eXtensible Markup Language/Lesson 2/Slide 12 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for… (Contd.)


☛ As per the given scenario, the type of content for each
element is given in the following table:

Element Content Type Description


PRODUCTDATA Element content Contains one or more PRODUCT elements.

PRODUCT Element content Contains details of other products, and hence will contain other
elements like PRODUCTNAME, DESCRIPTION, PRICE, and
QUANTITY.
PRODUCTNAME Data content Contains regular text that represents the name of a product.

DESCRIPTION Data content Contains regular text that represents the description of a product.

PRICE Data content Contains regular text that represents the price of a product.

QUANTITY Data content Contains regular text that represents the quantity of a product.

©NIIT eXtensible Markup Language/Lesson 2/Slide 13 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for… (Contd.)


☛ You need to use the <!ELEMENT> statement for
declaring elements in a DTD.
☛ For example, the PRODUCTNAME element used in the
CyberShoppe scenario can be declared as follows:
<!ELEMENT PRODUCTNAME (#PCDATA)>

©NIIT eXtensible Markup Language/Lesson 2/Slide 14 of 49


Ensuring Consistency of Data in XML
Documents

Task 5: Identify the method for declaring


attributes.
☛ The syntax for declaring attributes in a DTD is as
follows:
<!ATTLIST elementname attributename
valuetype [attributetype] [“default”]>
✓ The attributename valuetype
[attributetype] [“default”] section is
repeated as often as necessary to create multiple
attributes for any given element.

©NIIT eXtensible Markup Language/Lesson 2/Slide 15 of 49


Ensuring Consistency of Data in XML
Documents

Task 5: Identify the method for declaring


attributes. (Contd.)
☛ The value types that can be specified for attributes in
a DTD are:
✓ PCDATA
✓ ID
✓ (enumerated)
☛ The attribute types are:
✓ REQUIRED
✓ FIXED
✓ IMPLIED

©NIIT eXtensible Markup Language/Lesson 2/Slide 16 of 49


Ensuring Consistency of Data in XML
Documents

Task 5: Identify the attribute types and… (Contd.)


Result:
☛ In the case of CyberShoppe, the attribute and their
value types will be as follows:
Attribute Attribute Type Value Type Description

PRODUCTID #REQUIRED ID Product ID must have a


unique value and has to
be specified for every
product.

CATEGORY #REQUIRED (enumerated) Category must be TOYS


or BOOKS.

☛ You need to use the <!ATTLIST> statement for


declaring attributes in a DTD.
©NIIT eXtensible Markup Language/Lesson 2/Slide 17 of 49
Ensuring Consistency of Data in XML
Documents

Task 6: Identify the method to validate the


structure of data.
☛ To validate the structure of data in an XML document
you need to use parsers.
☛ Parsers are software programs that check the syntax
used in an XML file. There are two types of parsers.
They are:
✓ Non-validating parsers: Check whether an XML
document is well-formed.
✓ Validating parsers: Check for well-formedness and
validity of an XML document.

©NIIT eXtensible Markup Language/Lesson 2/Slide 18 of 49


Ensuring Consistency of Data in XML
Documents

Task 6: Identify the method to validate… (Contd.)


Result:
☛ In order to check whether the data sent by the branches
of CyberShoppe conforms to the structure specified in
the DTD, you need a validating parser.

©NIIT eXtensible Markup Language/Lesson 2/Slide 19 of 49


Ensuring Consistency of Data in XML
Documents

Task 7: Declare elements and attributes.


☛ Internal and External DTDs
✓ You can declare elements and attributes in a DTD.
✓ A DTD can be classified into two types. They are:
➤ Internal DTD
➤ External DTD

©NIIT eXtensible Markup Language/Lesson 2/Slide 20 of 49


Ensuring Consistency of Data in XML
Documents

Task 7: Declare elements and attributes. (Contd.)


☛ Differences between internal and external DTDs are
given in the following table:
Internal DTD External DTD
This DTD is a part of the XML document. This DTD is maintained as a separate file. A
reference to this file in included in the XML
document.
This DTD can be used only by the document This DTD can be used across multiple
in which it is created and cannot be used documents.
across multiple documents.

©NIIT eXtensible Markup Language/Lesson 2/Slide 21 of 49


Ensuring Consistency of Data in XML
Documents

Task 7: Declare elements and attributes. (Contd.)


☛ To ensure that the structure of an XML document
conforms to the DTD, you must associate the DTD
with the XML document.
☛ The <!DOCTYPE> declaration is used to define the
internal DTD. It can also be used to reference an
external DTD.
☛ The syntax for defining an internal DTD in an XML
document is as follows:
<!DOCTYPE rootelement
[element and attribute declarations]>

©NIIT eXtensible Markup Language/Lesson 2/Slide 22 of 49


Ensuring Consistency of Data in XML
Documents

Task 7: Declare elements and attributes. (Contd.)


☛ The syntax for referencing an external DTD in the
XML document is as follows:
<!DOCTYPE rootelement PUBLIC|SYSTEM "path-
of-file">
Action:
☛ Type the code for creating the DTD.
☛ Save the file as products.dtd.

©NIIT eXtensible Markup Language/Lesson 2/Slide 23 of 49


Ensuring Consistency of Data in XML
Documents

Task 8: Store data.


Action:
☛ Write the code for creating the XML document.
☛ Save the file as products.xml.

©NIIT eXtensible Markup Language/Lesson 2/Slide 24 of 49


Ensuring Consistency of Data in XML
Documents

Task 9: Validate the structure of data.


Action:
☛ Open index.htm in Internet Explorer.
☛ Click the DTD Validator link.
☛ Type the name of the XML document that you want to
parse in the text box.
☛ Click the Validate button.

©NIIT eXtensible Markup Language/Lesson 2/Slide 25 of 49


Ensuring Consistency of Data in XML
Documents

Just a Minute…
The branches of CyberShoppe send information about
books sold by them to the head office. The book details
must be stored in a consistent format. Restrictions must
be placed on kind of data that can be saved in the data
store to ensure uniformity and consistency of
information. The details of the books sold by
CyberShoppe consist of the name of the book, ISBN of
the book, first and last names of the author of the book,
and the price of the book. The ISBN should be unique
for each book. In addition, you need to ensure that the
book category contains HISTORY, SCIENCE, or
FICTION as its valid values. Create a DTD for declaring
the elements to be used for storing book details in an
XML document.

©NIIT eXtensible Markup Language/Lesson 2/Slide 26 of 49


Ensuring Consistency of Data in XML
Documents

Introduction to XML Schemas


☛ An XML schema is used to define the structure of an
XML document.
☛ Microsoft has developed a language that is used to
define the schema of an XML document. This
language is called the XML Schema Definition (XSD)
language.

©NIIT eXtensible Markup Language/Lesson 2/Slide 27 of 49


Ensuring Consistency of Data in XML
Documents

Advantages of XML Schemas over DTDs


☛ Some of the advantages of an XML schema created
by using XSD over DTD are as follows:
✓ XSD provides more control over the type of data
that can be assigned to elements and attributes as
compared to DTD.
✓ DTD does not enable you to define your own
customized data types. XSD enables you to create
your own data types.
✓ XSD also allows you to specify restrictions on data.

©NIIT eXtensible Markup Language/Lesson 2/Slide 28 of 49


Ensuring Consistency of Data in XML
Documents

Advantages of XML Schemas over DTDs (Contd.)


✓ The syntax for defining a DTD is different from the
syntax used for creating an XML document.
However, the syntax for defining an XSD is the
same as the syntax of the XML document.

©NIIT eXtensible Markup Language/Lesson 2/Slide 29 of 49


Ensuring Consistency of Data in XML
Documents

Problem Statement 2.D.2


The head office of CyberShoppe sends information about
its products to its branch offices. The product details must
be stored in a consistent format. Restrictions must be
placed on the kind of data that can be saved in the data
store to ensure uniformity and consistency of information.

The product details comprise the name of the product, a


brief description about it, the price of the product, and the
quantity available in stock. The price of the product must
always be greater than zero.

©NIIT eXtensible Markup Language/Lesson 2/Slide 30 of 49


Ensuring Consistency of Data in XML
Documents

Task List
☛ Identify the elements required to store data.
☛ Identify the data type of the contents of an element.
☛ Identify the method for declaring a simple type
element.
☛ Identify the method for declaring a complex type
element.
☛ Create the XML schema.
☛ Create an XML document conforming to the schema.
☛ Validate an XML document against the schema.

©NIIT eXtensible Markup Language/Lesson 2/Slide 31 of 49


Ensuring Consistency of Data in XML
Documents

Task 1: Identify the elements required to store


data.
Result:
☛ As per the problem, the elements required in the XML
document are:
Element Description
PRODUCTDATA This element indicates that data specific to various products is being
stored in the document. Therefore, it contains more elements and acts
as the root element

PRODUCT Represents the details (product name, description, price, and quantity)
for each product.

PRODUCTNAME Represents the name of each product.

DESCRIPTION Represents the description of each product.

PRICE Represents the price of each product

QUANTITY Represents the quantity of each product.

©NIIT eXtensible Markup Language/Lesson 2/Slide 32 of 49


Ensuring Consistency of Data in XML
Documents

Task 2: Identify the data type of the contents of an


element.
☛ Every element declared in XSD, must be associated
with a data type.
☛ XSD provides a list of pre-defined data types.
✓ Primitives Data Types: Fundamental data types of
XSD, such as string, decimal, float, and boolean.
✓ Derived Data Types: Defined by using other data
types.
✓ Atomic Data Types: Data types that cannot be
broken further.
✓ List Data Types: Contain a set of values.
✓ Union Data Types: Derived from list and atomic data
types.
©NIIT eXtensible Markup Language/Lesson 2/Slide 33 of 49
Ensuring Consistency of Data in XML
Documents

Task 2: Identify the data type of the… (Contd.)


☛ XSD also allows definition of custom data types. These
custom data types can be classified as follows:
✓ Simple data type: A data type that contains only
values.
✓ Complex data type: A data type that contains child
elements, attributes, and also the mixed content.

©NIIT eXtensible Markup Language/Lesson 2/Slide 34 of 49


Ensuring Consistency of Data in XML
Documents

Task 2: Identify the data type of the… (Contd.)


Result:
☛ The data type for the contents of the elements will be:
Element Data Type Description
PRODUCTDATA Complex data type A complex type element that can hold other elements, attributes,
and mixed content. This element will hold a complex data type,
which will be defined in the later session.

PRODUCT Complex data type A complex type element that can hold other elements, attributes,
and mixed content. This element will hold a complex data type,
which will be defined in the later session.

PRODUCTNAME String A simple type element that contains values of data string type.

DESCRIPTION String A simple type element that contains values of string data type.

PRICE Positiveinteger A simple type element that contains values of positiveInteger


data type (product price must be greater than zero.

QUANTITY Integer A simple type element that contains values of integer data type.

©NIIT eXtensible Markup Language/Lesson 2/Slide 35 of 49


Ensuring Consistency of Data in XML
Documents

Task 3: Identify the method for declaring a


simple type element.
☛ A simple element does not contain any child elements
or attributes. Simple elements contain only values
such as numbers, strings, and dates.
☛ The syntax for declaring elements with a simple data
type is as follows:
<xsd:element name=”element-name”
type=”data type” />

©NIIT eXtensible Markup Language/Lesson 2/Slide 36 of 49


Ensuring Consistency of Data in XML
Documents

Task 3: Identify the method for declaring a


simple type element. (Contd.)
☛ You can associate an element with a user-defined
simple data type. To do so, you must define the new
simple data type.
☛ You can use the simpleType element of XSD to create
a user-defined simple data type.

©NIIT eXtensible Markup Language/Lesson 2/Slide 37 of 49


Ensuring Consistency of Data in XML
Documents

Task 3: Identify the method for declaring a


simple type element. (Contd.)
Result:
☛ As per the problem, the simple elements can be
declared in the XSD as follows:
<xsd:element name="PRODUCTNAME"
type="xsd:string"/>
<xsd:element name="DESCRIPTION"
type="xsd:string"/>
<xsd:element name="PRICE"
type="xsd:positiveInteger"/>
<xsd:element name="QUANTITY"
type="xsd:nonNegativeInteger"/>

©NIIT eXtensible Markup Language/Lesson 2/Slide 38 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for declaring a


complex type element.
☛ A complex type element is one that contains other
markup elements, attributes, and mixed content.
☛ To declare a complex type element, you need to first
define a complex data type. After you define a
complex data type, you can declare a complex
element by associating this data type with the
element.
☛ You can define a complex data type by using the
syntax given below:
<xsd:complexType name=”data type name”>
Content model declaration
</xsd:complexType>
©NIIT eXtensible Markup Language/Lesson 2/Slide 39 of 49
Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for declaring a


complex type element. (Contd.)
☛ To declare an element as a complex type element, the
element must be associated with a complex data type.
☛ For example, to declare the element PRODUCT as a
complex type element you can associate this element
with the prdt data type as shown below:
<xsd:element name="PRODUCT" type="prdt"/>
Result
☛ In the CyberShoppe scenario, you require two complex
type elements, PRODUCTDATA and PRODUCT.

©NIIT eXtensible Markup Language/Lesson 2/Slide 40 of 49


Ensuring Consistency of Data in XML
Documents

Task 4: Identify the method for declaring a


complex type element. (Contd.)
☛ You can create complex type elements by associating
them with complex data types.
☛ You can use the element element of XSD to declare a
complex type element.
☛ You can use the complexType element of XSD to
create the complex data type.

©NIIT eXtensible Markup Language/Lesson 2/Slide 41 of 49


Ensuring Consistency of Data in XML
Documents

Task 5: Create the XML Schema.


☛ The Schema element
✓ The integration of the various components of the
XSD is done using the schema element.
✓ The declaration of an XML schema starts with the
<schema> element.
✓ The <schema> element uses the xmlns attribute to
specify the namespace associated with the
document.
Action:
✓ Type the XML Schema in Notepad.
✓ Save the file as product.xsd.
©NIIT eXtensible Markup Language/Lesson 2/Slide 42 of 49
Ensuring Consistency of Data in XML
Documents

Task 6: Create an XML document conforming to


the schema.
☛ To create a data structure that conforms to the XML
schema, you should create an XML document and
associate it with the XML schema.
☛ An XML file cannot be directly associated with the
XML schema file. The XML file can be associated with
the XML schema only through a validator.
Action:
✓ Type the code in Notepad.
✓ Save the file as products.xml

©NIIT eXtensible Markup Language/Lesson 2/Slide 43 of 49


Ensuring Consistency of Data in XML
Documents

Task 7: Validate an XML document against the


schema.
Action:
✓ Open index.htm.
✓ Click the Schema Validator link.
✓ Type the name of the XML document and the XSD
file.
✓ Click the Validate button.

©NIIT eXtensible Markup Language/Lesson 2/Slide 44 of 49


Ensuring Consistency of Data in XML
Documents

Problem Statement 2.P.2


☛The details of the books sold by CyberShoppe consist
of the name of the book, the ISBN of the book, the first
and last names of the author of the book, and the
price of the book. The ISBN must start with the letter I
and be followed by three digits.

This data must be validated to ensure that it conforms


to the standards specified in order to maintain data
integrity. Also, the data types used for the data must
be compatible with those used in databases. All data
must be stored in a consistent format.

©NIIT eXtensible Markup Language/Lesson 2/Slide 45 of 49


Ensuring Consistency of Data in XML
Documents

Summary
In this lesson, you learned that:
☛ Document type Definition (DTD) is method for defining
the structure of the data in an XML document.
☛ There are two types of DTD:
✓ Internal DTD: It can be included as a part of the
document.
✓ External DTD: it is stored as a separate file having
the declaration of all elements and attributes that
can be used in an XML document.
☛ There are three types of elements: empty,
unrestricted, and container.

©NIIT eXtensible Markup Language/Lesson 2/Slide 46 of 49


Ensuring Consistency of Data in XML
Documents

Summary (Contd.)
☛ The <!ELEMENT> statement is used to declare an
element in a DTD.
☛ The <!ATTLIST> statement is used to declare a list of
attributes for an element in a DTD.
☛ The <!DOCTYPE statement is used in an XML
document to associate the XML document with a
DTD.
☛ Non-validating XML parsers check whether an XML
document is well-formed.
☛ Validating XML parsers are used to validate an XML
document against a DTD or a schema.

©NIIT eXtensible Markup Language/Lesson 2/Slide 47 of 49


Ensuring Consistency of Data in XML
Documents

Summary (Contd.)
☛ Schema can be used to specify the list of elements
and the order in which these elements must appear in
the XML document.
☛ The language that is used to describe the structure of
the elements in a schema is called the XML Schema
Definition (XSD) language .
☛ The data types supported by schema are of the
following types:
✓ Primitive
✓ Derived
✓ Atomic
✓ List

©NIIT eXtensible Markup Language/Lesson 2/Slide 48 of 49


Ensuring Consistency of Data in XML
Documents

Summary (Contd.)
☛ The simpleType element of XSD allows you to create
user-defined simple data types.
☛ The complexType element of XSD allows you to
create complex data types.
☛ The restriction element can be used to specify
constraints on values that can be stored in elements
and attributes.

©NIIT eXtensible Markup Language/Lesson 2/Slide 49 of 49

You might also like