XML Schema Part I Introduction XML Schema XML

XML Schema Part I Introduction XML Schema XML

XML Schema Part I Introduction XML Schema XML itself does not restrict what elements existing in a document. In a given application, you want to fix a vocabulary -- what elements make sense, what their types are, etc. Use a Schema to define an XML dialect MusicXML, ChemXML, VoiceXML, ADXML, etc.

Restrict documents to those tags. Schema can be used to validate a document -ie to see if it obeys the rules of the dialect. Schema determine What sort of elements can appear in the document. What elements MUST appear Which elements can appear as part of

another element What attributes can appear or must appear What kind of values can/must be in an attribute. We start with sample XML document and 0836217462 reverse engineer a Being a Dog is a Full-Time Job schema as a simple

example Charles Schulz 1922-11-26 First identify the elements: 2000-02-12 author, book, born, character, dead, isbn, library, name, qualification, title Peppermint Patty 1966-08-22 Next categorize by content bold,brash, and tomboyish model Empty: contains nothing Simple: only text nodes Snoopy Complex: only sub-elements 1950-10-04

Mixed: text nodes + sub-elements extroverted beagle Note: content model independent of comments, attributes, or Schroeder processing instructions! 1951-05-30 brought classical music to the Peanuts Strip Lucy 1952-03-03 bossy, crabby, and selfish Content models

Simple content model: name, born, title, dead, isbn, qualification Complex content model: libarary, character, book, author Content Types We further distinguish between complex and simple content Types: Simple Type: An element with only text nodes and no child elements or attributes Complex Type: All other cases

We also say (and require) that all attributes themselves have simple type Content Types Simple content type: name, born, dead, isbn, qualification Complex content type: library, character, book, author, title Building the schema Schema are XML documents

They must contain a schema root element as such ... ... We will discuss details in a bit -- note for now that yellow part can be excluded for now. Flat schema for library Start by defining all of the simple types (including attributes):

/ Complex types with simple content Now to complex types: Being a Dog is the element named title has a complex

type which is a simple content obtained by extending the predefined datatype xs:string by adding the attribute defined in this schema and having the name lang. Complex Types All other types are complex types with complex content. For example:

Same schema but with everything defined locally! Dissecting Schema Whats in a Schema? A Schema is an XML document (a DTD is not) Because it is an XML document, it must have a root element The root element is

Within the root element, there can be Any number and combination of Inclusions Imports Re-definitions Annotations Followed by any number and combinations of

Simple and complex data type definitions Element and attribute definitions Model group definitions Annotations Structure of a Schema ... ... ... ... ... ... ... ... ...

Simple Types Elements What is an element with simple type? A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes. Can also add restrictions (facets) to a data type in order to limit its content, and you can require the data to match a defined pattern. Example Simple Element

The syntax for defining a simple element is: where xxx is the name of the element and yyy is the data type of the element. Here are some XML elements: Refsnes 37 1968-03-27 And here are the corresponding simple element definitions:

Common XML Schema Data Types XML Schema has a lot of built-in data types. Here is a list of the most common types: xs:string xs:decimal xs:integer xs:boolean xs:date xs:time

Declare Default and Fixed Values for Simple Elements Simple elements can have a default value OR a fixed value set. A default value is automatically assigned to the element when no other value is specified. In the following example the default value is "red": A fixed value is also automatically assigned to the element. You cannot specify another value. In the following example the fixed value is "red":

Attributes (Another simple type) All attributes are declared as simple types. Only complex elements can have attributes! What is an Attribute? Simple elements cannot have attributes.

If an element has attributes, it is considered to be of complex type. But the attribute itself is always declared as a simple type. This means that an element with attributes always has a complex type definition. How to Define an Attribute The syntax for defining an attribute is: where xxx is the name of the attribute and yyy is the data type of the attribute. Here are an

XML element with an attribute: Smith And here are a corresponding simple attribute definition: Declare Default and Fixed Values for Attributes Attributes can have a default value OR a fixed value specified. A default value is automatically assigned to the attribute when no other value is specified. In the following example the default value is "EN": A fixed value is also automatically assigned to the attribute.

You cannot specify another value. In the following example the fixed value is "EN": Creating Optional and Required Attributes All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute: To make an attribute required:

Restrictions As we will see later, simple types can have ranges put on their values These are known as restrictions Complex Types Complex Elements A complex element is an XML element that contains other elements and/or attributes.

There are four kinds of complex elements: empty elements elements that contain only other elements elements that contain only text elements that contain both other elements and text Note: Each of these elements may (or must) contain attributes as well! Examples of Complex XML Elements A complex XML element, "product", which is empty:

A complex XML element, "employee", which contains only other elements: John Smith A complex XML element, "food", which contains only text: Ice cream

A complex XML element, "description", which contains both elements and text: It happened on 03.03.99.. An Example XML Schema

Referencing XML Schema in XML documents Sample Schema header The element may contain

some attributes. A schema declaration often looks something like this: ... ... Schema headers, cont. The following fragment: xmlns:xs=http://www.w3.org/2001/XMLSchema indicates that the elements and data types used in the schema (schema, element, complexType, sequence, string, boolean, etc.) come from the "http://www.w3.org/ 2001/XMLSchema" namespace.

It also specifies that the elements and data types that come from the "http://www.w3.org/2001/XMLSchema" namespace should be prefixed with xs: !! Schema header, cont. This fragment: This fragment: targetNamespace=http://www.w3schools.com indicates that the elements defined by this schema (note, to, from, heading, body.) come from the "http://www.w3schools.com" namespace.

xmlns=http://www.w3schools.com indicates that the default namespace is "http://www.w3schools.com". This fragment: elementFormDefault="qualified indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified. Referencing schema in XML This XML document has a reference to an XML Schema:

Tove Jani Reminder Don't forget me this weekend! Referencing schema in xml, cont. The following fragment: xmlns=http://www.w3schools.com specifies the default namespace declaration. This declaration tells the schema-validator that all the elements used in this XML document are declared in the

"http://www.w3schools.com" namespace. Once you have the XML Schema Instance namespace available: xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance you can use the schemaLocation attribute. This attribute has two values. The first value is the namespace to use. The second value is the location of the XML schema to use for that namespace: xsi:schemaLocation="http://www.w3schools.com note.xsd" Using References Using References

You don't have to have the content of an element defined in the nested fashion as just shown You can define the element globally and use a reference to it instead

Rooms Schema using References

Types Both elements and attributes have types, which are defined in the Schema. One can reuse types by giving them names. OR

Other XML Schema Features Foreign key facility (uses Xpath)

Rich datatype facility Build up datatypes by inheritance Dont need to list all of the attributes (can say "these attributes plus others"). Restrict strings using regular expressions Namespace aware. Can restrict location of an element based on a namespaces Restrictions Datatype Restrictions A DTD can only say that price can be any non-markup text. Like this translated to Schemas

But in Schema you can do better: Or even, make your own restrictions Restriction Ranges The restrictions must be "derived" from a base type, so it's object based

Preceding "derived" from "integer" Has 2 restrictions (called "facets") The first says that it must be greater than 41 The second says that it must be less than 43 XML file is "42"

xsi:noNamespaceSchemaLocation="LifeUniverseEverything.xsd">42 Facet enumeratio n fractionDigi ts length maxExclusi ve maxInclusi ve maxLength minExclusi ve minInclusiv e minLength pattern

Description Defines a list of acceptable values The maximum number of decimal places allowed. >=0 The exact number of characters or list items allowed. >=0 The upper bounds for numeric values (the value must be less than the value specified) The upper bounds for numeric values (the value must be less than or equal to the value specified) The maximum number of characters or list items allowed. >=0 The lower bounds for numeric values (the value must be greater than the value specified) The lower bounds for numeric values (the value must be greater than or equal to the value specified) The minimum number of characters or list items allowed >=0 The sequence of acceptable characters based on a regular expression

Enumeration Facet Patterns (Regular Expressions) One interesting facet is the pattern, which allows restrictions based on a regular expression This regular expression specifies a normal

word of one or more characters: Patterns (Regular Expressions) Individual characters may be repeated a specific number of times in the regular expression. The following regular expression restricts the string to exactly 8 alpha-numeric characters:

Whitespace facet The "whitespace" facet controls how white space in the element will be processed There are three possible values to the whitespace facet "preserve" causes the processor to keep all whitespace as-is "replace" causes the processor to replace all whitespace

characters (tabs, carriage returns, line feeds, spaces) with space characters "collapse" causes the processor to replace all strings of whitespace characters (tabs, carriage returns, line feeds, spaces) with a single space character Types Both elements and attributes have types, which are defined in the Schema. One can reuse types by giving them names. Addr.xsd:

OR Types The usage in the XML file is identical:

xsi:noNamespaceSchemaLocation="Address-WithTypeName.xsd" > 1108 E. 58th St. Ryerson 155 60637 1108 E. 58th St. Ryerson 155 60637 Type Extensions A third way of creating a complex type is to extend another complex type (like OO inheritance)

Type Extensions (use) To use a type that is an extension of another, it is as though it were all defined in a single type

King Arthur

Round Table
Camelot England Simple Content in Complex Type If a type contains only simple content (text and attributes), a element can be put inside the must have either a or a

This example is from the (Bridge of Death) Episode Dialog: Model Groups Model Groups are used to define an element that has

mixed content (elements and text mixed) element content Model Groups can be all choice the elements specified must all be there, but in any order any of the elements specified may or may not be there sequence

all of the elements specified must appear in the specified order "All" Model Group The following schema specifies 3 elements and mixed content The following XML file is valid in the above schema

Title: The Holy Grail Published: Moose Author: Monty Python Attributes The attribute declaration is part of the type of the element.

Attributes

If an attribute type is more complicated than a basic type, then we spell out the type in a type declaration. Optional and Required Attributes All attributes are optional by default. To explicitly specify that the attribute is optional, use the "use" attribute: To make an attribute required:

Other XML Schema Features Foreign key facility (uses Xpath) Rich datatype facility Build up datatypes by inheritance Dont need to list all of the attributes (can say these attributes plus others). Restrict strings using regular expressions Namespace aware. Can restrict location of an element based on a namespaces XML Schema Status

Became a W3C recommendation Spring 2001 World domination expected imminently. Supported in Xalan. Supported in XML spy and other editor/validators. On the other hand: More complex than DTDs. Ultra verbose. Validating a Schema

By using Xeena or XMLspy or XML Notepad. When publishing hand-written XML docs, this is the way to go. By using a Java program that performs validation. When validating on-the-fly, must do it this way Some guidelines for Schema design Designing a Schema

Analogous to database schema design --- look for intuitive names Can start with an E-R diagram, and then convert Attributes to Attributes Subobjects to Subelements Relationships to IDREFS Normalization? Still makes sense to avoid repetition whenever possible

If you have an Enrolment document, only list Ids of students, not their names. Store names in a separate document Leave it to tools to connect them Designing a Schema (cont.) Difficulties: Many more degrees of freedom than with database schemas: e.g. one can associate information with something by including it as an attribute or a subelement.

Martin Sheen 4145

ELEMENTS are more extensible use when there is a possibility that more substructure will be added. ATTRIBUTES are easier to search on. Rules for Designing a Schema Never leave structure out. The following is definitely a bad idea:
Martin Sheen 1222 Alameda Drive, Carmel, CA 40145
Better would be:

city=Carmel state=CA zip=40145 /> Or:

MartinSheen 1222Alameda Drive Carmel CA40145
MoreRules for Designing a Schema

When to use Elements (instead of attributes) Do not put large text blocks inside an attribute (Bad Idea) Elements are more flexible, so use an Element if you think you might have to add more substructure later on.

More Rules for Designing Schemas More on when to use Elements (instead of Attributes) Use an embedded element when the information you are recording is a constituent part of the parent element one's head and one's height are both inherent to a human

being, you can't be a conventionally structured human being without having a head and having a height One's head is a constituent part and one's height isn't -you can cut off my head, but not my height use embedded elements for complex structure validation (obvious) use embedded elements when you need to show order (attributes are not ordered) More Rules for Designing Schemas When to use Attributes instead of Elements use an attribute when the information is inherent to the parent but not a constituent part (height instead of head) use attributes to stress the one-to-one relationship

among pieces of information to stress that the element represents a tuple of information dangerous rule, though Leads to the extreme formulation that a element can have a TITLE= attribute And then to the conclusion that it really ought to have a CONTENT= attribute too Then you find yourself writing the entire document as an empty element with an attribute value as long as the Quest for the Holy Grail

use attributes for simple datatype validation (obviously) Bookings XML Schema Note that there are four global types in this document! Bookings, cont.

Bookings, cont.

An Example Bookings Document Reverse engineer a reasonable schema for the following sample xml file 2003 April 1 Democratic Party Green Room Republican Party Red Room

Recently Viewed Presentations

  • Frankenstein - Humble Independent School District

    Frankenstein - Humble Independent School District

    He comes across Victor Frankenstein who has been stranded. Victor relates the story of his life and experiment to Walton Chapters 11-16 contain the monster's story as he relates it to Frankenstein. Story within the monster's tale: the history of...
  • Shear Stresses in Concrete Beams - engr.uky.edu

    Shear Stresses in Concrete Beams - engr.uky.edu

    Girder A2-A3 Strength of Concrete in Torsion Redistribution of Torque Torsion Reinforcement Stresses induced by torque are resisted with closed stirrups and longitudinal reinforcement along the sides of the beam web. The distribution of torque along a beam is usually...
  • Enzymes in the Food Industry Food Chemistry Lab

    Enzymes in the Food Industry Food Chemistry Lab

    Browning. Browning can be either desirable (caramel, bread crust) or undesirable (fruit and vegetables) Browning can be characterized as non-enzymatic (maillard, ascorbic acid) and enzymatic . Polyphenol oxidase (PPO) is the major culprit of enzymatic browning in foods. Maillard (non-enzymatic)...
  • Nursing theory: An Exploration of Jean Watson's Philosophy ...

    Nursing theory: An Exploration of Jean Watson's Philosophy ...

    Nursing theory: An Exploration of Jean Watson's Philosophy & Science of Caring. Ferris State University, NURS 324 ... Imogene King: Interacting Systems Framework and Theory of Goal Attainment ... An Exploration of Jean Watson's Philosophy & Science of Caring
  • Language-Reasoning Items from ECERS-R

    Language-Reasoning Items from ECERS-R

    Language-Reasoning Items from ECERS-R Author: Kathleen McClelland Last modified by ... Calibri Arial Chianti XBd BT Office Theme PowerPoint Presentation PowerPoint Presentation PowerPoint Presentation PowerPoint Presentation PowerPoint Presentation PowerPoint Presentation PowerPoint Presentation ...
  • The March 11, 2011 Earthquake-tsunami Disaster in Japan ...

    The March 11, 2011 Earthquake-tsunami Disaster in Japan ...

    arial calibri default design the march 11, 2011 earthquake-tsunami disaster in japan remembering: part 2 one year later: more than 20 million tons of trash from the march 11, 2011 japanese tsunami is headed for the usa millions of tons...
  • Langston Hughes - North Thurston Public Schools

    Langston Hughes - North Thurston Public Schools

    Hughes's creative genius was influenced by his life in New York City's Harlem, a primarily African American neighborhood. His literary works helped shape American literature and politics. Hughes, like others active in the Harlem Renaissance, had a strong sense of...
  • A Comparison O

    A Comparison O

    a comparison of patient satisfaction between chinese and spanish speaking patients in an ambulatory clinic virginia s. tong, lmsw vp, cultural competence