LBSC 690: Session 5 - Metadata and XML

LBSC 690: Session 5 - Metadata and XML

LBSC 690: Session 5 Metadata and XML Jimmy Lin College of Information Studies University of Maryland Monday, October 8, 2007 Blind Men and Elephants Blind Men and Elephants Is this an elephant? Metadata Literally data about data a set of data that describes and gives information about other data Oxford English Dictionary Data without Metadata 7/1/1988 7/2/1988 7/3/1988

7/4/1988 7/5/1988 7/6/1988 7/7/1988 7/8/1988 7/9/1988 7/10/1988 7/11/1988 7/12/1988 7/13/1988 7/14/1988 7/15/1988 7/16/1988 7/17/1988 7/18/1988 7/19/1988 7/20/1988 7/21/1988 7/22/1988 7/23/1988 7/24/1988 7/25/1988 7/26/1988 7/27/1988 7/28/1988 7/29/1988 7/30/1988 7/31/1988

8/1/1988 8/2/1988 8/3/1988 8/4/1988 8/5/1988 OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL OL

OL OL OL OL OL OL OL OL OL OL OL OL OL OL 950 950 . 950 1005 1020 1015 925 945 1030 940 1010 945

950 955 955 1015 934 1010 952 1029 1017 1040 923 1030 950 1006 1010 1000 1005 1015 1018 1004 1011 955 951 20.3 24.2 . 0.4

32.9 32.3 36.8 42.8 23.3 49.8 44.8 47.6 36.5 19.5 31.7 23.3 23.8 32.9 29.2 44.8 33.7 34.3 35.7 47.6 58.3 49.3 54.1 40.5 25.5 47.9 38 21.2

38.5 94 58.3 55.8 13 12.6 . 16.3 18.9 20.5 24.9 25.6 27.8 26.2 25.2 26.9 22.6 18.6 15.7 14.5 16.6 16.7 20.4 24.8 37.1 32.9 24.6

28.9 32.6 29.2 20.9 16.5 23.6 17.6 22.5 8.8 22.8 32.6 43.1 42.2 0.8 1 . 0.4 1.4 1.4 1.7 2.5 0.7 2.6 2.5 2.6 1.9 0.4

1.5 1.8 1.6 2.1 1.9 2.1 1.9 2 2 2.9 2.9 3.4 3.9 1.7 1.4 0.8 1.5 1.1 2.1 2.1 2.5 2.1 -0.1 -0.1 . 0.2 0.3

0.3 0.5 0.6 0.8 0.6 0.8 0.7 0.6 0.5 0.4 0.8 0.6 0.7 0.7 0.8 0.6 0.7 0.8 0.8 0.7 0.6 0.6 0.3 0.1 0.1 0.1 -0.1 0.3

0.3 1.1 0.8 33.1 27.8 . 41 29.8 23.4 18.6 23.7 27.7 40.3 34 47.3 36.7 302 29.7 23.4 27.7 34 26 31.7 34.5 31.4 23.7 67.3

68 86 94 41 41 18.3 30 24.7 54 45.5 41 38 27.8 23.9 . 34.5 23.7 18.9 15.3 19.9 23.5 34 29.2 39.6 32.6 39.1 25

20.7 24.1 28.9 22.3 27.5 30.1 26.2 20.4 58.9 59.3 75.1 82.8 34.4 35.4 15.9 25.3 21.1 46.8 38.9 33.1 31 can be pretty useless! 5.3 3.8 . 6.5

6.1 4.5 3.2 3.9 4.3 6.3 4.8 7.7 4 262.9 4.7 2.7 3.7 5.1 3.7 4.2 4.3 5.1 3.3 8.4 8.7 10.9 11.2 6.6 5.6 2.3 4.7 3.6

7.2 6.6 7.9 7 5.92 4.56 . 15.5 14.23 12.97 13.92 15.18 12.33 22.14 16.76 16.13 15.5 11.07 9.49 8.14 9.17 9.49 10.44 10.75 12.02 12.65 15.5

20.87 22.14 21.19 25.06 6.54 3.82 4.19 4.44 4.81 9.8 9.49 9.8 8.86 Who: authored it? to contact about data? What: are contents of database? When: was it collected? processed? finalized? Where: was the study done? Why: was the data collected? How: were data collected?

processed? Verified? Early Example of Metadata Encoding Metadata Language for expressing metadata should be: Universal - so all can understand Flexible - to incorporate different types Extensible - flexible to custom types Simple - to encourage adoption Modular - so that schemes can be mixed, extended From: Ian Graham, An Introduction to RDF. http://www.utoronto.ca/ian/talks/ Metadata How do we encode metadata?

How do we encode metadata to support interoperability? Simple example: January 31, 2001 31 janvier 2001 2001-01-31 01-31-2000 31012000 What is the Dublin Core? A metadata standard for describing digital resources An initiative to create a digital library card catalog for the Web Dublin Core fields: (all optional) Title Description Date

Identifier Relation Creator Publisher Type Source Coverage Subject Contributor Format Language Rights What is XML? XML = eXtensible Markup Language XML is a standard for exchanging structured data

Provides standardization at the syntactic level Does not provide meaning for the tags XML is a standard recommended by the W3C Goals of XML Easy to use Easy to extend and adapt Easy to write programs that use XML Support a wide variety of applications Should be human legible

Formal and concise The Basic Rules XML is case sensitive All start tags must have end tags Elements must be properly nested XML declaration is the first statement Every document must contain a root element

Attribute values must have quotation marks Certain characters are reserved for parsing < = < Questions about XML How is XML like HTML? How is HTML like XML? Whats the relationship between XML and structured documents?

How are the rules governing a structured document encoded? XML: Historic Perspective HTML and the birth of the Web HTML is not enough Development of XML This section contains slides adapted from presentations by Ian Graham: http://www.utoronto.ca/ian/talks/ In the beginning The foundations of the Web: HTML HTTP URLs FTP News Email

Web Server Db & other software HTML (data/display) Internet communication protocols URLs (location e.g.,http://www.foo.org/) HTTP (transfer) Three Core Technologies HTTP - HyperText Transfer Protocol

URL - Uniform Resource Locator A protocol for transferring data between machines on the Internet A scheme for referencing the specific location of a resource HTML - HyperText Markup Language A markup language for encoding information to be read by humans HTTP and URLs have pretty-well stood the test of time. But by 1996, HTML was already showing signs of age .... HTML Started with very few tags Language evolved as more tags were added:

Forms Tables Fonts Frames Problems with HTML Desire for personalized tags Desire to incorporate other types of data HTML cant be extended

Mathematics, database entries, literary text, poems, purchase orders HTML cant accommodate other types of data Desire for automatic processing by software HTML is too messy and inconsistent Back to the Basics HTML was defined using SGML Complex, sophisticated, powerful Standard Generalized Markup Language A meta-language for defining languages too difficult to use

Idea: create a simpler version of SGML The birth of XML! Evolution of XML XML can be used to define other languages Many XML languages, optimized for different roles MathML: for mathematics SMIL: for synchronized multimedia RSS: for news feeds XHTML: HTML by XML rules RDF: for the Semantic Web

MathML An XML language for defining mathematic formulas x2 + 4x + 4 =0 x2 + 4 x +4 =0 See http://www.mozilla.org/projects/mathml/demo/tester.html MathML What advantages does it offer? SMIL

Synchronized Multimedia Integration Language Integration of multimedia with text, audio, video Support in RealPlayer SMIL Example

RSS RSS = Really Simple Syndication or Rich Site Summary An XML format for distributing news headlines on the Web See example at http://www.nytimes.com/services/xml/rss/ XHTML: Beyond HTML Title of text XHTML Document

Heading of Page

here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy.

Here is another paragraph with inline emphasized text, and absolutely no sense of humor.

And another paragraph, this one with an waste of time image, and a
line break.

XHTML Just like HTML, but based on XML rules Will support integration of different data into a single document XHTML and other Data Title of XHTML Document

Heading of Page

MathML markup

more html stuff goes here

SMIL markup

Demo at: http://www.umiacs.umd.edu/~jimmylin/LBSC690-2007-Fall/XML-demo/math-demo.xml And Others CML chemical Markup Lang CellML biological models BSML bioinformatic sequences MAGE-ML Microarray Gene Expression

XSTAR for archaeological research MARCXML MARC in XML AML astronomy markup language SportsML for sharing sports data The XML Family Tree SMIL XHTML HTML SpeechML MathML TEI XUL RDF

... ... XML SGML Mixing XML Dialects XML is designed to support the integration of multiple standards Allows users to mix elements from different standards Snapping together XML dialects like Lego pieces Based on the notion of namespaces Example

xmlns:dc="http://purl.org/dc/elements/1.1/"> XML.com http://xml.com/pub XML.com features a rich mix of information and services for the XML community. XML, RDF, metadata, information syndication services http://www.xml.com O'Reilly & Associates, Inc. Copyright 2000, O'Reilly & Associates, Inc. Example from http://www.xml.com/pub/a/2000/10/25/dublincore/ Interoperability What does it mean and whats the role of XML? XML as a universal format for data interchange

Software exchange data as XML-format messages Advantages? Eliminates proprietary data formats Promotes interoperability Encourages cooperation Leverages lots of existing XML processing software Interoperability slides adapted from presentations by Ian Graham: http://www.utoronto.ca/ian/talks/ XML Messaging Supplier Place order Factory Supplier Supplier

Response XML Messaging Database Database Send/request data Database Database Request/send data Example Message Gold sprockel grommets, with matching hamster 12 . Order something else ..

The next best thing since Whats the big deal about XML? What does XML not do? How do XML tags acquire meaning? How do standards arise? Whats wrong with the Web? It was meant for humans, not machines The current Web contains only data, not knowledge

From Web of data to Web of knowledge Difficult to Aggregate/compare data across sites Delegate complex tasks to agents Formulate complex queries involving multiple constraints The Semantic Web REALITY composed by COMPUTER DOMAIN composed by Tosca Puccini

born in Madame Butterfly Lucca knowledge layer information layer Slide from http://www.ontopia.net/ Web 2.0 Tagging (folksonomy) Blogging The Long Tail Web services

Wikipedia Back to the elephant Concepts covered: Metadata XML Semantic Web Questions? Confused?

Recently Viewed Presentations

  • Module I - Notes Milenge

    Module I - Notes Milenge

    Module I Classification of Laws Classification Mainly the laws is divided into four branches : (i) International law it includes public and private international laws. (ii) Substantive laws (iii) Procedural laws (iv) Municipal or National law it includes public and...
  • The Seed Gatherer

    The Seed Gatherer

    Whether, then, the Phrygians are shown to be the most ancient people by the goats of the fable; or, on the other hand, the Arcadians by the poets, who describe them as older than the moon; or, finally, the Egyptians...
  • Body, Mind Spirit

    Body, Mind Spirit

    A characteristic of the shadow Mentor is an inability to allow the student to move on into the role of Master, maintaining control over the student's development of mind, body, and skills. The distinction between this archetype and the Teacher...
  • MXCZ MC HM - FMCSA Login

    MXCZ MC HM - FMCSA Login

    The HM carrier has attached, as Attachment E to this application, a statement providing information concerning: (1) the names of employees responsible for ensuring compliance with HM regulations, (2) a description of their HM safety functions, and (3) a copy...
  • Chapter 1 Introduction to Business Analytics

    Chapter 1 Introduction to Business Analytics

    Shows the impact that variation in a model input has on some output while holding all other inputs constant. ... Newsvendor problem. One-way data table. ... Chapter 1 Introduction to Business Analytics
  • PRUDENTIAL PLC JONATHAN BLOOMER GROUP CHIEF EXECUTIVE Strength

    PRUDENTIAL PLC JONATHAN BLOOMER GROUP CHIEF EXECUTIVE Strength

    PRUDENTIAL PLC JONATHAN BLOOMER GROUP CHIEF EXECUTIVE MERRILL LYNCH CONFERENCE 10 OCTOBER 2002 Strength & Opportunity Through Diversity AN INDUSTRY WITH COMPELLING DEMOGRAPHICS AN INDUSTRY WITH COMPELLING DEMOGRAPHICS PRUDENTIAL PLC GROUP OVERVIEW Focus on medium and long term savings A...
  • Cloud Computing&amp; Business - University of Central Florida

    Cloud Computing& Business - University of Central Florida

    Cost of cloud computing, expensive! up.time IT Systems Management Blog * The Berkeley article argues that private clouds are not included in cloud computing. * IBM's take on the cloud. Industry-wide survey of companies in 2009. ... PowerPoint Presentation Company:
  • Chapter 8

    Chapter 8

    Designing Organizational Structures Prepared by Norm Althouse University of Calgary * Copyright © 2011 by Nelson Education Ltd.