BA372 - XML & Web Services
- Notice how the markup tags convey meaning, yet say nothing
about how this information must be displayed.
- Also notice how the concepts are hierarchical; i.e., 'nested:'
- a collection
can contain more than one painting.
- titles
and artists
appear only within paintings.
- This strict hierarchy is easy for programs to 'parse.'
- An XML document or fragment can be represented as a tree
(similar to a directory/folder tree).
- Problem: Can
you draw the <collection>
tree?
- Trees are 'nice' data structures to represent and parse.
- Document Object Model
(DOM) is a W3C-governed standard for representing XML documents as
'trees.'
- An entire XML document can be read into a tree and stored
in memory; e.g., as a DOM object.
- When the tree is properly constructed; i.e., as a tree, the document
is
said to be well formed.
- Only well-formed documents can be reliably parsed ==>
XML makes things very rigid & predictable ==> easy on
programs to parse and 'take apart.'
- Problem:
what are some examples of a not well-formed XML tree?
- Problem:
make an XML document, introduce some well formedness errors and try
pick up the 'broken' document with your browser; what do you observe?
- Since XML (like HTML) is text-based, it is compatible across
computing platforms.
Problem:
Who determines which 'tags' or
'elements' are available? What if I would like to create some XML to
represent course student enrollment? Which XML elements are there for
me to use?
- Note how an element or 'node' such as <section> can
have attributes, similar to attributes in HTML tags.
- Problem: When
to make something an attribute rather than an element?
But how does the
receiving computer know which XML elements I made up?
How, therefore, can it 'parse' my XML?
- Answer 1: If the document is well formed; a document tree can
always be formed.
- However, what about more complex constraints?
- <student>
must have both a <first_name>
and a <last_name>.
- <first_name>s
and <last_name>s
are atomic; i.e., they must
contain nothing but characters.
- A <section>
has an attribute number.
- Answer 2: you have to provide a declaration of your syntax (grammar +
vocabulary): Document Type
Definition (DTD) or XML
Schema (XSD).
XML (Harold & Means (2001),
p. 30)
|
DTD
|
<person>
<name>
<first>Allen</first>
<last>Turing</last>
</name>
<profession >computer scientist</profession>
<profession>mathematician</profession>
<profession>cryptographer</profession>
</person>
|
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT profession (#PCDATA)>
<!ELEMENT name (first, last)>
<!ELEMENT person (name, profession*)>
|
- Problem: So how do
the XML document and the DTD get associated with each other? How does
the receiving computer know which DTD goes with which XML document?
- Internal DTD:
DTD and XML stored in the same document:
<!DOCTYPE
person [
<!ELEMENT first (#PCDATA)>
<!ELEMENT last (#PCDATA)>
<!ELEMENT profession (#PCDATA)>
<!ELEMENT name (first, last)>
<!ELEMENT person (name, profession*)>
]>
<person>
<name>
<first>Allen</first>
<last>Turing</last>
</name>
<profession >computer scientist</profession>
<profession>mathematician</profession>
<profession>cryptographer</profession>
</person>
- External DTD:
XML document contains a reference to a DTD elsewhere on the Web.
<!DOCTYPE
person SYSTEM "http://www.mywebsite.com/person.dtd">
<person>
<name>
<first>Allen</first>
<last>Turing</last>
</name>
<profession >computer scientist</profession>
<profession>mathematician</profession>
<profession>cryptographer</profession>
</person>
- Examples of XML-based formats:
- Web service:
HTTP/XML-based data provider: two types:
- Problem: What is a
so-called Service-Oriented
Architecture (SOA):
- Problem: Isn't this
what OOP was supposed to give us?
- Problem: So, what's
the difference? Why would XML-based Web services be able to accomplish
what OOP did not?