What is XML well-formedness and how could you measure it?
A correctly formed XML document which strictly adheres to the XML syntax rules and guides is termed a "well formed" XML document. To be a well formed XML document it must at least adhere to the rules listed below:
- All XML documents must contain at least 1 (one) element.
- The first element has to be defined in between the opening and closing tags
- The whole XML document must be embraces between a unique pair of opening and closing tags to be valid.
- Nesting is also very important. Overlapping tags are not allowed.
- Each tag is identified by the "<" and ">" or angled brackets and the use of any other type of brackets is not allowed.
- XML is case sensitive. This means that for example, <BOOK> is not the same as <book>. Furthermore, the XML spec defines that tags should be in lower case except in DTDs where keywords are usually used in upper case such as ELEMENT, ATTLIST, #IMPLIED, and #REQUIRED. Actually, when creating custom elements a mixture of character cases may be used, however it has the be ensures that the opening and closing elements use the same mixture precisely.
- Empty tags have to be closed with a slash such as the <BR> tag which is a line break in HTML must be closed with a slash such as <br/>.
- Attributes must always be qouted (enclosed in qoutes)
Modern browsers are equipped with an XML parser or "translator" that is built in. If any of the above rules are ignored by the XML markup, the browser may not be able to parse the document properly and consequently raise an error condition. Special parser tester applications may be employed by the developer to check the XML syntax and that the rules above are being followed.The job of the parser is to read the XML file and look for the hierarchical tree structure that is inherent to all XML documents.
There are many examples of XML parsers. The Xeres Java Parser may be used to check the integrity of business data in XML. XT-XP and XP are both parsers written in the JAVA language and offer high performance and speed. XT is a set of tools for building program transformation systems
XML parser for Java is also popular. This parser like others mentioned above has the advantage of being cross-platform and will run on any operating system.
What is XML validation and what is the output?
Validation refers to checking if an XML document conforms to the standards of markup and syntax as specified by the XML spec. It also refers to the DTD (Document Type Definition) which is another set of rules that defines what the tags in an XML document mean. Since XML is extensible and customizable the definition in a DTD is required to make sense of the data contained. The DTD is important and must thus be included in any validation argument because it describes what is allowed in the structure of a document such as names that can be used for elements types, the frequency an element may or must be used and the order of the elements. It also includes the rules on how elements may be nested. DTD also specified what attributes are used with which elements and is they can be omitted or not.
Valid XML elements must have their elements in the specified order, the first element being the “root” element. To be valid it must also include the correct DOCTYPE declaration which tells the parser about the document.
The DTD appears as part of the prolog of an XML document and can be put inside DOCTYPE declaration, which contains the name of the root element. Consequently, the name of the root element used in XML needs to be identical to the name specified in the DTD. Using an external DTD allows the same DTD to be re-used in multiple XML documents and it is used as a guide in the validation of all the documents. This is of course preferable to repeating the DTD in every single XML file. It also offers substantial advantages in terms of streamlining the XML date.
Example calling and external DTD:
<?xml version="1.0"?>
<!DOCTYPE film SYSTEM "book.dtd">
<book>
<title id="1">Cooking at Home</title>
<genre>&COO;</genre>
<year>2010</year>
<title id="2">The Killer</title>
<genre>&THR;</genre>
<year>2000</year>
</book>
Example showing DTD in same document:
<?xml version="1.0"standalone= “yes”>
<!DOCTYPE book [
<!ENTITY COO "Cookery">
<!ENTITY THR "Thrillers">
<!ELEMENT book(title+,genre,year)>
<!ELEMENT title (#PCDATA)>
<!ATTLIST title
xml:lang NMTOKEN "EN"
id ID #IMPLIED>
<!ELEMENT genre (#PCDATA)>
<!ELEMENT year (#PCDATA)>
]>
<book>
<title id="1">Cooking at Home</title>
<genre>&COO;</genre>
<year>2010</year>
<title id="2">The Killer</title>
<genre>&THR;</genre>
<year>2000</year>
</book>
How can an XML document be best presented?
What is the importance of the first line of an XML document?
What are the differences between elements and attributes and what are their different uses?
In XML elements are used to describe the data while attributes contain information that will be used to display the data and are therefore used in presentation. Attributes are usually included in the opening tags. Web developers are very used to this syntax in html, for example :
<table> is the actual element while the border=’1’ is an attribute that tells the browser that the table must be shown with a border or 1 pixel.
There is a defined set of rules for elements and attributes in the DTD. Here are some of them :
The element declaration:
For example: <!ELEMENT book(author, publishdate, genre)>
Using the “pipe” symbol (vertical bar) specifies and OR condition. For example :
Is specifying that the element book can be red, green or blue.
The asterisk * is used to specify zero or more such as :
The above specifies that the car may have more than one spoiler or none at all.
The plus (+) symbol specifies that the element may occur once or more than once (one or more).
For example :
A car can have one or more spoilers.
The question mark (?) can also be used and will specify “zero or one” which means that the element can occur once and once only or none at all.
Empty elements may also be specified which mean that valid XML document cannot include tags of this type:
<!ELEMENT tintedglass EMPTY>
Attribute in DTDs
Attributes are used in DTD to achieve the following :
These rules and definitions cannot be specified in the XML itself and this is what makes the DTD important.
The DTD is then used by the parser to validate the document. The attributes of an elements are declared in a single list using ATLIST:
<!–ATLIST element-name attribute-specification … attribute-specification>
Elements must be defined in the same DTD and more than one attribute can be specified in a single ATLIST element.
Attribute specification will have the form “name type value” name where the name is a chosen attribute. Attribute names may only appear once in an attribute declaration, but the same attributes name can be used with different elements.
The CDATA attribute type is the most commonly used and specifies ‘character data’.
Attribute keywords specify whether an attribute is compulsory or required, optional or implied, or constant or fixed. Here is an example of a “required” or compulsory attribute :
XML does not include a styling system “out of the box” and thus the data contained in it, if displayed, will not look like much and will not make much sense to the viewer. Other languages have been created to satisfy this need such as XSL, XUL. XIML and the more well known CSS.
CSS is the most commonly used was to style XML and is widely used in web design today. CSS allows the developer to separate content from styling by creating style sheets which contain instructions about how the display system (usually a browser) will display the data. CSS can be re-used across many XML documents and is usually used to group the styling of multiple documents in one easily accessible place. CSS in its latest incarnation which is version 3 allows the developer to change the style and color of text and include images for example. It does have some shortcomings such as the exclusion of the use of simple arithmetic when specifying styles. This is usually achieved through the use of PHP.
CSS also allows for styling through the use of classes. Classes allow the grouping of a number of elements by making them “belong” to the specific class. For example, a class may be created to display something in red and a font size of 15 pixels. Any element to which this class is applied will be rendered on the browser in the specified format. Elements can also be made unique through the use of the “id” element.
Lists and tables can also be styled through CSS. The way how lists is fully customizable through CSS and table spacing and padding as well as colour of cells, rows and columns may also be specified.
Hypertext links can also be styled in the same way as other elements through CSS and it supports various states of the link such as differentiating between hovering and visited.
What is the importance of the first line of an XML document?
XML documents begin with an XML declaration which identifies the document as an XML document and specifies a version number such as version number is 1.0 which is the most commonly used. Example :
<?xml version= “1.0”?>
This first line that includes the XML declaration, processing instructions and the “encoding” of the document is called a “Prolog”. The Encoding specified is specified as follows :
<?xml version= “1.0” encoding= “utf-8”?>.
The above will specify that the document is encoded in utf-8 (Unicode Transformation Format 8). Others can be specified such as utf-16 and ISO-8859-1.
When working with XML documents they have to be saved in the editor in the same encoding version that is specified in the document or there is a risk that applications such as browsers will not render the document correctly.
The Prolog of the XML document can also include the DTD declaration which specifies that the document also uses an external DTD which is the used to validate the document. At this point the “Standalone” attribute can also be specified so that the document can be parsed without referring to external sources.
Finally processing instructions may also be included in the Prolog of an XML document. These instructions will tell the application that will process the documents any other things that need to be done when working with the document.
What are the differences between elements and attributes and what are their different uses?
In XML elements are used to describe the data while attributes contain information that will be used to display the data and are therefore used in presentation. Attributes are usually included in the opening tags. Web developers are very used to this syntax in html, for example :
<table border=’1’>
<table> is the actual element while the border=’1’ is an attribute that tells the browser that the table must be shown with a border or 1 pixel.
There is a defined set of rules for elements and attributes in the DTD. Here are some of them :
The element declaration:
<!ELEMENT element-name (regular – expression)>
For example: <!ELEMENT book(author, publishdate, genre)>
Using the “pipe” symbol (vertical bar) specifies and OR condition. For example :
<!ELEMENT book(green| red| blue)>
Is specifying that the element book can be red, green or blue.
The asterisk * is used to specify zero or more such as :
<!ELEMENT car (spoiler*)>
The above specifies that the car may have more than one spoiler or none at all.
The plus (+) symbol specifies that the element may occur once or more than once (one or more).
For example :
<!ELEMENT car(spoiler+)>
A car can have one or more spoilers.
The question mark (?) can also be used and will specify “zero or one” which means that the element can occur once and once only or none at all.
Empty elements may also be specified which mean that valid XML document cannot include tags of this type:
<!ELEMENT tintedglass EMPTY>
Attribute in DTDs
Attributes are used in DTD to achieve the following :
· Define default values
· Define sets of allowed and valid values
· Create references between elements
· Define fixed values
These rules and definitions cannot be specified in the XML itself and this is what makes the DTD important.
The DTD is then used by the parser to validate the document. The attributes of an elements are declared in a single list using ATLIST:
<!–ATLIST element-name attribute-specification … attribute-specification>
Elements must be defined in the same DTD and more than one attribute can be specified in a single ATLIST element.
Attribute specification will have the form “name type value” name where the name is a chosen attribute. Attribute names may only appear once in an attribute declaration, but the same attributes name can be used with different elements.
The CDATA attribute type is the most commonly used and specifies ‘character data’.
Attribute keywords specify whether an attribute is compulsory or required, optional or implied, or constant or fixed. Here is an example of a “required” or compulsory attribute :
<!–ATLIST author type CDATA #REQUIRED>
What is AJAX and what are its purposes?
AJAX is well known for allowing web developers to create web pages parts of which can be updated with fresh information without having to refresh the whole page again which is usually the case with other web languages such as HTML. The AJAZ paradigm was developed in 2005 and originates from a tradition of client-side scripting languages such as Java Script, DHTML (Dynamic HTML) and JAVA Applets.
The element that sets AJAX apart is the ‘XMLHttpRequest’. This element allows browsers to send requests to the server and get replies without warranting the user to confirm. The server may respond with an XML document or a simple stream of characters which can then be processed by client scripts and though the DOM (Document Object Model) be used to update the page. ‘XMLHttpRequest’ does in fact come with some security considerations against which browsers guard against. Cross domain requests are forbidden and sending requests to a domain other than to the domain that originated the page is not allowed to mitigate these risks.
Ajax can be considered as combination of:
- A standards-based presentation using XHTML and CSS
- Dynamic display and interaction using the document object model
- Data interchange and manipulation using XML and XSLT (eXtensible Stylesheet Language Transformation)
- Asynchronous data retrieval using ‘XMLHttpRequest’
- JavaScript binding everything together
Ajax is written either by using Javascript to build Ajax code or special APIs can be sued such as the Google Ajax API amongst others such as JQuery which also includes a rich Ajax implementation framework.
Ajax also has some well known disadvantages such as its limited set of capabilities. It does not support multimedia for instance and interaction with hardware such as printers and web cameras.
Perhaps the most obvious limitation is Ajax’s dependence on an internet connection. Since Ajax needs constant updates from the server it obviously does not work without a connection to the server.
Many times “browser side caching” has to be employed to ensure that an Ajax application does not “lock up” while it is waiting for data from the server. This could have serious performance implications if browser caching is not used extensively.
Finally, a developer need to be conversant with Javascript to be able to develop sites in Ajax and for this reason Ajax is often considered as a second-tier programming language.

0 comments:
Post a Comment