XML

Share

What is XML?

XML, short for eXtensible Markup Language, is a commonly used markup language that marks pieces of texts using <tags>, that is, with XML tags whose tag names are surrounded by angle brackets, e.g. normal text <bold>bold text</bold> normal text.

XML is used by all sorts of applications to do all sorts of things. Although it looks similar to HTML, there are many key differences. In particular, HTML is primarily used by web browsers to display web pages, while XML doesn't server this purpose, and is instead used by basically anything to encode data in a text format. XML is more comparable to JSON, which is another data transfer format. Note that there is also XHTML, which is a mixture of XML and HTML.

In XML, tags represent an XML element in an XML document. For example, for a BOLD element, there will be an opening tag, e.g. <bold>, and a closing tag that has a forward slash (/) before its name, e.g. </bold>. What's between these two tags is the contents of the element. Essentially, the XML code ou see is interpreted by a program, called a XML parser, to turn it into a data structure called Document Object Model (or DOM), which is a tree structure of element nodes.

Each node may contain a number of child nodes, of which there are many types. You can have element nodes inside other element nodes, but you can also have text that isn't inside a child element. This text is considered its own type of node: a text node. Another type of node is the <!-- comment --> node that is meant to be ignored.

For example, for <bold>foo <italic>bar</italic></bold>, we would have the following hierarchy:

<bold> (start element node)
   foo (text node)
   <italic> (start element node)
       bar (text node)
   </italic>
</bold>

A node without children can be shortened to a single tag by placing the forward slash after its tag name, e.g. <bold/> is equivalent to <bold></bold>. Note that this is different from how HTML5 works. In HTML5, <br> represents a line break element that can't have any children, so a closing tag isn't necessary. Typing <br/> isn't invalid for backwards compatibility with XHTML, but doesn't mean anything in HTML5 either. On the other hand, <div/> means <div></div> in XHTML, but in HTML5, <div/> means just <div> since the / is ignored, so it opens a tag, but doesn't close it.

Each node may also have a number of key="value" attributes in its opening tag, separated from the tag name by an space after it.

In any XML document, there will be a first node, called the root node, which encompasses the whole XML document.

The first line of a XML document may be a <!DOCTYPE> declaration. This declaration tells the what kind of data the XML code is encoding. For example, Krita uses XML as a way to transfer its filter settings. The XML that stores settings for a gaussian blur filter looks like this:

<!DOCTYPE params>
<params version="1">
 <param type="internal" name="horizRadius">10</param>
 <param type="internal" name="lockAspect">true</param>
 <param type="internal" name="vertRadius">10</param>
</params>

See above that they use params as a doctype. This doesn't mean anything in XML. It's something Krita decided for their own use of XML.

External Links

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *