Living Standard — Last Updated 13 November 2024
Support in all current engines.
このセクションは、XMLリソースに関する規則についてのみ説明する。text/html
リソースに関する規則は、"HTML構文"と題された前述のセクションで説明される。
Using the XML syntax is not recommended, for reasons which include the fact that there is no specification which defines the rules for how an XML parser must map a string of bytes or characters into a Document
object, as well as the fact that the XML syntax is essentially unmaintained — in that, it’s not expected that any further features will ever be added to the XML syntax (even when such features have been added to the HTML syntax).
HTMLのためのXML構文は、以前に"XHTML"と呼ばれていたが、この仕様ではその用語を使用しない(他にも理由はあるが、MathMLやSVGのHTML構文にそのような用語は使用されないため)。
XML構文は、XMLのXMLおよびNamespaces in XMLで定義されている。[XML] [XMLNS]
この仕様は、XMLに対して完全に定義されるものを超えるいかなる構文レベルの要件も定義しない。
XML文書は必要に応じてDOCTYPE
が含んでもよいが、DOCTYPEはこの仕様に適合する必要はない。この仕様は、公開またはシステム識別子を定義せず、公式なDTDも提供しない。
XMLによれば、XMLプロセッサはDOCTYPEで参照される外部DTDサブセットを処理することを保証しない。たとえば、(<
、>
、&
、"
、'
を除く)実体参照が外部ファイルで定義される場合、これは、XML文書で文字に対して実体参照を使用することが危険であることを意味する。
This section describes the relationship between XML and the DOM, with a particular emphasis on how this interacts with HTML.
An XML parser, for the purposes of this specification, is a construct that follows the rules given in XML to map a string of bytes or characters into a Document
object.
At the time of writing, no such rules actually exist.
An XML parser is either associated with a Document
object when it is created, or creates one implicitly.
This Document
must then be populated with DOM nodes that represent the tree structure of the input passed to the parser, as defined by XML, Namespaces in XML, and DOM. When creating DOM nodes representing elements, the create an element for a token algorithm or some equivalent that operates on appropriate XML data structures must be used, to ensure the proper element interfaces are created and that custom elements are set up correctly.
For the operations that the XML parser performs on the Document
's tree, the user agent must act as if elements and attributes were individually appended and set respectively so as to trigger rules in this specification regarding what happens when an element is inserted into a document or has its attributes set, and DOM's requirements regarding mutation observers mean that mutation observers are fired. [XML] [XMLNS] [DOM] [UIEVENTS]
Between the time an element's start tag is parsed and the time either the element's end tag is parsed or the parser detects a well-formedness error, the user agent must act as if the element was in a stack of open elements.
This is used by various elements to only start certain processes once they are popped off of the stack of open elements.
This specification provides the following additional information that user agents should use when retrieving an external entity: the public identifiers given in the following list all correspond to the URL given by this link. (This URL is a DTD containing the entity declarations for the names listed in the named character references section.) [XML]
-//W3C//DTD XHTML 1.0 Transitional//EN
-//W3C//DTD XHTML 1.1//EN
-//W3C//DTD XHTML 1.0 Strict//EN
-//W3C//DTD XHTML 1.0 Frameset//EN
-//W3C//DTD XHTML Basic 1.0//EN
-//W3C//DTD XHTML 1.1 plus MathML 2.0//EN
-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN
-//W3C//DTD MathML 2.0//EN
-//WAPFORUM//DTD XHTML Mobile 1.0//EN
Furthermore, user agents should attempt to retrieve the above external entity's content when one of the above public identifiers is used, and should not attempt to retrieve any other external entity's content.
This is not strictly a violation of XML, but it does contradict the spirit of XML's requirements. This is motivated by a desire for user agents to all handle entities in an interoperable fashion without requiring any network access for handling external subsets. [XML]
XML parsers can be invoked with XML scripting support enabled or XML scripting support disabled. Except where otherwise specified, XML parsers are invoked with XML scripting support enabled.
When an XML parser with XML scripting support enabled creates a script
element, it must have its parser document set and its force async set to false. If the parser was created as part of the XML fragment parsing algorithm, then the element's already started must be set to true. When the element's end tag is subsequently parsed, the user agent must perform a microtask checkpoint, and then prepare the script
element. If this causes there to be a pending parsing-blocking script, then the user agent must run the following steps:
Block this instance of the XML parser, such that the event loop will not run tasks that invoke it.
Spin the event loop until the parser's Document
has no style sheet that is blocking scripts and the pending parsing-blocking script's ready to be parser-executed is true.
Unblock this instance of the XML parser, such that tasks that invoke it can again be run.
Execute the script element given by the pending parsing-blocking script.
Set the pending parsing-blocking script to null.
Since the document.write()
API is not available for XML documents, much of the complexity in the HTML parser is not needed in the XML parser.
When the XML parser has XML scripting support disabled, none of this happens.
When an XML parser would append a node to a template
element, it must instead append it to the template
element's template contents (a DocumentFragment
node).
This is a willful violation of XML; unfortunately, XML is not formally extensible in the manner that is needed for template
processing. [XML]
When an XML parser creates a Node
object, its node document must be set to the node document of the node into which the newly created node is to be inserted.
Certain algorithms in this specification spoon-feed the parser characters one string at a time. In such cases, the XML parser must act as it would have if faced with a single string consisting of the concatenation of all those characters.
When an XML parser reaches the end of its input, it must stop parsing, following the same rules as the HTML parser. An XML parser can also be aborted, which must again be done in the same way as for an HTML parser.
For the purposes of conformance checkers, if a resource is determined to be in the XML syntax, then it is an XML document.
The XML fragment serialization algorithm for a Document
or Element
node either returns a fragment of XML that represents that node or throws an exception.
For Document
s, the algorithm must return a string in the form of a document entity, if none of the error cases below apply.
For Element
s, the algorithm must return a string in the form of an internal general parsed entity, if none of the error cases below apply.
In both cases, the string returned must be XML namespace-well-formed and must be an isomorphic serialization of all of that node's relevant child nodes, in tree order. User agents may adjust prefixes and namespace declarations in the serialization (and indeed might be forced to do so in some cases to obtain namespace-well-formed XML). User agents may use a combination of regular text and character references to represent Text
nodes in the DOM.
A node's relevant child nodes are those that apply given the following rules:
template
elementstemplate
element's template contents, if any.For Element
s, if any of the elements in the serialization are in no namespace, the default namespace in scope for those elements must be explicitly declared as the empty string. (This doesn't apply in the Document
case.) [XML] [XMLNS]
For the purposes of this section, an internal general parsed entity is considered XML namespace-well-formed if a document consisting of an element with no namespace declarations whose contents are the internal general parsed entity would itself be XML namespace-well-formed.
If any of the following error cases are found in the DOM subtree being serialized, then the algorithm must throw an "InvalidStateError
" DOMException
instead of returning a string:
Document
node with no child element nodes.DocumentType
node that has an external subset public identifier that contains characters that are not matched by the XML PubidChar
production. [XML]DocumentType
node that has an external subset system identifier that contains both a U+0022 QUOTATION MARK (") and a U+0027 APOSTROPHE (') or that contains characters that are not matched by the XML Char
production. [XML]Name
production. [XML]Attr
node with no namespace whose local name is the lowercase string "xmlns
". [XMLNS]Element
node with two or more attributes with the same local name and namespace.Attr
node, Text
node, Comment
node, or ProcessingInstruction
node whose data contains characters that are not matched by the XML Char
production. [XML]Comment
node whose data contains two adjacent U+002D HYPHEN-MINUS characters (-) or ends with such a character.ProcessingInstruction
node whose target name is an ASCII case-insensitive match for the string "xml
".ProcessingInstruction
node whose target name contains a U+003A COLON (:).ProcessingInstruction
node whose data contains the string "?>
".These are the only ways to make a DOM unserialisable. The DOM enforces all the other XML constraints; for example, trying to append two elements to a Document
node will throw a "HierarchyRequestError
" DOMException
.
The XML fragment parsing algorithm either returns a Document
or throws a "SyntaxError
" DOMException
. Given a string input and a context element context, the algorithm is as follows:
Create a new XML parser.
Feed the parser just created the string corresponding to the start tag of the context element, declaring all the namespace prefixes that are in scope on that element in the DOM, as well as declaring the default namespace (if any) that is in scope on that element in the DOM.
A namespace prefix is in scope if the DOM lookupNamespaceURI()
method on the element would return a non-null value for that prefix.
The default namespace is the namespace for which the DOM isDefaultNamespace()
method on the element would return true.
No DOCTYPE
is passed to the parser, and therefore no external subset is referenced, and therefore no entities will be recognized.
Feed the parser just created the string input.
Feed the parser just created the string corresponding to the end tag of the context element.
If there is an XML well-formedness or XML namespace well-formedness error, then throw a "SyntaxError
" DOMException
.
If the document element of the resulting Document
has any sibling nodes, then throw a "SyntaxError
" DOMException
.
Return the child nodes of the document element of the resulting Document
, in tree order.