1. 8.4 動的マークアップの挿入
      1. 8.4.1 入力ストリームを開く
      2. 8.4.2 入力ストリームを閉じる
      3. 8.4.3 document.write()
      4. 8.4.4 document.writeln()
    2. 8.5 DOM解析およびシリアライゼーションAPI
      1. 8.5.1 DOMParserインターフェイス
      2. 8.5.2 HTML parsing methods
      3. 8.5.3 HTMLシリアル化メソッド
      4. 8.5.4 innerHTMLプロパティ
      5. 8.5.5 outerHTMLプロパティ
      6. 8.5.6 insertAdjacentHTML()メソッド
      7. 8.5.7 createContextualFragment()メソッド
      8. 8.5.8 XMLSerializerインターフェイス
    3. 8.6 HTML sanitization
      1. 8.6.1 Introduction
        1. 8.6.1.1 Safe and unsafe
      2. 8.6.2 The Sanitizer interface
      3. 8.6.3 Sanitizer configuration
        1. 8.6.3.1 Configuration invariants
      4. 8.6.4 Security considerations
        1. 8.6.4.1 Server-side reflected and stored XSS
        2. 8.6.4.2 DOM clobbering
        3. 8.6.4.3 XSS with script gadgets
        4. 8.6.4.4 Mutation XSS

8.4 動的マークアップの挿入

マークアップを文書に動的に挿入するためのAPIはパーサーと相互作用するため、その動作は、HTML文書(およびHTMLパーサー)かXML文書(およびXMLパーサー)かのどちらで使用されるかによって異なる。

8.4.1 入力ストリームを開く

document = document.open()

前のオブジェクトを再利用する場合を除き、あたかもそれが新しいDocumentオブジェクトであるかのように、Documentに正しい場所で置換され、その後返される。

結果として得られるDocumentはHTMLパーサーが関連付けられており、 document.write()を使用して解析するデータを与えることができる。

Documentがまだ解析されている場合、メソッドは効果がない。

DocumentXML文書である場合、"InvalidStateError" DOMExceptionを投げる。

パーサーがカスタム要素コンストラクターを現在実行している場合、"InvalidStateError" DOMExceptionを投げる。

window = document.open(url, name, features)

window.open()メソッドのように動作する。

8.4.2 入力ストリームを閉じる

document.close()

document.open()メソッドによって開かれた入力ストリームを閉じる。

DocumentXML文書である場合、"InvalidStateError" DOMExceptionを投げる。

パーサーがカスタム要素コンストラクターを現在実行している場合、"InvalidStateError" DOMExceptionを投げる。

8.4.3 document.write()

document.write(...text)

一般に、与えられた文字列をDocumentの入力ストリームに加える。

このメソッドは非常に特異な振る舞いを持つ。一部の場合において、このメソッドは、パーサーが実行されている間、HTMLパーサーの状態に影響を与えることができる。その結果、文書のソースに対応しないDOMをもたらす(たとえば、記述された文字列が、文字列"<plaintext>"または"<!--"である場合)。他の例では、あたかもdocument.open()が呼び出されていたかのように、呼び出しが最初に現在のページをクリアできる。さらに多くの例では、メソッドは単に無視されるか、または例外を投げる。User agents are explicitly allowed to avoid executing script elements inserted via this method. さらに悪いことに、このメソッドの正確な動作は、場合によってはネットワーク遅延に依存する可能性があり、これはデバッグが非常に困難な障害につながる可能性がある。これらすべての理由から、このメソッドの使用は強く勧めない。

XML文書で呼び出されるとき、"InvalidStateError" DOMExceptionを投げる。

パーサーがカスタム要素コンストラクターを現在実行している場合、"InvalidStateError" DOMExceptionを投げる。

このメソッドは、scriptまたはイベントハンドラーコンテンツ属性のような潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.4.4 document.writeln()

document.writeln(...text)

改行文字の後に、与えられた文字列をDocumentの入力ストリームに加える。必要ならば、open()メソッドを暗黙のうちに最初に呼び出す。

このメソッドは非常に特異な振る舞いを持つ。Use of this method is strongly discouraged, for the same reasons as document.write().

XML文書で呼び出されるとき、"InvalidStateError" DOMExceptionを投げる。

パーサーがカスタム要素コンストラクターを現在実行している場合、"InvalidStateError" DOMExceptionを投げる。

このメソッドは、scriptまたはイベントハンドラーコンテンツ属性のような潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5 DOM解析およびシリアライゼーションAPI

DOMParser

Support in all current engines.

Firefox1+Safari1.3+Chrome1+
Opera8+Edge79+
Edge (Legacy)12+Internet Explorer9+
Firefox Android?Safari iOS?Chrome Android?WebView Android?Samsung Internet?Opera Android10.1+

8.5.1 DOMParserインターフェイス

DOMParser インターフェイスは、HTMLまたはXMLのいずれかとして、文字列を解析することで新しいDocumentオブジェクトを作成することを可能にする。

parser = new DOMParser()

新しいDOMParserオブジェクトを構築する。

document = parser.parseFromString(string, type)

typeに応じて、HTMLまたはXMLパーサーのいずれかを使用して文字列を解析し、結果のDocumentを返す。typeは、"text/html"(HTMLパーサーを呼び出す)、または"text/xml"、"application/xml"、"application/xhtml+xml"、もしくは"image/svg+xml"(XMLパーサーを呼び出す)。

XMLパーサーの場合、文字列を解析できない場合、返されるDocumentは、結果のエラーを説明する要素が含まれる。

script 要素は解析中に評価されず、結果の文書のエンコーディングは常にUTF-8となることに注意する。文書のURLは、parser関連するグローバルオブジェクトから継承される。

typeに上記以外の値を指定すると、TypeError例外が投げられる。

構築してからparseFromString()メソッドを呼び出す必要があるクラスとしてのDOMParserの設計は、不幸な歴史的成果物である。もし今日にこの機能を設計していたとしたら、それはスタンドアロン機能になっただろう。HTMLを解析する場合、現在の代替手段はDocument.parseHTMLUnsafe()である。

このメソッドは、scriptまたはイベントハンドラーコンテンツ属性のような潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5.2 HTML parsing methods

element.setHTML(html, options)

Parses html using the HTML parser with options options, and replaces the children of element with the result. element provides context for the HTML parser. The parsed fragment is sanitized based on the options's "sanitizer" member, and unsafe content is removed.

shadowRoot.setHTML(html, options)

Parses html using the HTML parser with options options, and replaces the children of shadowRoot with the result. shadowRoot's host provides context for the HTML parser. The parsed fragment is sanitized based on the options's "sanitizer" member, and unsafe content is removed.

element.setHTMLUnsafe(html, options)

Parses html using the HTML parser with options options, and replaces the children of element with the result. element provides context for the HTML parser. If the options dictionary contains a "sanitizer" member, it is used to sanitize the parsed fragment before it is inserted into element.

shadowRoot.setHTMLUnsafe(html, options)

Parses html using the HTML parser with options options, and replaces the children of shadowRoot with the result. shadowRoot's host provides context for the HTML parser. If the options dictionary contains a "sanitizer" member, it is used to sanitize the parsed fragment before it is inserted into shadowRoot.

doc = Document.parseHTML(html, options)

Parses html using the HTML parser with options options, and returns a new Document containing the result. The resulting document is sanitized based on the options's "sanitizer" member, and unsafe content is removed.

doc = Document.parseHTMLUnsafe(html, options)

Parses html using the HTML parser with options options, and returns the resulting Document.

script 要素は解析中に評価されず、結果の文書のエンコーディングは常にUTF-8となることに注意する。文書のURLは、about:blankになる。If the options dictionary contains a "sanitizer" member, it is used to sanitize the resulting DOM.

The methods with an Unsafe suffix perform no sanitization to remove potentially-dangerous elements and attributes like script or event handler content attributes.

8.5.3 HTMLシリアル化メソッド

html = element.getHTML({ serializableShadowRoots, shadowRoots })

elementをHTMLにシリアル化した結果を返す。element内のシャドウルートは、指定されたオプションに従ってシリアル化される。

どちらのオプションも指定しない場合、シャドウルートはシリアル化されない。

html = shadowRoot.getHTML({ serializableShadowRoots, shadowRoots })

コンテキスト要素としてシャドウホストを使用して、shadowRootをHTMLにシリアル化した結果を返す。shadowRoot内のシャドウルートは、上記のように、指定されたオプションに従ってシリアル化される。

8.5.4 innerHTMLプロパティ

innerHTMLプロパティには、DOM Parsing and Serialization issue trackerに未解決の問題が多数あり、その仕様に関するさまざまな問題が文書化されている。

element.innerHTML

要素の内容を表すHTMLまたはXMLのフラグメントを返す。

XML文書の場合、要素をXMLにシリアル化できない場合、"InvalidStateError" DOMExceptionを投げる。

element.innerHTML = value

要素の内容を、指定された文字列から解析されたノードに置き換える。

XML文書の場合、指定した文字列が整形式でない場合、"SyntaxError" DOMExceptionを投げる。

shadowRoot.innerHTML

シャドウルートの内容を表すHTMLのフラグメントを返す。

shadowRoot.innerHTML = value

シャドウルートの内容を、指定した文字列から解析されたノードに置き換える。

これらのプロパティのセッターは、scriptまたはイベントハンドラーコンテンツ属性などの潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5.5 outerHTMLプロパティ

outerHTMLプロパティには、DOM Parsing and Serialization issue trackerに未解決の問題が多数あり、その仕様に関するさまざまな問題が文書化されている。

element.outerHTML

要素とその内容を表すHTMLまたはXMLのフラグメントを返す。

XML文書の場合、要素をXMLにシリアル化できない場合、"InvalidStateError" DOMExceptionを投げる。

element.outerHTML = value

要素を、指定された文字列から解析されたノードに置き換える。

XML文書の場合、指定した文字列が整形式でない場合、"SyntaxError" DOMExceptionを投げる。

要素の親がDocumentである場合、"NoModificationAllowedError" DOMExceptionを返す。

このプロパティのセッターは、scriptまたはイベントハンドラーコンテンツ属性などの潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5.6 insertAdjacentHTML()メソッド

insertAdjacentHTML()メソッドには、DOM Parsing and Serialization issue trackerに未解決の問題が多数あり、その仕様に関するさまざまな問題が文書化されている。

element.insertAdjacentHTML(position, string)

stringをHTMLまたはXMLとして解析し、結果のノードを次のようにposition引数で指定された位置にツリーに挿入する:

"beforebegin"
要素自体の前(つまり、elementの前の兄弟の後)
"afterbegin"
要素のすぐ内側で、最初の子の前。
"beforeend"
要素の内部で、最後の子の後。
"afterend"
要素自体の前(つまり、elementの次の兄弟の前)

引数に無効な値が含まれている場合に"SyntaxError" DOMExceptionを投げる(たとえば、XML文書の場合、指定された文字列が整形式でない場合)。

指定した位置が使用できない場合に"NoModificationAllowedError" DOMExceptionを投げる(たとえば、Documentのルート要素の後に要素を挿入する場合など)。

このメソッドは、scriptまたはイベントハンドラーコンテンツ属性のような潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5.7 createContextualFragment()メソッド

createContextualFragment()メソッドには、DOM Parsing and Serialization issue trackerに未解決の問題が多数あり、その仕様に関するさまざまな問題が文書化されている。

docFragment = range.createContextualFragment(string)

fragmentが解析されるコンテキストとしてrange開始ノードを使用して、マークアップ文字列stringから作成された DocumentFragmentを返す。

このメソッドは、scriptまたはイベントハンドラーコンテンツ属性のような潜在的に危険な要素および属性を削除するためのサニタイズを実行しない。

8.5.8 XMLSerializerインターフェイス

>XMLSerializeインターフェイスには、DOM Parsing and Serialization issue trackerに未解決の問題が多数あり、その仕様に関するさまざまな問題が文書化されている。DOM Parsing and Serializationの残りの部分は、この仕様に徐々にアップストリームされまる。

xmlSerializer = new XMLSerializer()

新しいXMLSerializerオブジェクトを構築する。

string = xmlSerializer.serializeToString(root)

rootをXMLにシリアル化した結果を返す。

rootをXMLにシリアル化できない場合、"InvalidStateError" DOMExceptionを投げる。

構築してからserializeToString()メソッドを呼び出す必要があるクラスとしてのXMLSerializerの設計は、不幸な歴史的成果物である。もし今日にこの機能を設計していたとしたら、それはスタンドアロン機能になっただろう。

8.6 HTML sanitization

8.6.1 Introduction

Web applications often need to process untrusted HTML strings, such as when rendering user-generated content or using client-side templates. Safely inserting these strings into the DOM requires careful sanitization to prevent DOM-based cross-site scripting (XSS) attacks.

HTML sanitization provides a native mechanism for safely parsing and sanitizing HTML strings. By using the user agent's own HTML parser, they ensure the sanitized output accurately reflects how the browser will render the content, preventing script execution and mitigating advanced attacks such as script gadgets.

These APIs offer functionality to parse a string containing HTML into a DOM tree, and to filter the resulting tree according to a user-supplied configuration. The methods come in two main flavors: "safe" and "unsafe".

8.6.1.1 Safe and unsafe

The "safe" methods will not generate any markup that executes script. That is, they are intended to be safe from XSS. The "unsafe" methods will parse and filter based on the provided configuration, but do not have the same safety guarantees by default.

8.6.2 The Sanitizer interface

config = sanitizer.get()

Returns a copy of the sanitizer's configuration.

sanitizer.allowElement(element)

Ensures that the sanitizer configuration allows the specified element.

sanitizer.removeElement(element)

Ensures that the sanitizer configuration blocks the specified element.

sanitizer.replaceElementWithChildren(element)

Configures the sanitizer to remove the specified element but keep its child nodes.

sanitizer.allowAttribute(attribute)

Configures the sanitizer to allow the specified attribute globally.

sanitizer.removeAttribute(attribute)

Configures the sanitizer to block the specified attribute globally.

sanitizer.allowProcessingInstruction(pi)

Configures the sanitizer to allow the specified processing instruction.

sanitizer.removeProcessingInstruction(pi)

Configures the sanitizer to block the specified processing instruction.

sanitizer.setComments(allow)

Sets whether the sanitizer preserves comments.

sanitizer.setDataAttributes(allow)

Sets whether the sanitizer preserves custom data attributes (e.g., data-*).

sanitizer.removeUnsafe()

Modifies the configuration to automatically remove elements and attributes that are considered unsafe.

A Sanitizer object has an associated configuration, which is a SanitizerConfig.

The new Sanitizer(configuration) constructor steps are:

  1. If configuration is a SanitizerPresets string:

    1. Assert: configuration is "default".

    2. Set configuration to the built-in safe default configuration.

  2. Configure this given configuration and true.

To configure a Sanitizer sanitizer, given a dictionary configuration and a boolean allowCommentsPIsAndDataAttributes:

  1. Canonicalize the configuration configuration with allowCommentsPIsAndDataAttributes.

  2. If configuration is not valid, then throw a TypeError.

  3. Set sanitizer's configuration to configuration.

To canonicalize the configuration SanitizerConfig configuration with a boolean allowCommentsPIsAndDataAttributes:

  1. For each member of configuration that is a list of strings:

    1. Replace each string in member with the result of canonicalizing it.

  2. If neither configuration["elements"] nor configuration["removeElements"] exists, then set configuration["removeElements"] to an empty list.

  3. If neither configuration["attributes"] nor configuration["removeAttributes"] exists, then set configuration["removeAttributes"] to an empty list.

  4. If neither configuration["processingInstructions"] nor configuration["removeProcessingInstructions"] exists:

    1. If allowCommentsPIsAndDataAttributes is true, then set configuration["removeProcessingInstructions"] to an empty list.

    2. Otherwise, set configuration["processingInstructions"] to an empty list.

  5. If configuration["elements"] exists:

    1. Let newElements be « ».

    2. For each element of configuration["elements"], append the result of canonicalizing element to newElements.

    3. Set configuration["elements"] to newElements.

  6. If configuration["removeElements"] exists, then set configuration["removeElements"] to the result of canonicalizing configuration["removeElements"].

  7. If configuration["attributes"] exists, then set configuration["attributes"] to the result of canonicalizing configuration["attributes"].

  8. If configuration["removeAttributes"] exists, then set configuration["removeAttributes"] to the result of canonicalizing configuration["removeAttributes"].

  9. If configuration["replaceWithChildrenElements"] exists, then set configuration["replaceWithChildrenElements"] to the result of canonicalizing configuration["replaceWithChildrenElements"].

  10. If configuration["processingInstructions"] exists, then set configuration["processingInstructions"] to the result of canonicalizing configuration["processingInstructions"].

  11. If configuration["removeProcessingInstructions"] exists, then set configuration["removeProcessingInstructions"] to the result of canonicalizing configuration["removeProcessingInstructions"].

  12. If configuration["comments"] does not exist, then set it to allowCommentsPIsAndDataAttributes.

  13. If configuration["attributes"] exists and configuration["dataAttributes"] does not exist, then set it to allowCommentsPIsAndDataAttributes.

To canonicalize a sanitizer list list:

  1. Let newList be « ».

  2. For each item in list, append the result of canonicalizing item to newList.

  3. Return newList.

To canonicalize a processing instruction list list:

  1. Let newList be « ».

  2. For each item in list, append the result of canonicalizing item to newList.

  3. Return newList.

To canonicalize a processing instruction given a SanitizerPI pi:

  1. If pi is a DOMString, then return «[ "target" → pi ]».

  2. Assert: pi is a dictionary and pi["target"] exists.

  3. Return «[ "target" → pi["target"] ]».

To canonicalize a sanitizer name given a DOMString or dictionary name, and a default namespace defaultNamespace (default null):

  1. If name is a DOMString, then return «[ "name" → name, "namespace" → defaultNamespace ]».

  2. Assert: name is a dictionary and both name["name"] and name["namespace"] exist.

  3. If name["namespace"] is the empty string, then set it to null.

  4. Return «[ "name" → name["name"], "namespace" → name["namespace"] ]».

To canonicalize a sanitizer element given a SanitizerElement element:

  1. Return the result of canonicalizing element with the HTML namespace as the default namespace.

To canonicalize a sanitizer element list list:

  1. Let newList be « ».

  2. For each item in list, append the result of canonicalizing item to newList.

  3. Return newList.

To find the canonicalized intersection of lists A and B:

  1. Let setA be « ».

  2. Let setB be « ».

  3. For each entry of A, append the result of canonicalizing entry to setA.

  4. For each entry of B, append the result of canonicalizing entry to setB.

  5. Return the intersection of setA and setB.

The get() method steps are:

Outside of the get() method, the order of the Sanitizer's elements and attributes is unobservable. By explicitly sorting the result of this method, we give implementations the opportunity to optimize by, for example, using unordered sets internally.

  1. Let config be this's configuration.

  2. Assert: config is valid.

  3. If config["elements"] exists:

    1. For each element of config["elements"]:

      1. If element["attributes"] exists, then set element["attributes"] to the result of sorting element["attributes"], with compare sanitizer items.

      2. If element["removeAttributes"] exists, then set element["removeAttributes"] to the result of sorting element["removeAttributes"], with compare sanitizer items.

    2. Set config["elements"] to the result of sorting config["elements"], with compare sanitizer items.

  4. Otherwise:

    1. Set config["removeElements"] to the result of sorting config["removeElements"], with compare sanitizer items.

  5. If config["replaceWithChildrenElements"] exists, then set config["replaceWithChildrenElements"] to the result of sorting config["replaceWithChildrenElements"], with compare sanitizer items.

  6. If config["processingInstructions"] exists, then set config["processingInstructions"] to the result of sorting config["processingInstructions"], with piA["target"] being code unit less than piB["target"].

  7. Otherwise:

    1. Set config["removeProcessingInstructions"] to the result of sorting config["removeProcessingInstructions"], with piA["target"] being code unit less than piB["target"].

  8. If config["attributes"] exists, then set config["attributes"] to the result of sorting config["attributes"] given compare sanitizer items.

  9. Otherwise:

    1. Set config["removeAttributes"] to the result of sorting config["removeAttributes"] given compare sanitizer items.

  10. Return config.

The allowElement(element) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. Set element to the result of canonicalizing element.

  4. If configuration["elements"] exists:

    1. Let modified be the result of removing element from configuration["replaceWithChildrenElements"].

    2. If configuration["attributes"] exists:

      1. If element["attributes"] exists:

        1. Set element["attributes"] to the result of creating a set from element["attributes"].

        2. Set element["attributes"] to the difference of element["attributes"] and configuration["attributes"].

        3. If configuration["dataAttributes"] is true, then remove all items item from element["attributes"] where item is a custom data attribute.

      2. If element["removeAttributes"] exists:

        1. Set element["removeAttributes"] to the result of creating a set from element["removeAttributes"].

        2. Set element["removeAttributes"] to the intersection of element["removeAttributes"] and configuration["attributes"].

    3. Otherwise:

      1. If element["attributes"] exists:

        1. Set element["attributes"] to the result of creating a set from element["attributes"].

        2. Set element["attributes"] to the difference of element["attributes"] and element["removeAttributes"] with default « ».

        3. Remove element["removeAttributes"].

        4. Set element["attributes"] to the difference of element["attributes"] and configuration["removeAttributes"].

      2. If element["removeAttributes"] exists:

        1. Set element["removeAttributes"] to the result of creating a set from element["removeAttributes"].

        2. Set element["removeAttributes"] to the difference of element["removeAttributes"] and configuration["removeAttributes"].

    4. If configuration["elements"] does not contain element:

      1. Append element to configuration["elements"].

      2. Return true.

    5. Let currentElement be the item in configuration["elements"] whose name member is element's name member and whose namespace member is element's namespace member.

    6. If element is equal to currentElement, then return modified.

    7. Remove element from configuration["elements"].

    8. Append element to configuration["elements"].

    9. Return true.

  5. Otherwise:

    1. If element["attributes"] exists or element["removeAttributes"] with default « » is not empty, then return false.

    2. Let modified be the result of removing element from configuration["replaceWithChildrenElements"].

    3. If configuration["removeElements"] does not contain element, then return modified.

    4. Remove element from configuration["removeElements"].

    5. Return true.

The removeElement(element) method steps are to return the result of removing element from this's configuration.

The replaceElementWithChildren(element) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. Set element to the result of canonicalizing element.

  4. If the built-in non-replaceable elements list contains element, then return false.

  5. Let modified be the result of removing element from configuration["elements"].

  6. If removing element from configuration["removeElements"] is true, then set modified to true.

  7. If configuration["replaceWithChildrenElements"] does not contain element:

    1. Append element to configuration["replaceWithChildrenElements"].

    2. Return true.

  8. Return modified.

The allowAttribute(attribute) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. Set attribute to the result of canonicalizing attribute.

  4. If configuration["attributes"] exists:

    1. If configuration["dataAttributes"] is true and attribute is a custom data attribute, then return false.

    2. If configuration["attributes"] contains attribute, then return false.

    3. If configuration["elements"] exists:

      1. For each element in configuration["elements"]:

        1. If element["attributes"] with default « » contains attribute, then remove attribute from element["attributes"].

    4. Append attribute to configuration["attributes"].

    5. Return true.

  5. Otherwise:

    1. If configuration["removeAttributes"] does not contain attribute, then return false.

    2. Remove attribute from configuration["removeAttributes"].

    3. Return true.

The removeAttribute(attribute) method steps are to return the result of removing attribute from this's configuration.

The setComments(allow) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. If configuration["comments"] exists and is equal to allow, then return false.

  4. Set configuration["comments"] to allow.

  5. Return true.

The setDataAttributes(allow) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. If configuration["attributes"] does not exist, then return false.

  4. If configuration["dataAttributes"] exists and is equal to allow, then return false.

  5. If allow is true:

    1. If configuration["elements"] exists:

      1. For each element of configuration["elements"]:

        1. If element["attributes"] exists, then remove all items item from element["attributes"] where item is a custom data attribute.

    2. Remove all items item from configuration["attributes"] where item is a custom data attribute.

  6. Set configuration["dataAttributes"] to allow.

  7. Return true.

The allowProcessingInstruction(pi) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. Set pi to the result of canonicalizing pi.

  4. If configuration["processingInstructions"] exists:

    1. If configuration["processingInstructions"] contains pi, then return false.

    2. Append pi to configuration["processingInstructions"].

    3. Return true.

  5. Otherwise:

    1. If configuration["removeProcessingInstructions"] contains pi:

      1. Remove pi from configuration["removeProcessingInstructions"].

      2. Return true.

    2. falseを返す。

The removeProcessingInstruction(pi) method steps are:

  1. Let configuration be this's configuration.

  2. Assert: configuration is valid.

  3. Set pi to the result of canonicalizing pi.

  4. If configuration["processingInstructions"] exists:

    1. If configuration["processingInstructions"] contains pi:

      1. Remove pi from configuration["processingInstructions"].

      2. Return true.

    2. falseを返す。

  5. Otherwise:

    1. If configuration["removeProcessingInstructions"] contains pi, then return false.

    2. Append pi to configuration["removeProcessingInstructions"].

    3. Return true.

The removeUnsafe() method steps are to return the result of removing unsafe from this's configuration.

8.6.3 Sanitizer configuration

SanitizerElementNamespace, SanitizerAttributeNamespace, SanitizerElementNamespaceWithAttributes, and SanitizerProcessingInstruction dictionaries are considered equal when all of their members are equal.

Equality should be defined in the infra spec instead. See issue #664.

8.6.3.1 Configuration invariants

Configurations can and ought to be modified by developers to suit their purposes. Options are to write a new SanitizerConfig dictionary from scratch, to modify an existing Sanitizer's configuration by using the modifier methods, or to get() an existing Sanitizer's configuration as a dictionary and modify the dictionary and then create a new Sanitizer with it.

An empty configuration allows everything (when called with the "unsafe" methods like setHTMLUnsafe()). A configuration "default" contains a built-in safe default configuration. Note that "safe" and "unsafe" sanitizer methods have different defaults.

Not all configuration dictionaries are valid. A valid configuration avoids redundancy (like specifying the same element to be allowed twice) and contradictions (like specifying an element to be both removed and allowed.)

Several conditions need to hold for a configuration to be valid:

The elements element allow-list can also specify allowing or removing attributes for a given element. This is meant to mirror this standard's structure, which knows both global attributes as well as local attributes that apply to a specific element. Global and local attributes can be mixed, but note that ambiguous configurations where a particular attribute would be allowed by one list and forbidden by another, are generally invalid.

global attributesglobal removeAttributes
local attributesAn attribute is allowed if it matches either list. No duplicates are allowed.An attribute is only allowed if it's in the local allow list. No duplicate entries between global remove and local allow lists are allowed. Note that the global remove list has no function for this particular element, but can apply to other elements that do not have a local allow list.
local removeAttributesAn attribute is allowed if it's in the global allow-list, but not in the local remove-list. Local remove has to be a subset of the global allow lists.An attribute is allowed if it is in neither list. No duplicate entries between global remove and local remove lists are allowed.

Please note the asymmetry where mostly no duplicates between global and per-element lists are permitted, but in the case of a global allow-list and a per-element remove-list the latter has to be a subset of the former. An excerpt of the table above, only focusing on duplicates, is as follows:

global attributesglobal removeAttributes
local attributesNo duplicates are allowed.No duplicates are allowed.
local removeAttributesLocal remove has to be a subset of the global allow lists.No duplicates are allowed.

The dataAttributes setting allows custom data attributes. The rules above easily extends to custom data attributes if one considers dataAttributes to be an allow-list:

global attributes and dataAttributes set
local attributesAll custom data attributes are allowed. No custom data attributes can be listed in any allow-list, as that would mean a duplicate entry.
local removeAttributesA custom data attribute is allowed, unless it's listed in the local remove-list. No custom data attribute can be listed in the global allow-list, as that would mean a duplicate entry.

Putting these rules in words:

8.6.4 Security considerations

The Sanitizer API is intended to prevent DOM-based cross-site scripting by traversing supplied HTML content and removing elements and attributes according to a configuration. By design, the setHTML() and parseHTML() methods remove script-capable markup regardless of the configuration supplied; if any configuration could preserve such markup through these methods, that would be a bug.

However, there are security issues that the Sanitizer API cannot prevent. The following sections describe them.

8.6.4.1 Server-side reflected and stored XSS

The Sanitizer API operates solely in the DOM and adds a capability to traverse and filter an existing DocumentFragment. The Sanitizer API does not address server-side reflected or stored XSS.

8.6.4.2 DOM clobbering

DOM clobbering describes an attack in which malicious HTML confuses an application by using id or name attributes such that DOM properties, such as the children property of an HTML element, are shadowed by malicious content.

The Sanitizer API does not protect against DOM clobbering attacks by default, but can be configured to remove id and name attributes.

8.6.4.3 XSS with script gadgets

Script gadgets are a technique in which an attacker uses existing application code from popular JavaScript libraries to cause their own code to execute. This is often done by injecting innocent-looking code or seemingly inert DOM nodes that are only parsed and interpreted by a framework which then performs the execution of JavaScript based on that input.

The Sanitizer API cannot prevent these attacks. Instead, it relies on authors to explicitly allow unknown elements in general, and additionally to explicitly allow attributes, elements, and markup commonly used for templating and framework-specific code, such as data-* and slot attributes and elements like slot and template. These restrictions are not exhaustive and authors are encouraged to examine their third party libraries for this behavior.

8.6.4.4 Mutation XSS

Mutation XSS or mXSS describes an attack that exploits cases where the parsed DOM structure is not the same after serializing and parsing again, to bypass sanitization that happens before serialization. An example for carrying out such an attack is by relying on the change of parsing behavior for foreign content or mis-nested tags.

The Sanitizer API offers only functions that turn a string into a node tree. The context is supplied implicitly by all sanitizer functions: setHTML() uses the current element; Document.parseHTML() creates a new document. Therefore Sanitizer API is not directly affected by mutation XSS.

If a developer were to retrieve a sanitized node tree as a string, e.g. via innerHTML, and to then parse it again then mutation XSS can occur. This practice is strongly discouraged. If processing or passing of HTML as a string is necessary after all, then any string is to be considered untrusted and re-sanitized when inserted into the DOM. In other words, a sanitized and then serialized HTML tree can no longer be considered sanitized. A more complete treatment of mXSS can be found in [MXSS].