html
The html module provides HTML/URL utilities and a complete DOM implementation for parsing, querying, manipulating, and serializing HTML documents.
import "html" for Html, Document, Element, TextNode, NodeList
Html Class
Html
HTML and URL encoding utilities
Static Methods
URL-encodes a string for use in URLs and query parameters.
- string (String) - The string to encode
- Returns: URL-encoded string
System.print(Html.urlencode("hello world")) // hello+world
Decodes a URL-encoded string.
- string (String) - The URL-encoded string
- Returns: Decoded string
System.print(Html.urldecode("hello+world")) // hello world
Converts a string to a URL-friendly slug.
- string (String) - The string to slugify
- Returns: URL-friendly slug
System.print(Html.slugify("Hello World!")) // hello-world
Escapes HTML special characters to prevent XSS attacks.
- string (String) - The string to escape
- Returns: HTML-escaped string
System.print(Html.quote("<script>")) // <script>
Unescapes HTML entities back to their original characters.
- string (String) - The HTML-escaped string
- Returns: Unescaped string
System.print(Html.unquote("<div>")) // <div>
Encodes a map of key-value pairs into a URL query string.
- params (Map) - Map of parameters to encode
- Returns: URL-encoded query string
System.print(Html.encodeParams({"name": "John", "age": 30})) // name=John&age=30
Decodes a URL query string into a map of key-value pairs.
- string (String) - URL-encoded query string
- Returns: Map of decoded parameters
var params = Html.decodeParams("name=John&age=30")
System.print(params["name"]) // John
Document Class
Document
Represents a parsed HTML document. Provides methods to query and manipulate the DOM tree.
Constructor
Parses an HTML string and returns a Document object.
- html (String) - HTML string to parse
- Returns: Document object representing the parsed HTML
var doc = Document.parse("<div><p>Hello</p></div>")
System.print(doc.body) // Element
Properties
Returns the root element of the document (typically the html element).
Returns the head element of the document, or null if not present.
Returns the body element of the document, or null if not present.
Gets the document title (content of the title element).
var doc = Document.parse("<title>My Page</title>")
System.print(doc.title) // My Page
Sets the document title.
doc.title = "New Title"
Returns the inner HTML content of the document element.
Returns the complete HTML serialization of the document.
var doc = Document.parse("<div>Hello</div>")
System.print(doc.outerHTML) // <html><div>Hello</div></html>
Query Methods
Returns the first element matching the CSS selector, or null if not found.
- selector (String) - CSS selector string
- Returns: First matching Element or null
var doc = Document.parse("<div class='box'>Content</div>")
var elem = doc.querySelector(".box")
System.print(elem.textContent) // Content
Returns all elements matching the CSS selector.
- selector (String) - CSS selector string
- Returns: NodeList of matching elements
var doc = Document.parse("<p>One</p><p>Two</p>")
var elems = doc.querySelectorAll("p")
System.print(elems.count) // 2
Returns the element with the specified ID, or null if not found.
- id (String) - The element ID to search for
- Returns: Element with matching ID or null
var doc = Document.parse("<div id='main'>Content</div>")
var elem = doc.getElementById("main")
System.print(elem.tagName) // DIV
Returns all elements with the specified class name.
- className (String) - The class name to search for
- Returns: NodeList of matching elements
var doc = Document.parse("<div class='item'>A</div><div class='item'>B</div>")
var elems = doc.getElementsByClassName("item")
System.print(elems.count) // 2
Returns all elements with the specified tag name.
- tagName (String) - The tag name to search for (case-insensitive)
- Returns: NodeList of matching elements
var doc = Document.parse("<p>One</p><p>Two</p><span>Three</span>")
var elems = doc.getElementsByTagName("p")
System.print(elems.count) // 2
Element Creation
Creates a new element with the specified tag name.
- tagName (String) - The tag name for the new element
- Returns: New Element object
var doc = Document.parse("<div></div>")
var p = doc.createElement("p")
p.textContent = "Hello"
doc.querySelector("div").appendChild(p)
Creates a new text node with the specified content.
- text (String) - The text content
- Returns: New TextNode object
var doc = Document.parse("<div></div>")
var text = doc.createTextNode("Hello World")
doc.querySelector("div").appendChild(text)
Element Class
Element
Represents an HTML element in the DOM tree. Provides methods for traversal, manipulation, and querying.
Identity Properties
Returns the tag name of the element in uppercase.
var div = doc.querySelector("div")
System.print(div.tagName) // DIV
Gets the ID attribute of the element.
Sets the ID attribute of the element.
elem.id = "newId"
System.print(elem.id) // newId
Gets the class attribute as a string.
Sets the class attribute.
elem.className = "foo bar"
System.print(elem.className) // foo bar
Returns the class names as a list of strings.
elem.className = "foo bar baz"
var classes = elem.classList
System.print(classes[0]) // foo
System.print(classes.count) // 3
Node Properties
Returns the node type (1 for Element).
Returns the node name (same as tagName for elements).
Returns null for elements (nodeValue is only meaningful for text nodes).
Content Properties
Gets the text content of the element and all descendants.
Sets the text content, replacing all children with a single text node.
elem.textContent = "New text"
Gets the HTML content inside the element.
Sets the inner HTML, parsing and replacing all children.
elem.innerHTML = "<p>New content</p>"
Gets the HTML serialization of the element including itself.
Replaces the element with the parsed HTML.
Attribute Methods
Gets the value of the specified attribute, or null if not present.
- name (String) - Attribute name
- Returns: Attribute value or null
var href = elem.getAttribute("href")
System.print(href)
Sets the value of the specified attribute.
- name (String) - Attribute name
- value (String) - Attribute value
elem.setAttribute("data-id", "123")
Removes the specified attribute.
- name (String) - Attribute name to remove
elem.removeAttribute("disabled")
Returns true if the element has the specified attribute.
- name (String) - Attribute name
- Returns: Boolean indicating presence
if (elem.hasAttribute("disabled")) {
System.print("Element is disabled")
}
Returns a map of all attributes (name to value).
var attrs = elem.attributes
for (name in attrs.keys) {
System.print("%(name)=%(attrs[name])")
}
Returns a map of all data-* attributes with camelCase keys.
// <div data-user-id="123" data-active="true">
var data = elem.dataset
System.print(data["userId"]) // 123
System.print(data["active"]) // true
Traversal Properties
Returns the parent node (element or document).
Returns the parent element, or null if parent is not an element.
Returns all child elements (excludes text nodes).
Returns all child nodes including text nodes.
Returns the first child node.
Returns the last child node.
Returns the first child that is an element.
Returns the last child that is an element.
Returns the next sibling node.
Returns the previous sibling node.
Returns the next sibling that is an element.
Returns the previous sibling that is an element.
Manipulation Methods
Appends a child node to the end of the element's children.
- child (Element|TextNode) - Node to append
- Returns: The appended node
var p = doc.createElement("p")
p.textContent = "New paragraph"
container.appendChild(p)
Inserts a node before the reference node.
- newNode (Element|TextNode) - Node to insert
- refNode (Element|TextNode|Null) - Reference node (null appends)
- Returns: The inserted node
var newP = doc.createElement("p")
container.insertBefore(newP, existingP)
Removes a child node from the element.
- child (Element|TextNode) - Node to remove
- Returns: The removed node
var removed = container.removeChild(child)
System.print(removed.textContent)
Replaces a child node with a new node.
- newChild (Element|TextNode) - Replacement node
- oldChild (Element|TextNode) - Node to replace
- Returns: The replaced (old) node
var newP = doc.createElement("p")
container.replaceChild(newP, oldP)
Creates a copy of the element.
- deep (Bool) - If true, clones all descendants
- Returns: Cloned element
var shallow = elem.cloneNode(false) // Just the element
var deep = elem.cloneNode(true) // Element and all children
Removes the element from its parent.
elem.remove() // Element is now detached
Merges adjacent text nodes and removes empty text nodes.
Query Methods
Returns the first descendant matching the CSS selector.
Returns all descendants matching the CSS selector.
Returns all descendants with the specified class.
Returns all descendants with the specified tag.
Returns true if the element matches the CSS selector.
if (elem.matches(".active")) {
System.print("Element is active")
}
Returns the closest ancestor (or self) matching the selector.
var container = elem.closest(".container")
if (container) {
System.print("Found container: %(container.id)")
}
Returns true if the node is a descendant of this element.
if (container.contains(child)) {
System.print("Child is inside container")
}
Returns true if the element has any child nodes.
if (elem.hasChildNodes()) {
System.print("Element has children")
}
TextNode Class
TextNode
Represents a text node in the DOM tree.
Properties
Gets the text content.
Sets the text content.
Returns 3 (TEXT_NODE).
Returns "#text".
Gets the text content (same as textContent).
Sets the text content.
Traversal Properties
Returns the parent node.
Returns the parent element.
Returns the next sibling node.
Returns the previous sibling node.
Methods
Creates a copy of the text node (deep parameter is ignored).
Removes the text node from its parent.
NodeList Class
NodeList
An iterable collection of DOM nodes returned by query methods.
Properties
Returns the number of nodes in the list.
Access Methods
Returns the node at the specified index, or null if out of bounds.
var first = list[0]
var last = list[list.count - 1]
Same as subscript access.
Converts the NodeList to a standard Wren List.
var list = nodeList.toList
for (elem in list) {
System.print(elem.tagName)
}
Iteration
NodeList supports iteration with for-in loops:
for (elem in doc.querySelectorAll("p")) {
System.print(elem.textContent)
}
Calls the function for each node in the list.
nodeList.forEach {|elem|
System.print(elem.tagName)
}
CSS Selectors
The DOM implementation supports the following CSS selector syntax:
| Selector | Example | Description |
|---|---|---|
| Tag | div, p |
Matches elements by tag name |
| Universal | * |
Matches all elements |
| ID | #myId |
Matches element with id="myId" |
| Class | .myClass |
Matches elements with class="myClass" |
| Attribute | [href] |
Matches elements with href attribute |
| Attribute equals | [type="text"] |
Matches elements with type="text" |
| Attribute contains word | [class~="item"] |
Matches if class contains "item" as a word |
| Attribute starts with | [href^="https"] |
Matches if href starts with "https" |
| Attribute ends with | [src$=".png"] |
Matches if src ends with ".png" |
| Attribute contains | [title*="hello"] |
Matches if title contains "hello" |
| Descendant | div p |
Matches p anywhere inside div |
| Child | div > p |
Matches p that is direct child of div |
| Adjacent sibling | h1 + p |
Matches p immediately after h1 |
| General sibling | h1 ~ p |
Matches any p after h1 |
| :first-child | p:first-child |
Matches first child element |
| :last-child | p:last-child |
Matches last child element |
| :nth-child(n) | li:nth-child(2) |
Matches 2nd child |
| :nth-child(odd/even) | tr:nth-child(odd) |
Matches odd rows |
| :first-of-type | p:first-of-type |
First p among siblings |
| :last-of-type | p:last-of-type |
Last p among siblings |
| :only-child | p:only-child |
Matches if only child |
| :empty | div:empty |
Matches elements with no children |
Compound Selectors
doc.querySelector("div.container#main") // Tag + class + ID
doc.querySelector("input[type='text'].large") // Tag + attribute + class
doc.querySelector("ul > li:first-child") // Combinator + pseudo-class
Examples
Web Scraping
import "html" for Document
var html = """
<div class="products">
<div class="product">
<h2>Widget</h2>
<span class="price">$9.99</span>
</div>
<div class="product">
<h2>Gadget</h2>
<span class="price">$19.99</span>
</div>
</div>
"""
var doc = Document.parse(html)
for (product in doc.querySelectorAll(".product")) {
var name = product.querySelector("h2").textContent
var price = product.querySelector(".price").textContent
System.print("%(name): %(price)")
}
HTML Generation
import "html" for Document
var doc = Document.parse("<div id='root'></div>")
var root = doc.getElementById("root")
var items = ["Apple", "Banana", "Cherry"]
var ul = doc.createElement("ul")
for (item in items) {
var li = doc.createElement("li")
li.textContent = item
ul.appendChild(li)
}
root.appendChild(ul)
System.print(doc.outerHTML)
DOM Transformation
import "html" for Document
var doc = Document.parse("<p>Hello <b>World</b></p>")
// Replace all <b> with <strong>
for (b in doc.querySelectorAll("b").toList) {
var strong = doc.createElement("strong")
strong.innerHTML = b.innerHTML
b.parentNode.replaceChild(strong, b)
}
System.print(doc.outerHTML)
The DOM implementation automatically decodes HTML entities when parsing and encodes them when serializing.