The HTML Selection API gives developers the ability to access highlighted text within the browser and perform some DOM and text manipulation on the selected text. These useful features are available now in any modern browser as well as legacy browsers back to IE9. While there are more complex things that can be done with this API this blog article will hopefully illustrate some possible uses of the API and give you an idea of how to start using some of these features.
The Window Selection
Use of the HTML Selection API revolves primarily around two objects, the Selection object and the Range object. The Selection object refers to the current text selection on the page. It includes several properties that can be used in combination to help you locate the selection within the HTML document, the anchorNode, anchorOffset, focusNode, and focusOffset. The anchorNode is the Node object that contains the start of the text selection, and the anchorOffset is going to be one of two things depending on the type of the anchorNode. If the anchorNode is a text node then the offset is the index of the starting character within the node. If the anchorNode is of type Element then the anchorOffset is the index of the child node of that element that contains the starting character. In that specific case where the offset is referring to a node then you can assume that the text selection starts at the beginning of the node. There are two other properties that will be useful to you. isCollapsed is a boolean that if it is true means that the selection has a length of 0 so there is not really any selected text. The rangeCount property indicates how many Range objects are part of the selection.
var selection = window.getSelection(); if (!selection.isCollapsed) { // we have a non-zero length selection var startNode = selection.anchorNode; var startOffset = selection.anchorOffset; if (startNode instanceof Element) { // if it is an element then the offset is the child node index startNode = startNode.childNodes[startOffset]; startOffset = 0; } var endNode = selection.focusNode; var endOffset = selection.focusOffset; if (endNode instanceof Element) { // if it is an element then the offset is the child node index endNode = endNode.childNodes[endOffset]; endOffset = 0; } console.log(startNode, startOffset, endNode, endOffset); }
The Selection Range
Range objects represent fragments of the document and they can be added to the selection object. Ranges are not allowed to extend beyond the boundaries of the selection though. By default the 0th Range in a Selection contains the entire Selection, although it can be modified programmatically or additional Ranges can be constructed and added to the Selection. The Range object gives you the ability to start manipulating the text in the document through various methods that are exposed. toString() simply returns the text included in the range. This is only the text though so if the text is wrapped in markup like strong or em tags then you will not see those tags and you will only get the text inside of them.
Text Manipulation
There are also some methods that are available to make your life easier if you need to perform manipulation on the markup or text. The Range has two methods, cloneContents() and extractContents() that return DocumentFragments containing the selected text. If a node is only partially included then it will clone the parent tags to appropriately wrap the partial content. Event listeners are not copied over but IDs and attributes are so if you append these back in to the document you might end up with invalid HTML. There are probably many different uses for these, but one example that comes to mind would be to copy the text into a “clipboard” that is displayed to the user with all of the text that they have highlighted within the document.
var range = selection.getRangeAt(0); var selectedFragment = range.cloneContents(); var clippedText = document.createElement('li'); clippedText.appendChild(selectedFragment); document.getElementById('output').appendChild(clippedText);
Using a TreeWalker
DocumentFragments might be able to get you some of what you need. However, if you really wanted to get the HTML markup contained in the selection and have some filtering ability then the Range exposes enough information for you to use a TreeWalker. The Range has some helpful properties available that allow you to relatively easily set up a TreeWalker to traverse your selection. The TreeWalker allows you to walk from the start node to the end node and filter out nodes based on information that you provide, and it also lets you capture the HTML contents of nodes that are contained by your Range. You will want to restrict your traversing to only the selected text so the general principle would be to use the Text node’s splitText() method to split the starting node into two nodes with the start index as the location to split upon and then also split the ending node upon the ending index. Once you have done that then you construct a TreeWalker starting at your newly created starting node and finishing with your newly created ending node. The Range also gives a property containing the commonAncestorContainer so that you can restrict how high the TreeWalker will traverse so that you can restrict anything not in between the starting node and the ending node. Since a Range can just be a subset of the Selection the Range also has properties for the startContainer, endContainer, startOffset, and endOffset that follow the same rules as the Selection’s properties.
// this example will grab all of the text EXCEPT for that inside of any elements with the ID of ignore //now let's split the text if (endNode instanceof Text) { endNode.splitText(endOffset); } if (startNode instanceof Text) { startNode.splitText(startOffset); startNode = startNode.nextSibling; } // filter out the element with id=ignore function filterFunction(node) { if (node.id === 'ignore') { return NodeFilter.FILTER_REJECT; } return NodeFilter.FILTER_ACCEPT; } var nodeTypes = NodeFilter.SHOW_ELEMENT|NodeFilter.SHOW_TEXT; var walker = document.createTreeWalker(root, nodeTypes, { acceptNode: filterFunction}, false); walker.currentNode = startNode; var nextNode = walker.currentNode; var html = []; while (nextNode && nextNode !== endNode) { if (nextNode.nodeType === Node.TEXT_NODE) { html.push(nextNode.nodeValue); } nextNode = walker.nextNode(); } if (nextNode.nodeType === Node.TEXT_NODE) { html.push(nextNode.nodeValue); } var filteredOutput = document.createElement('li'); filteredOutput.innerHTML = html.join(''); document.getElementById('output').appendChild(filteredOutput);
Once you have used a TreeWalker to traverse the contents of your Range then you can also change your approach and modify the markup as you go. For instance, if you wanted to make all of the selected text bold then you could use the TreeWalker to find all Text nodes within the Range and wrap them with strong elements or span elements appropriately styled.
How to Use Selections
Now that we have gone over how to use the Selection and Range objects, let’s take a quick look at how you might use them in your code. You can call window.getSelection() anytime and if there is no selected text then you will be given a collapsed Selection object, so any JavaScript event could be used to check if there is a valid Selection. One common approach though is to use the mouseup event to trigger the check for s Selection since the mouseup will always fire when a user finishes highlighting text. It is important though to always check the isCollapsed property before doing anything with is since you will always be given a Selection object.
I hope that this article was able to shed some light on an aspect of the HTML specification that does not seem to be widely known and to give you some ideas of how it might be useful in your applications. With more applications moving to the web these days I think that text manipulation features like these will become more widely used. All of the source code examples used in this article are available in JSFiddle snippet.
Leave a Comment