You might expect all XML parsers to provide access to this sort of information as a matter of course. The standard SAX parser does (as we will see), but the DOM parser does not, unless the XML actually fails to parse. The standard DOM parser probably uses a SAX parser under the hood, but the API denies us access to it.
Switching from DOM to SAX is a high price to pay to make your error reporting better. Beside, you may need to use DOM tools such as XSLT. You could switch to using a thirdy party parser, access the underlying SAX parser in some sneaky non-standard and unsupported way, or you can just use the following trick.
Use the java.xml.transform API.
The great thing about the XSL Transformation API is that it will use a number of different types of input sources and output destinations. Possible options are:
- input and output streams
- files
- SAX parser input sources
- DOM fragments and documents
- JAXB object models
So we can read XML using a SAX parser and get a resulting DOM.
In our case we don't actually want to apply an XSL transformation. However, the API will provide us with a Transformer that just copies to the output form without altering the data. So here we have a tool that can convert XML from one form to another. The following examples converts XML from a file into a DOM and back.
Reading an XML file into a DOM:
TransformerFactory transformerFactoryWriting an XML DOM to a file:
= TransformerFactory.newInstance();
// Do not share transformers between threads
Transformer nullTransformer = transformerFactory.newTransformer();
Source fileSource = new StreamSource(new File("input.xml"));
DOMResult domResult = new DOMResult();
nullTransformer.transform(fileSource, domResult);
Document dom = (Document) domResult.getNode();
Source domSource = new DOMSource(dom);
Result fileResult = new StreamResult(new File("output.xml"));
nullTransformer.transform(domSource, fileResult);
So how do we obtain the line number information for DOM nodes?
The trick is to use a SAX parser and attach the location infomation it provides to the created element nodes as they are added to the DOM. Here is a SAX filter that does exactly this:
public class LocationAnnotator extends XMLFilterImpl {Next the LocationData objects that the filter attaches to each DOM element node.
private Locator locator;
private Element lastAddedElement;
private Stack<Locator> locatorStack = new Stack<Locator>();
private UserDataHandler dataHandler = new LocationDataHandler();
LocationAnnotator(XMLReader xmlReader, Document dom) {
super(xmlReader);
// Add listener to DOM, so we know which node was added.
EventListener modListener = new EventListener() {
@Override
public void handleEvent(Event e) {
EventTarget target = ((MutationEvent) e).getTarget();
lastAddedElement = (Element) target;
}
};
((EventTarget) dom).addEventListener("DOMNodeInserted",
modListener, true);
}
@Override
public void setDocumentLocator(Locator locator) {
super.setDocumentLocator(locator);
this.locator = locator;
}
@Override
public void startElement(String uri, String localName,
String qName, Attributes atts) throws SAXException {
super.startElement(uri, localName, qName, atts);
// Keep snapshot of start location,
// for later when end of element is found.
locatorStack.push(new LocatorImpl(locator));
}
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// Mutation event fired by the adding of element end,
// and so lastAddedElement will be set.
super.endElement(uri, localName, qName);
if (locatorStack.size() > 0) {
Locator startLocator = locatorStack.pop();
LocationData location = new LocationData(
startLocator.getSystemId(),
startLocator.getLineNumber(),
startLocator.getColumnNumber(),
locator.getLineNumber(),
locator.getColumnNumber());
lastAddedElement.setUserData(
LocationData.LOCATION_DATA_KEY, location,
dataHandler);
}
}
// Ensure location data copied to any new DOM node.
private class LocationDataHandler implements UserDataHandler {
@Override
public void handle(short operation, String key, Object data,
Node src, Node dst) {
if (src != null && dst != null) {
LocationData locatonData = (LocationData)
src.getUserData(LocationData.LOCATION_DATA_KEY);
if (locatonData != null) {
dst.setUserData(LocationData.LOCATION_DATA_KEY,
locatonData, dataHandler);
}
}
}
}
}
public class LocationData {The final piece of code shows how to wire up all the pieces:
public static final String LOCATION_DATA_KEY = "locationDataKey";
private final String systemId;
private final int startLine;
private final int startColumn;
private final int endLine;
private final int endColumn;
public LocationData(String systemId, int startLine,
int startColumn, int endLine, int endColumn) {
super();
this.systemId = systemId;
this.startLine = startLine;
this.startColumn = startColumn;
this.endLine = endLine;
this.endColumn = endColumn;
}
public String getSystemId() {
return systemId;
}
public int getStartLine() {
return startLine;
}
public int getStartColumn() {
return startColumn;
}
public int getEndLine() {
return endLine;
}
public int getEndColumn() {
return endColumn;
}
@Override
public String toString() {
return getSystemId() + "[line " + startLine + ":"
+ startColumn + " to line " + endLine + ":"
+ endColumn + "]";
}
}
/*
* During application startup
*/
DocumentBuilderFactory documentBuilderFactory
= DocumentBuilderFactory.newInstance();
TransformerFactory transformerFactory
= TransformerFactory.newInstance();
Transformer nullTransformer
= transformerFactory.newTransformer();
/*
* Create an empty document to be populated within a DOMResult.
*/
DocumentBuilder docBuilder
= documentBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.newDocument();
DOMResult domResult = new DOMResult(doc);
/*
* Create SAX parser/XMLReader that will parse XML. If factory
* options are not required then this can be short cut by:
* xmlReader = XMLReaderFactory.createXMLReader();
*/
SAXParserFactory saxParserFactory
= SAXParserFactory.newInstance();
// saxParserFactory.setNamespaceAware(true);
// saxParserFactory.setValidating(true);
SAXParser saxParser = saxParserFactory.newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
/*
* Create our filter to wrap the SAX parser, that captures the
* locations of elements and annotates their nodes as they are
* inserted into the DOM.
*/
LocationAnnotator locationAnnotator
= new LocationAnnotator(xmlReader, doc);
/*
* Create the SAXSource to use the annotator.
*/
String systemId = new File("example.xml").getAbsolutePath();
InputSource inputSource = new InputSource(systemId);
SAXSource saxSource
= new SAXSource(locationAnnotator, inputSource);
/*
* Finally read the XML into the DOM.
*/
nullTransformer.transform(saxSource, domResult);
/*
* Find one of the element nodes in our DOM and output the location
* information.
*/
Node n = doc.getElementsByTagName("title").item(0);
LocationData locationData = (LocationData)
n.getUserData(LocationData.LOCATION_DATA_KEY);
System.out.println(locationData);
Although XML files can include other XML files by enabling XInclude on the SAXParserFactory, this does not currently give correct location within included files. See XERCESJ-1247.