Dealing with XML errors when loading documents is a very simple task. Using the libxml functionality it is possible to suppress all XML errors when loading the document and then iterate over the errors.
The libXMLError object, returned by libxml_get_errors(), contains several properties including the message, line and column (position) of the error.
Example #1 Loading broken XML string
<?php
libxml_use_internal_errors(true);
$sxe = simplexml_load_string("<?xml version='1.0'><broken><xml></broken>");
if ($sxe === false) {
echo "Failed loading XML\n";
foreach(libxml_get_errors() as $error) {
echo "\t", $error->message;
}
}
?>
以上例程会输出:
Failed loading XML Blank needed here parsing XML declaration: '?>' expected Opening and ending tag mismatch: xml line 1 and broken Premature end of data in tag broken line 1
openbip at gmail dot com (2010-02-25 20:20:17)
Note that "if (! $sxe) {" may give you a false-negative if the XML document was empty (e.g. "<root />"). In that case, $sxe will be:
object(SimpleXMLElement)#1 (0) {
}
which will evaluate to false, even though nothing technically went wrong.
Consider instead: "if ($sxe === false) {"
Jacob Tabak (2010-02-15 13:17:36)
If you are trying to load an XML string with some escaped and some unescaped ampersands, you can pre-parse the string to ecsape the unescaped ampersands without modifying the already escaped ones:
<?php
$s = preg_replace('/&[^; ]{0,6}.?/e', "((substr('\\0',-1) == ';') ? '\\0' : '&'.substr('\\0',1))", $s);
?>