DOMDocument
在线手册:中文  英文

DOMDocument::saveHTML

(PHP 5)

DOMDocument::saveHTML Dumps the internal document into a string using HTML formatting

说明

public string DOMDocument::saveHTML ([ DOMNode $node = NULL ] )

Creates an HTML document from the DOM representation. This function is usually called after building a new dom document from scratch as in the example below.

参数

node

Optional parameter to output a subset of the document.

返回值

Returns the HTML, or FALSE if an error occurred.

更新日志

版本 说明
5.3.6 The node parameter was added.

范例

Example #1 Saving a HTML tree into a string

<?php

$doc 
= new DOMDocument('1.0');

$root $doc->createElement('html');
$root $doc->appendChild($root);

$head $doc->createElement('head');
$head $root->appendChild($head);

$title $doc->createElement('title');
$title $head->appendChild($title);

$text $doc->createTextNode('This is the title');
$text $title->appendChild($text);

echo 
$doc->saveHTML();

?>

参见


DOMDocument
在线手册:中文  英文

用户评论:

mpeters at domblogger dot net (2011-11-03 00:27:24)

There is not a <script /> problem.
When a script node does not have a child and it is dumped as XML, a self closing script node is proper. Any browser with XML support will do the right thing IF you send your document with the right mime type -- application/xhtml+xml
When you dump it via saveHTML() - the script node will not be self closing.
There is however a <source /> problem.
With the new html5 media tags, <source src="whatever"> is not closed in html - so when sending as html, do a preg_replace on the output of saveHTML() to get rid of the </source> tags which are invalid.

alvaro at demogracia dot com (2011-03-29 11:04:18)

Since PHP/5.3.6, DOMDocument->saveHTML() accepts an optional DOMNode parameter similarly to DOMDocument->saveXML():
http://bugs.php.net/bug.php?id=39771

Yajo (2010-11-24 07:21:13)

Another way to workaround the <script/> problem is putting a semicolon (;) inside the script element.

Anonymous (2010-02-09 07:52:04)

If you want a simpler way to get around the <script> tag problem try:

<?php

  $script 
$doc->createElement ('script');\
  
// Creating an empty text node forces <script></script>
  
$script->appendChild ($doc->createTextNode (''));
  
$head->appendChild ($script);

?>

Anonymous (2009-05-12 19:35:54)

To avoid script tags from being output as <script />, you can use the DOMDocumentFragment class:

<?php

$doc 
= new DOMDocument();
$doc -> loadXML($xmlstring);
$fragment $doc->createDocumentFragment();
/* Append the script element to the fragment using raw XML strings (will be preserved in their raw form) and if succesful proceed to insert it in the DOM tree */ 
if($fragment->appendXML("<script type='text/javascript' src='$source'></script>") { 
  
$xpath = new DOMXpath($doc);
  
$resultlist $xpath->query("//*[local-name() = 'html']/*[local-name() = 'head']"); /* namespace-safe method to find all head elements which are childs of the html element, should only return 1 match */
  
foreach($resultlist as $headnode)  // insert the script tag
     
$headnode->appendChild($fragment);
}
$doc->saveXML(); /* and our script tags will still be <script></script> */

?>

Bart Feenstra (2009-01-18 10:17:23)

I am using this solution to prevent tags and the doctype from being added to the HTML string automatically:

<?php
$html 
'<h1>Hello world!</h1>';
$html '<div>' $html '</div>';
$doc = new DOMDocument;
$doc->loadHTML($html);
echo 
substr($doc->saveXML($doc->getElementsByTagName('div')->item(0)), 5, -6)

// Outputs: "<h1>Hello world!</h1>"
?>

m at hbblogs daught calm (2008-08-18 08:41:41)

This method, as of 5.2.6, will automatically add <html><body> and <!DOCTYPE> tags to the document if they are missing, without asking whether you want them. In my application, I needed to use the DOM methods to manipulate just a fragment of html, so these tags were rather unhelpful.
Here's a simple hack to remove them in case, like me, all you wanted to do was perform a few operations on an HTML fragment.
$html_fragment = preg_replace('/^<!DOCTYPE.+?>/', '', str_replace( array('<html>', '</html>', '<body>', '</body>'), array('', '', '', ''), $dom->saveHTML()));

Anonymous (2008-04-25 20:15:25)

<?php
function getDOMString($retNode) {
  if (!
$retNode) return null;
  
$retval strtr($retNode-->ownerDocument->saveXML($retNode),
  array(
    
'></area>' => ' />',
    
'></base>' => ' />',
    
'></basefont>' => ' />',
    
'></br>' => ' />',
    
'></col>' => ' />',
    
'></frame>' => ' />',
    
'></hr>' => ' />',
    
'></img>' => ' />',
    
'></input>' => ' />',
    
'></isindex>' => ' />',
    
'></link>' => ' />',
    
'></meta>' => ' />',
    
'></param>' => ' />',
    
'default:' => ''
    
// sometimes, you have to decode entities too...
    
'&quot;' => '&#34;',
    
'&amp;' =>  '&#38;',
    
'&apos;' => '&#39;',
    
'&lt;' =>   '&#60;',
    
'&gt;' =>   '&#62;',
    
'&nbsp;' => '&#160;',
    
'&copy;' => '&#169;',
    
'&laquo;' => '&#171;',
    
'&reg;' =>   '&#174;',
    
'&raquo;' => '&#187;',
    
'&trade;' => '&#8482;'
  
));
  return 
$retval;
}
?>

mjaque at ilkebenson dot com (2008-02-19 11:34:37)

DOMDocument->saveXML() doesn't generate a proper XHTML format either.

There is a problem with "script" empty elements. For example:

This will be the code generated by saveXML, with an empty script tag.

<html>
  <head>
    <script type="text/JavaScript" src="myScript.js"/>
  </head>
  <body>
    <p>I will not appear</p>
    <script type="text/JavaScript">
    alert("Not working");
    </script>
  </body>
</html>

I don't know if this is valid XHTML (W3C Validator doesn't complain), but both FF 2.0 and IE 6 will not render it properly. Both will use </script> as the closing tag for the first script causing js errors and ignoring in between elements.

You can post-process saveXML string in order to close empty tags with the following function:

<?php
    
function cerrarTag($tag$xml){
        
$indice 0;
        while (
$indicestrlen($xml)){
            
$pos strpos($xml"<$tag "$indice);
            if (
$pos){
                
$posCierre strpos($xml">"$pos);
                if (
$xml[$posCierre-1] == "/"){
                    
$xml substr_replace($xml"></$tag>"$posCierre-12);
                }
                
$indice $posCierre;
            }
            else break;
        }
        return 
$xml;
    }
?>

At least script and select empty elements should be closed. This example shows how it can be used:

<?php
    define
("CABECERA_XHTML"'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">');

  
$xhtml $docXML->saveXML($docXML->documentElement);
  
$xhtml cerrarTag("script"$xhtml);
  
$xhtml cerrarTag("select"$xhtml);
  
$xhtml CABECERA_XHTML."\n".$xhtml;
  echo 
$xhtml;
?>

archanglmr at yahoo dot com (2007-11-27 15:28:44)

If created your DOMDocument object using loadHTML() (where the source is from another site) and want to pass your changes back to the browser you should make sure the HTTP Content-Type header matches your meta content-type tags value because modern browsers seem to ignore the meta tag and trust just the HTTP header. For example if you're reading an ISO-8859-1 document and your web server is claiming UTF-8 you need to correct it using the header() function.

<?php
header
('Content-Type: text/html; charset=iso-8859-1');
?>

xoplqox (2007-11-20 11:07:44)

XHTML:
If the output is XHTML use the function saveXML().
Output example for saveHTML:
<select name="pet" size="3" multiple>
<option selected>mouse</option>
<option>bird</option>
<option>cat</option>
</select>
XHTML conform output using saveXML:
<select name="pet" size="3" multiple="multiple">
<option selected="selected">mouse</option>
<option>bird</option>
<option>cat</option>
</select>

tyson at clugg dot net (2005-04-21 17:44:56)

<?php
// Using DOM to fix sloppy HTML.
// An example by Tyson Clugg <tyson@clugg.net>
//
// vim: syntax=php expandtab tabstop=2

function tidyHTML($buffer)
{
  
// load our document into a DOM object
  
$dom = @DOMDocument::loadHTML($buffer);
  
// we want nice output
  
$dom->formatOutput true;
  return(
$dom->saveHTML());
}

// start output buffering, using our nice
// callback funtion to format the output.
ob_start("tidyHTML");

?>
<html>
<p>It's like comparing apples to oranges.
</html>
<?php

// this will be called implicitly, but we'll
// call it manually to illustrate the point.
ob_end_flush();

?>

The above code takes out sloppy HTML:
 <html>
 <p>It's like comparing apples to oranges.
 </html>

And cleans it up to the following:
 <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
 <html><body><p>It's like comparing apples to oranges.
 </p></body></html>

易百教程