htmlbalance: don't compact whitespace, and set misc other options
authorSimon McVittie <smcv@ http://smcv.pseudorandom.co.uk/>
Tue, 18 Nov 2008 11:25:13 +0000 (11:25 +0000)
committerJoey Hess <joey@kodama.kitenet.net>
Fri, 12 Dec 2008 19:23:12 +0000 (14:23 -0500)
Not compacting whitespace is the most important one: now that we run
sanitize hooks on individual posted comments in the comments plugin,
whitespace that is significant to Markdown (but not HTML) is lost.
(cherry picked from commit cb5aaa3cee8b35d6fc6e88a7449a9477a6587c7a)

IkiWiki/Plugin/htmlbalance.pm

index 3a2d62d15d47feb20886b594a51d3bc231e62a1b..dcd92055fbebcdeefdbf07d45337e7f0b2f9e360 100644 (file)
@@ -30,7 +30,15 @@ sub sanitize (@) { #{{{
        my %params=@_;
        my $ret = '';
 
-       my $tree = HTML::TreeBuilder->new_from_content($params{content});
+       my $tree = HTML::TreeBuilder->new();
+       $tree->ignore_unknown(0);
+       $tree->ignore_ignorable_whitespace(0);
+       $tree->no_space_compacting(1);
+       $tree->p_strict(1);
+       $tree->store_comments(0);
+       $tree->store_declarations(0);
+       $tree->store_pis(0);
+       $tree->parse_content($params{content});
        my @nodes = $tree->disembowel();
        foreach my $node (@nodes) {
                if (ref $node) {