htmlbalance: don't compact whitespace, and set misc other options
authorSimon McVittie <smcv@ http://smcv.pseudorandom.co.uk/>
Tue, 18 Nov 2008 11:25:13 +0000 (11:25 +0000)
committerSimon McVittie <smcv@ http://smcv.pseudorandom.co.uk/>
Thu, 11 Dec 2008 21:14:03 +0000 (21:14 +0000)
Not compacting whitespace is the most important one: now that we run
sanitize hooks on individual posted comments in the comments plugin,
whitespace that is significant to Markdown (but not HTML) is lost.

IkiWiki/Plugin/htmlbalance.pm

index 3a2d62d15d47feb20886b594a51d3bc231e62a1b..dcd92055fbebcdeefdbf07d45337e7f0b2f9e360 100644 (file)
@@ -30,7 +30,15 @@ sub sanitize (@) { #{{{
        my %params=@_;
        my $ret = '';
 
-       my $tree = HTML::TreeBuilder->new_from_content($params{content});
+       my $tree = HTML::TreeBuilder->new();
+       $tree->ignore_unknown(0);
+       $tree->ignore_ignorable_whitespace(0);
+       $tree->no_space_compacting(1);
+       $tree->p_strict(1);
+       $tree->store_comments(0);
+       $tree->store_declarations(0);
+       $tree->store_pis(0);
+       $tree->parse_content($params{content});
        my @nodes = $tree->disembowel();
        foreach my $node (@nodes) {
                if (ref $node) {