Data::Dumper: Option to avoid building much of the seen hash

If the "$Sparseseen" option is set by the user, Data::Dumper eschews building the seen-this-scalar hash for ALL SCALARS but instead just adds those that have a refcount > 1. Since the seen hash is exposed to the user in the OO interface (rats!), this needs to be opt-in in if OO is used. If the DD constructor is called from Dumpxs (because the user used the functional interface as customary), then this option could be implicitly enabled in those cases as the seen hash is never visible to the user. In my real-world-data benchmark, setting this option speeds up serialization by about 50%! This is really Yves Orton's idea. I'm just the code monkey on this one.
author: Steffen Mueller <smueller@cpan.org> 2012-08-02 18:51:19 +0200
committer: Steffen Mueller <smueller@cpan.org> 2012-08-02 20:09:10 +0200
commit: d424882cc3537598b5c65fc8a4426bf49da5d903 (patch)
tree: cfdfadd9c521eab55bdce01b86502162837c97b8 /dist/Data-Dumper/Dumper.pm
parent: 08b2a930f16c631ad58d4ec6d184e81c0a4ec7b6 (diff)
download: perl-d424882cc3537598b5c65fc8a4426bf49da5d903.tar.gz
1 files changed, 27 insertions, 0 deletions
diff --git a/dist/Data-Dumper/Dumper.pm b/dist/Data-Dumper/Dumper.pm
index a5a6b312f5..a7dc82f9cb 100644
--- a/dist/Data-Dumper/Dumper.pm
+++ b/dist/Data-Dumper/Dumper.pm
@@ -55,6 +55,7 @@ $Pair       = ' => '    unless defined $Pair;
 $Useperl    = 0         unless defined $Useperl;
 $Sortkeys   = 0         unless defined $Sortkeys;
 $Deparse    = 0         unless defined $Deparse;
+$Sparseseen = 0         unless defined $Sparseseen;
 
 #
 # expects an arrayref of values to be dumped.
@@ -94,6 +95,7 @@ sub new {
 	     useperl    => $Useperl,    # use the pure Perl implementation
 	     sortkeys   => $Sortkeys,   # flag or filter for sorting hash keys
 	     deparse	=> $Deparse,	# use B::Deparse for coderefs
+             noseen     => $Sparseseen, # do not populate the seen hash unless necessary
 	   };
 
   if ($Indent > 0) {
@@ -700,6 +702,11 @@ sub Deparse {
   defined($v) ? (($s->{'deparse'} = $v), return $s) : $s->{'deparse'};
 }
 
+sub Sparseseen {
+  my($s, $v) = @_;
+  defined($v) ? (($s->{'noseen'} = $v), return $s) : $s->{'noseen'};
+}
+
 # used by qquote below
 my %esc = (  
     "\a" => "\\a",
@@ -1099,6 +1106,26 @@ XSUB implementation doesn't support it.
 Caution : use this option only if you know that your coderefs will be
 properly reconstructed by C<B::Deparse>.
 
+=item *
+
+$Data::Dumper::Sparseseen I<or>  $I<OBJ>->Sparseseen(I<[NEWVAL]>)
+
+By default, Data::Dumper builds up the "seen" hash of scalars that
+it has encountered during serialization. This is very expensive.
+This seen hash is necessary to support and even just detect circular
+references. It is exposed to the user via the C<Seen()> call both
+for writing and reading.
+
+If you, as a user, do not need explicit access to the "seen" hash,
+then you can set the C<Sparseseen> option to allow Data::Dumper
+to eschew building the "seen" hash for scalars that are known not
+to possess more than one reference. This speeds up serialization
+considerably if you use the XS implementation.
+
+Note: If you turn on C<Sparseseen>, then you must not rely on the
+content of the seen hash since its contents will be an
+implementation detail!
+
 =back
 
 =head2 Exports
author	Steffen Mueller <smueller@cpan.org>	2012-08-02 18:51:19 +0200
committer	Steffen Mueller <smueller@cpan.org>	2012-08-02 20:09:10 +0200
commit	d424882cc3537598b5c65fc8a4426bf49da5d903 (patch)
tree	cfdfadd9c521eab55bdce01b86502162837c97b8 /dist/Data-Dumper/Dumper.pm
parent	08b2a930f16c631ad58d4ec6d184e81c0a4ec7b6 (diff)
download	perl-d424882cc3537598b5c65fc8a4426bf49da5d903.tar.gz