diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-06-16 21:47:00 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-06-16 21:47:00 +0000 |
commit | a0cb39004565ec2396fbdb3f1949b8f13304208e (patch) | |
tree | 67b23b5671a1bf84313263478ddd1c4894a7b7ad /lib/Memoize/TODO | |
parent | 58a21a9b07f5f6666d09bb8c0b9bf9150baca513 (diff) | |
download | perl-a0cb39004565ec2396fbdb3f1949b8f13304208e.tar.gz |
Integrate Memoize 0.64. Few tweaks were required in
the test scripts. Note that the speed and expire*
tests take several dozen seconds to run.
p4raw-id: //depot/perl@10645
Diffstat (limited to 'lib/Memoize/TODO')
-rw-r--r-- | lib/Memoize/TODO | 335 |
1 files changed, 335 insertions, 0 deletions
diff --git a/lib/Memoize/TODO b/lib/Memoize/TODO new file mode 100644 index 0000000000..db0843b2a9 --- /dev/null +++ b/lib/Memoize/TODO @@ -0,0 +1,335 @@ +# Version 0.05 alpha $Revision: 1.5 $ $Date: 1999/09/17 14:57:55 $ + +=head1 TO DO + +=over 4 + +=item * + +LIST_CACHE doesn't work with ties to most DBM implementations, because +Memouze tries to save a listref, and DB_File etc. can only store +strings. This should at least be documented. Maybe Memoize could +detect the problem at TIE time and throw a fatal error. + +Try out MLDBM here and document it if it works. + +=item * + +We should extend the benchmarking module to allow + + timethis(main, { MEMOIZED => [ suba, subb ] }) + +What would this do? It would time C<main> three times, once with +C<suba> and C<subb> unmemoized, twice with them memoized. + +Why would you want to do this? By the third set of runs, the memo +tables would be fully populated, so all calls by C<main> to C<suba> +and C<subb> would return immediately. You would be able to see how +much of C<main>'s running time was due to time spent computing in +C<suba> and C<subb>. If that was just a little time, you would know +that optimizing or improving C<suba> and C<subb> would not have a +large effect on the performance of C<main>. But if there was a big +difference, you would know that C<suba> or C<subb> was a good +candidate for optimization if you needed to make C<main> go faster. + +Done. + +=item * + +Perhaps C<memoize> should return a reference to the original function +as well as one to the memoized version? But the programmer could +always construct such a reference themselves, so perhaps it's not +necessary. We save such a reference anyway, so a new package method +could return it on demand even if it wasn't provided by C<memoize>. +We could even bless the new function reference so that it could have +accessor methods for getting to the original function, the options, +the memo table, etc. + +Naah. + +=item * + +The TODISK feature is not ready yet. It will have to be rather +complicated, providing options for which disk method to use (GDBM? +DB_File? Flat file? Storable? User-supplied?) and which stringizing +method to use (FreezeThaw? Marshal? User-supplied?) + +Done! + +=item * + +Maybe an option for automatic expiration of cache values? (`After one +day,' `After five uses,' etc.) Also possibly an option to limit the +number of active entries with automatic LRU expiration. + +You have a long note to Mike Cariaso that outlines a good approach +that you sent on 9 April 1999. + +What's the timeout stuff going to look like? + + EXPIRE_TIME => time_in_sec + EXPIRE_USES => num_uses + MAXENTRIES => n + +perhaps? Is EXPIRE_USES actually useful? + +19990916: Memoize::Expire does EXPIRE_TIME and EXPIRE_USES. +MAXENTRIES can come later as a separate module. + +=item * + +Put in a better example than C<fibo>. Show an example of a +nonrecursive function that simply takes a long time to run. +C<getpwuid> for example? But this exposes the bug that you can't say +C<memoize('getpwuid')>, so perhaps it's not a very good example. + +Well, I did add the ColorToRGB example, but it's still not so good. +These examples need a lot of work. C<factorial> might be a better +example than C<fibo>. + +=item * + +Add more regression tests for normalizers. + +=item * + +Maybe resolve normalizer function to code-ref at memoize time instead +of at function call time for efficiency? I think there was some +reason not to do this, but I can't remember what it was. + +=item * + +Add more array value tests to the test suite. + +Does it need more now? + +=item * + +Fix that `Subroutine u redefined ... line 484' message. + +Fixed, I think. + +=item * + +Get rid of any remaining *{$ref}{CODE} or similar magic hashes. + +=item * + +There should be an option to dump out the memoized values or to +otherwise traverse them. + +What for? + +Maybe the tied hash interface taskes care of this anyway? + +=item * + +Include an example that caches DNS lookups. + +=item * + +Make tie for Storable (Memoize::Storable) + +A prototype of Memoize::Storable is finished. Test it and add to the +test suite. + +Done. + +=item * + +Make tie for DBI (Memoize::DBI) + +=item * + +I think there's a bug. See `###BUG'. + +=item * + +Storable probably can't be done, because it doesn't allow updating. +Maybe a different interface that supports readonly caches fronted by a +writable in-memory cache? A generic tied hash maybe? + + FETCH { + if (it's in the memory hash) { + return it + } elsif (it's in the readonly disk hash) { + return it + } else { + not-there + } + } + + STORE { + put it into the in-memory hash + } + +Maybe `save' and `restore' methods? + +It isn't working right because the destructor doesn't get called at +the right time. + +This is fixed. `use strict vars' would have caught it immediately. Duh. + +=item * + +Don't forget about generic interface to Storable-like packages + +=item * + + +Maybe add in TODISK after all, with TODISK => 'filename' equivalent to + + SCALAR_CACHE => [TIE, Memoize::SDBM_File, $filename, O_RDWR|O_CREAT, 0666], + LIST_CACHE => MERGE + +=item * + +Maybe the default for LIST_CACHE should be MERGE anyway. + +=item * + +There's some terrible bug probably related to use under threaded perl, +possibly connected with line 56: + + my $wrapper = eval "sub { unshift \@_, qq{$cref}; goto &_memoizer; }"; + +I think becayse C<@_> is lexically scoped in threadperl, the effect of +C<unshift> never makes it into C<_memoizer>. That's probably a bug in +Perl, but maybe I should work around it. Can anyone provide more +information here, or lend me a machine with threaded Perl where I can +test this theory? Line 59, currently commented out, may fix the +problem. + +=item * + +Maybe if the original function has a prototype, the module can use +that to select the most appropriate default normalizer. For example, +if the prototype was C<($)>, there's no reason to use `join'. If it's +C<(\@)> then it can use C<join $;,@$_[0];> instead of C<join $;,@_;>. + +=item * + +Ariel Scolnikov suggests using the change counting problem as an +example. (How many ways to make change of a dollar?) + +=item * + +I found a use for `unmemoize'. If you're using the Storable glue, and +your program gets SIGINT, you find that the cache data is not in the +cache, because Perl normally writes it all out at once from a +DESTROY method, and signals skip DESTROY processing. So you could add + + $sig{INT} = sub { unmemoize ... }; + +(Jonathan Roy pointed this out) + +=item * + +This means it would be useful to have a method to return references to +all the currently-memoized functions so that you could say + + $sig{INT} = sub { for $f (Memoize->all_memoized) { + unmemoize $f; + } + } + + +=item * + +19990917 There should be a call you can make to get back the cache +itself. If there were, then you could delete stuff from it to +manually expire data items. + +=item * + +19990925 Randal says that the docs for Memoize;:Expire should make it +clear that the expired entries are never flushed all at once. He +asked if you would need to do that manually. I said: + + Right, if that's what you want. If you have EXISTS return false, + it'll throw away the old cached item and replace it in the cache + with a new item. But if you want the cache to actually get smaller, + you have to do that yourself. + + I was planning to build an Expire module that implemented an LRU + queue and kept the cache at a constant fixed size, but I didn't get + to it yet. It's not clear to me that the automatic exptynig-out + behavior is very useful anyway. The whole point of a cache is to + trade space for time, so why bother going through the cache to throw + away old items before you need to? + +Randal then pointed out that it could discard expired items at DESTRoY +or TIEHASH time, which seemed like a good idea, because if the cache +is on disk you might like to keep it as small as possible. + +=item * + +19991219 Philip Gwyn suggests this technique: You have a load_file +function that memoizes the file contexts. But then if the file +changes you get the old contents. So add a normalizer that does + + return join $;, (stat($_[0])[9]), $_[0]; + +Now when the modification date changes, the true key returned by the +normalizer is different, so you get a cache miss and it loads the new +contents. Disadvantage: The old contents are still in the cache. I +think it makes more sense to have a special expiration manager for +this. Make one up and bundle it. + +19991220 I have one written: Memoize::ExpireFile. But how can you +make this work when the function might have several arguments, of +which some are filenames and some aren't? + +=item * + +19991219 There should be an inheritable TIEHASH method that does the +argument processing properly. + +19991220 Philip Gwyn contributed a patch for this. + +20001231 You should really put this in. Jonathan Roy uncovered a +problem that it will be needed to solve. Here's the problem: He has: + + memoize "get_items", + LIST_CACHE => ["TIE", "Memoize::Expire", + LIFETIME => 86400, + TIE => ["DB_File", "debug.db", O_CREAT|O_RDWR, 0666] + ]; + +This won't work, because memoize is trying to store listrefs in a +DB_File. He owuld have gotten a fatal error if he had done this: + + memoize "get_items", + LIST_CACHE => ["TIE", "DB_File", "debug.db", O_CREAT|O_RDWR, 0666]' + + +But in this case, he tied the cache to Memoize::Expire, which is *not* +scalar-only, and the check for scalar-only ties is missing from +Memoize::Expire. The inheritable method can take care of this. + +=item * + +20001130 Custom cache manager that checks to make sure the function +return values actually match the memoized values. + +=item * + +20001231 Expiration manager that watches cache performance and +accumulates statistics. Variation: Have it automatically unmemoize +the function if performance is bad. + +=item * + +20010517 Option to have normalizer *modify* @_ for use by memoized +function. This would save code and time in cases like the one in the +manual under 'NORMALIZER', where both f() and normalize_f() do the +same analysis and make the same adjustments to the hash. If the +normalizer could make the adjustments and save the changes in @_, you +wouldn't have to do it twice. + +=item * +There was probably some other stuff that I forgot. + + + +=back |