[prev] [thread] [next] [lurker] [Date index for 2003/05/29]
OK. I admit that for >50% of the subject I don't know what I'm doing. (ie 100% of the "mariachi" bit) I used the 6M mailbox from p5p for May 2003, on mirth: $ /usr/local/perl5.8.1/bin/perl5.8.1 -d:DProf mariachi 2003-05.mbox foo Should pass a list title reticulating splines 0.000 elapsed 0.000 total 1200 messages load 1293 22.714 elapsed 22.714 total dropped 0 duplicate messages dedupe 0.091 elapsed 22.804 total thread 6.134 elapsed 28.938 total sanity 0.599 elapsed 29.537 total order 1.831 elapsed 31.368 total sanity 0.668 elapsed 32.037 total strand 0.735 elapsed 32.771 total splicing threads in 45 places deep threads split up 0.619 elapsed 33.390 total sanity 0.670 elapsed 34.060 total order 0.981 elapsed 35.041 total index 21 thread indexes 12.229 elapsed 47.271 total date indexes 11.769 elapsed 59.040 total message 1200 message bodies 95.481 elapsed 154.521 total generate 0.008 elapsed 154.529 total So it looks like "message bodies" is the first thing to attack. dprofpp says: $ dprofpp tmon.out Total Elapsed Time = 115.3059 Seconds User+System Time = 113.1859 Seconds Exclusive Times %Time ExclSec CumulS #Calls sec/call Csec/c Name 13.3 15.06 19.496 3129 0.0048 0.0062 Email::Find::find 13.2 15.03 18.450 3129 0.0048 0.0059 URI::Find::find 8.30 9.395 149.71 8777 0.0011 0.0171 Template::Document::__ANON__ 7.83 8.867 8.866 1293 0.0069 0.0069 Email::Simple::_read_headers 7.71 8.723 20.316 173776 0.0001 0.0001 Template::Stash::XS::get 5.73 6.488 6.486 1293 0.0050 0.0050 Email::Simple::_split_head_from_bo dy 4.22 4.778 5.116 34375 0.0001 0.0001 Memoize::_memoizer 2.84 3.209 2.732 159192 0.0000 0.0000 Class::Accessor::Fast::__ANON__ 2.32 2.625 159.53 8777 0.0003 0.0182 Template::Context::process 1.87 2.119 2.750 11049 0.0002 0.0002 Template::Plugin::Date::format 1.67 1.889 1.886 1294 0.0015 0.0015 Email::Folder::Mbox::next_message 1.50 1.699 1.630 23212 0.0001 0.0001 Template::Filters::html_filter 1.50 1.699 3.264 26341 0.0001 0.0001 Template::Context::filter 1.49 1.686 3.520 3296 0.0005 0.0011 Mail::Thread::Container::iterate_d own 1.48 1.679 6.335 10124 0.0002 0.0006 Template::Context::template 0: Observation - all of those are in modules not directly mariachi. 1: Does this list of prime time eating functions bear any relation to what people know mariachi to be doing, particularly in the slower sections in the elapsed output? 2: Is it worrying that Memoize::_memoizer shows up? Given that sub addr_regex { $Addr_spec_re } returns a constant, and the winner on time bloat is: sub find { my($self, $r_text) = @_; my $emails_found = 0; my $re = $self->addr_regex; $$r_text =~ s{($re)}{ my($replace, $found) = $self->validate($1); $emails_found += $found; $replace; }eg; return $emails_found; } would turning that s///eg into s///ego be a good idea? [it doesn't let you subclass with a dynamic return result for addr_regex] Nick
Generated at 13:56 on 01 Jul 2004 by mariachi 0.52