[prev] [thread] [next] [lurker] [Date index for 2003/05/30]
On Thu, May 29, 2003 at 11:30:05PM +0100, Nicholas Clark wrote: > > OK. I admit that for >50% of the subject I don't know what I'm doing. > (ie 100% of the "mariachi" bit) > > I used the 6M mailbox from p5p for May 2003, on mirth: > > $ /usr/local/perl5.8.1/bin/perl5.8.1 -d:DProf mariachi 2003-05.mbox foo > Should pass a list title > reticulating splines 0.000 elapsed 0.000 total > 1200 messages > load 1293 22.714 elapsed 22.714 total [...] > thread indexes 12.229 elapsed 47.271 total > date indexes 11.769 elapsed 59.040 total > message 1200 > message bodies 95.481 elapsed 154.521 total > generate 0.008 elapsed 154.529 total > > So it looks like "message bodies" is the first thing to attack. > dprofpp says: My feeling was to start on 'load' (and indeed I already have), which is invoking all those C<'Email::Simple::_read_headers> calls. It's important for penderel (Paul did some benchmarking there, http://paste.husk.org/198 shows incremental "message bodies" generation really kicking ass) It's also one of the two steps that you can't skip any of by doing "this already has a html file" checking. The other is threading, though for that we can store a pre-computed thread tree and just update it. I guess we'll meet halfway through :) > 0: Observation - all of those are in modules not directly mariachi. Considering that Mariachi is two modules, and depends on many more, this isn't a shock. > 1: Does this list of prime time eating functions bear any relation to what > people know mariachi to be doing, particularly in the slower sections > in the elapsed output? Yup. Email::Find is used by Template::Plugin::Mariachi, as is URI::Find. That gets called for every page we output so we can mark it up nicely. > 2: Is it worrying that Memoize::_memoizer shows up? > Ish. I asked it to be there in that in Mariachi::Message filename is memoized, as it allowed me to move that code out of C<new> which was previously a ratty pile. It's a little worrying that it's called enough to show up though. > Given that (in Email::Find) > > sub addr_regex { $Addr_spec_re } > > returns a constant, and the winner on time bloat is: > > sub find { > my($self, $r_text) = @_; > > my $emails_found = 0; > my $re = $self->addr_regex; > $$r_text =~ s{($re)}{ > my($replace, $found) = $self->validate($1); > $emails_found += $found; > $replace; > }eg; > return $emails_found; > } > > would turning that s///eg into s///ego be a good idea? Looks likely. > [it doesn't let you subclass with a dynamic return result for addr_regex] Ah, but such a change is unwanted upstream then we can subclass it, and just replace C<find> with one that does, and s'all good. -- Richard Clamp <richardc@xxxxxxxxx.xxx>
Generated at 13:56 on 01 Jul 2004 by mariachi 0.52