Re: [siesta-dev] plugin idea

[prev] [thread] [next] [lurker] [Date index for 2003/10/20]

From: Simon Wistow
Subject: Re: [siesta-dev] plugin idea
Date: 13:48 on 20 Oct 2003
On Mon, Oct 20, 2003 at 01:44:04PM +0100, Simon Wistow said:
> since we have algorithms to determine threadishness how hard do you 
> reckon it would to fetch the previous mail in a thread and check to see 
> if >$threashold appears in a current mail once whitespace and new lines 
> are ignored?

Apparently I didn't explain myself very well.

Basically what I wnat to do is snip out stuff from a mail which is the 
previous mail in a thread copied in verbatim.

The best way I can think of doing this is.

1. Work out thread parent of current mail.
2. Retrieve that.
3. Determine if more than n-amount of parent mail is present in current 
   mail in a contiguous block (using n-grams?). 
4. Remove said block.


Simon


Generated at 13:56 on 01 Jul 2004 by mariachi 0.52