find-patterns/readme.md

1.7 KiB

2006 Neil Edelman, distributed under the terms of the GNU General Public License 3.

In 2006 after attending a talk on bioinformatics, I had the idea of making an email client that would take the methods of bioinformatics and apply them to spam-detection.

Searches through input and outputs sequences that are repeated. Because it's intended for text files, control characters are ignored.

FindPatterns [filename] [-b] [-e] [-i] [-o] [-v] [-m<n>] [-l<n>] [-g<n>] [-?|h]

`filename`Attempt to read input from this file, otherwise uses stdin.
`-b`Keep a buffer to count repeated matches (`!o -> b`.)
`-e`Echo input.
`-i`Case-insensitive (not implemented.)
`-n`Don't display matches at the end.
`-o`Output matches immediately as they are found.
`-s`Silent mode - plain output with no extra characters.
`-v`Verbose comments while outputting.
`-g`Set memory buffer granularity to the closest power of two lower than `` bytes (default 1024.)
`-l`Set match limit to `` matches (default 4096; 0 -> no limit.)
`-m`Set minimum match length to symbols (default 3).
`-?` or `-h`Display this help screen and exit.

Adding -<s>- will turn off switch <s>.

Also included is a simple KillSpam email client that takes the patterns generated (from FindPatterns) and eliminates all the emails that have matching patterns. This is compiled on lcc Win32.