Libtextcat is a library with functions that implement the classification
technique described in Cavnar & Trenkle, "N-Gram-Based Text
Categorization". It was primarily developed for language guessing, a
task on which it is known to perform with near-perfect accuracy.
Based on the FreeBSD port.
This is a prerequisite for pinot.
Hunspell is a spell checker and morphological analyzer library and
program designed for languages with rich morphology and complex word
compounding or character encoding.
Note that this is not to be considered as an aspell replacement just
yet. We install no hunspell dictionnaries for now but use the ones from
mozilla.
Reworked from an original port by Edd Barrett (maintainer).
Tested by sthen@ in a bulk, thanks!
ok sthen@
This module parses a query string into a data structure to be handled
by external search engines. For examples of such engines, see
File::Tabular and Search::Indexer.
The query string can contain simple terms, "exact phrases", field
names and comparison operators, '+/-' prefixes, parentheses,
and boolean connectors.
from Ian Mcwilliam (MAINTAINER)
Catfish is a handy file searching tool for linux and unix. Basically it
is a frontend for different search engines (daemons) which provides a
unified interface. The interface is intentionally lightweight and
simple, using only GTK+2. You can configure it to your needs by using
several command line options.
ok ajacoutot@
Meld is a visual diff and merge tool. You can compare two or three files
and edit them in place (diffs update dynamically). You can compare two
or three folders and launch file comparisons. You can browse and view a
working copy from popular version control systems such such as CVS,
Subversion, Bazaar-ng and Mercurial if the corresponding commands are
installed.
ok ajacoutot@ wcmaier@
disabled for now. "i'm stunned by the quality and that it doesn't
choke on a recent document[0] where xpdf had issues with" simon@
(who also helped tracking down the key bindings, thanks!).
Fitz is a project to create a new and modern graphics library.
At the core of Fitz is the display tree: a scene graph of vector
graphics, images and text making up the contents of a page.
The standard components of Fitz are:
* Base runtime (thin memory and error handling layer)
* Streams and filters (standard postscript, pdf and tiff filters)
* World model (display trees and resources)
* Drawing (draw the tree to a bitmap raster)
MuPDF is a PDF parser that reads PDF files and creates Fitz trees.
MuPDF also has an API to modify internal objects in the PDF files
and write PDF files. For instance, it is possible to use the MuPDF
library to encrypt existing PDF files, or to rearrange the pages.
pdftool is a commandline demo of this functionality; it is a portable
pdf swiss army knife for fixing broken pdf files, changing permissions,
merging and extracting pages, and examining the internal object
structure of a PDF file.
The mupdf binary (aka pdfview) is a bare bones PDF viewer.