library. This can be used to embed the swish-e search code into your
perl program avoiding the need to run the swish-e binary for searching.
From simon@
ok sturm@, steven@
collections of Web pages or other files. Swish-e is ideally suited for
collections of a million documents or smaller. Using the GNOME libxml2
parser and a collection of filters, Swish-e can index plain text,
e-mail, PDF, HTML, XML, Microsoft Word/PowerPoint/Excel and just about
any file that can be converted to XML or HTML text. Swish-e is also
often used to supplement databases like the MySQL DBMS for very fast
full-text searching.
help from simon, ok steven@, sturm@
there is only one way to make sure you don't have to bump PKGNAME, build
the package with PLIST_DB and ONLY if it doesn't complain, you don't have
to bump
like grep, aimed at programmers with large trees of heterogeneous source
code. ack is written purely in Perl, and takes advantage of the power
of Perl's regular expressions.
ok simon@
This module attempts to extract the maximum amount of content from
available documents, and is less concerned with XML compliance than
alternatives. Rather than rely on XML::Parser, it uses heuristics and
good old-fashioned Perl regular expressions. It stores the data in a
simple hash structure, and "aliases" certain tags so that when done,
you can count on having the minimal data necessary for re-constructing
a valid RSS file. This means you get the basic title, description,
and link for a channel and its items.
This module provides a Perl interface to the GNU Aspell library. It is
there to meet the need of looking up many words, one at a time, in a
single session, such as spell-checking a document in memory.
ICU (International Components for Unicode) is a set for C/C++ and Java
librairies providing Unicode and globalization support. icu4c is the
C/C++ version.
ICU services include code page conversion, collation (comparison using
locale-specific ordering), locale-wise formatting, Unicode regexp and
bidirectionnal text handling.
ICU is available under an open source non-copyleft licence.
from MAINTAINER Vincent Gross via jasper@, with hints from ajacoutot@
and tweak by me
ok jasper@
Highlight converts sourcecode to HTML, XHTML, RTF, LaTeX, TeX and XML
files with syntax highlighting. Its language definitions, colour themes
and indentation schemes are customizable.
tweak & ok ajacoutot@
Yould is a generator for pronounceable random words. The engine uses
Markov chains with two letter transitions. This distribution includes
trained engines for several languages: English, Dutch, Finnish, Italian,
French and German.
ok ajacoutot@
Pygments is a generic syntax highlighter for general use in all kinds of
software such as forum systems, wikis or other applications that need to
prettify source code.