From Jeff Bachtel <>, reviewed by naddy@
w3mir is a all purpose HTTP copying and mirroring tool. The main
focus of w3mir is to create and maintain a browseable copy of one,
or several, remote WWW site(s). Used to the max w3mir can retrieve
the contents of several related sites and leave the mirror browseable
via a local web server, or from a filesystem, such as directly from
w3mir's goal is to be able to make useful mirrors of any reasonable
WWW site. It specifically preserves link integrity within the
mirrored documents as well as the integrety of links outside the
mirror, following redirects as needed. If you want it to. w3mir has
a powerful ``multi-scope'' mechanism enabling the user to make
mirrors of several related sites and have links between them refer
to the mirrored documents rather than the original site. w3mir has
several features directed at getting mirrors for CDROM burning and
handling of some not too often seen problems when mirroring.
w3mir supports HTML4, and has partial support for CSS, Java and
October 21, 2000, Version 3.0.18
- Fixed file upload bugs (Sascha)
October 11, 2000, Version 3.0.17
- Fixed output functions (Sascha)
- Added odbc_tables() (Frank)
- Fixed htmlspecialchars/htmlentities inconsistencies (Rasmus)
- Added is_uploaded_file() (Zeev)
- Clean up htmlspecialchars/htmlentities inconsistencies (Rasmus)
- Add optional charset parameter to sybase_[p]connect (
- Fixed incorrect handling of 0-precision strings (e.g., %4.0s)
in printf (Ken Coar)
- You can now call Ora_Error() without prameters to get the reason
for a failed connection attempt. (Kirill Maximov)
- Fixed crash in OCIFetchStatement() when trying to read after
all data has already been read. (Thies)
- Added --enable-sigchild. Use this option if you encounter
<defunc> processes when using Oracle 8i. (Thies)
- Uncommitted outstanding OCI8 transactions are now rolled back
before the connection is closed. (Thies)
- Improved configure checks for Oracle 8i. (Thies)
- Added imap_mime_header_decode() function (Skalski)
clientAccessCheck incorrectly returns ACCESS_ALLOWED for proxy requests
when configured as an HTTP accelerator only
Everywhere where Squid inserts text received from the network into a HTML
page (error pages, FTP listings, Gopher listings, ...) care must be taken
to ensure that the text is properly encoded as HTML, or a malicious user
might be able to insert script code or other HTML tags, and exploit the
web browser of any user visiting their page or clicking on that funny link
received in a email..
MHonArc is a Perl mail-to-HTML converter. MHonArc provides HTML
mail archiving with index, mail thread linking, etc; plus other
capabilities including support for MIME and powerful user customization
HTML::Base is an expansion module for Perl 5 which provides
an object-oriented way to build pages of HTML. Its purpose
is to create HTML tags at the lowest level of functionality,
that is to say, it creates HTML and doesn't do much else.
Specifically, it does not provide any CGI-like services
(see the CGI modules for that!).
Currently, the module supports all of the HTML 2.0 tags,
plus some selected tags from HTML 3.0, and some Netscape-isms.
ok by brad@
Libwww-perl is a collection of Perl modules which provides a simple
and consistent application programming interface (API) to the
World-Wide Web.
The main focus of the library is to provide classes and functions
that allow you to write WWW clients, thus libwww-perl said to be a
WWW client library. The library also contain modules that are of
more general use and even classes that help you implement simple
HTTP servers.
guess what, ok'ed by brad@ !
This is a collection of modules that parse and extract information
from HTML documents. Bug reports and discussions about these modules
can be sent to the <> mailing list. Remember to
also look at the HTML-Tree package that creates and extracts
information from HTML syntax trees.
The modules present in this collection are:
HTML::Parser - The parser base class. It receives arbitrary sized
chunks of the HTML text, recognizes markup elements, and
separates them from the plain text. As different kinds of
markup and text are recognized, the corresponding event
handlers are invoked.
HTML::Entities - Provides functions to encode and decode text
with embedded HTML >entities>.
HTML::HeadParser - A lightweight HTML::Parser subclass that
extractsinformation from the <HEAD> section of an HTML document.
HTML::LinkExtor - An HTML::Parser subclass that extracts links
from an HTML document.
HTML::TokeParser - An alternative interface to the basic parser
that does not require event driven programming. Most simple
parsing needs are probably best attacked with this module.
ok by brad@
This package contains the module with friends. The module
implements the URI class. Objects of this class represent Uniform
Resource Identifier (URI) references as specified in RFC 2396.
URI objects can be used to access and manipulate the various
components that make up these strings. There are also methods to
combine URIs in various ways.
The URI class replaces the URI::URL class that used to be distributed
with libwww-perl. This package contains an emulation of the old
URI::URL interface. The emulated URI::URL implements both the old
and the new interface.