prtwash.pl: initial commit

This commit is contained in:
John McQuah 2022-05-28 12:49:56 -04:00
parent aedef1fb39
commit 9ccc1ebf59
2 changed files with 392 additions and 0 deletions

142
man1/prtwash.1 Normal file
View File

@ -0,0 +1,142 @@
.TH prtwash 1 "May 28, 2022" "prtwash 1.2.1" ""
.SH NAME
\fBprtwash \fP- a script to clean the port dirs in CRUX.
\fB
.SH SYNOPSIS
.nf
.fam C
\fBprtwash\fP [-p] [-b] [-d] [-s] [-t] [-q] [-a] <path> [<path> ...]
.fam T
.fi
.SH DESCRIPTION
\fBprtwash\fP is a perl script to clean the ports tree of a CRUX
distribution. You can clean a single directory, multiple directories,
or the entire ports tree. With the given options it is possible to
choose what kind of files the program should delete.
A test mode is provided.
.PP
\fBprtwash\fP was inspired by Martin Opel's prtsweep script.
.SH OPTIONS
By default prtwash will NOT delete the following items
in a port directory:
.IP \(bu 3
The Pkgfile
.IP \(bu 3
The sources (as stated in the Pkgfile)
.IP \(bu 3
The built package
.IP \(bu 3
The .signature, .footprint, .32bit and .nostrip files
.IP \(bu 3
Additional files: README, FAQ, pre-install, post-install
.PP
All items not included in the previous list WILL be deleted
(e.g.: the 'work' dir of pkgmk, if present).
.PP
You can choose to delete some of the above by passing the proper
option:
.TP
.B
-p
also remove the built package (current version)
.TP
.B
-b
also remove other built packages (older versions)
.TP
.B
-d
also remove the add-on files provided by CRUX maintainers
(patches and initscripts downloaded with \fBports -u\fP)
.TP
.B
-s
also remove the upstream sources (those NOT obtained via \fBports -u\fP)
.TP
.B
-t
(test mode) display the target files without deleting
.TP
.B
-q
(quiet mode) suppress messages about files that are kept;
basically the same as \fBprtwash [OPTIONS] | grep -v "keeping"\fP
.TP
.B
-a
(automatic mode) take the port collections from the
/etc/prt-get.conf file and do a recursive cleaning;
if omitted, the command line must contain at least one <path> specifying
a valid port directory
.PP
Misc options
.TP
.B
-h
Display usage information
.TP
.B
-v
Display version
.SH ENVIRONMENT
In automatic mode, \fBprtwash\fP gets a list of repositories from
/etc/prt-get.conf, and for each repository descends into the
individual port directories to read the associated Pkgfiles. In non-automatic
mode, \fBprtwash\fP only cleans the paths given on the command line. Directories
without a Pkgfile are skipped entirely, and a warning is given unless you
pass the option \fB\-q\fP. After reading the Pkgfile and /etc/pkgmk.conf,
\fBprtwash\fP will know the source filenames, the package filename, and the
distinction between sources under CRUX version control and sources downloaded
directly from upstream.
.PP
Sources under CRUX version control are only deleted if you pass the
option \fB\-d\fP. Sources from upstream are only deleted if you pass the
option \fB\-s\fP.
.PP
The current version of the package is only deleted if you pass the
option \fB\-p\fP. Older versions of the package are only deleted if you
pass the option \fB\-b\fP. Note that if the compression mode defined
in /etc/pkgmk.conf has been changed since you last built the package,
\fBprtwash\fP will not recognize the tarball in its regexp search.
.SH COMPARISON WITH OTHER UTILITIES
Because \fBprtwash\fP reads the location of port collections from \fB/etc/prt-get.conf\fP(5),
you can easily control which collections are cleaned in automatic mode by commenting
or uncommenting the appropriate \fIprtdir\fP directives in that file. This detection
algorithm makes it easy to wash all your locally-curated repositories, in contrast
to the algorithm in \fBprtsweep\fP(1) which looks at the sup files used by \fBports\fP(8).
.PP
Another contrast between the two tools is that \fBprtwash\fP sources each Pkgfile to
generate its keep list, whereas \fBprtsweep\fP reads the signatures file. Spawning an
external bash shell for each port should in theory make \fBprtwash\fP slower
than \fBprtsweep\fP (which does everything in native perl), but on modern hardware the
difference is basically undetectable.
.PP
Neither \fBprtwash\fP nor \fBprtsweep\fP will ever touch anything outside your ports tree
or the directories passed on the command line. If you have centralized directories outside
the ports tree for all downloaded sources and built packages, you might find
\fBoldfiles\fP(1) more useful. Nevertheless, the pace of CRUX port updates is fast enough that
some cruft is bound to accumulate in the ports tree, and for this reason \fBprtwash\fP and
\fBprtsweep\fP still have their place in a CRUX admin's toolbox.
.SH EXAMPLES
.TP
.B prtwash \-a \-t
does a default wash action on the entire port tree;
doesn't really delete the files
.TP
.B prtwash \-a \-p \-s
now we're getting serious: removes all downoaded files
and built packages from the entire port tree
.TP
.B prtwash \-a \-p \-s \-d
same as the above, but also removes patches, initscripts, etc
.TP
.B prtwash \-p \-s /usr/ports/contrib/sitecopy
removes all downoaded files
and built packages from the sitecopy directory
.TP
.B prtwash /usr/ports/contrib/*
does the default wash action on the entire contrib collection
.SH AUTHORS
Simone Rota <sip at varlock dot com>, John McQuah <jmcquah at disroot dot org>
.SH SEE ALSO
\fBprtsweep\fP(1), \fBoldfiles\fP(1)

250
scripts/prtwash.pl Executable file
View File

@ -0,0 +1,250 @@
#!/usr/bin/perl
#
########################################################################
#
# prtwash
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# This is a script for removing rubbish from a CRUX port tree.
# Distributed under the terms of the GPL license.
#
# ChangeLog and author information available in the prt-utils tarball.
#
########################################################################
use strict;
use File::Basename;
use File::Path;
our @portdirs;
our $argports;
my $version = 1.2.1;
our %options = ( oldver=>0, pkgtoo=>0, srctoo=>0, addons=>0,
dryrun=>0, auto=>0, quiet=>0 );
#################### main routine ################################
parse_args();
print_usage() if ((2*$argports-1)*(1-2*$options{auto}) < 0);
our $compression_mode = get_compress();
if ($options{auto} == 1) {
my @bldirs = getportdirs();
my @basedirs = @{$bldirs[0]}; my @localports = @{$bldirs[1]};
my $port;
foreach my $collection (@basedirs) {
print "====> Washing port collection $collection\n";
foreach $port (list_subdirs($collection)) {
do_wash($port);
}
}
foreach $port (@localports) {
do_wash($port);
}
} else {
foreach my $port (@portdirs) {
do_wash($port);
}
}
#################### subroutines #################################
sub parse_args {
while ( my $arg = pop(@ARGV) ) {
next if (-f $arg); # no filenames, only directories and options
if ($arg eq "-b") {
$options{oldver} = 1;
} elsif ($arg eq "-p") {
$options{pkgtoo} = 1;
} elsif ($arg eq "-s") {
$options{srctoo} = 1;
} elsif ($arg eq "-d") {
$options{addons} = 1;
} elsif ($arg eq "-t") {
$options{dryrun} = 1;
} elsif ($arg eq "-a") {
$options{auto} = 1;
} elsif ($arg eq "-q") {
$options{quiet} = 1;
} elsif ($arg eq "-v") {
print "$version\n";
exit(0);
} elsif (-d $arg) {
push (@portdirs, $arg);
} else {
print_usage();
}
}
$argports = @portdirs;
}
sub print_usage {
print <<EOT;
Usage: prtwash [OPTION]... [PORTDIRS]...
-a automatic mode, only valid when [PORTDIRS] are omitted
-p delete the built package (current version)
-b delete older versions of the built package
-s remove upstream tarballs (those NOT obtained by ports -u)
-d remove add-on files (sources obtained by ports -u)
-t test mode, don't actually delete anything
-q quiet mode, don't report files that are kept
-h print this message and exit
-v show version information and exit
Report bugs on libera.chat #crux-devel
EOT
exit(0);
}
sub parse_pkgfile {
my $pkgfile = shift;
my $name; my $version; my $release=1; my @source; my @renames;
my $cmd = "bash -c \'source $pkgfile; ";
$cmd = $cmd.'sa=(_name _version _release ${source[@]}); ';
$cmd = $cmd.'ra=($name $version $release ${renames[@]}); ';
$cmd = $cmd.'for ((s=0; s<${#sa[@]}; s++)); do ';
$cmd = $cmd.'echo "${sa[$s]} ==> ${ra[$s]}"; done'."\' |";
open(PIPE,$cmd) or return;
while (<PIPE>) {
chomp;
my @a = split(/ ==> /,$_);
if ($a[0] eq "_name") {
$name = $a[1];
} elsif ($a[0] eq "_version") {
$version = $a[1];
} elsif ($a[0] eq "_release") {
$release = $a[1];
} elsif ($a[0] =~ /^(ftp|http|https):/) {
$a[0] =~ s/^(ftp|http|https):.*\/(.*)$/remote:\/$2/ ;
push (@source, $a[0]);
push (@renames, $a[1]);
} else {
push (@source, $a[0]);
push (@renames, $a[1]);
}
}
close(PIPE);
return \$name, \$version, \$release, \@source, \@renames;
}
sub keeplist { # remember to pop off the last element for regex purposes
my $port = shift;
my @keepers = ("Pkgfile",".footprint",".signature");
push (@keepers,"pre-install","post-install","README","README.md",
".32bit",".nostrip") if $options{addons}==0;
my @parsed = parse_pkgfile("$port/Pkgfile"); # file existence already tested
my $name = ${$parsed[0]}; # by the caller. But if Pkgfile
my $version = ${$parsed[1]}; # is unreadable (and these vars
my $release = ${$parsed[2]}; # are initialized empty), then
my @source = @{$parsed[3]}; # the user probably doesn't have
my @renames = @{$parsed[4]}; # permissions to do much damage,
# even with a minimal keeplist.
my $i; my $si; my $ki;
ENTRY: for ($i=0; $i<=$#source; $i++) {
$si = $source[$i];
$ki = ($renames[$i] eq "SKIP" or $renames[$i] eq "") ? basename($si) :
$renames[$i];
if ($si =~ /^remote:/) {
next ENTRY if ($options{srctoo}==1);
push(@keepers, $ki);
} else {
next ENTRY if ($options{addons}==1);
push(@keepers, $ki);
}
}
push (@keepers, $name.'#'.$version.'-'.$release.".pkg.tar.$compression_mode") if $options{pkgtoo}==0;
push (@keepers, $name."#.*pkg.tar.$compression_mode");
return @keepers;
}
sub get_compress {
my $conf = "/etc/pkgmk.conf";
$compression_mode = "gz";
open(CONFIG,$conf) or return $compression_mode;
while(<CONFIG>) {
$compression_mode = $1 if m/^[\t ]*PKGMK_COMPRESSION_MODE=(.*)\n/;
}
close(CONFIG);
$compression_mode =~ s/"//g; # ensure no double-quotes remain (thanks jaeger)
return $compression_mode;
}
sub do_wash {
my $port = shift;
while ($port =~ s/\/+$//) {}; # ensure the path contains no trailing slash
if ( ! -f "$port/Pkgfile" ) {
print "WARN: no Pkgfile found in $port. Skipping.\n";
return;
} else {
my @keepers = keeplist($port);
my $nevermatch = "ReallyLongStringThatWouldNeverMatchAnyBuiltPackage";
my $pkgrex = pop(@keepers);
$pkgrex = ($options{oldver} == 1 or $options{pkgtoo} == 1) ? $nevermatch : $pkgrex;
opendir (DIR,$port) or return;
foreach my $f (sort(readdir(DIR))) {
next if ($f eq '.' or $f eq '..');
$f =~ s/\+/\\\+/g; # plus sign in filenames interferes with regex search
if ((grep /$f/, @keepers) or ($f =~ /$pkgrex/)) {
print "... keeping file $port/$f.\n" unless $options{quiet} == 1;
} else {
remove ("$port/$f");
}
}
closedir (DIR);
}
}
sub getportdirs { # returns scalar references to two arrays
my @basedirs;
my @localports;
my $conf = "/etc/prt-get.conf";
open(PORTS, $conf) or die "could not open $conf";
while (<PORTS>) {
chomp;
if ( /^prtdir\s+/ ) {
my $line = $_;
$line =~ s/^prtdir\s+//; #remove the leading directive
$line =~ s/#.*$//; #strip inline comments like this one
$line =~ s/\s+//g; #collapse all whitespace, even if in a path!
#(not that we ever put spaces or commas in repo names anyway)
if ( $line !~ /:/ ) {
push @basedirs, $line if (-d $line);
} else {
my @a = split(/:/, $line);
my @b = split(/,/, @a[1]);
while ( my $c = pop @b ) {
my $port = $a[0] . "/" . $c;
push @localports, $port if (-d $port);
}
}
}
}
close(PORTS);
return \@basedirs, \@localports;
}
sub list_subdirs { # roughly equivalent to `find $1 -maxdepth 1 -type d`
my $path = shift;
my @list;
while ($path =~ s/\/\//\//g) {}
$path =~ s/\/$//;
opendir(DIR, $path) or return;
foreach my $entry(sort(readdir(DIR))) {
next if ( substr($entry,0,1) eq '.' );
push (@list, "$path/$entry") if -d "$path/$entry";
}
closedir(DIR);
return @list;
}
sub remove {
my $path=shift;
my $prepend = ($options{dryrun}==1) ? "+ (t) " : "+ ";
if (-d $path) {
print "$prepend removing directory $path\n";
rmtree ($path,0,1) if ($options{dryrun}==0);
} else {
print "$prepend removing file $path\n";
if ($options{dryrun}==0) { unlink "$path" or return };
}
}