fetch from ftp-master and pointyhat; they'll just get timeouts.
Instead, each machine is expected to set up their own MASTER_SITE_*
variables in etc/make.conf via a bindist-${hostname}.tar file.
Approved by: portmgr (self)
on a disconnected client, without running the time-consuming rsyncs.
This is useful when a build is interrupted and needs to be restarted.
* After we have cleaned up the machine, reset the queue counter by using
pollmachine -queue. This has a race condition if other builds are being
dispatched to the machine (e.g. builds on another branch):
getmachine can claim a directory and increment the counter, then the
machine is polled and finds e.g. 0 chroots in use, and resets the
counter to 0, then claim-chroot is run and the build dispatched, with
the counter now off-by-one. This could be fixed by running
claim-chroot with the .lock held, but this turns out to be too
time-consuming. A two-level lock approach might also fix this
efficiently.
same time, assuming that the admin has already built the INDEX and
INDEX.old in advance.
* Adapt to new method of calculating build concurrency, by summing the
value of ${maxjobs} listed in every portbuild.${machine}
* Support 5-exp builds
(i.e. if the package lists a dependency on the relevant package in the
PACKAGE_BUILDING case). This allows packages that require an
available DISPLAY to again build (with some forthcoming fixes to
existing ports).
Improve the reporting of detected filesystem anomalies (extra files
left behind after deinstallation, changes to and removal of
pre-existing files)
synchronously instead of probabilistically scheduling jobs, which
means that the job load on a machine never exceeds a desired
threshold, and we can preferentially use faster machines when they are
available. This has a dramatic effect on package build throughput,
although I don't yet have precise measurements of the performance
improvements.
Specifically, the changes are:
* Introduce the new variable maxjobs in portbuild. This replaces the
build scheduling weights previously listed in the mlist file, which
now changes format to list the build machines only, ranked in order of
preference for job dispatches (i.e. faster machines first).
* The ${arch}/queue directory is used to list machines available for
jobs (file content is the number of jobs currently running on the
machine). Changes to files in this directory are serialized using
lockf on the .lock file.
* Claim a machine with the getmachine script, with the .lock held.
This picks the machine with the fewestnumber of jobs running, which is
listed highest in the mlist file in case of multiple machines with
equal load. The job counter is incremented, and the file removed if
the counter reaches ${maxjobs} for that machine. If all machines are
busy, sleep for 15 seconds and retry.
* After we have claimed a machine, we run claim-chroot on it to claim
an empty chroot, as before. If the claim fails, release the job from
the queue with the releasemachine script and retry after a 15 second
wait.
* When the build is finished, decrement the job counter with the
releasemachine script, with .lock held.
* The checkmachines script now exists only to poll the load averages
for admin convenience (every 2 minutes), and to ping for unreachable
machines. When a machine cannot be reached, remove the entry in the
queue directory to stop further job dispatches to it. This needs more
work to deal with reinitialization of machines after they become
available again.
synchronously instead of probabilistically scheduling jobs, which
means that the job load on a machine never exceeds a desired
threshold, and we can preferentially use faster machines when they are
available. This has a dramatic effect on package build throughput,
although I don't yet have precise measurements of the performance
improvements.
Specifically, the changes are:
* Introduce the new variable maxjobs in portbuild. This replaces the
build scheduling weights previously listed in the mlist file, which
now changes format to list the build machines only, ranked in order of
preference for job dispatches (i.e. faster machines first).
* The ${arch}/queue directory is used to list machines available for
jobs (file content is the number of jobs currently running on the
machine). Changes to files in this directory are serialized using
lockf on the .lock file.
* Claim a machine with the getmachine script, with the .lock held.
This picks the machine with the fewestnumber of jobs running, which is
listed highest in the mlist file in case of multiple machines with
equal load. The job counter is incremented, and the file removed if
the counter reaches ${maxjobs} for that machine. If all machines are
busy, sleep for 15 seconds and retry.
* After we have claimed a machine, we run claim-chroot on it to claim
an empty chroot, as before. If the claim fails, release the job from
the queue with the releasemachine script and retry after a 15 second
wait.
* When the build is finished, decrement the job counter with the
releasemachine script, with .lock held.
* The checkmachines script now exists only to poll the load averages
for admin convenience (every 2 minutes), and to ping for unreachable
machines. When a machine cannot be reached, remove the entry in the
queue directory to stop further job dispatches to it. This needs more
work to deal with reinitialization of machines after they become
available again.
Additional changes to this file:
* Exit if passed a null package name, to avoid badness later on
* Send a nag-mail if pkg-plist errors are detected in the build
/rescue/mount -t linprocfs, so assume that the i386 build hosts have
statically-built copies of the necessary binaries in /sbin, until this is
fixed.
Create /usr/X11R6 inside the chroot so that mtree has something to do, since
this directory is otherwise orphaned.
List the extra/removed/changed files separately, and colour-code the
serious errors (files left behind outside of /usr/local and /usr/X11R^;
files removed that were installed by another port, and files with changed
permissions or ownership)
the port deinstall; mtree does not recurse into subdirectories it does
not know about
* Break out the 'files incorrectly removed' and 'files incorrectly changed'
into their own sections
* Remove USE_QT2 since it's obsolete now. [2]
* Clarify comments about ARCH. [3]
* Speedup 'make readmes'. Add a perl script "Tools/make_readmes"
and modify bsd.port.subdir.mk to avoid recursing into individual
port directories to create README.html. [4]
* Fix 'make search' to allow case insensitive search on 5-x/6-x. [5]
* Add the possibility to search the ports by category. [6]
* Remove tk42 and tcl76 from virtual categories since they're
obsolete. [7]
* Introduce new variable - DISTVERSION, vendor version of the
distribution, that can be set instead of PORTVERSION and is
automatically converted in a conforming PORTVERSION. [8]
* Use --suffix instead of -b option for patch(1) to make it
compatible with BSD patch(1) [9]
* Fix {WANT,WITH}_MYSQL_VER behavior, to deal with conflicting
versions. [10]
PR: ports/68895 [1], ports/69486 [2], ports/68539 [3],
ports/70018 [4], ports/68896 [5], ports/73299 [6],
ports/73570 [7], ports/67171 [8], ports/72182 [9]
Submitted by: linimon [1][3], arved [2][7], cperciva [4],
Matthew Seaman <m.seaman@infracaninophile.co.uk> [5],
Radek Kozlowski <radek@raadradd.com> [6],
eik [8], Andreas Hauser <andy-freebsd@splashground.de> [9],
clement [10]
restricted ports' instead of 'don't build any restricted ports' since
the former is useful when we're not intending to publish the results
of a build, but the latter is not.
Move the build preprocessing (directory setup, old build rotation,
etc) out from under -nobuild, so that we can set up a new build using
that option.
${arch}/${branch}/latest/${portdir}. We will use this in the
processfail script, so that the "new package build errors" webpages do
not have out-of-date links but instead link to the most recent copy of
the build error.
that it may be called by hand.
Support new portbuild.conf variables
client_user = user to connect to on the client (not necessarily
root). This user must have write permission to the
/var/portbuild tree if disconnected=1 (i.e. we're
going to run rsync).
rsync_gzip = set to "-z" to enable compression on low-bandwidth
disconnected clients.
Approved by: portmgr (self)
ssh times out)
* Support new portbuild.conf settings:
client_user = user to connect to on the client (not necessarily root)
sudo_cmd = If ssh'ing to a non-root user, run this command to gain
root privs (set to empty string for client_user=root,
or sudo for !root). Cannot require interactivity, of
course.
Approved by: portmgr (self)
because this file is a chronological history of port builds that have
failed, the files listed may not be present in the current set of
error logs, and we currently have no easy way to find the most recent
failure log to use instead.
i386-5-latest that are linked to from the index.html are symlinks to
dated directories (e.5.`date`), so the URLs in the error reports will
expire with the start of the next build when the symlink is repointed.
This change makes the URLs in the error reports use the realpath of
the target file, so they do not expire.
* Clients no longer have ssh access to the master, so we need to
push/pull everything on the client from here. This means we need to
know where the build took place so we can go in and get the files
after it finishes. Introduce the claim-chroot script which
atomically claims a free chroot directory on the host and returns
the name. This directory is later populated by the portbuild script
if it does not already contain an extracted bindist.
* Use the per-node portbuild.$(hostname) config file to decide where
in the filesystem to claim the chroot on the build host.
* If a port failed unexpectedly (i.e. is not marked BROKEN), or if
something strange happened when trying to pull in build results from
a client, then send me email (XXX should be configurable).
* Clean up after the build finishes and we have everything we need, by
dispatching the clean-chroot script on the client.
if requested (".keep" file in the port directory), no matter where
we fail.
* Add package dependencies before the corresponding build stage
(e.g. FETCH_DEPENDS before 'make fetch'), and remove them again
afterwards. This allows us to catch ports that list their
dependencies too early/late.
* No need to check for set[ug]id files here, the security-check target
in bsd.port.mk does it for us.
* Exclude some more directories and files from showing up in the mtree
before/after comparison, to trim down the false-positive in the
pkg-plist check.
* Other minor changes
it's done properly^Wbetter in makeparallel
* Script accepts new arguments:
-nodoccvs: skip cvs update of the doc tree
-trybroken: try to build BROKEN ports (off by default because the
i386 cluster is fast enough now that when doing incremental builds we
were spending most of the time rebuilding things we know are probably
going to fail anyway. Conversely, the other clusters are slow enough
that we also usually don't want to waste time on BROKEN ports).
-incremental: compare the interesting fields of the new INDEX with
the previous one, remove packages and log files for the old ports that
have changed, and rebuild the rest. This substantially cuts down on
build times since we don't rebuild ports that we know have not
changed. XXX checkpoint of work-in-progress, not yet working as
committed.
* When setting up the nodes, read in per-node config files
("portbuild.$(hostname)") before dispatching the setupnode script on
each node. For disconnected nodes (which don't mount the master via
NFS), we also rsync the interesting files required by the builds
(ports/src/doc trees, bindist tarballs, scripts) into place on the
client. They will be mounted locally via nullfs in the build chroots.
* Break out the restricted.sh generation into a makerestr script so it
can be called manually as needed.
* Remove the -nocvsup argument which has been unused for a long time.
* For now, don't prune the list of failed ports with prunefail,
since when -trybroken is not specified, every BROKEN port ends up in
the duds file (so the build is skipped), and as a result we would
prune almost everything from the list of failed ports. XXX
prunefailure should be run conditionally on -trybroken, or I should
find a way to prune in both cases.
* Don't run index in the background, it was thrashing against makeduds
and not saving any time by doing it concurrently.
* Build with 'make quickports all' to kick off the quickports builds
earlier.
* Delete restricted and/or cdrom distfiles *after* post-processing the
distfiles, otherwise the script doesn't remove any of them since
they're not in the expected place.
* Miscellaneous other minor changes and cleanups
tells us whether the node has NFS access to the master.
* Also copy the bindist-$(hostname).tar file to allow local
customization of the build chroots (e.g. resolv.conf and make.conf
files for disconnected systems)
* For disconnected hosts, we don't copy the bindist files from the
master, but just set up the local directories and let the server rsync
them into place later. Also set up dangling symlinks to the bindist
files in the build area, which will be filled in by the server too (in
the NFS case it makes sense to cache the bindist files locally to
avoid extra NFS traffic, but here we know the file is local so a
symlink is fine)
* Remove an apparently spurious 'killall fetch' that snuck in for what
were probably transient reasons.
* Forcibly clean up old chroot directories since we are preparing to
start another build and don't want old (possibly orphaned) builds to
skew the job scheduling or use up resources.
host), specified by disconnected=1 in portbuild.$(hostname) file.
These do not mount via NFS, so we need to maintain a local copy of
things needed by the build (like the ports/src/doc trees) on the build
host, which are mounted into the chroot by read-only nullfs. These
local files are maintained in the dopackages script via rsync.
* Download packages via http instead of NFS. Allow fetching via a
local http proxy (http_proxy variable in per-node
portbuild.$(hostname) file). Caching package dependencies saves about
85% of package fetches and similar reduction in package fetch traffic
by byte count.
* Support a per-node tarball (bindist-$(hostname).tar) to customize
the build chroots. This is used for things like local resolv.conf and
make.conf files on disconnected nodes.
* Make sure we don't use a chroot until it is finished extracting.
* Don't set '.' in PATH; this is bad practise, and fortunately nothing
seems to rely on it.
* Only try to build broken packages if requested
* Try harder to unmount leftover linprocfs mounts in the chroot, by copying
in the 5.x mount binary and supporting libraries from the host system.
The 5.x mount is able to unmount by FSID in situations where the 4.x umount
becomes confused.
* Don't clean up when we are signalled, that is done by the build
master from outside.
* Suppress some code relating to jail builds, which are not yet ready
for use.
* Don't push results of the build back to the master; the master now
pulls them from the client when the build completes. Clients no
longer need ssh access into the master; this is good for security as
well as significantly reducing the load on the master since it is not
thrashed by dozens of sshd processes.
advantage is that here we know the value of PKGSUFFIX (.tgz/.tbz) for
the build via buildenv.
* Add a list of 'quickports', which are ports with long dependency chains
that we should kick off straight away to try and avoid bottlenecks later
on when most of the cluster idles waiting for one or two ports to build.
Ideally we'd build dependencies of these ports exclusively first and only
build other ports when we run out (i.e. a build slot becomes free), but I
couldn't work out how to do this. As a compromise, we now do
'make -k -j<#> quickports all' which doesn't give quite as high a
priority to the quickports (i.e. we also build other ports from the
beginning while there are quickport dependencies still to build), but is
better than nothing.
* Pass in the FETCH/EXTRACT/PATCH/BUILD/RUN_DEPENDS separately via env
variables when dispatching a job. This allows us to add and remove
the dependencies at the corresponding build stage to catch ports
with dependencies listed too early/late.
sure we don't try and schedule jobs on it even if all other machines are
busy
* Remove sleep in outer loop, this isn't needed or worthwhile now that there
are so many machines being monitored
for INDEX builds [1]
* Remove the parallel target from Makefile; this is heavily tied to
the package build cluster and can be better done in the makeparallel
script (commit to follow) [2]
* Extend the format of INDEX to separately list the
EXTRACT/PATCH/FETCH_DEPENDS instead of lumping them all in together
with BUILD_DEPENDS. The three new fields are appended to the end of
the record in that order. [2]
* Change BROKEN to IGNORE in BROKEN_WITH_MYSQL failure code [3]
* Support non-default PREFIX for perl 5.00503 [5]
* Use pkg_info -I instead of ls when searching for conflicts [6]
* Allow local customization of the port subdirectories by including
${.CURDIR}/Makefile.local in bsd.subdir.mk if it exists [7]
* Fix 'make search' when ${PORTSDIR} is a symlink to a directory name
containing extended regexp metacharacters [8]
Submitted by: linimon [1] [3], kris [2], lth [4], sem [5], eik [5] [6],
Roman Neuhauser <neuhauser@chello.cz> [7]
PR: 68299 [1], 67705 [3], 67264 [4], 59696 [5], 66568 [6],
68072 [7]
build locking, log files, and cleans things up if a build fails.
This script is the primary starting point for a package build. Symlinks
should be created in the form of dopackages.${branch} -> dopackages.wrapper
where ${branch} is currently one of 4, 4-exp, or 5. This script takes the
place of the unofficial (i.e. uncommitted) dopackages.steveX scripts.
Ok'd by: kris
Tested by: 4.10-RELEASE package build
- CC committers and maintainer [1]
- include affected ports in the subject line [2]
- do a CVS log of the version checked out [3]
Suggsted by: Ade Lovett <ade@FreeBSD.org> [1]
Bjoern A. Zeeb <bzeeb-lists@lists.zabbadoz.net> [2]
Pav Lucistnik <pav@FreeBSD.org> [3]
You can even get notified of version changes in your favourite
perl modules by setting
WATCH_REGEX='p5-.*'
Plus, it has a nice configurable nagging option.
used in 20 minutes, as well as directories listed as 'in use' that have not been touched
in 24 hours (corresponding to port builds that have timed out or shut down uncleanly)
and prunes them to reclaim space. This is intended to be run as a cron job.