Jolly-Seber Models
The essential features of Jolly-Seber experiments
(Jolly 1965;
Seber 1965,
1982 Chapter 5) are:
- The population is sampled on K > 2 occasions, designated as sample i,
i=1,...,K. Samples need not be equally spaced in time, but if not, the times
between samples
di =
ti+1 -
ti should be known. Animals are captured at each
sample, i, and are returned marked to the population. Marks may uniquely
identify the animal and be applied at its first capture occasion, or batch
marks identifying sample time may be used, but in the latter case, the batch
mark must be re-applied each time an animal is captured. Animals that are
killed in sample i or are, for any other reason, not returned to the population
after sample i, are called losses on capture at i. At the end, the capture
history of each animal is known or (with batch marks) the number of animals
with any given history of captures is known.
- The population is assumed to be open; that is, subject to "births"
(additions to the catchable population through recruitment, immigration, birth)
and "deaths" (through any mechanism of loss of animals from the catchable
population; deaths do not include known losses through loss on capture).
- The fundamental parameters of the Jolly-Seber model are:
- Pi
- the capture probability at sample i.
- Si
- the survival probability to sample i+1 for animals alive just after i.
Note that the time interval, di, affects this parameter.
- Bi
- the number of "net births" between samples i and i+1: i.e. the number of
animals that enter the catchable population during this interval and are still
there at the end of the interval.
B0
can be understood to be the initial
population size. As we shall see later, however, this is not the only way to
parameterize the birth process.
Derived parameters of interest are the population size,
Ni , and the number of
marked animals alive, Mi.
- The main assumptions of the Jolly-Seber method are:
- no mark loss and correct identification of marks.
- homogeneity of capture probability for all animals alive just before
sample i.
- homogeneity of survival for all animals in the population just after
sample i.
Some of the tests for, and effects of, assumption failures of various types are
reviewed by Pollock et al (1990), and it is
important that data analysis include careful and thorough testing.
The family of Jolly-Seber (J-S) models includes:
(a) Full model: the model with all fundamental parameters possibly varying
with each i. This model often gives poor estimate precision because of the large number of parameters.
(b) Closure models: where some or all of the
fi are known to be 1.0
("no-death") or where some or all of the
Bi are known to be 0 ("no-birth").
Such models can give enormous precision gains over the full model but are
often more sensitive to assumption failures (e.g. to tag loss, as shown in
Arnason & Mills 1981).
(c) Constant parameter models: where some or all of the
pi are assumed to
be equal (constant capture models); similarly where some or all of the
fi
are assumed equal (constant survival models), although here it is necessary
to adjust for differences due to unequal inter-sample intervals
(Jolly 1982).
These models also confer higher precision, especially the constant
capture models.
(d) Generalized models: where additional parameters are introduced to model
age or previous capture history effects on the
fi and
pi
(Pollock 1975, 1981).
These models include group-effect models where some but not all parameters may
differ between 2 (or more) attribute groups. For example, the males and
females in a population may be equally catchable, but their survival rates
might differ. If all parameters differ, the attribute groups should simply
be analysed separately, but often the question of biological interest is
whether group differences exist and, if so, to estimate their magnitude
(Burnham et al. 1987)
and answering these questions requires analysing the groups together.
(e) Covariate models: where auxiliary variables are used to model the
fundamental parameters; for example, sampling effort at
ti might be used to explain
pi
(Pollock et al. 1984)
or weather variables might be used to explain
fi
(Clobert & Lebreton 1985).
As in (d), it is useful to have covariate models
where some or all of the auxiliary variable coefficients are shared across
attribute groups. As in (c), use of covariate models can greatly improve
precision (Clobert & Lebreton 1985).
All classes of models (a-e) can currently be applied to the survival and
capture components using the SURGE program
(Lebreton et al 1992) although many
of the analyses in SURGE assume that samples are taken at equally spaced times.
SURGE handles release-recapture models, called Cormack-Jolly-Seber (CJS)
models, for use when estimates of the marked fraction in the sample are not
available or are not unbiased for the marked fraction in the population. In
this case, estimates of abundance
Ni and net births
Bi are not available.
POPAN Organization
POPAN is essentially batch oriented in the way other large statistical systems
such as SAS or BMDP are:
the user prepares a Command file and (possibly)
a Raw Data file and submits a job which launches the statistical system.
This causes the Command file to be read, which in turn causes the data to be
read and analysed according to the directives in the Command file,
and produces output files:
- Binary Data files for subsequent analysis
- a Log file that reflects the user's commands and any syntactic or semantic
errors in the commands
- and a Results file of the analyses performed on the data and any execution
time error messages.
The Log and Results file can then be browsed and the process can be repeated
to carry out further analyses. The user can either re-use the raw data or
can use the saved binary version of the data. The latter is a bit faster and
saves the user re-specifying the metadata (information about the encoding of
the raw data supplied through the command input).
The Command file directives are in the form of paragraphs that start with a
paragraph name. Data manipulation paragraphs include CREATE, SELECT, and LIST.
Data analysis paragraphs include: STATISTICS, ANALYSIS, and TEST. There is
also a SIMULATE paragraph. Each paragraph is made up of a number of sentences
of the form Keyword=Keyword_value; where the keywords are reserved words
indicating an option or action choice for the current paragraph, and
keyword_value supplies user information or choices for that action. We discuss
the capabilities of each of these paragraphs and give several examples below.
The Raw Data file contains animal histories in the form Id
A1
A2... T
X1
X2 ...
XT where:
- Id
- is an individual animal identifier (e.g. tag number) or a group count of
the number of animals with this history.
- Ai
- are (up to 8) single character attribute codes. For example, A1 may be
coded 'M' or 'F' for sex, A2 might be coded '1', '2', ... for year cohort,
etc. Metadata about attribute names, codes and labels are provided through
keywords and stored in a header to the binary data file. The metadata can be
accessed by other paragraphs or printed later.
- T
- is a count of the number of capture occasions.
- Xi
- is a (variable length) list of the T capture times. Allowance is made for
injections (animals known not to be in the population prior to X1) and losses
on capture (known not to be alive after
XT).
For example, with grouped counts, the history: 3 F2 2 1 -4
indicates that there were 3 female, cohort 2 animals that were seen twice,
in samples 1 and 4 and were lost on capture at sample 4.
POPAN-PC, the IBM/PC implementation of POPAN-3, has a full-screen interactive
user-interface that lets you carry out all the actions needed to run a POPAN
job. You can generate a customized form for raw data entry and have the data
checked interactively as they are entered. When the data are saved, the
appropriate CREATE paragraph is generated for you, and you can run it to create
the Binary Data file. You can compose a Command file by filling in paragraph
templates, save it, and run it. When the job is complete, you can retrieve
the output files and browse through them using full-screen scrolling and send
them to your printer. You can retrieve a saved Command file, re-edit it, and
repeat this cycle as often as you wish. The user-interface also lets you
manage and keep track of your files. Help prompts are displayed at every step.
The interface can be run from DOS or Windows (Windows only in POPAN-4).
POPAN-3
We discuss the capabilities of POPAN-3 grouped by function and giving a
representative example of the command syntax. These examples show only a
fraction of the complete set of possibilities. Further examples are given
in the manuals (Arnason & Baniuk 1978;
Arnason & Schwarz 1987).
1 Data management (CREATE, SELECT, LIST)
CREATE provides the directives for converting raw data to a binary file and
recording the metadata on sample times
(ti) and attributes
(number of Ai
and their codes and labels). Example 1
gives an example. POPAN performs
thorough checks of the raw data to ensure consistency with these metadata.
SELECT, in its simplest form, specifies the binary file to be used for all
subsequent LIST or data analysis paragraphs (section 2). But it also includes
a very general ability to select out data subsets based on attribute and
history conditions and the ability to remap histories by specifying the
omission or grouping of sample times. Examples are given in
Example 1.
LIST allows the user to list the SELECTed contents of binary files in various
sorted orders and formats. It also provides a way to convert binary files
back to raw history format, and to convert individual histories into grouped
histories. In this case, of course, the individual ID's are lost.
2 Data Analysis (STATISTICS, ANALYSIS, TEST)
STATISTICS provides the user with a very general statistics gathering
capability controlled by a formal but natural syntax. For example, the
number of unmarked captures at sample i that are returned to the population
is given by "firstseen at (i) and not lost at (i)". This phrase causes a
count to be made of the number of SELECTed animals that satisfy this condition
for each of the (possibly re-mapped) sample times i=1,...,K. Results for up
to L=14 statistics phrases can be gathered in a single pass of the data file
and are presented in a K by L table along with user-supplied labels and
descriptions. Example 2 gives more complex examples
of statistics phrases.
ANALYSIS provides pre-programmed ("black-box") analysis of all closure
(Jolly 1965) and constant parameter
(Jolly 1982) models; that is, all combinations of
- losses = present/absent/constant (allowing for unequal inter-sample
intervals)
- births = present/absent
- constant capture probability = present/absent
Note that this includes 2 closed models
(M0 and
MT of
Otis et al. 1978). Also note that the
closure and constant parameter models are not selective about sample times:
they apply to all times or to none.
All analyses allow for losses on capture. In addition, all analyses may
have Reduced Capture History pooling applied. This is a moving average
smoothing technique that was first proposed by Jolly (pers. comm.) and
investigated by Kreger (1973).
It has recently been shown to be approximately unbiased by
Hargrove & Borland (1995)
and robust to some forms of heterogeneity in
pi and
fi (M. Efford, pers. comm.).
Output from ANALYSIS is in the form of 2 tables: A K by L Statistics Table
with the L statistics automatically chosen appropriate to the analysis, and
a K row Estimate Table with columns for the estimates and their estimated
standard errors (SEs). For the full and closure models, estimates are
corrected for small sample bias and inadmissible estimates are flagged and
reset (to avoid computation problems in the SE computations). For the
constant parameter models, inadmissible estimates are left as is. Proper handling of inadmissible estimates by constrained likelihood is achieved in
POPAN-4.
STATISTICS and ANALYSIS can be used together to carry out general 2 by 2
contingency table tests. Most of the goodness-of-fit tests and tests for
assumption failures in J-S models can be cast in this form
(Pollock et al. 1990).
The user specifies the statistics phrases defining the counts in each of the
4 cells of the contingency table and then requests an ANALYSIS: NUM=39; which
will use the saved statistics definitions to compute the cell counts (and put
in the Statistics table) and test statistics and significance (Estimate table).
Example 2 gives an example.
TEST allows the user to carry out hierarchical fits of the log-linear models of
Cormack (1989). This was the primary means, in POPAN-3,
of carrying out model selection using the likelihood ratio criterion from
among virtually the full range of models (a-e).
It also provided residual plotting and some unique models of dependency in capture probability
that are important for testing assumptions and fit.
There are some problems with the TEST analyses:
- storage cost is proportional to 2**K, limiting it to small numbers of
samples (K<10);
- no SEs were computed for the estimates because no general model-independent
algorithm was available.
- The model is formulated using somewhat complicated functions of the
fundamental parameters making it difficult or impossible to impose some
constraints on the fundamental parameters.
For these reasons, extensions to include covariate models and models across
attribute groups would have been difficult in POPAN. These
problems are largely resolved in POPAN-4.
3 Simulation of sampling experiments (SIMULATE)
SIMULATE lets the user generate sample histories by simulation of a population
governed by user-specified entry rates, capture, survival, loss-on-capture,
and tag-loss rates. Mechanisms are fully stochastic and may be fixed or
varying across sample times and/or animals. Mechanisms are available for
causing temporary emigration of animals. Further generality is permitted
through group (in POPAN-4) and age cohort generation mechanisms that allows
group and sample history dependencies in rates. Simulated histories can be
output as a binary file, or an ANALYSIS or TEST paragraph can be invoked
from within SIMULATE to produce the Statistics and Estimate tables for that
analysis. A simulation can be replicated up to 999 times in which case each
table is replaced by 2 tables: one of means and one of standard deviations
over (valid) replications. Control of the initial seed of the random number
generator allows populations to be re-generated exactly to compare results of
different analyses.
Simulations can be run in which all the assumptions of the chosen analysis
are satisfied. These simulations will reveal precision and small-sample bias
of the estimates. They are useful in planning sampling experiments to
determine the number of samples and allocation of sampling effort needed to
obtain satisfactory precision in populations of given (guestimated) size
and turn-over rates. Simulations can also be run where assumptions are
deliberately violated in known ways, either through choice of non-homogeneous
mechanisms, forbidden mechanisms (e.g. tag loss, temporary emigration) or
through choice of an incorrect model for data analysis (e.g. a too restrictive
closure model). Such simulations can be used to estimate the resulting bias
and loss of precision due to such violations. Because the 2 by 2 Chi-Square
test is implemented as an ANALYSIS (NUMBER= 39) and puts its results in the
Estimate table, SIMULATE can also be used to investigate the power of a
previously specified analysis (Example 2).
POPAN-4
POPAN-4 adds general hierarchical model selection of a very wide class of
models. The models are based on a new unified formulation of general
Jolly-Seber models developed by one of us (CJS). The main features of this
formulation are (Schwarz & Arnason 1995):
- The model is formulated in terms of the logits (log odds) of the
fundamental parameters pi,
phii, and a new parameter
bi. The new birth parameters,
bi, are the net
Bi normalized to sum to
1 (i=0, ...,K-1). The parameter for the initial population size,
B0, permits derivation of the
Bi from the
bi. This formulation makes it
very easy to obtain estimates of the fundamental parameters and their SEs
from the model parameters. It is the fundamental parameters that are of
interest to the biologist. (Some of the parameters, such as
B0, may be confounded with
other parameters and hence are not estimable in some models.) A derived
parameter for Gross Births (BGi)
can be obtained from the Bi
and phii. This parameter is of
considerable interest to fisheries biologists who use J-S models to estimate
total run sizes of salmon but gross estimates and their SEs have not
previously been available (Schwarz et al. 1993).
- The model likelihood can be factored in a way that permits very efficient
iterative searches for the maximum likelihood estimates by successive searches
through each parameter subspace before attempting a global search for all
(3K-1) parameters. This allows reliable convergence of the search for models
with large K and storage costs are proportional to K2.
- Variances (and covariances) for the model parameters are computed
numerically using standard likelihood theory (the inverted and negated
Hessian matrix of second partial derivatives with respect to the parameters
is the Information matrix) and are transformed to SEs for the fundamental
parameters using the delta and appropriate unconditioning techniques.
- Very general constraints on the parameters can be imposed in the numerical
optimization using the method of Lagrange multipliers. Because there is a
simple and direct relationship between fundamental parameters and model
parameters, constraints on the fundamental parameters can be translated
(automatically, in POPAN-4) into constraints on the model parameters.
The implementation in POPAN-4 provides both automatic constraints and very
general user-specified constraints through the UFIT paragraph. Constraints
may be imposed on any of the 3 parameter types and constraints may be
within-group (e.g.applied to all animals: there are 3K-1 parameters before
constraints) or across groups (e.g. parameters for females constrained to equal
those for males: there are G(3K-1) parameters for populations made up of G
groups). Redundant or contradictory constraints will generally produce a
clear error warning ("singular matrix").
UFIT prints out the maximized likelihood and the number of restrictions used
in the fit. This permits the user to use likelihood ratio or Akaike
Information Criterion (AIC) methods for model selection. The UFIT paragraph
is fully integrated with SIMULATE: A UFIT paragraph is defined and its keywords
SAVEd for a future SIMULATE (Example 3).
The SIMULATE then specifies ANALYSIS = UFIT causing the user specified model
fit to be applied to each replicated population. When UFIT is used to analyse
actual rather than simulated data, it is preceded by a SELECT.
We discuss the simpler no-group-effects implementation first.
1 Non-group constraints
Constraints are specified for each of the parameter types using keywords
CPCONST (on Capture Probabilities
pi), SPCONST (on Survival Probabilities phii) or BPCONST
(on Birth Proportions bi).
The keyword_values are the same for each of these and allow the following
constraints specified as multiple contrasts.
- Constant constraints (Pi - value) where value is a constant or a second
parameter.
- The first form (Pi - c), where value is a constant, allows selective
closure at any time: for example
SPCONST = (P3-1)(P5-1); BPCONST = (P1 - 0)(P2 - 0)(P3 - 0) ;
specifies no losses between samples 3 and 4 and samples 5 and 6 and no new
entries between samples 1 through 4. Constraints like the second can be
shortened, to (P1:P3 - 0), using range notation. POPAN-4 adds constraints
automatically during the search for any parameter that wanders off towards
inadmissible values and automatically adds the constraint to normalize the
bi. Constrained parameters have
estimated SE of 0 whether they are constrained deliberately, because there is
real knowledge about the parameter, or automatically because the unconstrained
estimates just happened to be inadmissible. Ideally, one would like the latter
situation to reflect some imprecision in the estimates.
The second form, (Pi - Pj) where value is a second parameter, allows selective
equality of parameters at different sample times i,j. For example, in a K=5
sample experiment CPCONST = (P1:P4 - P5) is equivalent to the Jolly-Dickson
constant capture probability model (Jolly 1982), but
selective constraints like CPCONST = (P1-P2)(P4-P5) are also allowed and may
be necessary to resolve the non-identifiability of
p1 and
pK. Equality constraints on
phii and
bi may not be meaningful when
sample times are unequally spaced.
Keywords ADJUST=YES/NO and BIRTHS = NET/GROSS can be used to have POPAN
modify the constraints appropriately. For example with ADJUST = YES the
constraint SPCONST = (P1 - P2) is imposed as
phi1
1
/d1 =
phi2
1
/d2 where
1/di
is the inverse of the time interval between sample i and i+1.
- Covariate constraints P-(C1, C2, ..CL)
- The user can define up to 9 covariates which are real valued vectors of
auxiliary variables associated with each sample time. Polynomial covariate
models of up to quadratic terms in the covariates may then be defined for each
fundamental parameter or its logit using a fairly compact notation. The
notation C0 indicates the constant vector (for the intercept) and Cxy indicates
the term-by-term product of vectors x and y. For
Example 3, two covariates have been defined (keywords
C1= and C2=). The constraint equation
SPCON = P - (C1,C2)
fits the model: phii =
beta1
C1i +
b2
C2i ; similarly,
SPCON = LOGITP - (C0, C1,C11)
fits the model: logit(phii) =
beta1 +
beta2
C1i +
beta2
(C1i)**2 . The ADJUST keyword can
be used to allow for unequal sample time spacing. Covariate models on net
births are allowed but implementation for gross births proved too complicated.
Covariate models are implemented by allowing the first L model parameters to
be freely optimized in the numeric search. At each iteration, these L
parameters are used to solve for current values of the b coefficients and
these values are used to impose constraints on the remaining model parameters
(as illustrated by the 2 constraint equations above involving the beta). At
convergence, estimates for the betas and their SEs can be derived from the
first L model parameter estimates and their variance-covariance matrix.
These estimates are added to the Estimate Table.
2 Across-group constraints
Up to 9 groups can be specified in the UFIT paragraph using keywords
G1=,...,G9=; the keyword_value for these is the same as for the ATTRIBUTE=
keyword in SELECT (Example 1). Constraints within and
across groups are then specified using a group prefix in the constraint
keywords.
As a specific example, suppose we are interested in comparisons of males and
females in a K=5 sample experiment and that attribute 1 defines the sex of each
animal as in Example 1. There are now a total of
2(3K-1)=28 parameters, all of which are assumed to differ between groups and
across sample times unless constraints are specified. The Statistics and
Estimate tables will now each have 2K=10 rows, the first K for Group 1 (males)
and the next K for Group 2 (females). We would first specify
NGROUP=2; G1 = (A1 .eq. 'M'); G2 = (A1 .eq. 'F');
Constant constraints can be specified for either group: e.g. no losses of
females:
SPCONST = (G2P1 - 1)(G2P2 - 1)(G2P3 - 1)(G2P4 - 1) ;
but this can be shortened to SPCONST = (G2P1:P4 - 1) ; The range notation
P1:P4 implies a vector of 4 parameters. Vectors can be equated to single
values or term by term to another vector of equal length. When NGROUP is
specified, a group prefix must be used on all contrasts.
Equality constraints can be applied in various ways across groups. The
examples below are applied to capture rates.
- (a) Temporal effect and group effect (pt*g):
- no constraints (this is the default)
- (b) No temporal effect, group effect (pg):
- CPCONST = (G1P1:P4 - G1P5)(G2P1:P4 - G2P5) ;
- (c) Temporal effect, no group effect (pt):
- CPCONST = (G1P1:P5 - G2P1:P5) ;
- (d) No temporal effect or group effect (p):
- CPCONST = (G1P1:P4 - G1P5)(G2P1:P5 - G1P5) ;
Covariate constraints also use the Group prefix notation to indicate group
effects on specific terms of the covariate model. For example:
SPCONST = P - (C0, G1:G2C1) ;
specifies a linear covariate model where the intercepts are different for
males and females but the slopes are the same. Here the range notation in
the Group prefix is used to indicate equality constraints on the covariate
coefficients across groups. As before, the user can specify that equality and
covariate constraints be ADJUSTed to account for unequal sample time interval
effects on survival, and to constrain gross rather than net births. Unlike
SURGE, POPAN does not explicitly model age effects, but if samples are
taken annually and if the file contains an attribute for age class at initial
release, then this syntax does permit modelling of age effects on capture
and survival rates.
3 Current state of POPAN-4
At the time of writing (January 1995) POPAN-4 has been implemented with the
non-group constraints described for paragraph UFIT (section 1). The syntax
for the group constraints has been implemented and the workability of group
constraints has been tested in a stand-alone version. The current SUN
workstation version of POPAN-4 is ready for release once the manuals are
completed. An update will follow in 1995 when group constraints have been
fully implemented. A version that runs under Windows forIBM/PCs has been
successfully tested and will be available in 1995 when the POPAN-PC interface
has been updated. A version for OS/2 for PCs has also been developed and will
be available at the same time as the SUN version.
Design aims of POPAN
Some of the main design aims throughout the development of POPAN have been:
- (a) Comprehensive data
- The data for a population should be kept together along with appropriate
metadata. The program can then give the user options for analysing various
data subsets, or for selecting out and pooling subsets of the sample times.
This is a top-down approach in which all the data is organised together, and
then reduced by exploratory analyses and data manipulations.
- (b) Comprehensive analyses
- Provide both simple "black box" analyses and more customizable methods.
Provide criteria for model selection and assessment of fit and tests for
assumption failures.
- (c) Orthogonality
- Everything works with everything else: for example any analysis can be
invoked with SIMULATE; every analysis allows for losses-on-capture and can
be used with the RCH pooling method; within a paragraph, all combinations of
keyword_value choices should lead to meaningful choices.
- (d) Realism and reliability
- The software must allow for the awkwardness of real data. Animals are lost
on capture; sample times are not always at equally spaced times; allowance
must be made for large numbers of small samples and over-parameterization,
poor precision, inadmissible estimates and numeric problems. The analysis
must automatically protect against problems resulting from inconvenient sample
results: null samples can occur when selecting sample sub-sets, samples may
have no marked animals, etc. It is also important that analyses used
in simulations be bullet-proof; if the analyses don't anticipate and recover
from fatal errors, then a simulation can fail. This is particularly annoying
if it happens on the last of 900 replications.
To the best of our knowledge the following major features are unique to POPAN:
- General group sub-selection based on attributes; general methods for
sample time omission and pooling.
- The ability to gather almost any statistic based on capture histories
and to use these to construct contingency table tests.
- A non-parametric smoothing technique that can be applied to all analyses.
- A very general simulation capability.
- The most flexible model customizing and fitting procedure (UFIT) available for Jolly-Seber type models.
COMMAND examples
Example 1 of POPAN-3 command paragraphs creates a binary file from
raw data with 2 attributes and 10 sample times and analyzes the male sub-group
and then the full set of animals but with capture histories re-mapped.
CREATE:
NAME = 'two-sex population, grouped histories' ;
BEGIN = 1 ; END = 10; ID = GROUP ;
INPUT = 11; SAVE = ASIS ; DATASET = 12 ;
C specify (unequally spaced) sample times (days) for samples 1...10
SVALUES = (1, 2, 2.5, 7, 10.5, 15, 15.5, 16, 19, 21) ;
C specify 2 attributes with 3 and 2 codes and give their names and values
ANUM = 2(3,2); ALIST = SEX, AGE
AVALUES = SEX (M) 'Male' (F) 'Female' ( ) 'Undetermined'
AGE (1) 'Juvenile' (2) 'Adult' ;
C turn on range checking of raw history capture times and attribute codes
TCHECK = RANGE; ACHECK = YES; /
SELECT:
TITLE = 'Selecting out adult males' ; INPUT = 12;
ATTRIBUTE = (A1 .eq. 'M' .and. A2 .eq. '2') ; /
ANALYSIS:
TITLE = 'first analysis...no births, constant survival per day' ;
DILUTION = ABSENT ; LOSSES = FIXED ; /
SELECT:
TITLE = 'Selecting all animals...reduced to 5 sample times';
INPUT = 12;
OMIT = (1, 10); GROUP = (2,3),(6:8); /
ANALYSIS:
TITLE = 'second analysis...same as first but specified by number' ;
NUMBER = 8; /
Example 2 of POPAN-3 command paragraphs uses STATISTICS and
ANALYSIS: Number=39 from within SIMULATE to investigate the power of a
goodness-of-fit test (Pollock et al 1990, Fig 4.2) to detect trap avoidance.
To investigate the bias in the J-S Full model estimates produced by the trap
avoidance, the SIMULATE paragraph can be run alone with just a change to the
specified analysis (ANAL = 1 instead of 39)
STATISTICS:
TITLE = 'Stats for second component goodness-of-fit' ;
NUMBER = 4; SAVE = STAT; SCAN = FULL ;
C give the column symbol, verbal description, and formal condition for each
C of the 4 cells of the 2 by 2 table in row order
SYM = ' M.1 ' ; DES = 'early marks...recaptured now' ;
CON = 'FIRSTSEEN BEFORE (I -1) AND SEEN AT (I) ;
SYM = ' Z.1 ' ; DES = 'early marks...recaptured later' ;
CON = 'FIRSTSEEN BEFORE (I -1) AND NOT SEEN AT (I)
AND SEEN AFTER (I) ;
SYM = ' M.2 ' ; DES = 'recent marks...recaptured now' ;
CON = 'FIRSTSEEN AT (I -1) AND SEEN AT (I) ;
SYM = ' Z.2 ' ; DES = 'recent marks...recaptured later' ;
CON = 'FIRSTSEEN AT (I -1) AND NOT SEEN AT (I)
AND SEEN AFTER (I) ; /
SIMULATE:
TITLE = 'stable population with trap avoidance' ;
C allow 6 sample times (weeks) in a stable population of 1000 animals
C and weekly survival of 80%
LSEL = 6 ; NEWENT = ENDOGENOUS(1000) ; SMECH = FIXED (0.8)
;
C all births enter as age 0 animals and age by 1 at each capture...age 0
C animals (unmarked) are captured at rate 40%, marked (age >0) at 25%.
AMECH = FIXED(0); ATYPE = ENDOGENOUS ;
CMECH = AGEDEP, DISCRETE, 2, (0,0.40),(1, 0.25) ;
C run parameters control number of reps, amount of output, random seed
RUN = YES ; REP = 100 ; WRITE = NO ; SEED = 385674511 ;
C analyse using previously defined STATISTICS and a 2-tailed test at the
C 1%, 5% and 10% significance level.
SIZE = 1(1,5,10) ; ANAL = 39 ; /
Example 3 is a POPAN-4 UFIT paragraph that fits covariate models
for each of the mechanisms. This fitting procedure is applied to the data
generated in each replicate population of the SIMULATE which follows. The
mechanisms specified in SIMULATE are chosen to generate a correct model of
the type UFIT is attempting to fit and so all estimates and SEs should be
unbiased.
UFIT:
C test that covariate constraints work
LSEL = 6;
C1 = (1.0, 2.0, 3.0, 4.0, 5.0, 6.0) ;
C2 = (5.0, 2.0, 1.0, 1.0, 2.0, 5.0);
C try with =P and =LOGITP
CPCONST = P - (C0,C1,C2,C12);
C try with =P and =LOGITP and with ADJUST = YES or NO
SPCONST = P - (C1,C2); ADJUST = YES;
C try with =P and =LOGITP
BPCONST = P - (C0, C1, C11) ;
ANALYSIS = 9; SAVE = UFIT;
TITLE = 'Covariate models for all parameters'; /
SIMULATE:
LSEL = 6; REPLICATIONS = 100;
TITLE = 'SIMULATION TEST - CALLING UFIT FROM SIMULATE';
C CP true values for all covariate coefficients = 0.02
CMECHANISM = VECTOR(0.24, 0.18, 0.16, 0.20, 0.36, 0.84);
C SP true values for all covariate coefficients = 1/8 (except last)
SMECHANISM = VECTOR(0.75, 0.5, 0.5,0.625,0.875, 0.99);
C NE true model is -40 + 240 C1 - 40 C1*C1
C NET Ntot = 2000 and b0....b5=0.4, 0.08, 0.14, 0.16, 0.14, 0.08
NEWENTRIES = VECTOR(800, 160, 280, 320, 280, 160);
SEED = 123456789; RUN = YES; WRITE = NO ;
ANALYSIS = UFIT; SAVE = NO ; /
Acknowledgements
Work on POPAN has been supported by individual operating grants to each of
the authors from the Natural Sciences and Engineering Research Council of
Canada and by a Canada Department of Fisheries and Oceans/NSERC subvention
grant to the authors jointly. We wish to acknowledge the major programming
contribution to POPAN-4 made by Gord Boyer. We also thank Christopher
Lapkowski, Ben Li and Chris Kirby for recent contributions to POPAN-PC
and POPAN-3.
REFERENCES
Arnason, A. N., & Baniuk, L. (1978) POPAN-2: A data maintenance and
analysis system for mark-recapture data (Box 272, St. Norbert,
Manitoba, Charles Babbage Research Centre).
Arnason, A. N., Miller, D. W. & Lapkowski, C. (1992) POPAN-PC:
Installation and user's manual for running POPAN-3 on IBM PC
microcomputers under DOS or Windows 3 (Box 272, St. Norbert,
Manitoba, Charles Babbage Research Centre).
Arnason, A. N. & Mills, K.H. (1981) Bias and loss of precision due to
tag loss in Jolly-Seber estimates for mark-recapture experiments,
Canadian Journal of Fisheries and Aquatic Sciences, 38, pp. 1077-1095.
Arnason, A. N. & Schwarz, C.J. (1987) POPAN-3: Extended analysis and
testing features for POPAN-2 (Box 272, St. Norbert, Manitoba,
Charles Babbage Research Centre).
Burnham, K. P., Anderson, D.R., White, G.C., Brownie, C., & Pollock,
K. H. (1987) Design and analysis methods for fish survival
experiments based on release-recapture (Bethesda MD, American
Fisheries Society Monograph Number 5).
Clobert, J. & Lebreton, J. D. (1985) Dépendence de facteurs de milieu
dans les estimations de taux de survie par capture-recapture,
Biometrics, 41, pp. 1031-1037.
Cormack, R. M. (1989) Loglinear models for capture-recapture.
Biometrics, 41, pp. 385-413.
Hargrove, J. W. & Borland, C. W. (1994) Pooled population parameter
estimates from mark-recapture data, To appear: Biometrics, 50.
Jolly, G. M. (1965) Explicit estimates from capture-recapture data
with both death and immigration - stochastic model
Biometrika, 52, pp. 225-247.
Jolly, G. M. (1982) Mark-recapture models with parameters constant
in time, Biometrics, 38, pp. 301-321.
Kreger, N.S. (1973) A simulation study of Jolly's estimates for
animal populations when sampling intensity is low (Winnipeg,
Manitoba, Department of Computer Science, M.Sc. thesis).
Lebreton, J.-D., Burnham, K. P., Clobert, J. & D. R. Anderson
(1992) Modeling survival and testing biological hypotheses
using marked animals: a unified approach with case studies,
Ecological Monographs, 62, pp. 67-118.
Otis, D. L., Burnham, K. P., White, G. C. & Anderson, D. R.
(1978) Statistical inference from capture data on closed
animal populations. Wildlife Monograph No. 62, pp1-135.
Pollock, K. H. (1975) A K-sample tag-recapture model allowing for
unequal survival and catchability, Biometriika, 62, pp. 577-584.
Pollock, K. H. (1981) Capture-recapture models allowing for age-dependent
survival and capture rates, Biometrics, 37, pp. 521-529.
Pollock, K. H., Hines, J. E., & Nichols, J. D. (1984) The use of
auxilliary variables in capture-recapture and removal experiments,
Biometrics, 40, pp. 329-340.
Pollock, K. H., Nichols, J. D., Brownie, C., & Hines, J. E. (1990)
Statistical inference for capture-recapture experiments
Wildlife Monograph No. 107, pp. 1-97.
Schwarz, C. J. & Arnason, A. N. (1995) A general methodology for the
analysis of capture-recapture experiments in open populations.
Submitted to: Biometrics.
Schwarz, C. J., Bailey, R.E., Irvine, J. R., & Dalziel, F. C. (1993)
Estimating salmon spawning escapement using capture-recapture methods,
Canadian Journal of Fisheries and Aquatic Sciences, 50, pp. 1181-1197.
Seber, G. A. F. (1965) A note on the multiple-recapture census,
Biometrika, 52, 249-259.
Seber, G. A. F. (1982) The estimation of animal abundance and related
parameters (Second edition, London, Griffin).
POPAN-5 Homepage
POPAN Maintainer: Gord Boyer --
gboyer@cs.umanitoba.ca