Jolly-Seber Models

The essential features of Jolly-Seber experiments (Jolly 1965; Seber 1965, 1982 Chapter 5) are:

The population is sampled on K > 2 occasions, designated as sample i, i=1,...,K. Samples need not be equally spaced in time, but if not, the times between samples d_i = t_i+1 - t_i should be known. Animals are captured at each sample, i, and are returned marked to the population. Marks may uniquely identify the animal and be applied at its first capture occasion, or batch marks identifying sample time may be used, but in the latter case, the batch mark must be re-applied each time an animal is captured. Animals that are killed in sample i or are, for any other reason, not returned to the population after sample i, are called losses on capture at i. At the end, the capture history of each animal is known or (with batch marks) the number of animals with any given history of captures is known.
The population is assumed to be open; that is, subject to "births" (additions to the catchable population through recruitment, immigration, birth) and "deaths" (through any mechanism of loss of animals from the catchable population; deaths do not include known losses through loss on capture).
The fundamental parameters of the Jolly-Seber model are:

P_i
the capture probability at sample i.
S_i
the survival probability to sample i+1 for animals alive just after i. Note that the time interval, di, affects this parameter.
B_i
the number of "net births" between samples i and i+1: i.e. the number of animals that enter the catchable population during this interval and are still there at the end of the interval. B₀ can be understood to be the initial population size. As we shall see later, however, this is not the only way to parameterize the birth process.
Derived parameters of interest are the population size, N_i , and the number of marked animals alive, M_i.
The main assumptions of the Jolly-Seber method are:
- no mark loss and correct identification of marks.
- homogeneity of capture probability for all animals alive just before sample i.
- homogeneity of survival for all animals in the population just after sample i.
Some of the tests for, and effects of, assumption failures of various types are reviewed by Pollock et al (1990), and it is important that data analysis include careful and thorough testing.

The family of Jolly-Seber (J-S) models includes:

(a) Full model: the model with all fundamental parameters possibly varying with each i. This model often gives poor estimate precision because of the large number of parameters.

(b) Closure models: where some or all of the f_i are known to be 1.0 ("no-death") or where some or all of the B_i are known to be 0 ("no-birth"). Such models can give enormous precision gains over the full model but are often more sensitive to assumption failures (e.g. to tag loss, as shown in Arnason & Mills 1981).

(c) Constant parameter models: where some or all of the p_i are assumed to be equal (constant capture models); similarly where some or all of the f_i are assumed equal (constant survival models), although here it is necessary to adjust for differences due to unequal inter-sample intervals (Jolly 1982). These models also confer higher precision, especially the constant capture models.

(d) Generalized models: where additional parameters are introduced to model age or previous capture history effects on the f_i and p_i (Pollock 1975, 1981). These models include group-effect models where some but not all parameters may differ between 2 (or more) attribute groups. For example, the males and females in a population may be equally catchable, but their survival rates might differ. If all parameters differ, the attribute groups should simply be analysed separately, but often the question of biological interest is whether group differences exist and, if so, to estimate their magnitude (Burnham et al. 1987) and answering these questions requires analysing the groups together.

(e) Covariate models: where auxiliary variables are used to model the fundamental parameters; for example, sampling effort at t_i might be used to explain p_i (Pollock et al. 1984) or weather variables might be used to explain f_i (Clobert & Lebreton 1985). As in (d), it is useful to have covariate models where some or all of the auxiliary variable coefficients are shared across attribute groups. As in (c), use of covariate models can greatly improve precision (Clobert & Lebreton 1985).

All classes of models (a-e) can currently be applied to the survival and capture components using the SURGE program (Lebreton et al 1992) although many of the analyses in SURGE assume that samples are taken at equally spaced times. SURGE handles release-recapture models, called Cormack-Jolly-Seber (CJS) models, for use when estimates of the marked fraction in the sample are not available or are not unbiased for the marked fraction in the population. In this case, estimates of abundance N_i and net births B_i are not available.

POPAN Organization

POPAN is essentially batch oriented in the way other large statistical systems such as SAS or BMDP are: the user prepares a Command file and (possibly) a Raw Data file and submits a job which launches the statistical system. This causes the Command file to be read, which in turn causes the data to be read and analysed according to the directives in the Command file, and produces output files:

Binary Data files for subsequent analysis
a Log file that reflects the user's commands and any syntactic or semantic errors in the commands
and a Results file of the analyses performed on the data and any execution time error messages.

The Log and Results file can then be browsed and the process can be repeated to carry out further analyses. The user can either re-use the raw data or can use the saved binary version of the data. The latter is a bit faster and saves the user re-specifying the metadata (information about the encoding of the raw data supplied through the command input).

The Command file directives are in the form of paragraphs that start with a paragraph name. Data manipulation paragraphs include CREATE, SELECT, and LIST. Data analysis paragraphs include: STATISTICS, ANALYSIS, and TEST. There is also a SIMULATE paragraph. Each paragraph is made up of a number of sentences of the form Keyword=Keyword_value; where the keywords are reserved words indicating an option or action choice for the current paragraph, and keyword_value supplies user information or choices for that action. We discuss the capabilities of each of these paragraphs and give several examples below.

The Raw Data file contains animal histories in the form Id A₁ A₂... T X₁ X₂ ... X_T where:

Id: is an individual animal identifier (e.g. tag number) or a group count of the number of animals with this history.
A_i: are (up to 8) single character attribute codes. For example, A1 may be coded 'M' or 'F' for sex, A2 might be coded '1', '2', ... for year cohort, etc. Metadata about attribute names, codes and labels are provided through keywords and stored in a header to the binary data file. The metadata can be accessed by other paragraphs or printed later.
T: is a count of the number of capture occasions.
X_i: is a (variable length) list of the T capture times. Allowance is made for injections (animals known not to be in the population prior to X1) and losses on capture (known not to be alive after X_T).

For example, with grouped counts, the history: 3 F2 2 1 -4 indicates that there were 3 female, cohort 2 animals that were seen twice, in samples 1 and 4 and were lost on capture at sample 4.

POPAN-PC, the IBM/PC implementation of POPAN-3, has a full-screen interactive user-interface that lets you carry out all the actions needed to run a POPAN job. You can generate a customized form for raw data entry and have the data checked interactively as they are entered. When the data are saved, the appropriate CREATE paragraph is generated for you, and you can run it to create the Binary Data file. You can compose a Command file by filling in paragraph templates, save it, and run it. When the job is complete, you can retrieve the output files and browse through them using full-screen scrolling and send them to your printer. You can retrieve a saved Command file, re-edit it, and repeat this cycle as often as you wish. The user-interface also lets you manage and keep track of your files. Help prompts are displayed at every step. The interface can be run from DOS or Windows (Windows only in POPAN-4).

POPAN-3

We discuss the capabilities of POPAN-3 grouped by function and giving a representative example of the command syntax. These examples show only a fraction of the complete set of possibilities. Further examples are given in the manuals (Arnason & Baniuk 1978; Arnason & Schwarz 1987).

1 Data management (CREATE, SELECT, LIST)

CREATE provides the directives for converting raw data to a binary file and recording the metadata on sample times (t_i) and attributes (number of A_i and their codes and labels). Example 1 gives an example. POPAN performs thorough checks of the raw data to ensure consistency with these metadata. SELECT, in its simplest form, specifies the binary file to be used for all subsequent LIST or data analysis paragraphs (section 2). But it also includes a very general ability to select out data subsets based on attribute and history conditions and the ability to remap histories by specifying the omission or grouping of sample times. Examples are given in Example 1. LIST allows the user to list the SELECTed contents of binary files in various sorted orders and formats. It also provides a way to convert binary files back to raw history format, and to convert individual histories into grouped histories. In this case, of course, the individual ID's are lost.

2 Data Analysis (STATISTICS, ANALYSIS, TEST)

STATISTICS provides the user with a very general statistics gathering capability controlled by a formal but natural syntax. For example, the number of unmarked captures at sample i that are returned to the population is given by "firstseen at (i) and not lost at (i)". This phrase causes a count to be made of the number of SELECTed animals that satisfy this condition for each of the (possibly re-mapped) sample times i=1,...,K. Results for up to L=14 statistics phrases can be gathered in a single pass of the data file and are presented in a K by L table along with user-supplied labels and descriptions. Example 2 gives more complex examples of statistics phrases.

ANALYSIS provides pre-programmed ("black-box") analysis of all closure (Jolly 1965) and constant parameter (Jolly 1982) models; that is, all combinations of

losses = present/absent/constant (allowing for unequal inter-sample intervals)
births = present/absent
constant capture probability = present/absent

Note that this includes 2 closed models (M₀ and M_T of Otis et al. 1978). Also note that the closure and constant parameter models are not selective about sample times: they apply to all times or to none. All analyses allow for losses on capture. In addition, all analyses may have Reduced Capture History pooling applied. This is a moving average smoothing technique that was first proposed by Jolly (pers. comm.) and investigated by Kreger (1973). It has recently been shown to be approximately unbiased by Hargrove & Borland (1995) and robust to some forms of heterogeneity in p_i and f_i (M. Efford, pers. comm.). Output from ANALYSIS is in the form of 2 tables: A K by L Statistics Table with the L statistics automatically chosen appropriate to the analysis, and a K row Estimate Table with columns for the estimates and their estimated standard errors (SEs). For the full and closure models, estimates are corrected for small sample bias and inadmissible estimates are flagged and reset (to avoid computation problems in the SE computations). For the constant parameter models, inadmissible estimates are left as is. Proper handling of inadmissible estimates by constrained likelihood is achieved in POPAN-4.

STATISTICS and ANALYSIS can be used together to carry out general 2 by 2 contingency table tests. Most of the goodness-of-fit tests and tests for assumption failures in J-S models can be cast in this form (Pollock et al. 1990). The user specifies the statistics phrases defining the counts in each of the 4 cells of the contingency table and then requests an ANALYSIS: NUM=39; which will use the saved statistics definitions to compute the cell counts (and put in the Statistics table) and test statistics and significance (Estimate table). Example 2 gives an example.

TEST allows the user to carry out hierarchical fits of the log-linear models of Cormack (1989). This was the primary means, in POPAN-3, of carrying out model selection using the likelihood ratio criterion from among virtually the full range of models (a-e). It also provided residual plotting and some unique models of dependency in capture probability that are important for testing assumptions and fit.

There are some problems with the TEST analyses:

storage cost is proportional to 2**K, limiting it to small numbers of samples (K<10);
no SEs were computed for the estimates because no general model-independent algorithm was available.
The model is formulated using somewhat complicated functions of the fundamental parameters making it difficult or impossible to impose some constraints on the fundamental parameters.

For these reasons, extensions to include covariate models and models across attribute groups would have been difficult in POPAN. These problems are largely resolved in POPAN-4.

3 Simulation of sampling experiments (SIMULATE)

SIMULATE lets the user generate sample histories by simulation of a population governed by user-specified entry rates, capture, survival, loss-on-capture, and tag-loss rates. Mechanisms are fully stochastic and may be fixed or varying across sample times and/or animals. Mechanisms are available for causing temporary emigration of animals. Further generality is permitted through group (in POPAN-4) and age cohort generation mechanisms that allows group and sample history dependencies in rates. Simulated histories can be output as a binary file, or an ANALYSIS or TEST paragraph can be invoked from within SIMULATE to produce the Statistics and Estimate tables for that analysis. A simulation can be replicated up to 999 times in which case each table is replaced by 2 tables: one of means and one of standard deviations over (valid) replications. Control of the initial seed of the random number generator allows populations to be re-generated exactly to compare results of different analyses.

Simulations can be run in which all the assumptions of the chosen analysis are satisfied. These simulations will reveal precision and small-sample bias of the estimates. They are useful in planning sampling experiments to determine the number of samples and allocation of sampling effort needed to obtain satisfactory precision in populations of given (guestimated) size and turn-over rates. Simulations can also be run where assumptions are deliberately violated in known ways, either through choice of non-homogeneous mechanisms, forbidden mechanisms (e.g. tag loss, temporary emigration) or through choice of an incorrect model for data analysis (e.g. a too restrictive closure model). Such simulations can be used to estimate the resulting bias and loss of precision due to such violations. Because the 2 by 2 Chi-Square test is implemented as an ANALYSIS (NUMBER= 39) and puts its results in the Estimate table, SIMULATE can also be used to investigate the power of a previously specified analysis (Example 2).

POPAN-4

POPAN-4 adds general hierarchical model selection of a very wide class of models. The models are based on a new unified formulation of general Jolly-Seber models developed by one of us (CJS). The main features of this formulation are (Schwarz & Arnason 1995):

The model is formulated in terms of the logits (log odds) of the fundamental parameters p_i, phi_i, and a new parameter b_i. The new birth parameters, b_i, are the net B_i normalized to sum to 1 (i=0, ...,K-1). The parameter for the initial population size, B₀, permits derivation of the B_i from the b_i. This formulation makes it very easy to obtain estimates of the fundamental parameters and their SEs from the model parameters. It is the fundamental parameters that are of interest to the biologist. (Some of the parameters, such as B₀, may be confounded with other parameters and hence are not estimable in some models.) A derived parameter for Gross Births (BG_i) can be obtained from the B_i and phi_i. This parameter is of considerable interest to fisheries biologists who use J-S models to estimate total run sizes of salmon but gross estimates and their SEs have not previously been available (Schwarz et al. 1993).
The model likelihood can be factored in a way that permits very efficient iterative searches for the maximum likelihood estimates by successive searches through each parameter subspace before attempting a global search for all (3K-1) parameters. This allows reliable convergence of the search for models with large K and storage costs are proportional to K².
Variances (and covariances) for the model parameters are computed numerically using standard likelihood theory (the inverted and negated Hessian matrix of second partial derivatives with respect to the parameters is the Information matrix) and are transformed to SEs for the fundamental parameters using the delta and appropriate unconditioning techniques.
Very general constraints on the parameters can be imposed in the numerical optimization using the method of Lagrange multipliers. Because there is a simple and direct relationship between fundamental parameters and model parameters, constraints on the fundamental parameters can be translated (automatically, in POPAN-4) into constraints on the model parameters.

The implementation in POPAN-4 provides both automatic constraints and very general user-specified constraints through the UFIT paragraph. Constraints may be imposed on any of the 3 parameter types and constraints may be within-group (e.g.applied to all animals: there are 3K-1 parameters before constraints) or across groups (e.g. parameters for females constrained to equal those for males: there are G(3K-1) parameters for populations made up of G groups). Redundant or contradictory constraints will generally produce a clear error warning ("singular matrix").

UFIT prints out the maximized likelihood and the number of restrictions used in the fit. This permits the user to use likelihood ratio or Akaike Information Criterion (AIC) methods for model selection. The UFIT paragraph is fully integrated with SIMULATE: A UFIT paragraph is defined and its keywords SAVEd for a future SIMULATE (Example 3). The SIMULATE then specifies ANALYSIS = UFIT causing the user specified model fit to be applied to each replicated population. When UFIT is used to analyse actual rather than simulated data, it is preceded by a SELECT.

We discuss the simpler no-group-effects implementation first.

1 Non-group constraints

Constraints are specified for each of the parameter types using keywords CPCONST (on Capture Probabilities p_i), SPCONST (on Survival Probabilities phi_i) or BPCONST (on Birth Proportions b_i). The keyword_values are the same for each of these and allow the following constraints specified as multiple contrasts.

Constant constraints (Pi - value) where value is a constant or a second parameter.

The first form (Pi - c), where value is a constant, allows selective closure at any time: for example
SPCONST = (P3-1)(P5-1); BPCONST = (P1 - 0)(P2 - 0)(P3 - 0) ;
specifies no losses between samples 3 and 4 and samples 5 and 6 and no new entries between samples 1 through 4. Constraints like the second can be shortened, to (P1:P3 - 0), using range notation. POPAN-4 adds constraints automatically during the search for any parameter that wanders off towards inadmissible values and automatically adds the constraint to normalize the b_i. Constrained parameters have estimated SE of 0 whether they are constrained deliberately, because there is real knowledge about the parameter, or automatically because the unconstrained estimates just happened to be inadmissible. Ideally, one would like the latter situation to reflect some imprecision in the estimates.

The second form, (Pi - Pj) where value is a second parameter, allows selective equality of parameters at different sample times i,j. For example, in a K=5 sample experiment CPCONST = (P1:P4 - P5) is equivalent to the Jolly-Dickson constant capture probability model (Jolly 1982), but selective constraints like CPCONST = (P1-P2)(P4-P5) are also allowed and may be necessary to resolve the non-identifiability of p₁ and p_K. Equality constraints on phi_i and b_i may not be meaningful when sample times are unequally spaced.

Keywords ADJUST=YES/NO and BIRTHS = NET/GROSS can be used to have POPAN modify the constraints appropriately. For example with ADJUST = YES the constraint SPCONST = (P1 - P2) is imposed as phi₁ ₁ /d₁ = phi₂ ₁ /d₂ where 1/d_i is the inverse of the time interval between sample i and i+1.

Covariate constraints P-(C1, C2, ..CL)

The user can define up to 9 covariates which are real valued vectors of auxiliary variables associated with each sample time. Polynomial covariate models of up to quadratic terms in the covariates may then be defined for each fundamental parameter or its logit using a fairly compact notation. The notation C0 indicates the constant vector (for the intercept) and Cxy indicates the term-by-term product of vectors x and y. For Example 3, two covariates have been defined (keywords C1= and C2=). The constraint equation
SPCON = P - (C1,C2)
fits the model: phi_i = beta₁ C1_i + b₂ C2_i ; similarly,
SPCON = LOGITP - (C0, C1,C11)
fits the model: logit(phi_i) = beta₁ + beta₂ C1_i + beta₂ (C1_i)**2 . The ADJUST keyword can be used to allow for unequal sample time spacing. Covariate models on net births are allowed but implementation for gross births proved too complicated. Covariate models are implemented by allowing the first L model parameters to be freely optimized in the numeric search. At each iteration, these L parameters are used to solve for current values of the b coefficients and these values are used to impose constraints on the remaining model parameters (as illustrated by the 2 constraint equations above involving the beta). At convergence, estimates for the betas and their SEs can be derived from the first L model parameter estimates and their variance-covariance matrix. These estimates are added to the Estimate Table.

2 Across-group constraints

Up to 9 groups can be specified in the UFIT paragraph using keywords G1=,...,G9=; the keyword_value for these is the same as for the ATTRIBUTE= keyword in SELECT (Example 1). Constraints within and across groups are then specified using a group prefix in the constraint keywords.

As a specific example, suppose we are interested in comparisons of males and females in a K=5 sample experiment and that attribute 1 defines the sex of each animal as in Example 1. There are now a total of 2(3K-1)=28 parameters, all of which are assumed to differ between groups and across sample times unless constraints are specified. The Statistics and Estimate tables will now each have 2K=10 rows, the first K for Group 1 (males) and the next K for Group 2 (females). We would first specify
NGROUP=2; G1 = (A1 .eq. 'M'); G2 = (A1 .eq. 'F');
Constant constraints can be specified for either group: e.g. no losses of females:
SPCONST = (G2P1 - 1)(G2P2 - 1)(G2P3 - 1)(G2P4 - 1) ;
but this can be shortened to SPCONST = (G2P1:P4 - 1) ; The range notation P1:P4 implies a vector of 4 parameters. Vectors can be equated to single values or term by term to another vector of equal length. When NGROUP is specified, a group prefix must be used on all contrasts.

Equality constraints can be applied in various ways across groups. The examples below are applied to capture rates.

(a) Temporal effect and group effect (pt*g):: no constraints (this is the default)
(b) No temporal effect, group effect (pg):: CPCONST = (G1P1:P4 - G1P5)(G2P1:P4 - G2P5) ;
(c) Temporal effect, no group effect (pt):: CPCONST = (G1P1:P5 - G2P1:P5) ;
(d) No temporal effect or group effect (p):: CPCONST = (G1P1:P4 - G1P5)(G2P1:P5 - G1P5) ;

Covariate constraints also use the Group prefix notation to indicate group effects on specific terms of the covariate model. For example:
SPCONST = P - (C0, G1:G2C1) ;
specifies a linear covariate model where the intercepts are different for males and females but the slopes are the same. Here the range notation in the Group prefix is used to indicate equality constraints on the covariate coefficients across groups. As before, the user can specify that equality and covariate constraints be ADJUSTed to account for unequal sample time interval effects on survival, and to constrain gross rather than net births. Unlike SURGE, POPAN does not explicitly model age effects, but if samples are taken annually and if the file contains an attribute for age class at initial release, then this syntax does permit modelling of age effects on capture and survival rates.

3 Current state of POPAN-4

At the time of writing (January 1995) POPAN-4 has been implemented with the non-group constraints described for paragraph UFIT (section 1). The syntax for the group constraints has been implemented and the workability of group constraints has been tested in a stand-alone version. The current SUN workstation version of POPAN-4 is ready for release once the manuals are completed. An update will follow in 1995 when group constraints have been fully implemented. A version that runs under Windows forIBM/PCs has been successfully tested and will be available in 1995 when the POPAN-PC interface has been updated. A version for OS/2 for PCs has also been developed and will be available at the same time as the SUN version.

Design aims of POPAN

Some of the main design aims throughout the development of POPAN have been:

(a) Comprehensive data: The data for a population should be kept together along with appropriate metadata. The program can then give the user options for analysing various data subsets, or for selecting out and pooling subsets of the sample times. This is a top-down approach in which all the data is organised together, and then reduced by exploratory analyses and data manipulations.
(b) Comprehensive analyses: Provide both simple "black box" analyses and more customizable methods. Provide criteria for model selection and assessment of fit and tests for assumption failures.
(c) Orthogonality: Everything works with everything else: for example any analysis can be invoked with SIMULATE; every analysis allows for losses-on-capture and can be used with the RCH pooling method; within a paragraph, all combinations of keyword_value choices should lead to meaningful choices.
(d) Realism and reliability: The software must allow for the awkwardness of real data. Animals are lost on capture; sample times are not always at equally spaced times; allowance must be made for large numbers of small samples and over-parameterization, poor precision, inadmissible estimates and numeric problems. The analysis must automatically protect against problems resulting from inconvenient sample results: null samples can occur when selecting sample sub-sets, samples may have no marked animals, etc. It is also important that analyses used in simulations be bullet-proof; if the analyses don't anticipate and recover from fatal errors, then a simulation can fail. This is particularly annoying if it happens on the last of 900 replications.

To the best of our knowledge the following major features are unique to POPAN:

General group sub-selection based on attributes; general methods for sample time omission and pooling.
The ability to gather almost any statistic based on capture histories and to use these to construct contingency table tests.
A non-parametric smoothing technique that can be applied to all analyses.
A very general simulation capability.
The most flexible model customizing and fitting procedure (UFIT) available for Jolly-Seber type models.

COMMAND examples

Example 1 of POPAN-3 command paragraphs creates a binary file from raw data with 2 attributes and 10 sample times and analyzes the male sub-group and then the full set of animals but with capture histories re-mapped.

     CREATE:

	NAME = 'two-sex population, grouped histories' ;  
            BEGIN = 1 ;   END = 10;   ID = GROUP ;  
	INPUT = 11;    SAVE = ASIS ;    DATASET = 12 ;

C  specify (unequally spaced) sample times (days) for samples 1...10
	SVALUES = (1,  2,  2.5,  7,  10.5, 15, 15.5, 16, 19, 21) ;

C  specify 2 attributes with 3 and 2 codes and give their names and values 
	ANUM = 2(3,2);     ALIST = SEX, AGE
	AVALUES =	SEX (M) 'Male'  (F) 'Female'  (  ) 'Undetermined' 
			AGE (1) 'Juvenile'   (2) 'Adult' ;

C turn on range checking of raw history capture times and attribute codes
	TCHECK = RANGE;   ACHECK = YES; /

     SELECT:
	TITLE = 'Selecting out adult males' ;  INPUT = 12;
	ATTRIBUTE =  (A1 .eq. 'M' .and. A2 .eq. '2') ; /

     ANALYSIS:
	TITLE = 'first analysis...no births,  constant survival per day' ;
	DILUTION = ABSENT ;     LOSSES = FIXED ; /

    SELECT:
	TITLE = 'Selecting all animals...reduced to 5 sample times';
	INPUT = 12;
	OMIT = (1, 10);   GROUP = (2,3),(6:8); /

     ANALYSIS:
	TITLE = 'second analysis...same as first but specified by number' ;
	NUMBER = 8; /

Example 2 of POPAN-3 command paragraphs uses STATISTICS and ANALYSIS: Number=39 from within SIMULATE to investigate the power of a goodness-of-fit test (Pollock et al 1990, Fig 4.2) to detect trap avoidance. To investigate the bias in the J-S Full model estimates produced by the trap avoidance, the SIMULATE paragraph can be run alone with just a change to the specified analysis (ANAL = 1 instead of 39)

     STATISTICS:
	TITLE = 'Stats for second component goodness-of-fit' ;
	NUMBER = 4;	SAVE = STAT;	SCAN = FULL ;

C  give the column symbol, verbal description, and formal condition for each
C  of the 4 cells of the 2 by 2 table in row order

	SYM = '   M.1  ' ;      DES = 'early marks...recaptured now' ;
	CON = 'FIRSTSEEN BEFORE (I -1) AND SEEN AT (I) ;

	SYM = '   Z.1  ' ;      DES = 'early marks...recaptured later' ;
	CON = 'FIRSTSEEN BEFORE (I -1) AND NOT SEEN AT (I) 
			AND SEEN AFTER (I) ;

	SYM = '   M.2  ' ;      DES = 'recent marks...recaptured now' ;
	CON = 'FIRSTSEEN AT (I -1) AND SEEN AT (I) ;

	SYM = '   Z.2  ' ;      DES = 'recent marks...recaptured later' ;
	CON = 'FIRSTSEEN AT (I -1) AND NOT SEEN AT (I) 
			AND SEEN AFTER (I) ; /


     SIMULATE:
	TITLE = 'stable population with trap avoidance' ;

C  allow 6 sample times (weeks) in a stable population of 1000 animals
C  and weekly survival of 80%

	LSEL = 6 ;   NEWENT = ENDOGENOUS(1000) ;  SMECH = FIXED (0.8) 
;

C  all births enter as age 0 animals and age by 1 at each capture...age 0 
C  animals (unmarked) are captured at rate 40%,  marked (age >0) at 25%.

	AMECH = FIXED(0);  ATYPE = ENDOGENOUS ;
	CMECH = AGEDEP, DISCRETE, 2, (0,0.40),(1, 0.25) ;

C  run parameters control number of reps, amount of output, random seed

	RUN = YES ;   REP = 100 ;  WRITE = NO ;  SEED = 385674511 ;

C  analyse using previously defined STATISTICS and a 2-tailed test at the
C  1%, 5% and 10% significance level.

	SIZE = 1(1,5,10) ;  ANAL = 39 ; /

Example 3 is a POPAN-4 UFIT paragraph that fits covariate models for each of the mechanisms. This fitting procedure is applied to the data generated in each replicate population of the SIMULATE which follows. The mechanisms specified in SIMULATE are chosen to generate a correct model of the type UFIT is attempting to fit and so all estimates and SEs should be unbiased.

       UFIT:
C  test that covariate constraints work
     LSEL = 6;
     C1 = (1.0, 2.0, 3.0, 4.0, 5.0, 6.0) ;
     C2 = (5.0, 2.0, 1.0, 1.0, 2.0, 5.0);

C try with =P and =LOGITP
     CPCONST = P - (C0,C1,C2,C12);

C try with =P and =LOGITP and with ADJUST = YES or NO
     SPCONST =  P - (C1,C2);        ADJUST = YES; 
     
C try with =P and =LOGITP
     BPCONST = P - (C0, C1, C11) ;

     ANALYSIS = 9;  SAVE = UFIT;
     TITLE = 'Covariate models for all parameters'; /

  
SIMULATE:

    LSEL = 6;       REPLICATIONS = 100;
    TITLE = 'SIMULATION TEST - CALLING UFIT FROM SIMULATE';

C  CP  true values for all covariate coefficients = 0.02
    CMECHANISM = VECTOR(0.24, 0.18, 0.16, 0.20, 0.36, 0.84);

C  SP  true values for all  covariate coefficients  = 1/8 (except last)
    SMECHANISM = VECTOR(0.75, 0.5, 0.5,0.625,0.875, 0.99);

C  NE  true model is  -40 + 240 C1 - 40 C1*C1  
C  NET Ntot = 2000  and b0....b5=0.4, 0.08, 0.14, 0.16, 0.14, 0.08
    NEWENTRIES = VECTOR(800, 160, 280, 320, 280, 160);

    SEED = 123456789;     RUN = YES;       WRITE = NO ;  
    ANALYSIS = UFIT;       SAVE = NO ; /

Acknowledgements

Work on POPAN has been supported by individual operating grants to each of the authors from the Natural Sciences and Engineering Research Council of Canada and by a Canada Department of Fisheries and Oceans/NSERC subvention grant to the authors jointly. We wish to acknowledge the major programming contribution to POPAN-4 made by Gord Boyer. We also thank Christopher Lapkowski, Ben Li and Chris Kirby for recent contributions to POPAN-PC and POPAN-3.

REFERENCES

Arnason, A. N., & Baniuk, L. (1978) POPAN-2: A data maintenance and analysis system for mark-recapture data (Box 272, St. Norbert, Manitoba, Charles Babbage Research Centre).

Arnason, A. N., Miller, D. W. & Lapkowski, C. (1992) POPAN-PC: Installation and user's manual for running POPAN-3 on IBM PC microcomputers under DOS or Windows 3 (Box 272, St. Norbert, Manitoba, Charles Babbage Research Centre).

Arnason, A. N. & Mills, K.H. (1981) Bias and loss of precision due to tag loss in Jolly-Seber estimates for mark-recapture experiments, Canadian Journal of Fisheries and Aquatic Sciences, 38, pp. 1077-1095.

Arnason, A. N. & Schwarz, C.J. (1987) POPAN-3: Extended analysis and testing features for POPAN-2 (Box 272, St. Norbert, Manitoba, Charles Babbage Research Centre).

Burnham, K. P., Anderson, D.R., White, G.C., Brownie, C., & Pollock, K. H. (1987) Design and analysis methods for fish survival experiments based on release-recapture (Bethesda MD, American Fisheries Society Monograph Number 5).

Clobert, J. & Lebreton, J. D. (1985) D�pendence de facteurs de milieu dans les estimations de taux de survie par capture-recapture, Biometrics, 41, pp. 1031-1037.

Cormack, R. M. (1989) Loglinear models for capture-recapture. Biometrics, 41, pp. 385-413.

Hargrove, J. W. & Borland, C. W. (1994) Pooled population parameter estimates from mark-recapture data, To appear: Biometrics, 50.

Jolly, G. M. (1965) Explicit estimates from capture-recapture data with both death and immigration - stochastic model Biometrika, 52, pp. 225-247.

Jolly, G. M. (1982) Mark-recapture models with parameters constant in time, Biometrics, 38, pp. 301-321.

Kreger, N.S. (1973) A simulation study of Jolly's estimates for animal populations when sampling intensity is low (Winnipeg, Manitoba, Department of Computer Science, M.Sc. thesis).

Lebreton, J.-D., Burnham, K. P., Clobert, J. & D. R. Anderson (1992) Modeling survival and testing biological hypotheses using marked animals: a unified approach with case studies, Ecological Monographs, 62, pp. 67-118.

Otis, D. L., Burnham, K. P., White, G. C. & Anderson, D. R. (1978) Statistical inference from capture data on closed animal populations. Wildlife Monograph No. 62, pp1-135.

Pollock, K. H. (1975) A K-sample tag-recapture model allowing for unequal survival and catchability, Biometriika, 62, pp. 577-584.

Pollock, K. H. (1981) Capture-recapture models allowing for age-dependent survival and capture rates, Biometrics, 37, pp. 521-529.

Pollock, K. H., Hines, J. E., & Nichols, J. D. (1984) The use of auxilliary variables in capture-recapture and removal experiments, Biometrics, 40, pp. 329-340.

Pollock, K. H., Nichols, J. D., Brownie, C., & Hines, J. E. (1990) Statistical inference for capture-recapture experiments Wildlife Monograph No. 107, pp. 1-97.

Schwarz, C. J. & Arnason, A. N. (1995) A general methodology for the analysis of capture-recapture experiments in open populations. Submitted to: Biometrics.

Schwarz, C. J., Bailey, R.E., Irvine, J. R., & Dalziel, F. C. (1993) Estimating salmon spawning escapement using capture-recapture methods, Canadian Journal of Fisheries and Aquatic Sciences, 50, pp. 1181-1197.

Seber, G. A. F. (1965) A note on the multiple-recapture census, Biometrika, 52, 249-259.

Seber, G. A. F. (1982) The estimation of animal abundance and related parameters (Second edition, London, Griffin).

POPAN-5 Homepage

POPAN Maintainer: Gord Boyer -- gboyer@cs.umanitoba.ca