1st Page

Journal of Animal Science - Animal Production

Hierarchical Bayesian modeling of heterogeneous variances in average daily weight gain of commercial feedlot cattle1


This article in JAS

  1. Vol. 91 No. 6, p. 2910-2919
    Received: June 09, 2012
    Accepted: Feb 27, 2013
    Published: November 25, 2014

    2 Corresponding author(s):

  1. N. Cernicchiaro*,
  2. D. G. Renter*,
  3. S. Xiang,
  4. B. J. White and
  5. N. M. Bello 2
  1. Department of Diagnostic Medicine/Pathobiology, College of Veterinary Medicine, Kansas State University, Manhattan 66506
    Department of Statistics, College of Arts and Sciences, Kansas State University, Manhattan 66506
    Department of Clinical Sciences, Kansas State University, Manhattan 66506


Variability in ADG of feedlot cattle can affect profits, thus making overall returns more unstable. Hence, knowledge of the factors that contribute to heterogeneity of variances in animal performance can help feedlot managers evaluate risks and minimize profit volatility when making managerial and economic decisions in commercial feedlots. The objectives of the present study were to evaluate heteroskedasticity, defined as heterogeneity of variances, in ADG of cohorts of commercial feedlot cattle, and to identify cattle demographic factors at feedlot arrival as potential sources of variance heterogeneity, accounting for cohort- and feedlot-level information in the data structure. An operational dataset compiled from 24,050 cohorts from 25 U. S. commercial feedlots in 2005 and 2006 was used for this study. Inference was based on a hierarchical Bayesian model implemented with Markov chain Monte Carlo, whereby cohorts were modeled at the residual level and feedlot-year clusters were modeled as random effects. Forward model selection based on deviance information criteria was used to screen potentially important explanatory variables for heteroskedasticity at cohort- and feedlot-year levels. The Bayesian modeling framework was preferred as it naturally accommodates the inherently hierarchical structure of feedlot data whereby cohorts are nested within feedlot-year clusters. Evidence for heterogeneity of variance components of ADG was substantial and primarily concentrated at the cohort level. Feedlot-year specific effects were, by far, the greatest contributors to ADG heteroskedasticity among cohorts, with an estimated ∼12-fold change in dispersion between most and least extreme feedlot-year clusters. In addition, identifiable demographic factors associated with greater heterogeneity of cohort-level variance included smaller cohort sizes, fewer days on feed, and greater arrival BW, as well as feedlot arrival during summer months. These results support that heterogeneity of variances in ADG is prevalent in feedlot performance and indicate potential sources of heteroskedasticity. Further investigation of factors associated with heteroskedasticity in feedlot performance is warranted to increase consistency and uniformity in commercial beef cattle production and subsequent profitability.


Accurate assessment of feedlot cattle performance and its response to management strategies are essential to the profitability of beef production systems. Linear mixed models are commonly used in data analysis to evaluate the effects of management practices on mean performance of feedlot cattle (Babcock et al., 2009; Cernicchiaro et al., 2012). One critical assumption of these statistical methods is that variance components are homogeneous and constant across managerial environments (Kutner et al., 2005). However, numerous studies indicate that unequal or heterogeneous variances, also referred to as heteroskedasticity, are not uncommon in livestock production systems (SanCristobal-Gaudy et al., 2001; Cardoso et al., 2005, 2007; Gernand et al., 2007). Fig. 1 illustrates the concept of heterogeneous variances between groups for a performance outcome of interest.

Figure 1.
Figure 1.

Illustrative demonstration of heterogeneity of variances and of mean performance in ADG between hypothetical groups of feedlot cattle. Depicted is the distribution of ADG for 3 hypothetical groups of cattle, representative of 3 levels of a fixed effect factor (i.e., managerial practices or demographic factors). The comparison of Group 1 and Group 2 shows a situation where groups have different mean performance but similar variability around their means (i.e., homogeneous variances). In contrast, Group 1 and Group 3 share the same mean, yet Group 3 has greater dispersion around that mean than Group 1 (i.e., heteroskedasticity or heterogeneity of variances).


Ignoring heterogeneous variances, when present, has been shown to lead to inefficient and possibly misleading (i.e., biased) inference on treatment mean differences of interest (Wiggans and Vanraden, 1991; Wolfinger, 1996). Furthermore, identifying sources of heteroskedasticity is likely to pose additional benefits as it can inform management decisions geared to enhance consistency of feedlot yield, thus allowing for effective risk management strategies associated with profit variability (Belasco et al., 2009b). Here, heteroskedasticity in feedlot performance is explored using a hierarchical Bayesian framework (Kizilkaya and Tempelman, 2005) that specifies variances as functions of explanatory variables of interest.

The objectives of the study are 1) to assess heteroskedasticity in ADG of feedlot cattle using operational data from commercial feedlots fitted with a hierarchical Bayesian modeling approach and 2) to identify demographic factors at arrival as potential sources of variance heterogeneity, accounting for cohort and feedlot levels of information in the data structure.


Animal Care and Use Committee approval was not obtained for this study because the data were obtained from a preexisting database.

Data Description

The dataset used in this study included information from cattle cohorts compiled from commercial feedlots located in the central and southern high plains of the United States. These data were retrieved from a feedlot management software program through a central database. The existing database included information collected routinely from multiple feedlots on cohort management characteristics, performance, and demographic information. These data were subsequently exported and reformatted in SAS (SAS Inst. Inc., Cary, NC) for analysis. A cohort (i.e., lot) was defined as a group of cattle that were acquired, managed, and marketed similarly, but not necessarily housed in the same physical location (i.e., pen) during the feeding period.

The dataset was refined to consist only of cattle arriving at the commercial feedlots during 2005 and 2006. To be included in the study, cohorts were required to have a minimum cohort size of 50 cattle and a mean arrival BW of at least 181 kg per animal. Also, cohorts were required to be fully composed of either heifers or steers; mixed gender and Holstein cohorts were excluded from analyses. Cohorts recorded to have an ADG greater than 4 kg or spending more than 450 d on feed were also excluded from analysis. In total, these cohort-level criteria eliminated 11.1% (n = 2,998) of the cohorts present in the original dataset. After these edits, data from 24,050 cattle cohorts from 25 commercial feedlots were available for analysis.

To study feedlot performance, cohort-level ADG per animal (expressed in kg) was calculated as the difference between the total BW gain of the cohort at the end of the finishing period and on arrival to the feedlot, then dividing the difference by the total number of cattle at arrival multiplied by the number of days on feed (DOF). This calculation incorporates all animals ever present in the cohort, including cattle that died (“deads in”) and uses this formula:

For this study, the clustering effect of a feedlot was defined within a particular calendar year, thereby yielding 50 feedlot-year clusters, which may be considered equivalent to the concept of contemporary groups commonly modeled in the context of other production systems (Kadarmideen et al., 2003; Tsuruta et al., 2004; Bello et al., 2012).

Explanatory classification factors of interest recorded on individual cohorts (i.e., cohort-level demographic variables) are presented in Table 1. The season of arrival, defined by the month of arrival at the feedlot, was categorized as follows: winter (January to March), spring (April to June), summer (July to September), and fall (October to December). Days on feed, computed as the difference between the date of a cohort arrival at the feedlot and the date of shipment, was categorized in quartiles as follows: 1 to 156, 157 to 183, 184 to 216, and 217 to 395 d on feed. Cohort size at arrival, recorded as the number of cattle per cohort on arrival at the feedlot, was categorized in quartiles: 50 to 87, 88 to 109, 110 to 166, and ≥167 cattle. Cohort gender referred to heifer and steer categories. Mean cohort arrival BW, expressed in kg, was categorized in 45 kg (or equivalently, 100 lb) increments as 181 to 226, 227 to 272, 273 to 317, 318 to 362, and ≥362 kg. Other feedlot attributes and feedlot-year descriptors that were considered are listed in Table 1. The percentage of steer cohorts per feedlot per year was expressed in quartiles (≤74.2, 74.3 to 83.1, 83.2 to 86.4, and ≥86.5%). Feedlot capacity, expressed as the number of cattle per feedlot per year, was categorized in quartiles (≤46,621; 46,622 to 64,011; 64,012 to 103,713; and ≥103,714 cattle). Mean cohort arrival BW per feedlot per year was also expressed in quartiles (≤293, 294 to 319, 320 to 332, and ≥333 kg). All explanatory variables originally recorded on a continuous scale were categorized, as previously described, to avoid violation of the linearity assumption. Categorization was conducted on the basis of quartiles (e.g., cohort size, percentage of steer cohorts per feedlot per year) or when feasible, based on industry-standard cut-offs (e.g., mean cohort arrival BW).

View Full Table | Close Full ViewTable 1.

Listing and descriptive statistics of cohort-level and feedlot-year-level explanatory variables available for model selection and assessment of sources of variance heterogeneity

Variable % of cohorts (clusters) Mean SD Median Range (minimum to maximum)
Cohort-level variables
    No. of cohorts included for analysis (clusters) 100 (n = 24,050)
    Arrival Year
        2005 51.9
        2006 48.1
        Winter 23.1
        Spring 23.7
        Summer 27.8
        Fall 25.4
    Days on feed, days 164 51 160 1 to 395
    Cohort size, cattle 153 88 125 50 to 1659
        Steers 55.1
        Heifers 44.9
    Mean arrival BW, kg 308.6 55.1 314.0 181.0 to 408.0
Feedlot-year-level variables
    No. of feedlot-year clusters 100 (n = 50)
    Steer cohorts, % of total cohorts per feedlot-year cluster 75.6 16.9 83.5 38.4 to 93.5
    Feedlot size (No. of cattle per feedlot per year), cattle 86,003 32,625 80,423 1954 to 139,320
    Mean arrival BW per feedlot-year cluster, kg 308.5 31.4 318.0 226.0 to 359.0

Statistical Analyses

Model Specification.

Our primary interest focuses on measures of dispersion in ADG, namely variance components at the residual (i.e., cohort) level and at the cluster (i.e., feedlot-year) level (further defined below), as well as potential sources of heterogeneity on such variance components.

Let yi denote the observed response ADG for the ith cattle cohort (i = 1, … 24,050), such thatHere, β denotes a vector of fixed effects and u represents a vector of random feedlot-year-specific effects on ADG, whereas ei indicates a random cohort-specific residual effect; xi and zi are known vectors connecting the model effects to the response.

For and are specified to be mutually independent.

Following Foulley et al. (1990), a generalized linear model on each variance parameter was specified using a logarithmic link function as follows:where τeand τu denote vectors of fixed scaling effects on cohort-level and feedlot-year-level variances, respectively, and ve denotes a vector of random cluster-specific (i.e., feedlot-year-specific) scaling effects on cohort-level variances; xe,i, xu,j, and ze,i indicate the corresponding known incidence vectors. Further, the elements of are specified to follow independent inverse gamma distributions such that and var with larger values of ηe indicating less heterogeneity on cohort-level variances between feedlot-year clusters, as described for heteroskedastic mixed models by Kizilkaya and Tempelman (2005).

Hence, the proposed hierarchical Bayesian model embeds 3 different submodels that specify heterogeneity at various levels; namely, Eq. [1] directly defines a classical linear mixed model on the location parameters, whereas Eq. [2] and [3] specify linear mixed models on the cohort level and feedlot-year-level variances, respectively. Prior distributional specification on unknown parameters were diffuse, as specified by Bello et al. (2010), to ensure that inference is dominated by data information only.

Posterior Computation.

The proposed hierarchical Bayesian model is implemented using a Markov chain Monte Carlo (MCMC) sampling algorithm, which greatly facilitates data analysis for multistage hierarchical models such as that investigated in this paper. MCMC is a simulation-intensive method that involves drawing random samples of all parameters from the joint posterior density of all unknown parameters (Sorensen and Gianola, 2002). This allows for simultaneous estimation of all model parameters, namely location parameters (i.e., β and u in Eq. [1]), components of the residual cohort-level variance (i.e., τe and ve in Eq. [2], as well as ηe) and components of the random feedlot-year variance (i.e., τu in Eq. [3]). The MCMC algorithm is based on a combination of Gibbs sampling updates from fully conditional posterior distributions (β, u, τe, ve, τu) and Metropolis-Hastings updates from random walk proposals (ηe). Additional technical details on distributional specifications of fully conditional posterior densities are available in Bello et al. (2010, 2012). Computational programming of the MCMC algorithm for this hierarchical Bayesian model was conducted on an R CRAN platform (R Development Core Team, 2012) incorporating the sparse linear algebra package SparseM (Koenker and Ng, 2010). All MCMC chains for competing models at a given selection step (defined in Model Selection section below) were run in parallel at the Kansas State University’s Beowulf high-performance computing cluster. Computer code is available from the corresponding author on request.

For each competing model (described in Model Selection section below), model selection was based on an MCMC chain of 55,000 saved iterations after the first 2,000 burn-in cycles. For the final selected model that was used for inference, the MCMC chain was extended to 250,000 saved iterations. Trace plots and the Raftery-Lewis diagnostics (Raftery and Lewis, 1992) were used to monitor convergence of MCMC chains. In addition, effective sample size (ESS) was estimated to assess the number of effectively independent samples within the MCMC chain. Posterior inference on heterogeneous variances was conducted by computing the marginal variance in the log link scale for levels of a given fixed effect factor, averaged across levels of other factors also selected into the final model, and then back-transforming to the data (i.e., inverse link) scale by exponentiation (Bello et al., 2012). These link-scaled marginal variances are parameterized in a manner analogous to least square mean estimates, as popularly implemented by SAS software. For each marginal variance of interest, the posterior means (PM), posterior standard deviations (PSD), and 95% highest posterior density intervals (HPD) in the data scale are reported, as well as the ESS. Statistical significance for fixed effect parameters on variances were established when the 95% HPD on the ratio of marginal variances corresponding to levels of a fixed effect factor did not include the null value 1. For comparisons of interest between any 2 variance parameters θ1 and θ2, Bayesian P-values are provided; these are defined as

Furthermore, the residual CV (Kizilkaya and Tempelman, 2005) is used to describe dispersion of the cohort-level variance across feedlot-year clusters; greater values of CVe indicate increased dispersion of ADG variability between feedlot-year clusters relative to a “typical” variance, which for a multiplicative model is equal to the null value 1.

Model Selection.

A sequential forward selection approach similar to that implemented by Bello et al. (2012) was performed to decide which demographic characteristics and feedlot descriptors (Table 1) were to be incorporated as explanatory variables to model cohort-level and feedlot-year-level variance components in τe and τe of Eq. [2] and [3], respectively. The deviance information criterion (DIC; Spiegelhalter et al., 2002) was used to compare goodness of fit between competing models. Smaller values of DIC indicate better fit of the data, with differences greater than 7 considered indicative of a decisive difference in model fit (Spiegelhalter et al., 2002). Briefly, selection of sources of heterogeneity on the variances consisted of 3 steps: steps 1 and 2 involved model selection on the cohort-level variance. Step 1 included in τe the fixed effects factor corresponding to the explanatory variable that led to the largest decrease in DIC and restarted the process with respect to the remaining explanatory variables until none led to a DIC decrease of 7 or greater. Step 2 considered whether to include the feedlot-year random effect ve in Eq. [2] depending on a DIC decrease of greater than or less than 7 for estimating the hyperparameter ηe. Step 3 mirrored Step 1 except that it pertained to selecting the fixed effects τu for the best-fitting model on the feedlot-year-level variance. Table 2 outlines the details of the DIC-based forward model selection procedure.

View Full Table | Close Full ViewTable 2.

Details of the forward model selection strategy implemented on variances for ADG of feedlot cattle, both at the cohort level (Steps 1 and 2) and at the feedlot-year level (Step 3)1

DIC difference
Factors entering the model Relative to Null model Relative to model in preceding step
Null model consisting of: 0
    Fixed effects on the mean-model specification:
        Cohort size (in quartiles), mean arrival BW class, gender, feedlot size (in quartiles), arrival season, and year
    Random effect of feedlot-year cluster on the mean model specification
Step 1:
    Selection of fixed effects (i.e., τe) on the cohort-level variance
        1.1 Cohort size (in quartiles) −198 −198
        1.2 Days on feed (in quartiles) −368 −170
        1.3 Season −440 −72
        1.4 Weight class −492 −52
        No additional effects entered the model
Step 2:
    Selection of random effects (i.e., ve) on the cohort-level variance
        2.1 Clustering effect of feedlot-year −2904 −2412
Step 3:
    Selection of fixed effects (i.e., τu) on the feedlot-year-level variance
        No effect entered the model
1Selection of fixed and random effects into the model was based on deviance information criterion (DIC) differences exceeding 7, following Spiegelhalter et al. (2002).

It is emphasized that our main interest in this study is inference on sources of heterogeneity on the variance components at the cohort-level and at the feedlot-year cluster level. Therefore, our model selection approach was limited to choosing fixed effects τe and τu for Eq. [2] and [3], as well as parameters that define variability between random clustering effects, namely ηe in Eq. [2]. It is noted that the focus of interest was not specifically inference on the classical mean-model fixed effect parameters β in Eq. [1], for which much literature is already available (Holland et al., 2010; Tatum et al., 2012). Therefore, the classical fixed effects β always included the effects of cohort size, mean arrival BW, and feedlot size, all categorized as previously explained in the Data Description section, as well as the effects of gender, arrival season, and year. Also included in the means model (Eq. [1]) was the random clustering effect of contemporary group u, denoted as feedlot-year cluster. In specifying the means model, our intention was to purposefully overfit fixed effects β, thus ensuring robust inference on heterogeneous variance components (Wolfinger, 1993).


Cohort-Level Heteroskedasticity in ADG

Evidence for heterogeneity of variance components on ADG was substantial and primarily concentrated at the cohort level. Table 3 summarizes posterior inference on variances components on ADG for the demographic factors and feedlot descriptors that were selected (Table 2) as potential sources of heterogeneity on the cohort-level variance based on their contribution to enhanced model fit.

View Full Table | Close Full ViewTable 3.

Posterior inference on sources of heterogeneous variances for ADG in commercial feedlot cattle, including marginal posterior means (PM), posterior standard deviations (PSD), 95% greatest posterior density intervals (HPD), and effective sample size (ESS) on feedlot-year-level (i.e., cluster; ) and cohort-level (i.e., ) variances.

Variance components (kg2) PM PSD 95% HPD ESS
Feedlot-year-level variance
    Overall1 0.0190 0.0043 0.0114, 0.0276 197,732
Cohort-level variances
Fixed effects τe
    Cohort size
        50 to 87 cattle 0.0333a 0.0033 0.0274, 0.0398 1296
        88 to109 cattle 0.0267b 0.0026 0.0221, 0.0320 1280
        110 to 166 cattle 0.0238c 0.0023 0.0197, 0.0285 1271
        >167 cattle 0.0240c 0.0024 0.0198, 0.0287 1128
        Winter 0.0235d 0.0023 0.0194, 0.0282 1266
        Spring 0.0263e 0.0026 0.0217, 0.0315 1296
        Summer 0.0303f 0.0030 0.0251, 0.0363 1282
        Fall 0.0273e 0.0027 0.0225, 0.0326 1177
    Days on feed
        1 to 156 d 0.0347g 0.0034 0.0286, 0.0417 1496
        157 to183 d 0.0250h 0.0025 0.0205, 0.0298 1396
        184 to 216 d 0.0230i 0.0023 0.0190, 0.0276 1272
        217 to 395 d 0.0256h 0.0025 0.0210, 0.0306 1079
    Arrival BW
        181 to 226 kg 0.0221j 0.0023 0.0180, 0.0268 1510
        227 to 272 kg 0.0243k 0.0024 0.0200, 0.0291 1377
        273 to 317 kg 0.0292l 0.0028 0.0241, 0.0349 1271
        318 to 362 kg 0.0289l 0.0028 0.0238, 0.0346 1184
        >362 kg 0.0302l 0.0030 0.0250, 0.0363 1078
Random effects ve between feedlot-year clusters
    Coefficient of variation CVe 0.6579 0.1770 0.4185, 0.9575 3152
a–lWithin a fixed effect factor, marginal means with different superscripts (a−c: P < 0.001 d−f: P < 0.0001 g−i: P < 0.01 j−l: P < 0.05) indicate levels that differ from each other.
1No multifactorial heterogeneity was inferred for ; i.e., .

Cohort size was identified as a contributor to cohort-level heteroskedasticity (Table 3), whereby the smallest cohorts (first cohort size quartile, consisting of 50 to 87 cattle) were about 25 to 40% more variable in their ADG performance than cohorts of any larger size (P < 0.0001). In turn, the most uniform (i.e., least variable) ADG was observed among cohorts with more than 110 cattle (third and fourth quartiles; P < 0.01), whereas the second quartile group (88 to 109 cattle) had intermediate variability relative to the previous groups.

Also, cohort-level variability in ADG performance differed with season of cohort arrival at the feedlot (Table 3). More specifically, cohort performance at the feedlot was most variable for summer arrivals relative to any other seasons (P < 0.001), whereas winter arrivals showed the most consistent (i.e., least variable) ADG outcomes of all seasons (P < 0.001). In turn, spring and fall arrivals had intermediate cohort-level variance on ADG; that is, cohorts in these seasons had more consistent ADG performance than summer arrivals (P < 0.001) but more variable ADG than cohorts arriving during the winter season (P < 0.001).

In addition, the length of the stay in the feedlot, as characterized by DOF quartiles, was associated with changes in the variability of ADG (Table 3). In particular, cattle cohorts that were on feed for the shortest period evaluated (first quartile for DOF, consisting of 1 to 156 d) showed greater variability in ADG by approximately 35 to 50% relative to that of cattle cohorts fed longer than 156 d (P < 0.01). Moreover, cohorts in the third quartile of DOF (184 to 216 d) were the least variable in ADG performance of all DOF duration (P < 0.01).

Average arrival BW also contributed to ADG variability between cohorts (Table 3). Cohorts arriving in the lightest BW category (181 to 226 kg) had the most consistent ADG relative to any other BW categories (P < 0.05). Further, ADG cohort-level heteroskedasticity increased (P < 0.0001) with each category of increasing arrival BW until an average arrival BW of approximately 273 kg, after which no heterogeneity in variability of ADG was apparent.

Feedlot-year specific scaling effects were, by far, the greatest contributors to ADG heteroskedasticity among cohorts, as indicated by the largest DIC difference in the model selection approach (Table 2). Heteroskedastic cohort-level ADG across feedlot-year clusters was apparent through posterior inference on the coefficient of variation (Table 3), which serves as a normalized measure of dispersion of the cluster-specific (i.e., feedlot-year specific) scaling effects on the cohort-level variances (Kizilkaya and Tempelman 2005). The large magnitude of the posterior mean (0.658) of CVe and that of the lower bound of its 95% HPD (0.418) indicate strong evidence that the degree of residual (i.e., cohort level) heteroskedasticity across feedlot-year clusters was substantial for ADG. Further, feedlot-year driven heterogeneity in the cohort-level variance of ADG is interpreted by looking at the elements of ve in Eq. [2], which characterize feedlot-year specific scaling effects relative to a reference variance of 1 (i.e., typically dispersed feedlot-year cluster). In particular, the posterior means of cohort-level variances in ADG for the most and least variable feedlot-year clusters were 3.94 and 0.34 times that of the reference variance (i.e., equal to 1), respectively; their ratio thereby indicating an approximate 12-fold change between feedlot-years clusters of most and least extreme variability in ADG.

Feedlot-Year-Level Variance on ADG

The feedlot-year-level variance on ADG was estimated at 0.019 kg2, as per its posterior mean, with 95% HPD = [0.011 kg2, 0.028 kg2]. Our DIC-based model selection approach indicated no evidence for enhanced model fit when any of the feedlot-specific demographic factors available from the dataset (Table 1) entered the model in Eq. [3]. Thus, our inference is based on a model with homogeneous variance between feedlot-year clusters (Table 3).


One of the main challenges faced by the feedlot industry is the prediction of performance based on limited operational data available at the start of the finishing period. These predictions are crucial for making managerial and economic decisions in commercial feedlots (Galyean et al., 2011). Previous research demonstrated that ADG significantly affects the mean and variability of profits (Belasco et al., 2009a,b). When mean ADG increases, its variability has a greater impact on profit distribution, making overall profits more volatile (Belasco et al., 2009b). Therefore, the elicitation of factors that contribute to heteroskedasticity in feedlot performance can help cattle managers evaluate risks and, conditional on cattle and feed market value fluctuations, decide, for instance, about retaining ownership through the feeding period (White et al., 2007).

This study provides evidence for substantial heterogeneity in the variance of ADG of commercial feedlot cattle, most of which was concentrated among cattle cohorts (i.e., cohort-level) within a feedlot-year cluster. A variety of statistical methods are commonly applied to feedlot data in an attempt to identify factors and conditions under which the prediction of mean ADG performance can be optimized. However, in practice, accurate performance predictions can be hard to achieve due to considerable dispersion around the predicted mean. Indeed, lack of homogeneity within and between cattle cohorts due to demographic characteristics, feed management, genetic background, and managerial practices, among others, easily translates into much uncertainty for mean predictions, as well as considerable variability for inference not only on centrality parameters (i.e., means and medians) but also on measures of dispersion (i.e., variances; Fig. 1), as supported by this study. This arguably adds another layer of complexity to the production system, whereby it is not only mean performance that should be optimized, but also variability around such ideal mean to ensure consistency and uniformity of the production process toward the targeted mean goal.

In this study, our methodological approach to modeling heteroskedasticity of ADG was based on the hierarchical Bayesian framework proposed by Kizilkaya and Tempelman (2005). One of the strengths of this approach lies in the flexibility that it allows for explicit modeling not only of mean-model parameters but also of variance components. In addition, this approach accounts for the inherently hierarchical structure of the data consisting of cattle cohorts nested within managerial groups (i.e., feedlot-year clusters).

Our results indicate that heterogeneity of variances on ADG was substantial and that it existed primarily among cohorts. In particular, cohort-level heteroskedasticity in ADG was primarily driven by unidentified scaling effects specific to feedlot-year clusters, thus warranting further investigation of additional feedlot attributes and management practices as potential sources of heteroskedasticity. Identifiable cohort-level demographic factors on feedlot arrival also contributed to explain heterogeneous dispersion of ADG, including cohort size, mean arrival BW, DOF and season. More specifically, smaller size cohorts of heavier average arrival BW that started their finishing period during summer months and spent fewer DOF were characterized by more variable ADG. In turn, larger size cohorts of lighter arrival BW that were introduced to the feedlot during the winter season were associated with a more consistent (i.e., less variable) performance. In contrast, the variance of ADG among feedlot-year clusters (i.e., feedlot-year level) could not be partitioned as a function of the feedlot-level demographic factors available in the dataset and was thus deemed homogeneous in this study. This, however, does not preclude future consideration of other feedlot attributes and management practices as potential sources of heteroskedasticity, as availability of feedlot-level information was scarce in this study.

Among the demographic factors of interest, cohort-level heteroskedasticity was identified to vary with cohort size. It is possible that this result may partially reflect a statistical, rather than a biological effect. Observed values of ADG for larger cohorts would naturally be obtained as the summary measure (i.e., average) of observations on more cattle relative to the corresponding ADG observations recorded in smaller cohorts. Averages computed from many observations can be naturally expected to be less variable than those based on fewer observations. Also, observations of extreme magnitude within a group (i.e., outliers) are likely to have greater influence on descriptive statistics for smaller cohorts of cattle, thus providing an additional possible explanation of the results observed here. Nevertheless, one may also conceive that cattle are often grouped at arrival by BW, but subsequently tend to grow apart during the feeding phase depending on where they were on the growth curve at the time of feedlot arrival. Thus, one might also anticipate a biological component of the greater variability of ADG observed in cohorts of smaller size.

A similar statistical effect is likely to hold, at least partially, for explaining cohort-level heteroskedasticity as a function of DOF because an average computed from more DOF can be naturally expected to be less variable than the average computed from fewer DOF. This does not, however, preclude additional biological effects. It is conceivable that the increased ADG variability observed during shorter feedlot stays may be partially explained by daily variations in feed intake, health status, and individual physiological factors occurring during a short period of time to which the animal has limited time to adapt. In addition, measurement errors and the frequency at which BW measures were taken may also affect the variability of ADG.

Season of arrival at the feedlot was associated with heterogeneity in the cohort-level variance. This is consistent with results by Belasco et al. (2009b), who found placement season to be associated with the variance of ADG such that summer placements had greater ADG variances than placements in other times of the year. Possibly, a lower feed intake experienced during extreme heat in summer arrivals coupled with different resilience capabilities of animals acclimating to extreme weather conditions may have led to the greater variability observed in summer months. In addition, the cattle population, even within the same BW class, likely differs based on the season of the year. Most calves in the United States are born in spring; thus light BW calves in fall are typically those born during spring months; however, light BW calves arriving at the feedlot in summer may be younger (born that spring and weaned early) or older (born in fall but below average size for their age) than their fall arrival counterparts. In contrast, one may speculate that the lower variability in ADG observed for cohorts arriving in winter months may be partially due to cattle able to adapt to temperatures closer to their comfort zone (Hahn, 1999), thus mitigating the need for expressing individual adaptations to extreme environments. Further evaluation of the effect of arrival season on heterogeneity of ADG is warranted.

Mean arrival BW also was associated with ADG variance among cohorts, with heavier cohorts contributing to greater variability in ADG than lighter BW cohorts. Likewise, Belasco et al. (2009b) found that heavier BW placements were associated with greater ADG variance. Cattle in the heavier BW categories and in the lowest DOF quartile showed greater ADG variability. On the contrary, one might expect that because heavier BW cattle are usually more mature (older) and subsequently have fewer health problems, they would have decreased variability in ADG. Although the specific reasons for this effect (i.e., heavier BW cattle showing greater ADG variance) are not known, it is possible that they may be related to feed conversion mechanisms that can be maximized for either extreme low or high entry BW classes (Belasco et al., 2009a). Moreover, this phenomenon could be related to the infrastructure of the beef industry. Traditionally, cattle buyers purchase groups of animals through direct sales (i.e., ranch to yard) or from secondary markets (e.g., auction markets, backgrounders, stocker operations, other buyers; Smith, 2009). A large proportion of light BW cattle are more likely purchased directly from ranches, whereas heavier BW calves are more likely to originate from background or stocker operations. Increased commingling or mixing is expected in cattle purchased from secondary markets as opposed to cattle originated from ranches. Heavier BW cattle may be more likely to originate from secondary sources and thus have greater biological and genetic diversity, experience greater levels of stress due to more frequent social interactions and transport, and have greater exposure to pathogens than calves originated from a single source (Galyean et al., 1999; Duff and Galyean, 2007); any of which could possibly influence the variability in ADG within the cohort, as previously reported (Step et al., 2008).

Heteroskedasticity of performance outcomes has received little attention in feedlot production systems, with the exception of recent studies by Belasco et al. (2009a,b). Limitations in their approach include arbitrary selection of fixed effects to model heterogeneous variances, as well as lack of a straightforward interpretation of model parameters to report the magnitude of heteroskedasticity in the scale of the data. In contrast, the general hierarchical Bayesian framework proposed by Kizilkaya and Tempelman (2005) that was implemented here allows for formal DIC-based selection of demographic factors that is driven by enhanced model fit to the data and variance components that have a direct interpretation in the scale of the data (i.e., kilograms in the case of ADG).

Most data analysis techniques that are available through commercial software and that are commonly applied to feedlot data focus on inferring on group mean differences and assume homogeneous variances for all groups. Potential violation of such assumption is of concern; in fact, it has been previously warned that “the risk of inferential bias due to failure to take heterogeneity of variances into consideration should not be dismissed lightly” (Wolfinger, 1996), as statistical procedures that do not adequately capture the structure and complexity of feedlot production data can result in misleading conclusions (Larson and White, 2008). In fact, overlooking heteroskedasticity has been reported to lead to biased inference on group mean differences or treatment effects, as well as inefficient estimates of variance components (White, 1980; Foulley and Quaas, 1995; Wolfinger, 1996). Moreover, inferential artifacts are not uncommon when the homoskedasticity assumption is violated. For example, ANOVA F-tests for high order interactions in multifactorial models that incorrectly assume homoskedasticity have been shown to be wildly liberal and not trustworthy due to inflation of Type I error (Gaugler and Akritas, 2011). On the other hand, statistical power can further benefit from explicit modeling of heterogeneous variances by appropriately down-weighting portions of the data that are highly variable and extracting more information from other portions of the data that are more precise, thus enhancing precision of the inference on mean-model parameters. Further development and commercial delivery of statistical approaches to address heterogeneous variances, a phenomenon demonstrated by this study to exist in ADG of feedlot cattle cohorts, should be warranted.

The authors acknowledge that only a few risk factors were analyzed in this study, as standard reporting of demographic factors of cohorts and attributes of feedlots is not consistent across commercial operations (Babcock et al., 2013). However, the structure and timing of the data collected are consistent with the type of data routinely collected by feedlot operations (Corbin and Griffin, 2006) and the time (i.e., feedlot arrival) at which economic decisions are made. Although there are inherent limitations when using existing operational data for epidemiologic analyses, untangling signals imbedded in the data can enhance its current and future use in decision making. Caution should be exerted, though, when interpreting results. The observational nature of the data, which is determined by the retrospective cross-sectional features of data collection, limits cause-and-effect inference between the demographic factors, and heteroskedasticity of ADG (Casella, 2008; Dohoo et al., 2009). In other words, due to lack of randomization during the data collection process, one can only draw conclusions in terms of nondirectional association between the factors analyzed and the variability of the response; in turn, conclusions on causality are not appropriate

This study represents, to our knowledge, the first application of a hierarchical Bayesian model to investigate heteroskedasticity of a performance outcome in feedlot production systems. Directing efforts to assess heteroskedasticity of feedlot performance outcomes can benefit inferential accuracy and precision when concluding on group mean differences or treatment effects (Wolfinger, 1996). In addition, results of the current study have practical implications for reducing variability and eliciting uniform and consistent performance of cattle within the feedlot production system.




Be the first to comment.

Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.