Search
Author
Title
Vol.
Issue
Year
1st Page

Journal of Animal Science - Animal Genetics

Technical note: An approach to derive breeding goals from the preferences of decision makers

 

This article in JAS

  1. Vol. 94 No. 11, p. 4498-4506
     
    Received: May 31, 2016
    Accepted: Aug 05, 2016
    Published: October 7, 2016


    1 Corresponding author(s): leo.alfonso@unavarra.es
 View
 Download
 Share

doi:10.2527/jas.2016-0685
  1. L. Alfonso 1a
  1. a Escuela Técnica Superior de Ingenieros Agrónomos, Universidad Pública Navarra, Campus de Arrosadia, 31006 Pamplona, Spain

Abstract

This paper deals with the use of the Choquet integral to identify breeding objectives and construct an aggregate genotype. The Choquet integral can be interpreted as an extension of the aggregate genotype based on profit equations, substituting the vector of economic weights by a monotone function, called capacity, which allows the aggregation of traits based, for instance, on the preferences of decision makers. It allows the aggregation of traits with or without economic value, taking into account not only the importance of the breeding value of each trait but also the interaction among them. Two examples have been worked out for pig and dairy cattle breeding scenarios to illustrate its application. It is shown that the expression of stakeholders’ or decision makers’ preferences, as a single ranking of animals or groups of animals, could be sufficient to extract information to derive breeding objectives. It is also shown that coalitions among traits can be identified to evaluate whether a linear additive function, equivalent of the Hazel aggregate genotype where economic values are replaced by Shapley values, could be adequate to define the net merit of breeding animals.



INTRODUCTION

Since Hazel (1943) proposed to compute the net merit of breeding animals, aggregating the genetic value of different traits and weighting each one according to its relative economic value, the methodology for defining breeding objectives has long been debated (Goddard, 1998).

The main point for discussion is that the arguments for considering more than just farm profit as the primary driver when establishing a breeding goal have become more important (Amer, 2011). It is recognized that non-economic factors related to sustainability issues of animal production need to be also considered (Nielsen et al., 2014). In global markets breeding objectives should be readjusted according to customer preferences (Sae-Lim et al., 2012); animal welfare and environmental effects, and “societally important traits” are difficult to include in profit equations (Kanis et al., 2005); and though trait economic values could be approximately derived in marginal low-input livestock systems of subsistence farmers (e.g., Alfonso et al., 2011) intangible benefits of livestock keeping should also be considered (Gizaw et al., 2010).

This note does not attempt to review the different approaches developed for these and other scenarios (see, for instance, Nielsen et al., 2014) but to introduce and illustrate how to derive the relative importance of breeding objective traits based on the preferences expressed, either by stakeholders (companies, breeders, farmers, or smallholders) or by final decision makers, using the discrete Choquet integral. The preferences of stakeholders or final decision makers (hereinafter called “decision makers”) have been modeled based on this technique in several scientific areas (Grabisch and Labreuche, 2010), including the animal sciences (Veissier et al., 2010; Alfonso, 2013; Brosig et al., 2016). This note aims to show how to apply the discrete Choquet integral to derive breeding goals and construct an aggregate genotype from the preferences of decision makers.


MATERIAL AND METHODS

Choquet Integral of Aggregate Genotype

Instead of the usual way to combine the genetic merit of various traits into a single value (HAG, aggregate genotype) assuming a linear additive function of breeding values (BVi) weighted by their relative economic values (ai),the use of the discrete Choquet integral is proposed,where BV() = {BV(1), BV(2),……,BV(n)} is a reordered set of the breeding values BV = {BV1, BV2…….,BVn} (expressed in a unipolar scale) such that BV(1)BV(2) ≤ ……. ≤ BV(n); v is a vector of measures (“capacities”), 2n → [0,1], with v0 = 0 and vn = 1; and S(i) is a collection of subsets defined by S(i) = {(i),……., (n)}.

As can be inferred, the Choquet integral allows the aggregation of different criteria, with respect to a capacity, taking into account the importance of all possible groups of criteria (Vu et al., 2013). The term “capacity,” proposed by Choquet (1954), also referred to as “fuzzy measure,” “non-additive measure,” “monotone measure,” etc. (e.g., Grabisch, 2003), makes reference to a monotone function v: 2N → [0,1], satisfying v0 = 0 and vN = 1.

Consider that the aggregation of the breeding values of 3 traits with the usual aggregate genotype HAG = a1BV1 + a2BV2 + a3BV3. Applying the Choquet integral aggregate genotype would be

Let’s take the third equation (BV1BV2BV3) as an example. Figure 1 represents the aggregation of the 3 breeding values that following Eq [1] areso,

Figure 1.
Figure 1.

Graphical representation of aggregation of 3 breeding values (BV2BV1BV3) with relation to a set of capacities vi, using the Choquet integral [HAG = BV2 + (BV1–BV2)v13 + (BV3–BV1)v3]. Subscripts in parentheses represents a reordered set of breeding values and capacities, such that BV(1)BV(2) ≤ ……. ≤ BV(n).

 

Note that, v1, v2, and v3 are the capacities for breeding values; v12, v13, and v23 are the capacities for the combinations of pairs of breeding values; and v123 is the capacity of the combination of the 3 breeding values.

The Two-Additive Choquet Integral of Aggregate Genotype

According to Eq [1], the number of parameters needed for aggregating n traits is 2n − 2, so the number of subsets S(i) must be reduced to reduce the complexity of Choquet integral for practical application. It appears that the concept of k-additivity, a k-additive capacity, can model interaction among at most k criteria (k=number of criteria). The notion of 1-additivity coincides with that of additivity, i.e., the linear aggregated genotype. In practice, a 2-additive Choquet integral that considers only the coalition between pairs of criteria is considered a good compromise between complexity and richness of the model (Grabisch and Labreuche, 2010).

The interpretation of capacities is not always straightforward, but it is for the 2-additive Choquet integral of aggregate genotype because it can be expressed asor alternatively, taking into account that absolute value of a real number, x can be defined as following the expressionwhere φi is the Shapley value of breeding value i, and Iij is the interaction index between breeding values i and j, both of which are easier to interpret than capacities.

The Shapley value can be interpreted as the overall contribution of a breeding value to the aggregate genotype, and the Interaction index can be interpreted as the joint contribution of 2 breeding values (i,j). If Iij = 0, the aggregate genotype will correspond to independency between BV. If Iij > 0, the aggregate genotype will not be greater than the sums of individual BV, and the more BVi is different from BVj, the more the aggregate genotype is penalized. If Iij < 0, the aggregate genotype will be greater than the sums of individual BV, and the more BVi is different from BVj, the more the aggregate genotype is increased. It should be noticed that if breeding values of different traits do not interact to determine decision maker preference, Eq [2] would be the equivalent of Hazel linear additive function where economic values are replaced by Shapley values.

Shapley values and interaction indexes can be computed by the expressions

Considering, for instance, 2-traits estimated Shapley values and interaction indexes would be (a more detailed example can be found in Alfonso [2013]), and the estimated aggregate genotype following Eq [2] would be

Capacity Estimation

The estimation of the underlying capacities is required before the use of a Choquet integral as an alternative way to derive breeding goals and construct an aggregate genotype. This estimation can be performed by relying on a valuation measure of animals with different breeding values (expressed in a unipolar scale, usually [0,1]). Different valuation measures as stated or revealed preferences of experts, breeders, farmers, smallholders, etc. can be considered. Once the aggregate genotype for these reference animals, or groups of animals, has been established, different estimation approaches can be applied as the minimization of the square error or the minimization of the absolute distance, the last recommended in the presence of outliers (Beliakov, 2009). Both are implemented in some packages for the GNU R statistical system as Kappalab (Grabisch et al., 2008) or Rfmtool (Vu et al., 2013).

Simulated Pig Data: An Example for the Derivation of Breeding Goals

A pig production scenario was considered in which the preferences of decision makers were expressed with relation to the relative genetic merit of different groups of animals. It was assumed that 10 different groups of decision makers evaluated 10 different groups of pigs, taking into account characteristics of growth, reproduction, and environmental impact. They were asked to express their preference based in the relative genetic merit of each group of pigs (BV) for ADG (g), number of piglets born alive (TBA, piglets), and ratio of N excreted over N intake (RNE, %). This last trait was considered as an example of a difficult trait to take into account in profit equations analysis. Two preference expressions were considered: first, a scale from 0 to 1 and second, a rank from the less preferable to the most preferable.

A data set was simulated, sampling BV from normal distributions N(0,50), N(0,0.8), and N(0,2) for ADG, TBA, and RNE, respectively, and were assumed to be genetically uncorrelated. The average relative preferences of each group of decision makers for improving the BV of ADG, TBA, and RNE (PADG, PTBA, and PRNE, respectively) were also sampled from normal distributions N(0.002,0.001), N(0.1,0.02), and N(1,0.05), respectively, that is, the utility of improving 1 g of ADG and 1 piglet born alive per parity were considered 0.2% and 10%, respectively, of the utility of improving RNE by 1%. Negative values of PRNE were considered because the objective was to decrease N excretion. Standard deviations of PADG, PTBA, and PRNE represented variability between groups of decision makers, with the coefficient of variation of preference high for BVADG (50%), intermediate for BVTBA (10%), and low for BVRNE (5%). Taking into account the restrictions of the R package, which was later used for analysis, BV values were transformed to interval [0,1] {BVu = [BV − min(BV)]/[max(BV) − min(BV)]} to simulate the way the information about genetic merit was offered to decision makers (Table 1).


View Full Table | Close Full ViewTable 1.

Description of data set simulated for the pig scenario example. Relative breeding values for ADG (BVuADG), number of piglets born alive (BVuTBA), and ratio of N excretion (BVuRNE) are shown for each population. Average values for overall preference simulated (OCPui) and corresponding average ranking values (RPi) are also shown for the scenarios simulated (i = 1) independence among traits; (i = 2) coalition between BVADG and BVRNE

 
Pig population BVuADG BVuTBA BVuRNE OCPu1 RP1 OCPu2 RP2
1 0.000 0.664 0.000 0.03 0.00 0.33 0.19
2 0.948 0.293 0.251 0.32 0.40 0.81 0.55
3 0.871 0.307 0.670 0.68 0.60 0.96 0.80
4 0.363 0.445 0.214 0.24 0.20 0.78 0.34
5 0.496 1.000 0.737 0.73 0.71 0.79 0.38
6 0.649 0.856 0.236 0.29 0.30 0.81 0.53
7 0.647 0.337 1.000 0.96 0.90 0.16 0.00
8 0.553 0.000 0.427 0.43 0.50 0.99 0.88
9 0.788 0.283 0.031 0.11 0.10 0.29 0.11
10 1.000 0.774 0.717 0.75 0.79 0.93 0.72

The overall preference (OCP) of decision makers for each group of pigs was computed in 2 different ways:

  1. Assuming an additive model:

  2. Assuming a model with an additional term proportional to the distance between ADG and RNE breeding values:

The first model simulates a scenario in which overall preference is independent of the combination of trait values. The second simulates a scenario where an equilibrated population for BVADG and BVRNE is preferable, i.e., a population as good for BVADG as for BVRNE. Figure 2 shows the resulting relationship between OCP and the utility (Pi*BVi) of each trait.

Figure 2.
Figure 2.

Variation of overall preference with respect to the utility of each trait (Pi*BVi), simulated under 2 scenarios (black cicles, OCP1: additive model; grey circles, OCP2: preference penalized by differences between BVADG and BVRNE). Adjusted polynomial lines (order 2) are shown. Pi, average relative preferences; BVi, breeding values; OCP, overall preference.

 

Finally, OCPi values were transformed to [0,1] (OCPui) and to a ranking preference (RPi) on a scale from 0 to 0.9 (Table 1). Data were analyzed using the Rfmtool package (Vu et al., 2013) for the GNU R statistical system. Shapley and interaction index values were computed to determine breeding goals, assuming a 2- and 1-additive Choquet integral. Additionally, values for OCPui and RPi were estimated and compared with those simulated.

Dairy Cattle Data: Example for the Construction of Aggregate Genotype

The construction of an aggregate genotype for dairy cattle production was considered to be a second example using available real data. Dairy cattle are one of the most studied livestock systems in terms of profit equations and derivation of economic values for traits included in different breeding goals defined across the world. Specifically, Holstein U.S. dairy cattle data were analyzed because a large amount of information is available. It was assumed that the preferences of decision makers were expressed based on predicted breeding values of animals.

First, information about a sire’s genetic expected values was obtained from the Active AI Sire List database of the National Association of Animal Breeders (NAAB). This is a database with genetic evaluations and semen price information for the Active AI Sires and was obtained in December 2015 (http://www.naab-css.org). The predicted transmitting ability (PTA) from 7 traits, included in TPI (TPI=Total Performance Index), were considered: milk protein (MP), milk fat (MF), productive life (PL), somatic cell score (SCS), daughter pregnancy rate (DPR), daughter calving ease (DCE), and daughter stillbirth (DSB) . These traits were chosen as examples of old and new breeding objectives in dairy cattle.

Second, information about the preferences of dairy breeders was extracted from the High Registry Activity by Bull list produced by Holstein Association USA, Inc. (http://www.holsteinusa.com/hol/highRegistryBulls.action). It is a list that ranks the bulls by most offspring in the last 2 wk and was obtained in February, March, and April of 2016. For this example, the number of offspring was considered as a “realized preference” on aggregate genotype.

Merged information from both databases for 42 bulls is available for analysis (Table 2). Capacities were estimated based on the values of PTA and the number of offspring using Rfmtool (Vu et al., 2013), which uses linear programing techniques. As Rfmtool assumes input variables in [0,1], data were transformed as in simulated example. Previously, negative values were considered for SCS, DCE, and DSB because preferences on these traits are decreased genetic values. The 2-additivity Choquet integral was considered to determine the aggregate genotype.


View Full Table | Close Full ViewTable 2.

Data set considered in the dairy cattle example

 
PTA1
Bull P F PL SCS DPR DCE DSB No_offs2
1 42 49 4.7 2.79 1.2 4.1 3.7 615
2 54 95 5 2.98 0.2 5 4.8 474
3 59 63 9.2 2.9 1.8 5.9 5 428
4 65 81 4.2 3.07 −1.1 6.2 6.4 396
5 44 59 5.1 2.97 0.9 4.5 6.9 388
6 44 64 6.4 2.94 1.4 3.3 3.2 322
7 55 65 7.7 2.88 2.6 5.9 7.4 289
8 53 89 8.1 2.83 1.6 5.6 5.1 281
9 57 102 4.6 2.88 −1 5.3 7.1 278
10 57 48 5.4 2.89 2.8 6.1 5.5 271
11 50 53 6.8 2.87 1.6 6.2 5.2 260
12 61 72 6.6 2.73 0.6 4.7 4.7 245
13 36 57 3.3 2.82 0.1 5.8 6.2 242
14 79 80 6.8 3 1.7 4.6 4.4 212
15 42 59 7.9 2.76 2.6 5.5 5 182
16 24 29 5.6 2.88 4.4 4.3 6.7 159
17 20 34 7.2 2.49 6 5.4 4.7 157
18 62 72 5.6 2.89 1.6 3.4 3.9 149
19 39 79 4.9 2.85 0.2 5.1 5.9 142
20 66 100 5.7 3.07 −0.4 7.2 6.3 141
21 50 66 7.5 2.7 2.7 4.6 4.1 140
22 55 100 6.8 3 0.8 4.9 4.6 134
23 27 70 4.7 2.82 0.6 6 6.5 134
24 2 7 0 2.74 −1 4 5.9 134
25 43 79 3.8 2.92 1.3 6.7 6.5 125
26 39 68 4 2.88 0.4 6.1 7.2 120
27 50 57 6.3 2.74 0.6 5.3 5.8 119
28 41 51 8.2 2.77 2.5 7.4 5.7 118
29 75 94 5.1 2.92 0.4 4.8 3.4 w114
30 51 72 4.8 2.91 1.3 6.9 5.8 108
31 48 64 3.9 2.74 1.1 5.2 4.5 107
32 51 76 4.1 2.89 −0.2 6.9 7.4 105
33 41 91 6.2 2.73 1.1 4.7 6.9 90
34 59 75 5 2.93 1.3 3.3 4.5 85
35 48 67 4.9 2.8 0.8 4 5.9 84
36 −19 −10 −0.6 3.14 −1.7 9.7 10.4 82
37 32 64 5.2 2.76 5.4 5.4 8 81
38 68 74 6.6 2.93 2.5 4.5 6.2 78
39 37 33 6.6 2.53 3.4 5.6 6.3 75
40 37 68 4.6 2.52 −0.6 6 6.4 67
41 32 44 5.8 2.86 2.3 5.9 6.8 67
42 34 42 0.2 2.87 −3.4 7.2 9.2 63
1Predicted transmitting ability (PTA) for milk protein (P), milk fat (F), productive life (PL), somatic cell score (SCS), daughter pregnancy rate (DPR), daughter calving ease (DCE), and daughter stillbirth (DSB).
2Number of offspring for 42 highly employed bulls over the period considered.


RESULTS AND DISCUSSION

Derivation of Breeding Goals From Simulated Data

Under the P1 scenario, no important coalitions among traits were detected, but they were in the P2 scenario (Table 3). In this last scenario, a correlation was detected between ADG and RNE (IADG-RNE > 0) that shows that the more BVADG was different from BVRNE, the more the aggregation was penalized; the other 2 interactions were complementary (IADG-TBA > 0; ITBA-RNE > 0), indicating that it is preferable that populations have high BV for ADG, TBA, and RNE traits. These results were in agreement with simulations models.


View Full Table | Close Full ViewTable 3.

Shapley values and interaction indexes estimated in the pig example when considering overall preference (OCPui) or ranking values (RPi) simulated assuming (i = 1) independence among traits or (i = 2) coalition between breeding values for ADG (BVADG) and breeding values for the ratio of N excretion (BVRNE)

 
k = 2
k = 1
Shapley value
Interaction index
Shapley value
Simulation Trait1 TBA RNE
OCPu1 ADG 0.082 −0.007 −0.004 0.091
TBA 0.030 −0.018 0.025
RNE 0.888 0.883
OCPu2 ADG 0.754 −0.568 0.489 0.570
TBA 0.229 −0.455 0.430
RNE 0.017 0.000
RP1 ADG 0.186 −0.062 −0.001 0.155
TBA 0.018 0.097 0.000
RNE 0.797 0.845
RP2 ADG 0.677 −0.161 0.817 0.414
TBA 0.068 −0.306 0.151
RNE 0.256 0.436
1TBA, number of piglets born alive; RNE, ratio of N excreted over N intake.

Focusing first on the P1 scenario, interaction indexes estimates indicate that no important coalitions exist among traits, in agreement with the additive simulation model, with respect to all 3 traits. Consequently, estimated Shapley values for 2-additivity and 1-additivity Choquet integrals were similar. Simulated OCP1 and RP1 values were accurately estimated considering both 2- and 1-additivity expressions; the Spearman coefficient of correlation was always close to unity (r = 0.99).

Shapley values indicate that preference was mainly determined by RNE and that ADG and TBA were of little importance, especially TBA. These results were also in agreement with the simulation, as the utilities of improving 1 unity ADG and TBA were 0.2 and 10% of the utility of improving 1 unity RNE, and their standard deviations were 50, 0.8, and 2 unities respectively, so the simulated overall preference OCP1 was mainly determined by RNE as shown in Fig. 2. OCP1 variability around polynomial-adjusted lines observed in Fig. 2 due to the variability between groups of simulated decision makers was high for BVADG (CV = 50%), intermediate for BVTBA (CV = 10%), and low for BVRNE (CV = 5%).

Under the P2 scenario, when preferences were simulated as being aggregated in a nonadditive way, the Choquet integral had more difficulties extracting information from data. Larger differences between simulated and estimated values were observed with respect to the P1 scenario. The Spearman coefficients of correlation were low for the 2-additivity model, (0.45 and 0.44 for OCP2 and RP2) and obviously lower (0.30 and 0.26 for OCP2 and RP2) when no coalitions were taken into account (1-additivity). These results were expected because the simulation model was different from the Choquet integral for k = 1, and also, though more similar, for k = 2.

Estimated Shapley values (k = 2; Table 3) indicate that the more important trait for decision makers under this scenario was ADG instead of RNE. This result agrees with the simulated relationship between RNE utility and overall preference OCP2. As shown in Fig. 2, decision makers preferred intermediate breeding values for RNE but high breeding values for ADG. Differences between Shapley values that were estimated for the 2- and 1-additivity Choquet integral differed more than in the P1 scenario as expected because the simulated breeding value interaction was neglected assuming 1-additivity. The coalition estimated between ADG and RNE corresponded to the simulated term −(BVADGBVRNE)2. Interaction indexes for ADG and RNE were positive for both OCP2 and RP2, indicating a synergy between both traits, and negative for TBA with ADG and RNE, indicating redundancy according to simulation.

In summary, results indicate that the 2-additivity Choquet integral could help in some way to identify coalitions among traits when deriving breeding goals, whatever the model is that determines the preferences of decision makers. Differences found when using OCP and RP measurements also indicated that the expression of decision maker preferences as a single ranking of different animals or groups of animals under evaluation could be sufficient to extract information to determine breeding objectives.

Construction of Aggregate Genotype for Dairy Cattle Data

According to the results shown in Table 4 and considering the PTA values of P, F, PL, SCS, DPR, DCE, and DSB, which are taken into account for the dairy cattle example, the aggregate genotype could be approximated by the following expression (where u indicates data in [0,1]):


View Full Table | Close Full ViewTable 4.

Shapley values and interaction indexes estimated in the dairy cattle example

 
Interaction indexes
Trait1 Shapley values F PL -SCS DPR -DCE -DSB
P 0 0 0 0 0 0 0
F 0.0834 0 0 0.1667 0 0
PL 0 0 0 0 0
-SCS 0.4166 0.8333 0 0
DPR 0.5000 0 0
-DCE 0 0
-DSB 0
1P, milk protein; F, milk fat; PL, productive life; SCS, somatic cell score; DPR, daughter pregnancy rate; DCE, daughter calving ease; DSB, daughter stillbirth.

It was mainly determined by the PTA of F, SCS, and DPR. The PTA of P, PL, DCE, and DSB were not found to be relevant for decision makers (Shapley values = 0). The apparent importance of SCS and DPR were similar and greater than the importance of F (Shapley values, 0.417, 0.5, and 0.083, respectively). However, interaction indexes values indicated some degree of coalition between F and DPR and especially between SCS and DPR. The aggregate genotype would be lower than the sums of the individual PTA for F, SCS, and DPR because positive values were found, indicating redundancy between these PTA values. The preference for one or the other bull is not only dependent on the individual PTA but also on its combination. For instance, bull “A” with PTAu values for F, SCS, and DPR of 0, 1, and 1, respectively, would have a greater (0.834) than bull “B” with values 1, 0.5, and 1 ( = 0.583).

To take into account the relevance of considering interaction indexes among traits to construct aggregate genotypes, these values can be compared to those obtained under the assumption of independence. Under the simplest linear model (1-capacity model), the corresponding aggregate genotype waswith values for bulls A and B of 1 and 0.595, respectively. So, the aggregate genotypes were overestimated under the linear model for these 2 bulls A and B because the aggregate did not account for redundancy between breeding values of different traits on decision maker preference. The ranking would be the same for both bulls, but when all of the bulls in Table 2 were considered, the Spearman coefficient of correlation between and was 0.73. In spite of the changes in bull ranking, the greater complexity of vs. should be considered, especially if traits under genetic evaluation differ from the traits of the aggregate genotype. Nonlinear models for the aggregate genotype have been previously discussed in the literature in the context of profit functions, but uniformly best solutions have not been proposed (Weller et al., 1996). If weak coalitions among traits are estimated, the 1-additive Choquet integral could be adequate to aggregate values with and without estimable economic values.


CONCLUSIONS

The main interest of the proposed approach lies in analyzing new traits that are difficult to include in profit equations based on the preferences of decision makers. The Choquet integral provides an easy way to construct an aggregate genotype, eliminating the need for economic values, and can be applied using previously collected data to include societally and environmentally important traits in breeding objectives. An additional interest lies in the analysis of possible interactions among traits defining aggregate genotype, though the resulting increased complexity should be evaluated compared with the simpler additive model before constructing a practical aggregate genotype.

 

References

Footnotes


Comments
Be the first to comment.



Please log in to post a comment.
*Society members, certified professionals, and authors are permitted to comment.