1st Page



This article in JAS

  1. Vol. 95 No. 4, p. 1489-1501
    Received: Nov 19, 2016
    Accepted: Jan 25, 2017
    Published: April 13, 2017

    2 Corresponding author(s):


The impact of multi-generational genotype imputation strategies on imputation accuracy and subsequent genomic predictions1

  1. M. M Judge*,
  2. D. C. Purfield*,
  3. R. D. Sleator and
  4. D. P. Berry 2*
  1. * Teagasc, Animal & Grassland Research and Innovation Center, Moorepark, Ireland
     Department of Biological Sciences, Cork Institute of Technology, Bishopstown, Ireland


The objective of the present study was to quantify, using simulations, the impact of successive generations of genotype imputation on genomic predictions. The impact of using a small reference population of true genotypes versus a larger reference population of imputed genotypes on the accuracy of genomic predictions was also investigated. After construction of a founder population, high-density (HD) genotypes (n = 43,500 single nucleotide polymorphisms, SNP) were simulated across 25 generations (n = 46,800 per generation); a low-density genotype panel (n = 3,000 SNP) was developed from these HD genotypes, which was then used to impute genotypes using 7 alternative imputation strategies. Both low (0.03) and moderately (0.35) heritable phenotypes were simulated. Direct genomic values (DGV) were estimated using imputed genotypes from the investigated scenarios and the accuracy of predicting the simulated true breeding values (TBV) were expressed relative to the accuracy when the true genotypes were used. Mean allele concordance rate and the rate of change in mean allele concordance per generation differed between the imputation strategies investigated. Imputation was most accurate when the true HD genotypes of sires and 50% of the dams of the generation being imputed were included in the reference population; the average allele concordance rate for this scenario across generations was 0.9707. The strongest correlation between the TBV and DGV of the last generation was when the reference population included sequentially imputed HD genotypes of all previous generations, plus the true HD genotypes of all sires of the previous generations (0.987 as efficient as when the true genotypes were used in the reference population). With a moderate heritability, the correlation between the TBV and the DGV using a small reference population of accurate genotypes were, on average, 0.07 units stronger compared to DGV generated using a larger population of imputed genotypes. When the heritability was low, the accuracy of genomic predictions benefited from a larger reference population, even if SNP were imputed. The impact on the accuracy of genomic predictions from the accumulation of imputation errors across generations indicates the need to routinely generate HD genotypes on influential animals to reduce the accumulation of imputation errors over generations.

  Please view the pdf by using the Full Text (PDF) link under 'View' to the left.

Copyright © 2017. American Society of Animal Science