Selection index theory is an important tool for animal breeders. This theory can be used for combining multiple sources of information to estimate breeding values. This ensures that the accuracy of estimated breeding value is optimized. This post outlines selection index theory in simple terms, gives some examples, and introduces an online tool that can quickly compute the optimal index weights.
Breeding Values
Animal breeders aim to gradually improve populations by means of artificial selection. Each generation, the genetically superior animals are selected to become parents of the next generation. In most cases, breeders identify the superior individuals by estimating breeding values.
The first step towards estimating breeding values is to collect data. Suppose we want to increase egg size in a chickens, then we need to measure egg size in our current population of animals. These measurements are typically called phenotypes.
The simplest strategy is to estimate the breeding value for each animal by using their own phenotype only. Animals with a high phenotype will then have a higher estimated breeding value (EBV) than animals with a low phenotype. However, this strategy may be suboptimal, because there may be more information available that can increase the accuracy of EBV. For example, phenotypes from relatives (parents, (half-)sisters, offspring, etc.) can provide additional information on the breeding value of an animal. If we combine all the available information sources, we can optimize the accuracy of EBV.
Selection Index Theory
Suppose that I am interested in estimating the breeding value for milk production of a dairy cow named Molly. Phenotypes are collected from Molly herself and her mother. These two sources of information are not independent (they are correlated), because the genes that were transmitted from the mother to Molly are expressed in both these phenotypes. If we would ignore this correlation, we would give to much value to each of these information sources. In other words, we would be using the same information twice, which will lead to suboptimal accuracy of the breeding value.
To account for correlations between information sources, we can use selection index theory. Selection index theory allows to combine multiple information sources, while accounting for the fact that these sources provide overlapping information. In other words, the theory accounts for the fact that the information sources are correlated.
The selection index for a single trait can be written as a simple mathematical formula
EBV = b1X1 + b2X2 + b3X3 … + bnXn
where EBV is the estimated breeding value, the b variables are called the index weights, the X variables are phenotype values expressed as deviations from the population mean, and subscripts denote the different information sources.
Let’s go back to the example of Molly. She has produced 9000 kilograms of milk in a year, while her mother has produced 9700 kilograms of milk. The average milk production of the entire population is 8500 kg. Suppose that the optimal index weights are 0.24 for the phenotype of Molly and 0.10 for the phenotype of her mother. The EBV of Molly can then be calculated as
EBV = b1X1 + b2X2 = 0.24*(9000-8500) + 0.10*(9700-8500) = 240
Optimizing the index
The goal of using selection index theory is to find optimal values for each b (the index weights) such that the accuracy of EBV is maximized. I will not show the full theory here, but with some relatively simple algebra we can come up with formulas that give us the optimal index weights.
I have developed a simple online tool for calculating index weights using selection index theory. This tool can be used for education purposes and to get some insight into the value of different information sources, and how these depend on parameters such as heritability and repeatability.