Campus Units

Animal Science

Document Type


Publication Version

Accepted Manuscript

Publication Date


Journal or Book Title

Livestock Science

First Page





Breed associations and registries maintain breed purity by enforcing certain conformational characteristics defining the breed along with cataloguing the pedigree of every animal in the registry. Furthermore, developing niche markets is often based on specialized products using heritage breeds that need to guarantee breed purity. Genomic technology and the progressively lower costs of genotyping can be helpful when assessing breed purity by estimating breed composition. In this research, genotypes from 648 pigs and 11 breeds were used to develop marker panels to estimate breed composition with special emphasis on Mangalitsa pigs as a heritage breed. Two sets of panels were created. The first set was based on Fst scores that were calculated individually for ~31,000 available markers across the pig genome. Here, panels composed of the 10, 50, 100, 500 and 1000 markers with the highest Fst scores were generated.

The second set was composed by randomly selected markers and had the same number of markers as the Fst-derived panels. Two statistical methods, linear regression and random forest were then used on the marker panels to estimate breed composition, of 107 pigs including 47 individuals known to have Mangalitsa background. Fst appeared to be better at identifying Mangalitsa individuals when compared to random markers regardless of the method used to estimate breed composition. However, random markers were more accurate at estimating breed composition for non-Mangalitsa individuals.

When the results were compared across methods for estimating breed composition, linear regression produced more accurate estimates of breed composition than random forest. However, both methods lacked accuracy when estimating breed composition for crossbred individuals. It must also be noted that these methods were focused on estimating breed composition of Mangalitsa pigs and different markers should be selected if different breeds will be the focus and accuracy of prediction will depend on the breeds that are available to be used as references for the Fst calculations.

The results presented in this study allow us to conclude that: 1) Random forest was effective at classifying individuals into breeds, but not at estimating breed composition when compared to the linear regression method. 2) Markers filtered using Fst scores are more effective at identifying Mangalitsa breed composition while not as effective at identifying other breeds. 3) If Fst-filtered markers that are effective at identifying Mangalitsa from other breeds are being used to estimate breed composition for individuals of other breeds, a greater number of markers is needed


This is a manuscript from an article published as Chinchilla-Vargas, Josue, Francesca Bertolni, K. J. Stalder, J. P. Steibel, and M. F. Rothschild. "Estimating breed composition for pigs: A case study focused on Mangalitsa pigs and two methods." Livestock Science: 104398. doi: 10.1016/j.livsci.2021.104398. Posted with permission.

Copyright Owner

Elsevier B.V.



File Format


Available for download on Sunday, January 09, 2022

Published Version