Center for Digital Agriculture

Project Team


The approaches commonly used to identify cattle that have the highest genetic potential for milk production and health status make simplistic assumptions about the relationship between phenotypes and genotypes. These simplifications introduce biases in the identification of genetically superior animals and hinder the improvement of the U.S. dairy cattle population. We propose the use of deep learning to address the analytical limitations of the present models. The goal of this proposal is to assess the strengths of convolutional neural nets (CNN) to relate genomic and phenotypic information. The capacity of this approach to accommodate additive and non-additive genomic effects will improve the identification of superior animals and advance the understanding of the molecular architecture of dairy traits.

Our team is uniquely positioned to pioneer the application of deep learning methods to U.S. dairy cattle improvement. A one-of-a-kind dataset to train and validate the CNN is available to investigator Rodriguez Zas in her role as investigator of a USDA multi-institutional grant. This dataset includes milk yield and health records from over 11,000 Holstein cows across the U.S. Cows from this population were genotyped for 770,000 single nucleotide polymorphisms (SNPs) across the genome. A new collaboration between ACES investigator Rodriguez Zas (contributing expertise in livestock genomic analysis), and NCSA investigator Huerta Escudero (providing expertise in deep learning methods in high-performance computing environments) will enable the application of CNNs to our comprehensive dataset. Results from the proposed project will support grant applications aligned with USDA NIFA Foundational program priority areas. The proposed project will showcase the multiple benefits of deep learning approaches including, a) the identification of genomic locations influencing traits of economic importance to the dairy industry; b) the characterization of epistatic effects influencing dairy traits; and c) the computation of precise merit estimates for genome-enabled improvement of the U.S. dairy population.