The Harry M. Zweig Memorial Fund for Equine Research

Horse Genome Project: Functional Genomics through Equine Microarrays

Dr. Douglas F. Antczak

In 2005 we proposed to evaluate an important new tool for studying the expression of very large numbers of genes in single experiments – the expression array gene chip. This research was to be a collaborative undertaking that is part of the overall Horse Genome Project Workshop in which our laboratory has participated successfully for the past decade. Our part of this new project was dependent upon our colleagues at Texas A&M University , who were responsible for constructing the gene chip using cDNA sequences produced earlier by several laboratories, including ours. Unfortunately, the Texas group encountered unexpected difficulties in the production process and they have thus far failed to produce the expression array. Without this gene chip, we have been unable to conduct all of the experiments that we had proposed for year one of this project.

However, because of an unanticipated development in horse genomics, we have been able to begin some very promising genomic experiments that are consistent with our original aims. Furthermore, there is good news to report concerning new competitive grant funding for horse genomics. Finally, my laboratory continues to publish scientific papers based on research in equine genomics supported by the Zweig Fund.

Whole genome sequencing of the horse: In early 2006 the National Institutes of Health selected the horse for the list of mammals to be used for high density whole genome sequencing. ‘High density whole genome sequencing' means that on average, the entire equine genome will be sequenced five times over (a so-called 5X coverage of the genome). In practice, this means that about 85% of the horse genome will be sequenced, because some parts of the genome are very difficult to sequence and would not be covered by a 5X sequence. This 5X level of coverage is equivalent to that obtained for mice, rats, and dogs, and is exceeded only by the 8-9X coverage obtained for the human genome sequence. As of this writing (September 2006) the DNA sequencing of the horse genome is virtually complete, but an additional six months of work will be required to ‘assemble' the raw data into a finished sequence. Additional details of the whole genome sequencing project are provided in the scientific Progress Report.

The primary motivation of the NIH in sequencing several mammalian species is to obtain comparative genomic information that will improve the understanding of the structure, function, and evolution of human genome. However, the NIH decision to sequence the horse genome at this time was strongly influenced by the tremendous progress in equine gene mapping made over the past decade by the Horse Genome Project Workshop, and the demonstration by the equine genetic researchers of the many important applications to equine medicine, surgery, and husbandry that a whole genome sequence of the horse would facilitate. Thus, the equine genome sequence is expected to serve both human and equine health. It is difficult to overstate the importance of this advance in equine genetics. The cost of 5X whole genome sequencing today is estimated to be about $40 million. Equine researchers now have access to a level of DNA sequence that was undreamed of even one year ago. The door has been opened to highly sophisticated, gene-based studies of the horse, and already this has changed the way equine research is conducted.

Current research: Over the past winter my laboratory prepared to test the Texas A&M gene expression array gene chip, which we expected to arrive by March. When it became clear in April that the delays in production would continue, we decided to commit the current year Zweig Fund support for studies of embryo and placenta gene expression, using the new whole genome sequence database that is available in the public domain via the internet. The pilot experiment we conducted over the past summer illustrates the power and potential of the whole genome sequence. We designed projects for three veterinary students who were awarded fellowships from the Havemeyer and Cornell Leadership Programs to conduct summer research in my laboratory. Each student was assigned between 10 and 15 genes and given the task of searching for the horse sequences in the whole genome database, designing polymerase chain reaction primers for these genes, and testing for the expression of these genes in embryonic and fetal tissues from various stages of development between days 15 and 35 of gestation. The table included in the Appendix shows the list of 43 genes, all of which have been shown to be important for fetal or placental development in other species. Progress was made on characterizing 39 of the 43 genes, and for some genes the data obtained is nearly ready for publication. This is extraordinary progress by a small group of talented students who had the benefit of only minimal training and instruction. By way of comparison, last winter, before the new sequence became available, a competent scientist might easily spend three months to obtain the sequence of a single horse gene, using the previously available equine genomic resources.

Plans for next year: In 2007 we plan to complete the characterization of a selected group of the genes identified over the past summer. We will concentrate our efforts on genes that determine the development of different cell types in the placenta, with an emphasis on the chorionic girdle and endometrial cups. These tissues have been the target of many of our studies of the past 20 years.

In another important development, the Morris Animal Foundation has announced its intent to fund a five year, $2.5 million Equine Consortium Grant to the Horse Genome Project participants. The equine genome scientists submitted a proposal last February to this new program, and it was selected in July for funding. Details of the implementation of the grant are not yet complete, but it is certain that the program will support the development, testing, and application of two genomic tools that will be based on the equine whole genome sequence. A new expression microarray gene chip containing virtually all of the 20,000+ genes that make up the horse genome is planned, as is a new Single Nucleotide Polymorphism (SNP) chip for assessing genetic variation between horses. It is likely that the expression array gene chip and the SNP chip will be produced by commercial biotechnology firms experienced in the construction of these nano-scale devices. This outsourcing should ensure that the array and SNP chip are produced on schedule and with good quality control. Some funds from this new grant will come to my laboratory at Cornell, but in the first year or two most of the support will go towards the production of the genomic tools.

Publications : During the past year I have authored or co-authored four publications that used genomic data generated with Zweig funds. The papers are listed on the Appendix page.

Closure: It is a pleasure to note that the horse chosen as the DNA donor for the equine whole genome sequence is Twilight, a Thoroughbred mare raised at Cornell and a member of the experimental herd of Major Histocompatibility Complex homozygous horses that have been specifically bred for our research program in equine immunology, genetics, and reproduction over the past 25 years. This honor will bring special recognition to the equine research program at Cornell and to the Zweig Fund, which has provided so much important support for the Horse Genome Project since its inception in 1995.