Time-stamp: Notes regarding hindex, a program for calculating a hybrid index for individuals of unknown ancestry. For additional information see http://www.uwyo.edu/buerkle/hindex. _______________________________________ INSTALLATION and USAGE Mac OS X: Decompress the downloaded file with StuffIt (this may be taken care of automatically by your browser). Move the program 'hindex' to a convenient location. The /Applications folder would be a sensible location, but it can be placed anywhere that you can find it. Double-click on the icon to launch the program. Retain the folder that contains the README.txt, license information and sample files elsewhere. Linux (x86): Decompress the downloaded file (e.g. tar xvzf hindex.1.3_i386.tgz). Move the program 'hindex' to a convenient location in your $PATH. At a command prompt in a terminal window, type hindex to launch the program. A graphical user interface will be launched. Retain the folder that contains the README.txt, license information and sample files elsewhere. _______________________________________ Data input Example input files are included with the distribution. Details are given here. It is probably easiest to create input files in a spreadsheet program that can save text files. Simple ASCII text files are required (not rtf, doc, xls, etc). Text files can have Windows, UNIX, or MacOS line-endings. Rows in the files correspond to loci. In the file containing data for individuals, columns correspond to individuals. In the file containing data for parental species, columns correspond to alleles or genotypes (depending on the marker type). In input files, everything preceding a # symbol on a line is ignored and thus this space can be used for locus designations. Use of the # symbol to designate anything but the boundary between locus designation and the data that follow will result in errors. Individual data: The first row in the input file must begin with the word 'Individual'. The second row in the input file lists labels for individuals that will be used in the output file. Genotype designations Codominant loci: alleles are sequential integers starting at 1 and up to the number of alleles for that locus. Integers for alleles refer to the order in which frequencies are given in the parental file. For readability, two alleles at a locus are separated by commas, and individuals are separated by any number of spaces or tabs. Dominant loci: 1 for band present and 0 for null allele Missing data should be encoded with NA for a dominant locus and NA,NA for codominant loci (a single NA is an error at a codominant locus). Alleles observed in individuals to be classified, but not present in either parental taxon, should be entered as NA. Otherwise an error will be printed and hybrid indexes will not be calculated. This is meant as a guard against entering nonsensical allele designations. Parental data: The first row in the input file must begin with the word 'Parental'. It would be nonsensical to have missing data for parental allele frequencies, so they are not expected in the input file. Codominant loci: hindex expects input of parental frequencies for the alleles in each of two parental populations (all alleles for parent 1, followed by all alleles for parent 2). This is best illustrated by an example, in this case a codominant locus (the leading C) with three alleles: chr4 locus5 # C 0.9 0.1 0 0.1 0.8 0.1 The six numbers after the locus type designation (D or C) are the frequencies for the three alleles in the two parental populations (3 alleles X 2 pops). Some sanity checking is done on the input frequencies, but this is no substitute for making sure that the frequencies are correctly entered. Dominant loci: hindex expects input of the frequencies of individuals with a band at a dominant locus (sometimes referred to as possessing a dominant marker). The designation for dominant loci (D) is followed by the frequency in each of the parental populations. For example: chr4 locus5 # D 0.9 0.1 Additional notes: Data for loci need to be given in the same order in both data files, and all loci need to be listed in both files. No attempt is made to match labels for loci in the two files. In fact, labels for loci are ignored intentionally since small typographical errors could lead to mismatches. Keeping the loci in order and complete is the job of the user. If a hybrid index can not be calculated for an individual because of missing or uninformative data, "nan" (not a number) will be printed in the output. License and contact info This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program (it is available via a menu option and as a text file with the source code); if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. C. Alex Buerkle, Department of Botany, 3165 1000 E. University Ave. University of Wyoming Laramie, WY 82071 http://www.uwyo.edu/buerkle buerkle@uwyo.edu