Finally some progress. I had the marker work done yesterday, but it has taken most of today to get a working example out of my code. But I believe I now have success and a base to continue working from. Thanks Randy, for your pointers to the simulated data generation in R.
I set up a new trait in Dragons, "Combonism", which is strongly expressed in Dragons with HH on chromosome 1, and WW on chromosome 2. The trait diminishes linearly with the number of recessive alleles in either gene, to where HhWw is mediocre, and hhww is completely absent.
Then I created 2 dragons, one with HHWW, and one with hhww, (and the rest of the genes randomly homozygous) and made sure that all markers were different between the two. Then I bred a female and male child (F1), from which I bred 100 dragons (F2). From these dragons, I extracted the phenotype value and marker values, to create a dataset which could be loaded into R/qtl.
I then used R/qtl to analyze the dataset, and mostly got what I expected (I have some questions on that, though).
Questions (forgive my rusty/rough understanding of biology/genetics):
- An inbred strain will (generally? always?) be homozygous across all its genes.
- Markers have no direct correlation with genes, correct? Other than through their proximity, one can track which genes end up in an offspring.
- So therefore, one strain may have a 'G' SNP and a dominant allele where another strain might have a 'C' and a dominant allele?
- In an inbred strain, will it be the case that both copies of a chromosome will have the same marker?
- In a world where crossing over doesn't happen, QTL analysis would only be able to pinpoint the chromosomes involved, and not really be able to deduce further what actual markers are involved, correct? I have suspicions that Biologica isn't simulating crossovers like I thought it was, and therefore I'm not getting the exact markers I expected.
- When a crossover happens, is it usually only at one point in the chromosome? Will it ever cross at more than one point?
My next step will be to tie together the work I've done interfacing with R through JRI to do the QTL analysis automatically when the F2 generation is generated.
I'm not sure what format you want this in. We have not yet named the genes (we are going to us names from the mouse database so that we can link to the MGI database for searches).
We were holding off naming the genes until we had a data structure from you, but if this structure looks OK, let me know and we will start to populate the gene list with real gene names. The G(number) names are just place holders.