Title: | Construction of Genetic Maps in Experimental Crosses |
---|---|
Description: | Analysis of molecular marker data from model (backcrosses, F2 and recombinant inbred lines) and non-model systems (i. e. outcrossing species). For the later, it allows statistical analysis by simultaneously estimating linkage and linkage phases (genetic map construction) according to Wu et al. (2002) <doi:10.1006/tpbi.2002.1577>. All analysis are based on multipoint approaches using hidden Markov models. |
Authors: | Gabriel Margarido [aut], Marcelo Mollinari [aut], Cristiane Taniguti [ctb, cre], Getulio Ferreira [ctb], Rodrigo Amadeu [ctb], Jeekin Lau [ctb], Karl Broman [ctb], Katharine Preedy [ctb, cph] (MDS ordering algorithm), Bastian Schiffthaler [ctb, cph] (HMM parallelization), Augusto Garcia [aut, ctb] |
Maintainer: | Cristiane Taniguti <[email protected]> |
License: | GPL-3 |
Version: | 3.2.0 |
Built: | 2025-01-11 09:36:24 UTC |
Source: | https://github.com/cristianetaniguti/onemap |
Perform gaussian sum
acum(w)
acum(w)
w |
vector of numbers |
Creates a new sequence by adding markers from a predetermined one. The markers are added in the end of the sequence.
add_marker(input.seq, mrks)
add_marker(input.seq, mrks)
input.seq |
an object of class |
mrks |
a vector containing the markers to be added from the
|
An object of class sequence
, which is a list
containing the following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
@author Marcelo Mollinari, [email protected]
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) (LG1 <- make_seq(groups,1)) (LG.aug<-add_marker(LG1, c(4,7)))
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) (LG1 <- make_seq(groups,1)) (LG.aug<-add_marker(LG1, c(4,7)))
Add the redundant markers removed by create_data_bins function
add_redundants(sequence, onemap.obj, bins)
add_redundants(sequence, onemap.obj, bins)
sequence |
object of class |
onemap.obj |
object of class |
bins |
object of class |
New sequence object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Cristiane Taniguti, [email protected]
It shows the alpha value to be used in each chi-square segregation test, in order to achieve a given global type I error. To do so, it uses Bonferroni's criteria.
Bonferroni_alpha(x, global.alpha = 0.05)
Bonferroni_alpha(x, global.alpha = 0.05)
x |
an object of class onemap_segreg_test |
global.alpha |
the global alpha that |
the alpha value for each test (numeric)
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap Chi <- test_segregation(onemap_example_bc) # Performs the chi-square test for all markers print(Chi) # Shows the results of the Chi-square tests Bonferroni_alpha (Chi) # Shows the individual alpha level to be used
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap Chi <- test_segregation(onemap_example_bc) # Performs the chi-square test for all markers print(Chi) # Shows the results of the Chi-square tests Bonferroni_alpha (Chi) # Shows the individual alpha level to be used
Based on MAPpoly check_data_sanity function by Marcelo Mollinari
check_data(x)
check_data(x)
x |
an object of class |
if consistent, returns 0. If not consistent, returns a
vector with a number of tests, where TRUE
indicates
a failed test.
Cristiane Taniguti, [email protected]
data(onemap_example_bc) check_data(onemap_example_bc)
data(onemap_example_bc) check_data(onemap_example_bc)
Based on MAPpoly check_data_sanity function by Marcelo Mollinari
check_twopts(x)
check_twopts(x)
x |
an object of class |
if consistent, returns 0. If not consistent, returns a
vector with a number of tests, where TRUE
indicates
a failed test.
Cristiane Taniguti, [email protected]
data(onemap_example_bc) twopts <- rf_2pts(onemap_example_bc) check_twopts(twopts)
data(onemap_example_bc) twopts <- rf_2pts(onemap_example_bc) check_twopts(twopts)
Merge two or more OneMap datasets from the same cross type. Creates an
object of class onemap
.
combine_onemap(...)
combine_onemap(...)
... |
Two or more |
Given a set of OneMap datasets, all from the same cross type (full-sib,
backcross, F2 intercross or recombinant inbred lines obtained by self-
or sib-mating), merges marker and phenotype information to create a
single onemap
object.
If sample IDs are present in all datasets (the standard new format), not all individuals need to be genotyped in all datasets - the merged dataset will contain all available information, with missing data elsewhere. If sample IDs are missing in at least one dataset, it is required that all datasets have the same number of individuals, and it is assumed that they are arranged in the same order in every dataset.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
a string indicating that this is a combined dataset. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
Gabriel R A Margarido, [email protected]
Lincoln, S. E., Daly, M. J. and Lander, E. S. (1993) Constructing genetic linkage maps with MAPMAKER/EXP Version 3.0: a tutorial and reference manual. A Whitehead Institute for Biomedical Research Technical Report.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
read_onemap
and
read_mapmaker
.
data("onemap_example_out") data("vcf_example_out") combined_data <- combine_onemap(onemap_example_out, vcf_example_out)
data("onemap_example_out") data("vcf_example_out") combined_data <- combine_onemap(onemap_example_out, vcf_example_out)
For a given sequence with markers, computes the multipoint
likelihood of all
possible orders.
compare(input.seq, n.best = 50, tol = 0.001, verbose = FALSE)
compare(input.seq, n.best = 50, tol = 0.001, verbose = FALSE)
input.seq |
an object of class |
n.best |
the number of best orders to store in object (defaults to 50). |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
verbose |
if |
Since the number is large even for moderate values
of
, this function is to be used only for sequences with
relatively few markers. If markers were genotyped in an outcross population,
linkage phases need to be estimated and therefore more states need to be
visited in the Markov chain; when segregation types are D1, D2 and C,
computation can required a very long time (specially when markers linked in
repulsion are involved), so we recommend to use this function up to 6 or 7 markers.
For inbred-based populations, up to 10 or 11 markers can be ordered with this function,
since linkage phase are known.
The multipoint likelihood is calculated according to Wu et al.
(2002b) (Eqs. 7a to 11), assuming that the recombination fraction is the
same in both parents. Hidden Markov chain codes adapted from Broman et al.
(2008) were used.
An object of class compare
, which is a list containing the
following components:
best.ord |
a |
best.ord.rf |
a |
best.ord.phase |
a |
best.ord.like |
a
|
best.ord.LOD |
a |
data.name |
name of the object of class |
twopt |
name of the object of class |
Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. _Heredity_ 103: 494-502.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
marker_type
for details about segregation
types and make_seq
.
#outcrossing example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(12,14,15,26,28)) (markers.comp <- compare(markers)) (markers.comp <- compare(markers,verbose=TRUE)) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) markers <- make_seq(twopt,c(17,26,29,30,44,46,55)) (markers.comp <- compare(markers)) (markers.comp <- compare(markers,verbose=TRUE))
#outcrossing example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(12,14,15,26,28)) (markers.comp <- compare(markers)) (markers.comp <- compare(markers,verbose=TRUE)) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) markers <- make_seq(twopt,c(17,26,29,30,44,46,55)) (markers.comp <- compare(markers)) (markers.comp <- compare(markers,verbose=TRUE))
Creates a new dataset based on onemap_bin
object
create_data_bins(input.obj, bins)
create_data_bins(input.obj, bins)
input.obj |
an object of class |
bins |
an object of class |
Given a onemap_bin
object,
creates a new data set where the redundant markers are
collapsed into bins and represented by the marker with the lower
amount of missing data among those on the bin.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Marcelo Mollinari, [email protected]
data("onemap_example_f2") (bins<-find_bins(onemap_example_f2, exact=FALSE)) onemap_bins <- create_data_bins(onemap_example_f2, bins)
data("onemap_example_f2") (bins<-find_bins(onemap_example_f2, exact=FALSE)) onemap_bins <- create_data_bins(onemap_example_f2, bins)
An internal function that prepares a dataframe suitable for drawing a graphic of raw data using ggplot2, i. e., a data frame with long format
create_dataframe_for_plot_outcross(x)
create_dataframe_for_plot_outcross(x)
x |
an object of classes |
a dataframe
Create database and ggplot graphic of allele reads depths
create_depths_profile( onemap.obj = NULL, vcfR.object = NULL, vcf = NULL, parent1 = NULL, parent2 = NULL, vcf.par = "AD", recovering = FALSE, mks = NULL, inds = NULL, GTfrom = "onemap", alpha = 1, rds.file = "data.rds", y_lim = NULL, x_lim = NULL, verbose = TRUE )
create_depths_profile( onemap.obj = NULL, vcfR.object = NULL, vcf = NULL, parent1 = NULL, parent2 = NULL, vcf.par = "AD", recovering = FALSE, mks = NULL, inds = NULL, GTfrom = "onemap", alpha = 1, rds.file = "data.rds", y_lim = NULL, x_lim = NULL, verbose = TRUE )
onemap.obj |
an object of class |
vcfR.object |
object of class vcfR; |
vcf |
path to VCF file. |
parent1 |
a character specifying the first parent ID |
parent2 |
a character specifying the second parent ID |
vcf.par |
the vcf parameter that store the allele depth information. |
recovering |
logical. If TRUE, all markers in vcf are consider, if FALSE only those in onemap.obj |
mks |
a vector of characters specifying the markers names to be considered or NULL to consider all markers |
inds |
a vector of characters specifying the individual names to be considered or NULL to consider all individuals |
GTfrom |
the graphic should contain the genotypes from onemap.obj or from the vcf? Specify using "onemap", "vcf" or "prob". |
alpha |
define the transparency of the dots in the graphic |
rds.file |
rds file name to store the data frame with values used to build the graphic |
y_lim |
set scale limit for y axis |
x_lim |
set scale limit for x axis |
verbose |
If |
an rds file and a ggplot graphic.
Cristiane Taniguti, [email protected]
The genotypes probabilities can be calculated considering a global error (default method) or considering a genotype error probability for each genotype. Furthermore, user can provide directly the genotype probability matrix.
create_probs( input.obj = NULL, global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
create_probs( input.obj = NULL, global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
input.obj |
object of class onemap or onemap sequence |
global_error |
a integer specifying the global error value |
genotypes_errors |
a matrix with dimensions (number of individuals) x (number of markers) with genotypes errors values |
genotypes_probs |
a matrix with dimensions (number of individuals)*(number of markers) x possible genotypes (i.e., a ab ba b) with four columns for f2 and outcrossing populations, and two for backcross and RILs). |
The genotype probability matrix has number of individuals x number of markers rows and four columns (or two if considering backcross or RILs populations), one for each possible genotype of the population. This format follows the one proposed by MAPpoly.
The genotype probabilities come from SNP calling methods. If you do not have them, you can use a global error or a error value for each genotype. The OneMap until 2.1 version have only the global error option.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Cristiane Taniguti [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
data(onemap_example_out) new.data <- create_probs(onemap_example_out, global_error = 10^-5)
data(onemap_example_out) new.data <- create_probs(onemap_example_out, global_error = 10^-5)
Provides a simple draw of a genetic map.
draw_map( map.list, horizontal = FALSE, names = FALSE, grid = FALSE, cex.mrk = 1, cex.grp = 0.75 )
draw_map( map.list, horizontal = FALSE, names = FALSE, grid = FALSE, cex.mrk = 1, cex.grp = 0.75 )
map.list |
a map, i.e. an object of class |
horizontal |
if |
names |
if |
grid |
if |
cex.mrk |
the magnification to be used for markers. |
cex.grp |
the magnification to be used for group axis annotation. |
figure with genetic map draw
Marcelo Mollinari, [email protected]
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) lg<-group(make_seq(twopt, "all")) maps<-vector("list", lg$n.groups) for(i in 1:lg$n.groups) maps[[i]]<- make_seq(order_seq(input.seq= make_seq(lg,i),twopt.alg = "rcd"), "force") draw_map(maps, grid=TRUE) draw_map(maps, grid=TRUE, horizontal=TRUE)
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) lg<-group(make_seq(twopt, "all")) maps<-vector("list", lg$n.groups) for(i in 1:lg$n.groups) maps[[i]]<- make_seq(order_seq(input.seq= make_seq(lg,i),twopt.alg = "rcd"), "force") draw_map(maps, grid=TRUE) draw_map(maps, grid=TRUE, horizontal=TRUE)
Provides a simple draw of a linkage map.
draw_map2( ..., tag = NULL, id = TRUE, pos = TRUE, cex.label = NULL, main = NULL, group.names = NULL, centered = FALSE, y.axis = TRUE, space = NULL, col.group = NULL, col.mark = NULL, col.tag = NULL, output = NULL, verbose = TRUE )
draw_map2( ..., tag = NULL, id = TRUE, pos = TRUE, cex.label = NULL, main = NULL, group.names = NULL, centered = FALSE, y.axis = TRUE, space = NULL, col.group = NULL, col.mark = NULL, col.tag = NULL, output = NULL, verbose = TRUE )
... |
map(s). Object(s) of class |
tag |
name(s) of the marker(s) to highlight. If "all", all markers will be highlighted. Default is |
id |
logical. If |
pos |
logical. If |
cex.label |
the magnification used for label(s) of tagged marker(s). If |
main |
an overall title for the plot. Default is |
group.names |
name(s) to identify the group(s). If |
centered |
logical. If |
y.axis |
logical. If |
space |
numerical. Spacing between groups. If |
col.group |
the color used for group(s). |
col.mark |
the color used for marker(s). |
col.tag |
the color used for highlighted marker(s) and its/theirs label(s). |
output |
the name of the output file. The file format can be specified by adding its extension. Available formats: 'bmp', 'jpeg', 'png', 'tiff', 'pdf' and 'eps' (default). |
verbose |
If |
ggplot graphic with genetic map draw
Getulio Caixeta Ferreira, [email protected]
data("onemap_example_out") twopt <- rf_2pts(onemap_example_out) lg<-group(make_seq(twopt, "all")) seq1<-make_seq(order_seq(input.seq= make_seq(lg,1),twopt.alg = "rcd"), "force") seq2<-make_seq(order_seq(input.seq= make_seq(lg,2),twopt.alg = "rcd"), "force") seq3<-make_seq(order_seq(input.seq= make_seq(lg,3),twopt.alg = "rcd"), "force") draw_map2(seq1,seq2,seq3,tag = c("M1","M2","M3","M4","M5"), output = paste0(tempfile(), ".png"))
data("onemap_example_out") twopt <- rf_2pts(onemap_example_out) lg<-group(make_seq(twopt, "all")) seq1<-make_seq(order_seq(input.seq= make_seq(lg,1),twopt.alg = "rcd"), "force") seq2<-make_seq(order_seq(input.seq= make_seq(lg,2),twopt.alg = "rcd"), "force") seq3<-make_seq(order_seq(input.seq= make_seq(lg,3),twopt.alg = "rcd"), "force") draw_map2(seq1,seq2,seq3,tag = c("M1","M2","M3","M4","M5"), output = paste0(tempfile(), ".png"))
Creates a new sequence by dropping markers from a predetermined one.
drop_marker(input.seq, mrks)
drop_marker(input.seq, mrks)
input.seq |
an object of class |
mrks |
a vector containing the markers to be removed
from the |
An object of class sequence
, which is a list
containing the following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
@author Marcelo Mollinari, [email protected]
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) (LG1 <- make_seq(groups,1)) (LG.aug<-drop_marker(LG1, c(10,14)))
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) (LG1 <- make_seq(groups,1)) (LG.aug<-drop_marker(LG1, c(10,14)))
Edit sequence ordered by reference genome positions comparing to another set order
edit_order_onemap(input.seq)
edit_order_onemap(input.seq)
input.seq |
object of class sequence with alternative order (not genomic order) |
Cristiane Taniguti, [email protected]
Produce empty object to avoid code break. Function for internal purpose.
empty_onemap_obj(vcf, P1, P2, cross)
empty_onemap_obj(vcf, P1, P2, cross)
vcf |
object of class vcfR |
P1 |
character with parent 1 ID |
P2 |
character with parent 2 ID |
cross |
type of cross. Must be one of: |
An empty object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
Cristiane Taniguti, [email protected]
Export genotype probabilities in MAPpoly format (input for QTLpoly)
export_mappoly_genoprob(input.map)
export_mappoly_genoprob(input.map)
input.map |
object of class 'sequence' |
object of class 'mappoly.genoprob'
Export OneMap maps to be visualized in VIEWpoly
export_viewpoly(seqs.list)
export_viewpoly(seqs.list)
seqs.list |
a list with 'sequence' objects |
object of class viewmap
Uses vcfR package and onemap object to generates list of vectors with reference allele count and total counts for each marker and genotypes included in onemap object (only available for biallelic sites)
extract_depth( vcfR.object = NULL, onemap.object = NULL, vcf.par = c("GQ", "AD", "DPR, PL", "GL"), parent1 = "P1", parent2 = "P2", f1 = "F1", recovering = FALSE )
extract_depth( vcfR.object = NULL, onemap.object = NULL, vcf.par = c("GQ", "AD", "DPR, PL", "GL"), parent1 = "P1", parent2 = "P2", f1 = "F1", recovering = FALSE )
vcfR.object |
object output from vcfR package |
onemap.object |
onemap object output from read_onemap, read_mapmaker or onemap_read_vcf function |
vcf.par |
vcf format field that contain allele counts informations, the implemented are: AD, DPR, GQ, PL, GL. AD and DPR return a list with allele depth information. GQ returns a matrix with error probability for each genotype. PL return a data.frame with genotypes probabilities for every genotype. |
parent1 |
parent 1 identification in vcfR object |
parent2 |
parent 2 identification in vcfR object |
f1 |
if your cross type is f2, you must define the F1 individual |
recovering |
TRUE/FALSE, if TRUE evaluate all markers from vcf file, if FALSE evaluate only markers in onemap object |
list containing the following components:
palt |
a |
pref |
a |
psize |
a |
oalt |
a |
oref |
a |
osize |
a |
n.mks |
total number of markers. |
n.ind |
total number of individuals in progeny. |
inds |
progeny individuals identification. |
mks |
markers identification. |
onemap.object |
same onemap.object inputted |
Cristiane Taniguti, [email protected]
Filter markers based on 2pts distance
filter_2pts_gaps(input.seq, max.gap = 10)
filter_2pts_gaps(input.seq, max.gap = 10)
input.seq |
object of class sequence with ordered markers |
max.gap |
maximum gap measured in kosambi centimorgans allowed between adjacent markers. Markers that presents the defined distance between both adjacent neighbors will be removed. |
New sequence object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Cristiane Taniguti, [email protected]
Filter markers according with a missing data threshold
filter_missing( onemap.obj = NULL, threshold = 0.25, by = "markers", verbose = TRUE )
filter_missing( onemap.obj = NULL, threshold = 0.25, by = "markers", verbose = TRUE )
onemap.obj |
an object of class |
threshold |
a numeric from 0 to 1 to define the threshold of missing data allowed |
by |
character defining if 'markers' or 'individuals' should be filtered |
verbose |
A logical, if TRUE it output progress status information. |
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Cristiane Taniguti, [email protected]
data(onemap_example_out) filt_obj <- filter_missing(onemap_example_out, threshold=0.25)
data(onemap_example_out) filt_obj <- filter_missing(onemap_example_out, threshold=0.25)
Function filter genotypes by genotype probability
filter_prob(onemap.obj = NULL, threshold = 0.8, verbose = TRUE)
filter_prob(onemap.obj = NULL, threshold = 0.8, verbose = TRUE)
onemap.obj |
an object of class |
threshold |
a numeric from 0 to 1 to define the threshold for the probability of the called genotype (highest probability) |
verbose |
If |
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Cristiane Taniguti, [email protected]
data(onemap_example_out) filt_obj <- filter_prob(onemap_example_out, threshold=0.8)
data(onemap_example_out) filt_obj <- filter_prob(onemap_example_out, threshold=0.8)
Function to allocate markers with redundant information into bins. Within each bin, the pairwise recombination fraction between markers is zero.
find_bins(input.obj, exact = TRUE)
find_bins(input.obj, exact = TRUE)
input.obj |
an object of class |
exact |
logical. If |
An object of class onemap_bin
, which is a list containing the
following components:
bins |
a list containing the bins. Each element of the list is a table whose lines indicate the name of the marker, the bin in which that particular marker was allocated and the percentage of missing data. The name of each element of the list corresponds to the marker with the lower amount of missing data among those on the bin |
n.mar |
total number of markers. |
n.ind |
number individuals |
exact.search |
logical; indicates if
the search was performed with the argument |
Marcelo Mollinari, [email protected]
data("vcf_example_out") (bins<-find_bins(vcf_example_out, exact=FALSE))
data("vcf_example_out") (bins<-find_bins(vcf_example_out, exact=FALSE))
Function to divide the sequence in batches with user defined size
generate_overlapping_batches(input.seq, size = 50, overlap = 15)
generate_overlapping_batches(input.seq, size = 50, overlap = 15)
input.seq |
an object of class |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
Identifies linkage groups of markers, using results from two-point (pairwise) analysis and the transitive property of linkage.
group(input.seq, LOD = NULL, max.rf = NULL, verbose = TRUE)
group(input.seq, LOD = NULL, max.rf = NULL, verbose = TRUE)
input.seq |
an object of class |
LOD |
a (positive) real number used as minimum LOD score (threshold) to declare linkage. |
max.rf |
a real number (usually smaller than 0.5) used as maximum recombination fraction to declare linkage. |
verbose |
logical. If |
If the arguments specifying thresholds used to group markers, i.e., minimum
LOD Score and maximum recombination fraction, are NULL
(default),
the values used are those contained in object input.seq
. If not
using NULL
, the new values override the ones in object
input.seq
.
Returns an object of class group
, which is a list
containing the following components:
data.name |
name of
the object of class |
twopt |
name of the object of class |
marnames |
marker names, according to the input file. |
n.mar |
total number of markers. |
LOD |
minimum LOD Score to declare linkage. |
max.rf |
maximum recombination fraction to declare linkage. |
n.groups |
number of linkage groups found. |
groups |
number of the linkage group to which each marker is assigned. |
Gabriel R A Margarido, [email protected] and Marcelo Mollinari, [email protected]
Lincoln, S. E., Daly, M. J. and Lander, E. S. (1993) Constructing genetic linkage maps with MAPMAKER/EXP Version 3.0: a tutorial and reference manual. A Whitehead Institute for Biomedical Research Technical Report.
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out) all.data <- make_seq(twopts,"all") link_gr <- group(all.data) link_gr print(link_gr, details=FALSE) #omit the names of the markers
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out) all.data <- make_seq(twopts,"all") link_gr <- group(all.data) link_gr print(link_gr, details=FALSE) #omit the names of the markers
Identifies linkage groups of markers combining input sequences
objects with
unlinked markers from rf_2pts
object. The results from two-point
(pairwise) analysis and the transitive property of linkage are used for
grouping, as group
function.
group_seq( input.2pts, seqs = "CHROM", unlink.mks = "all", repeated = FALSE, LOD = NULL, max.rf = NULL, min_mks = NULL )
group_seq( input.2pts, seqs = "CHROM", unlink.mks = "all", repeated = FALSE, LOD = NULL, max.rf = NULL, min_mks = NULL )
input.2pts |
an object of class |
seqs |
a list of objects of class |
unlink.mks |
a object of class |
repeated |
logical. If |
LOD |
a (positive) real number used as minimum LOD score (threshold) to declare linkage. |
max.rf |
a real number (usually smaller than 0.5) used as maximum recombination fraction to declare linkage. |
min_mks |
integer defining the minimum number of markers that a provided sequence (seqs or CHROM) should have to be considered a group. |
If the arguments specifying thresholds used to group markers, i.e., minimum
LOD Score and maximum recombination fraction, are NULL
(default),
the values used are those contained in object input.2pts
. If not
using NULL
, the new values override the ones in object
input.2pts
.
Returns an object of class group_seq
, which is a list
containing the following components:
data.name |
name of
the object of class |
twopt |
name of the object of class |
mk.names |
marker names, according to the input file. |
input.seqs |
list with the numbers of the markers in each inputted sequence |
input.unlink.mks |
numbers of the unlinked markers in inputted sequence |
out.seqs |
list with the numbers of the markers in each outputted sequence |
n.unlinked |
number of markers that remained unlinked |
n.repeated |
number of markers which repeated in more than one group |
n.mar |
total number of markers evaluated |
LOD |
minimum LOD Score to declare linkage. |
max.rf |
maximum recombination fraction to declare linkage. |
sequences |
list of outputted sequences |
repeated |
list with the number of the markers that are repeated in each outputted sequence |
unlinked |
number of the markers which remained unlinked |
Cristiane Taniguti, [email protected]
data(onemap_example_out) # load OneMap's fake dataset for a outcrossing population data(vcf_example_out) # load OneMap's fake dataset from a VCF file for a outcrossing population comb_example <- combine_onemap(onemap_example_out, vcf_example_out) # Combine datasets twopts <- rf_2pts(comb_example) out_CHROM <- group_seq(twopts, seqs="CHROM", repeated=FALSE) out_CHROM seq1 <- make_seq(twopts, c(1,2,3,4,5,25,26)) seq2 <- make_seq(twopts, c(8,18)) seq3 <- make_seq(twopts, c(4,16,20,21,24,29)) out_seqs <- group_seq(twopts, seqs=list(seq1,seq2,seq3)) out_seqs
data(onemap_example_out) # load OneMap's fake dataset for a outcrossing population data(vcf_example_out) # load OneMap's fake dataset from a VCF file for a outcrossing population comb_example <- combine_onemap(onemap_example_out, vcf_example_out) # Combine datasets twopts <- rf_2pts(comb_example) out_CHROM <- group_seq(twopts, seqs="CHROM", repeated=FALSE) out_CHROM seq1 <- make_seq(twopts, c(1,2,3,4,5,25,26)) seq2 <- make_seq(twopts, c(8,18)) seq3 <- make_seq(twopts, c(4,16,20,21,24,29)) out_seqs <- group_seq(twopts, seqs=list(seq1,seq2,seq3)) out_seqs
Identifies linkage groups of markers using the results of two-point (pairwise) analysis and UPGMA method. Function adapted from MAPpoly package written by Marcelo Mollinari.
group_upgma(input.seq, expected.groups = NULL, inter = TRUE, comp.mat = FALSE)
group_upgma(input.seq, expected.groups = NULL, inter = TRUE, comp.mat = FALSE)
input.seq |
an object of class |
expected.groups |
when available, inform the number of expected linkage groups (i.e. chromosomes) for the species |
inter |
if |
comp.mat |
if |
Returns an object of class group
, which is a list
containing the following components:
data.name |
the referred dataset name |
hc.snp |
a list containing information related to the UPGMA grouping method |
expected.groups |
the number of expected linkage groups |
groups.snp |
the groups to which each of the markers belong |
seq.vs.grouped.snp |
comparison between the genomic group information
(when available) and the groups provided by |
LOD |
minimum LOD Score to declare linkage. |
max.rf |
maximum recombination fraction to declare linkage. |
twopt |
name of the object of class |
Marcelo Mollinari, [email protected]
Cristiane Taniguti [email protected]
Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. doi:10.1534/g3.119.400378
data("vcf_example_out") twopts <- rf_2pts(vcf_example_out) input.seq <- make_seq(twopts, "all") lgs <- group_upgma(input.seq, expected.groups = 3, comp.mat=TRUE, inter = FALSE) plot(lgs)
data("vcf_example_out") twopts <- rf_2pts(vcf_example_out) input.seq <- make_seq(twopts, "all") lgs <- group_upgma(input.seq, expected.groups = 3, comp.mat=TRUE, inter = FALSE) plot(lgs)
Apply Haldane mapping function
haldane(rcmb)
haldane(rcmb)
rcmb |
vector of recombination fraction values |
vector with centimorgan values
Keep in the onemap and twopts object only markers in the sequences
keep_only_selected_mks(list.sequences = NULL)
keep_only_selected_mks(list.sequences = NULL)
list.sequences |
a list of objects 'sequence' |
a list of objects 'sequences' with internal onemap and twopts objects reduced
Cristiane Taniguti
Apply Kosambi mapping function
kosambi(rcmb)
kosambi(rcmb)
rcmb |
vector of recombination fraction values |
vector with centimorgan values
Load list of sequences saved by save_onemap_sequences
load_onemap_sequences(filename)
load_onemap_sequences(filename)
filename |
name of the file to be loaded |
Makes a sequence of markers based on an object of another type.
make_seq(input.obj, arg = NULL, phase = NULL, data.name = NULL, twopt = NULL)
make_seq(input.obj, arg = NULL, phase = NULL, data.name = NULL, twopt = NULL)
input.obj |
an object of class |
arg |
its value depends on the type of object |
phase |
its value is also dependent on the type of |
data.name |
the object which
contains the raw data. This does not have to be defined by the
user: it is here for compatibility issues when calling |
twopt |
the object which
contains the two-point information. This does not have to be defined by the
user: it is here for compatibility issues when calling |
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Gabriel Margarido, [email protected]
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
compare
, try_seq
,
order_seq
and map
.
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") all_mark <- make_seq(twopt,1:30) # same as above, for this data set groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ord <- order_seq(LG1) (LG1.final <- make_seq(LG1.ord)) # safe order (LG1.final.all <- make_seq(LG1.ord,"force")) # forced order markers <- make_seq(twopt,c(2,3,12,14)) markers.comp <- compare(markers) (base.map <- make_seq(markers.comp)) base.map <- make_seq(markers.comp,1,1) # same as above (extend.map <- try_seq(base.map,30)) (base.map <- make_seq(extend.map,5)) # fifth position is the best
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") all_mark <- make_seq(twopt,1:30) # same as above, for this data set groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ord <- order_seq(LG1) (LG1.final <- make_seq(LG1.ord)) # safe order (LG1.final.all <- make_seq(LG1.ord,"force")) # forced order markers <- make_seq(twopt,c(2,3,12,14)) markers.comp <- compare(markers) (base.map <- make_seq(markers.comp)) base.map <- make_seq(markers.comp,1,1) # same as above (extend.map <- try_seq(base.map,30)) (base.map <- make_seq(extend.map,5)) # fifth position is the best
Estimates the multipoint log-likelihood, linkage phases and recombination frequencies for a sequence of markers in a given order.
map( input.seq, tol = 1e-04, verbose = FALSE, rm_unlinked = FALSE, phase_cores = 1, parallelization.type = "PSOCK", global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
map( input.seq, tol = 1e-04, verbose = FALSE, rm_unlinked = FALSE, phase_cores = 1, parallelization.type = "PSOCK", global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
input.seq |
an object of class |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
verbose |
If |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
phase_cores |
number of computer cores to be used in analysis |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
global_error |
single value to be considered as error probability in HMM emission function |
genotypes_errors |
matrix individuals x markers with error values for each marker |
genotypes_probs |
table containing the probability distribution for each combination of marker × individual. Each line on this table represents the combination of one marker with one individual, and the respective probabilities. The table should contain four three columns (prob(AA), prob(AB) and prob(BB)) and individuals*markers rows. |
Markers are mapped in the order defined in the object input.seq
. If
this object also contains a user-defined combination of linkage phases,
recombination frequencies and log-likelihood are estimated for that
particular case. Otherwise, the best linkage phase combination is also
estimated. The multipoint likelihood is calculated according to Wu et al.
(2002b)(Eqs. 7a to 11), assuming that the recombination fraction is the
same in both parents. Hidden Markov chain codes adapted from Broman et al.
(2008) were used.
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Adapted from Karl Broman (package 'qtl') by Gabriel R A Margarido, [email protected] and Marcelo Mollinari, [email protected], with minor changes by Cristiane Taniguti and Bastian Schiffthaler
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) # correct phases map(markers) markers <- make_seq(twopt,c(30,12,3,14,2),phase=c(4,1,4,3)) # incorrect phases map(markers)
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) # correct phases map(markers) markers <- make_seq(twopt,c(30,12,3,14,2),phase=c(4,1,4,3)) # incorrect phases map(markers)
Repeat HMM if map find unlinked marker
map_avoid_unlinked( input.seq, size = NULL, overlap = NULL, phase_cores = 1, tol = 1e-04, parallelization.type = "PSOCK", max.gap = FALSE, global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
map_avoid_unlinked( input.seq, size = NULL, overlap = NULL, phase_cores = 1, tol = 1e-04, parallelization.type = "PSOCK", max.gap = FALSE, global_error = NULL, genotypes_errors = NULL, genotypes_probs = NULL )
input.seq |
object of class sequence |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
max.gap |
the marker will be removed if it have gaps higher than this defined threshold in both sides |
global_error |
single value to be considered as error probability in HMM emission function |
genotypes_errors |
matrix individuals x markers with error values for each marker |
genotypes_probs |
table containing the probability distribution for each combination of marker × individual. Each line on this table represents the combination of one marker with one individual, and the respective probabilities. The table should contain four three columns (prob(AA), prob(AB) and prob(BB)) and individuals*markers rows. |
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) # correct phases map_avoid_unlinked(markers) markers <- make_seq(twopt,c(30,12,3,14,2),phase=c(4,1,4,3)) # incorrect phases map_avoid_unlinked(markers)
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) # correct phases map_avoid_unlinked(markers) markers <- make_seq(twopt,c(30,12,3,14,2),phase=c(4,1,4,3)) # incorrect phases map_avoid_unlinked(markers)
Apply the batch mapping algorithm using overlapping windows.
map_overlapping_batches( input.seq, size = 50, overlap = 15, phase_cores = 1, verbose = FALSE, seeds = NULL, tol = 1e-04, rm_unlinked = TRUE, max.gap = FALSE, parallelization.type = "PSOCK" )
map_overlapping_batches( input.seq, size = 50, overlap = 15, phase_cores = 1, verbose = FALSE, seeds = NULL, tol = 1e-04, rm_unlinked = TRUE, max.gap = FALSE, parallelization.type = "PSOCK" )
input.seq |
an object of class |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
verbose |
A logical, if TRUE its output progress status information. |
seeds |
A vector of phase information used as seeds for the first batch |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
max.gap |
the marker will be removed if it have gaps higher than this defined threshold in both sides |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
This algorithm implements the overlapping batch maps for high density marker sets. The mapping problem is reduced to a number of subsets (batches) which carry information forward in order to more accurately estimate recombination fractions and phasing. It is a adapted version of map.overlapping.batches function of BatchMap package. The main differences are that this onemap version do not have the option to reorder the markers according to ripple algorithm and, if the it finds markers that do not reach the linkage criterias, the algorithm remove the problematic marker and repeat the analysis. Than, the output map can have few markers compared with the input.seq.
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Perform map using background objects with only selected markers. It saves ram memory during the procedure. It is useful if dealing with many markers in total data set.
map_save_ram( input.seq, tol = 1e-04, verbose = FALSE, rm_unlinked = FALSE, phase_cores = 1, size = NULL, overlap = NULL, parallelization.type = "PSOCK", max.gap = FALSE )
map_save_ram( input.seq, tol = 1e-04, verbose = FALSE, rm_unlinked = FALSE, phase_cores = 1, size = NULL, overlap = NULL, parallelization.type = "PSOCK", max.gap = FALSE )
input.seq |
object of class sequence |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
verbose |
If |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
max.gap |
the marker will be removed if it have gaps higher than this defined threshold in both sides |
Simulated data set from a F2 population.
data("mapmaker_example_f2")
data("mapmaker_example_f2")
The format is: List of 8 $ geno : num [1:200, 1:66] 1 3 2 2 1 0 3 1 1 3 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr [1:66] "M1" "M2" "M3" "M4" ... $ n.ind : num 200 $ n.mar : num 66 $ segr.type : chr [1:66] "A.H.B" "C.A" "D.B" "C.A" ... $ segr.type.num: num [1:66] 1 3 2 3 3 2 1 3 2 1 ... $ input : chr "/home/cristiane/R/x86_64-pc-linux-gnu-library/3.4/onemap/extdata/mapmaker_example_f2.raw" $ n.phe : num 1 $ pheno : num [1:200, 1] 37.6 36.4 37.2 35.8 37.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr "Trait_1" - attr(*, "class")= chr [1:2] "onemap" "f2"
A total of 200 individuals were genotyped for 66 markers (36
co-dominant, i.e. a, ab or b and 30 dominant i.e. c or a and d or b) with 15% of missing data.
There is one quantitative phenotype to show how
to use onemap
output as R\qtl
and QTL Cartographer
input. Also, it is used
for the analysis in the tutorial that comes with OneMap.
data(mapmaker_example_f2) # perform two-point analyses twopts <- rf_2pts(mapmaker_example_f2) twopts
data(mapmaker_example_f2) # perform two-point analyses twopts <- rf_2pts(mapmaker_example_f2) twopts
Informs the type of segregation of all markers from an object of class
sequence
. For outcross populations it uses the notation by Wu
et al., 2002. For backcrosses, F2s and RILs, it uses the
traditional notation from MAPMAKER i.e. AA, AB, BB, not AA and not BB.
marker_type(input.seq)
marker_type(input.seq)
input.seq |
an object of class |
The segregation types are (Wu et al., 2002):
Type | Cross | Segregation |
A.1 | ab x cd | 1:1:1:1 |
A.2 | ab x ac | 1:1:1:1 |
A.3 | ab x co | 1:1:1:1 |
A.4 | ao x bo | 1:1:1:1 |
B1.5 | ab x ao | 1:2:1 |
B2.6 | ao x ab | 1:2:1 |
B3.7 | ab x ab | 1:2:1 |
C8 | ao x ao | 3:1 |
D1.9 | ab x cc | 1:1 |
D1.10 | ab x aa | 1:1 |
D1.11 | ab x oo | 1:1 |
D1.12 | bo x aa | 1:1 |
D1.13 | ao x oo | 1:1 |
D2.14 | cc x ab | 1:1 |
D2.15 | aa x ab | 1:1 |
D2.16 | oo x ab | 1:1 |
D2.17 | aa x bo | 1:1 |
D2.18 | oo x ao | 1:1 |
data.frame with segregation types of all markers in the sequence are displayed on the screen.
Gabriel R A Margarido, [email protected]
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out) markers.ex <- make_seq(twopts,c(3,6,8,12,16,25)) marker_type(input.seq = markers.ex) # segregation type for some markers data(onemap_example_f2) twopts <- rf_2pts(onemap_example_f2) all_mrk<-make_seq(twopts, "all") lgs<-group(all_mrk) lg1<-make_seq(lgs,1) marker_type(lg1) # segregation type for linkage group 1
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out) markers.ex <- make_seq(twopts,c(3,6,8,12,16,25)) marker_type(input.seq = markers.ex) # segregation type for some markers data(onemap_example_f2) twopts <- rf_2pts(onemap_example_f2) all_mrk<-make_seq(twopts, "all") lgs<-group(all_mrk) lg1<-make_seq(lgs,1) marker_type(lg1) # segregation type for linkage group 1
For a given sequence of markers, apply mds method described in Preedy and Hackett (2016) using MDSMap package to ordering markers and estimates the genetic distances with OneMap multipoint approach. Also gives MDSMap input file format for directly analysis in this package.
mds_onemap( input.seq, out.file = NULL, p = NULL, ispc = TRUE, displaytext = FALSE, weightfn = "lod2", mapfn = "haldane", ndim = 2, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, tol = 1e-05, hmm = TRUE, parallelization.type = "PSOCK" )
mds_onemap( input.seq, out.file = NULL, p = NULL, ispc = TRUE, displaytext = FALSE, weightfn = "lod2", mapfn = "haldane", ndim = 2, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, tol = 1e-05, hmm = TRUE, parallelization.type = "PSOCK" )
input.seq |
an object of class |
out.file |
path to the generated MDSMap input file. |
p |
Integer - the penalty for deviations from the sphere - higher p forces points more closely onto a sphere. |
ispc |
Logical determining the method to be used to estimate the map. By default this is TRUE and the method of principal curves will be used. If FALSE then the constrained MDS method will be used. |
displaytext |
Shows markers names in analysis graphic view |
weightfn |
Character string specifying the values to use for the weight matrix in the MDS 'lod2' or 'lod'. |
mapfn |
Character string specifying the map function to use on the recombination fractions 'haldane' is default, 'kosambi' or 'none'. |
ndim |
number of dimensions to be considered in the multidimensional scaling procedure (default = 2) |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
hmm |
logical defining if the HMM must be applied to estimate multipoint genetic distances |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
For better description about MDS method, see MDSMap package vignette.
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Cristiane Taniguti, [email protected]
Preedy, K. F. and Hackett, C. A. (2016). A rapid marker ordering approach for high-density genetic linkage maps in experimental autotetraploid populations using multidimensional scaling. Theoretical and Applied Genetics 129: 2117-2132
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
https://CRAN.R-project.org/package=MDSMap.
Simulated data set from a backcross population.
data(onemap_example_bc)
data(onemap_example_bc)
The format is: List of 10 $ geno : num [1:150, 1:67] 1 2 1 1 2 1 2 1 1 2 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:150] "ID1" "ID2" "ID3" "ID4" ... .. ..$ : chr [1:67] "M1" "M2" "M3" "M4" ... $ n.ind : int 150 $ n.mar : int 67 $ segr.type : chr [1:67] "A.H" "A.H" "A.H" "A.H" ... $ segr.type.num: logi [1:67] NA NA NA NA NA NA ... $ n.phe : int 1 $ pheno : num [1:150, 1] 40.8 39.5 37.9 34.2 38.9 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr "Trait_1" $ CHROM : NULL $ POS : NULL $ input : chr "onemap_example_bc.raw" - attr(*, "class")= chr [1:2] "onemap" "backcross"
A total of 150 individuals were genotyped for 67 markers with 15% of
missing data. There is one quantitative phenotype to show how
to use onemap
output as R\qtl
input.
Marcelo Mollinari, [email protected]
read_onemap
and read_mapmaker
.
data(onemap_example_bc) # perform two-point analyses twopts <- rf_2pts(onemap_example_bc) twopts
data(onemap_example_bc) # perform two-point analyses twopts <- rf_2pts(onemap_example_bc) twopts
Simulated data set from a F2 population.
data("onemap_example_f2")
data("onemap_example_f2")
The format is: List of 10 $ geno : num [1:200, 1:66] 1 3 2 2 1 0 3 1 1 3 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:200] "IND1" "IND2" "IND3" "IND4" ... .. ..$ : chr [1:66] "M1" "M2" "M3" "M4" ... $ n.ind : int 200 $ n.mar : int 66 $ segr.type : chr [1:66] "A.H.B" "C.A" "D.B" "C.A" ... $ segr.type.num: num [1:66] 1 3 2 3 3 2 1 3 2 1 ... $ n.phe : int 1 $ pheno : num [1:200, 1] 37.6 36.4 37.2 35.8 37.1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : NULL .. ..$ : chr "Trait_1" $ CHROM : NULL $ POS : NULL $ input : chr "/home/cristiane/R/x86_64-pc-linux-gnu-library/3.4/onemap/extdata/onemap_example_f2.raw" - attr(*, "class")= chr [1:2] "onemap" "f2"
A total of 200 individuals were genotyped for 66 markers (36
co-dominant, i.e. a, ab or b and 30 dominant i.e. c or a and d or b) with 15% of missing data.
There is one quantitative phenotype to show how
to use onemap
output as R\qtl
and QTL Cartographer
input. Also, it is used
for the analysis in the tutorial that comes with OneMap.
data(onemap_example_f2) plot(onemap_example_f2)
data(onemap_example_f2) plot(onemap_example_f2)
Simulated data set for an outcross, i.e., an F1 population obtained by crossing two non-homozygous parents.
data(onemap_example_out)
data(onemap_example_out)
An object of class onemap
.
A total of 100 F1 individuals were genotyped for 30 markers. The data currently contains only genotype information (no phenotypes). It is included to be used as a reference in order to understand how a data file needs to be. Also, it is used for the analysis in the tutorial that comes with OneMap.
Gabriel R A Margarido, [email protected]
read_onemap
for details about objects of class
onemap
.
data(onemap_example_out) # perform two-point analyses twopts <- rf_2pts(onemap_example_out) twopts
data(onemap_example_out) # perform two-point analyses twopts <- rf_2pts(onemap_example_out) twopts
Simulated biallelic data set for an ri self
population.
data("onemap_example_riself")
data("onemap_example_riself")
The format is: List of 10 $ geno : num [1:100, 1:68] 3 1 3 1 1 1 1 1 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:100] "ID1" "ID2" "ID3" "ID4" ... .. ..$ : chr [1:68] "M1" "M2" "M3" "M4" ... $ n.ind : int 100 $ n.mar : int 68 $ segr.type : chr [1:68] "A.B" "A.B" "A.B" "A.B" ... $ segr.type.num: logi [1:68] NA NA NA NA NA NA ... $ n.phe : int 0 $ pheno : NULL $ CHROM : NULL $ POS : NULL $ input : chr "onemap_example_riself.raw" - attr(*, "class")= chr [1:2] "onemap" "riself"
A total of 100 F1 individuals were genotyped for 68 markers. The data currently contains only genotype information (no phenotypes). It is included to be used as a reference in order to understand how a data file needs to be.
Cristiane Taniguti, [email protected]
read_onemap
for details about objects of class
onemap
.
data(onemap_example_riself) plot(onemap_example_riself)
data(onemap_example_riself) plot(onemap_example_riself)
Converts data from a vcf file to onemap initial object, while identify the appropriate marker segregation patterns.
onemap_read_vcfR( vcf = NULL, vcfR.object = NULL, cross = NULL, parent1 = NULL, parent2 = NULL, f1 = NULL, only_biallelic = TRUE, output_info_rds = NULL, verbose = TRUE )
onemap_read_vcfR( vcf = NULL, vcfR.object = NULL, cross = NULL, parent1 = NULL, parent2 = NULL, f1 = NULL, only_biallelic = TRUE, output_info_rds = NULL, verbose = TRUE )
vcf |
string defining the path to VCF file; |
vcfR.object |
object of class vcfR; |
cross |
type of cross. Must be one of: |
parent1 |
|
parent2 |
|
f1 |
|
only_biallelic |
if TRUE (default) only biallelic markers are considered, if FALSE multiallelic markers are included. |
output_info_rds |
define a name for the file with alleles information. |
verbose |
A logical, if TRUE it output progress status information. |
Only biallelic SNPs and indels for diploid variant sites are considered.
Genotype information on the parents is required for all cross types. For full-sib progenies, both outbred parents must be genotyped. For backcrosses, F2 intercrosses and recombinant inbred lines, the original inbred lines must be genotyped. Particularly for backcross progenies, the recurrent line must be provided as the first parent in the function arguments.
Marker type is determined based on parental genotypes. Variants for which parent genotypes cannot be determined are discarded.
Reference sequence ID and position for each variant site are also stored.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Cristiane Taniguti, [email protected]
read_onemap
for a description of the output object of class onemap.
data <- onemap_read_vcfR(vcf=system.file("extdata/vcf_example_out.vcf.gz", package = "onemap"), cross="outcross", parent1=c("P1"), parent2=c("P2"))
data <- onemap_read_vcfR(vcf=system.file("extdata/vcf_example_out.vcf.gz", package = "onemap"), cross="outcross", parent1=c("P1"), parent2=c("P2"))
Order the markers in a sequence using the genomic position
ord_by_geno(input.seq)
ord_by_geno(input.seq)
input.seq |
object of class 'sequence' |
An object of class sequence
Cristiane Taniguti
For a given sequence of markers, this function first uses the
compare
function to create a framework for a subset of informative
markers. Then, it tries to map remaining ones using the try_seq
function.
order_seq( input.seq, n.init = 5, subset.search = c("twopt", "sample"), subset.n.try = 30, subset.THRES = 3, twopt.alg = c("rec", "rcd", "ser", "ug"), THRES = 3, touchdown = FALSE, tol = 0.1, rm_unlinked = FALSE, verbose = FALSE )
order_seq( input.seq, n.init = 5, subset.search = c("twopt", "sample"), subset.n.try = 30, subset.THRES = 3, twopt.alg = c("rec", "rcd", "ser", "ug"), THRES = 3, touchdown = FALSE, tol = 0.1, rm_unlinked = FALSE, verbose = FALSE )
input.seq |
an object of class |
n.init |
the number of markers to be used in the |
subset.search |
a character string indicating which method should be
used to search for a subset of informative markers for the
|
subset.n.try |
integer. The number of times to repeat the subset
search procedure. It is only used if |
subset.THRES |
numerical. The threshold for the subset search
procedure. It is only used if |
twopt.alg |
a character string indicating which two-point algorithm
should be used if |
THRES |
threshold to be used when positioning markers in the
|
touchdown |
logical. If |
tol |
tolerance number for the C routine, i.e., the value used to evaluate convergence of the EM algorithm. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
verbose |
A logical, if TRUE its output progress status information. |
For outcrossing populations, the initial subset and the order in which
remaining markers will be used in the try_seq
step is given by the
degree of informativeness of markers (i.e markers of type A, B, C and D, in
this order).
For backcrosses, F2s or RILs, two methods can be used for
choosing the initial subset: i) "sample"
randomly chooses a number
of markers, indicated by n.init
, and calculates the multipoint
log-likelihood of the possible orders.
If the LOD Score of the second best order is greater than
subset.THRES
, than it takes the best order to proceed with the
try_seq
step. If not, the procedure is repeated. The maximum number
of times to repeat this procedure is given by the subset.n.try
argument. ii) "twopt"
uses a two-point based algorithm, given by the
option "twopt.alg"
, to construct a two-point based map. The options
are "rec"
for RECORD algorithm, "rcd"
for Rapid Chain
Delineation, "ser"
for Seriation and "ug"
for Unidirectional
Growth. Then, equally spaced markers are taken from this map. The
"compare"
step will then be applied on this subset of markers.
In both cases, the order in which the other markers will be used in the
try_seq
step is given by marker types (i.e. co-dominant before
dominant) and by the missing information on each marker.
After running the compare
and try_seq
steps, which result in
a "safe" order, markers that could not be mapped are "forced" into the map,
resulting in a map with all markers positioned.
An object of class order
, which is a list containing the
following components:
ord |
an object of class |
mrk.unpos |
a |
LOD.unpos |
a |
THRES |
the same as the input value, just for printing. |
ord.all |
an object of class |
data.name |
name of the object of class |
twopt |
name of the object of class |
Gabriel R A Margarido, [email protected] and Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S. and Green, P. (1987). Construction of multilocus genetic linkage maps in humans. Proc. Natl. Acad. Sci. USA 84: 2363-2367.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
make_seq
, compare
and
try_seq
.
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG2 <- make_seq(groups,2) LG2.ord <- order_seq(LG2,touchdown=TRUE) LG2.ord make_seq(LG2.ord) # get safe sequence make_seq(LG2.ord,"force") # get forced sequence
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG2 <- make_seq(groups,2) LG2.ord <- order_seq(LG2,touchdown=TRUE) LG2.ord make_seq(LG2.ord) # get safe sequence make_seq(LG2.ord,"force") # get forced sequence
Generates data.frame with parents estimated haplotypes
parents_haplotypes( ..., group_names = NULL, map.function = "kosambi", ref_alt_alleles = FALSE )
parents_haplotypes( ..., group_names = NULL, map.function = "kosambi", ref_alt_alleles = FALSE )
... |
objects of class sequence |
group_names |
vector of characters defining the group names |
map.function |
"kosambi" or "haldane" according to which was used to build the map |
ref_alt_alleles |
TRUE to return parents haplotypes as reference and alternative ref_alt_alleles codification |
data.frame with group ID (group), marker number (mk.number) and names (mk.names), position in centimorgan (dist) and parents haplotypes (P1_1, P1_2, P2_1, P2_2)
Getulio Caixeta Ferreira, [email protected]
Cristiane Taniguti, [email protected]
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) parents_haplotypes(lg1.map)
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) parents_haplotypes(lg1.map)
Suggest an optimal batch size value for use in
map_overlapping_batches
pick_batch_sizes(input.seq, size = 50, overlap = 15, around = 5)
pick_batch_sizes(input.seq, size = 50, overlap = 15, around = 5)
input.seq |
an object of class |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
around |
The range around the center which is maximally allowed to be searched. |
An integer value for the size which most evenly divides batches. In case of ties, bigger batch sizes are preferred.
Bastian Schiffthaler, [email protected]
LG <- structure(list(seq.num = seq(1,800)), class = "sequence") batchsize <- pick_batch_sizes(LG, 50, 19)
LG <- structure(list(seq.num = seq(1,800)), class = "sequence") batchsize <- pick_batch_sizes(LG, 50, 19)
The function receives an object of class onemap
.
For outcrossing populations, it can show detailed information (all 18 possible categories),
or a simplified version.
plot_by_segreg_type(x, subcateg = TRUE)
plot_by_segreg_type(x, subcateg = TRUE)
x |
an object of class |
subcateg |
a TRUE/FALSE option to indicate if results will be plotted showing all possible categories (only for outcrossing populations) |
a ggplot graphic
data(onemap_example_out) #Outcrossing data plot_by_segreg_type(onemap_example_out) plot_by_segreg_type(onemap_example_out, subcateg=FALSE) data(onemap_example_bc) plot_by_segreg_type(onemap_example_bc) data(mapmaker_example_f2) plot_by_segreg_type(mapmaker_example_f2)
data(onemap_example_out) #Outcrossing data plot_by_segreg_type(onemap_example_out) plot_by_segreg_type(onemap_example_out, subcateg=FALSE) data(onemap_example_bc) plot_by_segreg_type(onemap_example_bc) data(mapmaker_example_f2) plot_by_segreg_type(mapmaker_example_f2)
Provides simple genetic to physical ggplot.
plot_genome_vs_cm(map.list, mapping_function = "kosambi", group.names = NULL)
plot_genome_vs_cm(map.list, mapping_function = "kosambi", group.names = NULL)
map.list |
a map, i.e. an object of class |
mapping_function |
either "kosambi" or "haldane" |
group.names |
vector with group name for each sequence object in the map.list |
ggplot with cM on x-axis and physical position on y-axis
Jeekin Lau, [email protected]
Shows a heatmap (in ggplot2, a graphic of geom "tile") for raw data.
Lines correspond to markers and columns to individuals.
The function can plot a graph for all marker types, depending of the cross type (dominant/codominant markers, in all combinations).
The function receives a onemap object of class onemap
, reads information
from genotypes from this object, converts it to a long dataframe format
using function melt() from package reshape2() or internal function create_dataframe_for_plot_outcross(), converts numbers from the object
to genetic notation (according to the cross type), then plots the graphic.
If there is more than 20 markers, removes y labels
For outcross populations, it can show all markers together, or it can split them according the segregation
pattern.
## S3 method for class 'onemap' plot(x, all = TRUE, ...)
## S3 method for class 'onemap' plot(x, all = TRUE, ...)
x |
an object of class |
all |
a TRUE/FALSE option to indicate if results will be plotted together (if TRUE) or splitted based on their segregation pattern. Only used for outcross populations. |
... |
currently ignored |
a ggplot graphic
# library(ggplot2) data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap plot(onemap_example_bc) # This will show you the graph # You can store the graphic in an object, then save it with a number of properties # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_bc) data(onemap_example_f2) # Loads a fake backcross dataset installed with onemap plot(onemap_example_f2) # This will show you the graph # You can store the graphic in an object, then save it with a number of properties # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_f2) data(onemap_example_out) # Loads a fake full-sib dataset installed with onemap plot(onemap_example_out) # This will show you the graph for all markers plot(onemap_example_out, all=FALSE) # This will show you the graph splitted for marker types # You can store the graphic in an object, then save it. # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_out, all=FALSE)
# library(ggplot2) data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap plot(onemap_example_bc) # This will show you the graph # You can store the graphic in an object, then save it with a number of properties # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_bc) data(onemap_example_f2) # Loads a fake backcross dataset installed with onemap plot(onemap_example_f2) # This will show you the graph # You can store the graphic in an object, then save it with a number of properties # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_f2) data(onemap_example_out) # Loads a fake full-sib dataset installed with onemap plot(onemap_example_out) # This will show you the graph for all markers plot(onemap_example_out, all=FALSE) # This will show you the graph splitted for marker types # You can store the graphic in an object, then save it. # For details, see the help of ggplot2's function ggsave() g <- plot(onemap_example_out, all=FALSE)
Figure is generated with the haplotypes for each selected individual. As a representation, the recombination breakpoints are here considered to be in the mean point of the distance between two markers. It is important to highlight that it did not reflects the exact breakpoint position, specially if the genetic map have low resolution.
## S3 method for class 'onemap_progeny_haplotypes' plot( x, col = NULL, position = "stack", show_markers = TRUE, main = "Genotypes", ncol = 4, ... )
## S3 method for class 'onemap_progeny_haplotypes' plot( x, col = NULL, position = "stack", show_markers = TRUE, main = "Genotypes", ncol = 4, ... )
x |
object of class onemap_progeny_haplotypes |
col |
Color of parents' homologous. |
position |
"split" or "stack"; if "split" (default) the alleles' are plotted separately. if "stack" the parents' alleles are plotted together. |
show_markers |
logical; if |
main |
An overall title for the plot; default is |
ncol |
number of columns of the facet_wrap |
... |
currently ignored |
a ggplot graphic
Getulio Caixeta Ferreira, [email protected]
Cristiane Taniguti, [email protected]
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) plot(progeny_haplotypes(lg1.map))
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) plot(progeny_haplotypes(lg1.map))
Plot recombination breakpoints counts for each individual
## S3 method for class 'onemap_progeny_haplotypes_counts' plot(x, by_homolog = FALSE, n.graphics = NULL, ncol = NULL, ...)
## S3 method for class 'onemap_progeny_haplotypes_counts' plot(x, by_homolog = FALSE, n.graphics = NULL, ncol = NULL, ...)
x |
object of class onemap_progeny_haplotypes_counts |
by_homolog |
logical, if TRUE plots counts by homolog (two for each individuals), if FALSE plots total counts by individual |
n.graphics |
integer defining the number of graphics to be plotted, they separate the individuals in different plots |
ncol |
integer defining the number of columns in plot |
... |
currently ignored |
a ggplot graphic
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) prog.haplo <- progeny_haplotypes(lg1.map, most_likely = TRUE) plot(progeny_haplotypes_counts(prog.haplo))
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) prog.haplo <- progeny_haplotypes(lg1.map, most_likely = TRUE) plot(progeny_haplotypes_counts(prog.haplo))
Draw a graphic showing the p-values (re-scaled to -log10(p-values)) associated with the chi-square tests for the expected segregation patterns for all markers in a dataset. It includes a vertical line showing the threshold for declaring statistical significance if Bonferroni's correction is considered, as well as the percentage of markers that will be discarded if this criterion is used.
## S3 method for class 'onemap_segreg_test' plot(x, order = TRUE, ...)
## S3 method for class 'onemap_segreg_test' plot(x, order = TRUE, ...)
x |
an object of class onemap_segreg_test (produced by onemap's function test_segregation()), i. e., after performing segregation tests |
order |
a variable to define if p-values will be ordered in the plot |
... |
currently ignored |
a ggplot graphic
data(onemap_example_bc) # load OneMap's fake dataset for a backcross population BC.seg <- test_segregation(onemap_example_bc) # Applies chi-square tests print(BC.seg) # Shows the results plot(BC.seg) # Plot the graph, ordering the p-values plot(BC.seg, order=FALSE) # Plot the graph showing the results keeping the order in the dataset data(onemap_example_out) # load OneMap's fake dataset for an outcrossing population Out.seg <- test_segregation(onemap_example_out) # Applies chi-square tests print(Out.seg) # Shows the results plot(Out.seg) # Plot the graph, ordering the p-values plot(Out.seg, order=FALSE) # Plot the graph showing the results keeping the order in the dataset
data(onemap_example_bc) # load OneMap's fake dataset for a backcross population BC.seg <- test_segregation(onemap_example_bc) # Applies chi-square tests print(BC.seg) # Shows the results plot(BC.seg) # Plot the graph, ordering the p-values plot(BC.seg, order=FALSE) # Plot the graph showing the results keeping the order in the dataset data(onemap_example_out) # load OneMap's fake dataset for an outcrossing population Out.seg <- test_segregation(onemap_example_out) # Applies chi-square tests print(Out.seg) # Shows the results plot(Out.seg) # Plot the graph, ordering the p-values plot(Out.seg, order=FALSE) # Plot the graph showing the results keeping the order in the dataset
print method for object class 'compare'
## S3 method for class 'compare' print(x, ...)
## S3 method for class 'compare' print(x, ...)
x |
object of class compare |
... |
currently ignored |
compare object description
Print method for object class 'onemap'
## S3 method for class 'onemap' print(x, ...)
## S3 method for class 'onemap' print(x, ...)
x |
object of class onemap |
... |
currently ignored |
printed information about onemap object
print method for object class 'onemap_bin'
## S3 method for class 'onemap_bin' print(x, ...)
## S3 method for class 'onemap_bin' print(x, ...)
x |
object of class |
... |
currently ignored |
No return value, called for side effects
It shows the results of Chisquare tests performed for all markers in a onemap object of cross type outcross, backcross, F2 intercross or recombinant inbred lines.
## S3 method for class 'onemap_segreg_test' print(x, ...)
## S3 method for class 'onemap_segreg_test' print(x, ...)
x |
an object of class onemap_segreg_test |
... |
currently ignored |
a dataframe with marker name, H0 hypothesis, chi-square statistics, p-values, and
data(onemap_example_out) # Loads a fake outcross dataset installed with onemap Chi <- test_segregation(onemap_example_out) # Performs the chi-square test for all markers print(Chi) # Shows the results
data(onemap_example_out) # Loads a fake outcross dataset installed with onemap Chi <- test_segregation(onemap_example_out) # Performs the chi-square test for all markers print(Chi) # Shows the results
Print order_seq object
## S3 method for class 'order' print(x, ...)
## S3 method for class 'order' print(x, ...)
x |
object of class order_seq |
... |
currently ignored |
printed information about order_seq object
Print method for object class 'sequence'
## S3 method for class 'sequence' print(x, ...)
## S3 method for class 'sequence' print(x, ...)
x |
object of class sequence |
... |
corrently ignored |
printed information about sequence object
Generate data.frame with genotypes estimated by HMM and its probabilities
progeny_haplotypes(..., ind = 1, group_names = NULL, most_likely = FALSE)
progeny_haplotypes(..., ind = 1, group_names = NULL, most_likely = FALSE)
... |
Map(s) or list(s) of maps. Object(s) of class sequence. |
ind |
vector with individual index to be evaluated or "all" to include all individuals |
group_names |
Names of the groups. |
most_likely |
logical; if |
a data.frame information: individual (ind) and marker ID, group ID (grp), position in centimorgan (pos), genotypes probabilities (prob), parents, and the parents homologs and the allele IDs.
Getulio Caixeta Ferreira, [email protected]
Cristiane Taniguti, [email protected]
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) progeny_haplotypes(lg1.map)
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) progeny_haplotypes(lg1.map)
Generate graphic with the number of break points for each individual considering the most likely genotypes estimated by the HMM. Genotypes with same probability for two genotypes are removed. By now, only available for outcrossing and f2 intercross.
progeny_haplotypes_counts(x)
progeny_haplotypes_counts(x)
x |
object of class onemap_progeny_haplotypes |
a data.frame
with columns individuals ID (ind), group ID (grp),
homolog (homolog) and counts of breakpoints
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) progeny_haplotypes_counts(progeny_haplotypes(lg1.map, most_likely = TRUE))
data("onemap_example_out") twopts <- rf_2pts(onemap_example_out) lg1 <- make_seq(twopts, 1:5) lg1.map <- map(lg1) progeny_haplotypes_counts(progeny_haplotypes(lg1.map, most_likely = TRUE))
Implements the marker ordering algorithm Rapid Chain Delineation (Doerge, 1996).
rcd( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
rcd( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
input.seq |
an object of class |
LOD |
minimum LOD-Score threshold used when constructing the pairwise recombination fraction matrix. |
max.rf |
maximum recombination fraction threshold used as the LOD value above. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
hmm |
logical defining if the HMM must be applied to estimate multipoint genetic distances |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
verbose |
A logical, if TRUE it output progress status information. |
Rapid Chain Delineation (RCD) is an algorithm for marker ordering in linkage groups. It is not an exhaustive search method and, therefore, is not computationally intensive. However, it does not guarantee that the best order is always found. The only requirement is a matrix with recombination fractions between markers. Next is an excerpt from QTL Cartographer Version 1.17 Manual describing the RCD algorithm (Basten et al., 2005):
The linkage group is initiated with the pair of markers having the smallest recombination fraction. The remaining markers are placed in a “pool” awaiting placement on the map. The linkage group is extended by adding markers from the pool of unlinked markers. Each terminal marker of the linkage group is a candidate for extension of the chain: The unlinked marker that has the smallest recombination fraction with either is added to the chain subject to the provision that the recombination fraction is statistically significant at a prespecified level. This process is repeated as long as markers can be added to the chain.
After determining the order with RCD, the final map is constructed
using the multipoint approach (function map
).
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Gabriel R A Margarido, [email protected]
Basten, C. J., Weir, B. S. and Zeng, Z.-B. (2005) QTL Cartographer Version 1.17: A Reference Manual and Tutorial for QTL Mapping.
Doerge, R. W. (1996) Constructing genetic maps by rapid chain delineation. Journal of Quantitative Trait Loci 2: 121-132.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1, hmm = FALSE) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1, hmm = FALSE) LG1.rcd
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1, hmm = FALSE) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1, hmm = FALSE) LG1.rcd
Imports data from a Mapmaker raw file.
read_mapmaker(file = NULL, dir = NULL, verbose = TRUE)
read_mapmaker(file = NULL, dir = NULL, verbose = TRUE)
file |
the name of the input file which contains the data to be read. |
dir |
directory where the input file is located. |
verbose |
A logical, if TRUE it output progress status information. |
For details about MAPMAKER files see Lincoln et al. (1993). The current version supports backcross, F2s and RIL populations. The file can contain phenotypic data, but it will not be used in the analysis.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes
read for each marker in |
MAPMAKER/EXP
fashion, i.e., 1, 2, 3: AA, AB, BB, respectively; 3, 4:
BB, not BB, respectively; 1, 5: AA, not AA, respectively. Each column
contains data for a marker and each row represents an individual.
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the segregation type of each marker, as
|
segr.type.num |
a vector with the segregation type of each marker, represented in a simplified manner as integers. Segregation types were adapted from outcross segregation types. For details see read_onemap. |
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. Currently ignored. |
error |
matrix containing HMM emission probabilities |
Adapted from Karl Broman (package qtl) by Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Lincoln, S. E., Daly, M. J. and Lander, E. S. (1993) Constructing genetic linkage maps with MAPMAKER/EXP Version 3.0: a tutorial and reference manual. A Whitehead Institute for Biomedical Research Technical Report.
mapmaker_example_bc
and mapmaker_example_f2
raw files in the
package source.
map_data <-read_mapmaker(file=system.file("extdata/mapmaker_example_f2.raw", package = "onemap")) #Checking 'mapmaker_example_f2' data(mapmaker_example_f2) names(mapmaker_example_f2)
map_data <-read_mapmaker(file=system.file("extdata/mapmaker_example_f2.raw", package = "onemap")) #Checking 'mapmaker_example_f2' data(mapmaker_example_f2) names(mapmaker_example_f2)
Imports data derived from outbred parents (full-sib family) or inbred
parents (backcross, F2 intercross and recombinant inbred lines obtained
by self- or sib-mating). Creates an object of class onemap
.
read_onemap(inputfile = NULL, dir = NULL, verbose = TRUE)
read_onemap(inputfile = NULL, dir = NULL, verbose = TRUE)
inputfile |
the name of the input file which contains the data to be read. |
dir |
directory where the input file is located. |
verbose |
A logical, if TRUE it output progress status information. |
The file format is similar to that used by MAPMAKER/EXP
(Lincoln et al., 1993). The first line indicates the cross type
and is structured as data type {cross}
, where cross
must be one of "outcross"
, "f2 intercross"
,
"f2 backcross"
, "ri self"
or "ri sib"
. The second line
contains five integers: i) the number of individuals; ii) the number of
markers; iii) an indicator variable taking the value 1 if there is CHROM
information, i.e., if markers are anchored on any reference sequence, and
0 otherwise; iv) a similar 1/0 variable indicating whether there is POS
information for markers; and v) the number of phenotypic traits.
The next line contains sample IDs, separated by empty spaces or tabs. Addition of this sample ID requirement makes it possible for separate input datasets to be merged.
Next comes the genotype data for all markers. Each new marker is initiated
with a “*” (without the quotes) followed by the marker name, without
any space between them. Each marker name is followed by the corresponding
segregation type, which may be: "A.1"
, "A.2"
, "A.3"
,
"A.4"
, "B1.5"
, "B2.6"
, "B3.7"
, "C.8"
,
"D1.9"
, "D1.10"
, "D1.11"
, "D1.12"
,
"D1.13"
, "D2.14"
, "D2.15"
, "D2.16"
,
"D2.17"
or "D2.18"
(without quotes), for full-sibs [see
marker_type
and Wu et al. (2002) for details].
Other cross types have special marker types: "A.H"
for backcrosses;
"A.H.B"
for F2 intercrosses; and "A.B"
for recombinant inbred
lines.
After the segregation type comes the genotype data for the
corresponding marker. Depending on the segregation type, genotypes may be
denoted by ac
, ad
, bc
, bd
, a
, ba
,
b
, bc
, ab
and o
, in several possible
combinations. To make things easier, we have followed exactly the
notation used by Wu et al. (2002). Allowed values for backcrosses
are a
and ab
; for F2 crosses they are a
, ab
and
b
; for RILs they may be a
and b
. Genotypes must
be separated by a space. Missing values are denoted by "-"
.
If there is physical information for markers, i.e., if they are anchored at
specific positions in reference sequences (usually chromosomes), this is
included immediately after the marker data. These lines start with special
keywords *CHROM
and *POS
and contain strings
and
integers
, respectively, indicating the reference sequence and
position for each marker. These also need to be separated by spaces.
Finally, if there is phenotypic data, it will be added just after the marker
or CHROM
/POS
data. They need to be separated by spaces as
well, using the same symbol for missing information.
The example
directory in the package distribution contains an
example data file to be read with this function. Further instructions can
be found at the tutorial distributed along with this package.
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
error |
matrix containing HMM emission probabilities |
Gabriel R A Margarido, [email protected]
Lincoln, S. E., Daly, M. J. and Lander, E. S. (1993) Constructing genetic linkage maps with MAPMAKER/EXP Version 3.0: a tutorial and reference manual. A Whitehead Institute for Biomedical Research Technical Report.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
combine_onemap
and the example
directory in the package source.
outcr_data <- read_onemap(inputfile= system.file("extdata/onemap_example_out.raw", package= "onemap"))
outcr_data <- read_onemap(inputfile= system.file("extdata/onemap_example_out.raw", package= "onemap"))
Implements the marker ordering algorithm Recombination Counting and Ordering (Van Os et al., 2005).
record( input.seq, times = 10, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
record( input.seq, times = 10, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
input.seq |
an object of class |
times |
integer. Number of replicates of the RECORD procedure. |
LOD |
minimum LOD-Score threshold used when constructing the pairwise recombination fraction matrix. |
max.rf |
maximum recombination fraction threshold used as the LOD value above. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
hmm |
logical defining if the HMM must be applied to estimate multipoint genetic distances |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
verbose |
A logical, if TRUE it output progress status information. |
Recombination Counting and Ordering (RECORD) is an algorithm for marker ordering in linkage groups. It is not an exhaustive search method and, therefore, is not computationally intensive. However, it does not guarantee that the best order is always found. The only requirement is a matrix with recombination fractions between markers.
After determining the order with RECORD, the final map is
constructed using the multipoint approach (function
map
).
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Marcelo Mollinari, [email protected]
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
Van Os, H., Stam, P., Visser, R.G.F. and Van Eck, H.J. (2005) RECORD: a novel method for ordering loci on a genetic linkage map. Theoretical and Applied Genetics 112: 30-40.
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rec <- record(LG1, hmm = FALSE) ##F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rec <- record(LG1, hmm = FALSE) LG1.rec
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rec <- record(LG1, hmm = FALSE) ##F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rec <- record(LG1, hmm = FALSE) LG1.rec
Remove individuals from the onemap object
remove_inds(onemap.obj = NULL, rm.ind = NULL, list.seqs = NULL)
remove_inds(onemap.obj = NULL, rm.ind = NULL, list.seqs = NULL)
onemap.obj |
object of class onemap |
rm.ind |
vector of characters with individuals names |
list.seqs |
list of objects of class sequence |
An object of class onemap
without the selected individuals
if onemap object is used as input, or a list of objects of class sequence
without the selected individuals if a list of sequences objects is use as input
Cristiane Taniguti, [email protected]
Performs the two-point (pairwise) analysis proposed by Wu et al. (2002) between all pairs of markers.
rf_2pts(input.obj, LOD = 3, max.rf = 0.5, verbose = TRUE, rm_mks = FALSE)
rf_2pts(input.obj, LOD = 3, max.rf = 0.5, verbose = TRUE, rm_mks = FALSE)
input.obj |
an object of class |
LOD |
minimum LOD Score to declare linkage (defaults to |
max.rf |
maximum recombination fraction to declare linkage (defaults
to |
verbose |
logical. If |
rm_mks |
logical. If |
For n
markers, there are
pairs of markers to be analyzed. Therefore, completion of the two-point analyses can take a long time.
An object of class rf_2pts
, which is a list containing the
following components:
n.mar |
total number of markers. |
LOD |
minimum LOD Score to declare linkage. |
max.rf |
maximum recombination fraction to declare linkage. |
input |
the name of the input file. |
analysis |
an array with the complete results of the two-point analysis for each pair of markers. |
The thresholds used for LOD
and max.rf
will be used in
subsequent analyses, but can be overriden.
Gabriel R A Margarido [email protected] and Marcelo Mollinari [email protected]
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out,LOD=3,max.rf=0.5) # perform two-point analyses twopts print(twopts,c("M1","M2")) # detailed results for markers 1 and 2
data(onemap_example_out) twopts <- rf_2pts(onemap_example_out,LOD=3,max.rf=0.5) # perform two-point analyses twopts print(twopts,c("M1","M2")) # detailed results for markers 1 and 2
Plots a matrix of pairwise recombination fraction or LOD Scores using a color scale. Any value of the matrix can be easily accessed using an interactive plotly-html interface, helping users to check for possible problems.
rf_graph_table( input.seq, graph.LOD = FALSE, main = NULL, inter = FALSE, html.file = NULL, mrk.axis = "numbers", lab.xy = NULL, n.colors = 4, display = TRUE )
rf_graph_table( input.seq, graph.LOD = FALSE, main = NULL, inter = FALSE, html.file = NULL, mrk.axis = "numbers", lab.xy = NULL, n.colors = 4, display = TRUE )
input.seq |
an object of class |
graph.LOD |
logical. If |
main |
character. The title of the plot. |
inter |
logical. If |
html.file |
character naming the html file with interative graphic. |
mrk.axis |
character, "names" to display marker names in the axis, "numbers" to display marker numbers and "none" to display axis free of labels. |
lab.xy |
character vector with length 2, first component is the label of x axis and second of the y axis. |
n.colors |
integer. Number of colors in the pallete. |
display |
logical. If inter |
The color scale varies from red (small distances or big LODs) to purple.
When hover on a cell, a dialog box is displayed with some information
about corresponding markers for that cell (line (y) column (x)). They are:
) the name of the markers;
) the number of
the markers on the data set;
) the segregation types;
)
the recombination fraction between the markers and
) the LOD-Score
for each possible linkage phase calculated via two-point analysis. For
neighbor markers, the multipoint recombination fraction is printed;
otherwise, the two-point recombination fraction is printed. For markers of
type
D1
and D2
, it is impossible to calculate recombination
fraction via two-point analysis and, therefore, the corresponding cell will
be empty (white color). For cells on the diagonal of the matrix, the name, the number and
the type of the marker are printed, as well as the percentage of missing
data for that marker.
a ggplot graphic
Rodrigo Amadeu, [email protected]
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1) rf_graph_table(LG1.rcd, inter=FALSE) ##F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) ##"pre-allocate" an empty list of length groups$n.groups (3, in this case) maps.list<-vector("list", groups$n.groups) for(i in 1:groups$n.groups){ ##create linkage group i LG.cur <- make_seq(groups,i) ##ordering map.cur<-order_seq(LG.cur, subset.search = "sample") ##assign the map of the i-th group to the maps.list maps.list[[i]]<-make_seq(map.cur, "force") }
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.rcd <- rcd(LG1) rf_graph_table(LG1.rcd, inter=FALSE) ##F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) ##"pre-allocate" an empty list of length groups$n.groups (3, in this case) maps.list<-vector("list", groups$n.groups) for(i in 1:groups$n.groups){ ##create linkage group i LG.cur <- make_seq(groups,i) ##ordering map.cur<-order_seq(LG.cur, subset.search = "sample") ##assign the map of the i-th group to the maps.list maps.list[[i]]<-make_seq(map.cur, "force") }
Filter markers according with a two-points recombination fraction and LOD threshold. Adapted from MAPpoly.
rf_snp_filter_onemap( input.seq, thresh.LOD.rf = 5, thresh.rf = 0.15, probs = c(0.05, 1) )
rf_snp_filter_onemap( input.seq, thresh.LOD.rf = 5, thresh.rf = 0.15, probs = c(0.05, 1) )
input.seq |
an object of class |
thresh.LOD.rf |
LOD score threshold for recombination fraction (default = 5) |
thresh.rf |
threshold for recombination fractions (default = 0.15) |
probs |
indicates the probability corresponding to the filtering quantiles. (default = c(0.05, 1)) |
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Cristiane Taniguti, [email protected]
data("vcf_example_out") twopts <- rf_2pts(vcf_example_out) seq1 <- make_seq(twopts, which(vcf_example_out$CHROM == "1")) filt_seq <- rf_snp_filter_onemap(seq1, 20, 0.5, c(0.5,1))
data("vcf_example_out") twopts <- rf_2pts(vcf_example_out) seq1 <- make_seq(twopts, which(vcf_example_out$CHROM == "1")) filt_seq <- rf_snp_filter_onemap(seq1, 20, 0.5, c(0.5,1))
For a given sequence of ordered markers, computes the multipoint likelihood
of alternative orders, by shuffling subsets (windows) of markers within the
sequence. For each position of the window, all possible
orders are compared.
ripple_seq(input.seq, ws = 4, ext.w = NULL, LOD = 3, tol = 0.1, verbose = TRUE)
ripple_seq(input.seq, ws = 4, ext.w = NULL, LOD = 3, tol = 0.1, verbose = TRUE)
input.seq |
an object of class |
ws |
an integer specifying the length of the window size (defaults to 4). |
ext.w |
an integer specifying how many markers should be
considered in the vicinity of the permuted window. If
|
LOD |
threshold for the LOD-Score, so that alternative orders with LOD less then or equal to this threshold will be displayed. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
verbose |
A logical, if TRUE it output progress status information. |
Large values for the window size make computations very slow, specially if there are many partially informative markers.
This function does not return any value; it just produces text output to suggest alternative orders.
Gabriel R A Margarido, [email protected] and Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
make_seq
,
compare
, try_seq
and order_seq
.
#Outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(27,16,20,4,19,21,23,9,24,29)) markers.map <- map(markers) ripple_seq(markers.map) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG3 <- make_seq(groups,1) LG3.ord <- order_seq(LG3, subset.search = "twopt", twopt.alg = "rcd", touchdown=TRUE) LG3.ord make_seq(LG3.ord) # get safe sequence ord.1<-make_seq(LG3.ord,"force") # get forced sequence ripple_seq(ord.1, ws=5)
#Outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(27,16,20,4,19,21,23,9,24,29)) markers.map <- map(markers) ripple_seq(markers.map) #F2 example data(onemap_example_f2) twopt <- rf_2pts(onemap_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG3 <- make_seq(groups,1) LG3.ord <- order_seq(LG3, subset.search = "twopt", twopt.alg = "rcd", touchdown=TRUE) LG3.ord make_seq(LG3.ord) # get safe sequence ord.1<-make_seq(LG3.ord,"force") # get forced sequence ripple_seq(ord.1, ws=5)
Remove duplicated markers keeping the one with less missing data
rm_dupli_mks(onemap.obj)
rm_dupli_mks(onemap.obj)
onemap.obj |
object of class |
An empty object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
Cristiane Taniguti, [email protected]
The onemap sequence object contains everything users need to reproduce the complete analysis: the input onemap object, the rf_2pts result, and the sequence genetic distance and marker order. Therefore, a list of sequences is the only object users need to save to be able to recover all analysis. But simple saving the list of sequences will save many redundant objects. This redundancy is only considered by R when saving the object. For example, one input object and the rf_2pts result will be saved for every sequence.
save_onemap_sequences(sequences.list, filename)
save_onemap_sequences(sequences.list, filename)
sequences.list |
list of |
filename |
name of the output file (Ex: my_beautiful_map.RData) |
Estimates the multipoint log-likelihood, linkage phases and recombination frequencies for a sequence of markers in a given order using seeded phases.
seeded_map( input.seq, tol = 1e-04, phase_cores = 1, seeds, verbose = FALSE, rm_unlinked = FALSE, parallelization.type = "PSOCK" )
seeded_map( input.seq, tol = 1e-04, phase_cores = 1, seeds, verbose = FALSE, rm_unlinked = FALSE, parallelization.type = "PSOCK" )
input.seq |
an object of class |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
seeds |
A vector given the integer encoding of phases for the first N positions of the map |
verbose |
A logical, if TRUE it output progress status information. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
Markers are mapped in the order defined in the object input.seq
. The
best combination of linkage phases is also estimated starting from the first
position not in the given seeds.The multipoint likelihood is calculated
according to Wu et al. (2002b)(Eqs. 7a to 11), assuming that the
recombination fraction is the same in both parents. Hidden Markov chain
codes adapted from Broman et al. (2008) were used.
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Adapted from Karl Broman (package 'qtl') by Gabriel R A Margarido, [email protected] and Marcelo Mollinari, [email protected]. Modified to use seeded phases by Bastian Schiffthaler [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) seeded_map(markers, seeds = c(4,2))
data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(30,12,3,14,2)) seeded_map(markers, seeds = c(4,2))
A function to shows which marker have segregation distortion if Bonferroni's correction is applied for the Chi-square tests of mendelian segregation.
select_segreg(x, distorted = FALSE, numbers = FALSE, threshold = NULL)
select_segreg(x, distorted = FALSE, numbers = FALSE, threshold = NULL)
x |
an object of class onemap_segreg_test |
distorted |
a TRUE/FALSE variable to show distorted or non-distorted markers |
numbers |
a TRUE/FALSE variable to show the numbers or the names of the markers |
threshold |
a number between 0 and 1 to specify the threshold (alpha) to be considered in the test. If NULL, it uses the threshold alpha = 0.05. Bonferroni correction is applied for multiple test correction. |
a vector with marker names or numbers, according to the option for "distorted" and "numbers"
# Loads a fake backcross dataset installed with onemap data(onemap_example_out) # Performs the chi-square test for all markers Chi <- test_segregation(onemap_example_out) # To show non-distorted markers select_segreg(Chi) # To show markers with segregation distortion select_segreg(Chi, distorted=TRUE) # To show the numbers of the markers with segregation distortion select_segreg(Chi, distorted=TRUE, numbers=TRUE)
# Loads a fake backcross dataset installed with onemap data(onemap_example_out) # Performs the chi-square test for all markers Chi <- test_segregation(onemap_example_out) # To show non-distorted markers select_segreg(Chi) # To show markers with segregation distortion select_segreg(Chi, distorted=TRUE) # To show the numbers of the markers with segregation distortion select_segreg(Chi, distorted=TRUE, numbers=TRUE)
Extract marker number by name
seq_by_type(sequence, mk_type)
seq_by_type(sequence, mk_type)
sequence |
object of class or sequence |
mk_type |
vector of character with marker type to be selected |
New sequence object of class sequence
with selected marker type,
which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Cristiane Taniguti, [email protected]
Implements the marker ordering algorithm Seriation (Buetow & Chakravarti, 1987).
seriation( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
seriation( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
input.seq |
an object of class |
LOD |
minimum LOD-Score threshold used when constructing the pairwise recombination fraction matrix. |
max.rf |
maximum recombination fraction threshold used as the LOD value above. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
hmm |
logical defining if the HMM must be applied to estimate multipoint genetic distances |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
verbose |
A logical, if TRUE it output progress status information. |
Seriation is an algorithm for marker ordering in linkage groups. It is not an exhaustive search method and, therefore, is not computationally intensive. However, it does not guarantee that the best order is always found. The only requirement is a matrix with recombination fractions between markers.
NOTE: When there are to many pairs of markers with the same value in the
recombination fraction matrix, it can result in ties during the ordination
process and the Seriation algorithm may not work properly. This is
particularly relevant for outcrossing populations with mixture of markers
of type D1
and D2
. When this occurs, the function shows the
following error message: There are too many ties in the ordination
process - please, consider using another ordering algorithm
.
After determining the order with Seriation, the final map is
constructed using the multipoint approach (function
map
).
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Gabriel R A Margarido, [email protected]
Buetow, K. H. and Chakravarti, A. (1987) Multipoint gene mapping using seriation. I. General methods. American Journal of Human Genetics 41: 180-188.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG3 <- make_seq(groups,3) LG3.ser <- seriation(LG3)
##outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG3 <- make_seq(groups,3) LG3.ser <- seriation(LG3)
Defines the function that should be used to display the genetic map through the analysis.
set_map_fun(type = c("kosambi", "haldane"))
set_map_fun(type = c("kosambi", "haldane"))
type |
Indicates the function that should be used, which can be
|
No return value, called for side effects
Kosambi, D. D. (1944) The estimation of map distance from recombination values. Annuaire of Eugenetics 12: 172-175.
Marcelo Mollinari, [email protected]
Haldane, J. B. S. (1919) The combination of linkage values and the calculation of distance between the loci of linked factors. Journal of Genetics 8: 299-309.
Simulated data set from a backcross population.
data(simu_example_bc)
data(simu_example_bc)
The format is: List of 11 $ geno : num [1:200, 1:54] 1 2 1 1 2 2 2 1 1 2 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:200] "BC_001" "BC_002" "BC_003" "BC_004" ... .. ..$ : chr [1:54] "M001" "M002" "M003" "M004" ... $ n.ind : int 200 $ n.mar : int 54 $ segr.type : chr [1:54] "A.H" "A.H" "A.H" "A.H" ... $ segr.type.num: num [1:54] 8 8 8 8 8 8 8 8 8 8 ... $ n.phe : int 0 $ pheno : NULL $ CHROM : NULL $ POS : NULL $ input : chr "simu_example_bc.raw" $ error : num [1:10800, 1:2] 1 1 1 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:10800] "M001_BC_001" "M002_BC_001" "M003_BC_001" "M004_BC_001" ... .. ..$ : NULL - attr(*, "class")= chr [1:2] "onemap" "backcross"
A simulation of a backcross population of 200 individuals genotyped with 54 markers. There are no missing data. There are two groups, one (Chr01) with a total of 100 cM and the other (Chr10) with 150 cM. The markers are positioned equidistant from each other.
Cristiane Taniguti, [email protected]
read_onemap
and read_mapmaker
.
data(simu_example_bc) # perform two-point analyses twopts <- rf_2pts(simu_example_bc) twopts
data(simu_example_bc) # perform two-point analyses twopts <- rf_2pts(simu_example_bc) twopts
Simulated data set from a f2 intercross population.
data(simu_example_f2)
data(simu_example_f2)
The format is: List of 11 $ geno : num [1:200, 1:54] 1 2 1 1 2 2 1 1 1 2 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:200] "F2_001" "F2_002" "F2_003" "F2_004" ... .. ..$ : chr [1:54] "M001" "M002" "M003" "M004" ... $ n.ind : int 200 $ n.mar : int 54 $ segr.type : chr [1:54] "C.A" "C.A" "C.A" "C.A" ... $ segr.type.num: num [1:54] 7 7 7 7 4 4 7 4 4 4 ... $ n.phe : int 0 $ pheno : NULL $ CHROM : NULL $ POS : NULL $ input : chr "simu_example_f2.raw" $ error : num [1:10800, 1:4] 1 1 1 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:10800] "M001_F2_001" "M002_F2_001" "M003_F2_001" "M004_F2_001" ... .. ..$ : NULL - attr(*, "class")= chr [1:2] "onemap" "f2"
A simulation of a f2 intercross population of 200 individuals genotyped with 54 markers. There are no missing data. There are two groups, one (Chr01) with a total of 100 cM and the other (Chr10) with 150 cM. The markers are positioned equidistant from each other.
Cristiane Taniguti, [email protected]
read_onemap
and read_mapmaker
.
data(simu_example_f2) # perform two-point analyses twopts <- rf_2pts(simu_example_f2) twopts
data(simu_example_f2) # perform two-point analyses twopts <- rf_2pts(simu_example_f2) twopts
Simulated data set from a outcrossing population.
data(simu_example_out)
data(simu_example_out)
The format is: List of 11 $ geno : num [1:200, 1:54] 2 1 2 1 1 2 2 2 1 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:200] "F1_001" "F1_002" "F1_003" "F1_004" ... .. ..$ : chr [1:54] "M001" "M002" "M003" "M004" ... $ n.ind : int 200 $ n.mar : int 54 $ segr.type : chr [1:54] "D2.16" "D2.17" "D2.17" "D1.9" ... $ segr.type.num: num [1:54] 7 7 7 6 1 3 3 1 7 6 ... $ n.phe : int 0 $ pheno : NULL $ CHROM : NULL $ POS : NULL $ input : chr "simu_example_out.raw" $ error : num [1:10800, 1:4] 1.00e-05 1.00e-05 1.00e-05 1.00 3.33e-06 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:10800] "M001_F1_001" "M002_F1_001" "M003_F1_001" "M004_F1_001" ... .. ..$ : NULL - attr(*, "class")= chr [1:2] "onemap" "outcross"
A simulation of a outcrossing population of 200 individuals genotyped with 54 markers. There are no missing data. There are two groups, one (Chr01) with a total of 100 cM and the other (Chr10) with 150 cM. The markers are positioned equidistant from each other.
Cristiane Taniguti, [email protected]
read_onemap
and read_mapmaker
.
data(simu_example_out) # perform two-point analyses twopts <- rf_2pts(simu_example_out) twopts
data(simu_example_out) # perform two-point analyses twopts <- rf_2pts(simu_example_out) twopts
Sort markers in onemap object by their position in reference genome
sort_by_pos(onemap.obj)
sort_by_pos(onemap.obj)
onemap.obj |
object of class onemap |
An object of class onemap
, i.e., a list with the following
components:
geno |
a matrix with integers indicating the genotypes read for each marker. Each column contains data for a marker and each row represents an individual. |
n.ind |
number of individuals. |
n.mar |
number of markers. |
segr.type |
a vector with the
segregation type of each marker, as |
segr.type.num |
a
vector with the segregation type of each marker, represented in a
simplified manner as integers, i.e. 1 corresponds to markers of type
|
input |
the name of the input file. |
n.phe |
number of phenotypes. |
pheno |
a matrix with phenotypic values. Each column contains data for a trait and each row represents an individual. |
Cristiane Taniguti, [email protected]
Split rf_2pts object by markers
split_2pts(twopts.obj, mks)
split_2pts(twopts.obj, mks)
twopts.obj |
object of class rf_2pts |
mks |
markers names (vector of characters) or number (vector of integers) to be removed and added to a new rf_2pts object |
An object of class rf_2pts
with only the selected markers, which is a list containing the
following components:
n.mar |
total number of markers. |
LOD |
minimum LOD Score to declare linkage. |
max.rf |
maximum recombination fraction to declare linkage. |
Cristiane Taniguti, [email protected]
Receives one onemap object and a vector with markers names to be removed from the input onemap object and inserted in a new one. The output is a list containing the two onemap objects.
split_onemap(onemap.obj = NULL, mks = NULL)
split_onemap(onemap.obj = NULL, mks = NULL)
onemap.obj |
object of class onemap |
mks |
markers names (vector of characters) or number (vector of integers) to be removed and added to a new onemap object |
a list containing in first level the original onemap object without the indicated markers and the second level the new onemap object with only the indicated markers
It suggests a LOD Score for declaring statistical significance for two-point tests for linkage between all pairs of markers, considering that multiple tests are being performed.
suggest_lod(x)
suggest_lod(x)
x |
an object of class |
In a somehow naive approach, the function calculates the number of two-point tests that will be performed for all markers in the data set, and then using this to calculate the global alpha required to control type I error using Bonferroni's correction.
From this global alpha, the corresponding quantile from the chi-square distribution is taken and then converted to LOD Score.
This can be seen as just an initial approximation to help users to select a LOD Score for two point tests.
the suggested LOD to be used for testing linkage
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap suggest_lod(onemap_example_bc) # An value that should be used to start the analysis
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap suggest_lod(onemap_example_bc) # An value that should be used to start the analysis
Create table with summary information about the linkage map
summary_maps_onemap(map.list, mapping_function = "kosambi")
summary_maps_onemap(map.list, mapping_function = "kosambi")
map.list |
a map, i.e. an object of class |
mapping_function |
either "kosambi" or "haldane" |
data.frame with basic summary statistics
Jeekin Lau, [email protected]
Using OneMap internal function test_segregation_of_a_marker(), performs the Chi-square test to check if all markers in a dataset are following the expected segregation pattern, i. e., 1:1:1:1 (A), 1:2:1 (B), 3:1 (C) and 1:1 (D) according to OneMap's notation.
test_segregation(x, simulate.p.value = FALSE)
test_segregation(x, simulate.p.value = FALSE)
x |
an object of class |
simulate.p.value |
a logical indicating whether to compute p-values by Monte Carlo simulation. |
First, it identifies the correct segregation pattern and corresponding H0 hypothesis, and then tests it.
an object of class onemap_segreg_test, which is a list with marker name, H0 hypothesis being tested, the chi-square statistics, the associated p-values and the % of individuals genotyped. To see the object, it is necessary to print it.
data(onemap_example_out) # Loads a fake outcross dataset installed with onemap Chi <- test_segregation(onemap_example_out) # Performs the chi-square test for all markers print(Chi) # Shows the results
data(onemap_example_out) # Loads a fake outcross dataset installed with onemap Chi <- test_segregation(onemap_example_out) # Performs the chi-square test for all markers print(Chi) # Shows the results
Applies the chi-square test to check if markers are following the expected segregation pattern, i. e., 1:1:1:1 (A), 1:2:1 (B), 3:1 (C) and 1:1 (D) according to OneMap's notation. It does not use Yate's correction.
test_segregation_of_a_marker(x, marker, simulate.p.value = FALSE)
test_segregation_of_a_marker(x, marker, simulate.p.value = FALSE)
x |
an object of class |
marker |
the marker which will be tested for its segregation. |
simulate.p.value |
a logical indicating whether to compute p-values by Monte Carlo simulation. |
First, the function selects the correct segregation pattern, then it defines the H0 hypothesis, and then tests it, together with percentage of missing data.
a list with the H0 hypothesis being tested, the chi-square statistics, the associated p-values, and the % of individuals genotyped.
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap test_segregation_of_a_marker(onemap_example_bc,1) data(onemap_example_out) # Loads a fake outcross dataset installed with onemap test_segregation_of_a_marker(onemap_example_out,1)
data(onemap_example_bc) # Loads a fake backcross dataset installed with onemap test_segregation_of_a_marker(onemap_example_bc,1) data(onemap_example_out) # Loads a fake outcross dataset installed with onemap test_segregation_of_a_marker(onemap_example_out,1)
For a given linkage map, tries do add an additional unpositioned marker. This function estimates parameters for all possible maps including the new marker in all possible positions, while keeping the original linkage map unaltered.
try_seq(input.seq, mrk, tol = 0.1, pos = NULL, verbose = FALSE)
try_seq(input.seq, mrk, tol = 0.1, pos = NULL, verbose = FALSE)
input.seq |
an object of class |
mrk |
the index of the marker to be tried, according to the input file. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
pos |
defines in which position the new marker |
verbose |
if |
An object of class try
, which is a list containing
the following components:
ord |
a |
LOD |
a |
try.ord |
a |
data.name |
name of the object of
class |
twopt |
name of
the object of class |
Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
Jiang, C. and Zeng, Z.-B. (1997). Mapping quantitative trait loci with dominant and missing markers in various crosses from two inbred lines. Genetica 101: 47-58.
Lander, E. S., Green, P., Abrahamson, J., Barlow, A., Daly, M. J., Lincoln, S. E. and Newburg, L. (1987) MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1: 174-181.
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetic maps. Heredity 103: 494-502
Wu, R., Ma, C.-X., Painter, I. and Zeng, Z.-B. (2002a) Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61: 349-363.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. (2002b). Linkage mapping of sex-specific differences. Genetical Research 79: 85-96
#outcrossing example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(2,3,12,14)) markers.comp <- compare(markers) base.map <- make_seq(markers.comp,1) extend.map <- try_seq(base.map,30) extend.map print(extend.map,5) # best position print(extend.map,4) # second best position
#outcrossing example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) markers <- make_seq(twopt,c(2,3,12,14)) markers.comp <- compare(markers) base.map <- make_seq(markers.comp,1) extend.map <- try_seq(base.map,30) extend.map print(extend.map,5) # best position print(extend.map,4) # second best position
It uses try_seq function repeatedly trying to positioned each marker
in a vector of markers into a already ordered sequence.
Each marker in the vector "markers"
is kept in the sequence
if the difference of LOD and total group size of the models
with and without the marker are below the thresholds "lod.thr"
and "cM.thr"
.
try_seq_by_seq(sequence, markers, cM.thr = 10, lod.thr = -10, verbose = TRUE)
try_seq_by_seq(sequence, markers, cM.thr = 10, lod.thr = -10, verbose = TRUE)
sequence |
object of class sequence with ordered markers |
markers |
vector of integers defining the marker numbers to be inserted in the |
cM.thr |
number defining the threshold for total map size increase when inserting a single marker |
lod.thr |
the difference of LODs between model before and after inserting the marker need to have value higher than the value defined in this argument |
verbose |
A logical, if TRUE it output progress status information. |
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
name of the object of class |
twopt |
name of the object of class |
Implements the marker ordering algorithm Unidirectional Growth (Tan & Fu, 2006).
ug( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
ug( input.seq, LOD = 0, max.rf = 0.5, tol = 1e-04, rm_unlinked = TRUE, size = NULL, overlap = NULL, phase_cores = 1, hmm = TRUE, parallelization.type = "PSOCK", verbose = TRUE )
input.seq |
an object of class |
LOD |
minimum LOD-Score threshold used when constructing the pairwise recombination fraction matrix. |
max.rf |
maximum recombination fraction threshold used as the LOD value above. |
tol |
tolerance for the C routine, i.e., the value used to evaluate convergence. |
rm_unlinked |
When some pair of markers do not follow the linkage criteria,
if |
size |
The center size around which an optimum is to be searched |
overlap |
The desired overlap between batches |
phase_cores |
The number of parallel processes to use when estimating the phase of a marker. (Should be no more than 4) |
hmm |
logical defining if the HMM must be applied to estimate multipoint genetic distances |
parallelization.type |
one of the supported cluster types. This should be either PSOCK (default) or FORK. |
verbose |
A logical, if TRUE it output progress status information. |
Unidirectional Growth (UG) is an algorithm for marker ordering in linkage groups. It is not an exhaustive search method and, therefore, is not computationally intensive. However, it does not guarantee that the best order is always found. The only requirement is a matrix with recombination fractions between markers.
After determining the order with UG, the final map is constructed
using the multipoint approach (function map
).
An object of class sequence
, which is a list containing the
following components:
seq.num |
a |
seq.phases |
a |
seq.rf |
a |
seq.like |
log-likelihood of the corresponding linkage map. |
data.name |
object of class |
twopt |
object of class |
Marcelo Mollinari, [email protected]
Mollinari, M., Margarido, G. R. A., Vencovsky, R. and Garcia, A. A. F. (2009) Evaluation of algorithms used to order markers on genetics maps. Heredity 103: 494-502.
Tan, Y. and Fu, Y. (2006) A novel method for estimating linkage maps. Genetics 173: 2383-2390.
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ug <- ug(LG1) #F2 example data(mapmaker_example_f2) twopt <- rf_2pts(mapmaker_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ug <- ug(LG1) LG1.ug
#outcross example data(onemap_example_out) twopt <- rf_2pts(onemap_example_out) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ug <- ug(LG1) #F2 example data(mapmaker_example_f2) twopt <- rf_2pts(mapmaker_example_f2) all_mark <- make_seq(twopt,"all") groups <- group(all_mark) LG1 <- make_seq(groups,1) LG1.ug <- ug(LG1) LG1.ug
Simulated biallelic data set for an backcross population
data("vcf_example_bc")
data("vcf_example_bc")
An object of class onemap
.
A total of 142 backcross individuals were genotyped with 25 markers. The data
was generated from a VCF file. It contains chromossome and position
informations for each marker. It is included to be used as a example in
order to understand how to convert VCF file to OneMap input data with the functions
vcf2raw
and onemap_read_vcfR
.
Cristiane Hayumi Taniguti, [email protected]
read_onemap
for details about objects of class
onemap
.
data(vcf_example_bc) plot(vcf_example_bc)
data(vcf_example_bc) plot(vcf_example_bc)
Simulated biallelic data set for an f2 population
data(vcf_example_f2)
data(vcf_example_f2)
An object of class onemap
.
A total of 192 F2 individuals were genotyped with 25 markers. The data was generated from a VCF file. It contains chromossome and position informations for each marker. It is included to be used as a reference in order to understand how to convert VCF file to OneMap input data. Also, it is used for the analysis in the tutorial that comes with OneMap.
Cristiane Hayumi Taniguti, [email protected]
read_onemap
for details about objects of class
onemap
.
data(vcf_example_f2) # plot markers informations plot(vcf_example_f2)
data(vcf_example_f2) # plot markers informations plot(vcf_example_f2)
Simulated biallelic data set for an outcross, i.e., an F1 population obtained by crossing two non-homozygous parents.
data(vcf_example_out)
data(vcf_example_out)
An object of class onemap
.
A total of 92 F1 individuals were genotyped with 27 markers. The data was generated from a VCF file. It contains chromossome and position informations for each marker. It is included to be used as a reference in order to understand how to convert VCF file to OneMap input data. Also, it is used for the analysis in the tutorial that comes with OneMap.
Cristiane Hayumi Taniguti, [email protected]
read_onemap
for details about objects of class
onemap
.
data(vcf_example_out) # plot markers informations plot(vcf_example_out)
data(vcf_example_out) # plot markers informations plot(vcf_example_out)
Simulated biallelic data set for an ri self
population.
data("vcf_example_riself")
data("vcf_example_riself")
The format is: List of 10 $ geno : num [1:92, 1:25] 3 3 1 3 1 3 3 1 3 1 ... ..- attr(*, "dimnames")=List of 2 .. ..$ : chr [1:92] "ID1" "ID3" "ID4" "ID5" ... .. ..$ : chr [1:25] "SNP16" "SNP12" "SNP17" "SNP10" ... $ n.ind : int 92 $ n.mar : int 25 $ segr.type : chr [1:25] "A.B" "A.B" "A.B" "A.B" ... $ segr.type.num: logi [1:25] NA NA NA NA NA NA ... $ n.phe : int 0 $ pheno : NULL $ CHROM : chr [1:25] "1" "1" "1" "1" ... $ POS : int [1:25] 1791 6606 9001 11326 11702 15533 17151 18637 19146 19220 ... $ input : chr "vcf_example_riself.raw" - attr(*, "class")= chr [1:2] "onemap" "riself"
A total of 92 rils individuals were genotyped with 25 markers. The data
was generated from a VCF file. It contains chromossome and position
informations for each marker. It is included to be used as a example in
order to understand how to convert VCF file to OneMap input data with the functions
vcf2raw
and onemap_read_vcfR
.
Cristiane Hayumi Taniguti, [email protected]
read_onemap
for details about objects of class
onemap
.
data(vcf_example_riself) plot(vcf_example_riself)
data(vcf_example_riself) plot(vcf_example_riself)
These functions are defunct and no longer available.
vcf2raw()
vcf2raw()
No return value, called for side effects
Write a genetic map to a file, base on a given map, or a list of maps. The output file can be used as an input to perform QTL mapping using the package R/qtl. It is also possible to create an output to be used with QTLCartographer program.
write_map(map.list, file.out)
write_map(map.list, file.out)
map.list |
a map, i.e. an object of class |
file.out |
output map file. |
This function is available only for backcross, F2 and RILs.
file with genetic map information
Wang S., Basten, C. J. and Zeng Z.-B. (2010) Windows QTL Cartographer 2.5. Department of Statistics, North Carolina State University, Raleigh, NC.
Marcelo Mollinari, [email protected]
Broman, K. W., Wu, H., Churchill, G., Sen, S., Yandell, B. (2008) qtl: Tools for analyzing QTL experiments R package version 1.09-43
data(mapmaker_example_f2) twopt<-rf_2pts(mapmaker_example_f2) lg<-group(make_seq(twopt, "all")) ##"pre-allocate" an empty list of length lg$n.groups (3, in this case) maps.list<-vector("list", lg$n.groups) for(i in 1:lg$n.groups){ ##create linkage group i LG.cur <- make_seq(lg,i) ##ordering map.cur<-order_seq(LG.cur, subset.search = "sample") ##assign the map of the i-th group to the maps.list maps.list[[i]]<-make_seq(map.cur, "force") ##write maps.list to ".map" file write_map(maps.list, tempfile(fileext = ".map")) }
data(mapmaker_example_f2) twopt<-rf_2pts(mapmaker_example_f2) lg<-group(make_seq(twopt, "all")) ##"pre-allocate" an empty list of length lg$n.groups (3, in this case) maps.list<-vector("list", lg$n.groups) for(i in 1:lg$n.groups){ ##create linkage group i LG.cur <- make_seq(lg,i) ##ordering map.cur<-order_seq(LG.cur, subset.search = "sample") ##assign the map of the i-th group to the maps.list maps.list[[i]]<-make_seq(map.cur, "force") ##write maps.list to ".map" file write_map(maps.list, tempfile(fileext = ".map")) }
Converts onemap R object to onemap input file. The input file brings information about the mapping population: First line: cross type, it can be "outcrossing", "f2 intercross", "f2 backcross", "ri self" or "ri sib". Second line: number of individuals, number of markers, presence (1) or absence (0) of chromossome and position of the markers, and number of phenotypes mesured. Third line: Individuals/sample names; Followed lines: marker name, marker type and genotypes. One line for each marker. Final lines: chromossome, position and phenotypes informations. See more about input file format at vignettes.
write_onemap_raw(onemap.obj = NULL, file.name = NULL)
write_onemap_raw(onemap.obj = NULL, file.name = NULL)
onemap.obj |
object of class 'onemap' |
file.name |
a character for the onemap raw file name. Default is "out.raw" |
a onemap input file
Cristiane Taniguti, [email protected]
read_onemap
for a description of the output object of class onemap.
data(onemap_example_out) write_onemap_raw(onemap_example_out, file.name = paste0(tempfile(), ".raw"))
data(onemap_example_out) write_onemap_raw(onemap_example_out, file.name = paste0(tempfile(), ".raw"))