Title: | Genotyping Triploids (or Diploids) from Luminescence Data |
---|---|
Description: | Genotyping of triploid individuals from luminescence data (marker probeset A and B). Works also for diploids. Two main functions: Run_Clustering() that regroups individuals with a same genotype based on proximity and Run_Genotyping() that assigns a genotype to each cluster. For Shiny interface use: launch_GenoShiny(). |
Authors: | Julien Roche [aut, cre], Florence Phocas [aut], Mathieu Besson [aut], Pierre Patrice [aut], Marc Vandeputte [aut], François Allal [aut], Pierrick Haffray [aut] |
Maintainer: | Julien Roche <[email protected]> |
License: | GPL |
Version: | 1.1.3 |
Built: | 2025-03-25 17:40:30 UTC |
Source: | https://github.com/cran/GenoTriplo |
Clustering function to run clustering with no parallelization process nor auto save
Clustering( dataset, nb_clust_possible, n_iter = 5, Dmin = 0.28, SampleName = NULL )
Clustering( dataset, nb_clust_possible, n_iter = 5, Dmin = 0.28, SampleName = NULL )
dataset |
dataset with Contrast and SigStren for each individuals (as SampleName) and each markers (as MarkerName) |
nb_clust_possible |
number of cluster possible (ploidy+1) |
n_iter |
number of iterations to perform for clustering |
Dmin |
minimal distance between two clusters |
SampleName |
vector with all SampleName (important when missing genotype) |
list of results of clustering
data(GenoTriplo_to_clust) ploidy=3 res = Clustering(dataset=GenoTriplo_to_clust, nb_clust_possible=ploidy+1,n_iter=5)
data(GenoTriplo_to_clust) ploidy=3 res = Clustering(dataset=GenoTriplo_to_clust, nb_clust_possible=ploidy+1,n_iter=5)
Create SigStren and Contrast variables from luminescence values of probeset A and B of each markers and return a dataframe to be used for clustering or save the result if a saving name is given
Create_Dataset(data, save_name = NULL)
Create_Dataset(data, save_name = NULL)
data |
dataframe with probeset_id as first variable (markername finishing by -A or -B depending on the probeset) and individuals as variable with luminescence values for each probeset (dataset created by bash code by shiny app) |
save_name |
saving name |
number of individuals and markers (automatically save the dataset)
Example of dataset for clustering
GenoTriplo_to_clust
GenoTriplo_to_clust
A dataframe with 500 rows (corresponding to an individual for a given marker) and 4 columns (SigStren,Contrast,SampleName,MarkerName)
Example of dataset for genotyping
GenoTriplo_to_geno
GenoTriplo_to_geno
A list of 10 each element being the result of clustering for a given marker
Launch a shiny interface to use GenoTriplo. Really easy to use and user friendly, this will help you gain time !
launch_GenoShiny()
launch_GenoShiny()
void : most results are automatically saved
Launch the clustering phase in parallel from the dataset with SampleName, Contrast and SigStren for each markers (MarkerName).
Run_Clustering( data_clustering, ploidy, save_n = "", n_iter = 5, D_min = 0.28, n_core = 1, path_log = "" )
Run_Clustering( data_clustering, ploidy, save_n = "", n_iter = 5, D_min = 0.28, n_core = 1, path_log = "" )
data_clustering |
dataframe result from create dataset phase |
ploidy |
ploidy of offspring |
save_n |
name of the saving file |
n_iter |
number of iterations of clustering |
D_min |
threshold distance between two clusters |
n_core |
number of cores used for parallelization |
path_log |
path for log file when run by the shiny app |
the result of clustering or automatically save a list of objects if a saving name has been provided
data(GenoTriplo_to_clust) res = Run_Clustering(data_clustering=GenoTriplo_to_clust, ploidy=3,n_iter=5,n_core=1) # or if you want to automatically save the result # This will automatically create a folder and save the result in it # Run_Clustering(data_clustering=GenoTriplo_to_clust, # ploidy=3,n_iter=5,n_core=1,save_n='exemple')
data(GenoTriplo_to_clust) res = Run_Clustering(data_clustering=GenoTriplo_to_clust, ploidy=3,n_iter=5,n_core=1) # or if you want to automatically save the result # This will automatically create a folder and save the result in it # Run_Clustering(data_clustering=GenoTriplo_to_clust, # ploidy=3,n_iter=5,n_core=1,save_n='exemple')
Function that launch the genotyping phase from the dataset with SampleName, Contrast and SigStren for each markers and the result of the 'Run_clustering' function.
Run_Genotyping( data_clustering, res_clust, ploidy, SeuilNoCall = 0.85, SeuilNbSD = 2.8, SeuilSD = 0.28, n_core = 1, corres_ATCG = NULL, pop = "Yes", cr_marker = 0.97, fld_marker = 3.4, hetso_marker = -0.3, save_n = "", batch = "", ALL = TRUE, path_log = "" )
Run_Genotyping( data_clustering, res_clust, ploidy, SeuilNoCall = 0.85, SeuilNbSD = 2.8, SeuilSD = 0.28, n_core = 1, corres_ATCG = NULL, pop = "Yes", cr_marker = 0.97, fld_marker = 3.4, hetso_marker = -0.3, save_n = "", batch = "", ALL = TRUE, path_log = "" )
data_clustering |
dataframe result from create dataset phase |
res_clust |
object from clustering phase |
ploidy |
ploidy of offspring |
SeuilNoCall |
threshold of the probability of belonging to a cluster |
SeuilNbSD |
threshold for the distance between an individuals and his cluster (x=Contrast) |
SeuilSD |
threshold for the standard deviation of a cluster (SeuilSD*(1+0.5*abs(mean_contrast_cluster))) |
n_core |
number of cores used for parallelization |
corres_ATCG |
dataframe with the correspondence between A/B of AXAS and A/T/C/G (three columns : probeset_id, Allele_A, Allele_B) |
pop |
Yes or No : are individuals from a same population |
cr_marker |
call rate threshold |
fld_marker |
FLD threshold |
hetso_marker |
HetSO threshold |
save_n |
name of the saving file. If ” no auto save and return value is changed |
batch |
batch number in case of parallelization else ignore |
ALL |
TRUE/FALSE whether the dataset has been cut or not (from the shiny app) |
path_log |
path for log file when run by the shiny app |
if save_n != ” : 3 objects list : dataframe with call rate by individuals, dataframe with call rate and other metrics of markers and another dataframe – Automatically save results. Else : return list with genotype
data(GenoTriplo_to_clust) data(GenoTriplo_to_geno) res = Run_Genotyping(data_clustering=GenoTriplo_to_clust, res_clust=GenoTriplo_to_geno, ploidy=3)
data(GenoTriplo_to_clust) data(GenoTriplo_to_geno) res = Run_Genotyping(data_clustering=GenoTriplo_to_clust, res_clust=GenoTriplo_to_geno, ploidy=3)