This function constructs the machine learning model, the output object can be used as input for plotting and further analyses.
Usage
ml_model(
processed_se,
char = "none",
transform = c("none", "log10", "square", "cube"),
ranking_method = c("p_value", "pvalue_FC", "ROC", "Random_forest", "SVM", "Lasso",
"Ridge", "ElasticNet"),
ml_method = c("Random_forest", "SVM", "Lasso", "Ridge", "ElasticNet", "xgboost"),
split_prop = 0.3,
nfold = 10,
alpha = NULL
)
Arguments
- processed_se
A SummarizedExperiment object constructed by
as_summarized_experiment
and processed bydata_process
.- char
Character list. Lipid characteristics selected from the ml_char list returned by
list_lipid_char
. Select 'none' to exclude all lipid characteristics.- transform
Character. Method for data transformation. Allowed methods include "none", "log10", "square", "cube". Select 'none' to skip data transformation. Default is
'log10'
.- ranking_method
Character. The ranking method to be computed. Allowed methods include 'p_value', 'pvalue_FC', 'ROC', 'Random_forest', 'SVM', 'Lasso', 'Ridge', 'ElasticNet'. Default is
'Random_forest'
.- ml_method
Character. The machine learning method to be computed. Allowed methods include 'Random_forest', 'SVM', 'Lasso', 'Ridge', 'ElasticNet', 'xgboost'. Default is
'Random_forest'
.- split_prop
Numeric. The proportion of data to be retained for modeling/analysis. The range is 0.1 to 0.5. Default is
0.3
.- nfold
Numeric. The number of fold that the original dataset is randomly partitioned into equal-sized subsamples. Must be a positive interger. Default is
10
.- alpha
Numeric. The alpha value between 0 and 1 when choosing
"ElasticNet"
as ml_method. 0 is for Ridge and 1 is for Lasso. If "ElasticNet" is not selected as the ml_method, set the value to NULL.
Examples
data("ml_sub")
processed_se <- data_process(
ml_sub, exclude_missing=TRUE, exclude_missing_pct=70,
replace_na_method='min', replace_na_method_ref=0.5,
normalization='Percentage', transform='log10')
char_list <- list_lipid_char(processed_se)
ml_se <- ml_model(
processed_se, char=c("class","Total.DB"), transform='log10',
ranking_method='Random_forest', ml_method='Random_forest', split_prop=0.3,
nfold=5, alpha=NULL)
#> Processing CV fold 1
#> Registered S3 methods overwritten by 'proxy':
#> method from
#> print.registry_field registry
#> print.registry_entry registry
#> Processing CV fold 2
#> Processing CV fold 3
#> Processing CV fold 4
#> Processing CV fold 5