This function constructs the machine learning model, the output object can be used as input for plotting and further analyses.
ml_model(
processed_se,
char = "none",
transform = c("none", "log10", "square", "cube"),
ranking_method = c("p_value", "pvalue_FC", "ROC", "Random_forest", "SVM", "Lasso",
"Ridge", "ElasticNet"),
ml_method = c("Random_forest", "SVM", "Lasso", "Ridge", "ElasticNet", "xgboost"),
split_prop = 0.3,
nfold = 10,
alpha = NULL
)
A SummarizedExperiment object constructed by
as_summarized_experiment
and processed by data_process
.
Character list. Lipid characteristics selected from the ml_char list
returned by list_lipid_char
. Select 'none' to exclude all lipid characteristics.
Character. Method for data transformation. Allowed methods
include "none", "log10", "square", "cube". Select 'none' to skip data transformation.
Default is 'log10'
.
Character. The ranking method to be computed.
Allowed methods include 'p_value', 'pvalue_FC', 'ROC', 'Random_forest', 'SVM',
'Lasso', 'Ridge', 'ElasticNet'. Default is 'Random_forest'
.
Character. The machine learning method to be computed.
Allowed methods include 'Random_forest', 'SVM', 'Lasso', 'Ridge',
'ElasticNet', 'xgboost'. Default is 'Random_forest'
.
Numeric. The proportion of data to be retained for modeling/analysis.
The range is 0.1 to 0.5. Default is 0.3
.
Numeric. The number of fold that the original dataset is randomly
partitioned into equal-sized subsamples. Must be a positive interger.
Default is 10
.
Numeric. The alpha value between 0 and 1 when choosing
"ElasticNet"
as ml_method. 0 is for Ridge and 1 is for Lasso.
If "ElasticNet" is not selected as the ml_method, set the value to NULL.
Return a SummarizedExperiment object containing analysis results.
data("ml_sub")
processed_se <- data_process(
ml_sub, exclude_missing=TRUE, exclude_missing_pct=70,
replace_na_method='min', replace_na_method_ref=0.5,
normalization='Percentage')
char_list <- list_lipid_char(processed_se)
ml_se <- ml_model(
processed_se, char=c("class","Total.DB"), transform='log10',
ranking_method='Random_forest', ml_method='Random_forest', split_prop=0.3,
nfold=5, alpha=NULL)
#> Processing CV fold 1
#> Processing CV fold 2
#> Processing CV fold 3
#> Processing CV fold 4
#> Processing CV fold 5