This function processes the abundance data based on user options, including removing features with missing values, imputing missing values, and normalization.
Usage
data_process(
se,
exclude_missing = TRUE,
exclude_missing_pct = 70,
replace_na_method = c("none", "QRILC", "SVD", "KNN", "IRMI", "min", "mean", "median",
"PPCA", "BPCA", "RandomForest"),
replace_na_method_ref = 0.5,
normalization = c("none", "Percentage", "PQN", "Quantile", "Sum", "Median"),
transform = c("none", "log10", "cube", "square")
)
Arguments
- se
A SummarizedExperiment object construct by
as_summarized_experiment
.- exclude_missing
Logical. If exclude_missing=TURE, lipids with missing values will be removed. Default is
TRUE
.- exclude_missing_pct
Numeric. Lipids with missing values over a certain percentage (5-100) should be removed. Default is
70
.- replace_na_method
Character. The method for NA values replacing. Allowed methods include "QRILC", "SVD", "KNN", "IRMI", "min", "mean", "median", "PPCA", "BPCA", "RandomForest", and "none". If you have already replaced NAs, select 'none'. Default is
'min'
.- replace_na_method_ref
Numeric. The value for replacing NA values varies depending on the selected method, and each method applies different number ranges.
QRILC: 0.1-1
SVD: 1-10
KNN: 1-10
min: 0.1-0.5
PPCA: 1-10
BPCA: 1-10
Default is
0.5
for replace_na_method='min'.- normalization
Character. Normalization function. Allowed methods include "Percentage", "PQN", "Quantile", "Sum", "Median", and "none". If you have already normalized the abundance values, select 'none'. Default is
'Percentage'
.- transform
Character. Normalization function. Allowed methods include "log10", "cube", "square", and "none". If you have already transformed the abundance values, select 'none'. Default is
'log10'
.
Examples
data("de_data_twoGroup")
processed_data <- data_process(de_data_twoGroup, exclude_missing=TRUE,
exclude_missing_pct=70, replace_na_method='min', replace_na_method_ref=0.5,
normalization='Percentage', transform='log10')