Chapter 15 Combining datasets
15.1 Introduction
This page describes how the function r_combine_datasets
can be used to obtain
a new object that is the first main step for any combination analysis. In
particular, this function is the first step of the MFA and the PLS analyses.
15.1.1 Used datasets
The workflow will be illustrated on the protein dataset, the clinical dataset and the mRNA dataset. These datasets do not have exactly the same rows.
<- "../forge/backend/R/data/protein.csv"
input r_wrapp("r_import", input, data.name = "proteins", sep = " ", row.names = 1)
<- "../forge/backend/R/data/clinical.csv"
input r_wrapp("r_import", input, data.name = "clinical", row.names = 1)
<- "../forge/backend/R/data/mrna.csv"
input r_wrapp("r_import", input, data.name = "mrna", row.names = 1)
15.2 Function call and options
The function r_combine_datasets
has the following options:
datasetNames
: list (no default and required) with the names of the datasets (character) used as inputs.userName
: character (by default will take the default name of the output object) that specifies the name of the complex object returned by the function, as given by the user.
<- r_wrapp("r_combine_datasets",
out_combine list("proteins", "mrna", "clinical"))
15.2.1 State of the workspace after the function call
After the function call, the R workspace contains the following objects, where the combined analysis contains information on its parent datasets, ordered alphabetically:
print(names(object_db))
## [1] "proteins" "clinical" "mrna" "combinedDF_1"
::json_tree_view(
jsonview::toJSON(graph_db, pretty = TRUE, auto_unbox = TRUE),
jsonlitescroll = T
)
15.3 Output of the function
In addition to the created object (which names is also returned in the entry
ObjectName
in the output of the r_wrapp
call), the function also returns
some descriptive statistics and plots that are to be displayed to the user.
15.3.1 Returned tables
Two tables are returned that provide descriptive statistics before and after
(if performed) the filtering step that consists in filtering out all rows that are
not common to all datasets. This information is provided in entries
dataInfoBefore
and dataInfoAfter
(this second one is not always provided)
in the output of the r_wrapp
call.
::json_tree_view(
jsonview::toJSON(combinedDF_1$Table$dataInfoBefore,
jsonlitepretty = TRUE, auto_unbox = TRUE),
scroll = T
)
::json_tree_view(
jsonview::toJSON(combinedDF_1$Table$dataInfoAfter,
jsonlitepretty = TRUE, auto_unbox = TRUE),
scroll = T
)
15.3.2 Returned plots
Two plots are returned that provide information on common individuals in all
datasets. The upset plot is provided in the entry UpsetPlot
in the output of the
r_wrapp
call. It is a list meant for json conversion. Here is a truncated version:
::json_tree_view(
jsonview::toJSON(list(type = combinedDF_1$Graphical$UpsetPlot$type,
jsonlitedata = combinedDF_1$Graphical$UpsetPlot$data[1:10]),
pretty = TRUE, auto_unbox = TRUE),
scroll = T
)
The Venn diagramm is given in the entry VennPlot
in the output of the r_wrapp
call.
It is a list meant for json conversion:
::json_tree_view(
jsonview::toJSON(combinedDF_1$Graphical$VennPlot,
jsonlitepretty = TRUE, auto_unbox = TRUE),
scroll = T
)