Chapter 7 Variates: r_univariate, r_bivariate, r_multivariate, r_univariate_dataset
<- read.table("../forge/backend/R/data/protein.csv", sep = " ",
proteins quote = '\"', dec = ".", row.names = 1)
<- read.table("../forge/backend/R/data/clinical.csv", sep = ",",
clinical quote = '\"', dec = ".", row.names = 1)
7.1 Univariate
Analysis on one variable from one dataset.
The function r_univariate
takes two inputs:
dataset
: the name of the dataset, as character. Can also be directly the name of the dataset, whithout the"
for R internal use.varname
: the name of the variable as character.
According to the type of the variable (factor, i.e. not numerical, or numerical) the outputs are different.
7.1.1 Factor variables
Example for a factor variable:
<- r_univariate("clinical", "patient.clinical_cqcf.consent_or_death_status")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in JSON as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
$Graphical out
## $Barplot
7.1.2 Numerical variables
Example for a numerical variable:
# examples univariate
<- r_univariate(clinical, "patient.samples.sample.2.portions.portion.analytes.analyte.aliquots.aliquot.quantity")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Boxplot" "Histogram" "Density" "Violin" "Stripchart"
$Graphical$Boxplot out
$Graphical$Histogram out
$Graphical$Density out
$Graphical$Violin out
$Graphical$Stripchart out
7.2 Bivariate
Cross-analysis on two variables, from one or two datasets.
The function r_bivariate
takes four inputs:
dataset1
: the name of the first dataset, as character. Can also be directly the name of the dataset, whithout the"
for R internal use.varname1
: the name of the first variable as character.dataset2
: the name of the second dataset, as character. Can also be directly the name of the dataset, whithout the"
for R internal use.varname2
: the name of the second variable as character.
According to the types of the variables (factor, i.e. not numerical, or numerical) the outputs are different.
7.2.1 Factor - factor
Example in the case of two factor variables:
<- r_bivariate("clinical", "patient.clinical_cqcf.consent_or_death_status", "clinical", "patient.gender")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Barplot_base" "Barplot_fill" "Barplot_dodge"
$Graphical$Barplot_base out
$Graphical$Barplot_fill out
$Graphical$Barplot_dodge out
<- r_bivariate("clinical", "patient.samples.sample.portions.portion.analytes.analyte.3.analyte_type", "clinical", "patient.gender")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in JSON as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Barplot_base" "Barplot_fill" "Barplot_dodge"
$Graphical$Barplot_base out
$Graphical$Barplot_fill out
$Graphical$Barplot_dodge out
7.2.2 Numeric - factor
Example in the case of one numeric and one factor variable:
<- r_bivariate("clinical", "patient.samples.sample.2.portions.portion.analytes.analyte.aliquots.aliquot.quantity", "clinical", "patient.gender")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Stripchart" "Boxplot" "Violin" "Density"
$Graphical$Stripchart out
$Graphical$Boxplot out
$Graphical$Violin out
$Graphical$Density out
2nd example of one numeric and one factor variable:
<- r_bivariate("clinical", "patient.samples.sample.2.portions.portion.analytes.analyte.aliquots.aliquot.quantity", "clinical", "patient.clinical_cqcf.consent_or_death_status")
out names(out)
## [1] "Table" "Graphical"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Stripchart" "Boxplot" "Violin" "Density" "TukeyPlot"
$Graphical$Stripchart out
$Graphical$Boxplot out
$Graphical$Violin out
$Graphical$Density out
$Graphical$TukeyPlot out
7.2.3 Numeric - numeric
Example in the case of two numeric variables:
<- r_bivariate("clinical",
out "patient.samples.sample.2.portions.portion.analytes.analyte.aliquots.aliquot.quantity",
"clinical", "patient.day_of_form_completion")
names(out)
## [1] "Table" "Graphical"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
The Graphical
component:
names(out$Graphical)
## [1] "Scatterplot"
$Graphical$Scatterplot out
7.3 Multivariate Dotplot
Here the output is a graph.
The function r_multivariate_dotplot
takes up to 10 arguments.
Four are mandatory:
- datasetxaxis
, a character, the name of the dataset for the x-axis variable
- varxaxis
, a character, the name of the variable for the x-axis
- datasetyaxis
, a character, the name of the dataset for the y-axis variable
- varyaxis
, a character, the name of the variable for the y-axis
Six are optional:
- datasetcolor
, a character, the name of the dataset for the colour of points
- varcolor
, a character, the name of the variable for the colour of points
- datasetshape
, a character, the name of the dataset for the shape of points
- varshape
, a character, the name of the variable for the shape of points
- datasetsize
, a character, the name of the dataset for the size of points
- varsize
, a character, the name of the variable for the size of points
All the variables can be either numerical or categorical.
Example:
<- r_multivariate_dotplot(datasetxaxis = "proteins", varxaxis = "AR",
out datasetyaxis = "proteins", varyaxis = "Akt",
datasetcolor = "proteins", varcolor = "C.Raf",
datasetshape = "clinical", varshape = "patient.gender",
datasetsize = "proteins", varsize = "Bak")
names(out)
## [1] "Graphical"
names(out$Graphical)
## [1] "Dotplot"
$Graphical$Dotplot out
7.4 Univariate on a dataset
The function r_univariate_dataset
performs univariate analysis on all variables of a dataset.
It handles separately numerical and categorical variables.
The function takes two arguments:
- datasetName
: the name of the dataset,
- scale
: a boolean, default to FALSE. Should the numerical variables be scaled for plot ?
The function returns an object in the global environement (Object
component). It return at least one table (up to 3) and one plot (up to 2), in plotly.
If there is too many variables, the plots takes only the first ones (first 150 for numerical and first 50 for categorical) Example:
<- r_univariate_dataset(datasetName = "clinical",
out scale = TRUE)
names(out)
## [1] "Graphical" "Table" "Object"
names(out$Graphical)
## [1] "plotNum" "plotCateg"
names(out$Table)
## [1] "numSummary" "numNormTests" "catSummary"
The Table
component of output in json as passed to the interface:
<- out
out_TableOnly $Graphical <- NULL
out_TableOnly$Object <- NULL
out_TableOnly::json_tree_view(out_TableOnly, scroll = TRUE) jsonview
Content of the “Graphical” component:
names(out$Graphical)
## [1] "plotNum" "plotCateg"
$Graphical$plotNum out
$Graphical$plotCateg out
rm(list=ls())