1. R2 data-base

The R2 platform is a freely accessible online genomics analysis and visualization tool which can analyze a large collection of public data, but also allows shielded analysis of private dataset(s) as indicated by Rijswijk (2024).
R2 consists of a database, storing the genomic information, coupled to an extensive set of tools to analyze/visualize the datasets.
A link to the can be found in the references session (R2 2023) and below:
- https://hgserver1.amc.nl/cgi-bin/r2/main.cgi

2. Loading the neuroblastoma data-frame

Downloaded data from the R2 website, in the data folder

The file downloaded from R2 comes with the “#” symbols, which need to be removed.

r2_gse62564 <- read.table("../data/ps_avgpres_gse62564geo498_seqcnb1_box1687888611-datagrabber-.txt")
class(r2_gse62564)
saveRDS(r2_gse62564, file = "../data/r2_gse62564.rds")

Removal of “#” will create and save the appropriate data-frame

r2_gse62564 <- read.table("../data/ps_avgpres_gse62564geo498_seqcnb1_box1687888611-datagrabber_modified.txt")
class(r2_gse62564)
saveRDS(r2_gse62564, file = "../data/r2_gse62564.rds")

Load rds object and calculate how long it takes R to load the object

r2_gse62564.rds loaded needs to be the correct one.
If it was loaded from the first chunk above, R throws an error.
Object r2_gse62564.rds is stored in the data folder of the BBC-2024 Google Drive:
- https://drive.google.com/drive/folders/1fr7SkOqMqOqQIW5ttjblGkPHCJY2HDJv?usp=sharing
This code allows calculation of the time R takes to load the dataframe into the R environment:

start_time <- Sys.time()
r2_gse62564 <- readRDS("../data/r2_gse62564.rds")
end_time <- Sys.time()
end_time - start_time

## Time difference of 2.061233 secs

Visualize the dataframe

Use the head() function

head(r2_gse62564, n = 20)

Use the dim() function

dim(r2_gse62564)

## [1] 24961   500

3. Processing the neuroblastoma dataframe

Understanding the metadata part of the dataframe

Load dplyr package

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Metadata is a set of data that describes and gives information about other data.
Take the first 17 lines to be the metadata.

metadata_gse62564               <- r2_gse62564[1:17,]

Visualize metadata dataframe

## View(metadata_gse62564)
head(metadata_gse62564, n = 20)

Visualize gene expression dataframe

head(r2_gse62564, n = 20)

Note that the second column, in the previous visualization, is repeated
Remove the second column, which is repeated using the select method of the dplyr package

r2_gse62564   <- dplyr::select(r2_gse62564, -V2)
head(r2_gse62564, n = 20)

Remove all metadata information, keep only gene counts in the gene expression dataframe

r2_gse62564   <- r2_gse62564[-c(2:18),]

Give names to columns using row 1

colnames(r2_gse62564) <- r2_gse62564[1, ]

Remove column that contain rows names

r2_gse62564           <- r2_gse62564[-1,]
head(r2_gse62564, n = 20)

Removing repeated rows

library(maditr)

## 
## To modify variables or add new variables:
##              let(mtcars, new_var = 42, new_var2 = new_var*hp) %>% head()

## 
## Attaching package: 'maditr'

## The following objects are masked from 'package:dplyr':
## 
##     between, coalesce, first, last

## The following object is masked from 'package:base':
## 
##     sort_by

Remove duplicated rows using the maditr package
distinct() keeps only unique/distinct rows from a data frame.

r2_gse62564_distinct           <- r2_gse62564 %>% distinct(`H:hugo`, .keep_all = TRUE)

Look at the gene expression dataset

head(r2_gse62564, n = 20)

Give names to rows

rownames(r2_gse62564_distinct) <- r2_gse62564_distinct$`H:hugo`

Remove the H:hugo column

r2_gse62564_distinct           <- dplyr::select(r2_gse62564_distinct, -`H:hugo`)

Look at the gene expression dataset after the distinct method is applied

head(r2_gse62564, n = 20)

names(r2_gse62564_distinct)    <- toupper(names(r2_gse62564_distinct))
r2_gse62564_distinct           <- r2_gse62564_distinct %>% mutate_if(is.character, as.numeric)
r2_gse62564_matrix             <- as.matrix(r2_gse62564_distinct)
r2_gse62564_df                 <- as.data.frame(r2_gse62564_matrix)

Remove the extra H:hugo column in r2_gse62564

r2_gse62564           <- dplyr::select(r2_gse62564, -`H:hugo`)

Rename columns in r2_gse62564_distinct

names(r2_gse62564_distinct)    <- toupper(names(r2_gse62564))
r2_gse62564_distinct           <- r2_gse62564_distinct %>% mutate_if(is.character, as.numeric)
r2_gse62564_matrix             <- as.matrix(r2_gse62564_distinct)
r2_gse62564_df                 <- as.data.frame(r2_gse62564_matrix)

4. Phenotype Score Calculation

In this part, we construct the dataframe to include phenotype scoring with the gene expression values. To calculate phenotype scores, we use the GSVA library.

The GSVA Library

Load the GSVA and GSEABAse libraries

# install.packages(BiocManager)
# library(BiocManager)
# BiocManager::install("GSVA")
library("GSVA")
library("GSEABase")

GSVA error:

Show in New Window
[1] "data.frame"
Show in New Window
Time difference of 5.320694 secs
Show in New Window
Error in getGmt("../data/cfDNA_genes_PCA.txt") : 
  could not find function "getGmt"
Show in New Window
Error: package or namespace load failed for ‘GSVA’:
 .onLoad failed in loadNamespace() for 'XML', details:
  call: dyn.load(file, DLLpath = DLLpath, ...)
  error: unable to load shared object '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/XML/libs/XML.so':
  dlopen(/Library/Frameworks/R.framework/Versions/4.1/Resources/library/XML/libs/XML.so, 6): Library not loaded: @rpath/libxml2.2.dylib
  Referenced from: /Library/Frameworks/R.framework/Versions/4.1/Resources/library/XML/libs/XML.so
  Reason: image not found

Reference for GSVA error:

https://support.bioconductor.org/p/9157686/

Re-install Bioconductor and GSVA:

https://bioconductor.org/install/

Load gene sets

cfDNA_PCA_gene_list    <- getGmt("../data/cfDNA_genes_PCA.txt")

## Warning in readLines(con, ...): incomplete final line found on
## '../data/cfDNA_genes_PCA.txt'

## Warning in getGmt("../data/cfDNA_genes_PCA.txt"): 321 record(s) contain
## duplicate ids: ADRN_Gene_List_373, ADRN_Gronigen, ...,
## WP_GLYCOLYSIS_AND_GLUCONEOGENESIS_7, WP_GLYCOLYSIS_IN_SENESCENCE_Genes_4

Estimating GSVA enrichment scores (depracated)

Construct GSVA data-frame. This step needs to be re-run everytime there is a new gene set. From previous code versions, this chunk is now edpracated.

r2_gse62564_GSVA <- gsva(r2_gse62564_matrix, 
                          cfDNA_PCA_gene_list, 
                          min.sz=1, max.sz=Inf, 
                          verbose=TRUE)

## for simplicity, use only a subset of the sample data
ses <- leukemia_eset[1:1000, ]
gsc <- c2BroadSets[1:100]
gp1 <- gsvaParam(ses, gsc, minSize = 2, maxSize = Inf)
gp1

gsva_gp1 <- gsva(gp1, verbose=TRUE)

Estimating GSVA enrichment scores

build GSVA parameter object, according to documentation. In the examples of the gsvaParam class documentation of the leukemia and c2BroadSets, only the ses and gsc objects were used, not the maxDiff set to TRUE.

# gsvapar        <- gsvaParam(y, geneSets, maxDiff=TRUE)
gsvapar_gse62564 <- gsvaParam(r2_gse62564_matrix, cfDNA_PCA_gene_list)

## estimate GSVA enrichment scores for the three sets
# gsva_es        <- gsva(gsvapar)
gsva_es_gse62564 <- gsva(gsvapar_gse62564)

## Warning in .filterGenes(dataMatrix, removeConstant = removeConstant,
## removeNzConstant = removeNzConstant): 164 genes with constant values throughout
## the samples.

## Warning in .filterGenes(dataMatrix, removeConstant = removeConstant,
## removeNzConstant = removeNzConstant): Genes with constant values are discarded.

## No annotation package name available in the input data object.
## Attempting to directly match identifiers in data to gene sets.

## Warning in .filterAndMapGenesAndGeneSets(param): Some gene sets have size one.
## Consider setting 'minSize' to a value > 1.

## Estimating GSVA scores for 312 gene sets.
## Estimating ECDFs with Gaussian kernels
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%

Merge gene expression and gsva scores

# r2_gse62564_GSVA_genes <- rbind(r2_gse62564_GSVA, r2_gse62564_matrix)
gsva_es_gse62564_genes   <- rbind(gsva_es_gse62564, r2_gse62564_matrix)

5. Save dataframe (old dataframe)

Save the new dataframe

# r2_gse62564_GSVA_genes <- rbind(r2_gse62564_GSVA, r2_gse62564_matrix)
gsva_es_gse62564_genes   <- rbind(gsva_es_gse62564, r2_gse62564_matrix)
saveRDS(gsva_es_gse62564_genes, file = "../data/gsva_es_gse62564_genes.rds")

Save dataframe (old dataframe)

gse62564_metadata <- metadata_gse62564
# Repeat row with IDs to confirm matching later
gse62564_metadata[18,] <- gse62564_metadata[1,]
# Remove first column of metadata DF
gse62564_metadata <- gse62564_metadata[,-c(1)]
# Make all IDs in row 2, uppercase
gse62564_metadata[1,2:499] <- toupper(gse62564_metadata[1,2:499])
# Make row 1, column names
library(janitor)
gse62564_metadata <- gse62564_metadata %>%
  row_to_names(row_number = 1)
# Make column as row names in metadata
library(tidyverse)
gse62564_metadata <- gse62564_metadata %>% remove_rownames %>% column_to_rownames(var="probeset")
r2_gse62564_GSVA_Metadata <- rbind(gse62564_metadata, r2_gse62564_GSVA_genes)
r2_gse62564_GSVA_Metadata <- t(r2_gse62564_GSVA_Metadata)
r2_gse62564_GSVA_Metadata <- as.data.frame(r2_gse62564_GSVA_Metadata)
saveRDS(r2_gse62564_GSVA_Metadata, file = "../results/r2_gse62564_GSVA_Metadata.rds")

References

R2. 2023. R2: Genomics Analysis and Visualization Platform. http://r2.amc.nl.

Rijswijk, Merlijn van. 2024. R2 Genomics Analysis and Visualization Platform – AMC. https://www.dtls.nl/technology-hotels/list/r2-genomics-analysis-and-visualization-platform-amc/.

Session Info

sessionInfo()

## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS Big Sur 11.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Chicago
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] GSEABase_1.66.0      graph_1.82.0         annotate_1.82.0     
##  [4] XML_3.99-0.17        AnnotationDbi_1.66.0 IRanges_2.38.1      
##  [7] S4Vectors_0.42.1     Biobase_2.64.0       BiocGenerics_0.50.0 
## [10] GSVA_1.52.3          maditr_0.8.4         dplyr_1.1.4         
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1            blob_1.2.4                 
##  [3] Biostrings_2.72.1           fastmap_1.2.0              
##  [5] SingleCellExperiment_1.26.0 digest_0.6.37              
##  [7] rsvd_1.0.5                  lifecycle_1.0.4            
##  [9] KEGGREST_1.44.1             RSQLite_2.3.7              
## [11] magrittr_2.0.3              compiler_4.4.1             
## [13] rlang_1.1.4                 sass_0.4.9                 
## [15] tools_4.4.1                 utf8_1.2.4                 
## [17] yaml_2.3.10                 data.table_1.15.4          
## [19] knitr_1.48                  S4Arrays_1.4.1             
## [21] bit_4.0.5                   DelayedArray_0.30.1        
## [23] abind_1.4-5                 BiocParallel_1.38.0        
## [25] HDF5Array_1.32.1            withr_3.0.1                
## [27] grid_4.4.1                  fansi_1.0.6                
## [29] beachmat_2.20.0             xtable_1.8-4               
## [31] Rhdf5lib_1.26.0             SummarizedExperiment_1.34.0
## [33] cli_3.6.3                   rmarkdown_2.28             
## [35] crayon_1.5.3                generics_0.1.3             
## [37] rstudioapi_0.16.0           httr_1.4.7                 
## [39] rjson_0.2.22                DBI_1.2.3                  
## [41] cachem_1.1.0                rhdf5_2.48.0               
## [43] zlibbioc_1.50.0             parallel_4.4.1             
## [45] XVector_0.44.0              matrixStats_1.3.0          
## [47] vctrs_0.6.5                 Matrix_1.7-0               
## [49] jsonlite_1.8.8              BiocSingular_1.20.0        
## [51] bit64_4.0.5                 irlba_2.3.5.1              
## [53] magick_2.8.4                jquerylib_0.1.4            
## [55] glue_1.7.0                  codetools_0.2-20           
## [57] GenomeInfoDb_1.40.1         GenomicRanges_1.56.1       
## [59] UCSC.utils_1.0.0            ScaledMatrix_1.12.0        
## [61] tibble_3.2.1                pillar_1.9.0               
## [63] htmltools_0.5.8.1           rhdf5filters_1.16.0        
## [65] GenomeInfoDbData_1.2.12     R6_2.5.1                   
## [67] sparseMatrixStats_1.16.0    evaluate_0.24.0            
## [69] lattice_0.22-6              png_0.1-8                  
## [71] SpatialExperiment_1.14.0    memoise_2.0.1              
## [73] bslib_0.8.0                 Rcpp_1.0.13                
## [75] SparseArray_1.4.8           xfun_0.47                  
## [77] MatrixGenerics_1.16.0       pkgconfig_2.0.3

Download and Process Neuroblastoma Data-Frame from R2 Database

Gepoliano Chaves

August 20th, 2024