library(xcms)
library(tidyverse)
library(plotly)

Introduction

The objective of this demo is to show how xcms can be used to read the results of fragmentation experiments and using of them to perform annotation.

Annotation is the process of going back from features to compounds and it can be considered one (or may be the bigger) challenge in metabolomics. In real life world, annotation is an effort which integrate a lot of different expertises ranging from chemistry, informatics, chemoinformatics,….

Annotation in MS is particularly challenging due to the fact that a mass spectrometer is basically a tool able to measure only mass to charge ratios … I mean can you imagine how difficult would be to distinguish people on the bases only of their weight?

Fragmentation experiments, often called MS/MS, are an essential tool for this task. They can be used either to deduce information on the chemical characteristics of unknown compounds, or to confirm the annotation when authentic chemical standards are available.

In this second case, the typical workflow is to inject the standards with the same analytical pipeline used for the samples, and then use m/z, rt and MS/MS to confirm the presence of a given compound inside my samples.

The importance of having reliable and efficient databases of chemical standards cannot be overlooked and the community is putting a lot of effort on that in terms of data organization, standardization and development of specific software libraries.

Reading MS/MS

Let’s suppose that having a library of standards in our lab, we injected them with the same chromatography used for the wines. Among the other compounds, our library contains 3 types of procyanidins B1, B2, B3. They are rather large polyphenols widely present in many matrices of vegetal origin.

Suppose that our analytical facility told us that they were analysed (with other compounds) in three injections:

We also know that in negative ion mode all these three molecules, will mainly yield the [M-H]- ion in negative ion mode MS (m/z 577.1352).

Our task now is to confirm (or not) the presence of the three of them in the wine samples. Let’s start with B1 …

Identify the retention time

First of all we have to identify the retention time of the standard. This could be done manually, or automatically. The manual way is to look for peaks in the extracted ion trace of the (577.1352) ion and identify the position of the peak.

The automatic way is to do peak picking … but this will be part of your assignment ;-)

std_files <- list.files("../data/wines/", pattern = "mix", full.names = TRUE)


std_raw <- readMSData(std_files, 
                    msLevel. = 1, ## we read only MS1 
                    mode = "onDisk")

Since we know what we are looking for let’s get the extracted ion trace

proc_chrom <- chromatogram(std_raw, mz = 577.1352 + 0.01*c(-1, 1))

plot(proc_chrom, col = seq(3)+1)

So three large peaks are visible in the three files, just to have an idea let’s look to the three profiles in interactive mode:

ggplotly(ggplot() + 
  geom_line(aes(x = rtime(proc_chrom[1,1]), intensity(proc_chrom[1,1]), col = "red")) + 
  geom_line(aes(x = rtime(proc_chrom[1,2]), intensity(proc_chrom[1,2])), col = "green") + 
  geom_line(aes(x = rtime(proc_chrom[1,3]), intensity(proc_chrom[1,3])), col = "blue") + 
  theme_bw())

So the tentative peak position turns out to be

B3: 242
B2: 267
B1: 234

Let’s find the spectrum on the top of the blue trace:

blue_id <- which(rtime(std_raw) == 233.9871)

ggplotly(plot(std_raw[[blue_id]]))

So the full scan does not give a clear fingerprint of the fragmentation spectrum, to understand that we have to query MS2 spectra:

std_MS_MS <- readMSData(std_files, 
                    msLevel. = 2, ## we read only MS1 and MS2
                    mode = "onDisk")

The question to be addressed, now is: do we have fragmentation spectra of the ion around 577?

The list of precursor ions can be directly accessed with

precursorMz(std_MS_MS)[1:10]
## F1.S0002 F1.S0005 F1.S0008 F1.S0011 F1.S0014 F1.S0017 F1.S0020 F1.S0023 
## 330.8706 330.8705 194.9464 194.9465 330.8705 330.8703 330.8704 330.8705 
## F1.S0026 F1.S0029 
## 330.8704 194.9464

But two handy methods can be used to filter the list of precursors

focus_MS_MS <- std_MS_MS %>% 
  filterPrecursorMz(577.1352, ppm = 10) %>% 
  filterRt(rt = c(224,280))


focus_MS_MS
## MSn experiment data ("OnDiskMSnExp")
## Object size in memory: 0.03 Mb
## - - - Spectra data - - -
##  MS level(s): 2 
##  Number of spectra: 4 
##  MSn retention times: 3:53 - 4:29 minutes
## - - - Processing information - - -
## Data loaded [Wed Nov 29 00:17:34 2023] 
## Filter: select MS level(s) 2. [Wed Nov 29 00:17:34 2023] 
## Filter: select by precursor m/z: 577.1352. [Wed Nov 29 00:17:34 2023] 
## Filter: select retention time [224-280] and MS level(s), 2 [Wed Nov 29 00:17:34 2023] 
##  MSnbase version: 2.26.0 
## - - - Meta data  - - -
## phenoData
##   rowNames: x015_mix02_2_NEG_DDA.mzML x016_mix03_2_NEG_DDA.mzML
##     x017_mix04_2_NEG_DDA.mzML
##   varLabels: sampleNames
##   varMetadata: labelDescription
## Loaded from:
##   [1] x015_mix02_2_NEG_DDA.mzML...  [3] x017_mix04_2_NEG_DDA.mzML
##   Use 'fileNames(.)' to see all files.
## protocolData: none
## featureData
##   featureNames: F1.S0530 F2.S0578 F3.S0506 F3.S0515
##   fvarLabels: fileIdx spIdx ... spectrum (35 total)
##   fvarMetadata: labelDescription
## experimentData: use 'experimentData(object)'

So here we have four spectra compatible with the rules …Let’s look to their retention times

rtime(focus_MS_MS)
## F1.S0530 F2.S0578 F3.S0506 F3.S0515 
## 243.0483 269.1749 232.8409 237.7909

The first two are OK (File1 and File2), for the third file we have two fragmentations. This is not unreasonable since the blue peak is large, the ion at m/z 577 was, most likely fragmented in two consecutive scans. To understand which is the good one one should rely on DBs or interpretation of mass spectra.

Let’s now make a nice plot with the MS/MS spectra:

plot(mz(focus_MS_MS[[1]]), 
     intensity(focus_MS_MS[[1]])/max(intensity(focus_MS_MS[[1]])), 
     type = "h",
     xlab = "mz", col = "red",
     ylab = "normalized Intensity")
points(mz(focus_MS_MS[[2]])+2, intensity(focus_MS_MS[[2]])/max(intensity(focus_MS_MS[[2]])), col = "green", type = "h")
points(mz(focus_MS_MS[[4]])-2, intensity(focus_MS_MS[[4]])/max(intensity(focus_MS_MS[[4]])), col = "blue", type = "h")

So, as expected the three procyanidins are giving substantially the same fragmentation spectrum and are eluting at three different retention times.

DIY

Try to check if these compounds are visible inside one of our wine samples. Hint, (1) check the extracted ion trace of the 577 ion (2) check if this ion was fragmented.