Introduction

This is an example of loading in a search result from MaxQuant and analysing it in MSstats. We can do this using their shiny application, but doing it in R directly gives us more control over the process and lets us examine intermediate steps in the processing.

The data we are using comes from PRIDE accession PXD043985. It’s a yeast dataset which contains both whole proteome and an affinity pull down using the eIF2a protein. The interest here is to look at the proteins which are enriched in the pull down compared to the whole proteome.

Setup

We’re going to base this analysis on the MSstats package, but will supplement this with the standard tidyverse packages. We’ll use pheatmap for drawing heatmaps of the hits.

library(MSstats)
library(MSstatsConvert)
library(tidyverse)
library(pheatmap)
theme_set(theme_bw())

Loading Data

We are going to import two files from the MaxQuant output. These are:

  1. evidence.txt this is the main quantified file at the peptide level
  2. proteinGroups.txt this file provides protein level quantitation and shows how the peptides were combined

We’ll read in the evidence file first to look at some of the properties of the data, but we can then get MSStats to convert this into its standard format. We’re importing data from MaxQuant, but this would equally work with data from other search platforms.

Read in evidence file

We read in the file using the standard read_delim but we use the same column name repair that `read.delim would use as MSstats expects the names to be in this format.

read_delim(
  "evidence.txt",
  name_repair = "universal"
) -> evidence

head(evidence)

Read in protein file

We can also load the protein level information.

read_delim(
  "proteinGroups.txt", 
  name_repair = "universal"
) -> protein_groups

head(protein_groups)

Create annotation file

For the downstream analysis we also need to make up a tibble of annotations to say which group each file belongs to. There are only two groups here, there full proteomes and the affinity tag pull downs. We’ll make the annotation from the data in the evidence file.

We’re not doing a mass tagged experiment so we need to say that all of the samples are using light (L), ie normal masses.

evidence %>%
  distinct(`Raw.file`) %>%
  mutate(Condition = str_replace(Raw.file,"^.*-","")) %>%
  mutate(Condition = str_replace(Condition,"TAP_Prot","Prot")) %>%
  mutate(Condition = str_replace(Condition,"_Rep.","")) %>%
  arrange(Raw.file) %>%
  group_by(Condition) %>%
  mutate(BioReplicate = 1:n()) %>%
  ungroup() %>%
  add_column(IsotypeLabelType="L")  -> annotation

annotation

Properties of input data

We’ve already looked at the QC of this data using PTXQC, but we can also look directly into the evidence and protein data to see what we’re working with. MSstats will do some filtering for us when the data is loaded, but the exact metrics which are used will vary between different search programs.

Retention time

It’s good to see that we’re getting a nice even spread of peptides coming into the experiment through the duration of the retention time. We can see this visually.

evidence %>%
  ggplot(aes(x=Retention.time, colour=Raw.file)) +
  geom_density()

Number of peptides per sample

evidence %>%
  group_by(Raw.file) %>%
  summarise(
    true_matches = sum(is.na(Match.score)),
    transfer_matches = sum(!is.na(Match.score))
  ) %>%
  ungroup() %>%
  pivot_longer(
    cols = -Raw.file,
    names_to="match_type",
    values_to="peptide_count"
  ) %>%
  ggplot(aes(x=Raw.file, y=peptide_count, fill=match_type)) +
  geom_col() +
  coord_flip() +
  scale_fill_brewer(palette = "Set1")

All of the plots we could make in PTXQC we could remake here. We can see that the TAP samples show fewer peptides than the proteomes.

Amount of contamination

evidence %>%
  group_by(Raw.file) %>%
  summarise(
    contaminants = sum(!is.na(Potential.contaminant)),
    non_contaminants = sum(is.na(Potential.contaminant))
  ) %>%
  left_join(annotation) %>%
  ggplot(aes(x=non_contaminants, y=contaminants, colour=Condition, label=BioReplicate)) +
  geom_point(size=7) +
  geom_text(colour="black")
## Joining with `by = join_by(Raw.file)`

Again, we can see that the full proteomes have many more peptides in them, but also many fewer contaminants. The TAP samples are less complex (as you’d expect) and more contaminated.

We can also see that there is a small batch effect where samples 1,2 and 3 in the full proteome are less abundant than samples 4,5 and 6.

PEP scores

The PEP scores represent the probability that the reported match is correct and are based on the strength of the hits to the real proteome reference vs the hits to the reversed reference.

evidence %>%
  filter(is.na(Potential.contaminant)) %>%
  select(Raw.file,PEP) %>%
  mutate(PHRED = -10*log10(PEP)) %>%
  filter(!is.nan(PHRED)) %>%
  mutate(PHRED = replace(PHRED, PHRED > 50,50)) %>%
  left_join(annotation) %>%
  ggplot(aes(x=Raw.file, y=PHRED, fill=Condition)) +
  geom_violin() +
  coord_flip()
## Joining with `by = join_by(Raw.file)`

We can see that although a lot of the matches are of high quality the Q1 files, especially the Q1 TAP have more hits with lower quality matches. The Q2 files look much nicer. It’s only the PEP scores with a PHRED of less than 20 (99% confidence) where we’d have any concerns.

Number of peptides per protein

protein_groups %>%
  select(starts_with("Peptides.")) %>%
  pivot_longer(
    cols=everything(),
    names_to="Raw.File",
    values_to="peptide_count"
  ) %>%
  filter(peptide_count > 0) %>%
  mutate(Raw.File=str_replace(Raw.File,"Peptides.","")) %>%
  ggplot(aes(x=peptide_count, group=Raw.File)) +
  geom_density() +
  coord_cartesian(xlim=c(0,25)) +
  scale_x_continuous(breaks=0:25)

There are quite a lot of proteins identified by only a single peptide. These would usually be removed during the data import and processing.

Number and type of proteins

We can also look at the total number of proteins observed in each sample and how it was found (directly observed or via MS1 transfer).

protein_groups %>%
  select(starts_with("Identification.type.")) %>%
  pivot_longer(
    cols=everything(),
    names_to="Raw.File",
    values_to="IdentificationType"
  ) %>%
  group_by(Raw.File,IdentificationType) %>%
  count() %>%
  ggplot(aes(x=Raw.File, y=n, fill=IdentificationType)) +
  geom_col() +
  coord_flip()

We can see that there are many fewer observed proteins in the TAP samples, and that the Q1 replicates had a higher proportion of MS2 identifications than the Q2.

Having taken a basic look at the raw data we can start working through the MSstats pipeline.

Running MSstats

Converting the input data

MaxQtoMSstatsFormat(
  evidence = evidence,
  annotation = annotation,
  proteinGroups = protein_groups
) -> raw_data
## INFO  [2024-09-26 15:52:03] ** Raw data from MaxQuant imported successfully.
## INFO  [2024-09-26 15:52:03] ** Rows with values of Potentialcontaminant equal to + are removed 
## INFO  [2024-09-26 15:52:04] ** Rows with values of Reverse equal to + are removed 
## INFO  [2024-09-26 15:52:04] ** Rows with values of Potentialcontaminant equal to + are removed 
## INFO  [2024-09-26 15:52:04] ** Rows with values of Reverse equal to + are removed 
## INFO  [2024-09-26 15:52:04] ** Rows with values of Onlyidentifiedbysite equal to + are removed 
## INFO  [2024-09-26 15:52:04] ** + Contaminant, + Reverse, + Potential.contaminant, + Only.identified.by.site proteins are removed.
## INFO  [2024-09-26 15:52:05] ** Raw data from MaxQuant cleaned successfully.
## INFO  [2024-09-26 15:52:05] ** Using provided annotation.
## INFO  [2024-09-26 15:52:05] ** Run labels were standardized to remove symbols such as '.' or '%'.
## INFO  [2024-09-26 15:52:05] ** The following options are used:
##   - Features will be defined by the columns: PeptideSequence, PrecursorCharge
##   - Shared peptides will be removed.
##   - Proteins with single feature will not be removed.
##   - Features with less than 3 measurements across runs will be removed.
## INFO  [2024-09-26 15:52:05] ** Features with all missing measurements across runs are removed.
## INFO  [2024-09-26 15:52:05] ** Shared peptides are removed.
## INFO  [2024-09-26 15:52:05] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max
## INFO  [2024-09-26 15:52:05] ** Features with one or two measurements across runs are removed.
## INFO  [2024-09-26 15:52:05] ** Run annotation merged with quantification data.
## INFO  [2024-09-26 15:52:05] ** Features with one or two measurements across runs are removed.
## INFO  [2024-09-26 15:52:05] ** Fractionation handled.
## INFO  [2024-09-26 15:52:06] ** Updated quantification data to make balanced design. Missing values are marked by NA
## INFO  [2024-09-26 15:52:06] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.
raw_data %>%
  as_tibble() %>%
  head()

We can see a simplified, standardised version of the data.

Quantifying the data

dataProcess(
  raw_data,
  logTrans = 2,
  normalization = "equalizeMedians"
) -> quantified_data
## INFO  [2024-09-26 15:52:07] ** Log2 intensities under cutoff = 15.108  were considered as censored missing values.
## INFO  [2024-09-26 15:52:07] ** Log2 intensities = NA were considered as censored missing values.
## INFO  [2024-09-26 15:52:07] ** Use all features that the dataset originally has.
## INFO  [2024-09-26 15:52:08] 
##  # proteins: 1681
##  # peptides per protein: 1-145
##  # features per peptide: 1-1
## INFO  [2024-09-26 15:52:08] Some proteins have only one feature: 
##  D6VTK4,
##  O74700,
##  O94742,
##  P00044;P00045,
##  P04037 ...
## INFO  [2024-09-26 15:52:08] 
##                     ProtTot TAP
##              # runs       6   6
##     # bioreplicates       6   6
##  # tech. replicates       1   1
## INFO  [2024-09-26 15:52:08] Some features are completely missing in at least one condition:  
##  NQFYQLPTPTSSK_2_NA_NA,
##  AEASGEAADEADEADEE_2_NA_NA,
##  INEKPTVVNDYEAAR_2_NA_NA,
##  LRGNNIGSPLGAPK_2_NA_NA,
##  DGEVIANIIGEAK_2_NA_NA ...
## INFO  [2024-09-26 15:52:08]  == Start the summarization per subplot...
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
## INFO  [2024-09-26 15:52:57]  == Summarization is done.
quantified_data$FeatureLevelData %>% head()
quantified_data$ProteinLevelData %>% head()

We now have data which has been quantiated and normalised. The dataProcess function has additional options for changing the default methods used for filtering, quantitation and normalisation. By default it normalises at the PSM level to equalise the median intensity between files. It summarises the intensities of multiple PSMs in a protein using a Tukey polished median value and it keeps all features in the data. It will impute missing values rather than leaving them empty.

The review of statistics said that just using the default maxquant quantitation, with no additional normalisation is also a good approach, so you could

Visualising the quantitated data

Checking peptide normalisation

We can look at the distribution of normalised intensity values in the peptides. We should see that their medians are all the same because of the method used.

quantified_data$FeatureLevelData %>%
  filter(!censored) %>%
  ggplot(aes(x=originalRUN, y=newABUNDANCE, fill=GROUP)) +
  geom_boxplot() +
  coord_flip()

These seem fairly comparable from this level of view. Whilst this type of normalisation would generally work well with whole proteomes we will check how it performs on this data later, since we are looking at enriched data.

Checking protein normalisation

We can also look at the protein level.

quantified_data$ProteinLevelData %>%
  ggplot(aes(x=originalRUN, y=LogIntensities, fill=GROUP)) +
  geom_boxplot() +
  coord_flip() 

At the protein level we can see that the normalisation doesn’t look as nice and that the distribution of intensities in the total protein are higher than the TAP. We saw earlier that the TAP was a less complex mix so there are many fewer proteins in it. We would also expect that they would be distributed differently since they are only proteins which are affinity pulled-down so we might expect a smaller number of more highly abundant proteins. For whole proteome comparisons normalisation should be fairly straight forward, but for enriched mixes it can be more complex.

We can look at the distributions in a different way, looking at the detail of the distribution for each sample.

quantified_data$ProteinLevelData %>%
  ggplot(aes(x=LogIntensities, colour=GROUP, group=originalRUN)) +
  geom_density(linewidth=1) 

Again, we can see the distributions look different, but now it’s clearer that we can see an outgroup with high abundance in the TAP samples, and the majority of the data with relatively low abundance. It’s the high abundance proteins in the TAP relative to Total that we’re likely to be most interested in.

Clustering

We can perform clustering on the samples in a few different ways. The simplest is to do a PCA plot

quantified_data$ProteinLevelData %>%
  select(Protein,originalRUN,LogIntensities) %>%
  filter(!is.na(LogIntensities)) %>%
  group_by(Protein) %>%
  filter(n() == 12) %>%
  ungroup() %>%
  pivot_wider(
    names_from=originalRUN,
    values_from=LogIntensities
  ) %>%
  column_to_rownames(var="Protein") %>%
  t() %>%
  prcomp(scale=TRUE) -> pca_results

We can then plot the results

pca_results$x %>%
  as_tibble(rownames="Raw.file") %>%
  left_join(annotation) %>%
  ggplot(aes(x=PC1, y=PC2, fill=Condition)) +
  geom_point(pch=21, size=6) +
  ggtitle("PCA plot of all samples")
## Joining with `by = join_by(Raw.file)`

We can see that the samples separate between groups really nicely on PC1. On PC2 we see the ProtTot samples separate into two groups, the same as we saw when we looked at the amount of contaminants.

We can see how much of the total signal is contained within the different principle components.

tibble(
  PC = paste0("PC",1:12),
  SD = pca_results$sdev
) %>%
  mutate(PC = factor(PC, levels=PC)) %>%
  mutate(
    PC_SD = 100 * SD^2 / sum(SD^2)
  ) %>%
  ggplot(aes(x=PC,y=PC_SD)) +
  geom_col() +
  ggtitle("Amount of information in each PC")

We can see that by only looking at PCs 1 and 2 we’re really not missing much.

If we wanted to see which genes were driving the changes in PC1 and PC2 then we could look at the highest weights in those components.

pca_results$rotation %>%
  as_tibble(rownames="Protein") %>%
  select(Protein,PC1,PC2) %>%
  pivot_longer(
    cols=-Protein,
    names_to="PC",
    values_to="Weight"
  ) %>%
  arrange(desc(abs(Weight))) %>%
  group_by(PC) %>%
  slice(1:20) %>%
  group_split()
## <list_of<
##   tbl_df<
##     Protein: character
##     PC     : character
##     Weight : double
##   >
## >[2]>
## [[1]]
## # A tibble: 20 × 3
##    Protein                     PC     Weight
##    <chr>                       <chr>   <dbl>
##  1 P12709                      PC1   -0.0554
##  2 P34760;Q04120               PC1   -0.0554
##  3 P14540                      PC1   -0.0554
##  4 P00560                      PC1   -0.0553
##  5 P54115                      PC1   -0.0553
##  6 P00950                      PC1   -0.0553
##  7 P00817                      PC1   -0.0552
##  8 P06169;P26263;P16467;Q07471 PC1   -0.0552
##  9 P00925                      PC1   -0.0552
## 10 P17255                      PC1   -0.0552
## 11 P00549                      PC1   -0.0552
## 12 P04840                      PC1   -0.0552
## 13 Q03048                      PC1   -0.0552
## 14 P00330;P00331               PC1   -0.0552
## 15 P32589                      PC1   -0.0552
## 16 P00830                      PC1   -0.0552
## 17 P32582                      PC1   -0.0551
## 18 P22515                      PC1   -0.0550
## 19 P00498                      PC1   -0.0550
## 20 P20081                      PC1   -0.0549
## 
## [[2]]
## # A tibble: 20 × 3
##    Protein       PC    Weight
##    <chr>         <chr>  <dbl>
##  1 P25443        PC2   0.101 
##  2 P02309        PC2   0.0985
##  3 P02294;P02293 PC2   0.0969
##  4 P40495        PC2   0.0964
##  5 P0CX46;P0CX45 PC2   0.0954
##  6 P12695        PC2   0.0945
##  7 P32497        PC2   0.0945
##  8 P14120        PC2   0.0932
##  9 P32481        PC2   0.0921
## 10 P30822        PC2   0.0920
## 11 P16387        PC2   0.0906
## 12 P39015        PC2   0.0902
## 13 P53622        PC2   0.0894
## 14 Q06287        PC2   0.0886
## 15 P53551        PC2   0.0883
## 16 P0CX42;P0CX41 PC2   0.0882
## 17 P0CX28;P0CX27 PC2   0.0866
## 18 P53742        PC2   0.0864
## 19 P39730        PC2   0.0851
## 20 P09624        PC2   0.0849

Scatterplot

Let’s look at the scatterplot of Total vs TAP at the protein level.

quantified_data$ProteinLevelData %>%
  group_by(Protein, GROUP) %>%
  summarise(
    LogIntensities = mean(LogIntensities, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from=GROUP,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  ggplot(aes(x=ProtTot, y=TAP)) +
  geom_jitter(width=0.1, height=0.1) +
  geom_abline(slope=1, intercept = 0, colour="red", linewidth=1)
## `summarise()` has grouped output by 'Protein'. You can override using the
## `.groups` argument.

Because we have a lot of proteins which were only observed in one condition or the other I’ve added an artificial value of 15 to these missing values, and then gave these a bit of noise so we could get a sense of how many there were. We can see a lot of proteins in Total which are absent in TAP, and only a few which go the other way. There is a diagonal set of points which look higher in TAP, but those might be the true unchanging proteins, so we could later opt to re-normalise to center the data on those proteins, which will give us a very different answer for the changing proteins.

Differential Expression

We’d like to do some statistics to compare the two conditions. We need to build a contrast, which is super simple where we only have two conditions. we need a “1” for one conditions and “-1” for the other.

levels(quantified_data$FeatureLevelData$GROUP)
## [1] "ProtTot" "TAP"
matrix(c(-1,1),nrow=1) -> contrasts

row.names(contrasts) <- "TAP_vs_Total"
colnames(contrasts) <- levels(quantified_data$FeatureLevelData$GROUP)

contrasts
##              ProtTot TAP
## TAP_vs_Total      -1   1

Now we can run the statistics. This runs the mixed effects model built into MSstats.

groupComparison(contrast.matrix = contrasts, data=quantified_data) -> comparison_result
## INFO  [2024-09-26 15:53:01]  == Start to test and get inference in whole plot ...
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
## INFO  [2024-09-26 15:53:41]  == Comparisons for all proteins are done.

Review stats results

Let’s take a look at the results. We’re only really interested in proteins which are higher in TAP than Total.

comparison_result$ComparisonResult %>%
  filter(log2FC >0) %>%
  arrange(adj.pvalue) %>%
  head(n=20)

We can see that the most significant hits are proteins which appeared in one sample but not the other. Let’s have a look at those which are in both.

comparison_result$ComparisonResult %>%
  filter(log2FC >0 & !is.infinite(log2FC)) %>%
  arrange(adj.pvalue) %>%
  head(n=20)

Volcano Plot

We can plot this out as a volcano plot.

comparison_result$ComparisonResult %>%
  filter(!is.na(log2FC) & !is.na(adj.pvalue)) %>%
  mutate(PHRED=-10*log10(adj.pvalue)) %>%
  mutate(PHRED=replace(PHRED,PHRED>55,55)) %>%
  mutate(log2FC = replace(log2FC, log2FC< -10, -10)) %>%
  mutate(log2FC = replace(log2FC, log2FC> 10, 10)) %>%
  mutate(significant = log2FC>0 & adj.pvalue<0.01) -> volcano_data

volcano_data %>%
  ggplot(aes(x=log2FC, y=PHRED, colour=significant, label=Protein)) +
  geom_jitter(show.legend = FALSE, width=0.2, height=1) +
  scale_colour_manual(values=c("grey","blue2")) +
  geom_vline(xintercept = 0, linewidth=1) +
  geom_text(data = volcano_data %>% filter(significant & log2FC > 4), colour="black", size=3, vjust=1.5)

These statistics give a uniformly high value to the points which are consistently observed in one sample but not the other. This would be one of the major differences between this method and other statistical methods.

Scatterplot

We can plot these out on the same scatterplot as before.

quantified_data$ProteinLevelData %>%
  group_by(Protein, GROUP) %>%
  summarise(
    LogIntensities = mean(LogIntensities, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from=GROUP,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  mutate(up_in_TAP = Protein %in% (comparison_result$ComparisonResult %>% filter(log2FC >0 & adj.pvalue < 0.01) %>% pull(Protein))) %>%
  arrange(up_in_TAP) %>%
  ggplot(aes(x=ProtTot, y=TAP, colour=up_in_TAP)) +
  geom_jitter(width=0.1, height=0.1, show.legend = FALSE) +
  geom_abline(slope=1, intercept = 0, colour="red", linewidth=1) +
  scale_colour_manual(values=c("grey","blue2")) 
## `summarise()` has grouped output by 'Protein'. You can override using the
## `.groups` argument.

Validating quantitative hits

If we wanted to make a strong claim about an individual protein then it would be good to review the information we have for it. Let’s look at a few of the top hits to see how well they behave across samples.

comparison_result$ComparisonResult %>%
  filter(log2FC >0 & !is.infinite(log2FC)) %>%
  arrange(adj.pvalue) %>%
  head(n=20) %>% 
  pull(Protein) %>%
  as.character() -> top_20_measured

top_20_measured
##  [1] "P22147"        "P53235"        "Q06218"        "Q06344"       
##  [5] "Q06631"        "P42846"        "Q12460"        "P38697"       
##  [9] "P25555"        "P04147"        "P25586"        "P41056"       
## [13] "P25644"        "Q12754"        "P24276"        "Q3E792;P0C0T4"
## [17] "P41819"        "Q03532"        "Q12024"        "P40070"

These are the top 20 hits which were measured in both conditions.

quantified_data$ProteinLevelData %>%
  filter(Protein %in% top_20_measured) %>%
  ggplot(aes(x=GROUP, y=LogIntensities, size=NumMeasuredFeature, colour=GROUP)) +
  geom_jitter(show.legend = FALSE) +
  stat_summary(geom="crossbar", colour="black", fun=mean, show.legend = FALSE) +
  facet_wrap(vars(Protein))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

We can see a nice consistent separation of the two groups. We can see from the point sizes that there is variability in the number of peptides seen in each sample and that some samples are better observed that others.

Peptide level differences

The view we see above comes from the protein level aggregation, and each of those measurements derives from multiple peptide level measurements. We can go back to the peptides to see how well the evidence there stacks up.

Let’s look at one of the moderate hits - Q12460

quantified_data$FeatureLevelData %>%
  filter(PROTEIN == "Q12460") %>%
  arrange(GROUP,originalRUN) %>%
  mutate(originalRUN=factor(originalRUN, levels=unique(originalRUN))) %>%
  ggplot(aes(x=originalRUN, y=newABUNDANCE, colour=GROUP, fill=GROUP, shape=censored, group=PEPTIDE)) +
  geom_line(linewidth=1)+
  geom_point(size=3, colour="black") +
  scale_shape_manual(values=c(21,24))

Now we can see the more detailed peptide level view of the data showing the degree of overall consistency. With the shape colours we can also see which values were actually measured, and which were imputed.

Heatmap

We can also look at the significant hits in a heatmap. Because of the clustering methods we can only use proteins with complete observations, ie no NA values, so we’d either need to substitute those for some arbitrarily low value, or we’d need to exlude them.

This is excluding them to just use the fully measured proteins.

quantified_data$ProteinLevelData %>%
  filter(Protein %in% (comparison_result$ComparisonResult %>% filter(adj.pvalue<0.01) %>% pull(Protein) )) %>%
  select(Protein,originalRUN,LogIntensities) %>%
  group_by(Protein) %>%
  filter(n()>=12) %>%
  pivot_wider(
    names_from=originalRUN,
    values_from=LogIntensities
  ) %>%
  column_to_rownames(var="Protein") %>%
  pheatmap(
    scale="row",
    show_rownames = FALSE,
    fontsize_col = 6,
    color = colorRampPalette(c("magenta2","black","green2"))(50),
    breaks=seq(from=-2,to=2,length.out=50)
  )

This is swapping them for a value of 15 (as we used previously).

quantified_data$ProteinLevelData %>%
  filter(Protein %in% (comparison_result$ComparisonResult %>% filter(adj.pvalue<0.01) %>% pull(Protein) )) %>%
  select(Protein,originalRUN,LogIntensities) %>%
  group_by(Protein) %>%
  pivot_wider(
    names_from=originalRUN,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  column_to_rownames(var="Protein") %>%
  pheatmap(
    scale="row",
    show_rownames = FALSE,
    fontsize_col = 6,
    color = colorRampPalette(c("magenta2","black","green2"))(50),
    breaks=seq(from=-2,to=2,length.out=50)
  )

We can see that the missing values don’t always look great. There are a bunch which are present in only one replicate of the 4 available, and some which show relatively small increases.

Let’s look at these categorical hits more closely.

Categorical Hits

We can also look at the hits which were present in one condition and absent in the other to see how strong that difference is to gauge how confident we should be that this is a difference which derives from actual underlying protein abundance changes.

comparison_result$ComparisonResult %>%
  filter(is.infinite(log2FC), log2FC>0)  %>%
  pull(Protein) %>%
  as.character() -> categorical_hits

categorical_hits
##  [1] "P24482" "P25515" "P32605" "P32639" "P32644" "P36527" "P40460" "P41903"
##  [9] "P47015" "P47100" "P47108" "P47130" "P53866" "P53953" "P53959" "P53971"
## [17] "Q00539" "Q00916" "Q03776" "Q04119" "Q04408" "Q04602" "Q04867" "Q05900"
## [25] "Q07508" "Q08492" "Q08925" "Q12525"

We can now see how high and consistent these are in the TAP condition, since we know that they will be absent in the ProtTot

quantified_data$ProteinLevelData %>%
  filter(Protein %in% categorical_hits) %>%
  filter(GROUP=="TAP") %>%
  ggplot(aes(x=Protein, y=LogIntensities)) +
  geom_jitter(colour="darkgrey", size=4, width=0.2) +
  stat_summary(geom="crossbar", fun.data=mean_se) +
  coord_flip()

We can see that there are a lot of samples here which are near the limit of detection (which was around 20 in this data), and that some of the samples show a lot of variability between the replicates which were measured, and had some replicates missing (there should be 6) so we’d need to be cautious about believing some of these candidates.

Renormalising

With this type of data it’s not obvious where the correct position for protein level normalisation should be. We have a few different normalisation options but most of them are based around matching some properties of the distributions. In our case we don’t necessarily expect the distributions to match nicely because of the TAP enrichment. Instead we can pull out genes in the central high intensity band and use those to direct the normalisation.

Selecting Genes

It looks like I can select genes with an intensity above 23 in both samples and use those for normalisation.

quantified_data$ProteinLevelData %>%
  group_by(Protein, GROUP) %>%
  summarise(
    LogIntensities = mean(LogIntensities, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from=GROUP,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  filter(ProtTot > 23 & TAP > 23) %>%
  pull(Protein) -> genes_to_normalise_with
## `summarise()` has grouped output by 'Protein'. You can override using the
## `.groups` argument.

Let’s have a look at those.

quantified_data$ProteinLevelData %>%
  group_by(Protein, GROUP) %>%
  summarise(
    LogIntensities = mean(LogIntensities, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from=GROUP,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  mutate(normalise = Protein %in% genes_to_normalise_with) %>%
  arrange(normalise) %>%
  ggplot(aes(x=ProtTot, y=TAP, colour=normalise)) +
  geom_jitter(width=0.1, height=0.1, show.legend = FALSE) +
  geom_abline(slope=1, intercept = 0, colour="red", linewidth=1) +
  scale_colour_manual(values=c("grey","blue2")) 
## `summarise()` has grouped output by 'Protein'. You can override using the
## `.groups` argument.

That doesn’t look too bad.

Re-run normalisation

dataProcess(
  raw_data,
  normalization = "globalStandards",
  nameStandards = genes_to_normalise_with
) -> quantified_data2
## INFO  [2024-09-26 15:53:51] ** Log2 intensities under cutoff = 15.292  were considered as censored missing values.
## INFO  [2024-09-26 15:53:51] ** Log2 intensities = NA were considered as censored missing values.
## INFO  [2024-09-26 15:53:51] ** Use all features that the dataset originally has.
## INFO  [2024-09-26 15:53:51] 
##  # proteins: 1681
##  # peptides per protein: 1-145
##  # features per peptide: 1-1
## INFO  [2024-09-26 15:53:51] Some proteins have only one feature: 
##  D6VTK4,
##  O74700,
##  O94742,
##  P00044;P00045,
##  P04037 ...
## INFO  [2024-09-26 15:53:51] 
##                     ProtTot TAP
##              # runs       6   6
##     # bioreplicates       6   6
##  # tech. replicates       1   1
## INFO  [2024-09-26 15:53:52] Some features are completely missing in at least one condition:  
##  NQFYQLPTPTSSK_2_NA_NA,
##  AEASGEAADEADEADEE_2_NA_NA,
##  INEKPTVVNDYEAAR_2_NA_NA,
##  LRGNNIGSPLGAPK_2_NA_NA,
##  DGEVIANIIGEAK_2_NA_NA ...
## INFO  [2024-09-26 15:53:52]  == Start the summarization per subplot...
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
## INFO  [2024-09-26 15:54:38]  == Summarization is done.

Let’s check the re-normalised data

quantified_data2$ProteinLevelData %>%
  ggplot(aes(x=LogIntensities, colour=GROUP, group=originalRUN)) +
  geom_density(linewidth=1) 

From a distribution point of view this actually looks worse, but we can observe the patterning in the scatterplot which makes more sense.

Rerun comparison

groupComparison(contrast.matrix = contrasts, data=quantified_data2) -> comparison_result2
## INFO  [2024-09-26 15:54:39]  == Start to test and get inference in whole plot ...
##   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |==                                                                    |   4%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |=========                                                             |  14%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |=======================                                               |  34%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |=====================================                                 |  54%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |==========================================================            |  84%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
## INFO  [2024-09-26 15:55:17]  == Comparisons for all proteins are done.

Replotting

quantified_data2$ProteinLevelData %>%
  group_by(Protein, GROUP) %>%
  summarise(
    LogIntensities = mean(LogIntensities, na.rm = TRUE)
  ) %>%
  pivot_wider(
    names_from=GROUP,
    values_from=LogIntensities,
    values_fill = 15
  ) %>%
  mutate(up_in_TAP = Protein %in% (comparison_result2$ComparisonResult %>% filter(log2FC >0 & adj.pvalue < 0.01) %>% pull(Protein))) %>%
  arrange(up_in_TAP) %>%
  ggplot(aes(x=ProtTot, y=TAP, colour=up_in_TAP)) +
  geom_jitter(width=0.1, height=0.1, show.legend = FALSE) +
  geom_abline(slope=1, intercept = 0, colour="red", linewidth=1) +
  scale_colour_manual(values=c("grey","blue2")) 
## `summarise()` has grouped output by 'Protein'. You can override using the
## `.groups` argument.

We can see that we get a smaller collection of hits with the modified normalisation. The data has shifted to center on the genes we selected for normalisation. It’s not completely clear which result is more correct since we have no absolute measurements, but this method gives us the ability to rerun normalisation with any selection of genes we care to pick.