Plot 2D scatter plots comparing gene expression of two genes

mmccoy · July 17, 2022, 7:13pm

Sometimes you want to know how the expression of two genes relate to each other. Here is some code to plot a 2D scatter plot:

# Create a variable with your favorite gene
my_favorite_genes <- c("ENSMUSG00000038192","ENSMUSG00000047139")
my_favorite_celltype <- "Def. endoderm"

# Plot
logcounts(sce)[my_favorite_genes,sce$celltype %in% my_favorite_celltype] %>%
  t() %>%
  as.matrix() %>%
  as_tibble() %>%
  rename(gene1 = 1, gene2 = 2) %>%
  ggplot(aes(x = gene1, y = gene2)) +
  geom_point() +
  theme_bw() +
  xlab(paste(my_favorite_genes[1], "log(counts)")) +
  ylab(paste(my_favorite_genes[2], "log(counts)")) +
  ggtitle(my_favorite_celltype)

To plot a linear regression on top, you can add one with geom_smooth(method = “lm”):

# Plot with linear regression line
logcounts(sce)[my_favorite_genes,sce$celltype %in% my_favorite_celltype] %>%
  t() %>%
  as.matrix() %>%
  as_tibble() %>%
  rename(gene1 = 1, gene2 = 2) %>%
  ggplot(aes(x = gene1, y = gene2)) +
  geom_point() +
  geom_smooth(method = "lm", col = "firebrick") +
  theme_bw() +
  xlab(paste(my_favorite_genes[1], "log(counts)")) +
  ylab(paste(my_favorite_genes[2], "log(counts)")) +
  ggtitle(my_favorite_celltype)

Together with the correlatePairs() function, which I posted about recently (Finding coexpressed genes in specific cell types), you can use the p value and FDR to get a sense of whether your genes are significantly correlated, and then the rho value to see if they are positively or negatively correlated.

We can see from the above plots that these two genes are positively correlated. The output from correlatePairs() confirms this:

> cor.pairs <- correlatePairs(x = sce_subset, subset.row = of.interest)
> cor.pairs %>% as_tibble() %>% filter(gene1 %in% my_favorite_genes, gene2 %in% my_favorite_genes)
# A tibble: 1 × 5
  gene1              gene2                rho  p.value      FDR
  <chr>              <chr>              <dbl>    <dbl>    <dbl>
1 ENSMUSG00000038192 ENSMUSG00000047139 0.860 7.97e-30 1.59e-25

@yanitrevino

yanitrevino · July 18, 2022, 8:08pm

Hey @mmccoy so I tried running it to find the rho for Wnt7B(ENSMUSG00000022382) and c-jun(ENSMUSG00000052684), however the two genes I looked for didn’t pop up. What should I do?

yanethp · July 19, 2022, 3:19am

I’m having the same issue as Yani ^.

mmccoy · July 19, 2022, 3:44am

Hi @yanethp and @yanitrevino , your genes are in the cor.pairs object you made, you just have to search for them. I edited the code above to show you how to filter the cor.pairs object to only show your two genes of interest:

cor.pairs %>% as_tibble() %>% filter(gene1 %in% my_favorite_genes, gene2 %in% my_favorite_genes)

yanitrevino · July 19, 2022, 6:08am

Okay thank you I got it working!
@yanethp this is how I got it to work:
cor.pairs ← correlatePairs(x = sce_subset, subset.row = of.interest)
cor.pairs %>% as_tibble() %>% filter(gene1 %in% my_favorite_gene, gene2 %in% my_favorite_gene)

A tibble: 1 × 5

ShamounDivanyan · July 19, 2022, 6:47am

How should I alter this code to show a violin plot of gene expression relative the embryonic stages, for all cell types, would this even be possible ?

LindsayHayes · July 19, 2022, 2:57pm

Plotting the gastrulation across modalities using facet_wrap()

gene_name <- "Sox17"
gene_id <- rowData(sce)[which(rowData(sce)[,2] %in% gene_name),1]

# favorite plot to combine both stage and cell type
plotExpression(sce, features = gene_id, colour_by = "celltype", x = "celltype") + theme(axis.text.x = element_text(angle = 90), legend.position = "none") + facet_wrap( ~colData(sce)$stage ) + labs(subtitle = gene_name)

ShamounDivanyan · July 20, 2022, 5:22am

Whenever I run this code with multiple gene names, it would only provide the plots for the very first gene, How could I tweak this so that it would show every single gene with their own individual plots, or would I have to run the code 9 times with each gene

LindsayHayes · July 20, 2022, 12:47pm

You have to plot each one individually.

yanethp · July 20, 2022, 3:35pm

I’m still ruining into errors.

mmccoy · July 20, 2022, 8:25pm

Hi @yanethp , your variable is named “my_favorite_gene”, not “my_favorite_genes”. They need to match.

Topic		Replies	Views
Gene_A x Gene_B Plot Code Cookbook	0	269	April 11, 2022
Rstudio Poster Project Help Help project-work , R-Notebooks	3	477	April 19, 2022
Gene Coexpression Code Cookbook	0	272	July 7, 2022
Finding coexpressed genes in specific cell types Code Cookbook	4	316	July 17, 2022
Add gene SYMBOL to your plots Code Cookbook	0	361	February 18, 2022

Plot 2D scatter plots comparing gene expression of two genes

A tibble: 1 × 5

Related topics