How to navigate chapter 7 in miniCURE-BioDIGS

I was trying the test version of miniCURE for BioDIGS, when I got stuck in Chapter 7 (Kickstart project work). I am having trouble in calculating summary statistics by running the ‘nanoplot’ in Activity 2 – Adjust for PacBio HiFi.
I think I am not able to locate the required tool. Please indicate how I can proceed with activity 2.
Thanks!!

Hi,

The tool you need is called NanoPlot, for which the input is fastq format. If, however, after locating the tool you still have trouble running it, to troubleshoot we will need more information.

To start with, we will need to know which dataset you are using. The easiest way to help will be if you post a link to your Galaxy working directory by going to a) History options little tab (that has 3 horizontal lines), b) Select Share or Publish and c) make history accessible d) copy and paste here. Thanks and Good luck!

1 Like

Hi!
Here is the link: Galaxy
I tried to play around with it a bit more. I was able to work on Nanoplot for the BioDIGS pilot dataset.
The ABRicate tool supports FASTA format, whereas I have FASTQ. I don’t really know how to proceed with it.
Also, using any other datasets from the NCBI Sequence Read Archive for nanoplot or ABRicate is confusing. Mainly because I don’t understand what the first step after importing the new dataset is.
Thanks,
Gauri

Hi Gauri,

Here are a few notes to help you proceed:

  1. In general, as BioDIGS pilot dataset is part of the Kickstart Project, refer to miniCURE-BioDIGS instructions in Chapter 7 Chapter 7 Kickstart Project Work | BioDIGS miniCURE

  2. The BioDIGs pilot is expected to be a PacBio dataset. To adjust for that, simply skip trimming and QC steps.

  3. You are correct - ABRicate will take in FASTA format, not FASTQ. If you look for ABRicate tool on Galaxy, the description will say that ABRicate is a tool for “Mass screening of contigs for antimicrobial and virulence genes”. The key word here is ‘contigs’, and implies that ABRicate takes a contig file in FASTA format as an input.

To get the contigs file, you first need to assemble the congs from your fastq reads. So, prior to running ABRicate do this:

a) In Galaxy, assemble the contigs from fastq file by running Flye which is de novo assembler for single molecule sequencing reads.
i) For Mode, select PacBio HiFi (–pacbio hifi)
ii) Set Perform Metagenomics assembly option to Yes

b) Once Flye completes contig assembly, it will output a contig file in FASTA format. Use contig.fasta file (which will be called flye_consensus_fasta) for ABRicate input and run ABRicate.

Let us know if you succeed or if you need more help. Good luck!

Thank you so much! I will try this out.