How I perform Genome Mapping using BWA | Mapping any Reads to a Reference Genome | Paired-end Reads

Ғылым және технология

This tutorial shows you how to perform genome mapping using Burrows-Wheeler Aligner(BWA) and process the output bam file using samtools.
Script and data can be downloaded from here: / 75872047
Dealing with multiple samples: / genome-mapping-77559127
Support my work
www.buymeacoffee.com/informat...
www.paypal.com/paypalme/thein...
/ bigdataanalytics
One-on-One coaching (Video Conferencing)
calendly.com/bioinformaticscoach
One-on-One coaching(Audio Call)
clarity.fm/vincentappiah
Get more bioinformatics tutorials on Patreon
/ bigdataanalytics
Subscribe to my channels
Bioinformatics: / @bioinformaticscoach
Data Science: / @datasciencecoach
Short Clips: / @bioinformaticsclips
Reach out
bioinformaticscoach@gmail.comSupport my work
Commands used in this tutorial
bwa index
bwa mem
samtools view
samtools sort
samtools index
samtools flagstat
Course materials
reference genome: www.ncbi.nlm.nih.gov/nuccore/...
source of data: www.ncbi.nlm.nih.gov/pmc/arti...
sra-database trace.ncbi.nlm.nih.gov/Traces...
download link for read 1 ftp.sra.ebi.ac.uk/vol1/run/ERR...
download link for read 2 ftp.sra.ebi.ac.uk/vol1/run/ERR...
BWA home page
github.com/lh3/bwa
bio-bwa.sourceforge.net/
samtools
github.com/bahlolab/bioinfoto...
www.htslib.org/doc/samtools-fl...
How to install BWA
build from source: • Bioinformatics Genome ...
use binaries: • Bioinformatics Tools |...
use conda: • Bioinformatics Tools |...
How to Install samtools
binaries: • Bioinformatics Tools |...
build from source: • Bioinformatics Tools |...
use conda: • Bioinformatics Tools |...
Chapters
00:00 Intro
00:13 PC Requirement
00:43 Data Description
00:58 Download data
01:33 Open terminal
01:42 Create a working directory
01:57 Download data
03:19 Download the reference sequence
04:27 Index the reference sequence
05:50 Map reads to the reference sequence
13:01 Index the bam file
14:11 Check the mapping statistics
#bioinformatics #bioinformática #bioinformaticsforbeginners #genomics
DNA Icon by flaticon

Пікірлер: 18

@lakshmipraba4398 Жыл бұрын
It's really helpful for my project. Thanks a lot 🙏
@dr.maqsoodahmad8572 Жыл бұрын
Very helpful Thanks
@kiplimosimon14292 ай бұрын
very informative. Thank you
@serychristianrenaud2 жыл бұрын
thank
@islamicmotivational_17862 жыл бұрын
Hello the vedio is very helpful Can you share vedio on same concept but using python That will be very grateful Thanks
@desaishailesh3527 Жыл бұрын
this is super and thanks for making it, but the question I was having after alignment in case you want to remove the non-aligned sequence (possible to reduce the size or even no use of data of non-alignment), how only we can keep aligned data?
@bioinformaticscoach
Жыл бұрын
you can use samtools to filter the alignment records.eg. select reads with mapped mate pairs, reads with a particular mapping quality,etc. Alternatively you can book a session with me and we can go through the steps.
@desaishailesh3527
Жыл бұрын
@@bioinformaticscoach thank you very much i will trt it
@alexandranikolaeva876 Жыл бұрын
This is super helpful, thank you! I am wondering what are the benefits of converting to BAM besides the fact that it is a smaller file. Do you lose any information when you convert?
@bioinformaticscoach
Жыл бұрын
You don't lose any information at all. But I recommend you use the bam files especially if dealing with lots of samples. Smaller files also mean less computational constraint
@alexandranikolaeva876
Жыл бұрын
@@bioinformaticscoach Awesome, thank you! Your videos are super helpful! I am using BAM files in the downstream application to check the ploidy of my samples, and this is the most straightforward explanation on the alignment that I was able to find so far!
@bioinformaticscoach
Жыл бұрын
@@alexandranikolaeva876 It is also important to perform filtering of the alignment records in the bam files. I working on a tutorial on that. This will be released next month. You can use tools like samtools and bamtools.
@alexandranikolaeva876
Жыл бұрын
@@bioinformaticscoach That's really awesome, thanks! One more question - in bwa mem command, how much does the number of threads matter (8)? My file has been running for 2 hours now, but I have a very big genome (27 gb). Do you think I can decrease that number without loss of information?
@bioinformaticscoach
Жыл бұрын
@@alexandranikolaeva876 Genome mapping is computationally expensive. So you need a high end PC or better still a Computer Server. You can then increase the number of threads to make BWA run faster.
@user-tm9di3ot3x11 ай бұрын
It really helpful I got 34 file s of R1 Illumina reads and 34 files of R2 Illumina reads Since it is not easy to write down them all in a command just like bwa mem -t 8 .... Could you help me to write a script just to merge them all?
@bioinformaticscoach
11 ай бұрын
Yes. That can be done. You can book a session and I will work on it.
@erindarruci1085
Ай бұрын
Hi, did you perform any trimming , or are you using the raw reads? Thank you!!
@Likitu26410 ай бұрын
What is the best software to perform alignment of a file that contains many sequences? Im gonna be working with data from an ilumina NGS sequencing panel, is about 100 human genes, only coding regions and 20bp from flanking introns