How I perform Genome Mapping using BWA | Mapping any Reads to a Reference Genome | Paired-end Reads
Ғылым және технология
This tutorial shows you how to perform genome mapping using Burrows-Wheeler Aligner(BWA) and process the output bam file using samtools.
Script and data can be downloaded from here: / 75872047
Dealing with multiple samples: / genome-mapping-77559127
Support my work
www.buymeacoffee.com/informat...
www.paypal.com/paypalme/thein...
/ bigdataanalytics
One-on-One coaching (Video Conferencing)
calendly.com/bioinformaticscoach
One-on-One coaching(Audio Call)
clarity.fm/vincentappiah
Get more bioinformatics tutorials on Patreon
/ bigdataanalytics
Subscribe to my channels
Bioinformatics: / @bioinformaticscoach
Data Science: / @datasciencecoach
Short Clips: / @bioinformaticsclips
Reach out
bioinformaticscoach@gmail.comSupport my work
Commands used in this tutorial
bwa index
bwa mem
samtools view
samtools sort
samtools index
samtools flagstat
Course materials
reference genome: www.ncbi.nlm.nih.gov/nuccore/...
source of data: www.ncbi.nlm.nih.gov/pmc/arti...
sra-database trace.ncbi.nlm.nih.gov/Traces...
download link for read 1 ftp.sra.ebi.ac.uk/vol1/run/ERR...
download link for read 2 ftp.sra.ebi.ac.uk/vol1/run/ERR...
BWA home page
github.com/lh3/bwa
bio-bwa.sourceforge.net/
samtools
github.com/bahlolab/bioinfoto...
www.htslib.org/doc/samtools-fl...
How to install BWA
build from source: • Bioinformatics Genome ...
use binaries: • Bioinformatics Tools |...
use conda: • Bioinformatics Tools |...
How to Install samtools
binaries: • Bioinformatics Tools |...
build from source: • Bioinformatics Tools |...
use conda: • Bioinformatics Tools |...
Chapters
00:00 Intro
00:13 PC Requirement
00:43 Data Description
00:58 Download data
01:33 Open terminal
01:42 Create a working directory
01:57 Download data
03:19 Download the reference sequence
04:27 Index the reference sequence
05:50 Map reads to the reference sequence
13:01 Index the bam file
14:11 Check the mapping statistics
#bioinformatics #bioinformática #bioinformaticsforbeginners #genomics
DNA Icon by flaticon
Пікірлер: 18
It's really helpful for my project. Thanks a lot 🙏
Very helpful Thanks
very informative. Thank you
thank
Hello the vedio is very helpful Can you share vedio on same concept but using python That will be very grateful Thanks
this is super and thanks for making it, but the question I was having after alignment in case you want to remove the non-aligned sequence (possible to reduce the size or even no use of data of non-alignment), how only we can keep aligned data?
@bioinformaticscoach
Жыл бұрын
you can use samtools to filter the alignment records.eg. select reads with mapped mate pairs, reads with a particular mapping quality,etc. Alternatively you can book a session with me and we can go through the steps.
@desaishailesh3527
Жыл бұрын
@@bioinformaticscoach thank you very much i will trt it
This is super helpful, thank you! I am wondering what are the benefits of converting to BAM besides the fact that it is a smaller file. Do you lose any information when you convert?
@bioinformaticscoach
Жыл бұрын
You don't lose any information at all. But I recommend you use the bam files especially if dealing with lots of samples. Smaller files also mean less computational constraint
@alexandranikolaeva876
Жыл бұрын
@@bioinformaticscoach Awesome, thank you! Your videos are super helpful! I am using BAM files in the downstream application to check the ploidy of my samples, and this is the most straightforward explanation on the alignment that I was able to find so far!
@bioinformaticscoach
Жыл бұрын
@@alexandranikolaeva876 It is also important to perform filtering of the alignment records in the bam files. I working on a tutorial on that. This will be released next month. You can use tools like samtools and bamtools.
@alexandranikolaeva876
Жыл бұрын
@@bioinformaticscoach That's really awesome, thanks! One more question - in bwa mem command, how much does the number of threads matter (8)? My file has been running for 2 hours now, but I have a very big genome (27 gb). Do you think I can decrease that number without loss of information?
@bioinformaticscoach
Жыл бұрын
@@alexandranikolaeva876 Genome mapping is computationally expensive. So you need a high end PC or better still a Computer Server. You can then increase the number of threads to make BWA run faster.
It really helpful I got 34 file s of R1 Illumina reads and 34 files of R2 Illumina reads Since it is not easy to write down them all in a command just like bwa mem -t 8 .... Could you help me to write a script just to merge them all?
@bioinformaticscoach
11 ай бұрын
Yes. That can be done. You can book a session and I will work on it.
@erindarruci1085
Ай бұрын
Hi, did you perform any trimming , or are you using the raw reads? Thank you!!
What is the best software to perform alignment of a file that contains many sequences? Im gonna be working with data from an ilumina NGS sequencing panel, is about 100 human genes, only coding regions and 20bp from flanking introns