Python For Bioinformatics and Your First Python for Bioinformatics Program

For more in-depth Python for Bioinformatics training visit: www.howtobioinformatics.com/py...
Hi and welcome to Python for bioinformatics, my name is Blake Allen, and I am going to show you how to make your first Python for Bioinformatics program, in under 20 minutes.
Were going to go over calculating GC content and making your first Python Program, So if you're a little more advanced and you already know how to use Python, but you'd like to learn more, go ahead and click the link below where I'll show you advanced techniques in learning python for bioinformatics.
The first thing we're going to need is some data. If you don't have any data, you can't do any bioinformatics, but the great thing is, is there is a ton of free data online ready to go.
So go ahead and open up your web browser and lets get started, I use chrome.
Go ahead and type in the Letters NCBI. In the search bar go ahead and type in BRCA 1
Click on this little tab right here that says nucleotide. Up at the top we've got a few things, go ahead and click on the homo sapiens BRCA1, FASTA tab.
Click on Send in the top right hand corner, click on send to file, and download as FASTA. Then copy that sequence.fasta to a new folder we'll be working in. Replace the name to BRCA1_BAP1.TXT, then you can open it and look at it.

Пікірлер: 87

  • @MyMasaka
    @MyMasaka2 жыл бұрын

    The best video i have seen on bioinformatics

  • @tomhitch763
    @tomhitch76310 жыл бұрын

    This tutorial is brilliant, please create more!

  • @georgegrevera7000
    @georgegrevera70006 жыл бұрын

    I very much enjoyed this video. I like the fact that, by the end, I'm working with real data and doing something useful. Thanks!

  • @chaokang3594
    @chaokang35949 жыл бұрын

    Really helpful! I love Python!

  • @LauraBrock
    @LauraBrock11 жыл бұрын

    This was really informative and interesting!

  • @MyChannel-jf7mr
    @MyChannel-jf7mr10 жыл бұрын

    Very informative. Thank you for providing this example.

  • @ricardomoran3
    @ricardomoran311 жыл бұрын

    FANTASTIC! Thank you!

  • @NA0S90
    @NA0S909 жыл бұрын

    very straight forward tutorial, thanks

  • @ShadArfMohammed
    @ShadArfMohammed7 жыл бұрын

    Thanks a lot, it was really helpful. You haven't put any other videos on this subject since 2013, though.

  • @SeemaP83
    @SeemaP8310 жыл бұрын

    It was helpful..thank you.keep adding

  • @alexanderdavis3117
    @alexanderdavis311711 жыл бұрын

    Very cool! I need to learn Python ASAP!

  • @laceycarlyle7754
    @laceycarlyle775411 жыл бұрын

    Very informative!

  • @cherryblossoms95
    @cherryblossoms9511 жыл бұрын

    THIS IS AMAZING.

  • @dhivyas9908
    @dhivyas99085 жыл бұрын

    Thank you it works very well

  • @jpshiva1
    @jpshiva111 жыл бұрын

    Noel Tanner, Thanks for the Reference sequence, i was having hard time finding the correct nucleotide.

  • @mardiclements1571
    @mardiclements157111 жыл бұрын

    Very Helpful!

  • @rusbiology3460
    @rusbiology34604 жыл бұрын

    Спасибо тебе большое за этот разбор!

  • @omotosoolatunde9139
    @omotosoolatunde91393 жыл бұрын

    Thank You!

  • @nityaaryasomayajula2204
    @nityaaryasomayajula22045 жыл бұрын

    Hello, Thanks for this video! I was wondering if we could use the difflib program to do comparative genomics for two different files and create a report of differences?

  • @meanderband
    @meanderband11 жыл бұрын

    Very Nice!

  • @aalimmujawar582
    @aalimmujawar5822 жыл бұрын

    thanks it is very good information

  • @zapy422
    @zapy4228 жыл бұрын

    Nice cool intro to bioinfo

  • @grimreapper2358
    @grimreapper23584 жыл бұрын

    this is outstanding iam hoping you can show more examples in jupyter notebook

  • @MrGomajo
    @MrGomajo8 жыл бұрын

    Why not write it in the Python IDLE?

  • @mni79
    @mni794 жыл бұрын

    good work

  • @ujenetics
    @ujenetics8 жыл бұрын

    Thanks a lot for a nice turotial! But have you tried TextWrangler instead of Textedit?

  • @jmadzo
    @jmadzo11 жыл бұрын

    more pythonic would by to get rif of nested loop and just use build in string function count(): for line in gene: g += line.count('g'); a += line.count('a'); c += line.count('c'); t += line.count('t');

  • @dr.md.ismailhossain2681
    @dr.md.ismailhossain26814 жыл бұрын

    very nice

  • @kjeyaprakash2638
    @kjeyaprakash26388 жыл бұрын

    which python book could be better for references ? This is nice!

  • @favoriteundsubscribe
    @favoriteundsubscribe11 жыл бұрын

    awesome

  • @unays
    @unays4 жыл бұрын

    oh man, wow thanx

  • @irenez.b.1730
    @irenez.b.17306 жыл бұрын

    any more advanced python scripts to use for the analysis of sequencing data

  • @SpamHead8
    @SpamHead811 жыл бұрын

    Very clear and informative - thanks! Do you mind if I post/share?

  • @kavansoni4671
    @kavansoni46716 жыл бұрын

    Pls provide the exact link for dataset download in description

  • @davidr.martinezph.d.4746
    @davidr.martinezph.d.47469 жыл бұрын

    Hi, So I wrote the same program on PyCharm I tried opening this in Bash Shell and I get told "not a directory". I switched directories to ensure I was in the right folder. Does anyone have suggestions?

  • @gitarrestunden2445
    @gitarrestunden244510 жыл бұрын

    Hi! Thanks for the video!! However, can you please explain why you set the g, a, t and c at 0 in the beginning? Thanks!

  • @stevanbr1

    @stevanbr1

    10 жыл бұрын

    Because you have to initialize variables to zero before you add a number to it ( g+=1 => g = g + 1), if you don't initialize variables to zero, your variable has seme thrash value, and you won't have a valid result. First time it enters 'if' with 'g', g is going to be zero, so g = 0 + 1 = 1, if you don't initialize, it will be g = #$#@$+ 1 = ?. Hope that helps :)

  • @irenez.b.1730
    @irenez.b.17306 жыл бұрын

    👏👏👏

  • @MrLompa76
    @MrLompa7610 жыл бұрын

    So I have to create a folder first then create another folder to put the file inside of it?

  • @VercingetoR3x
    @VercingetoR3x6 жыл бұрын

    What version of python did you use?

  • @shankfan
    @shankfan10 жыл бұрын

    this is for python 2.7.x right? it doesnt work with my 3.3.x

  • @dragonsteria3042
    @dragonsteria30428 жыл бұрын

    Awesome, my first python program to know the gc content... I have a question, What is the gc content for? What does it tell me exactly? Did not understand that very well. BTW I used this squence Rattus norvegicus BRCA1 mRNA, complete cds gc content: 0.460014

  • @NoelTanner
    @NoelTanner11 жыл бұрын

    I had a little trouble finding the correct Nucleotide, To save time here is the ref. # for the example in the video: NCBI Reference Sequence: NG_031859.1

  • @Stepwise9000

    @Stepwise9000

    4 жыл бұрын

    Now this doesn't work! :(

  • @Neohowphinktams
    @Neohowphinktams11 жыл бұрын

    Good video, just wish it was more streamlined

  • @cgroza
    @cgroza8 жыл бұрын

    Why not use count() or regular expressions?

  • @temaz3334

    @temaz3334

    7 жыл бұрын

    poor Python skills

  • @rafsanjanimuhammod309

    @rafsanjanimuhammod309

    7 жыл бұрын

    No, poor programming skills.

  • @bogdanbogdanovich140
    @bogdanbogdanovich1404 жыл бұрын

    invalid syntax on the second quote of print "number of g's " + str(g)

  • @MrChacha1994

    @MrChacha1994

    4 жыл бұрын

    idk if its because he's using make but If you are using windows like I am, make sure that when you use the "print" function, make sure to use parenthesis Ex: (EXACTLY LIKE THIS) print("number of g's " + str(g)")

  • @Paul-su7sb

    @Paul-su7sb

    3 жыл бұрын

    Same here, thank you so much for the advice I am going to try it

  • @kareenamulchandani3356

    @kareenamulchandani3356

    Жыл бұрын

    I think the syntax changed in Python3

  • @DaN3xtEconomist
    @DaN3xtEconomist4 жыл бұрын

    Just small question. Is this what bioinformatics mostly do? Sequence genes then use a programming language for analysis?

  • @MrChristian331

    @MrChristian331

    4 жыл бұрын

    In a nutshell...YES. But in addition to analysis, they can use programming for drug discovery therapeutics. They can use programming for predictive analytics to see if something will switch a gene on or turn it off before administering it experimenting with it to save time and money.

  • @titanoboa100
    @titanoboa10010 жыл бұрын

    My problem so far is saving the folder as a plain txt file. My macbook will not give me the option when I select the drop down list.

  • @vivanranjan261

    @vivanranjan261

    3 жыл бұрын

    yes even mine

  • @biemsklebob
    @biemsklebob5 жыл бұрын

    9:00 variable*

  • @MadMechwarrior
    @MadMechwarrior11 жыл бұрын

    I live python. Great tutorial!

  • @bhrishxxn1639
    @bhrishxxn16398 жыл бұрын

    thanks so much i'll definitely be coming back

  • @queenofunderland
    @queenofunderland8 жыл бұрын

    anyone know the answer ? what ,if u take the fasta format without head ,can u get rid of that gene.readline() ? And when the counter are named with A,C,T,G string, can u get rid of that line.lower() ? TQ 4 any suggestions .

  • @nenadsvrzikapa6893

    @nenadsvrzikapa6893

    8 жыл бұрын

    +willie ekaputra yeah that just skips the line, so if the line is not there you don't need to skip it, but if you remove it then it's no longer a fasta file. Either way, this is not how an advanced Bioinformatician would solve this task.I think Blake is showing that you can make the string lower case. It usually is upper case so you don't need to be converting you don't need that line.

  • @queenofunderland

    @queenofunderland

    7 жыл бұрын

    I have other question, can u then make this code a fct . with Def ... () :, so that u can open ANY Fasta saved files in yer PC and count its GC Content ?

  • @76BlueLions
    @76BlueLions11 жыл бұрын

    Your web page is down, can you let me download this. Your channel blocks it from being able to download.

  • @science_mbg
    @science_mbg8 жыл бұрын

    Thanks but I had problem while running. I used windows bash and I got " print "number of g's " + str(g) ^ SyntaxError: invalid syntax error. Even though I did the same thing that you did. Please help me

  • @nagaswaroopkenguntenagaraj8677

    @nagaswaroopkenguntenagaraj8677

    8 жыл бұрын

    +Suleyman Bozkurt That maybe because you are using python 3+ where the syntax for print statement is print("number of g's "+ str(g)) [Notice the parentheses], whereas in python 2+ the syntax for print is as mentioned in the video[ print "number of g's " + str(g) ] Hope it helped! :)

  • @d34thcom3sripping

    @d34thcom3sripping

    6 жыл бұрын

    thnx boss. resolved my issues.

  • @Actanonverba01
    @Actanonverba017 жыл бұрын

    for beginners only

  • @MWorks08
    @MWorks086 жыл бұрын

    1.75x Speed would be really appreciated for this video :D

  • @bhanuchandrakarisetty9718
    @bhanuchandrakarisetty971810 жыл бұрын

    sir i am using windows 7 operating system, python and instead of coda i am using sublime text 2. i have followed everything until the TERMINAL option. it is not there in windows. can u tell me the equivalent one. so that i can finish the last step. waiting for your reply sir. thank you

  • @wavesofgrey-vb9gw

    @wavesofgrey-vb9gw

    5 жыл бұрын

    windows command line, or now powershell. you will have to add python to the path to run python from the command line

  • @previeweverything6124
    @previeweverything61244 жыл бұрын

    My syntax is always error in If char == "g" : Usually in (if) and in (g) Help me why

  • @dxamphetamin

    @dxamphetamin

    4 жыл бұрын

    'g', you need to check for a char not a string

  • @mannyfan165
    @mannyfan1657 жыл бұрын

    dude why does this not work at all using windows

  • @LegeFles

    @LegeFles

    7 жыл бұрын

    did you install python?

  • @mannyfan165

    @mannyfan165

    7 жыл бұрын

    yes

  • @LegeFles

    @LegeFles

    7 жыл бұрын

    Matt saying it doesn't work "at all" isn't really a helpfull comment.

  • @mauroresaca
    @mauroresaca4 жыл бұрын

    Why never start with the code this man?

  • @jaredakers7683
    @jaredakers76837 жыл бұрын

    Someone should re-do these videos in Windows.

  • @IsaacPiera
    @IsaacPiera7 жыл бұрын

    super inneficient code. use the count() funcion which is WAY faster!

  • @georgegrevera7000

    @georgegrevera7000

    6 жыл бұрын

    I timed both ways on a file of 117k bases. His way used 0.02 sec. Using count() used 0.005 sec. Both are fast enough for me.

  • @johnfedorov8089

    @johnfedorov8089

    5 жыл бұрын

    @@georgegrevera7000 The problem is scale. Had the gene sequences been longer, this would be exponentially inefficient. I'm coming from a computer science background though, where efficiency is hammered into our heads due to scalability

  • @pankajsaraswat3110
    @pankajsaraswat31108 жыл бұрын

    bevkuff

  • @ggyanwali
    @ggyanwali8 жыл бұрын

    poor video making quality