Length of Human DNA in bytes


Recommended Posts

It's a very valid question and I have a very good reason for asking it. I will post more about my reasoning in another tread later on.

I had a look on Yahoo Answers and I read this: http://answers.yahoo.com/question/index?qi...25135324AApHMlU

I wonder if the Human Genome Project have the full file for us to download? If they RAR it up, the file should compress nicely for about 50MB.

Why would you need 12 bytes per base pair? There can be only four types of pairing (AT, TA, GC, CG). So, one pair can be encoded in two bits. [say, 00=AT, 01=TA, 10=GC, 11=CG]. Since a male human has 3080 Million base pairs in his DNA, it translates to 6160000000 bits or a little more than 734 MB. And for the human female with 3022 base pairs in her DNA, its 6044000000 bits or a little more than 720 MB.

That's of course assuming writing it in binary. Writing it in ASCII will be 8 times larger, and using Unicode will be two or four times larger than the ASCII text file. I don't know how well it will get compressed.

Edit: Wikipedia gets a big larger file size.

Edit 2: While I was calculating, virtorio beat me with the WP link.

Also, does that sequence include the RNA sequence too?

Central dogma of genetics: DNA -> RNA -> Protein. The short story is that RNA is transcribed from DNA (the genome).

So what are you asking for exactly? The entire genome as in all DNA contained in a human cell's chromosomes? Total coding regions? Total protein-encoding genes? Total DNA and products such as mRNA, siRNA, miRNA, tRNA, rRNA?

Say, 00=AT, 01=TA, 10=GC, 11=CG

If you're willing to simplify and assume Watson-Crick base-pairing, then you may as well take only the total genome length (as opposed to total genome length x 2 to account for dsDNA), since A "always" pairs with T, and C "always" base pairs with G. If you want to get more complicated, you will have modified bases, ranging from methylation, acetylation, and a few other base analogues, as well as non-W/C pairing.

Also note the presence of mobile DNA elements such as (retro)transposons. They can move or replicate and insert themselves back into the genome. You also get all sorts of other crazy events like site-specific recombination that can result in duplication or excision of a stretch of DNA from a replicating chromosome, so really, the take-home message is that it is a freaking miracle that anything manages to live.

the take-home message is that it is a freaking miracle that anything manages to live.

lol, this^

I think there are quite large issues with the HGP (or anyone else) publishing somebody's genome, I doubt they'd allow any Joe Bloggs to download it.

edit: "All genome sequence generated by the Human Genome Project has been deposited into GenBank, a public database freely accessible by anyone with a connection to the Internet."

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Hello, I am not as familiar with AMD CPUs as I am with Intel's, but as I understand it, that's a mid-range CPU from about three years ago.  I would think it to be fine for everyday casual-type use.  A larger SSD might be better, but with storage prices these days that's a decision that has to be carefully thought about. Regards, Aryeh Goretsky  
    • Ocenaudio 3.19.5 by Razvan Serea  Ocenaudio is a full featured, fast and easy to use audio and music editor. It is the ideal software for people who need to edit and analyze audio files without complications. Ocenaudio also has powerful features that will please more advanced users. To assist ocenaudio development, a powerful toolset of audio editing, analysis and manipulation called Ocen Framework was created. ocenaudio is also based on Qt framework, a well known library for cross-platform development. Cross-platform support ocenaudio is available for all major operating systems: Microsoft Windows, Mac OS X and Linux. Native applications are generated for each platform from a common source, in order to achieve excelent performance and seamless integration with the operating system. All versions of ocenaudio have a uniform set of features and the same graphical interface, so the skills you learn in one platform can be used in the others. VST plugins support Ocenaudio supports VST (Virtual Studio Technology) plugins, giving its users access to numerous effects. Like the native effects, VST effects can use real-time preview to aide configuration. Real-time preview of effects Applying effects such as EQ, gain and filtering is an important part of audio editing. However, it is very tricky to get the desired result by adjusting the controls configuration alone: you must listen the processed audio. To ease the configuration of audio effects, ocenaudio has a real time preview feature: you hear the processed signal while adjusting the controls. The effect configuration window also includes a miniature view of the selected audio signal. You can navigate on this miniature view in the same way as you do on the main interface, selecting parts that interest you and listening to the effect result in real time. Multiselection for delicate editions To speed up complex audio files editing, ocenaudio includes multi-selection. With this amazing tool, you can simultaneously select different portions of an audio file and listen, edit or even apply an effect to them. For example, if you want to normalize only the excerpts of an interview where the interviewee is talking, just select them and apply the effect. Eficient edition of large files With ocenaudio, there is no limit to the length or the quantity of the audio files you can edit. Using an advanced memory management system, the application keeps your files open without wasting any of your computer's memory. Even in files several hours long, common editing operations such as copy, cut or paste happen almost instantly. Fully featured spectrogram Besides offering an incredible waveform view of your audio files, ocenaudio has a powerful and complete spectrogram view. In this view, you can analyze the spectral content of your audio signal with maximum clarity. Advanced users will be surprised to find that the spectrogram settings are applied in real time. The display is updated immediately when altering features such as the number of frequency bands, window type and size and dynamic range of the display. Ocenaudio 3.19.5 changelog: Fixes crashes related to audio devices on Windows (DirectSound and ASIO) Fixes several crashes and memory corruption issues Fixes opening several headerless files at once, which previously dropped all but one Improves batch export by suggesting and remembering the destination folder Fixes accented and non-Latin characters in VST plug-in and compressed-archive file names Adds zstd compression support and updates the archive library Other bug fixes and improvements Download: Ocenaudio 64-bit | Portable | ~40.0 MB (Freeware) Download: Ocenaudio for Linux and Mac OS View: Ocenaudio Homepage | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • I did think about a Echo show once and it would be useful to see what my cameras see. But my brother got one and I changed my mind. Adverts and not really worth the price just to see my cameras. I have a load of dots and a Echo Gen 4, they will do.
    • I asking where you are from or live, because if you don't live in the U.K, why are you so bothered? That is another reason I voted out, E.U and people poking their noses in where they should not be. Sadly we still have it, Trump, and his cronies. Putin as well and no doubt others. It makes no difference what we believe, if we made the right choice or not, we are out. As I said to someone when the news first broke we have voted out, we just need to make the best of it. I have no problems with closer ties to the E.U, we still need to trade. Just don't want to be in their club.
  • Recent Achievements

    • One Month Later
      Excellence2025 earned a badge
      One Month Later
    • Week One Done
      Excellence2025 earned a badge
      Week One Done
    • Week One Done
      flexorcist earned a badge
      Week One Done
    • One Month Later
      Woland13 earned a badge
      One Month Later
    • Week One Done
      Woland13 earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      501
    2. 2
      +Edouard
      203
    3. 3
      PsYcHoKiLLa
      145
    4. 4
      Steven P.
      72
    5. 5
      FloatingFatMan
      68
  • Tell a friend

    Love Neowin? Tell a friend!