Length of Human DNA in bytes


Recommended Posts

It's a very valid question and I have a very good reason for asking it. I will post more about my reasoning in another tread later on.

I had a look on Yahoo Answers and I read this: http://answers.yahoo.com/question/index?qi...25135324AApHMlU

I wonder if the Human Genome Project have the full file for us to download? If they RAR it up, the file should compress nicely for about 50MB.

Why would you need 12 bytes per base pair? There can be only four types of pairing (AT, TA, GC, CG). So, one pair can be encoded in two bits. [say, 00=AT, 01=TA, 10=GC, 11=CG]. Since a male human has 3080 Million base pairs in his DNA, it translates to 6160000000 bits or a little more than 734 MB. And for the human female with 3022 base pairs in her DNA, its 6044000000 bits or a little more than 720 MB.

That's of course assuming writing it in binary. Writing it in ASCII will be 8 times larger, and using Unicode will be two or four times larger than the ASCII text file. I don't know how well it will get compressed.

Edit: Wikipedia gets a big larger file size.

Edit 2: While I was calculating, virtorio beat me with the WP link.

Also, does that sequence include the RNA sequence too?

Central dogma of genetics: DNA -> RNA -> Protein. The short story is that RNA is transcribed from DNA (the genome).

So what are you asking for exactly? The entire genome as in all DNA contained in a human cell's chromosomes? Total coding regions? Total protein-encoding genes? Total DNA and products such as mRNA, siRNA, miRNA, tRNA, rRNA?

Say, 00=AT, 01=TA, 10=GC, 11=CG

If you're willing to simplify and assume Watson-Crick base-pairing, then you may as well take only the total genome length (as opposed to total genome length x 2 to account for dsDNA), since A "always" pairs with T, and C "always" base pairs with G. If you want to get more complicated, you will have modified bases, ranging from methylation, acetylation, and a few other base analogues, as well as non-W/C pairing.

Also note the presence of mobile DNA elements such as (retro)transposons. They can move or replicate and insert themselves back into the genome. You also get all sorts of other crazy events like site-specific recombination that can result in duplication or excision of a stretch of DNA from a replicating chromosome, so really, the take-home message is that it is a freaking miracle that anything manages to live.

the take-home message is that it is a freaking miracle that anything manages to live.

lol, this^

I think there are quite large issues with the HGP (or anyone else) publishing somebody's genome, I doubt they'd allow any Joe Bloggs to download it.

edit: "All genome sequence generated by the Human Genome Project has been deposited into GenBank, a public database freely accessible by anyone with a connection to the Internet."

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Didn’t Dbrand once complain that Casetify was ripping off their designs a well? seems pretty bad of them to try and get around Valve’s copyright this way with that in mind.
    • Dbrand thought they could get away with this Steam Machine case, Valve disagreed by David Uzondu Image via Dbrand Dbrand has cancelled its highly anticipated Companion Cube enclosure for the Valve Steam Machine, which it teased back in November of last year with a concept render and sign-up page, because it did not ask Valve for permission first before manufacturing the case. According to Dbrand, it took the "backwards approach" of building the product first before asking for permission from the copyright holder. Seven months of work went into the project, requiring over a thousand engineering hours from the design team. Workers developed forty-four sets of injection molding tools, making a unique mold for each sub-component of the crate. When the Companion Cube went live on Monday last week, it, according to Dbrand, quickly became the second-fastest-selling product in the company's fifteen-year history, racking up orders for hundreds of thousands of units. Customers eagerly bought the $129.95 deluxe edition or the bare-bones $99.95 version, which the manufacturer cheekily branded as the "Poverty Cube". It was around this time that the legal eagles at Valve descended on the accessory maker with a formal demand. The developer pointed out that the iconic block design remains protected intellectual property from the game Portal, so unlicensed sales had to stop. Dbrand said that all its pleas to salvage the project with the Valve team, including proposals to run a properly licensed release under official terms "with their blessing", fell on deaf ears, so it had no choice but to obey and remove every trace of the product from the internet. If you bought the enclosure, the company said that banks will process your refund by the end of this week, but if it still hasn't arrived in your account by then, you should not hesitate to contact support. The Steam Machine itself is a high-performance console that Valve designed directly to bring PC gaming into the living room. It was announced on 12th November 2025 (the same day Dbrand announced the Cube) and runs on the Linux-based SteamOS, the same OS that powers the Steam Deck. As for the price, due to the shortage of memory and storage chips, the hardware cost landed much higher than people were expecting, starting at $1,049 for the 512 model (without a controller) or $1,128 with the new gamepad. The premium 2 TB model pushes those prices even higher, selling at $1,349 for the standalone console and hitting $1,428 if you want the bundle.
    • It's listed #399.99 on Amazon, per your link. It's not $299.99.
  • Recent Achievements

    • Rookie
      Almohandis went up a rank
      Rookie
    • Apprentice
      jahara21 went up a rank
      Apprentice
    • Reacting Well
      NovaEdgeX earned a badge
      Reacting Well
    • Week One Done
      NovaEdgeX earned a badge
      Week One Done
    • One Year In
      BA the Curmudgeon earned a badge
      One Year In
  • Popular Contributors

    1. 1
      +primortal
      534
    2. 2
      +Edouard
      263
    3. 3
      PsYcHoKiLLa
      148
    4. 4
      Steven P.
      97
    5. 5
      macoman
      59
  • Tell a friend

    Love Neowin? Tell a friend!