Length of Human DNA in bytes


Recommended Posts

It's a very valid question and I have a very good reason for asking it. I will post more about my reasoning in another tread later on.

I had a look on Yahoo Answers and I read this: http://answers.yahoo.com/question/index?qi...25135324AApHMlU

I wonder if the Human Genome Project have the full file for us to download? If they RAR it up, the file should compress nicely for about 50MB.

Why would you need 12 bytes per base pair? There can be only four types of pairing (AT, TA, GC, CG). So, one pair can be encoded in two bits. [say, 00=AT, 01=TA, 10=GC, 11=CG]. Since a male human has 3080 Million base pairs in his DNA, it translates to 6160000000 bits or a little more than 734 MB. And for the human female with 3022 base pairs in her DNA, its 6044000000 bits or a little more than 720 MB.

That's of course assuming writing it in binary. Writing it in ASCII will be 8 times larger, and using Unicode will be two or four times larger than the ASCII text file. I don't know how well it will get compressed.

Edit: Wikipedia gets a big larger file size.

Edit 2: While I was calculating, virtorio beat me with the WP link.

Also, does that sequence include the RNA sequence too?

Central dogma of genetics: DNA -> RNA -> Protein. The short story is that RNA is transcribed from DNA (the genome).

So what are you asking for exactly? The entire genome as in all DNA contained in a human cell's chromosomes? Total coding regions? Total protein-encoding genes? Total DNA and products such as mRNA, siRNA, miRNA, tRNA, rRNA?

Say, 00=AT, 01=TA, 10=GC, 11=CG

If you're willing to simplify and assume Watson-Crick base-pairing, then you may as well take only the total genome length (as opposed to total genome length x 2 to account for dsDNA), since A "always" pairs with T, and C "always" base pairs with G. If you want to get more complicated, you will have modified bases, ranging from methylation, acetylation, and a few other base analogues, as well as non-W/C pairing.

Also note the presence of mobile DNA elements such as (retro)transposons. They can move or replicate and insert themselves back into the genome. You also get all sorts of other crazy events like site-specific recombination that can result in duplication or excision of a stretch of DNA from a replicating chromosome, so really, the take-home message is that it is a freaking miracle that anything manages to live.

the take-home message is that it is a freaking miracle that anything manages to live.

lol, this^

I think there are quite large issues with the HGP (or anyone else) publishing somebody's genome, I doubt they'd allow any Joe Bloggs to download it.

edit: "All genome sequence generated by the Human Genome Project has been deposited into GenBank, a public database freely accessible by anyone with a connection to the Internet."

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • I have a Motorola, one of the lower end ones, it works fine. It is possible to get rid of the Gemini app and also to disable googles assistant , but A.i is still apps. I try to avoid all LLM A.I, is i can, I use no Ai duck duck go.
    • Free Software Foundation Europe pushes EU to force Google to allow AI uninstalls on Android by Paul Hill Credit: Pexels Users should be able to fully uninstall AI-based features from Android devices and be able to access interoperability functions, free from Google’s verification requirements, the European Commission has heard as part of an Android interoperability consultation under the Digital Markets Act. These measures were proposed by the Free Software Foundation Europe (FSFE) last week when it submitted its documentation. The FSFE noted that Google had started silently installing AI models without telling users. It noted that the EU’s DMA requires companies like Google to allow users to uninstall pre-loaded software from their devices, but in the case of the AI models Google is installing, they reinstall if you delete them, contravening the DMA. To get Google back under control, the FSFE has told the European Commission that there needs to be improvements within the Android Open Source Project (AOSP). First, it said that users should be able to fully remove pre-loaded AI components from their devices, with companies being prohibited from silently reinstalling or reactivating them. Second, access to Android interoperability features should not be contingent on registration, authorization, or contractual relationships with Google. This pertains to Google’s attempt to force developers to register with Google, even to publish apps to alternative app stores like F-Droid. Discussing its submission, Lucas Lasota, FSFE Legal Programme Manager, said: Google is planning to roll out its Android Developer Certification in September 2026. This will force every Android app developer to register with Google before their software can be installed on certified Android devices, but it should affect those who have removed Google Apps from their device. The program is controversial because it entails the signing of contracts and payment of account fees to Google, as well as the handing over of the identities of developers. It said: The FSFE said that if the Commission’s draft measures remain unchanged, then Google will be allowed to make developers verify their identity. The FSFE believes that asking developers to register is contrary to the text and spirit of the law. In summary, the FSFE has told the Commission that no developer should need a Google account, a Play Store presence, or any agreement with Google to access Android’s interoperability features.
  • Recent Achievements

    • Conversation Starter
      sumytbe earned a badge
      Conversation Starter
    • One Year In
      B4dM1k3 earned a badge
      One Year In
    • One Year In
      DarkWun earned a badge
      One Year In
    • Dedicated
      Almohandis earned a badge
      Dedicated
    • Dedicated
      JuvenileDelinquent earned a badge
      Dedicated
  • Popular Contributors

    1. 1
      +primortal
      519
    2. 2
      +Edouard
      188
    3. 3
      PsYcHoKiLLa
      87
    4. 4
      Michael Scrip
      81
    5. 5
      Steven P.
      72
  • Tell a friend

    Love Neowin? Tell a friend!