Scheduled PSN downtime tomorrow in Back Page News

Length of Human DNA in bytes


18 replies to this topic * * * * * 1 votes

#1 TonyLock

    Neowinian Senior

  • 2,148 posts
  • Joined: 23-July 02

Posted 22 August 2009 - 05:55

If you were able to write out the human genome in to a computer text file, how big do you think that file would be in terms of bytes?

If each base pair strand is equivalent to 12 bytes, then I'm guessing the file may be about one gig. Just wondering.

Any geneticists here on Neowin?


#2 bloodrain

    Resident Elite

  • 1,055 posts
  • Joined: 08-March 07

Posted 22 August 2009 - 05:56

its over WUN THOUSAND!!!!!!

#3 vetAndrew Lyle

    Don't Panic!

  • 31,628 posts
  • Joined: 15-December 03
  • Location: Toronto, Ontario
  • OS: Windows 7 SP1

Posted 22 August 2009 - 05:59

is this a real question?

I dont think you could possibly put an answer on this

#4 TonyLock

    Neowinian Senior

  • 2,148 posts
  • Joined: 23-July 02

Posted 22 August 2009 - 06:01

It's a very valid question and I have a very good reason for asking it. I will post more about my reasoning in another tread later on.

I had a look on Yahoo Answers and I read this: http://answers.yahoo.com/question/index?qi...25135324AApHMlU

I wonder if the Human Genome Project have the full file for us to download? If they RAR it up, the file should compress nicely for about 50MB.

#5 virtorio

    Virtorio

  • 5,944 posts
  • Joined: 28-April 03
  • Location: New Zealand
  • OS: OSX Lion, Windows 7
  • Phone: Windows Phone 7

Posted 22 August 2009 - 06:03

I don't know a thing about this, but it looks that the Wikipedia page may have the exact answer you're looking for:

http://en.wikipedia....ki/Human_genome

#6 Azusa

    Resident Panty thief

  • 8,717 posts
  • Joined: 07-December 04
  • Location: -=Sunderland=-
  • OS: Windows 7
  • Phone: HTC WildFire

Posted 22 August 2009 - 06:04

5.96046448 gigabytes

going on the number of base pairs x 2

#7 vetAndrew Lyle

    Don't Panic!

  • 31,628 posts
  • Joined: 15-December 03
  • Location: Toronto, Ontario
  • OS: Windows 7 SP1

Posted 22 August 2009 - 06:05

Wouldn't each human being have a unique amount of "bytes"? I won't think there is an exact amount of "bytes" per human genome..

#8 TonyLock

    Neowinian Senior

  • 2,148 posts
  • Joined: 23-July 02

Posted 22 August 2009 - 06:08

Also, does that sequence include the RNA sequence too?

View PostAndrew Lyle, on Aug 22 2009, 07:05, said:

Wouldn't each human being have a unique amount of "bytes"? I won't think there is an exact amount of "bytes" per human genome..

I think if you have down syndrome, you may have more.

#9 soumyasch

    Resident Fanatic

  • 584 posts
  • Joined: 05-June 05

Posted 22 August 2009 - 06:09

Why would you need 12 bytes per base pair? There can be only four types of pairing (AT, TA, GC, CG). So, one pair can be encoded in two bits. [Say, 00=AT, 01=TA, 10=GC, 11=CG]. Since a male human has 3080 Million base pairs in his DNA, it translates to 6160000000 bits or a little more than 734 MB. And for the human female with 3022 base pairs in her DNA, its 6044000000 bits or a little more than 720 MB.

That's of course assuming writing it in binary. Writing it in ASCII will be 8 times larger, and using Unicode will be two or four times larger than the ASCII text file. I don't know how well it will get compressed.

Edit: Wikipedia gets a big larger file size.

Edit 2: While I was calculating, virtorio beat me with the WP link.

#10 Snowl

    ‮i wasted 10 seconds of your time

  • 1,898 posts
  • Joined: 01-December 08
  • Location: Australia.

Posted 22 August 2009 - 07:16

Around 807 403 520 bytes.

#11 qdave

    Neowinian Super Cool

  • 15,745 posts
  • Joined: 02-October 02
  • Location: Vilnius, Lithuania | Toronto,On

Posted 22 August 2009 - 07:23

It says on wiki that it would be 770MB. Thats not much really.

#12 Relativity_17

    Just bitter.

  • 8,387 posts
  • Joined: 21-August 02

Posted 22 August 2009 - 07:28

Quote

Also, does that sequence include the RNA sequence too?
Central dogma of genetics: DNA -> RNA -> Protein. The short story is that RNA is transcribed from DNA (the genome).

So what are you asking for exactly? The entire genome as in all DNA contained in a human cell's chromosomes? Total coding regions? Total protein-encoding genes? Total DNA and products such as mRNA, siRNA, miRNA, tRNA, rRNA?

Quote

Say, 00=AT, 01=TA, 10=GC, 11=CG
If you're willing to simplify and assume Watson-Crick base-pairing, then you may as well take only the total genome length (as opposed to total genome length x 2 to account for dsDNA), since A "always" pairs with T, and C "always" base pairs with G. If you want to get more complicated, you will have modified bases, ranging from methylation, acetylation, and a few other base analogues, as well as non-W/C pairing.

Also note the presence of mobile DNA elements such as (retro)transposons. They can move or replicate and insert themselves back into the genome. You also get all sorts of other crazy events like site-specific recombination that can result in duplication or excision of a stretch of DNA from a replicating chromosome, so really, the take-home message is that it is a freaking miracle that anything manages to live.

#13 vetMike Brown

    Neowinian ULTRAKILL

  • 13,764 posts
  • Joined: 02-August 05
  • Location: London, UK

Posted 22 August 2009 - 07:29

View PostAndrew Lyle, on Aug 22 2009, 07:05, said:

Wouldn't each human being have a unique amount of "bytes"? I won't think there is an exact amount of "bytes" per human genome..
Surely everyone has the same amount (unless they have a genetic disorder like Down's syndrome).

#14 Relativity_17

    Just bitter.

  • 8,387 posts
  • Joined: 21-August 02

Posted 22 August 2009 - 07:30

Quote

Surely everyone has the same amount (unless they have a genetic disorder like Down's syndrome).
No... You'll probably see variation even between cells in the same individual.

#15 TonyLock

    Neowinian Senior

  • 2,148 posts
  • Joined: 23-July 02

Posted 22 August 2009 - 07:55

Anyone know if the HGP has a 700MB download someone's DNA?