Length of Human DNA in bytes


Recommended Posts

It's a very valid question and I have a very good reason for asking it. I will post more about my reasoning in another tread later on.

I had a look on Yahoo Answers and I read this: http://answers.yahoo.com/question/index?qi...25135324AApHMlU

I wonder if the Human Genome Project have the full file for us to download? If they RAR it up, the file should compress nicely for about 50MB.

Why would you need 12 bytes per base pair? There can be only four types of pairing (AT, TA, GC, CG). So, one pair can be encoded in two bits. [say, 00=AT, 01=TA, 10=GC, 11=CG]. Since a male human has 3080 Million base pairs in his DNA, it translates to 6160000000 bits or a little more than 734 MB. And for the human female with 3022 base pairs in her DNA, its 6044000000 bits or a little more than 720 MB.

That's of course assuming writing it in binary. Writing it in ASCII will be 8 times larger, and using Unicode will be two or four times larger than the ASCII text file. I don't know how well it will get compressed.

Edit: Wikipedia gets a big larger file size.

Edit 2: While I was calculating, virtorio beat me with the WP link.

Also, does that sequence include the RNA sequence too?

Central dogma of genetics: DNA -> RNA -> Protein. The short story is that RNA is transcribed from DNA (the genome).

So what are you asking for exactly? The entire genome as in all DNA contained in a human cell's chromosomes? Total coding regions? Total protein-encoding genes? Total DNA and products such as mRNA, siRNA, miRNA, tRNA, rRNA?

Say, 00=AT, 01=TA, 10=GC, 11=CG

If you're willing to simplify and assume Watson-Crick base-pairing, then you may as well take only the total genome length (as opposed to total genome length x 2 to account for dsDNA), since A "always" pairs with T, and C "always" base pairs with G. If you want to get more complicated, you will have modified bases, ranging from methylation, acetylation, and a few other base analogues, as well as non-W/C pairing.

Also note the presence of mobile DNA elements such as (retro)transposons. They can move or replicate and insert themselves back into the genome. You also get all sorts of other crazy events like site-specific recombination that can result in duplication or excision of a stretch of DNA from a replicating chromosome, so really, the take-home message is that it is a freaking miracle that anything manages to live.

the take-home message is that it is a freaking miracle that anything manages to live.

lol, this^

I think there are quite large issues with the HGP (or anyone else) publishing somebody's genome, I doubt they'd allow any Joe Bloggs to download it.

edit: "All genome sequence generated by the Human Genome Project has been deposited into GenBank, a public database freely accessible by anyone with a connection to the Internet."

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Depends on what you need. Might be a bit clearer on what you plan to do with it. Sort of a waste if you get the newest and greatest, but don't know how to use it.
    • NTLite 2026.06.11200 by Razvan Serea NTLite is a Windows configuration tool that allows you to modify your existing Windows install or an image yet to be deployed, remove Windows components, configure and integrate, speed up the Windows deployment process. Reduce Windows footprint on your RAM and storage drive memory. Remove components of your choice, guarded by compatibility safety mechanisms, which speed up finding that sweet spot. Windows Unattended feature support, providing many commonly used options on a single page for easy setup. Easily integrate a single or multiple drivers, update or language packages. Package integration features smart sorting, enabling you to seamlessly add packages for integration and the tool will apply them in the appropriate order, keeping hotfix compatibility in check. One of the important new features of NTLite (compared to its predecessors) is the ability to modify an already installed the operating system, by removing unnecessary components. Supports Windows 11, 10, 8.1 and 7, x86 and x64, live and image. Server editions of the same versions, excluding support for component removals and feature configuration. ARM64 image support in the alpha stage. Does not support Checked/Debug, Embedded, IoT editions, nor Vista or XP. NTLite 2026.06.11200 changelog: New Secure Boot Migration support: Verification, certificate staging, and boot-manager/sector update across the Image, Updates, Apply, and Create-ISO pages (2023 CA migration, optional 2011 revocation, Anti-rollback, Boot sector choice etc) Secure Boot Host Readiness: Live host Secure Boot migration monitor and Servicing-task control Option under Image page - C:\Windows row, or load the host as the target - Updates - Secure Boot Image: 'Sort mounted images first' option for the image list in Menu-Settings UI: Hover description card for Components and Unattended pages, selectable text and quick access to Compatibility options Command line: Relay commands into the already-running instance Enables controlling already running NTLite via ntlite.exe Use /NewInstance to launch an additional instance using CLI operations (premium) UI: 'New instance' option via main menu instead of a secondary ntlite.exe prompt Apply: Hide individual Apply-page notes with a per-note dismiss (X), critical excluded Settings: 'Unsigned RDP file launch warnings' tweak (RDP client), bypassing the April 2026 security-update prompt on RDP connections Upgrade Image: Live OS and deployed image editing now unlocked on free/test licenses, same licensing as images Image: 'Recompress' option in manual dialog Remove Editions to shrink the WIM in one session Image: SWM part size set inline on the Apply page and image dialogs, split-size popup retired Image: Relative 'Last change' dates; editions grouped by build time to reduce noise Image: 'Forget - Missing' on the Edit-cache menu to mass drop entries whose folder is gone Components: Root groups reorganized - user-facing groups first, system/critical last Components: Show filter options to view components by Template or App-type, since Apps are now merged into groups Presets: Delete confirmation now lists the multi-selected preset names UI: Design update propagated to the rest of the tool UI: Filter and search match words in any order and partially, better results filtering Components Unattended: Input-locale language derives from the user locale, with an independent keyboard picker, enables combinations previously unavailable Unattended: Input-locale now allows for a user value override Unattended: Localization OOBE WinPE now can be copied with the new WinPE Copy OOBE localization toggle, enter locale settings once for both stages Updates: Downloader greys and locks updates the image already carries (hotfix and MSIX) Updates: Resume interrupted update downloads Command line: Many upgrades, see /?, now prints help to the console or redirected output UI-Translation: Finnish language added, also thanks for Chinese Traditional (Matt), French (tistou77), Italian (clarensio), Russian (RDS), Swedish (1FF), Vietnamese (Vu Anh Vu) Fix Components: Containers removal breaking Apps deployment Components: Microsoft Account had leftovers when Easy Migrate is kept Image: Export to an existing WIM improvements, Append renamed to Merge Image: Improved 26H1 live removal support Image: No more 'X:\ not accessible' popup for certain drives during image scan Presets: Manual image refresh picks up presets added/removed outside the app Tweaks: Disabled visual-effect animations no longer return after first logon on a new profile Tweaks: Live Visual Effects toggles (animations, drag full windows, font smoothing) now apply correctly Download: NTLite 2026.06.11200 | 20.5 MB (Free, paid upgrade available) Link: NTLite Home Page | NTLite Features | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • Ah. La Fontana De Incontinentia ! Bella ! Bella !
    • Hi everyone, I'm planning a small network upgrade and was wondering how others prepare their networks for future needs. Do you usually invest in higher-speed switches and better cabling from the start, or do you upgrade only when necessary? I'd be interested in hearing what has worked well for you and any lessons you've learned over time. Thanks!
    • Greetings and welcome!!
  • Recent Achievements

    • One Year In
      BA the Curmudgeon earned a badge
      One Year In
    • Conversation Starter
      rosiecharles earned a badge
      Conversation Starter
    • First Post
      KMilenkoski1202 earned a badge
      First Post
    • First Post
      carols23 earned a badge
      First Post
    • One Month Later
      Tom Willson earned a badge
      One Month Later
  • Popular Contributors

    1. 1
      +primortal
      497
    2. 2
      +Edouard
      257
    3. 3
      PsYcHoKiLLa
      151
    4. 4
      Steven P.
      93
    5. 5
      macoman
      67
  • Tell a friend

    Love Neowin? Tell a friend!