Dropbox is intentionally scanning users public files in order to combat piracy, however it seems to be taking a reasonable stand on what has become a highly sensitive topic which is close to the public's heart.
When we utilize cloud storage we often take the right to privacy for granted, yet we seem to forget that we live in an age where privacy is on the lower end of most government agendas. The above image showing a file censored by the DMCA (Digital Millennium Copyright Act), which was tweeted by Darrel Whitelaw caused an mini Internet riot recently.
They even replied to the tweet, saying:
What this means is that once a file is identified as non-compliant it is not deleted; it's sharing capability is simply removed though it is important to note that Dropbox is indeed authorized to remove the file entirely via its DMCA policy. From a technical perspective, this is done via hashing which generates a unique number depending on the exact configuration of zeros and ones which constitute a file; it's important to note that this technology is used as a means of data de-duplication. This raises privacy issues because ultimately, as a user, you are entrusting Dropbox to maintain the integrity of your files.
Sadly, privacy has once again been put on the back burner, however it is empowering to see companies taking a stand against anti-privacy laws, though without any direct comment from Dropbox, this standpoint is purely speculative. One important item to note, too, is that Dropbox is not personally scanning your content, it is performed by a machine.
Hashing Methodology Explained:
When a user uploads a file it is immediately hashed. Hashing is when the file is given a unique identifier depending on the data it contains. This unique identifier is called a hash value. Should the file be opened and saved without changes made to it then the hash value will remain unchanged. Should the user re-enter the same file (which contains for example 25 million characters) and proceeds to add an extra space somewhere, then the hashing process will allocate a different hash value.
Dropbox utilize a blacklist of hash values which is are compared against all user uploaded files which have been marked as public/shareable in order to detect pirated material.