Not to complain about AV-Test, since this is more of a general issue facing all testers, but as I am sure you are aware, in any kind of sample set containing files not specifically verified by a human being there can be files which are incorrectly identified as malicious code when, in fact, they do not contain any executable code at all, or contain code that does not perform a threatening action, even though the behavior may initially be diagnosed as malicious (for example, a license key mechanism that injects the key into a runtime executable or library). While rare, reports of "false 'false positives'" can occur in tests involving samples, and investigating and balancing out those cases can be labor-intensive for both the tester and the testee.
I completely agree.
In my opinion, these tests are not representative of reality, but they can be useful as long as the reader understands the data. I remember a few months ago we absolutely bombed one of these tests because we generated hundreds of false positives. The tester installed us on a machine with thousands of infections and we (rightfully, in my opinion) automatically ramped up the heuristics to maximum, so we started to treat every file on the PC with maximum suspicion. Of course we generated lots of false positives and they trashed the product! In the real world, if one of our customers installed us on a machine with thousands of infections, the last thing they'll be concerned about is a false positive! Not to mention it would be pretty much impossible to get a PC into that state with Webroot SecureAnywhere installed!
One of the biggest problems I have with these tests is that the testers have to manually update the signature definitions before testing their sample malware. In the real-world, we don't get the luxury of updating our definitions the second before an infection strikes. With ~50,000 new threats every day, there's a huge window of exposure between updates which is not accounted for in the tests.
The 0-day tests they perform are also very weak. They tend to scan the virus and if the security vendor fails to detect it, the virus will be executed. If the virus is then running in memory, the security vendor is assumed to have failed. They don't take into consideration the monitoring capability of Webroot SecureAnywhere and the fact that the endpoint is protected from the threat, even though it's running (as you can see in the video in the OP).
The performance tests they perform can be very useful, though. :-)