Virus

Virus vs Anti-Virus: The Arms Race
Patrick Graydon Qiuhua Cao
Outline
Viruses Anti-Viruses Discussion
Viruses
A virus is a program that can infect other programs by modifying them to include a possibly evolved copy of itself. - Fred Cohen Fred Cohen seems to have been the first to define the term virus, but the concept had been discussed earlier and there were some viruses out in the wild before he began his research.
Link to virus history
Example of a virus
In his 1984 Turing award acceptance speech to the ACM, Ken Thompson related the story of how he modified the C compiler to insert a backdoor into the UNIX login program and to insert his modifications into any C compiler compiled using his modified compiler.
Slickno trace of the backdoor remains in any source code!
Viruses example
The WM.Nuclear Microsoft Word macro virus infects Word documents during opening, saving, and printing by adding a set of macros to them. On April 5th it attempts to overwrite critical system files, and it occaisonally adds the text "STOP ALL FRENCH NUCLEAR TESTING IN THE PACIFIC!" to the current document. (Information from Symantecs security bulletin.)
Worms are not viruses
The VBS.SST@mm Anna Kournikova malware is a worm, not a virus, because it emails copies of itself but does not infect any other documents. (Information about VBS.SST@mm from Symantecs security bulletin.)
Malware terminology
We found a web site listing 56 different terms related to viruses and malware, including:

backdoor boot sector viruses Encrypted virus Hoax Micro virus
Virus statistics
Here are some statistics from 2000 we found on the web:
Over 85% of all the known viruses are for Microsoft platforms (nearly all the self-propagating worms are as well) Slightly less than 52,000 are viruses for DOS/Windows/NT platforms - about 6000 of these are Word macro viruses - about 150-200 of these are known to be widespread "in the wild" - in 1999, approximately 650 new viruses were reported each month (more than 20 a day)
Virus statistics
More statistics from the same site
A few hundred are for Javascript, Hypercard, Perl, and other scripting languages. Few of these can spread beyond a few machines without active support of the users
150 are for the Atari 31 are native to the Macintosh, and only two of them are known to exist anymore 2 or 3 are viruses native to OS/2
Virus statistics (contd)
More statistics from the same site
About 5 are for Linux/Unix/etc, but none have been found in quantity "in the wild", nor would they be likely to spread very far if they were "loose" None are for BeOS, ErOS, or other smallpopulation systems.
Question: can we reduce the risk of getting a virus infection by not using Microsoft products?
Example virus
Fred Cohens example virus:

program virus := { 1234567; subroutine infect-executable := { loop:file = get-random-executable-file; if first-line-of-file = 1234567 then goto loop; prepend virus to file; } subroutine do-damage := { whatever damage is to be done } subroutine trigger-pulled := { return true if some condition holds } main-program := { infect-executable; if trigger-pulled then do-damage; goto next;} next: }
More about viruses
Viruses arent necessarily hard to write
Cohen reports that his first virus took only 8 hours for an experienced programmer to write. Cohen reports on a UNIX shell script virus that was only 7 lines long
Viruses arent necessarily big
Viruses arent necessarily malware
Cohen describes a hypothetical virus that compresses executables to conserve disk space.
Viruses can be malicious in many ways
Virus payloads could:
Carry out a denial of service attack Crash the machine Randomly destroy data Install a trojan horse program Perform password cracking and basically any other nasty thing you can think of.
Making matters worse
Virus payloads may not trigger immediately. If a virus has few detectable side effects, it could spread without notice and become widespread before the payload is triggered.
Question: is it possible that there are viruses in the wild today that have infected large numbers of systems but have gone unnoticed because they have few if any side effects and have not yet triggered their destructive payloads?
Isolation
One way to protect against infection is to isolate systems, users, and/or information to make it difficult or impossible for a virus to spread widely. Total isolation is a sure cure.
Total isolation probably isn't practical for most users Imagine life without google without BitTorrent without Amazon.com
Partitioning
If we cant isolate systems and users from each other completely, maybe we can erect partitions to limit the spread of malware. It was thought that the Bell-LaPadula model might help limit the spread of viruses, but Cohen reports that viruses demonstrated the ability to cross users boundaries and move from a given security level to a higher security level.
Partitioning (continued)
According to Cohen, the Biba and BellLaPadula models, if combined, would tend to create partitions.
Unfortunately: When we mix the Biba and BellLaPadula models, we find that the resulting isolationism secures us from viruses, but doesnt permit any user to write programs that can be used throughout the system. Cohen
Bad news about partitioning
Transitivity is a problem:
If there is a path from user A to user B, and there is a path from user B to user C, then there is a path from user A to user C with the witting or unwitting cooperation of user B. Cohen
The military uses a category system in which users can only access information needed for their current duties. But, some users have simultaneous access to multiple categories
More bad news
According to Cohen a precise system for integrity is NP-complete and any non-NP complete solution must tend toward isolationism.
If a system restricts users actions unnecessarily, it will be unpopular
And the hits just keep on coming
Cohen notes that flow distance and flow list models may limit virus spread.
Flow distance restrictions limit how far information can travel. Flow lists allow more arbitrary expressions for accessibility based on the list of users who have had an effect on an object. BUT: tracing exact information flow requires NPcomplete time, and maintaining markings requires large amounts of space.
Prevention by law
Couldnt we just make it against the law? By simply telling users not to launch attacks, little is accomplished; users who can be trusted will not launch attacks; but users who would do damage cannot be trusted, so only legitimate work is blocked. - Cohen
Limited interpretation
If a given document is interpreted, and the interpreter lacks commands like write file, it may be impossible for it to have a virus
Graphics files are probably immune
Except AnnaKournikova.jpg.vbs
Word documents can contain macro viruses such as WM.Nuclear
Documents that can hold scripts probably arent
Detection
If we cant limit the spread of a virus, maybe we can find it and quarantine infected files
Unfortunately, no general algorithm for detecting virus behavior is possible.
Cohen argues this by proposing a virus that infects only when the detection algorithm thinks it isnt a virus. Anti-virus programs must make do with more limited solutions, such as scanning for a virus signature.
Virus detection problems
According to Cohen, the following are undecidable:

Detection of a virus by its appearance Detection of a virus by its behavior Detection of an evolution of a known virus Detection of a triggering mechanism by its appearance Detection of a triggering mechanism by its behavior Detection of an evolution of a known triggering mechanism Detection of a virus detector by its appearance Detection of a virus detector by its behavior Detection of an evolution of a known viral detector
Detection by signature
Rather than implement a general solution, virus scanners look for virus signatures.
These signatures could be as small as a few bytes or as large as the entire virus code. If a virus scanner uses the whole virus code as a signature, it may not be able to find simple variants of a virus. However, if a virus uses a very small signature, it may incorrectly infections that arent there.
Updated signatures
Anti-virus companies must release new signatures each time a new virus is discovered
A viruss spread is unimpeded for a while According to Andreas Marx of AV-Test.org, it took Symantec 25h 5m to release an updated signature file in response to the W32/Sober.C worm attack.
The arms race
In order to make it hard for virus scanners to detect their vurises, virus writers can add morphing behavior to their creations:
A polymorphic virus morphs itself in order to evade detection. Metamorphic viruses attempt to evade heuristic detection techniques by using more complex obfuscations. Christodorescu and Jha
More bad news
Cohen argues that no general solution for proving the equivalence of two programs is possible.
His argument follows the same form as his argument against a general algorithm for virus detection: he proposes a virus in which two different infection instances will behave differently when a watching antivirus program believes they are the same.
Morphing
A virus may morph itself by:

Encrypting part of itself using a different key for each infection Changing variable names (in a script virus) Binary obfuscation techniques (more on this later)
Chameleon -- first polymorphic virus, 90s A partial list of the viruses that can be called 100 percent polymorphic (late 1993): Bootache, CivilWar (four versions), Crusher, Dudley, Fly, Freddy, Ginger, Grog, Haifa, Moctezuma (two versions), MVF, Necros, Nukehard, PcFly (three versions), Predator, Satanbug, Sandra, Shoker, Todor, Tremor, Trigger, Uruguay (eight versions). at link Virus-Scan-Software
Polymorphic virus examples:

Arming the virus writers
If virus author knew what the anti-virus programs look for, he or she could design a virus that they wouldnt find
Example: in the early 90s there were a few MS-DOS 'stealth' viruses that could interrupt a virus-scanning program's attempt to read the boot record and show it a clean versions rather than what was really there.

See Symantecs description of the Stealth_boot virus. "Frodo.4096" virus, first Stealth virus Beast.512" Stealth virus, less than a year after Frodo.4096 More on this at Virus-Scan-Software
Extracting signatures
Christodorescu and Jha report on a technique for extracting the signature used by a given antivirus program.
Basically they obfuscate parts of the program and determine what has to remain unobfuscated for the antivirus program to find the virus.
FYI there is a typo in the paper: the conditions on the loop in the SignatureExtraction function cause it to never execute
They say it was successful in many cases.
Binary obfuscation techniques
The goal of binary obfuscation is to make it difficult to obtain an assembly-language description of a program from its raw bytes
You need to turn raw bytes back into assembly code before you can decompile You can obfuscate by:

Garbage insertion (more in a minute) Variable renaming Code reordering Encapsulating/encrypting code or data
x86 binary obfuscation
If you create unused regions in the executable and fill them with garbage bytes, the variable-length nature of the x86 instruction set can cause disassemblers to think that the legitimate instructions following the garbage are in fact operands. You can use a conditional branch instruction to do an unconditional jumpdisassemblers assume no garbage bytes at the target address or following the branch instruction.
Better obfuscation
Linn and Debray describe obfuscation using a branch function
This function in turn branches to another target depending on where it is called from.
This makes determining which parts of the program are real by following the branch instructions difficult. The function can return to an instruction one or more bytes after the usual return point, opening up a region to insert more garbage bytes into.
Advances in disassembly
Kruegel, Robertson, Valeur and Vigna describe a disassembler that is able to correctly disassemble most instructions from a program obfuscated by the obfuscator Linn and Debray describe.
Dissasembly in detail
Static analysis techniques
Linear sweep

GNU's objdump uses linear sweep Gets confused by garbage bytes in unreachable areas Drawback: indirect jumps Doesnt always see the whole binary Hybrid approach
Recursive traversal following control flow

Speculative disassembly
Now for some good news
This arms race is usually in favor of the deobfuscator. The obfuscator has to devise techniques that transform the program without seriously impacting the run-time performance or increasing the binary's size or memory footprint while there are no such constraints for the de-obfuscator. - Kruegel et al
AV tool resistance to obfuscation
Christodorescu and Jha claim the state of the art for malware detectors is dismal!
They propose a testing technique and then use it to show that the tested virus scanners were not generally able to identify the sampled viruses when they were obfuscated by code reordering or encapsulation.
AV tool resistance to obfuscation (contd)
This doesnt mean that these products arent capable of detecting morphing virusesthe viruses in the sample set did not perform these morphs in the wild. This does mean that in order to protect against a new virus that is just a simple modification of one of these existing viruses the AV companies would have to release a new signature file.
Known clean system
Some virus detection techniques require you to start from a clean system.
DOS users used clean boot disks to defeat stealth viruses But is it always possible to get to a known clean state?
What if every UNIX vendor had been infected with Ken Thompsons C compiler virus? Even their clean distribution media would be infected
Discussion
Obfuscation vs deobfuscation, who can win?
Discussion (contd)
Anti-virus can win in the future?
Questions?
Thanks

Virus

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Virus

Uploaded by

Copyright:

Available Formats

Virus vs Anti-Virus: The Arms Race

Patrick Graydon Qiuhua Cao

Viruses Anti-Viruses Discussion

Link to virus history

Slickno trace of the backdoor remains in any source code!

Worms are not viruses

backdoor boot sector viruses Encrypted virus Hoax Micro virus

Here are some statistics from 2000 we found on the web:

More statistics from the same site

Virus statistics (contd)

More statistics from the same site

Fred Cohens example virus:

More about viruses

Viruses arent necessarily hard to write

Viruses arent necessarily big

Viruses arent necessarily malware

Viruses can be malicious in many ways

Virus payloads could:

Making matters worse

Bad news about partitioning

More bad news

If a system restricts users actions unnecessarily, it will be unpopular

And the hits just keep on coming

Graphics files are probably immune

Documents that can hold scripts probably arent

Unfortunately, no general algorithm for detecting virus behavior is possible.

Virus detection problems

According to Cohen, the following are undecidable:

The arms race

More bad news

A virus may morph itself by:

Polymorphic virus examples:

Arming the virus writers

They say it was successful in many cases.

Binary obfuscation techniques

x86 binary obfuscation

Linn and Debray describe obfuscation using a branch function

Static analysis techniques

Recursive traversal following control flow

Now for some good news

AV tool resistance to obfuscation

AV tool resistance to obfuscation (contd)

Known clean system

Obfuscation vs deobfuscation, who can win?

Anti-virus can win in the future?

You might also like