Professional Documents
Culture Documents
TOPIC: - DISASSEMBLERS
SUBMITTED BY:-
AMBER
COURSE- B.TECH (CSE)
REGD. NO.:-10802649
ROLL NO.:- RC1801A10
ACKNOWLEDGEMENT
Introduction:-
Assembly language:-
Assembly language is the symbolic representation of a computer’s binary encoding—
machine language. Assembly language is more readable than machine language because it
uses symbols instead of bits. The symbols in assembly language name commonly occurring
bit patterns, such as opcodes and register specifies, so people can read and remember them.
In addition, assembly language permits programmers to use labels to identify and name
particular memory words that hold instructions or data. Assembly languages are close to a
one to one correspondence between symbolic instructions and executable machine codes.
Assembly languages also include directives to the assembler, directives to the linker,
directives for organizing data space, and macros. Macros can be used to combine several
assembly language instructions into a high level language-like construct (as well as other
purposes). There are cases where a symbolic instruction is translated into more than one
machine instruction. But in general, symbolic assembly language instructions correspond to
individual executable machine instructions.
• IDA 3.7:- A DOS GUI tool that behaves very much like IDA Pro, but is
considerably more limited. It can disassemble code for the Z80, 6502, Intel 8051,
Intel i860, and PDP-11 processors, as well as x86 instructions up to the 486.
• IDA Pro Freeware :- it behaves almost exactly like IDA Pro, but disassembles only
Intel x86 opcodes and is Windows-only. It can disassemble instructions for those
processors available as of 2003.
• BORG Disassembler :- BORG is an excellent Win32 Disassembler with GUI.
• diStorm64 disassembler:- diStorm is an open source highly optimized stream
disassembler library for 80x86 and AMD64.
• OllyDbg: - OllyDbg is one of the most popular disassemblers recently. It has a large
community and a wide variety of plugins available
• Objconv
• Bjdump: - bjdump comes standard, and is typically used for general inspection of
binaries. Pay attention to the relocation option and the dynamic symbol table
option.
• Gdb: - gdb comes standard, as a debugger, but is very often used for disassembly.
• Dissy: - This program is a interactive disassembler that uses objdump.
Separating Code from Data :- Since data and instructions are all stored
in an executable as binary data, the obvious question arises: how can a disassembler
tell code from data? Is any given byte a variable, or part of an instruction?
The problem wouldn't be as difficult if data were limited to the .data section of an
executable and if executable code was limited to the .code section of an executable, but this
is often not the case. Data may be inserted directly into the code section (e.g. jump address
tables, constant strings), and executable code may be stored in the data section (although
new systems are working to prevent this for security reasons).
The general problem of separating code from data in arbitrary executable programs is
equivalent to the halting problem. As a consequence, it is not possible to write a
disassembler that will correctly separate code and data for all possible input programs.
Reverse engineering is full of such theoretical limitations, although by Rice's theorem all
interesting questions about program properties are undecidable .In practice a combination
of interactive and automatic analysis and perseverance can handle all but programs
specifically designed to thwart reverse engineering, like using encryption and decrypting
code just prior to use, and moving code around in memory.
Lost Information:- All text-based identifiers, such as variable names, label names, and
macros are removed by the assembly process. They may still be present in
generated object files, for use by tools like debuggers and relocating linkers, but the
direct connection is lost and re-establishing that connection requires more than a
mere disassembler. These identifiers, in addition to comments in the source file,
help to make the code more readable to a human, and can also shed some clues on
the purpose of the code. Without these comments and identifiers, it is harder to
understand the purpose of the source code, and it can be difficult to determine the
algorithm being used by that code. When you combine this problem with the
possibility that the code you are trying to read may, in reality, be data ,then it can be
ever harder to determine what is going on.
References:-
• blog.llvm.org/2010/01/x86-disassembler.html
• www.osdata.com/topic/language/asm/asmintro.htm
• www.swansontec.com/sprogram.html