August 30, 2003, 10:41
|
#1
|
Deity
Local Time: 11:01
Local Date: November 2, 2010
Join Date: Sep 2000
Location: Latvia, Riga
Posts: 18,355
|
Programming Question
I could definitely use input from Asher or someone else here.
Obviously, it's possible to write a disassembler that would disassemble another executable and provide the code for it.
The question stands, is it possible for a program to disassemble itself. Dunno, maybe with first creating a copy of my program first, but somehow. The disassembled code doesn't have to be changing - I'm not aiming for a self-changing program here. Merely viewing the code, and that's it.
Help or links to material much appreciated.
__________________
Solver, WePlayCiv Co-Administrator
Contact: solver-at-weplayciv-dot-com
I can kill you whenever I please... but not today. - The Cigarette Smoking Man
|
|
|
|
August 30, 2003, 11:14
|
#2
|
Civ4: Colonization Content Editor
Local Time: 09:01
Local Date: November 2, 2010
Join Date: Dec 2001
Posts: 11,117
|
Re: Programming Question
Quote:
|
Originally posted by Solver
The question stands, is it possible for a program to disassemble itself.
|
Only if the program is a disassembler. But I fear that's not the answer you're looking for.
Quote:
|
Dunno, maybe with first creating a copy of my program first, but somehow. The disassembled code doesn't have to be changing - I'm not aiming for a self-changing program here. Merely viewing the code, and that's it.
|
Some programs contain debug information, which provide information about symbols like function, class and variable names. Together with a debugger you can reverse engineer the program then. But all you get is the assembler code with label names. But that's not enough to analyze complex programs. You can't, for instance, reverse engineer C++ code this way.
Debug information are usually not in the release version of the programs, though.
|
|
|
|
August 30, 2003, 11:16
|
#3
|
Deity
Local Time: 11:01
Local Date: November 2, 2010
Join Date: Sep 2000
Location: Latvia, Riga
Posts: 18,355
|
But what if in my program, I build in also the code of a disassembler? That is, let it do its normal work, and the hidden disassembler work - so that it could provide the disassembler data for itself.
Debug information... yep, good stuff, but doesn't suit me. I need the program to provide full disassembler output for itself.
If only I knew how exactly does one make a disassembler ?
__________________
Solver, WePlayCiv Co-Administrator
Contact: solver-at-weplayciv-dot-com
I can kill you whenever I please... but not today. - The Cigarette Smoking Man
|
|
|
|
August 30, 2003, 11:21
|
#4
|
Emperor
Local Time: 03:01
Local Date: November 2, 2010
Join Date: May 2001
Location: flying too low to the ground
Posts: 4,625
|
i have never seen a "perfect" disassembler, every time i have used one, a lot of code was "lost" in the translation.
but, that was a long time ago, perhaps things have improved.
as for links, a quick search of my fav coding sites and a googling yeilding nothing of major intrest.
i'm also assuming you mean a C disassembler.
__________________
"I've lived too long with pain. I won't know who I am without it. We have to leave this place, I am almost happy here."
- Ender, from Ender's Game by Orson Scott Card
|
|
|
|
August 30, 2003, 11:35
|
#5
|
Deity
Local Time: 11:01
Local Date: November 2, 2010
Join Date: Sep 2000
Location: Latvia, Riga
Posts: 18,355
|
Quote:
|
i'm also assuming you mean a C disassembler.
|
Indeed, C or C++ stuff...
__________________
Solver, WePlayCiv Co-Administrator
Contact: solver-at-weplayciv-dot-com
I can kill you whenever I please... but not today. - The Cigarette Smoking Man
|
|
|
|
August 30, 2003, 12:46
|
#6
|
Civ4: Colonization Content Editor
Local Time: 09:01
Local Date: November 2, 2010
Join Date: Dec 2001
Posts: 11,117
|
Quote:
|
Originally posted by Solver
If only I knew how exactly does one make a disassembler ?
|
To write a disassembler is pretty easy, but for the modern processors it would be a hell lot of work. Just take the whole numeric instruction set and write a program, which uses the numerical codes and addressing schemes to restore assembler code. I've done that for several processors of the 65xx series and the Z80. In theory, you can write a disassembler for the Pentium IV too, this way. But nobody would do so.
To reverse engineer C or C++ code this way is not possible, because the process of compilation is not unambiguous reversible. A lot of modern compilers optimize code, rewrite functions by changing the instruction order, keep variables in registers instead of memory, make inline macros out of functions and eliminate unused code and data at all. You can't restore the program as it was. Forget it. All you can is do a compromise. Take variables like EAX, EBX, ECX, EDX, ESI, EDI, EBP and so on, which correspond with the processor registers, keep track of memory addresses used and try to find a nomenklatura to symbolize them (like SYM_001234AB for address 001234AB hex). You'd get C code this way, which could compile back into the program, but that code has absolutely nothing to do with the original code the program was written in.
|
|
|
|
August 30, 2003, 13:37
|
#7
|
Deity
Local Time: 11:01
Local Date: November 2, 2010
Join Date: Sep 2000
Location: Latvia, Riga
Posts: 18,355
|
Yup, the Halting Problem is of disturbance there.
The good stuff is this, though, any decompiler will get the MOV, CALL, PUSH and NOP commands from the executable... basically all I need.
Looked at a few thingies that try to make C code that way... nahh... let the kids play with them, I'll go with my ASM commands .
__________________
Solver, WePlayCiv Co-Administrator
Contact: solver-at-weplayciv-dot-com
I can kill you whenever I please... but not today. - The Cigarette Smoking Man
|
|
|
|
August 30, 2003, 15:27
|
#8
|
Warlord
Local Time: 08:01
Local Date: November 2, 2010
Join Date: Oct 1999
Location: Milan, Italy
Posts: 127
|
An interesting request, i'm unsure to have understood it correctly, so forgive me if my suggestions are not appliable to your case.
Usually scripting languages always offer this feature (i don't know if you need an compiled exe program).
To write self-modify code (you don't need it, i know, but if it can modify itself, it can also read itself) the king is always LISP.
Anyway if i remember correctly also Rexx was very good in that:
Yes, i checked, it has a builtin function called Sourceline(int) which returns a string containing the Rexx program source line you requested.
And with the Interpret(String) command Rexx will do the other thing, get a line (or more) of Rexx code and execute it.
If you don't want to use scripting languages you could always look at VM-based languages (Java and .NET), since the code produced by compilation is developed to be interpreted if needed (bytecode for Java and MSIL for .NET) the decompilation phase is always feasible.
Particularly .NET comes with some interesting Reflection features, aside from letting you inspecting everything about your assemblies such as types contained in it with its methods, fields, properties, etc (but this can be done with Java Reflection too), .NET goes further allowing you to add metadata to an assembly and inspect it afterwards (using the so-called attributes), Attributes are user-defined and can contain everything you want, so a solution could be signing every method with a "SourceAttribute" containing its source code.
Another feature of .NET Reflection is the Emit which allows to dynamically invoke C# (or VB) compiler to compile source-files at runtime (a sort of Rexx Interpret) to obtain an assembly as output; you could use it in this way: the exe is a small startup-code with a C# source file (containing the true application) embedded in the exe, the startup-code just compile the source file and launch it; in this way you can always request source files of the program directly from the program.
What else to add, from my experience C/C++ Decompilers are just a joke but extract the code section from an executable (at least on Windows and from a PE-Executable) is (relatively) easy, the difficult part is to decompile correctly what you extract.
I hope this helps,
Quote:
|
Originally posted by Solver
Indeed, C or C++ stuff...
|
Maybe not...
Greetings,
Angelo
__________________
"If it works, it's obsolete."
-- Marshall McLuhan
|
|
|
|
August 30, 2003, 15:32
|
#9
|
Deity
Local Time: 11:01
Local Date: November 2, 2010
Join Date: Sep 2000
Location: Latvia, Riga
Posts: 18,355
|
Angelo - excellent thoughts. Very helpful .
I took a look at the few C/C++ decompilers and really found them to be weak. Extraction of merely the code segment from PE files seems much better of an idea.
Didn't know about rexx... gotta do some research on that.
Thanks again, just the reply I was looking forward to .
This can be closed now...
__________________
Solver, WePlayCiv Co-Administrator
Contact: solver-at-weplayciv-dot-com
I can kill you whenever I please... but not today. - The Cigarette Smoking Man
|
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is On
|
|
|
All times are GMT -4. The time now is 04:01.
|
|