Need help with Windows reversing: ccdgen.exe from TI
Mychaela Falconia
falcon at freecalypso.org
Wed May 3 20:38:07 UTC 2023
Hi DS,
> Please have a look at https://www.freecalypso.org/members/ds/ccdgen.exe.c
> which is the result of the whole executable decompilation.
Is it from IDA or Ghidra? In either case, thank you for this point of
reference - it gives me an idea of what can be expected from the
existing tools.
> I recommend you give ghidra a try nonetheless.
Adding it to my long to-do list... Just "giving it a try" is already
difficult in itself because they don't support 32-bit Linux hosts: I
refuse to defile my pristine primary-use machine with a 64-bit OS,
thus I have to use a different computer other than my preferred
primary for this whole "give it a try" experiment...
> I would be surprised if either IDA's or ghidra output can be recompiled
> as-is. The result of the decompiler was not meant for recompilation,
This part is the most disappointing: the tools are simply not made for
the purpose that's needed here, and the remaining problems then mostly
stem from the mismatch in purpose-orientation between the creator and
the user of the tool...
When the problem statement in need of solving is recovery of a lost-
source math or data processing or "business logic" application (meaning
a program that is known to not issue any system calls, not access any
hw or any network resources etc, only reading and writing files via
stdio), the first step prior to applying decompilation logic needs to
be identification of program vs standard library code. I expect that
every part of the .text section in a binary such as ccdgen.exe must
fall neatly into one of just a few mutually exclusive categories:
1) entry point code coming from crt0.o or whatever M$ called their
version, executing before main() entry;
2) actual program code of interest, beginning with main() and ending
at points where the code calls fopen(), fgets(), fscanf(), printf(),
fprintf() etc, as well as other (not stdio) libc functions like
strcmp(), malloc() and whatnot;
3) bodies of all those libc functions just named and everything they
call further downstream;
4) bits of code inserted by the linker, whatever is needed for Win32
environment - my memory is rusty after not touching that stuff for
over 25 y.
Actual decompilation logic, as in machine generation of recompilable C
code from disassembly, needs to be applied *only* to part 2 of the
just-listed division, and not any other parts. Also given how linkers
typically work, especially old and "dumb" ones, I would expect the 4
code divisions I just listed above to actually appear in the .text
section in that order: the linker would first process crt0.o and the
application objects listed on its invokation line, then start pulling
modules from whatever was MSVC's equivalent of libc+libgcc in order to
satisfy externals. Hence I would expect to see a boundary in the
.text section between the end of interesting code and the beginning of
uninteresting bits pulled from the standard library - but looking at
the fully automated decompiler output, it looks like the tools aren't
smart enough to recognize it...
It has been 25 y since I did any work with x86 assembly, and 28 y since
I did truly hard-core x86 reversing, so my memory is quite a bit rusty,
but it looks like I have no choice but to heavily brush up on my x86
knowledge, dust off and re-read all those books about Win32 and PE
file format which I should still have somewhere, and then decide on
the most appropriate course of action, which may involve developing
some new tools.
Hasta la Victoria, Siempre,
Mychaela aka The Mother
More information about the Community
mailing list