From 9a7e36b520fc66be1867f91a0c5af3d141f06379 Mon Sep 17 00:00:00 2001 From: Christoph Egger Date: Wed, 10 Dec 2014 17:03:45 +0100 Subject: [PATCH] todo page --- html/main.css | 5 ++ src/todo.txt | 194 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 199 insertions(+) create mode 100644 src/todo.txt diff --git a/html/main.css b/html/main.css index ee515fc..c58d5eb 100644 --- a/html/main.css +++ b/html/main.css @@ -70,6 +70,11 @@ body { margin-bottom: 0.5em; } +.document h2 , .document h3 { + margin-top: .5 em; + margin-bottom: .3em; +} + .document ul { margin-left: 2em; } diff --git a/src/todo.txt b/src/todo.txt new file mode 100644 index 0000000..d57d479 --- /dev/null +++ b/src/todo.txt @@ -0,0 +1,194 @@ +restindex + page-title: TODO + include: yes +/restindex + +Doable +====== + +robust binary support +~~~~~~~~~~~~~~~~~~~~~ + +TODO understanding _start +------------------------- + + If we do not have symbols, the only known entry point is _start and + that does only call a library function. However it has a regular + structure and one can find the address if main by being + smart. Unfortunately it's not necessarily possible to do that portably + so I kind of want to move it out of the core. + + [http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html] + + +TODO finding _start for !ELF +---------------------------- + + COFF / MacO has an entry point. unfortunately LLVM doesn't directly + let me access this information so currently it does some magic for ELF + but not others. Might be possible to find something in LLDB which we + might want to have long-term anyway to add debugging capabilities + + +TODO finding plausible function prologs +--------------------------------------- + + Some nice heuristices (implemented as plugins) to get something even + if we can't properly find function calls. + + +serializing +~~~~~~~~~~~ + +TODO routing data through serializing point +------------------------------------------- + + Send all relevant data to some management instance. The management + instance then notifies all stakeholders of the update. This would make + sure the resulting state of the system is reproducible from the + serialized stream and is network-streamable. + + One will need to take care of potential performance issues. + + +TODO read/write files +--------------------- + + What we actually want is a bag of transactions and a logical (semi) + order on them. Idea is to mostly store xml in a zip container. + + +TODO network stuff +------------------ + + If we can serialize to zip containers we can stream them over the + network. Idea is to use XMPP as transport, a central instance that + supplies new participants with all past information and a MUC where + updates are sent to. + + Master/slave setups should be easy. Also normal editing should be + almost conflict free. Probably needs some locking to not cause + conflicts when scripts runn over the whole binary and add information + everywhere. + + +graph layouting +~~~~~~~~~~~~~~~ + +TODO special entry (exits?) points +---------------------------------- + + B0 with metainformation about the function. also gives a ⊤ and ⊥ for + the graph which is nice for all kind of algorithms + + +TODO routing edges +------------------ + + Edges skipping on layer of blocks sometimes get through (below) other + blocks. Fixing that makes the whole thing a complex problem while + currently it's a set of rather simple heuristics + + +TODO Dominator -> anordnung untereinander falls total dominiert +--------------------------------------------------------------- + + Currently blocks are ordered by address. Sometimes blocks are + semantically strictly after another but seen before in the address + space. Would be visually nicer to have a semantical order so backward + (upward) edges really only happen in loops + + +Scripting +~~~~~~~~~ + +TODO design reasonable API +-------------------------- + + SWIG? We really want a API that looks native for all supported + scripting languages and we want a API that is semantically the same + for all languages + + +TODO python +----------- + + We have guile implemented and working. Python seems to be highly + popular so we will want to have it as well some time not to distant in + the future. + + +Non-.text stuff +~~~~~~~~~~~~~~~ + +TODO finding data and strings +----------------------------- + + identify data types by interpreting the instruction sequence + referencing a datum. Probably we want to have all instructions + referencing a address in data segments and do some type narrowing + based on that. Probably also function calls + + +Anotating stuff +~~~~~~~~~~~~~~~ + +TODO notification of annotations to stakeholders +------------------------------------------------ + + +TODO Configuration stuff +~~~~~~~~~~~~~~~~~~~~~~~~ + + +Bugfixes +~~~~~~~~ + +TODO build up instruction analysis for !arm !x86 +------------------------------------------------ + + +TODO instruction alignment on RISC +---------------------------------- + + +TODO stop on decoding error? +---------------------------- + + +TODO hlt in _start / general +---------------------------- + + +TODO do not create functions for plt entries +-------------------------------------------- + + +TODO blocks not displayed in i4/cip +----------------------------------- + + +Advanced +======== + +Deduce structure +~~~~~~~~~~~~~~~~ + +TODO Natural loops +------------------ + + +TODO trivial control-flow-split +------------------------------- + + +Non-.text stuff +~~~~~~~~~~~~~~~ + +TODO plt stuff and finding API via parsing /usr/include +------------------------------------------------------- + + We already have a C parser (clang) so if we see a call to function@plt + we can see for potential prototypes for that function in /usr/include + which would give nice type information for C libraries. We also want + to display manpage for that function if available -- 2.39.5