Whitespace cleanup
[frida/frida.git] / TODO.org
1 #+OPTIONS: html-link-use-abs-url:nil html-postamble:auto
2 #+OPTIONS: html-preamble:t html-scripts:t html-style:t
3 #+OPTIONS: html5-fancy:nil tex:t toc:2
4 #+CREATOR: <a href="http://www.gnu.org/software/emacs/">Emacs</a> 24.4.1 (<a href="http://orgmode.org">Org</a> mode 8.2.10)
5 #+HTML_CONTAINER: div
6 #+HTML_DOCTYPE: xhtml-strict
7 #+HTML_HEAD:
8 #+HTML_HEAD_EXTRA:
9 #+HTML_LINK_HOME:
10 #+HTML_LINK_UP:
11 #+HTML_MATHJAX:
12 #+INFOJS_OPT:
13 #+LATEX_HEADER:
14
15 * Doable
16 ** robust binary support
17 *** TODO understanding _start
18
19 If we do not have symbols, the only known entry point is _start
20 and that does only call a library function. However it has a
21 regular structure and one can find the address if main by being
22 smart. Unfortunately it's not necessarily possible to do that
23 portably so I kind of want to move it out of the core.
24
25 http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html
26
27 *** TODO finding _start for !ELF
28
29 COFF / MacO has an entry point. unfortunately LLVM doesn't
30 directly let me access this information so currently it does some
31 magic for ELF but not others. Might be possible to find something
32 in LLDB which we might want to have long-term anyway to add
33 debugging capabilities
34
35 *** TODO finding plausible function prologs
36
37 Some nice heuristices (implemented as plugins) to get something
38 even if we can't properly find function calls.
39
40 ** serializing
41 *** TODO routing data through serializing point
42
43 Send all relevant data to some management instance. The management
44 instance then notifies all stakeholders of the update. This would
45 make sure the resulting state of the system is reproducible from
46 the serialized stream and is network-streamable.
47
48 One will need to take care of potential performance issues.
49
50 *** TODO read/write files
51
52 What we actually want is a bag of transactions and a logical
53 (semi) order on them. Idea is to mostly store xml in a zip
54 container.
55
56 *** TODO network stuff
57
58 If we can serialize to zip containers we can stream them over the
59 network. Idea is to use XMPP as transport, a central instance that
60 supplies new participants with all past information and a MUC
61 where updates are sent to.
62
63 Master/slave setups should be easy. Also normal editing should be
64 almost conflict free. Probably needs some locking to not cause
65 conflicts when scripts runn over the whole binary and add
66 information everywhere.
67
68 ** graph layouting
69 *** TODO special entry (exits?) points
70
71 B0 with metainformation about the function. also gives a ⊤ and ⊥
72 for the graph which is nice for all kind of algorithms
73
74 *** TODO routing edges
75
76 Edges skipping on layer of blocks sometimes get through (below)
77 other blocks. Fixing that makes the whole thing a complex problem
78 while currently it's a set of rather simple heuristics
79
80 *** TODO Dominator -> anordnung untereinander falls total dominiert
81
82 Currently blocks are ordered by address. Sometimes blocks are
83 semantically strictly after another but seen before in the address
84 space. Would be visually nicer to have a semantical order so
85 backward (upward) edges really only happen in loops
86
87 ** Scripting
88 *** TODO design reasonable API
89
90 SWIG? We really want a API that looks native for all supported
91 scripting languages and we want a API that is semantically the
92 same for all languages
93
94 *** TODO python
95
96 We have guile implemented and working. Python seems to be highly
97 popular so we will want to have it as well some time not to
98 distant in the future.
99
100 ** Non-.text stuff
101 *** TODO finding data and strings
102
103 identify data types by interpreting the instruction sequence
104 referencing a datum. Probably we want to have all instructions
105 referencing a address in data segments and do some type narrowing
106 based on that. Probably also function calls
107
108 ** Anotating stuff
109 *** TODO notification of annotations to stakeholders
110
111 ** TODO Configuration stuff
112 ** Bugfixes
113 *** TODO build up instruction analysis for !arm !x86
114 *** TODO instruction alignment on RISC
115 *** TODO stop on decoding error?
116 *** TODO hlt in _start / general
117 *** TODO do not create functions for plt entries
118 *** TODO blocks not displayed in i4/cip
119
120
121 * Advanced
122 ** Deduce structure
123 *** TODO Natural loops
124 *** TODO trivial control-flow-split
125 ** Non-.text stuff
126 *** TODO plt stuff and finding API via parsing /usr/include
127
128 We already have a C parser (clang) so if we see a call to
129 function@plt we can see for potential prototypes for that function
130 in /usr/include which would give nice type information for C
131 libraries. We also want to display manpage for that function if available
132
133