Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead Store Elimination #1130

Closed
JonathanSalwan opened this issue May 26, 2022 · 1 comment
Closed

Dead Store Elimination #1130

JonathanSalwan opened this issue May 26, 2022 · 1 comment
Assignees
Milestone

Comments

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented May 26, 2022

Introduction

We have recently added the concept of basic block (#1121) in Triton and we are now able to disassemble and process a block. How this new feature can improve Triton regarding binary deobfuscation? With the concept of block, we are now able to provide a dead store elimination simplification on a given block. Thus, the method simplify can now take a BasicBlock as input.

Example

Let's take as an example a VMProtect sample (thanks to @_xeroxz for giving us such sample). How it works? We will symbolically execute the block and thus create for each instruction their SSA symbolic expressions. With the SSA form, and on a single block, it's then easy to remove expressions that have no more reference. For example:

mov rdi, 1 ; <-- dead code
mov rdi, 2 ; previous rdi expression can be removed

Let's see the result on VMProtect's junk code:

.vmp0:0000000140004149 66 D3 D7               rcl     di, cl
.vmp0:000000014000414C 58                     pop     rax
.vmp0:000000014000414D 66 41 0F A4 DB 01      shld    r11w, bx, 1
.vmp0:0000000140004153 41 5B                  pop     r11
.vmp0:0000000140004155 80 E6 CA               and     dh, 0CAh
.vmp0:0000000140004158 66 F7 D7               not     di
.vmp0:000000014000415B 5F                     pop     rdi
.vmp0:000000014000415C 66 41 C1 C1 0C         rol     r9w, 0Ch
.vmp0:0000000140004161 F9                     stc
.vmp0:0000000140004162 41 58                  pop     r8
.vmp0:0000000140004164 F5                     cmc
.vmp0:0000000140004165 F8                     clc
.vmp0:0000000140004166 66 41 C1 E1 0B         shl     r9w, 0Bh
.vmp0:000000014000416B 5A                     pop     rdx
.vmp0:000000014000416C 66 81 F9 EB D2         cmp     cx, 0D2EBh
.vmp0:0000000140004171 48 0F A3 F1            bt      rcx, rsi
.vmp0:0000000140004175 41 59                  pop     r9
.vmp0:0000000140004177 66 41 21 E2            and     r10w, sp
.vmp0:000000014000417B 41 C1 D2 10            rcl     r10d, 10h
.vmp0:000000014000417F 41 5A                  pop     r10
.vmp0:0000000140004181 66 0F BA F9 0C         btc     cx, 0Ch
.vmp0:0000000140004186 49 0F CC               bswap   r12
.vmp0:0000000140004189 48 3D 97 74 7D C7      cmp     rax, 0FFFFFFFFC77D7497h
.vmp0:000000014000418F 41 5C                  pop     r12
.vmp0:0000000140004191 66 D3 C1               rol     cx, cl
.vmp0:0000000140004194 F5                     cmc
.vmp0:0000000140004195 66 0F BA F5 01         btr     bp, 1
.vmp0:000000014000419A 66 41 D3 FE            sar     r14w, cl
.vmp0:000000014000419E 5D                     pop     rbp
.vmp0:000000014000419F 66 41 29 F6            sub     r14w, si
.vmp0:00000001400041A3 66 09 F6               or      si, si
.vmp0:00000001400041A6 01 C6                  add     esi, eax
.vmp0:00000001400041A8 66 0F C1 CE            xadd    si, cx
.vmp0:00000001400041AC 9D                     popfq
.vmp0:00000001400041AD 0F 9F C1               setnle  cl
.vmp0:00000001400041B0 0F 9E C1               setle   cl
.vmp0:00000001400041B3 4C 0F BE F0            movsx   r14, al
.vmp0:00000001400041B7 59                     pop     rcx
.vmp0:00000001400041B8 F7 D1                  not     ecx
.vmp0:00000001400041BA 59                     pop     rcx
.vmp0:00000001400041BB 4C 8D A8 ED 19 28 C9   lea     r13, [rax-36D7E613h]
.vmp0:00000001400041C2 66 F7 D6               not     si
.vmp0:00000001400041CB 41 5E                  pop     r14
.vmp0:00000001400041CD 66 F7 D6               not     si
.vmp0:00000001400041D0 66 44 0F BE EA         movsx   r13w, dl
.vmp0:00000001400041D5 41 BD B2 6B 48 B7      mov     r13d, 0B7486BB2h
.vmp0:00000001400041DB 5E                     pop     rsi
.vmp0:00000001400041DC 66 41 BD CA 44         mov     r13w, 44CAh
.vmp0:0000000140007AEA 4C 8D AB 31 11 63 14   lea     r13, [rbx+14631131h]
.vmp0:0000000140007AF1 41 0F CD               bswap   r13d
.vmp0:0000000140007AF4 41 5D                  pop     r13
.vmp0:0000000140007AF6 C3                     retn

If we provide the previous code to Triton for a dead store analysis, we get as result:

>>> sblock = ctx.simplify(block)
>>> ctx.disassembly(sblock, 0x140004149)
>>> print(sblock)
0x140004149: pop rax
0x14000414a: pop r11
0x14000414c: pop rdi
0x14000414d: pop r8
0x14000414f: pop rdx
0x140004150: pop r9
0x140004152: pop r10
0x140004154: pop r12
0x140004156: pop rbp
0x140004157: popfq
0x140004158: pop rcx
0x140004159: pop rcx
0x14000415a: pop r14
0x14000415c: pop rsi
0x14000415d: pop r13
0x14000415f: ret

Full example here.

@risuxx
Copy link

risuxx commented May 3, 2023

I would like to ask a question: how can we quickly convert the code segment in a binary file into blocks? In the example I saw, it is necessary to know the length of each instruction when constructing a block. However, for a block of obfuscated code, it is not easy to determine the length of each instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants