Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aarch64 decompilation #533

Merged
merged 112 commits into from
Mar 28, 2019
Merged

Aarch64 decompilation #533

merged 112 commits into from
Mar 28, 2019

Conversation

MatejKastak
Copy link
Member

Changes include instructions translation to llvmir, unit tests, llvmir environment generation, arch abi, etc.

Run tool with reasonable Capstone basic modes for specified architecture.
Default values are as follows:
-a arm   : CS_MODE_ARM
-a arm64 : CS_MODE_ARM [looks like keystone doesn't like this]
-a mips  : CS_MODE_MIPS32
-a x86   : CS_MODE_32
-a ppc   : CS_MODE_32
-a <rest>: CS_MODE_LITTLE_ENDIAN
- register maps(_reg2type)
- instructions map(_i2fm)
Modified ARM Translator unit, Work in progress.
- register name could not be found because of the wrong cs_arch in constructor
- capstone was configured without the ARM64 support, this caused
  cs_open to fail
- flags from status register added to arm64 env
- program counter added to arm64 env
- basic implementation of functions needed for loading and storing operands
- translateAdd is for testing purposes
- started implementation of MEM operand type
- Store register instruction translation method
e.g. retdec-capstone2llvmir -a arm64 -t 'str x0, [x1]'
- MOV, MVN and MOVZ instructions
- operand shift functions moved and changed for ARM64
- instructions like 'movz x0, avast#3 LSL 16' work now
- test framework capstone2llvmirtranslator
- first INS_ADD test
- cmake compilation
- MOV, MOVZ
- Store pair instruction{pre-index, post-index, signed-offset}
- test for all cases except 32bit operands
- pc moved to its own enum
- generateGetOperandAddr to generate address from instruction operand
- LDR{pre-index, post-index, signed-offset} instruction implemented
- STR{pre-index, post-index, signed-offset} instruction implemented
- LDR tests ported from ARM
- LDP todo
- Register parent map
- Storing registers
- Loading registers
- Headers

- Need more changes to conversions, I think 'mov w0, avast#3' zeroes out
  the upper 32bits of x0 register. But need to investigate further.
- taken from uname -a in qemu arm64 machine
Linux debian-aarch64 4.9.0-4-arm64 avast#1 SMP Debian
4.9.65-3+deb9u1 (2017-12-23) aarch64 GNU/Linux
- when writing value to 32bit reg the 64bit, the value is zero
  extended to the vhole register
- parent register mapping enabled in tests
- 32bit version of tests
- added tests for label and imm branch
- added tests for instruction
- real binary testing is needed
- without tests
in Architecture::setArch() ARM64 needs to be set before ARM
because "arm" from ARM matches the "arm aarch64" from ARM64
- Added the option to switch this behaviour
- add one ADD test with shift
- Arm supports the extension of operand e.g. 'add x0, x1, w2, SXTW'
  will sign-extend the w2 register to 64 bit and after that add the values
- test for 64bit variant implemented
- need to check the optional imm(shift VM outputs weird values)
-> isArmOrThumb renamed to isArm32OrThumb
-> added isArm32 method
-> thumb is now set with a flag _thumbFlag
Now the enum eArch represents only general architecture and all
subtypes of architecture are checked to getBitSize() or _thumbFlag.

The function isArm() return true for every type of subarchitecture
e.g. {arm32, arm64 or thumb}
- Added some instruction IDs to branch types
- For example 'str w0, [sp]' should store only 4bytes to stack pointer
Replace svc #0 with corresponding syscall decoded from previous assignments.
Generate Vector registers so in case the pseudo instructions with them
as operands is generated we don't crash. For the similar purpose I
changed the f16 in ARM64_REG_H* to i16 since half type in not
supported and we wan't to be able to at least generate pseudo instructions.
Those tests target loading and storing floating point values.
- Zero division is NOW undefined behaviour
- This caused problems in modulo idiom detection
- Also removed coresponding tests
- Correctly handle imm values as operands of this instruction
This reverts commit 7b88475.
This change caused other tests to fail.
- Removed unused code from decoder/arm64.cpp
- Fixed insnWrittesPcArm64 to work better
- Fixed Cond branch tests
@PeterMatula
Copy link
Collaborator

Although I said that you don't need to document every single thing in Doxygen, please make sure that the existing comments are without errors - doxygen-build doesn't fail.

@MatejKastak
Copy link
Member Author

Yes, I completely forgot about documentation builds. It should be fixed now.

@PeterMatula PeterMatula changed the base branch from master to arm64 March 28, 2019 10:43
@PeterMatula PeterMatula merged commit f07407f into avast:arm64 Mar 28, 2019
s3rvac added a commit that referenced this pull request May 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants