You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
andq $-16, %rsp # clear the 4 least significant bits of stack pointer to align it
RSP is already aligned by 16 on process entry, as guaranteed by the x86-64 System V ABI.
mov $4, %rax # SYS_write
mov $1, %rdi
You can mov $4, %eax to do this more efficiently (implicit zero-extension to 64-bit), especially if you're later trying to optimize by merging a length into the low by of RDX (which most kernels zero on process entry). Also, you can #include <sys/syscall.h> to get call numbers as CPP macro #defines, so you can mov $SYS_write, %eax. (Call your file .S so gcc will run it through CPP first).
You can use as -O2 or -Os to do simple optimizations like mov $4, %rax into mov $4, %eax like NASM does, because the architectural effect is identical. (If using GCC, -Wa,-O2, notgcc -O2)
Putting a constant byte in static storage is just silly; make it an assemble time constant you can use as an immediate like mov $hello_len, %edx (Or %rdx if you want).
.section .data # could be .section .rodata
hello:
.ascii "Hello world!\n"
hello_len = . - hello
# .equ hello_len, . - hello # alternative using .equ
#.byte . - hello
So
mov hello_len, %dl # Note: does not clear upper bytes. Use movzxb (move zero extend) for that
becomes
mov $hello_len, %edx # zero-extends to fill RDX
The text was updated successfully, but these errors were encountered:
- add attribution (mostly stolen from https://polprog.net/blog/netbsdasmprog/)
- replace syscall args comment with the actual one from syscall.c
- rename $hello to $hello_str
(in response to first part of issue #2)
Thanks very much for your detailed comments and analysis!
I've dealt with the first item (the bug in the comment), and noted the origin of this example -- that's what I get for copy&paste!
It has been a long time since I did any Intel assembly coding, and this is actually my first x86_64-specific toy. Most of my practical experience with assembler is way back when on pdp11, vax, 6502, 1802, etc. and ancient x86, so I definitely appreciate your insight!
BTW, I like the idea of storing the length of the string in memory for other purposes, i.e. not just having a constant in the current assembly unit, so I'll probably keep that as an example, but I'll add a comment about avoiding the storage and using a constant instead.
Someone linked https://github.com/robohack/experiments/blob/430b5ea22bc2f4f697c659aeb399e938d09744c1/thello.s for an example of a BSD build command, which is why I'm randomly looking at it.
It has one bug (in a comment): syscall definitely can't take an arg in RCX, the syscall instruction itself destroys RCX before the kernel gets control. ( https://stackoverflow.com/questions/32253144/why-is-rcx-not-used-for-passing-parameters-to-system-calls-being-replaced-with) Linux uses R10 instead of RCX, with the rest of the convention matching the function-calling convention. I'd guess most other x86-64 SysV OSes do the same, but I don't know for sure.
Separately from that:
RSP is already aligned by 16 on process entry, as guaranteed by the x86-64 System V ABI.
You can
mov $4, %eax
to do this more efficiently (implicit zero-extension to 64-bit), especially if you're later trying to optimize by merging a length into the low by of RDX (which most kernels zero on process entry). Also, you can#include <sys/syscall.h>
to get call numbers as CPP macro #defines, so you canmov $SYS_write, %eax
. (Call your file.S
so gcc will run it through CPP first).You can use
as -O2
or-Os
to do simple optimizations likemov $4, %rax
intomov $4, %eax
like NASM does, because the architectural effect is identical. (If using GCC,-Wa,-O2
, notgcc -O2
)Using a 32-bit sign-extended immediate for an absolute address is possible but inefficient. Normally you'd use
lea hello(%rip), %rsi
, ormov $hello, %esi
(if 32-bit sign-extended works, so does zero-extended, assuming user-space using the bottom of the virtual address space, not the top.) https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-registerAgain, 32-bit operand-size is 100% fine, especially for the xor since
exit()
takes an int arg. See my answer on https://stackoverflow.com/questions/33666617/what-is-the-best-way-to-set-a-register-to-zero-in-x86-assembly-xor-mov-or-andPutting a constant byte in static storage is just silly; make it an assemble time constant you can use as an immediate like
mov $hello_len, %edx
(Or %rdx if you want).So
becomes
The text was updated successfully, but these errors were encountered: