thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

pcordes · 2021-05-18T03:02:26Z

Someone linked https://github.com/robohack/experiments/blob/430b5ea22bc2f4f697c659aeb399e938d09744c1/thello.s for an example of a BSD build command, which is why I'm randomly looking at it.

It has one bug (in a comment): syscall definitely can't take an arg in RCX, the syscall instruction itself destroys RCX before the kernel gets control. ( https://stackoverflow.com/questions/32253144/why-is-rcx-not-used-for-passing-parameters-to-system-calls-being-replaced-with) Linux uses R10 instead of RCX, with the rest of the convention matching the function-calling convention. I'd guess most other x86-64 SysV OSes do the same, but I don't know for sure.

# SYSCALL ARGS
# rdi rsi rdx r10 r8 r9
  # wrong original: # rdi rsi rdx rcx r8 r9    # that's the function-calling convention.

Separately from that:

	andq $-16, %rsp		# clear the 4 least significant bits of stack pointer to align it

RSP is already aligned by 16 on process entry, as guaranteed by the x86-64 System V ABI.

	mov $4, %rax		# SYS_write
	mov $1, %rdi

You can mov $4, %eax to do this more efficiently (implicit zero-extension to 64-bit), especially if you're later trying to optimize by merging a length into the low by of RDX (which most kernels zero on process entry). Also, you can #include <sys/syscall.h> to get call numbers as CPP macro #defines, so you can mov $SYS_write, %eax. (Call your file .S so gcc will run it through CPP first).

You can use as -O2 or -Os to do simple optimizations like mov $4, %rax into mov $4, %eax like NASM does, because the architectural effect is identical. (If using GCC, -Wa,-O2, not gcc -O2)

	mov $hello, %rsi

Using a 32-bit sign-extended immediate for an absolute address is possible but inefficient. Normally you'd use lea hello(%rip), %rsi, or mov $hello, %esi (if 32-bit sign-extended works, so does zero-extended, assuming user-space using the bottom of the virtual address space, not the top.) https://stackoverflow.com/questions/57212012/how-to-load-address-of-function-or-label-into-register

	mov $1, %rax		# SYS_exit
	xor %rdi, %rdi

Again, 32-bit operand-size is 100% fine, especially for the xor since exit() takes an int arg. See my answer on https://stackoverflow.com/questions/33666617/what-is-the-best-way-to-set-a-register-to-zero-in-x86-assembly-xor-mov-or-and

Putting a constant byte in static storage is just silly; make it an assemble time constant you can use as an immediate like mov $hello_len, %edx (Or %rdx if you want).

.section .data       # could be .section .rodata

hello:
	.ascii "Hello world!\n"
hello_len = . - hello
# .equ hello_len,  . - hello     # alternative using .equ

	#.byte . - hello

So

	mov hello_len, %dl	# Note: does not clear upper bytes. Use movzxb (move zero extend) for that

becomes

	mov $hello_len, %edx       # zero-extends to fill RDX

The text was updated successfully, but these errors were encountered:

- add attribution (mostly stolen from https://polprog.net/blog/netbsdasmprog/) - replace syscall args comment with the actual one from syscall.c - rename $hello to $hello_str (in response to first part of issue #2)

robohack · 2021-05-19T05:28:03Z

Hi Peter,

Thanks very much for your detailed comments and analysis!

I've dealt with the first item (the bug in the comment), and noted the origin of this example -- that's what I get for copy&paste!

It has been a long time since I did any Intel assembly coding, and this is actually my first x86_64-specific toy. Most of my practical experience with assembler is way back when on pdp11, vax, 6502, 1802, etc. and ancient x86, so I definitely appreciate your insight!

BTW, I like the idea of storing the length of the string in memory for other purposes, i.e. not just having a constant in the current assembly unit, so I'll probably keep that as an example, but I'll add a comment about avoiding the storage and using a constant instead.

robohack self-assigned this May 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

pcordes commented May 18, 2021 •

edited

Loading

robohack commented May 19, 2021

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

thello.s: sizes of constant strings should use .equ, not loading a byte from data memory, and various other optimizations. #2

Comments

pcordes commented May 18, 2021 • edited Loading

robohack commented May 19, 2021

pcordes commented May 18, 2021 •

edited

Loading