Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for the start feature #29633

Open
aturon opened this issue Nov 5, 2015 · 54 comments
Open

Tracking issue for the start feature #29633

aturon opened this issue Nov 5, 2015 · 54 comments
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. I-lang-nominated Nominated for discussion during a lang team meeting. S-tracking-design-concerns Status: There are blocking ❌ design concerns. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@aturon
Copy link
Member

aturon commented Nov 5, 2015

Tracking issue for #[start], which indicates that a function should be used as the entry point, overriding the "start" language item. In general this forgoes a bit of runtime setup that's normally run before and after main.

Open questions

@aturon aturon added T-lang Relevant to the language team, which will review and decide on the PR/issue. B-unstable Blocker: Implemented in the nightly compiler and unstable. labels Nov 5, 2015
@alexcrichton
Copy link
Member

I believe the semantics of this today are that the compiler will generate a function with the symbol main which calls the #[start] function, if present, in an executable. This skips the #[lang = "start"] implementation, if any, in an upstream library (the standard library provides this to set up the first catch_panic among a few other minor things).

The signature for this function is also fn(isize, *const *const u8) -> isize, where the isize may no longer be "the most correct". Additionally, the *const *const u8 may not be the most appropriate option for Windows (although it works). I'm not 100% sure what "the best" signature on Windows is.

@retep998
Copy link
Member

retep998 commented Nov 9, 2015

On Windows the executable entry point does not take any arguments. Currently we let the CRT act as the executable entry point which then calls our Rust entry wrapper which invokes the start function which is either #[lang = "start"] which then invokes the user's main function pointer provided to it or a user provided #[start] function. Which executable entry point the linker decides to use depends on the /SUBSYSTEM and which main function it can find (https://msdn.microsoft.com/en-us/library/f9t8842e.aspx). All information provided to the main function by the CRT can be obtained through alternative means. Note that if we eventually provide an option for /SUBSYSTEM:Windows that main function takes a very different set of (useless) arguments than the traditional main (https://msdn.microsoft.com/en-us/library/windows/desktop/ms633559%28v=vs.85%29.aspx).

@SimonSapin
Copy link
Contributor

html5ever uses in an ugly hack that overrides the main function used by cargo test in order to generate thousands of tests dynamically. (Tests with the same code but parameterized on (input, expected result).)

This would be better solved by some way to override the test harness used by cargo test (Is there an issue/RFC for that already?)

@alexcrichton
Copy link
Member

@SimonSapin the use case for that with Cargo should in theory be harness = false, but I'm curious how that interacts with #[start]?

@SimonSapin
Copy link
Contributor

The #[start] trick doesn’t work anymore, but it looked like this: servo/html5ever@df8e749

Is there a tracking issue for harness = false?

@alexcrichton
Copy link
Member

Oh that's already implemented today, if a test target is listed as harness = false then Cargo just won't pass --test when compiling it and expects it to be a binary.

(this may be a bit off-topic from #[start] though so feel free to ping me on IRC)

@mahkoh
Copy link
Contributor

mahkoh commented Jan 7, 2016

The current signature for the lang item is

fn lang_start(main: *const u8, argc: isize, argv: *const *const u8) -> isize {

which is called by a generated main function. Instead, the signatures of both should be arbitrary and the symbols translate to main directly. This allows the main function to be platform dependent. A pointer to the user's main function can be obtained via an intrinsic.

@retep998
Copy link
Member

retep998 commented Jan 7, 2016

Note that on windows it really shouldn't always be main. If the user sets the subsystem to windows instead of console, then the CRT expects to find WinMain which results in a linker error because it wasn't defined.

@steveklabnik
Copy link
Member

#20064 suggests that the signature here is wrong, we should consider this before making this feature stable.

@comex
Copy link
Contributor

comex commented Jun 7, 2016

Just to clarify, it's not just a question of what signature lang_start should have; rustc currently generates C entry points (main) that only work "by accident" (because on 32-bit platforms isize = c_int and on 64-bit the calling conventions happen to work out). On my Mac, I get:

define i64 @main(i64, i8**) unnamed_addr {

Of course this would be an easy fix.

@retep998
Copy link
Member

retep998 commented Jun 7, 2016

Just don't stabilize this until consideration is taken for subsystems, which change the entry point completely from main to WinMain and are really important for Rust to support if it wants to be used in the Windows world (several people have spoken to me and said this is one of the issues getting in the way of them using Rust in production on Windows)

@pravic
Copy link
Contributor

pravic commented Jun 7, 2016

Entry point name is irrelevant for windows apps actually. Ability to specify subsystem is one of important things to create application, because most of them are gui with "windows" subsystem.

@Mark-Simulacrum Mark-Simulacrum added the C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. label Jul 22, 2017
bors added a commit that referenced this issue Oct 1, 2017
Fix native main() signature on 64bit

Hello,

in LLVM-IR produced by rustc on x86_64-linux-gnu, the native main() function had incorrect types for the function result and argc parameter: i64, while it should be i32 (really c_int). See also #20064, #29633.

So I've attempted a fix here. I tested it by checking the LLVM IR produced with --target x86_64-unknown-linux-gnu and i686-unknown-linux-gnu. Also I tried running the tests (`./x.py test`), however I'm getting two failures with and without the patch, which I'm guessing is unrelated.
@AndrewGaspar
Copy link
Contributor

Pinging this thread since it seems inactive. I wanted to express interest in this feature being stabilized. I was really excited when I discovered that you could replace the entry point in Rust, and then real bummed when I found out that you could only do it on nightly.

@glandium
Copy link
Contributor

glandium commented Mar 2, 2018

It's also necessary to write #![no_std] programs.

@glandium
Copy link
Contributor

glandium commented Mar 2, 2018

I would expect #[start] to pass the /ENTRY argument to link.exe, but it apparently doesn't. Although it's worth noting that the function signature for an entry point on Windows is different anyways, so one would need a different #[start] function there.

@clarfonthey
Copy link
Contributor

clarfonthey commented Mar 5, 2018

It's also necessary to write #![no_std] programs.

Speaking of which… is there any reason for this? We could make a version of Termination for libcore and remove all of the system-specific stuff from the shim. We'd also want some form of env::Args for libcore, but I'd argue that this is reasonable, especially if it's very bare-bones.

Personally, instead of offering #[start], I think that it makes more sense to be able to prune down the existing start shim by removing some of its guarantees. For example, aborting on panics, disallowing multithreading, and disabling stack protection would be enough.

@retep998
Copy link
Member

retep998 commented Mar 5, 2018

@glandium The binary entry point is completely different from #[start]. Unless you don't want to link to the CRT at all, you'll probably want the binary entry point to remain the mainCRTStartup provided by the CRT, otherwise the C RunTime won't be initialized!

@clarcharr Adding some form of env::Args to libcore is a bad idea, as on Windows it requires calling system functions and heap allocating which is firmly in the realm of libstd. On the other hand, because it is only a system function, Windows users don't need to ask libcore/libstd for the args and can just go through winapi themselves. Unix users would still need to get argc/argv somehow...

@Amanieu
Copy link
Member

Amanieu commented Mar 5, 2018

I've just been using #[no_main] in my no_std programs. This completely eliminates the default entry point logic.

I just define my main function with #[no_mangle] and have it called by my initialization code.

@clarfonthey
Copy link
Contributor

@retep998 Oh, I didn't know that. I think that in that case, it makes sense to simply have a MainArgs opaque struct which encapsulates argc and argv or nothing if they're not available.

That was initially the idea but I didn't realise how windows did things.

@glandium
Copy link
Contributor

glandium commented Mar 15, 2018

I've just been using #[no_main] in my no_std programs. This completely eliminates the default entry point logic.

Indeed, it surprisingly works on linux, mac and even windows, with a #[no_mangle], pub extern "C" fn main(...) -> isize. (edit: maybe actually i32?)

@japaric
Copy link
Member

japaric commented Mar 15, 2018

#[no_mangle] is not type safe so I don't consider it a proper user facing way to set the entry point. It also doesn't require unsafe which makes it not obvious that it's extremely dangerous to get the type signature wrong. This is "safe" and segfaults:

#![no_main]

#[no_mangle]
pub fn main(args: Vec<String>) {
    for arg in args {
        println!("{}", arg);
    }
}

@ketsuban
Copy link
Contributor

The current signature for #[start] functions isn't great in embedded contexts either. There's nothing to pass arguments to a Game Boy Advance game or read a return value—the only way execution is going to end is if the player switches the game off.

@RalfJung
Copy link
Member

And then the automatic abort-on-unwind logic will kick in? Yeah, that makes sense -- but would also need documentation.

bors bot added a commit to intellij-rust/intellij-rust that referenced this issue Feb 3, 2023
10066: ANN: Add support for E0131, E0197, E0203 r=vlad20012 a=kuksag

<!--
Hello and thank you for the pull request!

We don't have any strict rules about pull requests, but you might check
https://github.com/intellij-rust/intellij-rust/blob/master/CONTRIBUTING.md
for some hints!

Also, please write a short description explaining your change in the following format: `changelog: %description%`
This description will help a lot to create release changelog. 
Drop these lines for internal only changes

:)
-->

changelog:

* Add support for E0131
Error code reference: https://doc.rust-lang.org/error_codes/E0131.html
There's a feature that might be connected to this error code: rust-lang/rust#29633

* Add support for E0197
Error code reference: https://doc.rust-lang.org/error_codes/E0197.html

* Add support for E0203
Error code reference: https://doc.rust-lang.org/error_codes/E0203.html
Compiler implementation: https://github.com/rust-lang/rust/blob/master/compiler/rustc_hir_analysis/src/astconv/mod.rs#L877


Co-authored-by: kuksag <[email protected]>
@mikeleany
Copy link
Contributor

mikeleany commented Apr 4, 2023

From the original post:

Tracking issue for #[start], which indicates that a function should be used as the entry point, overriding the "start" language item. In general this forgoes a bit of runtime setup that's normally run before and after main.

Has anyone actually used this successfully for the purpose stated above? From my testing (using bare-metal targets), it doesn't seem to define the entry point at all. It silences the following error: "error: requires start lang_item", but doesn't define an entry point.

What advantage does this (as currently implemented) even provide over using #![no_main] and defining the entry point with #[no_mangle] or #[export_name = "_start"]?

@bjorn3
Copy link
Member

bjorn3 commented Apr 4, 2023

From my testing (using bare-metal targets), it doesn't seem to define the entry point at all.

It defines the main function. The CRT defines _start and is expected to call main after libc has been initialized. On bare metal targets there is no libc, so directly defining _start makes sense unless you need an assembly trampoline to setup eg the stack pointer. And even then you can use #![no_main] + #[no_mangle] to define the function that the trampoline will call.

@mikeleany
Copy link
Contributor

From my testing (using bare-metal targets), it doesn't seem to define the entry point at all.

It defines the main function. The CRT defines _start and is expected to call main after libc has been initialized. On bare metal targets there is no libc, so directly defining _start makes sense unless you need an assembly trampoline to setup eg the stack pointer. And even then you can use #![no_main] + #[no_mangle] to define the function that the trampoline will call.

So, if I understand you correctly, you're saying that the description of this feature is wrong — that it was never intended to override the entry point at all, but to override the main function instead, on the assumption that you are linking to the C runtime (or something else that defines the entry point and calls a C-like main function).

Also, if it's meant to be called from the CRT, shouldn't it require an extern "C" function instead of requiring the Rust ABI as it currently does? I guess changing that would also solve the unwinding issue mentioned by @RalfJung.

@bjorn3
Copy link
Member

bjorn3 commented Apr 5, 2023

that it was never intended to override the entry point at all, but to override the main function instead, on the assumption that you are linking to the C runtime (or something else that defines the entry point and calls a C-like main function).

main is the user facing entry point in C. _start is an implementation detail on ELF systems and doesn't exist on Windows at all.

Also, if it's meant to be called from the CRT, shouldn't it require an extern "C" function instead of requiring the Rust ABI as it currently does?

Rustc actually won't rename the function annotated with #[start] to main. Instead it generates a main function with extern "C" which calls this function. Also on some targets like UEFI it generates a differently named function with a different calling concention as appropriate when the entrypoint on that target has a different name and/or calling convention.

@mikeleany
Copy link
Contributor

main is the user facing entry point in C.

That still makes wording like "indicates a function should be used as the entry point" ambiguous at best, and very confusing (as can be seen in previous discussions in this tracking issue). My suggestion, now that I understand what this feature is really intended for, is simply that the description of this feature needs to be clarified.

_start is an implementation detail on ELF systems and doesn't exist on Windows at all.

As far as I understand, Windows still has an entry point equivalent to _start, but just names it differently. In fact, even when using ELF object files, the entry point doesn't have to be called _start, though that's the common default. But yes, the naming of such startup routines is just an implementation detail.

Also, if it's meant to be called from the CRT, shouldn't it require an extern "C" function instead of requiring the Rust ABI as it currently does?

Rustc actually won't rename the function annotated with #[start] to main. Instead it generates a main function with extern "C" which calls this function. Also on some targets like UEFI it generates a differently named function with a different calling concention as appropriate when the entrypoint on that target has a different name and/or calling convention.

Ah, I see.

@Noratrieb
Copy link
Member

Noratrieb commented May 1, 2024

I think this issue should be closed and #[start] should be deleted. It's nothing but an accidentally leaked implementation detail that's a not very useful mix between "portable" entrypoint logic and bad abstraction.

I think the way the stable user-facing entrypoint should work (and works today on stable) is pretty simple:

  • std-using cross-platform programs should use fn main(). the compiler, together with std, will then ensure that code ends up at main (by having a platform-specific entrypoint that gets directed through lang_start in std to main - but that's just an implementation detail)
  • no_std platform-specific programs should use #![no_main] and define their own platform-specific entrypoint symbol with #[no_mangle], like main, _start, WinMain or my_embedded_platform_wants_to_start_here. most of them only support a single platform anyways, and need cfg for the different platform's ways of passing arguments or other things anyways

#[start] is in a super weird position of being neither of those two. It tries to pretend that it's cross-platform, but its signature is a total lie. Those arguments are just stubbed out to zero on Windows, for example. It also only handles the platform-specific entrypoints for a few platforms that are supported by std, like Windows or Unix-likes. my_embedded_platform_wants_to_start_here can't use it, and neither could a libc-less Linux program.
So we have an attribute that only works in some cases anyways, that has a signature that's a total lie (and a signature that, as I might want to add, has changed recently, and that I definitely would not be comfortable giving any stability guarantees on), and where there's a pretty easy way to get things working without it in the first place.

Note that this feature has not been RFCed in the first place.

@RalfJung
Copy link
Member

RalfJung commented May 1, 2024

Miri currently relies on #[start] to support running no-std binaries, but that could fairly easily be switched to a different scheme like the one described here. (I don't think I want to implement support for all the platform-specific start functions in Miri...)

@scottmcm
Copy link
Member

scottmcm commented May 1, 2024

Lang-nominated for the proposal in #29633 (comment)

@scottmcm scottmcm added the I-lang-nominated Nominated for discussion during a lang team meeting. label May 1, 2024
@bitwalker

This comment was marked as off-topic.

@Lokathor

This comment was marked as off-topic.

@Noratrieb

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@bitwalker

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@bitwalker

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@bitwalker

This comment was marked as off-topic.

@RalfJung

This comment was marked as off-topic.

@RalfJung
Copy link
Member

RalfJung commented Aug 2, 2024

With rust-lang/miri#3769, Miri now has a way to run no_std binaries that does not rely on this feature. So from our perspective it would be fine to remove this. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. I-lang-nominated Nominated for discussion during a lang team meeting. S-tracking-design-concerns Status: There are blocking ❌ design concerns. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests