Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use full Link-Time Optimization (LTO) for the release binaries #1433

Merged
merged 1 commit into from
Apr 12, 2018

Conversation

wilzbach
Copy link
Member

@wilzbach wilzbach commented Apr 3, 2018

Without LTO

~/dlang/avgtime/avgtime -q -r1000 ~/dlang/dub-repo/bin/dub.stable describe

------------------------
Total time (ms): 110249
Repetitions    : 1000
Sample mode    : 100 (586 occurrences)
Median time    : 108.899
Avg time       : 110.249
Std dev.       : 6.25044
Minimum        : 100.357
Maximum        : 142.289
95% conf.int.  : [97.9979, 122.499]  e = 12.2506
99% conf.int.  : [94.1484, 126.349]  e = 16.1001
EstimatedAvg95%: [109.861, 110.636]  e = 0.387399
EstimatedAvg99%: [109.739, 110.758]  e = 0.509129

With LTO

~/dlang/avgtime/avgtime -q -r1000 ~/dlang/dub-repo/bin/dub.lto describe

------------------------
Total time (ms): 97952.6
Repetitions    : 1000
Sample mode    : 93 (117 occurrences)
Median time    : 96.2915
Avg time       : 97.9526
Std dev.       : 5.86976
Minimum        : 89.173
Maximum        : 146.487
95% conf.int.  : [86.4481, 109.457]  e = 11.5045
99% conf.int.  : [82.8331, 113.072]  e = 15.1195
EstimatedAvg95%: [97.5888, 98.3164]  e = 0.363805
EstimatedAvg99%: [97.4745, 98.4307]  e = 0.478121

Note that the binary size goes up from 17M to 22M.
(Compressed from 3.8M to 5.2M), but imho space is cheap compared to this huge speed-up.

Learn more about LTO -> http://johanengelen.github.io/ldc/2016/11/10/Link-Time-Optimization-LDC.html

@dlang-bot
Copy link
Collaborator

Thanks for your pull request, @wilzbach!

release.sh Outdated
;;
*) echo "Unknown OS: $unameOut"; exit 1
esac

if [[ $(basename "$DMD") =~ ldmd.* ]] ; then
CUSTOM_FLAGS+=("-flto=full" "-linker=gold")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ported the configuration from dlang-community/D-Scanner#598 to this script.
(the binaries are here https://github.com/dlang-community/D-Scanner/releases/tag/v0.5.1)
Note: LDC's -static linking can't be used because curl (it's dlopen-ed in std.net.curl and apparently is std.socket not very static link-friendly either)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should have a version statement that allows to statically link curl. Then we could use the musl C standard library that fully supports static linking.

release.sh Outdated
;;
Darwin*)
OS=osx
CUSTOM_FLAGS+=("-L-macosx_version_min" "-L10.7" "-L-lcrt1.o")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend bumping to 10.8 to avoid warnings like this:

ld: warning: object file (/Users/travis/dlang/ldc-1.8.0/lib/libdruntime-ldc.a(osx_tls.c.o)) was built for newer OSX version (10.8) than being linked (10.7)

Seems LDC builds Phobos targeting 10.8.

https://travis-ci.org/dlang-community/D-Scanner/jobs/362495647

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even 10.9? (as this will come with 2.079)

https://dlang.org/changelog/2.079.0.html#osx-10_9

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW what for do we need lcrt1.o?
I assumed this is a legacy thing too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even 10.9? (as this will come with 2.079)

That was bumped due to linking with libc++. If we don't need that I don't see a reason to increase the version.

BTW what for do we need lcrt1.o?
I assumed this is a legacy thing too?

That's to link the correct startup code (IIRC). Alternatively the MACOSX_DEPLOYMENT_TARGET environment variable can be used instead, which handle these two linker flags automatically. I use the linker flags because it's not possible to set an environment variable in a Dub file.

I had another look at how this behaves with the C compiler and how the linker is invoked (on macOS 10.12.6 with Xcode 9.2):

(The code blocks indicate the relevant flags the linker was invoked with by the compiler)

Compiling as regular with no flags:

-macosx_version_min 10.12.0

Setting the MACOSX_DEPLOYMENT_TARGET=10.7 environment variable:

-macosx_version_min 10.7.0 -lcrt1.10.6.o

Setting the MACOSX_DEPLOYMENT_TARGET=10.8 environment variable:

-macosx_version_min 10.8.0

Manually specifying the linker flags without the startup code -Xlinker -macosx_version_min -Xlinker 10.7.0:

Undefined symbols for architecture x86_64:
  "start", referenced from:
     implicit entry/start for main executable

So the best would be if the MACOSX_DEPLOYMENT_TARGET environment variable can be used.

release.sh Outdated
;;
*) echo "Unknown OS: $unameOut"; exit 1
esac

if [[ $(basename "$DMD") =~ ldmd.* ]] && [[ ${OS:-} == "linux" ]] ; then
CUSTOM_FLAGS+=("-flto=full" "-linker=gold")
Copy link
Contributor

@jacob-carlborg jacob-carlborg Apr 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this is not enabled on macOS? At least a Hello World compiles and runs with -flto=full enabled.

@jacob-carlborg
Copy link
Contributor

Note that the binary size goes up from 17M to 22M.

Have a look at [1]. On macOS for D-Scanner the binary size is reduced using LTO and -L-dead_strip reduces it even more.

[1] dlang-community/D-Scanner#598 (comment)

# For OSX compatibility >= 10.7
MACOSX_DEPLOYMENT_TARGET=10.7
# For OSX compatibility >= 10.8
MACOSX_DEPLOYMENT_TARGET=10.8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's due to a recent bump in dmd, isn't it? Seems unrelated to the PR though, at least it's uncommented why it's here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It causes linker warnings because linked dependencies are built targeting 10.8.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment: #1433 (comment).

@@ -107,7 +107,8 @@ jobs:
stages:
- name: test
if: type = pull_request or (type = push and branch = master)
- name: deploy
if: type = push and tag =~ ^v
# Until deployment of the release binaries is fixed, always build them
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It don't understand this sentence, also seems unrelated to LTO ;).
Disabling auto-deploy to GH releases seems fine for now, as mentioned earlier I'd prefer to use the official binaries instead of rebuilding from sources. IIUC this PR adds support for LDC building and the GH binaries would prolly use it, while the official binaries still use dmd.
This inconsistency is likely to require additional maintenance and debug efforts down the road. Would be slightly cleaner to switch over dub together with using gdc or ldc to build optimized dmd release binaries.

In fact integrating gdc or ldc with installer/create_dmd_release just for dub would be a good first step towards changing the optimized compiler for dmd releases.


set -v -e -o pipefail
set -eux -o pipefail
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should standardize the writing to simplify reviews ;).
How about set -xeuo pipefail and set -euo pipefail?

unameOut="$(uname -s)"
case "$unameOut" in
Linux*)
OS=linux
CUSTOM_FLAGS="-L--export-dynamic"
CUSTOM_FLAGS+=("-L--export-dynamic")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally prefer single ticks for strings w/o interpolation.

@MartinNowak MartinNowak merged commit 4716396 into dlang:stable Apr 12, 2018
@MartinNowak
Copy link
Member

Also compiler optimizations are mood for an app that mostly waits on remote network requests, and optimizing offline 100ms latencies isn't paying back that much ;).
I guess the dependency resolver is the only hot loop in dub for complex dependency graphs.

@jacob-carlborg
Copy link
Contributor

LTO reduces binary size, at least on macOS.

@Geod24 Geod24 mentioned this pull request Dec 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants