Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce startup time. #1288

Merged
merged 1 commit into from
Dec 19, 2018
Merged

Reduce startup time. #1288

merged 1 commit into from
Dec 19, 2018

Conversation

dmaclach
Copy link
Contributor

Make global static a function local static. This stops it from being initialized premain and affecting startup time.

@nguyenhuy
Copy link
Member

I think @Adlai-Holler mentioned that dispatch_once is faster than function local static because the latter uses a global mutex for all statics. Maybe switch to dispatch_once to get the tiny perf gain?

Make global static a function local static. This stops it from being
initialized premain and affecting startup time.
@maicki
Copy link
Contributor

maicki commented Dec 18, 2018

I would be interested to get some benchmarks numbers around this area. @dmaclach Did you profile by any chance?

Looking at http://www.modernescpp.com/index.php/thread-safe-initialization-of-a-singleton it seems static vars with block scope is faster as using std::call_once what I would compare with dispatch_once.

@maicki
Copy link
Contributor

maicki commented Dec 18, 2018

I looked a bit more into benchmarking C++ static variables with block scope and dispatch_once initialization. Here is a small benchmark script and the results running it three times at the end. Pretty interesting based on the results we probably should go with dispatch_once:

#import <AppKit/AppKit.h>

extern "C" {
  extern uint64_t dispatch_benchmark(size_t count, void (^block)(void));
}

static NSArray *DefaultLinkAttributeNamesMagicStatic() {
  static NSArray *names = @[ NSLinkAttributeName ];
  return names;
}

static NSArray *DefaultLinkAttributeNamesMagicDispatchOnce() {
  static NSArray *names = nil;
  static dispatch_once_t onceToken;
  dispatch_once(&onceToken, ^{
    names = @[ NSLinkAttributeName ];
  });
  return names;
}

static void testDefaultLinkAttributeNamesMagicStatic(size_t iterations) {
  uint64_t t = dispatch_benchmark(iterations, ^{
    __unused NSArray *_ = DefaultLinkAttributeNamesMagicStatic();
  });
  NSLog(@"DefaultLinkAttributeNamesMagicStatic Avg. Runtime: %llu ns", t);
}

static void testDefaultLinkAttributeNamesMagicDispatchOnce(size_t iterations) {
  uint64_t t = dispatch_benchmark(iterations, ^{
    __unused NSArray *_ = DefaultLinkAttributeNamesMagicDispatchOnce();
  });
  NSLog(@"DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: %llu ns", t);
}

int main(int argc, const char * argv[]) {
  @autoreleasepool {
    size_t iterations = 100000;

    for (int i = 0; i < 5; i++) {
      testDefaultLinkAttributeNamesMagicStatic(iterations);
    }

    for (int i = 0; i < 5; i++) {
      testDefaultLinkAttributeNamesMagicDispatchOnce(iterations);
    }
  }
  return 0;
}

Runs:

1. Run:

2018-12-18 15:38:26.754728+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:38:26.759215+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:38:26.763499+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:38:26.767830+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 42 ns
2018-12-18 15:38:26.772055+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:38:26.775158+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:38:26.778210+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:38:26.781240+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:38:26.784296+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:38:26.787350+0100 StaticInitializerBenchmark[62817:763142] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns

2. Run:

2018-12-18 15:41:12.422922+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 40 ns
2018-12-18 15:41:12.427357+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:41:12.431553+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 40 ns
2018-12-18 15:41:12.435736+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 40 ns
2018-12-18 15:41:12.440016+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:41:12.443089+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:41:12.446144+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:41:12.449175+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:41:12.452215+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:41:12.455317+0100 StaticInitializerBenchmark[63262:768527] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns

3. Run:

2018-12-18 15:44:32.690212+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 42 ns
2018-12-18 15:44:32.694629+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:44:32.698876+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:44:32.703140+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:44:32.707375+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 41 ns
2018-12-18 15:44:32.710474+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:44:32.713577+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:44:32.716666+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:44:32.719714+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns
2018-12-18 15:44:32.722748+0100 StaticInitializerBenchmark[63835:775815] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 29 ns

@maicki maicki closed this Dec 18, 2018
@maicki maicki reopened this Dec 18, 2018
@maicki
Copy link
Contributor

maicki commented Dec 18, 2018

Also interesting moving this to multithreading access via dispatch_apply:

int main(int argc, const char * argv[]) {
  @autoreleasepool {
    size_t iterations = 10000000;

    dispatch_apply(5, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(size_t size) {
      testDefaultLinkAttributeNamesMagicStatic(iterations);
    });
//    for (int i = 0; i < 5; i++) {
//      testDefaultLinkAttributeNamesMagicStatic(iterations);
//    }

    dispatch_apply(5, dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^(size_t size) {
      testDefaultLinkAttributeNamesMagicDispatchOnce(iterations);
    });
//    for (int i = 0; i < 5; i++) {
//      testDefaultLinkAttributeNamesMagicDispatchOnce(iterations);
//    }
  }
  return 0;
}

1. Run

2018-12-18 15:58:05.995261+0100 StaticInitializerBenchmark[66418:809388] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 910 ns
2018-12-18 15:58:06.001869+0100 StaticInitializerBenchmark[66418:809489] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 911 ns
2018-12-18 15:58:06.007117+0100 StaticInitializerBenchmark[66418:809491] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 911 ns
2018-12-18 15:58:06.109230+0100 StaticInitializerBenchmark[66418:809490] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 922 ns
2018-12-18 15:58:06.109656+0100 StaticInitializerBenchmark[66418:809492] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 922 ns
2018-12-18 15:58:12.715117+0100 StaticInitializerBenchmark[66418:809961] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 659 ns
2018-12-18 15:58:12.761388+0100 StaticInitializerBenchmark[66418:809963] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 664 ns
2018-12-18 15:58:12.845285+0100 StaticInitializerBenchmark[66418:809388] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 672 ns
2018-12-18 15:58:12.868158+0100 StaticInitializerBenchmark[66418:809960] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 674 ns
2018-12-18 15:58:12.883791+0100 StaticInitializerBenchmark[66418:809962] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 676 ns

2. Run:

2018-12-18 15:58:36.470113+0100 StaticInitializerBenchmark[66479:810703] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 899 ns
2018-12-18 15:58:36.508274+0100 StaticInitializerBenchmark[66479:810786] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 903 ns
2018-12-18 15:58:36.646166+0100 StaticInitializerBenchmark[66479:810799] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 917 ns
2018-12-18 15:58:36.666202+0100 StaticInitializerBenchmark[66479:810785] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 919 ns
2018-12-18 15:58:36.707457+0100 StaticInitializerBenchmark[66479:810798] DefaultLinkAttributeNamesMagicStatic Avg. Runtime: 923 ns
2018-12-18 15:58:43.947740+0100 StaticInitializerBenchmark[66479:811002] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 723 ns
2018-12-18 15:58:44.046547+0100 StaticInitializerBenchmark[66479:811001] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 732 ns
2018-12-18 15:58:44.054506+0100 StaticInitializerBenchmark[66479:810703] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 733 ns
2018-12-18 15:58:44.054509+0100 StaticInitializerBenchmark[66479:811000] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 733 ns
2018-12-18 15:58:44.055542+0100 StaticInitializerBenchmark[66479:810999] DefaultLinkAttributeNamesMagicDispatchOnce Avg. Runtime: 733 ns

@dmaclach
Copy link
Contributor Author

Thanks folks. Moved over to dispatch_once.

@nguyenhuy
Copy link
Member

Very interesting numbers, thanks for sharing @maicki.

Copy link
Member

@nguyenhuy nguyenhuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. It's great that we can reclaim some pre-main time. Thank you, @dmaclach!

@dmaclach
Copy link
Contributor Author

dmaclach commented Dec 18, 2018

FWIW

https://opensource.apple.com/source/libcppabi/libcppabi-26/src/cxa_guard.cxx.auto.html

Is how the static initialization is done. There does appear to be at large recursive lock.

Copy link
Contributor

@maicki maicki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmaclach Thanks for improving!

@mikezucc
Copy link
Contributor

cool stuff!

@dmaclach
Copy link
Contributor Author

@maicki can you merge? I do not have privs.

@nguyenhuy nguyenhuy merged commit 7cddc2b into TextureGroup:master Dec 19, 2018
@dmaclach dmaclach deleted the static1 branch December 19, 2018 16:27
raviTokopedia pushed a commit to raviTokopedia/Texture that referenced this pull request Mar 5, 2020
Make global static a function local static. This stops it from being
initialized premain and affecting startup time.

(cherry picked from commit 7cddc2b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants