Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression in using NVCC as a CUDA compiler: redzone_allocator_kernel_cuda compilation fails #13460

Open
pearu opened this issue Jun 6, 2024 · 11 comments

Comments

@pearu
Copy link
Contributor

pearu commented Jun 6, 2024

After commit d8f0c1a, building XLA fails for CUDA backend. Reproducer:

$ ./configure.py --backend=CUDA --cuda_compiler=NVCC
$ bazel build --test_output=all --spawn_strategy=sandboxed //xla/tests:complex_unary_op_test
<snip>
ERROR: /home/pearu/git/pearu/xla/xla/stream_executor/gpu/BUILD:476:19: Compiling xla/stream_executor/gpu/redzone_allocator_kernel_cuda.cc failed: (Exit 2): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/stream_executor/gpu:redzone_allocator_kernel_cuda) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF ... (remaining 163 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
external/com_google_absl/absl/hash/hash.h(327): warning #549-D: variable "s" is used before its value is set
      return s;
             ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(155): note #2818-D: attribute was declared here
        __attribute__((enable_if(ValidFormatImpl<Args...>(s), "bad format trap")))
                       ^

external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(140): note #2818-D: attribute was declared here
            enable_if(str_format_internal::EnsureConstexpr(s), "constexpr trap"),
            ^

/mnt/md1/pearu/miniconda3/envs/xla-cuda-dev/lib/clang/18/include/emmintrin.h(47): error: identifier "__bf16" is undefined
  typedef __bf16 __v8bf __attribute__((__vector_size__(16), __aligned__(16)));
          ^

/mnt/md1/pearu/miniconda3/envs/xla-cuda-dev/lib/clang/18/include/emmintrin.h(48): error: identifier "__bf16" is undefined
  typedef __bf16 __m128bh __attribute__((__vector_size__(16), __aligned__(16)));
          ^

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute
      *this = std::move(new_status);
            ^

external/com_google_absl/absl/status/internal/statusor_internal.h(240): warning #2810-D: ignoring return value type with "nodiscard" attribute
        status_ = OkStatus();
                ^

external/com_google_absl/absl/status/internal/statusor_internal.h(247): warning #2810-D: ignoring return value type with "nodiscard" attribute
      status_ = static_cast<absl::Status>(std::forward<U>(v));
              ^

external/com_google_absl/absl/status/internal/statusor_internal.h(29): warning #1835-D: attribute "warn_unused_result" does not apply here
  class __attribute__((warn_unused_result)) StatusOr;
                       ^

4 errors detected in the compilation of "xla/stream_executor/gpu/redzone_allocator_kernel_cuda.cc".

With

$ ./configure.py --backend=CUDA --cuda_compiler=CLANG

the XLA build is succesful.

Using:

XLA main branch
clang version 18.1.6
bazel 6.5.0
nvcc: Cuda compilation tools, release 12.1, V12.1.66

CC: @beckerhe

@cheshire
Copy link
Member

cheshire commented Jun 6, 2024

Do we even support nvcc as a CUDA compiler?

@cheshire
Copy link
Member

cheshire commented Jun 6, 2024

also CC @ddunl

@pearu
Copy link
Contributor Author

pearu commented Jun 6, 2024

Do we even support nvcc as a CUDA compiler?

FWIW, NVCC is the default cuda compiler in configure.py.

@beckerhe
Copy link
Member

Sorry for the delay - I was out. I will look into this. Interestingly we don't see this on the CI. Which version of CUDA are you using?

@pearu
Copy link
Contributor Author

pearu commented Jun 17, 2024

Interestingly we don't see this on the CI. Which version of CUDA are you using?

12.1.0

copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

openxla/xla#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

Reverts c0e79da

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

openxla/xla#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

openxla/xla#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644658241
copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644685844
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this issue Jun 19, 2024
It used to be a `gpu_kernel_library` which means it gets compiled
for device using NVCC in OSS. But the sources files don't contain
any CUDA code, so there is no need for that.

openxla/xla#13460 reports that this behaviour
even fails with some versions of NVCC because NVCC can't deal
with C++ template magic being pulled in by Abseil.

PiperOrigin-RevId: 644685844
@beckerhe
Copy link
Member

Hey @pearu, I've pushed a (potential) fix. Would you be able to conform whether it actually fixes your issue?

@pearu
Copy link
Contributor Author

pearu commented Jun 19, 2024

IIUC, the (potential) fix is equivalent to applying a patch containing:

--- a/xla/stream_executor/gpu/BUILD
+++ b/xla/stream_executor/gpu/BUILD
@@ -473,7 +473,7 @@ gpu_only_cc_library(
     ]),
 )
 
-gpu_kernel_library(
+cc_library(
     name = "redzone_allocator_kernel_cuda",
     srcs = [
         "redzone_allocator_kernel.h",

When I apply this to my local branch (I can try main later if needed), there is a progress: redzone_allocator_kernel_cuda.cc compiles successfully. However, the build breaks at compiling gpu_timer_kernel_cuda.cu.cc with a similar failure:

ERROR: /home/pearu/git/pearu/xla/xla/stream_executor/gpu/BUILD:333:19: Compiling xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc failed: (Exit 2): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/stream_executor/gpu:gpu_timer_kernel_cuda) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/xla/stream_executor/gpu/_objs/gpu_timer_kernel_cuda/gpu_timer_kernel_cuda.cu.pic.d ... (remaining 132 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:199:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
  199 | #define LOG(severity) ABSL_LOG_INTERNAL_LOG_IMPL(_##severity)
      |         ^
external/tsl/tsl/platform/default/logging.h:165:9: note: previous definition is here
  165 | #define LOG(severity) _TF_LOG_##severity
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:237:9: warning: 'LOG_EVERY_N' macro redefined [-Wmacro-redefined]
  237 | #define LOG_EVERY_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:278:9: note: previous definition is here
  278 | #define LOG_EVERY_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:245:9: warning: 'LOG_FIRST_N' macro redefined [-Wmacro-redefined]
  245 | #define LOG_FIRST_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:284:9: note: previous definition is here
  284 | #define LOG_FIRST_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:253:9: warning: 'LOG_EVERY_POW_2' macro redefined [-Wmacro-redefined]
  253 | #define LOG_EVERY_POW_2(severity) \
      |         ^
external/tsl/tsl/platform/default/logging.h:290:9: note: previous definition is here
  290 | #define LOG_EVERY_POW_2(severity)                         \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:265:9: warning: 'LOG_EVERY_N_SEC' macro redefined [-Wmacro-redefined]
  265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
      |         ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
  300 | #define LOG_EVERY_N_SEC(severity, n_seconds)                      \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:57:9: warning: 'CHECK' macro redefined [-Wmacro-redefined]
   57 | #define CHECK(condition) ABSL_LOG_INTERNAL_CHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:308:9: note: previous definition is here
  308 | #define CHECK(condition)              \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:65:9: warning: 'QCHECK' macro redefined [-Wmacro-redefined]
   65 | #define QCHECK(condition) ABSL_LOG_INTERNAL_QCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:542:9: note: previous definition is here
  542 | #define QCHECK(condition) CHECK(condition)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:88:9: warning: 'DCHECK' macro redefined [-Wmacro-redefined]
   88 | #define DCHECK(condition) ABSL_LOG_INTERNAL_DCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:521:9: note: previous definition is here
  521 | #define DCHECK(condition) \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:116:9: warning: 'CHECK_EQ' macro redefined [-Wmacro-redefined]
  116 | #define CHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:499:9: note: previous definition is here
  499 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:118:9: warning: 'CHECK_NE' macro redefined [-Wmacro-redefined]
  118 | #define CHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:500:9: note: previous definition is here
  500 | #define CHECK_NE(val1, val2) CHECK_OP(Check_NE, !=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:120:9: warning: 'CHECK_LE' macro redefined [-Wmacro-redefined]
  120 | #define CHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:501:9: note: previous definition is here
  501 | #define CHECK_LE(val1, val2) CHECK_OP(Check_LE, <=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:122:9: warning: 'CHECK_LT' macro redefined [-Wmacro-redefined]
  122 | #define CHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:502:9: note: previous definition is here
  502 | #define CHECK_LT(val1, val2) CHECK_OP(Check_LT, <, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:124:9: warning: 'CHECK_GE' macro redefined [-Wmacro-redefined]
  124 | #define CHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:503:9: note: previous definition is here
  503 | #define CHECK_GE(val1, val2) CHECK_OP(Check_GE, >=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:126:9: warning: 'CHECK_GT' macro redefined [-Wmacro-redefined]
  126 | #define CHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:504:9: note: previous definition is here
  504 | #define CHECK_GT(val1, val2) CHECK_OP(Check_GT, >, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:128:9: warning: 'QCHECK_EQ' macro redefined [-Wmacro-redefined]
  128 | #define QCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:543:9: note: previous definition is here
  543 | #define QCHECK_EQ(x, y) CHECK_EQ(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:130:9: warning: 'QCHECK_NE' macro redefined [-Wmacro-redefined]
  130 | #define QCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:544:9: note: previous definition is here
  544 | #define QCHECK_NE(x, y) CHECK_NE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:132:9: warning: 'QCHECK_LE' macro redefined [-Wmacro-redefined]
  132 | #define QCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:545:9: note: previous definition is here
  545 | #define QCHECK_LE(x, y) CHECK_LE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:134:9: warning: 'QCHECK_LT' macro redefined [-Wmacro-redefined]
  134 | #define QCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:546:9: note: previous definition is here
  546 | #define QCHECK_LT(x, y) CHECK_LT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:136:9: warning: 'QCHECK_GE' macro redefined [-Wmacro-redefined]
  136 | #define QCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:547:9: note: previous definition is here
  547 | #define QCHECK_GE(x, y) CHECK_GE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:138:9: warning: 'QCHECK_GT' macro redefined [-Wmacro-redefined]
  138 | #define QCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:548:9: note: previous definition is here
  548 | #define QCHECK_GT(x, y) CHECK_GT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:140:9: warning: 'DCHECK_EQ' macro redefined [-Wmacro-redefined]
  140 | #define DCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:531:9: note: previous definition is here
  531 | #define DCHECK_EQ(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:142:9: warning: 'DCHECK_NE' macro redefined [-Wmacro-redefined]
  142 | #define DCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:532:9: note: previous definition is here
  532 | #define DCHECK_NE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:144:9: warning: 'DCHECK_LE' macro redefined [-Wmacro-redefined]
  144 | #define DCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:533:9: note: previous definition is here
  533 | #define DCHECK_LE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:146:9: warning: 'DCHECK_LT' macro redefined [-Wmacro-redefined]
  146 | #define DCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:534:9: note: previous definition is here
  534 | #define DCHECK_LT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:148:9: warning: 'DCHECK_GE' macro redefined [-Wmacro-redefined]
  148 | #define DCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:535:9: note: previous definition is here
  535 | #define DCHECK_GE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:150:9: warning: 'DCHECK_GT' macro redefined [-Wmacro-redefined]
  150 | #define DCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:536:9: note: previous definition is here
  536 | #define DCHECK_GT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
26 warnings generated.
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:199:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
  199 | #define LOG(severity) ABSL_LOG_INTERNAL_LOG_IMPL(_##severity)
      |         ^
external/tsl/tsl/platform/default/logging.h:165:9: note: previous definition is here
  165 | #define LOG(severity) _TF_LOG_##severity
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:237:9: warning: 'LOG_EVERY_N' macro redefined [-Wmacro-redefined]
  237 | #define LOG_EVERY_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:278:9: note: previous definition is here
  278 | #define LOG_EVERY_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:245:9: warning: 'LOG_FIRST_N' macro redefined [-Wmacro-redefined]
  245 | #define LOG_FIRST_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:284:9: note: previous definition is here
  284 | #define LOG_FIRST_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:253:9: warning: 'LOG_EVERY_POW_2' macro redefined [-Wmacro-redefined]
  253 | #define LOG_EVERY_POW_2(severity) \
      |         ^
external/tsl/tsl/platform/default/logging.h:290:9: note: previous definition is here
  290 | #define LOG_EVERY_POW_2(severity)                         \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:265:9: warning: 'LOG_EVERY_N_SEC' macro redefined [-Wmacro-redefined]
  265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
      |         ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
  300 | #define LOG_EVERY_N_SEC(severity, n_seconds)                      \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:57:9: warning: 'CHECK' macro redefined [-Wmacro-redefined]
   57 | #define CHECK(condition) ABSL_LOG_INTERNAL_CHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:308:9: note: previous definition is here
  308 | #define CHECK(condition)              \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:65:9: warning: 'QCHECK' macro redefined [-Wmacro-redefined]
   65 | #define QCHECK(condition) ABSL_LOG_INTERNAL_QCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:542:9: note: previous definition is here
  542 | #define QCHECK(condition) CHECK(condition)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:88:9: warning: 'DCHECK' macro redefined [-Wmacro-redefined]
   88 | #define DCHECK(condition) ABSL_LOG_INTERNAL_DCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:521:9: note: previous definition is here
  521 | #define DCHECK(condition) \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:116:9: warning: 'CHECK_EQ' macro redefined [-Wmacro-redefined]
  116 | #define CHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:499:9: note: previous definition is here
  499 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:118:9: warning: 'CHECK_NE' macro redefined [-Wmacro-redefined]
  118 | #define CHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:500:9: note: previous definition is here
  500 | #define CHECK_NE(val1, val2) CHECK_OP(Check_NE, !=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:120:9: warning: 'CHECK_LE' macro redefined [-Wmacro-redefined]
  120 | #define CHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:501:9: note: previous definition is here
  501 | #define CHECK_LE(val1, val2) CHECK_OP(Check_LE, <=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:122:9: warning: 'CHECK_LT' macro redefined [-Wmacro-redefined]
  122 | #define CHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:502:9: note: previous definition is here
  502 | #define CHECK_LT(val1, val2) CHECK_OP(Check_LT, <, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:124:9: warning: 'CHECK_GE' macro redefined [-Wmacro-redefined]
  124 | #define CHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:503:9: note: previous definition is here
  503 | #define CHECK_GE(val1, val2) CHECK_OP(Check_GE, >=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:126:9: warning: 'CHECK_GT' macro redefined [-Wmacro-redefined]
  126 | #define CHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:504:9: note: previous definition is here
  504 | #define CHECK_GT(val1, val2) CHECK_OP(Check_GT, >, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:128:9: warning: 'QCHECK_EQ' macro redefined [-Wmacro-redefined]
  128 | #define QCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:543:9: note: previous definition is here
  543 | #define QCHECK_EQ(x, y) CHECK_EQ(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:130:9: warning: 'QCHECK_NE' macro redefined [-Wmacro-redefined]
  130 | #define QCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:544:9: note: previous definition is here
  544 | #define QCHECK_NE(x, y) CHECK_NE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:132:9: warning: 'QCHECK_LE' macro redefined [-Wmacro-redefined]
  132 | #define QCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:545:9: note: previous definition is here
  545 | #define QCHECK_LE(x, y) CHECK_LE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:134:9: warning: 'QCHECK_LT' macro redefined [-Wmacro-redefined]
  134 | #define QCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:546:9: note: previous definition is here
  546 | #define QCHECK_LT(x, y) CHECK_LT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:136:9: warning: 'QCHECK_GE' macro redefined [-Wmacro-redefined]
  136 | #define QCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:547:9: note: previous definition is here
  547 | #define QCHECK_GE(x, y) CHECK_GE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:138:9: warning: 'QCHECK_GT' macro redefined [-Wmacro-redefined]
  138 | #define QCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:548:9: note: previous definition is here
  548 | #define QCHECK_GT(x, y) CHECK_GT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:140:9: warning: 'DCHECK_EQ' macro redefined [-Wmacro-redefined]
  140 | #define DCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:531:9: note: previous definition is here
  531 | #define DCHECK_EQ(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:142:9: warning: 'DCHECK_NE' macro redefined [-Wmacro-redefined]
  142 | #define DCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:532:9: note: previous definition is here
  532 | #define DCHECK_NE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:144:9: warning: 'DCHECK_LE' macro redefined [-Wmacro-redefined]
  144 | #define DCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:533:9: note: previous definition is here
  533 | #define DCHECK_LE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:146:9: warning: 'DCHECK_LT' macro redefined [-Wmacro-redefined]
  146 | #define DCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:534:9: note: previous definition is here
  534 | #define DCHECK_LT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:148:9: warning: 'DCHECK_GE' macro redefined [-Wmacro-redefined]
  148 | #define DCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:535:9: note: previous definition is here
  535 | #define DCHECK_GE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:150:9: warning: 'DCHECK_GT' macro redefined [-Wmacro-redefined]
  150 | #define DCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:536:9: note: previous definition is here
  536 | #define DCHECK_GT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
26 warnings generated.
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:199:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
  199 | #define LOG(severity) ABSL_LOG_INTERNAL_LOG_IMPL(_##severity)
      |         ^
external/tsl/tsl/platform/default/logging.h:165:9: note: previous definition is here
  165 | #define LOG(severity) _TF_LOG_##severity
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:237:9: warning: 'LOG_EVERY_N' macro redefined [-Wmacro-redefined]
  237 | #define LOG_EVERY_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:278:9: note: previous definition is here
  278 | #define LOG_EVERY_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:245:9: warning: 'LOG_FIRST_N' macro redefined [-Wmacro-redefined]
  245 | #define LOG_FIRST_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:284:9: note: previous definition is here
  284 | #define LOG_FIRST_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:253:9: warning: 'LOG_EVERY_POW_2' macro redefined [-Wmacro-redefined]
  253 | #define LOG_EVERY_POW_2(severity) \
      |         ^
external/tsl/tsl/platform/default/logging.h:290:9: note: previous definition is here
  290 | #define LOG_EVERY_POW_2(severity)                         \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:265:9: warning: 'LOG_EVERY_N_SEC' macro redefined [-Wmacro-redefined]
  265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
      |         ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
  300 | #define LOG_EVERY_N_SEC(severity, n_seconds)                      \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:57:9: warning: 'CHECK' macro redefined [-Wmacro-redefined]
   57 | #define CHECK(condition) ABSL_LOG_INTERNAL_CHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:308:9: note: previous definition is here
  308 | #define CHECK(condition)              \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:65:9: warning: 'QCHECK' macro redefined [-Wmacro-redefined]
   65 | #define QCHECK(condition) ABSL_LOG_INTERNAL_QCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:542:9: note: previous definition is here
  542 | #define QCHECK(condition) CHECK(condition)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:88:9: warning: 'DCHECK' macro redefined [-Wmacro-redefined]
   88 | #define DCHECK(condition) ABSL_LOG_INTERNAL_DCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:521:9: note: previous definition is here
  521 | #define DCHECK(condition) \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:116:9: warning: 'CHECK_EQ' macro redefined [-Wmacro-redefined]
  116 | #define CHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:499:9: note: previous definition is here
  499 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:118:9: warning: 'CHECK_NE' macro redefined [-Wmacro-redefined]
  118 | #define CHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:500:9: note: previous definition is here
  500 | #define CHECK_NE(val1, val2) CHECK_OP(Check_NE, !=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:120:9: warning: 'CHECK_LE' macro redefined [-Wmacro-redefined]
  120 | #define CHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:501:9: note: previous definition is here
  501 | #define CHECK_LE(val1, val2) CHECK_OP(Check_LE, <=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:122:9: warning: 'CHECK_LT' macro redefined [-Wmacro-redefined]
  122 | #define CHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:502:9: note: previous definition is here
  502 | #define CHECK_LT(val1, val2) CHECK_OP(Check_LT, <, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:124:9: warning: 'CHECK_GE' macro redefined [-Wmacro-redefined]
  124 | #define CHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:503:9: note: previous definition is here
  503 | #define CHECK_GE(val1, val2) CHECK_OP(Check_GE, >=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:126:9: warning: 'CHECK_GT' macro redefined [-Wmacro-redefined]
  126 | #define CHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:504:9: note: previous definition is here
  504 | #define CHECK_GT(val1, val2) CHECK_OP(Check_GT, >, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:128:9: warning: 'QCHECK_EQ' macro redefined [-Wmacro-redefined]
  128 | #define QCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:543:9: note: previous definition is here
  543 | #define QCHECK_EQ(x, y) CHECK_EQ(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:130:9: warning: 'QCHECK_NE' macro redefined [-Wmacro-redefined]
  130 | #define QCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:544:9: note: previous definition is here
  544 | #define QCHECK_NE(x, y) CHECK_NE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:132:9: warning: 'QCHECK_LE' macro redefined [-Wmacro-redefined]
  132 | #define QCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:545:9: note: previous definition is here
  545 | #define QCHECK_LE(x, y) CHECK_LE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:134:9: warning: 'QCHECK_LT' macro redefined [-Wmacro-redefined]
  134 | #define QCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:546:9: note: previous definition is here
  546 | #define QCHECK_LT(x, y) CHECK_LT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:136:9: warning: 'QCHECK_GE' macro redefined [-Wmacro-redefined]
  136 | #define QCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:547:9: note: previous definition is here
  547 | #define QCHECK_GE(x, y) CHECK_GE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:138:9: warning: 'QCHECK_GT' macro redefined [-Wmacro-redefined]
  138 | #define QCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:548:9: note: previous definition is here
  548 | #define QCHECK_GT(x, y) CHECK_GT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:140:9: warning: 'DCHECK_EQ' macro redefined [-Wmacro-redefined]
  140 | #define DCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:531:9: note: previous definition is here
  531 | #define DCHECK_EQ(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:142:9: warning: 'DCHECK_NE' macro redefined [-Wmacro-redefined]
  142 | #define DCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:532:9: note: previous definition is here
  532 | #define DCHECK_NE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:144:9: warning: 'DCHECK_LE' macro redefined [-Wmacro-redefined]
  144 | #define DCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:533:9: note: previous definition is here
  533 | #define DCHECK_LE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:146:9: warning: 'DCHECK_LT' macro redefined [-Wmacro-redefined]
  146 | #define DCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:534:9: note: previous definition is here
  534 | #define DCHECK_LT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:148:9: warning: 'DCHECK_GE' macro redefined [-Wmacro-redefined]
  148 | #define DCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:535:9: note: previous definition is here
  535 | #define DCHECK_GE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:150:9: warning: 'DCHECK_GT' macro redefined [-Wmacro-redefined]
  150 | #define DCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:536:9: note: previous definition is here
  536 | #define DCHECK_GT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
26 warnings generated.
external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(155): note #2818-D: attribute was declared here
        __attribute__((enable_if(ValidFormatImpl<Args...>(s), "bad format trap")))
                       ^

external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(140): note #2818-D: attribute was declared here
            enable_if(str_format_internal::EnsureConstexpr(s), "constexpr trap"),
            ^

external/com_google_absl/absl/status/status.h(796): warning #2810-D: ignoring return value type with "nodiscard" attribute
      *this = std::move(new_status);
            ^

Remark: The warnings can be suppressed with "-diag-suppress <warning-number>"

external/com_google_absl/absl/status/internal/statusor_internal.h(240): warning #2810-D: ignoring return value type with "nodiscard" attribute
        status_ = OkStatus();
                ^

external/com_google_absl/absl/status/internal/statusor_internal.h(247): warning #2810-D: ignoring return value type with "nodiscard" attribute
      status_ = static_cast<absl::Status>(std::forward<U>(v));
              ^

external/com_google_absl/absl/status/internal/statusor_internal.h(29): warning #1835-D: attribute "warn_unused_result" does not apply here
  class __attribute__((warn_unused_result)) StatusOr;
                       ^

external/com_google_absl/absl/hash/hash.h(327): warning #549-D: variable "s" is used before its value is set
      return s;
             ^

/mnt/md1/pearu/miniconda3/envs/xla-cuda-dev/lib/clang/18/include/emmintrin.h(47): error: identifier "__bf16" is undefined
  typedef __bf16 __v8bf __attribute__((__vector_size__(16), __aligned__(16)));
          ^

/mnt/md1/pearu/miniconda3/envs/xla-cuda-dev/lib/clang/18/include/emmintrin.h(48): error: identifier "__bf16" is undefined
  typedef __bf16 __m128bh __attribute__((__vector_size__(16), __aligned__(16)));
          ^

external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(155): note #2818-D: attribute was declared here
        __attribute__((enable_if(ValidFormatImpl<Args...>(s), "bad format trap")))
                       ^

external/com_google_absl/absl/strings/str_format.h(273): error: "enable_if" attributes with conditions that are not constant values are not currently supported
      str_format_internal::ArgumentToConv<Args>()...>;
      ^
external/com_google_absl/absl/strings/internal/str_format/bind.h(140): note #2818-D: attribute was declared here
            enable_if(str_format_internal::EnsureConstexpr(s), "constexpr trap"),
            ^

6 errors detected in the compilation of "xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc".

Just for the sake of experiment, making gpu_timer_kernel_cuda a cc_library as well, the build now fails with:

ERROR: /home/pearu/git/pearu/xla/xla/stream_executor/gpu/BUILD:333:11: Compiling xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc failed: (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command (from target //xla/stream_executor/gpu:gpu_timer_kernel_cuda) external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/k8-opt/bin/xla/stream_executor/gpu/_objs/gpu_timer_kernel_cuda/gpu_timer_kernel_cuda.cu.pic.d ... (remaining 123 arguments skipped)

Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
clang-18: warning: argument unused during compilation: '--cuda-path=/usr/local/cuda-12.1.0' [-Wunused-command-line-argument]
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:199:9: warning: 'LOG' macro redefined [-Wmacro-redefined]
  199 | #define LOG(severity) ABSL_LOG_INTERNAL_LOG_IMPL(_##severity)
      |         ^
external/tsl/tsl/platform/default/logging.h:165:9: note: previous definition is here
  165 | #define LOG(severity) _TF_LOG_##severity
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:237:9: warning: 'LOG_EVERY_N' macro redefined [-Wmacro-redefined]
  237 | #define LOG_EVERY_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:278:9: note: previous definition is here
  278 | #define LOG_EVERY_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:245:9: warning: 'LOG_FIRST_N' macro redefined [-Wmacro-redefined]
  245 | #define LOG_FIRST_N(severity, n) \
      |         ^
external/tsl/tsl/platform/default/logging.h:284:9: note: previous definition is here
  284 | #define LOG_FIRST_N(severity, n)                       \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:253:9: warning: 'LOG_EVERY_POW_2' macro redefined [-Wmacro-redefined]
  253 | #define LOG_EVERY_POW_2(severity) \
      |         ^
external/tsl/tsl/platform/default/logging.h:290:9: note: previous definition is here
  290 | #define LOG_EVERY_POW_2(severity)                         \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:2:
external/com_google_absl/absl/log/log.h:265:9: warning: 'LOG_EVERY_N_SEC' macro redefined [-Wmacro-redefined]
  265 | #define LOG_EVERY_N_SEC(severity, n_seconds) \
      |         ^
external/tsl/tsl/platform/default/logging.h:300:9: note: previous definition is here
  300 | #define LOG_EVERY_N_SEC(severity, n_seconds)                      \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:57:9: warning: 'CHECK' macro redefined [-Wmacro-redefined]
   57 | #define CHECK(condition) ABSL_LOG_INTERNAL_CHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:308:9: note: previous definition is here
  308 | #define CHECK(condition)              \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:65:9: warning: 'QCHECK' macro redefined [-Wmacro-redefined]
   65 | #define QCHECK(condition) ABSL_LOG_INTERNAL_QCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:542:9: note: previous definition is here
  542 | #define QCHECK(condition) CHECK(condition)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:88:9: warning: 'DCHECK' macro redefined [-Wmacro-redefined]
   88 | #define DCHECK(condition) ABSL_LOG_INTERNAL_DCHECK_IMPL((condition), #condition)
      |         ^
external/tsl/tsl/platform/default/logging.h:521:9: note: previous definition is here
  521 | #define DCHECK(condition) \
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:116:9: warning: 'CHECK_EQ' macro redefined [-Wmacro-redefined]
  116 | #define CHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:499:9: note: previous definition is here
  499 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:118:9: warning: 'CHECK_NE' macro redefined [-Wmacro-redefined]
  118 | #define CHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:500:9: note: previous definition is here
  500 | #define CHECK_NE(val1, val2) CHECK_OP(Check_NE, !=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:120:9: warning: 'CHECK_LE' macro redefined [-Wmacro-redefined]
  120 | #define CHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:501:9: note: previous definition is here
  501 | #define CHECK_LE(val1, val2) CHECK_OP(Check_LE, <=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:122:9: warning: 'CHECK_LT' macro redefined [-Wmacro-redefined]
  122 | #define CHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:502:9: note: previous definition is here
  502 | #define CHECK_LT(val1, val2) CHECK_OP(Check_LT, <, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:124:9: warning: 'CHECK_GE' macro redefined [-Wmacro-redefined]
  124 | #define CHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:503:9: note: previous definition is here
  503 | #define CHECK_GE(val1, val2) CHECK_OP(Check_GE, >=, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:126:9: warning: 'CHECK_GT' macro redefined [-Wmacro-redefined]
  126 | #define CHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:504:9: note: previous definition is here
  504 | #define CHECK_GT(val1, val2) CHECK_OP(Check_GT, >, val1, val2)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:128:9: warning: 'QCHECK_EQ' macro redefined [-Wmacro-redefined]
  128 | #define QCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:543:9: note: previous definition is here
  543 | #define QCHECK_EQ(x, y) CHECK_EQ(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:130:9: warning: 'QCHECK_NE' macro redefined [-Wmacro-redefined]
  130 | #define QCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:544:9: note: previous definition is here
  544 | #define QCHECK_NE(x, y) CHECK_NE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:132:9: warning: 'QCHECK_LE' macro redefined [-Wmacro-redefined]
  132 | #define QCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:545:9: note: previous definition is here
  545 | #define QCHECK_LE(x, y) CHECK_LE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:134:9: warning: 'QCHECK_LT' macro redefined [-Wmacro-redefined]
  134 | #define QCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:546:9: note: previous definition is here
  546 | #define QCHECK_LT(x, y) CHECK_LT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:136:9: warning: 'QCHECK_GE' macro redefined [-Wmacro-redefined]
  136 | #define QCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:547:9: note: previous definition is here
  547 | #define QCHECK_GE(x, y) CHECK_GE(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:138:9: warning: 'QCHECK_GT' macro redefined [-Wmacro-redefined]
  138 | #define QCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:548:9: note: previous definition is here
  548 | #define QCHECK_GT(x, y) CHECK_GT(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:140:9: warning: 'DCHECK_EQ' macro redefined [-Wmacro-redefined]
  140 | #define DCHECK_EQ(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:531:9: note: previous definition is here
  531 | #define DCHECK_EQ(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:142:9: warning: 'DCHECK_NE' macro redefined [-Wmacro-redefined]
  142 | #define DCHECK_NE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:532:9: note: previous definition is here
  532 | #define DCHECK_NE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:144:9: warning: 'DCHECK_LE' macro redefined [-Wmacro-redefined]
  144 | #define DCHECK_LE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:533:9: note: previous definition is here
  533 | #define DCHECK_LE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:146:9: warning: 'DCHECK_LT' macro redefined [-Wmacro-redefined]
  146 | #define DCHECK_LT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:534:9: note: previous definition is here
  534 | #define DCHECK_LT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:148:9: warning: 'DCHECK_GE' macro redefined [-Wmacro-redefined]
  148 | #define DCHECK_GE(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:535:9: note: previous definition is here
  535 | #define DCHECK_GE(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:19:
In file included from ./xla/stream_executor/gpu/gpu_executor.h:60:
In file included from ./xla/stream_executor/stream_executor_common.h:29:
In file included from ./xla/stream_executor/stream_executor.h:50:
In file included from ./xla/stream_executor/stream.h:28:
external/com_google_absl/absl/log/check.h:150:9: warning: 'DCHECK_GT' macro redefined [-Wmacro-redefined]
  150 | #define DCHECK_GT(val1, val2) \
      |         ^
external/tsl/tsl/platform/default/logging.h:536:9: note: previous definition is here
  536 | #define DCHECK_GT(x, y) _TF_DCHECK_NOP(x, y)
      |         ^
In file included from xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:21:
In file included from ./xla/stream_executor/gpu/gpu_timer_kernel.h:21:
./xla/stream_executor/gpu/gpu_stream.h:95:16: warning: 'parent' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
   95 |   GpuExecutor* parent() const { return parent_; }
      |                ^
./xla/stream_executor/stream_common.h:91:19: note: overridden virtual function is here
   91 |   StreamExecutor *parent() const override {
      |                   ^
xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:35:24: error: use of undeclared identifier 'clock64'
   35 |   const int64_t tstart{clock64()};
      |                        ^
xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:38:11: error: use of undeclared identifier 'clock64'
   38 |          (clock64() - tstart) < TIMEOUT_CYCLES) {
      |           ^
xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:40:22: error: use of undeclared identifier 'clock64'
   40 |     const int64_t t0{clock64()};
      |                      ^
xla/stream_executor/gpu/gpu_timer_kernel_cuda.cu.cc:42:17: error: use of undeclared identifier 'clock64'
   42 |       elapsed = clock64() - t0;
      |                 ^
27 warnings and 4 errors generated.

HTH

@beckerhe
Copy link
Member

Ok cool. That seems to be the same issue, but in a different file. I can fix that tomorrow.

@beckerhe
Copy link
Member

Hey, I won't be getting to this today. As a workaround you could use a later version of the CUDA toolkit. We use CUDA 12.3 in the CI, so I believe 12.3+ shouldn't have the bug anymore. Or - as you already discovered - you can use Clang as the CUDA compiler instead.

@cheshire
Copy link
Member

FWIW, NVCC is the default cuda compiler in configure.py

Yes, it should be changed to Clang since that's the only thing we test in CI.

@beckerhe

@pearu
Copy link
Contributor Author

pearu commented Jun 20, 2024

I confirm that when using CUDA 12.3.2, this issue cannot be reproduced: bazel build is successful without applying the above mentioned patches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants