Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dumped in sdk:6.0.100-bullseye-slim-arm32v7 and sdk:6.0.100-alpine3.14-arm32v7 without --privileged #3253

Closed
pablofrommars opened this issue Oct 27, 2021 · 26 comments

Comments

@pablofrommars
Copy link

pablofrommars commented Oct 27, 2021

Describe the Bug

In dotnet/nightly/sdk:6.0.100-bullseye-slim-arm32v7 and 6.0.100-alpine3.14-arm32v7, running dotnet commands inside the container started without --privileged result in core dumped. Docker host is running on raspberry pi OS (32 bits) on a pi v4. This is the minimal way to reproduce the issue. I noticed the issue after upgrading my Dockerfiles from 5.0 to 6.0 and attempting to build.

sdk:5.0.402-buster-slim-arm32v7 works as expected in both privileged and unprivileged.

sdk:5.0.402-bullseye-slim-arm32v7 leads to core dumped without --privileged but works fine when --privileged is supplied. That would indicate the issue has more to do with bullseye and alpine3.14 than the version of dotnet.

Thanks to anyone involved.

Steps to Reproduce

$ docker run -it mcr.microsoft.com/dotnet/nightly/sdk:6.0.100-bullseye-slim-arm32v7 bash
root@679d83583dbd:/# dotnet --version
Aborted (core dumped)

While the following works fine:

$ docker run -it --privileged mcr.microsoft.com/dotnet/nightly/sdk:6.0.100-bullseye-slim-arm32v7 bash
root@271ea1adfedd:/# dotnet --version
6.0.100

Other Information

I am unaware of a way to build images in "privileged mode" and bypass the issue.

Output of docker version

Client: Docker Engine - Community
 Version:           20.10.10
 API version:       1.41
 Go version:        go1.16.9
 Git commit:        b485636
 Built:             Mon Oct 25 07:42:19 2021
 OS/Arch:           linux/arm
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.10
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.9
  Git commit:       e2f740d
  Built:            Mon Oct 25 07:40:35 2021
  OS/Arch:          linux/arm
  Experimental:     false
 containerd:
  Version:          1.4.11
  GitCommit:        5b46e404f6b9f661a205e28d59c982d3634148f8
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Output of docker info

Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.6.3-docker)
  compose: Docker Compose (Docker Inc., v2.0.0-beta.6)

Server:
 Containers: 48
  Running: 0
  Paused: 0
  Stopped: 48
 Images: 68
 Server Version: 20.10.10
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2 io.containerd.runtime.v1.linux
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 5b46e404f6b9f661a205e28d59c982d3634148f8
 runc version: v1.0.2-0-g52b36a2
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 5.10.63-v7l+
 Operating System: Raspbian GNU/Linux 10 (buster)
 OSType: linux
 Architecture: armv7l
 CPUs: 4
 Total Memory: 7.539GiB
 Name: aeritlab
 ID: F6KA:LW5H:7ZN2:YACU:MWWG:HLBT:NEFO:SOWU:76HY:4455:JOTO:5CQH
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Username: gravityeye
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory TCP limit support
WARNING: No oom kill disable support
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@mthalman
Copy link
Member

Thanks @pablofrommars! I can repro this. I'm just gathering some more data to post on this.

@mthalman
Copy link
Member

mthalman commented Oct 27, 2021

It definitely seems to be distro version-related. I spent some time trying to determine the scope of the environments in which this can repro. Here's what I've got:

  • Arm32 device: Raspberry Pi 4
  • Affects all .NET versions: 3.1, 5.0, 6.0
  • Affects Debian 11 (Bullseye), Ubuntu 20.04 (Focal), Alpine 3.14
  • Does not affect Debian 10 (Buster), Ubuntu 18.04 (Bionic). Earlier versions of Alpine are not applicable because .NET doesn't support those versions for Arm32.
  • Only occurs in the sdk container image. A similar command, dotnet --info, works in the runtime and aspnet images and fails in the sdk image.

Here's the core dump to be investigated:
core.zip
coredump.zip

@janvorli - Is this something you can investigate to determine the root cause of the crash?

@pablofrommars
Copy link
Author

pablofrommars commented Oct 27, 2021

Could it be ICU related?

$ docker run -it --rm  --privileged -v /home/pi/dotnet:/repo arm32v7/debian:bullseye-slim bash
root@45622b4ee36b:/# cd repo
root@45622b4ee36b:/repo# ./dotnet --info
Process terminated. Couldn't find a valid ICU package installed on the system. Please install libicu using your package manager and try again. Alternatively you can set the configuration flag System.Globalization.Invariant to true if you want to run with no globalization support. Please see https://aka.ms/dotnet-missing-libicu for more information.
   at System.Environment.FailFast(System.String)
   at System.Globalization.GlobalizationMode+Settings..cctor()
   at System.Globalization.CultureData.CreateCultureWithInvariantData()
   at System.Globalization.CultureData.get_Invariant()
   at System.Globalization.CultureInfo..cctor()
   at System.Globalization.CultureInfo.get_CurrentUICulture()
   at System.TimeZoneInfo.GetUtcStandardDisplayName()
   at System.TimeZoneInfo.CreateUtcTimeZone()
   at System.TimeZoneInfo..cctor()
   at System.DateTime.get_Now()
   at Microsoft.DotNet.Cli.Program.Main(System.String[])
Aborted (core dumped)

Note that it's run with --privileged and still core dumped. Without --privileged, it fails a bit more silently (no exception stack trace).

$ docker run -it --rm  -v /home/pi/source/dotnet:/repo arm32v7/debian:bullseye-slim bash
root@c93e94cda139:/# cd repo/
root@c93e94cda139:/repo# ./dotnet --info
Aborted (core dumped)

@mthalman
Copy link
Member

.NET requires a number of packages, including ICU, that aren't installed in the bullseye-slim image so it would certainly be expected for things not to work if they aren't installed. Here are the dependencies:

RUN apt-get update \
&& apt-get install -y --no-install-recommends \
ca-certificates \
\
# .NET Core dependencies
libc6 \
libgcc1 \
libgssapi-krb5-2 \
libicu67 \
libssl1.1 \
libstdc++6 \
zlib1g \
&& rm -rf /var/lib/apt/lists/*

@janvorli
Copy link
Member

janvorli commented Nov 1, 2021

Both the core.zip and coredump.zip shared above show that a call to QueryPerformanceCounter has failed. Here is the top of the stack trace:

* thread #1, name = 'dotnet', stop reason = signal SIGABRT
  * frame #0: 0xb6c66c66 libc.so.6
    frame #1: 0xb6c75ea0 libc.so.6`raise + 124
    frame #2: 0xb6c667a2 libc.so.6`abort + 174
    frame #3: 0xb6b4dc5a libcoreclr.so`::PROCAbort(signal=<unavailable>) at process.cpp:3431:5
    frame #4: 0xb6b2ac2e libcoreclr.so`sigtrap_handler(int, siginfo_t*, void*) [inlined] invoke_previous_action(action=<unavailable>, code=<unavailable>, siginfo=<unavailable>, context=<unavailable>, signalRestarts=<unavailable>) at signal.cpp:406:13
    frame #5: 0xb6b2ac28 libcoreclr.so`sigtrap_handler(code=<unavailable>, siginfo=<unavailable>, context=<unavailable>) at signal.cpp:656
    frame #6: 0xb6c76cc0 libc.so.6
    frame #7: 0xb6b5235a libcoreclr.so`DBG_DebugBreak + 2
    frame #8: 0xb694f5b0 libcoreclr.so`GCToOSInterface::QueryPerformanceCounter() [inlined] GCToOSInterface::DebugBreak() at gcenv.os.cpp:299:5
    frame #9: 0xb694f5ac libcoreclr.so`GCToOSInterface::QueryPerformanceCounter() at gcenv.os.cpp:990
    frame #10: 0xb6a2329a libcoreclr.so`WKS::gc_heap::init_dynamic_data() [inlined] WKS::RawGetHighPrecisionTimeStamp() at gc.cpp:253:22
    frame #11: 0xb6a23296 libcoreclr.so`WKS::gc_heap::init_dynamic_data() at gc.cpp:37872
    frame #12: 0xb6a22632 libcoreclr.so`WKS::gc_heap::init_gc_heap(h_number=0) at gc.cpp:13298:10
    frame #13: 0xb6a40f9c libcoreclr.so`WKS::GCHeap::Initialize() [inlined] WKS::gc_heap::make_gc_heap() at gc.cpp:13032:9
    frame #14: 0xb6a40f96 libcoreclr.so`WKS::GCHeap::Initialize() [inlined] WKS::GCHeap::Init(this=<unavailable>, hn=<unavailable>) at gc.cpp:42468
    frame #15: 0xb6a40f96 libcoreclr.so`WKS::GCHeap::Initialize(this=<unavailable>) at gc.cpp:42912
    frame #16: 0xb6b1bbdc libcoreclr.so`EEStartupHelper() at ceemain.cpp:955:25
    frame #17: 0xb6b1b2e6 libcoreclr.so`EEStartup() [inlined] EEStartup(this=<unavailable>, p=<unavailable>)::$_0::operator()(void*) const at ceemain.cpp:1153:9
    frame #18: 0xb6b1b2aa libcoreclr.so`EEStartup() at ceemain.cpp:1155
    frame #19: 0xb6b1b242 libcoreclr.so`EnsureEEStarted() at ceemain.cpp:321:17
    frame #20: 0xb6871320 libcoreclr.so`CorHost2::Start(this=0x0122e538) at corhost.cpp:101:14

All the QueryPerformanceCounter does is this:

    BOOL retval = TRUE;

    struct timespec ts;
    int result = clock_gettime(CLOCK_MONOTONIC, &ts);

    if (result != 0)
    {
        ASSERT("clock_gettime(CLOCK_MONOTONIC) failed: %d\n", result);
        retval = FALSE;
    }
    else
    {
        lpPerformanceCount->QuadPart =
                ((LONGLONG)(ts.tv_sec) * (LONGLONG)(tccSecondsToNanoSeconds)) + (LONGLONG)(ts.tv_nsec);
    }

So the clock_gettime must have failed. If it was failing both with and without --privileged, I would have thought that the CLOCK_MONOTONIC is somehow not supported in docker, however since it fails only in the non-privileged case, I am not sure what's causing it to fail. The clock_gettime mentions errors only for invalid pointer to the time structure or invalid clock id. I'll try to create a simple C repro to see what error code it gets.

@mthalman
Copy link
Member

mthalman commented Nov 1, 2021

There have definitely been some changes in clock_gettime between Debian 10, where it works, and Debian 11, where it doesn't work. Specifically around 32-/64-bit.

@janvorli
Copy link
Member

janvorli commented Nov 1, 2021

Hmm, strange thing is that on my device running aarch64 distro and using docker with the 6.0.100-bullseye-slim-arm32v7 image, both dotnet and a simple test C app that calls clock_gettime work both with and without the --privileged as expected. So maybe the kernel is related too.

@mthalman
Copy link
Member

mthalman commented Nov 1, 2021

Yes, exactly. This is why it hasn't been caught by any of our builds, because all of our Arm build machines are aarch64. It only repros on an Arm32 machine.

@janvorli
Copy link
Member

janvorli commented Nov 1, 2021

It seems the issue might be the same one as the one mentioned here: debuerreotype/docker-debian-artifacts#106

@janvorli
Copy link
Member

janvorli commented Nov 1, 2021

Yes, it is the same issue. I was able to repro it on my RPI 4 with 32 bit Raspbian installed as a host OS for the docker.
I've then updated libseccomp on the host to a version >= 2.4.2 (2.5.2 in my case on Raspbian) and the issue is gone.

To install the updated libseccomp, I had to add testing repo to the package repositories in /etc/apt/sources.list:

deb http://raspbian.raspberrypi.org/raspbian/ testing main

Then I've followed an advice to set the priority of the testing repo to low so that things get still installed from the main repo by default (https://tech.borpin.co.uk/2019/12/17/install-a-package-from-the-testing-repository/)

Then I've ran apt-get install libseccomp2/testing to install the newer version.

@janvorli
Copy link
Member

janvorli commented Nov 1, 2021

Please note that this is not a .NET issue, other things in the bullseye image fail too (e.g. just apt-get update) without this change.

@pablofrommars
Copy link
Author

pablofrommars commented Nov 1, 2021

I confirm @janvorli fix my build issue. Thank you so much.

@pablofrommars
Copy link
Author

Thanks again!

@paulpeeters
Copy link

Thanks @janvorli, that solved the issue for me too. Date and time were incorrect in many recent images when run on arm32.

Before:

docker run --rm httpd date
Thu Jan  1 00:00:00 UTC 1970
docker run --rm httpd:buster date
Fri Nov 26 16:59:12 UTC 2021

And after updating libseccomp on host:

docker run --rm httpd date
Fri Nov 26 17:01:29 UTC 2021

@florianbader
Copy link

florianbader commented Dec 1, 2021

@janvorli We have the same issue for our .NET 6 IoT edge modules. Updating hundreds of devices and installing libseccomp2/testing isn't really an option for us. Is there any other alternative?
We tried building .NET 6 on alpine 3.12 but that doesn't work. We're currently trying to build .NET 6 for Debian Buster and see if that helps (Debian Buster for .NET 5 worked great for us).

The devices are ARM32v7 and running Debian 9 and Debian 10 on the host with Docker 3.0.13+azure.

@florianbader
Copy link

florianbader commented Dec 1, 2021

So it seems that manually building .NET 6 (wget runtime) based on Debian Buster Slim works for us. Is there any way to get an official image for that?

@janvorli
Copy link
Member

janvorli commented Dec 1, 2021

@florianbader unfortunately I am not aware of other alternatives (other than running docker with --privileged).
As for Alpine, besides the debian bullseye, we have docker images for .NET 6 based on Alpine 3.14 and Ubuntu 20.04. Maybe one of them would work for you.

@mthalman
Copy link
Member

mthalman commented Dec 1, 2021

So it seems that manually building .NET 6 (wget runtime) based on Debian Buster Slim works for us. Is there any way to get an official image for that?

@florianbader - Our policy is to only provide images for the latest stable version of Debian when a major version of .NET is released. Depending on feedback from people being negatively impacted by this, it's possible that we could reconsider the set of official images that are provided. But I don't think we're there yet.

You're likely familiar with this already, but we do have documentation on how to install .NET in containers for scenarios where we don't provide official images.

@florianbader
Copy link

I would love to see an official image because this affects all our edge devices. We cannot update from Debian 9 or 10 to the latest stable which means we cannot update the docker (moby) version.
This means we are either stuck with .NET 5 on Debian Buster or we have to build our own images which we did.

For anyone that finds this and wants to build their own .NET 6 image for Debian Buster this is what we did and worked for us:

FROM arm32v7/debian:buster-slim

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
    ca-certificates \
    \
    # .NET Core dependencies
    libc6 \
    libgcc1 \
    libgssapi-krb5-2 \
    libssl1.1 \
    libstdc++6 \
    zlib1g \
    wget \
    && rm -rf /var/lib/apt/lists/*

ENV \
    # Configure web servers to bind to port 80 when present
    ASPNETCORE_URLS=http://+:80 \
    # Enable detection of running in a container
    DOTNET_RUNNING_IN_CONTAINER=true \
    # Set the invariant mode since icu-libs isn't included (see https://github.com/dotnet/announcements/issues/20)
    DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=true \
    # ASP.NET Core version
    ASPNET_VERSION=6.0.0 \
    # Set the default console formatter to JSON
    Logging__Console__FormatterName=Json

ENV DOTNET_VERSION=6.0.0

RUN wget -O aspnetcore.tar.gz https://dotnetcli.azureedge.net/dotnet/aspnetcore/Runtime/$DOTNET_VERSION/aspnetcore-runtime-$DOTNET_VERSION-linux-arm.tar.gz \
    && aspnetcore_sha512='36be738bb40a0cadacd4531c3597a25fd45deb7c48090ffb61c79a5db7742a5b8e13051b06556e15e7e162e4a044795c0ca5e6da4db26b63b05c37b39e74e301' \
    && echo "$aspnetcore_sha512  aspnetcore.tar.gz" | sha512sum -c - \
    && mkdir -p /usr/share/dotnet \
    && tar -C /usr/share/dotnet -oxzf aspnetcore.tar.gz \
    && ln -s /usr/share/dotnet/dotnet /usr/bin/dotnet \
    && rm aspnetcore.tar.gz

If you only need the .NET runtime instead of ASP.NET Core use the following wget:

RUN wget -O dotnet.tar.gz https://dotnetcli.azureedge.net/dotnet/Runtime/$DOTNET_VERSION/dotnet-runtime-$DOTNET_VERSION-linux-arm.tar.gz \
    && dotnet_sha512='575037f2e164deaf3bcdd82f7b3f2b5a5784547c5bad4070375c00373722265401b88a81695b919f92ca176f21c1bdf1716f8fce16ab3d301ae666daa8cae750' \
    && echo "$dotnet_sha512  dotnet.tar.gz" | sha512sum -c - \
    && mkdir -p /usr/share/dotnet \
    && tar -C /usr/share/dotnet -oxzf dotnet.tar.gz \
    && ln -s /usr/share/dotnet/dotnet /usr/bin/dotnet \
    && rm dotnet.tar.gz

@mu88
Copy link

mu88 commented Feb 21, 2022

As recommended in this issue, I've updated libseccomp. But now when executing sudo apt full-upgrade, I get the following output:

Reading package lists... Done
Building dependency tree
Reading state information... Done
Calculating upgrade... Error!
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libc6-dev : Breaks: libgcc-8-dev (< 8.4.0-2~) but 8.3.0-6+rpi1 is to be installed
E: Error, pkgProblemResolver::Resolve generated breaks, this may be caused by held packages.

@mthalman How can I solve this?

@janvorli
Copy link
Member

@mu88 have you followed all the steps from the comment including setting priority of the testing repo to low?
Also, I wonder if uninstalling the libseccomp before attempting the upgrade would help.

@mu88
Copy link

mu88 commented Feb 21, 2022

@janvorli I've created the file /etc/apt/preferences with the following content:

Package: *
Pin: release a=stable
Pin-Priority: 700
 
Package: *
Pin: release o=Raspberry Pi Foundation,a=testing,n=buster,l=Raspberry Pi Foundation,c=main,b=armhf
Pin-Priority: 675
 
Package: *
Pin: release a=testing
Pin-Priority: 650
 
Package: *
Pin: release a=unstable
Pin-Priority: 600

Afterwards I called sudo apt update.

Now apt-cache policy gives me:

Package files:
 100 /var/lib/dpkg/status
     release a=now
 500 http://archive.raspberrypi.org/debian buster/main armhf Packages
     release o=Raspberry Pi Foundation,a=oldstable,n=buster,l=Raspberry Pi Foundation,c=main,b=armhf
     origin archive.raspberrypi.org
 500 https://download.docker.com/linux/raspbian buster/stable armhf Packages
     release o=Docker,a=buster,l=Docker CE,c=stable,b=armhf
     origin download.docker.com
 500 http://raspbian.raspberrypi.org/raspbian testing/main armhf Packages
     release o=Raspbian,a=testing,n=bookworm,l=Raspbian,c=main,b=armhf
     origin raspbian.raspberrypi.org
 500 http://raspbian.raspberrypi.org/raspbian buster/rpi armhf Packages
     release o=Raspbian,a=oldstable,n=buster,l=Raspbian,c=rpi,b=armhf
     origin raspbian.raspberrypi.org
 500 http://raspbian.raspberrypi.org/raspbian buster/non-free armhf Packages
     release o=Raspbian,a=oldstable,n=buster,l=Raspbian,c=non-free,b=armhf
     origin raspbian.raspberrypi.org
 500 http://raspbian.raspberrypi.org/raspbian buster/contrib armhf Packages
     release o=Raspbian,a=oldstable,n=buster,l=Raspbian,c=contrib,b=armhf
     origin raspbian.raspberrypi.org
 500 http://raspbian.raspberrypi.org/raspbian buster/main armhf Packages
     release o=Raspbian,a=oldstable,n=buster,l=Raspbian,c=main,b=armhf
     origin raspbian.raspberrypi.org

And sudo apt update says:

Hit:1 http://raspbian.raspberrypi.org/raspbian buster InRelease
Hit:2 http://raspbian.raspberrypi.org/raspbian testing InRelease
Hit:3 http://archive.raspberrypi.org/debian buster InRelease
Hit:4 https://download.docker.com/linux/raspbian buster InRelease
Reading package lists... Done
Building dependency tree
Reading state information... Done
1083 packages can be upgraded. Run 'apt list --upgradable' to see them.

1083 seems incredibly high... 🤔

@mu88
Copy link

mu88 commented Feb 22, 2022

@janvorli or @mthalman , can you pls help?

@iioter
Copy link

iioter commented Mar 18, 2022

@janvorli or @mthalman , can you pls help?
i can't build .net6 docker image on raspberry pi4(arm32),
when is use mcr.microsoft.com/dotnet/aspnet:6.0-bullseye-slim-arm32v7 AS base,
mcr.microsoft.com/dotnet/sdk:6.0-bullseye-slim-arm32v7 AS build,
when run dotnet restore xx.csproj
it console an error 'Aborted (core dumped) The command '/bin/sh -c dotnet restore "xx.csproj"' returned'
who can help me pls?

@zhaopeiym
Copy link

zhaopeiym commented May 17, 2022

Aborted (core dumped)

1、宿主libseccomp 2.4.2 或更高版本

sudo vi /etc/apt/sources.list
deb http://raspbian.raspberrypi.org/raspbian/ testing main
sudo apt-get update
sudo apt-get install libseccomp2/testing

2、Docker 版本 19.03.9 或更高版本

sudo apt install docker-ce=5:19.03.9~3-0~raspbian-buster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants