Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows build issues #7

Open
toothache opened this issue Dec 7, 2020 · 15 comments
Open

Windows build issues #7

toothache opened this issue Dec 7, 2020 · 15 comments

Comments

@toothache
Copy link
Contributor

toothache commented Dec 7, 2020

Here're some issues observed when building roaster packages in Windows:

  • toolchain.ps1:
    • Python39 is by default installed, but the script still checks for Python38;
    • VS2019 (vc142) is assumed, but the generated nuget package still contains v141 keyword;
    • Scripts under win/pkgs/env folder is ignored by mistake;
    • Lack scripts to install VS build tools;
  • winsdk:
    • There're several Win10 SDK versions installed, but it seems that only 10.0.16299 is initialized;
  • cmake:
    • It has dependency to have curl built and installed;
  • zlib and openssl:
    • Do you need to export some cmake ENVs to let CMake find it?
  • freetype and harfbuzz:
    • They have circular dependency, but freetype doesn't support build w/o harfbuzz;
  • boost:
    • Clone with -j50 doesn't reliably clone the repo and all the submodules;
    • Build variant is not configured to release;
  • snappy:
    • latest_ver is override to latest and cause build failure.
@xkszltl
Copy link
Owner

xkszltl commented Dec 7, 2020

VS2019 (vc142) is assumed, but the generated nuget package still contains v141 keyword

Correct and by-design.
It is for backward compatibility to allow seamless upgrade in downstream.
We only maintain for one version, most likely the latest, due to resource constraints.
Ideally there should be auto-generated meta nuget for v141->v142, and rename everything to 142.
Would you mind helping adding that?

Lack scripts to install VS build tools

Because I don't know a good way to solve all below together:

  • Install when user don't have one.
  • Complete user installation if it is insufficient (missing components).
  • Avoid multiple installations, even in the case where we installed first and user want to install a normal VS later.

Suggestions wanted.

There're several Win10 SDK versions installed, but it seems that only 10.0.16299 is initialized;

What do you mean by initialized?
You can only build against one winsdk version at a time.
I remember this was added due to some downstream requirement of OS, probably WinServer Core or WinServer DC version and CDPx.
Maybe this can be updated to a later version, haven't had change to investigate.

Follow version mapping here in case you need: https://developer.microsoft.com/en-us/windows/downloads/sdk-archive/

It has dependency to have curl built and installed

It should be installed.
If not, try ./setup.ps1 curl.

Do you need to export some cmake ENVs to let CMake find it?

Export from where and find it in where?

They have circular dependency, but freetype doesn't support build w/o harfbuzz;

Yes, that's why the default setup.ps1 workflow build it twice.

Clone with -j50 doesn't reliably clone the repo and all the submodules;

This depends on GitHub behavior in different space/time, and need to be carefully tuned.
Less concurrency also means I cannot fully utilize the bandwidth.
Many networks (both ISP and website) even relies on multi-threading heavily to get reasonable performance.

Linux side I implemented a back-off strategy as part of the mirror-aware submodule cloning.
Similar things can be done for Windows, but I'm not sure if we should duplicate this complex module or figure out a why to get bash.

for i in $(seq 10 -1 0); do
# Exponential back-off.
# GitHub may stall with too many large submodules.
[ "$(git submodule init | wc -l)" -gt 10 ] && timeout -k 10s 5m git submodule update -j 100 && break
[ "$(git submodule init | wc -l)" -gt 1 ] && timeout -k 10s 30m git submodule update -j 10 && break
[ "$(git submodule init | wc -l)" -gt 0 ] && git submodule update -j 1 && break
git submodule update && break
sleep 1
echo "Retrying... $i time(s) left."
done

Build variant is not configured to release

Yes, dual-build is well supported there and there's no many dependencies.

latest_ver is override to latest and cause build failure.

This has been fixed with /DNOMINMAX: e49f2d7
Latest release has broken linkage so I have to use master.

@xkszltl
Copy link
Owner

xkszltl commented Dec 7, 2020

FYI I did a Windows build during weekend.
All library built successfully (on my system) except pytorch: pytorch/pytorch#48895

@toothache
Copy link
Contributor Author

toothache commented Dec 8, 2020

I can successfully build libraries except for pytorch and onnx.

Will try to setup another clean machine to validate.

FYI I did a Windows build during weekend.
All library built successfully (on my system) except pytorch: pytorch/pytorch#48895

@toothache
Copy link
Contributor Author

Lack scripts to install VS build tools

Because I don't know a good way to solve all below together:

  • Install when user don't have one.
  • Complete user installation if it is insufficient (missing components).
  • Avoid multiple installations, even in the case where we installed first and user want to install a normal VS later.

Suggestions wanted.

vswhere provides reliable way to detect multiple installations. We can provide a default script to install vs_buildtool. The script can helpful in the following cases:

  • No build tools is detected;
  • Docker build or CI automation;

There're several Win10 SDK versions installed, but it seems that only 10.0.16299 is initialized;

What do you mean by initialized?
You can only build against one winsdk version at a time.
I remember this was added due to some downstream requirement of OS, probably WinServer Core or WinServer DC version and CDPx.
Maybe this can be updated to a later version, haven't had change to investigate.
Follow version mapping here in case you need: https://developer.microsoft.com/en-us/windows/downloads/sdk-archive/

You explicitly initialized here:

Invoke-Expression $($(cmd /c "`"${VS_HOME}/VC/Auxiliary/Build/vcvarsall.bat`" x64 10.0.16299.0 & set") -Match '^.+=' -Replace '^','${Env:' -Replace '=','}="' -Replace '$','"' | Out-String)

Per my understanding, the Win10 SDK version doesn't matter too much since you don't rely on any new Windows APIs. In such case, Windows SDK should have stable ABI interfaces between different versions.

BTW, another suggestion is using vs command line to install SDKs. It's incrementally installed and you don't need to keep the SDK links either.

https://docs.microsoft.com/en-us/visualstudio/install/workload-component-id-vs-build-tools?view=vs-2019

It has dependency to have curl built and installed

It should be installed.
If not, try ./setup.ps1 curl.

It's my mistake. I didn't notice that you'll use Invoke-WebRequest if curl is not available in cmake.ps1.

Do you need to export some cmake ENVs to let CMake find it?

Export from where and find it in where?

In my initial trials, CMake complained that it couldn't locate zlib or openssl (I manually executed those scripts one by one). I had to manually set some envs {LIB}_ROOT according to CMake documentation. But later, this issue was gone.

They have circular dependency, but freetype doesn't support build w/o harfbuzz;

Yes, that's why the default setup.ps1 workflow build it twice.

Yes, it worked as you described.

Clone with -j50 doesn't reliably clone the repo and all the submodules;

This depends on GitHub behavior in different space/time, and need to be carefully tuned.
Less concurrency also means I cannot fully utilize the bandwidth.
Many networks (both ISP and website) even relies on multi-threading heavily to get reasonable performance.

Linux side I implemented a back-off strategy as part of the mirror-aware submodule cloning.
Similar things can be done for Windows, but I'm not sure if we should duplicate this complex module or figure out a why to get bash.

for i in $(seq 10 -1 0); do
# Exponential back-off.
# GitHub may stall with too many large submodules.
[ "$(git submodule init | wc -l)" -gt 10 ] && timeout -k 10s 5m git submodule update -j 100 && break
[ "$(git submodule init | wc -l)" -gt 1 ] && timeout -k 10s 30m git submodule update -j 10 && break
[ "$(git submodule init | wc -l)" -gt 0 ] && git submodule update -j 1 && break
git submodule update && break
sleep 1
echo "Retrying... $i time(s) left."
done

Build variant is not configured to release

Yes, dual-build is well supported there and there's no many dependencies.

But we don't enable dual build in windows build. And the default build variant is debug for boost.

latest_ver is override to latest and cause build failure.

This has been fixed with /DNOMINMAX: e49f2d7
Latest release has broken linkage so I have to use master.

@xkszltl
Copy link
Owner

xkszltl commented Dec 9, 2020

vswhere provides reliable way to detect multiple installations. ...

Ideally I'd like to install standard (maybe community since that's free?) version of VS, so that people can add things later.
Build tool version seems very CI-specific.

Is that possible to install standard VS via command line?

You explicitly initialized here: ...

WINVER and all those WinSDK macros for minimum targeting platform are set to SDK version by default, and won't work on older OS.
I'm not aware of a global way to set it, but also want to avoid setting that per-project if possible.

But we don't enable dual build in windows build. And the default build variant is debug for boost.

I double checked, it is dual-build by default.
We got both mt and mt-gd dlls.

@toothache
Copy link
Contributor Author

toothache commented Dec 10, 2020

Ideally I'd like to install standard (maybe community since that's free?) version of VS, so that people can add things later.
Build tool version seems very CI-specific.

Is that possible to install standard VS via command line?

Yes. The script would be very similar with the vs_buildtools.ps1 in #7 . You just need to change the download url and may update the workloads to install.
https://docs.microsoft.com/en-us/visualstudio/install/workload-and-component-ids?view=vs-2019

WINVER and all those WinSDK macros for minimum targeting platform are set to SDK version by default, and won't work on older OS.
I'm not aware of a global way to set it, but also want to avoid setting that per-project if possible.

Per-project setting is also something i want to avoid. But my question is we seems to install extra WinSDKs than what we really need.

We actually have two kinds of cmake generators, Ninja and VS2019.

  • For Ninja build, it takes envs from vsvarall.bat. From the link, winsdk_version specifies the version of Windows SDK to use. Thus, for our case, it's 10.0.16299.0.
  • For VS build, we don't specifically target a WinSDK version. So i guess the latest one will be used.

Therefore, at most two Windows SDK version will be used across windows build, but in winsdk.ps1, four Windows SDKs are installed. It seems not necessary.

But we don't enable dual build in windows build. And the default build variant is debug for boost.

I double checked, it is dual-build by default.
We got both mt and mt-gd dlls.

You're correct, it's dual build. I probably just took a glance at the build log and then thought it only build for debug variant.

@xkszltl
Copy link
Owner

xkszltl commented Dec 11, 2020

For VS build, we don't specifically target a WinSDK version.

We do. It's the same as ninja build. vs generator should take what's given by the env var. Let me know if it failed to do so.

but in winsdk.ps1, four Windows SDKs are installed

Those 4 are selected based on popularity of Win10/Win2019 versions.

@toothache
Copy link
Contributor Author

Snappy build failed again. Snappy is adding third party submodules in their repo.

@toothache toothache reopened this Dec 17, 2020
@xkszltl
Copy link
Owner

xkszltl commented Dec 17, 2020

Fixed with --recursive in c54898a
Please have a try.

@toothache
Copy link
Contributor Author

Not actually. I'll provide error logs later.

@toothache
Copy link
Contributor Author

toothache commented Dec 17, 2020

2020-12-17T06:22:01.3378721Z ..\third_party\googletest\googlemock\src\gmock_main.cc(63): error C2491: 'main': definition of dllimport function not allowed
2020-12-17T06:22:01.3388588Z [13/46] Linking CXX executable snappy_test_tool.exe
2020-12-17T06:22:01.3394114Z FAILED: snappy_test_tool.exe 
2020-12-17T06:22:01.3398712Z cmd.exe /C "cd . && "C:\Program Files\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\snappy_test_tool.dir --rc="C:\PROGRA~2\Windows Kits\10\bin\10.0.16299.0\x64\rc.exe" --mt="C:\PROGRA~2\Windows Kits\10\bin\10.0.16299.0\x64\mt.exe" --manifests  -- "C:\PROGRA~2\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\bin\Hostx64\x64\link.exe" /nologo CMakeFiles\snappy_test_tool.dir\snappy_test_tool.cc.obj  /out:snappy_test_tool.exe /implib:snappy_test_tool.lib /pdb:pdb\snappy_test_tool.pdb /version:0.0 /DEBUG:FASTLINK /LTCG:incremental /INCREMENTAL:NO /subsystem:console  snappy_test_support.lib  snappy.lib  "C:\Program Files\gflags\lib\gflags.lib"  shlwapi.lib  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
2020-12-17T06:22:01.3403393Z LINK: command "C:\PROGRA~2\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\bin\Hostx64\x64\link.exe /nologo CMakeFiles\snappy_test_tool.dir\snappy_test_tool.cc.obj /out:snappy_test_tool.exe /implib:snappy_test_tool.lib /pdb:pdb\snappy_test_tool.pdb /version:0.0 /DEBUG:FASTLINK /LTCG:incremental /INCREMENTAL:NO /subsystem:console snappy_test_support.lib snappy.lib C:\Program Files\gflags\lib\gflags.lib shlwapi.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:snappy_test_tool.exe.manifest" failed (exit code 1120) with the following output:
2020-12-17T06:22:01.3405412Z    Creating library snappy_test_tool.lib and object snappy_test_tool.exp
2020-12-17T06:22:01.3406820Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "class file::OptionsStub const & __cdecl file::Defaults(void)" (?Defaults@file@@YAAEBVOptionsStub@1@XZ)
2020-12-17T06:22:01.3410428Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: __cdecl file::StatusStub::~StatusStub(void)" (??1StatusStub@file@@QEAA@XZ)
2020-12-17T06:22:01.3411848Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: bool __cdecl file::StatusStub::ok(void)" (?ok@StatusStub@file@@QEAA_NXZ)
2020-12-17T06:22:01.3417757Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "class file::StatusStub __cdecl file::GetContents(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > *,class file::OptionsStub const &)" (?GetContents@file@@YA?AVStatusStub@1@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PEAV34@AEBVOptionsStub@1@@Z)
2020-12-17T06:22:01.3421431Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "class file::StatusStub __cdecl file::SetContents(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class file::OptionsStub const &)" (?SetContents@file@@YA?AVStatusStub@1@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@0AEBVOptionsStub@1@@Z)
2020-12-17T06:22:01.3424734Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl snappy::StrFormat(char const *,...)" (?StrFormat@snappy@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@PEBDZZ)
2020-12-17T06:22:01.3426301Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: __cdecl snappy::LogMessage::~LogMessage(void)" (??1LogMessage@snappy@@QEAA@XZ)
2020-12-17T06:22:01.3428086Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: class snappy::LogMessage & __cdecl snappy::LogMessage::operator<<(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)" (??6LogMessage@snappy@@QEAAAEAV01@AEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@Z)
2020-12-17T06:22:01.3429751Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: class snappy::LogMessage & __cdecl snappy::LogMessage::operator<<(int)" (??6LogMessage@snappy@@QEAAAEAV01@H@Z)
2020-12-17T06:22:01.3432074Z snappy_test_tool.cc.obj : error LNK2001: unresolved external symbol "public: __cdecl snappy::LogMessageCrash::~LogMessageCrash(void)" (??1LogMessageCrash@snappy@@QEAA@XZ)
2020-12-17T06:22:01.3434104Z snappy_test_tool.exe : fatal error LNK1120: 10 unresolved externals

@xkszltl
Copy link
Owner

xkszltl commented Dec 17, 2020

Confirmed. Seems to be a bug. Static lib works but we need dynamic libs.
May due to some missing dllexport.

@toothache
Copy link
Contributor Author

Thanks, this should fix the issue.

BTW, why don't we use a stable release branch to build snappy?

@toothache
Copy link
Contributor Author

toothache commented Dec 17, 2020

And I can't find the issues tab in snappy repo... =。=

@xkszltl
Copy link
Owner

xkszltl commented Dec 17, 2020

BTW, why don't we use a stable release branch to build snappy?

Because unit test of that one doesn't work. gtest linkage issue.

And I can't find the issues tab in snappy repo... =。=

Same here.
Probably they closed it.
See it you can find their internal channel (or abuse PR....)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants