Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] lmp command doesn't start lammps (pip install) 2.2.0b0 #2307

Closed
luukasnik opened this issue Feb 9, 2023 · 5 comments
Closed

[BUG] lmp command doesn't start lammps (pip install) 2.2.0b0 #2307

luukasnik opened this issue Feb 9, 2023 · 5 comments
Assignees
Labels
bug reproduced This bug has been reproduced by developers upstream

Comments

@luukasnik
Copy link

Bug summary

I installed deepmd kit with the following command:

pip install deepmd-kit[gpu,lmp]==2.2.0b0
on an existing install of tensorflow 2.11.

The command dp works well, but when i try to run the lmp command, lammps does not start. No relevant error message is shown.

input command

lmp -h

DeePMD-kit Version

2.2.0b0

TensorFlow Version

2.11

How did you download the software?

pip

Input Files, Running Commands, Error Log, etc.

$lmp -h
2023-02-09 16:35:29.935085: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-02-09 16:35:30.146605: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2023-02-09 16:35:32.755988: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64/:/usr/mpi/gcc/openmpi-4.1.2rc2/lib64::/.singularity.d/libs
2023-02-09 16:35:32.756269: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64/:/usr/mpi/gcc/openmpi-4.1.2rc2/lib64::/.singularity.d/libs
2023-02-09 16:35:32.756291: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
WARNING:tensorflow:From /users/luukasni/.local/lib/python3.9/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.

Steps to Reproduce

pip install deepmd-kit[gpu,lmp]==2.2.0b0

lmp -h

Further Information, Files, and Links

No response

@luukasnik luukasnik added the bug label Feb 9, 2023
@njzjz njzjz self-assigned this Feb 9, 2023
@njzjz
Copy link
Member

njzjz commented Feb 9, 2023

Hi, I tried to reproduce the issue in a new kaggle notebook. However, everything looks fine: https://www.kaggle.com/jinzhezeng/test-lmp-h

@luukasnik
Copy link
Author

Hello,

Is there a command to run lmp in debug mode so i could trace to where the problem lies? The tensorflow installation is inside of a singularity container due to HPC reasons.

@njzjz
Copy link
Member

njzjz commented Feb 10, 2023

Perhaps you could share the container, and I can take a look.

@njzjz
Copy link
Member

njzjz commented Apr 7, 2023

I reproduced this issue on an HPC. The reason needs to be investigated.

@njzjz njzjz added reproduced This bug has been reproduced by developers and removed failed to reproduce labels Apr 7, 2023
@njzjz
Copy link
Member

njzjz commented Apr 25, 2023

Fixed in njzjz/lammps-wheel#24 by bumping patchelf to 0.17.2.

The root reason might be NixOS/patchelf#446.

@njzjz njzjz closed this as completed Apr 25, 2023
@njzjz njzjz added the upstream label Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug reproduced This bug has been reproduced by developers upstream
Projects
None yet
Development

No branches or pull requests

2 participants