Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

按照文档安装过程中没有任何报错,但是执行语音识别命令报错,重装好几次都是一样 #3692

Open
LjPro opened this issue Mar 3, 2024 · 6 comments
Labels

Comments

@LjPro
Copy link

LjPro commented Mar 3, 2024

General Question

pip list:
`Package Version


absl-py 2.1.0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
astor 0.8.1
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
bce-python-sdk 0.9.4
blinker 1.7.0
bokeh 3.3.4
boltons 23.1.1
Bottleneck 1.3.8
braceexpand 0.1.7
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.0
cycler 0.12.1
Cython 3.0.8
datasets 2.18.0
decorator 5.1.1
dill 0.3.4
Distance 0.1.3
editdistance 0.8.1
einops 0.7.0
exceptiongroup 1.2.0
executing 2.0.1
fastapi 0.110.0
filelock 3.13.1
Flask 3.0.2
flask-babel 4.0.0
flatbuffers 23.5.26
fonttools 4.49.0
frozenlist 1.4.1
fsspec 2024.2.0
ftfy 6.1.3
future 1.0.0
g2p-en 2.1.0
g2pM 0.1.2.5
h11 0.14.0
h5py 3.10.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.21.3
humanfriendly 10.0
HyperPyYAML 1.2.2
idna 3.6
inflect 7.0.0
intervaltree 3.1.0
ipython 8.22.1
itsdangerous 2.1.2
jedi 0.19.1
jieba 0.42.1
Jinja2 3.1.3
joblib 1.3.2
jsonlines 4.0.0
kaldiio 2.18.0
kiwisolver 1.4.5
librosa 0.8.1
llvmlite 0.42.0
loguru 0.7.2
lxml 5.1.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
matplotlib-inline 0.1.6
mdurl 0.1.2
mido 1.3.2
mock 5.1.0
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.12.2
nara-wpe 0.0.9
nltk 3.8.1
note-seq 0.0.3
numba 0.59.0
numpy 1.23.5
omegaconf 2.3.0
onnx 1.15.0
onnxruntime 1.17.1
OpenCC 1.1.7
opencc-python-reimplemented 0.1.7
opencv-python 4.6.0.66
opt-einsum 3.3.0
packaging 23.2
paddle2onnx 1.0.6
paddleaudio 1.1.0
paddlefsl 1.1.0
paddlenlp 2.6.1
paddlepaddle-gpu 2.6.0
paddlesde 0.2.5
paddleslim 2.6.0
paddlespeech 0.0.0
paddlespeech-feat 0.1.0
pandas 2.2.1
parameterized 0.9.0
parso 0.8.3
pathos 0.2.8
pattern-singleton 1.2.0
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
pooch 1.8.1
portalocker 2.8.2
pox 0.3.4
ppdiffusers 0.19.4
ppft 1.7.6.8
praatio 5.1.1
pretty-midi 0.2.10
prettytable 3.10.0
prompt-toolkit 3.0.43
protobuf 3.20.2
psutil 5.9.8
pure-eval 0.2.2
pyarrow 15.0.0
pyarrow-hotfix 0.6
pybind11 2.11.1
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.3
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
pygtrie 2.5.0
pyparsing 3.1.1
pypinyin 0.44.0
pypinyin-dict 0.7.0
pyreadline3 3.4.1
pytest-runner 6.0.1
python-dateutil 2.9.0.post0
pytz 2024.1
pywin32 306
pyworld 0.3.4
PyYAML 6.0.1
pyzmq 25.1.2
rarfile 4.1
regex 2023.12.25
requests 2.31.0
requests-mock 1.11.0
resampy 0.4.2
rich 13.7.1
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
sacrebleu 2.4.0
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.12.0
sentencepiece 0.2.0
seqeval 1.2.2
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
sortedcontainers 2.4.0
soundfile 0.12.1
stack-data 0.6.3
starlette 0.36.3
swig 4.2.1
sympy 1.12
tabulate 0.9.0
TextGrid 1.6.1
threadpoolctl 3.3.0
timer 0.2.2
ToJyutping 0.2.1
tornado 6.4
tqdm 4.66.2
traitlets 5.14.1
trampoline 0.1.2
typeguard 2.13.3
typer 0.9.0
typing_extensions 4.10.0
tzdata 2024.1
urllib3 1.26.18
uvicorn 0.27.1
visualdl 2.5.3
wcwidth 0.2.13
webrtcvad 2.0.10
websockets 12.0
Werkzeug 3.0.1
wheel 0.41.2
win32-setctime 1.1.0
xxhash 3.4.1
xyzservices 2023.10.1
yacs 0.1.8
yarl 1.9.4
zhon 2.0.2`

powershell执行: paddlespeech asr --lang zh --input zh.wav
`(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav
C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddleaudio_extension.py:141: UserWarning: paddleaudio C++ extension is not available.
warnings.warn("paddleaudio C++ extension is not available.")
C:\Users\an.conda\envs\paddle_test\lib\site-packages_distutils_hack_init_.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed!
W0303 10:39:49.955922 31980 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8
W0303 10:39:49.970871 31980 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000
[2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)
Traceback (most recent call last):
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer
result_transcripts = self.model.decode(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function
return func(*args, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode
hyp = self.attention_rescoring(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 543, in attention_rescoring
hyps, encoder_out = self._ctc_prefix_beam_search(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 424, in _ctc_prefix_beam_search
encoder_out, encoder_mask = self._forward_encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 229, in _forward_encoder
encoder_out, encoder_mask = self.encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\encoder.py", line 184, in forward
chunk_masks = add_optional_chunk_mask(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\mask.py", line 202, in add_optional_chunk_mask
chunk_masks = masks.logical_and(chunk_masks) # (B, L, L)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\logic.py", line 143, in logical_and
return _C_ops.logical_and(x, y)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

@LjPro LjPro added the Question label Mar 3, 2024
@18721688783
Copy link

我的也是,尝试了各种版本,安装成功,最终也是这个错误Broadcast dimension mismatch.

@Ray961123
Copy link

开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,目前开源项目维护人力有限,建议参考:#3246

@ljh-coder
Copy link

General Question

pip list: `Package Version

absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 antlr4-python3-runtime 4.9.3 anyio 4.3.0 astor 0.8.1 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 Babel 2.14.0 bce-python-sdk 0.9.4 blinker 1.7.0 bokeh 3.3.4 boltons 23.1.1 Bottleneck 1.3.8 braceexpand 0.1.7 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 contourpy 1.2.0 cycler 0.12.1 Cython 3.0.8 datasets 2.18.0 decorator 5.1.1 dill 0.3.4 Distance 0.1.3 editdistance 0.8.1 einops 0.7.0 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.110.0 filelock 3.13.1 Flask 3.0.2 flask-babel 4.0.0 flatbuffers 23.5.26 fonttools 4.49.0 frozenlist 1.4.1 fsspec 2024.2.0 ftfy 6.1.3 future 1.0.0 g2p-en 2.1.0 g2pM 0.1.2.5 h11 0.14.0 h5py 3.10.0 httpcore 1.0.4 httpx 0.27.0 huggingface-hub 0.21.3 humanfriendly 10.0 HyperPyYAML 1.2.2 idna 3.6 inflect 7.0.0 intervaltree 3.1.0 ipython 8.22.1 itsdangerous 2.1.2 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.3 joblib 1.3.2 jsonlines 4.0.0 kaldiio 2.18.0 kiwisolver 1.4.5 librosa 0.8.1 llvmlite 0.42.0 loguru 0.7.2 lxml 5.1.0 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.8.3 matplotlib-inline 0.1.6 mdurl 0.1.2 mido 1.3.2 mock 5.1.0 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.12.2 nara-wpe 0.0.9 nltk 3.8.1 note-seq 0.0.3 numba 0.59.0 numpy 1.23.5 omegaconf 2.3.0 onnx 1.15.0 onnxruntime 1.17.1 OpenCC 1.1.7 opencc-python-reimplemented 0.1.7 opencv-python 4.6.0.66 opt-einsum 3.3.0 packaging 23.2 paddle2onnx 1.0.6 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlenlp 2.6.1 paddlepaddle-gpu 2.6.0 paddlesde 0.2.5 paddleslim 2.6.0 paddlespeech 0.0.0 paddlespeech-feat 0.1.0 pandas 2.2.1 parameterized 0.9.0 parso 0.8.3 pathos 0.2.8 pattern-singleton 1.2.0 pillow 10.2.0 pip 23.3.1 platformdirs 4.2.0 pooch 1.8.1 portalocker 2.8.2 pox 0.3.4 ppdiffusers 0.19.4 ppft 1.7.6.8 praatio 5.1.1 pretty-midi 0.2.10 prettytable 3.10.0 prompt-toolkit 3.0.43 protobuf 3.20.2 psutil 5.9.8 pure-eval 0.2.2 pyarrow 15.0.0 pyarrow-hotfix 0.6 pybind11 2.11.1 pycparser 2.21 pycryptodome 3.20.0 pydantic 2.6.3 pydantic_core 2.16.3 pydub 0.25.1 Pygments 2.17.2 pygtrie 2.5.0 pyparsing 3.1.1 pypinyin 0.44.0 pypinyin-dict 0.7.0 pyreadline3 3.4.1 pytest-runner 6.0.1 python-dateutil 2.9.0.post0 pytz 2024.1 pywin32 306 pyworld 0.3.4 PyYAML 6.0.1 pyzmq 25.1.2 rarfile 4.1 regex 2023.12.25 requests 2.31.0 requests-mock 1.11.0 resampy 0.4.2 rich 13.7.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 sacrebleu 2.4.0 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.12.0 sentencepiece 0.2.0 seqeval 1.2.2 setuptools 68.2.2 six 1.16.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 stack-data 0.6.3 starlette 0.36.3 swig 4.2.1 sympy 1.12 tabulate 0.9.0 TextGrid 1.6.1 threadpoolctl 3.3.0 timer 0.2.2 ToJyutping 0.2.1 tornado 6.4 tqdm 4.66.2 traitlets 5.14.1 trampoline 0.1.2 typeguard 2.13.3 typer 0.9.0 typing_extensions 4.10.0 tzdata 2024.1 urllib3 1.26.18 uvicorn 0.27.1 visualdl 2.5.3 wcwidth 0.2.13 webrtcvad 2.0.10 websockets 12.0 Werkzeug 3.0.1 wheel 0.41.2 win32-setctime 1.1.0 xxhash 3.4.1 xyzservices 2023.10.1 yacs 0.1.8 yarl 1.9.4 zhon 2.0.2`

powershell执行: paddlespeech asr --lang zh --input zh.wav `(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddleaudio_extension.py:141: UserWarning: paddleaudio C++ extension is not available. warnings.warn("paddleaudio C++ extension is not available.") C:\Users\an.conda\envs\paddle_test\lib\site-packages_distutils_hack__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") 2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed! W0303 10:39:49.955922 31980 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8 W0303 10:39:49.970871 31980 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9. 2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000 [2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86) Traceback (most recent call last): File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer result_transcripts = self.model.decode( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function return func(*args, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode hyp = self.attention_rescoring( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 543, in attention_rescoring hyps, encoder_out = self._ctc_prefix_beam_search( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 424, in _ctc_prefix_beam_search encoder_out, encoder_mask = self._forward_encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 229, in _forward_encoder encoder_out, encoder_mask = self.encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call return self.forward(*inputs, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\encoder.py", line 184, in forward chunk_masks = add_optional_chunk_mask( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\mask.py", line 202, in add_optional_chunk_mask chunk_masks = masks.logical_and(chunk_masks) # (B, L, L) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\logic.py", line 143, in logical_and return _C_ops.logical_and(x, y) ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理,你这个主要还是版本的问题

@ljh-coder
Copy link

我的也是,尝试了各种版本,安装成功,最终也是这个错误Broadcast dimension mismatch.

可以参考 #3697 看看有没有帮助,底部我贴了一个博客链接,有详情的安装过程和部分报错的处理

@18721688783
Copy link

18721688783 commented Mar 22, 2024 via email

@hbjhyhb
Copy link

hbjhyhb commented Jun 25, 2024

这个项目的包版本管理真的是一塌糊涂。本来在requirement.txt可以一次性解决包匹配问题,可就是不写包的版本号,故意折腾各位。真是服了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants