按照文档安装过程中没有任何报错，但是执行语音识别命令报错，重装好几次都是一样 #3692

LjPro · 2024-03-03T02:54:44Z

General Question

pip list：
`Package Version

absl-py 2.1.0
aiohttp 3.9.3
aiosignal 1.3.1
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
astor 0.8.1
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.2.0
audioread 3.0.1
Babel 2.14.0
bce-python-sdk 0.9.4
blinker 1.7.0
bokeh 3.3.4
boltons 23.1.1
Bottleneck 1.3.8
braceexpand 0.1.7
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
coloredlogs 15.0.1
colorlog 6.8.2
contourpy 1.2.0
cycler 0.12.1
Cython 3.0.8
datasets 2.18.0
decorator 5.1.1
dill 0.3.4
Distance 0.1.3
editdistance 0.8.1
einops 0.7.0
exceptiongroup 1.2.0
executing 2.0.1
fastapi 0.110.0
filelock 3.13.1
Flask 3.0.2
flask-babel 4.0.0
flatbuffers 23.5.26
fonttools 4.49.0
frozenlist 1.4.1
fsspec 2024.2.0
ftfy 6.1.3
future 1.0.0
g2p-en 2.1.0
g2pM 0.1.2.5
h11 0.14.0
h5py 3.10.0
httpcore 1.0.4
httpx 0.27.0
huggingface-hub 0.21.3
humanfriendly 10.0
HyperPyYAML 1.2.2
idna 3.6
inflect 7.0.0
intervaltree 3.1.0
ipython 8.22.1
itsdangerous 2.1.2
jedi 0.19.1
jieba 0.42.1
Jinja2 3.1.3
joblib 1.3.2
jsonlines 4.0.0
kaldiio 2.18.0
kiwisolver 1.4.5
librosa 0.8.1
llvmlite 0.42.0
loguru 0.7.2
lxml 5.1.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
matplotlib-inline 0.1.6
mdurl 0.1.2
mido 1.3.2
mock 5.1.0
mpmath 1.3.0
multidict 6.0.5
multiprocess 0.70.12.2
nara-wpe 0.0.9
nltk 3.8.1
note-seq 0.0.3
numba 0.59.0
numpy 1.23.5
omegaconf 2.3.0
onnx 1.15.0
onnxruntime 1.17.1
OpenCC 1.1.7
opencc-python-reimplemented 0.1.7
opencv-python 4.6.0.66
opt-einsum 3.3.0
packaging 23.2
paddle2onnx 1.0.6
paddleaudio 1.1.0
paddlefsl 1.1.0
paddlenlp 2.6.1
paddlepaddle-gpu 2.6.0
paddlesde 0.2.5
paddleslim 2.6.0
paddlespeech 0.0.0
paddlespeech-feat 0.1.0
pandas 2.2.1
parameterized 0.9.0
parso 0.8.3
pathos 0.2.8
pattern-singleton 1.2.0
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
pooch 1.8.1
portalocker 2.8.2
pox 0.3.4
ppdiffusers 0.19.4
ppft 1.7.6.8
praatio 5.1.1
pretty-midi 0.2.10
prettytable 3.10.0
prompt-toolkit 3.0.43
protobuf 3.20.2
psutil 5.9.8
pure-eval 0.2.2
pyarrow 15.0.0
pyarrow-hotfix 0.6
pybind11 2.11.1
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.3
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
pygtrie 2.5.0
pyparsing 3.1.1
pypinyin 0.44.0
pypinyin-dict 0.7.0
pyreadline3 3.4.1
pytest-runner 6.0.1
python-dateutil 2.9.0.post0
pytz 2024.1
pywin32 306
pyworld 0.3.4
PyYAML 6.0.1
pyzmq 25.1.2
rarfile 4.1
regex 2023.12.25
requests 2.31.0
requests-mock 1.11.0
resampy 0.4.2
rich 13.7.1
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
sacrebleu 2.4.0
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.12.0
sentencepiece 0.2.0
seqeval 1.2.2
setuptools 68.2.2
six 1.16.0
sniffio 1.3.1
sortedcontainers 2.4.0
soundfile 0.12.1
stack-data 0.6.3
starlette 0.36.3
swig 4.2.1
sympy 1.12
tabulate 0.9.0
TextGrid 1.6.1
threadpoolctl 3.3.0
timer 0.2.2
ToJyutping 0.2.1
tornado 6.4
tqdm 4.66.2
traitlets 5.14.1
trampoline 0.1.2
typeguard 2.13.3
typer 0.9.0
typing_extensions 4.10.0
tzdata 2024.1
urllib3 1.26.18
uvicorn 0.27.1
visualdl 2.5.3
wcwidth 0.2.13
webrtcvad 2.0.10
websockets 12.0
Werkzeug 3.0.1
wheel 0.41.2
win32-setctime 1.1.0
xxhash 3.4.1
xyzservices 2023.10.1
yacs 0.1.8
yarl 1.9.4
zhon 2.0.2`

powershell执行： paddlespeech asr --lang zh --input zh.wav
`(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav
C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddleaudio_extension.py:141: UserWarning: paddleaudio C++ extension is not available.
warnings.warn("paddleaudio C++ extension is not available.")
C:\Users\an.conda\envs\paddle_test\lib\site-packages_distutils_hack_init_.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed!
W0303 10:39:49.955922 31980 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8
W0303 10:39:49.970871 31980 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000
[2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)
Traceback (most recent call last):
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer
result_transcripts = self.model.decode(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function
return func(*args, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode
hyp = self.attention_rescoring(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 543, in attention_rescoring
hyps, encoder_out = self._ctc_prefix_beam_search(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 424, in _ctc_prefix_beam_search
encoder_out, encoder_mask = self._forward_encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 229, in _forward_encoder
encoder_out, encoder_mask = self.encoder(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call
return self.forward(*inputs, **kwargs)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\encoder.py", line 184, in forward
chunk_masks = add_optional_chunk_mask(
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\mask.py", line 202, in add_optional_chunk_mask
chunk_masks = masks.logical_and(chunk_masks) # (B, L, L)
File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\logic.py", line 143, in logical_and
return _C_ops.logical_and(x, y)
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

18721688783 · 2024-03-03T16:35:24Z

我的也是，尝试了各种版本，安装成功，最终也是这个错误Broadcast dimension mismatch.

Ray961123 · 2024-03-08T06:58:36Z

开发者你好，感谢关注 PaddleSpeech 开源项目，抱歉给你带来了不好的开发体验，目前开源项目维护人力有限，建议参考：#3246

ljh-coder · 2024-03-08T14:31:00Z

General Question

pip list： `Package Version

absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 annotated-types 0.6.0 antlr4-python3-runtime 4.9.3 anyio 4.3.0 astor 0.8.1 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 Babel 2.14.0 bce-python-sdk 0.9.4 blinker 1.7.0 bokeh 3.3.4 boltons 23.1.1 Bottleneck 1.3.8 braceexpand 0.1.7 certifi 2024.2.2 cffi 1.16.0 charset-normalizer 3.3.2 click 8.1.7 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 contourpy 1.2.0 cycler 0.12.1 Cython 3.0.8 datasets 2.18.0 decorator 5.1.1 dill 0.3.4 Distance 0.1.3 editdistance 0.8.1 einops 0.7.0 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.110.0 filelock 3.13.1 Flask 3.0.2 flask-babel 4.0.0 flatbuffers 23.5.26 fonttools 4.49.0 frozenlist 1.4.1 fsspec 2024.2.0 ftfy 6.1.3 future 1.0.0 g2p-en 2.1.0 g2pM 0.1.2.5 h11 0.14.0 h5py 3.10.0 httpcore 1.0.4 httpx 0.27.0 huggingface-hub 0.21.3 humanfriendly 10.0 HyperPyYAML 1.2.2 idna 3.6 inflect 7.0.0 intervaltree 3.1.0 ipython 8.22.1 itsdangerous 2.1.2 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.3 joblib 1.3.2 jsonlines 4.0.0 kaldiio 2.18.0 kiwisolver 1.4.5 librosa 0.8.1 llvmlite 0.42.0 loguru 0.7.2 lxml 5.1.0 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.8.3 matplotlib-inline 0.1.6 mdurl 0.1.2 mido 1.3.2 mock 5.1.0 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.12.2 nara-wpe 0.0.9 nltk 3.8.1 note-seq 0.0.3 numba 0.59.0 numpy 1.23.5 omegaconf 2.3.0 onnx 1.15.0 onnxruntime 1.17.1 OpenCC 1.1.7 opencc-python-reimplemented 0.1.7 opencv-python 4.6.0.66 opt-einsum 3.3.0 packaging 23.2 paddle2onnx 1.0.6 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlenlp 2.6.1 paddlepaddle-gpu 2.6.0 paddlesde 0.2.5 paddleslim 2.6.0 paddlespeech 0.0.0 paddlespeech-feat 0.1.0 pandas 2.2.1 parameterized 0.9.0 parso 0.8.3 pathos 0.2.8 pattern-singleton 1.2.0 pillow 10.2.0 pip 23.3.1 platformdirs 4.2.0 pooch 1.8.1 portalocker 2.8.2 pox 0.3.4 ppdiffusers 0.19.4 ppft 1.7.6.8 praatio 5.1.1 pretty-midi 0.2.10 prettytable 3.10.0 prompt-toolkit 3.0.43 protobuf 3.20.2 psutil 5.9.8 pure-eval 0.2.2 pyarrow 15.0.0 pyarrow-hotfix 0.6 pybind11 2.11.1 pycparser 2.21 pycryptodome 3.20.0 pydantic 2.6.3 pydantic_core 2.16.3 pydub 0.25.1 Pygments 2.17.2 pygtrie 2.5.0 pyparsing 3.1.1 pypinyin 0.44.0 pypinyin-dict 0.7.0 pyreadline3 3.4.1 pytest-runner 6.0.1 python-dateutil 2.9.0.post0 pytz 2024.1 pywin32 306 pyworld 0.3.4 PyYAML 6.0.1 pyzmq 25.1.2 rarfile 4.1 regex 2023.12.25 requests 2.31.0 requests-mock 1.11.0 resampy 0.4.2 rich 13.7.1 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 sacrebleu 2.4.0 safetensors 0.4.2 scikit-learn 1.4.1.post1 scipy 1.12.0 sentencepiece 0.2.0 seqeval 1.2.2 setuptools 68.2.2 six 1.16.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 stack-data 0.6.3 starlette 0.36.3 swig 4.2.1 sympy 1.12 tabulate 0.9.0 TextGrid 1.6.1 threadpoolctl 3.3.0 timer 0.2.2 ToJyutping 0.2.1 tornado 6.4 tqdm 4.66.2 traitlets 5.14.1 trampoline 0.1.2 typeguard 2.13.3 typer 0.9.0 typing_extensions 4.10.0 tzdata 2024.1 urllib3 1.26.18 uvicorn 0.27.1 visualdl 2.5.3 wcwidth 0.2.13 webrtcvad 2.0.10 websockets 12.0 Werkzeug 3.0.1 wheel 0.41.2 win32-setctime 1.1.0 xxhash 3.4.1 xyzservices 2023.10.1 yacs 0.1.8 yarl 1.9.4 zhon 2.0.2`

powershell执行： paddlespeech asr --lang zh --input zh.wav `(paddle_test) PS E:\AI_WorkSpace> paddlespeech asr --lang zh --input zh.wav C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddleaudio_extension.py:141: UserWarning: paddleaudio C++ extension is not available. warnings.warn("paddleaudio C++ extension is not available.") C:\Users\an.conda\envs\paddle_test\lib\site-packages_distutils_hack__init__.py:33: UserWarning: Setuptools is replacing distutils. warnings.warn("Setuptools is replacing distutils.") 2024-03-03 10:39:49.952 | INFO | paddlespeech.s2t.modules.ctc::45 - paddlespeech_ctcdecoders not installed! W0303 10:39:49.955922 31980 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.3, Runtime API Version: 11.8 W0303 10:39:49.970871 31980 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9. 2024-03-03 10:39:50.355 | INFO | paddlespeech.s2t.modules.embedding:init:153 - max len: 5000 [2024-03-03 10:39:52,263] [ ERROR] - (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86) Traceback (most recent call last): File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\cli\asr\infer.py", line 314, in infer result_transcripts = self.model.decode( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\decorator.py", line 232, in fun return caller(func, *(extras + args), **kw) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\base\dygraph\base.py", line 352, in _decorate_function return func(*args, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 818, in decode hyp = self.attention_rescoring( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 543, in attention_rescoring hyps, encoder_out = self._ctc_prefix_beam_search( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 424, in _ctc_prefix_beam_search encoder_out, encoder_mask = self._forward_encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\models\u2\u2.py", line 229, in _forward_encoder encoder_out, encoder_mask = self.encoder( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\nn\layer\layers.py", line 1429, in call return self.forward(*inputs, **kwargs) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\encoder.py", line 184, in forward chunk_masks = add_optional_chunk_mask( File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddlespeech\s2t\modules\mask.py", line 202, in add_optional_chunk_mask chunk_masks = masks.logical_and(chunk_masks) # (B, L, L) File "C:\Users\an.conda\envs\paddle_test\lib\site-packages\paddle\tensor\logic.py", line 143, in logical_and return _C_ops.logical_and(x, y) ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3. [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ..\paddle/phi/kernels/funcs/common_shape.h:86)

KeyError: 'result'`

可以参考 #3697 看看有没有帮助，底部我贴了一个博客链接，有详情的安装过程和部分报错的处理，你这个主要还是版本的问题

ljh-coder · 2024-03-08T14:32:01Z

我的也是，尝试了各种版本，安装成功，最终也是这个错误Broadcast dimension mismatch.

可以参考 #3697 看看有没有帮助，底部我贴了一个博客链接，有详情的安装过程和部分报错的处理

18721688783 · 2024-03-22T00:33:25Z

您好，问题在上面截图中描述，谢谢。 Message ID: ***@***.***>

hbjhyhb · 2024-06-25T08:51:44Z

这个项目的包版本管理真的是一塌糊涂。本来在requirement.txt可以一次性解决包匹配问题，可就是不写包的版本号，故意折腾各位。真是服了。

LjPro added the Question label Mar 3, 2024

Ray961123 mentioned this issue Mar 25, 2024

[S2T]Absurd basic bug #3722

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

按照文档安装过程中没有任何报错，但是执行语音识别命令报错，重装好几次都是一样 #3692

按照文档安装过程中没有任何报错，但是执行语音识别命令报错，重装好几次都是一样 #3692

LjPro commented Mar 3, 2024 •

edited

Loading

18721688783 commented Mar 3, 2024

Ray961123 commented Mar 8, 2024

ljh-coder commented Mar 8, 2024

General Question

ljh-coder commented Mar 8, 2024

18721688783 commented Mar 22, 2024 via email

hbjhyhb commented Jun 25, 2024

按照文档安装过程中没有任何报错，但是执行语音识别命令报错，重装好几次都是一样 #3692

按照文档安装过程中没有任何报错，但是执行语音识别命令报错，重装好几次都是一样 #3692

Comments

LjPro commented Mar 3, 2024 • edited Loading

General Question

18721688783 commented Mar 3, 2024

Ray961123 commented Mar 8, 2024

ljh-coder commented Mar 8, 2024

General Question

ljh-coder commented Mar 8, 2024

18721688783 commented Mar 22, 2024 via email

hbjhyhb commented Jun 25, 2024

LjPro commented Mar 3, 2024 •

edited

Loading