Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

处理使用 PaddleSpeech 过程中出现的报错 ValueError (InvalidArgument) Broadcast dimension mismatch #3697

Open
ljh-coder opened this issue Mar 8, 2024 · 2 comments

Comments

@ljh-coder
Copy link

运行在百度 AiStudio ,相关依赖包:

paddle-bfloat               0.1.7
paddle2onnx                 1.1.0
paddleaudio                 1.1.0
paddlefsl                   1.1.0
paddlenlp                   2.5.2
paddlepaddle                2.4.2
paddlesde                   0.2.5
paddleslim                  2.6.0
paddlespeech                1.4.1
paddlespeech-ctcdecoders    0.2.1
paddlespeech-feat           0.1.0

ppdiffusers                 0.19.4
Python                      3.8.18

问题描述:

使用 语音识别 和 视频字幕 生成的 Python API 时出现报错。

出现以下报错:

ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 1, 0, 498] and the shape of Y = [1, 123, 123]. Received [498] in X is not equal to [123] in Y at i:3.
  [Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at /paddle/paddle/phi/kernels/funcs/common_shape.h:84)

Traceback (most recent call last):
  File "test1.py", line 3, in <module>
    result = asr(audio_file="zh.wav")
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/utils.py", line 328, in _warpper
    return executor_func(self, *args, **kwargs)
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 512, in __call__
    res = self.postprocess()  # Retrieve result of asr.
  File "/home/aistudio/PaddleSpeech/paddlespeech/cli/asr/infer.py", line 335, in postprocess
    return self._outputs["result"]
KeyError: 'result'

参考 语言识别 功能相关文档时发现其中的 Python API 运行时不报错。
参考链接:https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/demos/speech_recognition/README_cn.md

经测试发现 asr_executor() 方法时还需要指定模型。

例:

from paddlespeech.cli.asr.infer import ASRExecutor
asr = ASRExecutor()
result = asr(audio_file="zh.wav",model='conformer_wenetspeech')
print(result)     

运行结果:

2

写了两篇博客记录了相关过程,希望可以帮助到各位:

@Ray961123
Copy link

开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,欢迎给 PaddleSpeech 开源项目贡献代码,可以提交到:https://github.com/PaddlePaddle/PaddleSpeech/pulls
代码贡献流程:
1、clone develop 代码:git clone https://github.com/PaddlePaddle/PaddleSpeech.git
2、进入PaddleSpeech目录:cd PaddleSpeech
3、修改或添加要贡献的代码
4、代码格式规范化:pre-commit run --file 修改或添加要贡献的代码
5、添加代码: git add 修改或添加要贡献的代码
6、添加comment: git comment -m "修改或添加代码的意图"
7、上传仓库: git push
8、上传后在githup界面会有提交pr的提示,请按照提示填写对应的文本后提交pr request。
提交后系统会自动通过多个CI完成对代码正确性和格式的验证。如果全部通过,会有人帮忙进行代码review以及合入操作。

@ljh-coder
Copy link
Author

开发者你好,感谢关注 PaddleSpeech 开源项目,抱歉给你带来了不好的开发体验,欢迎给 PaddleSpeech 开源项目贡献代码,可以提交到:https://github.com/PaddlePaddle/PaddleSpeech/pulls
代码贡献流程:
1、clone develop 代码:git clone https://github.com/PaddlePaddle/PaddleSpeech.git
2、进入PaddleSpeech目录:cd PaddleSpeech
3、修改或添加要贡献的代码
4、代码格式规范化:pre-commit run --file 修改或添加要贡献的代码
5、添加代码: git add 修改或添加要贡献的代码
6、添加comment: git comment -m "修改或添加代码的意图"
7、上传仓库: git push
8、上传后在githup界面会有提交pr的提示,请按照提示填写对应的文本后提交pr request。
提交后系统会自动通过多个CI完成对代码正确性和格式的验证。如果全部通过,会有人帮忙进行代码review以及合入操作。

我不确定是不是版本问题,不过文档我可能可以做一下相关的补充说明。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants