Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recogniticon quality surprisingly bay? #4

Open
navid-zamani opened this issue Apr 27, 2023 · 4 comments
Open

Recogniticon quality surprisingly bay? #4

navid-zamani opened this issue Apr 27, 2023 · 4 comments

Comments

@navid-zamani
Copy link

navid-zamani commented Apr 27, 2023

I thought DeepSpeech was a NN-based good model.

I found no details regarding how well this works. I could not get it to recognize things correctly even one time.

The best recognition was also the most funny one: It turned “substance abuse is bad” into “substance the best”. 🤣

So this bug exists, as a request to add a link to the readme, where users can read about DeepSpeech. (And possibly download other speech files, as it may be my accent. 😇)

@dp0s
Copy link

dp0s commented Jan 12, 2024

Unfortunately the speech recognition quality is bad for me too. It understands single words correctly sometimes, but never a complete sentence.
I don't have this problem with Google or Dicio, there the recognition works fine.

@T-vK
Copy link
Owner

T-vK commented Jan 12, 2024

I agree it's really bad by any modern standard. But since it's developed by Mozilla, I would think that it is just a matter of bad setup/configuration on my part.

Google's speech recognition is proprietary and requires a remote backend. So it is not really comparable imo.

Dicio uses vosk which is comparable to DeepSpeech (open source, works offline). Vosk (at least the way Dicio integrated it) performs much better and yields far better results than Termux-DeepSpeech. Maybe someone should develop Termux-Vosk or something like that.

@dp0s
Copy link

dp0s commented Mar 25, 2024

Update: I found a way for reliable open source offline voice recognition in Termux.

1 install Sayboard from F-Droid.
2 go to settings and select Sayboard as default Voice Input Method
3 download required vosk models in Sayboard Settings.
4 Use default termux-speech-to-text command.
5 Wait a couple of seconds and then speak slowly.

Step 2 is only possible thanks to the latest Sayboard Update.

It is possible to configure the used language within Sayboard settings.

@navid-zamani
Copy link
Author

navid-zamani commented Mar 26, 2024

@dp0s: TermuxActivity just crashes here, when using termux-speech-to-text:

Unable to create service com.termux.api.apis.SpeechToTextAPI$SpeechToTextService: android.view.WindowManager$BadTokenException: Unable to add window -- token null is not valid; is your activity running?

And after that, retrying the command just does nothing until Termux is actually closed and restarted, no matter if I say something or how long I wait. And I have to Ctrl-C it.

The app is enabled as a keyboard. But there seems to be no way to pick a default speech recognition app. Is it possible this is hidden when there is only one? I found no place to pich the default. So I can’t really even tell if Sayboard is actually used.

That being said, recognition works halfway acceptable in Sayboard itself. I’m not sure it is useful with such a tiny dictionary though. I would have to unnaturally speak like a (or to a) small child. It also has big trouble with German compound words, and introduces invalid grammar by separating the words, giving it a very different meaning. (Something common with functional illiterates, that makes one look really stupid, and a bit like a certain kind of radical nationalist too. So you probably understand why that might be a no go. :))
Therr are probably bigger models, but I don’t know how much RAM they will use and if they can even work on an average 2023 phone.

So I don’t think the technology is ready yet. And I’ve used software in the 90s, on a 66MHz 486, that used 64MB RAM and did a better job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants