Skip to content

Export an ONNX graph that performs ISTFT. Designed for TTS models.

Notifications You must be signed in to change notification settings

mush42/istft-onnx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Create an ONNX model that performs ISTFT

If you want to stay with the ONNX stack, and you don't want to pull-in external dependencies for ISTFT, then this is for you.

Note that this is at least 4x slower than torch istft implementation, but the ISTFT overhead is negligible anyways.

Usage

Install requirements:

pip3 install -r requirements.txt

Run the script:

python3 istft_onnx.py

Make sure to correctly specify your ISTFT parameters.

You should also specify the maximom number of frames the exported model will operate on by specifying --max-frames parameters. By default it is set to 5200 (around 60 seconds for 22.05KHz sample rate).

The model is designed around the output of Vocos vocoder. Please change the inputs based on your needs. Most likely, you need to input mag and phase.

License

See the source file for more details.

About

Export an ONNX graph that performs ISTFT. Designed for TTS models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages