Skip to content

vadim-v-lebedev/audio_style_tranfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Audio Style Transfer

This is an implementation of artistic style transfer algorithm for audio, which uses convolutions with random weights to represent audio features.

To listen to examples go to the blog post. Also check out functionally identical implementations in TensorFlow and Torch

Dependencies

pip install librosa
  • numpy and matplotlib

The easiest way to install python is to use Anaconda.

How to run

  • Open audio_style_transfer.ipynb in Jupyter notebook.
  • In case you want to use your own audio files as inputs, first cut them to 10s length with:
ffmpeg -i yourfile.mp3 -ss 00:00:00 -t 10 yourfile_10s.mp3
  • Set CONTENT_FILENAME and STYLE_FILENAME in the third cell of Jupyter notebook to your input files.
  • Run all cells.

The most frequent problem is domination of either content or style in the output. To fight this problem, adjust ALPHA parameter. Larger ALPHA means more content in the output, and ALPHA=0 means no content, which reduces stylization to texture generation. Example output outputs/imperial_usa.wav, the result of mixing content of imperial march from star wars with style of U.S. National Anthem, was obtained with default value ALPHA=1e-2.

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages