Skip to content

Shivamkak19/Deepfake-Detector

Repository files navigation

Contributors Forks Stargazers Issues MIT License LinkedIn


VoiceProtect logo

VoiceProtect - Deepfake Audio Detector - Implemented on the Tortoise-TTS Library

VoiceProtect allows users to gauge whether audio files or live audio streams have been generated with AI. As deepfake audio scams are becoming more prevalent, reliable deepfake audio detectors will become increasingly valuable. This app leverages the Tortoise-TTS library, specifically the AudioMiniEncoderWithClassifierHead() function along with a classification model available publicly at Tortoise-TTS on HuggingFace (saved as classifier.pth in root of this repository).

The original intent of this app is toward a live time deployment with iOS/android call data, which is not accessible via public API's. The pyaudio record audio input acts a prototype for the feature of live scam call detection with call data.


View Product · Report Bug · Request Feature

Table of Contents
  1. Getting Started
  2. Usage
  3. Roadmap
  4. Contributing
  5. License
  6. Contact
  7. Acknowledgments

Built With

  • tortoise-tts
  • pydub
  • torchaudio
  • pyaudio
  • librosa
  • pytorch
  • streamlit

(back to top)

Getting Started

Below, the set-up process is listed to host VoiceProtect on your local machine. Be careful to install both the library requirements and the system requirements.

Prerequisites

To run this project, you must download the latest version of the pip installer. Additionally, download the system requirements listed below.

  • Download ffmpeg: https://ffmpeg.org/download.html (used by pydub)

  • portaudio19-dev: macOS see below, windows should install implicitly with pyaudio

  • pip install --upgrade pip
  • MACOS ONLY:

    brew install portaudio

Installation

  1. Clone the repo

    git clone https://github.com/Shivamkak19/Deepfake-Detector.git
  2. Switch to tortoise_tts folder

    cd tortoise_tts
  3. Install dependencies

    pip install -r requirements.txt
  4. Deploy Streamlit app on local server

    streamlit run voiceProtect_app.py

(back to top)

Usage

VoiceProtect Screen Shot

Use the local VoiceProtect deployment to analyze the likelihood that an input audio file or live audio recording contains audio created with generative AI. To receive results, wait until the streamlit app has finished processing function calls (indicated in the product pictures). The accuracy of this identification system is based on preset tortoise-tts models and functions, as described in main description above.

Make sure to launch the file ./tortoise_tts/voiceProtect_app.py. The main app must be launched within the tortoise_tts folder, as tortoise_tts must be launched in the main thread to resolve signal issues with the atlastk library (see issues.txt).

Uploaded file audio classification: VoiceProtect Screen Shot

Live audio stream classification: VoiceProtect Screen Shot

Results processing: VoiceProtect Screen Shot

** Additionally, the live streamlit deployment of VoiceProtect is currently facing issues with detecting an input device for audio recording with pyaudio. Check back here for updates. **

(back to top)

Roadmap

  • Clone Tortoise-tts library locally

  • Collect .mp3 file path

    • Load File, convert to tensor object with torchaudio
  • Utilize functions in classifier.py to classify deepfake audio input

  • Set up Streamlit GUI

  • Utilize pyaudio to accept live audio stream via default device microphone

    • Convert files to/from wav as needed with pydub
  • Generate waveform plot from input audio file or recorded audio

    • Convert files to/from wav as needed with pydub
    • Configure ffmpeg on system, add to system path
  • Export requirements.txt

  • Troubleshoot various threading issues:

    • MatPlotLib backend GUI issues
    • Tortoise-tts must be called from main thread
  • TODO Issues - Streamlit Deploy:

    • pyaudio senses no default mic in hosted streamlit environment
      • Utilize streamlit-webrtc library
    • streamlit deploy does not have access to system file uploader

See the open issues for a full list of proposed features (and known issues).

(back to top)

Contributing

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/newFeature)
  3. Commit your Changes (git commit -m 'Add some new feature to Deepfake-Detector')
  4. Push to the Branch (git push origin feature/newFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the MIT License. See LICENSE for more information.

(back to top)

Contact

(back to top)

Acknowledgments

  • AI Anytime, for tutorials on the Tortoise-TTS library, useful function calls, and integration with other relevant libraries (torchaudio, librosa, etc).
  • AI Anytime Youtube Channel

(back to top)

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages