Deep Fake Detection
Keywords:
Deep Fake Detection, Convolutional Neural Networks (CNNs), Xception Network, LibrosaAbstract
Deepfake technology presents an escalating threat by enabling the creation of highly realistic AI-generated videos, which can be exploited for misinformation, blackmail, and identity fraud. With the increasing accessibility of deepfake generation techniques using advanced neural networks like Generative Adversarial Networks (GANs) and Autoencoders, there is a growing need for robust detection methods to mitigate these risks. In this study, we propose a deep learning-based detection system that analyzes both visual and audio cues to enhance accuracy. Our approach employs an Xception Convolutional Neural Network (CNN) for frame-level feature extraction, as it has demonstrated superior performance in image classification and deep fake detection. The extracted features are then analyzed for spatial inconsistencies to identify potential forgeries. Additionally, temporal inconsistencies between frames are examined to detect unnatural transitions that are characteristic of manipulated videos. For audio analysis, we utilize Librosa and PyAudio to extract vocal features and detect anomalies in speech patterns. This helps identify AI-generated voices by analyzing pitch modulation, frequency variations, and unnatural transitions. Our model is trained on large-scale datasets such as FaceForensics++, the Deep Fake Detection Challenge dataset, and Celeb-DF to ensure robustness and generalizability. To further improve accuracy, we incorporate a multimodal fusion approach that combines both visual and audio features, allowing for a more comprehensive deep fake detection framework. To facilitate real-world usability, we have developed a user-friendly application that allows users to upload videos for deepfake analysis. The system provides real-time classification along with a confidence score, enabling effective identification of manipulated content. The application is designed to be computationally efficient, making it accessible for deployment on both cloud-based and edge computing platforms. By integrating both visual and audio-based detection methods, our approach offers a scalable and reliable solution to combat deep fake threats. Furthermore, ongoing research efforts aim to enhance detection capabilities by incorporating explainable AI techniques, which will provide deeper insights into the decision-making process of our model.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 IES International Journal of Multidisciplinary Engineering Research

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.