LanguageBind-MLP Model

Model Description

This is a fine-tuned LanguageBind model for detecting machine-generated content across multiple modalities (text, image, and audio). The model is part of the RU-AI project, which introduces a large multimodal dataset for AI-generated content detection.

This model leverages LanguageBind's multi-modal semantic alignment capabilities to identify whether content is human-generated or machine-generated across different modalities.

Model Details

Intended Use

This model is designed for detecting AI-generated content in:

  • Text: Identifying AI-written articles, essays, responses, and general text
  • Images: Detecting images generated by models like Stable Diffusion, DALL-E, etc.
  • Audio: Identifying synthetic speech from TTS models

Use Cases

  • Content moderation and authenticity verification
  • Academic integrity checking
  • Media forensics and fact-checking
  • Research on AI-generated content detection

Training Data

The model was trained on the RU-AI dataset, which includes:

  • 245,895 real/human-generated samples
  • 1,229,475 machine-generated samples
  • Multiple data sources: COCO, Flickr8k, Places dataset
  • AI-generated content from various models:
    • Images: Stable Diffusion (v1.5, v6.0, XL v3.0, AbsoluteReality, EpicRealism)
    • Audio: EfficientSpeech, StyleTTS2, VITS, XTTS2, YourTTS
    • Text: Various LLM-generated captions and descriptions

Dataset is publicly available at Zenodo.

Requirements

Hardware

  • NVIDIA GPU with at least 16GB VRAM (RTX 3090 24GB or higher recommended)
  • At least 500GB disk space for the full dataset

Software

  • Python >= 3.8
  • PyTorch >= 1.13.1
  • CUDA >= 11.6

Installation

# Clone the repository
git clone https://github.com/ZhihaoZhang97/RU-AI.git
cd RU-AI

# Create virtual environment
conda create -n ruai python=3.8
conda activate ruai

# Install dependencies
pip3 install -r requirements.txt

Usage

Model Inference

# See infer_languagebind_model.py in the GitHub repository
python infer_languagebind_model.py

Before running inference, you need to:

  1. Download the dataset or prepare your own data
  2. Update the data paths in infer_languagebind_model.py:
    • image_data_paths
    • audio_data_paths
    • text_data

Quick Start with Sample Data

# Download Flickr8k sample data
python ./download_flickr.py

# Or download the full dataset (157GB compressed, 500GB uncompressed)
python ./download_all.py

Model Performance

This model is designed to detect AI-generated content across multiple modalities simultaneously, leveraging LanguageBind's language-based semantic alignment to create unified representations.

For detailed performance metrics and evaluation results, please refer to the paper.

Limitations

  • The model's performance depends on the quality and diversity of training data
  • May not generalize well to AI models or techniques not represented in the training set
  • Detection accuracy may vary across different modalities
  • Requires significant computational resources for inference

Ethical Considerations

This model is intended for research and legitimate content verification purposes. Users should:

  • Consider privacy implications when analyzing user-generated content
  • Be aware of potential biases in training data
  • Use the model responsibly and not for censorship without human oversight
  • Understand that detection is probabilistic and may produce false positives/negatives

Citation

If you use this model in your research, please cite:

@misc{huang2024ruai,
  title={RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection},
  author={Liting Huang and Zhihao Zhang and Yiran Zhang and Xiyue Zhou and Shoujin Wang},
  year={2024},
  eprint={2406.04906},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Acknowledgments

This work builds upon:

We appreciate the open-source community for the datasets and models that made this work possible.

License

Please refer to the GitHub repository for license information.

Contact

For questions and issues:

  • Open an issue on the GitHub repository
  • Refer to the paper for contact information of the authors
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support