文本生成音效 Stable-Audio-Open
一个免费、开源且强大的文本转音效模型,让你可以生成各种音效。
Free Online Stable-Audio-Open
什么是 stable audio open ?
Stable Audio Open 允许任何人通过简单的文本提示生成长达 47 秒的高质量音频数据。其专业训练使其非常适合为音乐制作和声音设计创建鼓点、乐器连复段、环境声音、拟音录音和其他音频样本。
Feature
How to use Stable Audio Open?
Let's get started with Stable Audio Open in just a few simple steps.
1
Download model from huggingface
git clone https://huggingface.co/stabilityai/stable-audio-open-1.0
2
Install Dependencies
pip install torch torchaudio stable_audio_tools einops
3
Import Required Libraries
import torch import torchaudio from einops import rearrange from stable_audio_tools import get_pretrained_model from stable_audio_tools.inference.generation import generate_diffusion_cond import gradio as gr
4
Load model
model, model_config = get_pretrained_model('stabilityai/stable-audio-open-1.0') model = model.to(device)
5
Generate Audio
output = generate_diffusion_cond( model, steps=100, cfg_scale=7, conditioning=conditioning, sample_size=sample_size, sigma_min=0.3, sigma_max=500, sampler_type="dpmpp-3m-sde", device=device )
6
Output save audio
# Rearrange audio batch to a single sequence output = rearrange(output, "b d n -> d (b n)") # Peak normalize, clip, convert to int16, and save to file output = output.to(torch.float32).div(torch.max(torch.abs(output))).clamp(-1, 1).mul(32767).to(torch.int16).cpu() torchaudio.save("output.wav", output, sample_rate)
FAQs
这里有一些最常见的问题。