ThinkSound - 一键给无声视频配音，为AI视频生成匹配音效支持50系显卡一键整合包下载 - AI语音 - 前沿AI软件资源站

无言以对 发表于 2025-7-6 17:07:55

ThinkSound - 一键给无声视频配音，为AI视频生成匹配音效支持50系显卡一键整合包下载

ThinkSound 是阿里通义实验室开源的首个音频生成模型，它能够让AI像专业“音效师”一样，根据视频内容生成高度逼真、与视觉内容完美契合的音频。
ThinkSound 可直接应用于影视后期制作，为AI生成的视频自动匹配精准的环境噪音与爆炸声效；服务于游戏开发领域，实时生成雨势变化等动态场景的自适应音效；同时可以无障碍视频生产，为视障用户同步生成画面描述与环境音效。

应用领域 ‌

创意产业‌：ThinkSound可以极大地助力电影、动画、广告等创意产业的音频制作。它能够为视频内容自动生成高质量的音效和背景音乐，减轻音频师的工作负担，同时提高制作效率和音频质量。 ‌
视频生成模型的配音‌：该框架还可以与视频生成模型配合使用，为这些模型生成的视频提供配音。这意味着，在自动生成视频的同时，也能自动生成与之匹配的音频，进一步推动自动化内容创作的边界。 ‌
音频修复与编辑‌：在音频修复方面，ThinkSound能够准确地恢复被噪声掩盖的音频片段。此外，它还能根据用户的指令对音频进行精细编辑，如添加、删除或修改特定声音元素。 ‌
教育与培训‌：在教育和培训领域，ThinkSound可以用于创建具有丰富音效的多媒体教材，帮助学生更好地理解和记忆学习内容。 ‌
虚拟现实与增强现实‌：在虚拟现实（VR）和增强现实（AR）应用中，ThinkSound可以生成与用户的交互行为实时匹配的音频效果，提升沉浸感和真实感。

使用教程：（建议N卡，显存12G起，运行内存32G起。支持50系显卡，基于CUDA12.8）

上传需要配音的视频，可选提示词和描述，提交即可。

下载地址：
迅雷云盘：https://pan.xunlei.com/s/VOUXHPuTQl-VA1w0IAneHKqGA1?pwd=nzqw
百度网盘：**** 本内容需购买 ****

解压密码：https://deepfaces.cc/ 复制这个完整的网址即是解压密码，不要有空格，复制粘贴即可

ynzh668 发表于 2025-7-6 17:29:41

坐等更新！;P

lujun1996 发表于 2025-7-6 19:24:02

音效，现在也出现了大模型，真是厉害啊

来日方长 发表于 2025-7-7 23:49:22

效果不错，就是模型太大了，还有，太吃运存

alfanso 发表于 2025-7-8 20:21:44

Processing: 0%| | 0/1 D:\AIworks\ThinkSound\deepface\lib\site-packages\librosa\util\files.py:10: UserWarning: pkg_resources is deprecated as an API. See The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import resource_filename
D:\AIworks\ThinkSound\deepface\lib\site-packages\librosa\util\files.py:10: UserWarning: pkg_resources is deprecated as an API. See The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import resource_filename
2025-07-08 20:18:34,708 - INFO - torch.Size()

Processing: 100%|██████████| 1/1 >>> 一键包制作：https://deepfaces.cc <<<
>>> 欢迎转载，转载请注明出处 <<<

>>> 一键包制作：https://deepfaces.cc <<<
>>> 欢迎转载，转载请注明出处 <<<

Processing: 100%|██████████| 1/1
{'caption': "['Plastic Debris Handling']", 'caption_cot': "['Begin with the sound of hands scooping up loose plastic debris, followed by the subtle cascading noise as the pieces fall and scatter back down. Include soft crinkling and rustling to emphasize the texture of the plastic. Add ambient factory background noise with distant machinery to create an industrial atmosphere.']"}
Traceback (most recent call last):
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\queueing.py", line 407, in call_prediction
output = await route_utils.call_process_api(
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\route_utils.py", line 226, in call_process_api
output = await app.get_blocks().process_api(
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\blocks.py", line 1550, in process_api
result = await self.call_function(
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\blocks.py", line 1199, in call_function
prediction = await utils.async_iteration(iterator)
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\utils.py", line 519, in async_iteration
return await iterator.__anext__()
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\utils.py", line 512, in __anext__
return await anyio.to_thread.run_sync(
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\anyio\to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\anyio\_backends\_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\anyio\_backends\_asyncio.py", line 967, in run
result = context.run(func, *args)
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\utils.py", line 495, in run_sync_iterator_async
return next(iterator)
File "D:\AIworks\ThinkSound\deepface\lib\site-packages\gradio\utils.py", line 649, in gen_wrapper
yield from f(*args, **kwargs)
File "<frozen app>", line 130, in generate_audio
File "<frozen app>", line 36, in run_infer
UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 47: illegal multibyte sequence
运行报错了，跑了两个不同的示例视频都报同样的错，这是什么问题呀？

jeck123 发表于 2025-7-8 21:10:50

解压码不对

alfanso 发表于 2025-7-8 21:44:31

我已经解压好了，模型也自动下载好了，整个包有65.5G。使用包里自带的视频跑的报错。
“解压码不对”，能再说清楚一点吗？

无言以对 发表于 2025-7-8 22:11:45

alfanso 发表于 2025-7-8 20:21
Processing: 0%| | 0/1

提示词都删了试试

alfanso 发表于 2025-7-8 22:29:14

我把标题和CoT描述都删掉了，还是报一样的错误。

无言以对 发表于 2025-7-8 22:48:03

alfanso 发表于 2025-7-8 22:29
我把标题和CoT描述都删掉了，还是报一样的错误。

看着是编码错误，你用户名有没有中文？
这两天会更新一版，实在不行，等下一版吧，优化很多东西，比如内存和显存占用，生成速度

页: [1] 2

前沿AI软件资源站's Archiver

ThinkSound - 一键给无声视频配音，为AI视频生成匹配音效 支持50系显卡 一键整合包下载

ThinkSound - 一键给无声视频配音，为AI视频生成匹配音效支持50系显卡一键整合包下载