无言以对 发表于 2025-8-13 12:10:57

EchoMimic V3 - 一张照片一段音频生成半身数字人说话视频 支持手势动作 支持50系显卡 一键整合包下载


EchoMimicV3 是蚂蚁集团开发的一款基于人工智能的多模态人类动画生成系统,通过 13 亿参数的模型实现“声音+文本+图像”的统一驱动,生成高度逼真的虚拟人像动画(如说话的头部、肢体动作等)。简单来说,它能让虚拟角色根据输入的语音、文字或参考图像,自动生成同步的口型、表情和身体动作,像真人一样自然。



应用领域

虚拟数字人
为虚拟主播、AI 客服、智能助手提供自然动作和口型,提升交互体验。
示例:电商直播中,虚拟模特根据商品介绍自动调整表情和手势。

影视与游戏制作
快速生成动画片段,降低传统动作捕捉的成本和时间。
示例:为游戏角色添加语音驱动的实时表情,或修复影视中口型与台词不匹配的问题。

教育与研究
辅助语言学习(如生成带口型的外语教学动画),或用于心理学、人类行为研究。
示例:分析不同语调对虚拟人表情的影响。

无障碍技术
帮助语言障碍者通过文本或图像生成表达动作,或为听障人士提供可视化语音内容。
示例:将语音转换为带口型的动画,方便听障者“观看”对话。


使用教程:(建议N卡,显存12G起。支持50系显卡,基于CUDA12.8)

上传一张人物照片(支持头像和半身照)和驱动音频,设置参数,提交即可。
默认不带模型,解压后需手动下载模型。


下载地址:
夸克网盘:https://pan.quark.cn/s/f7f3026f28ee
百度网盘:
**** 本内容需购买 ****

wujunchen 发表于 2025-8-15 11:29:10

大佬,购买的资源,后面更新后,也要再支付一次吗?

无言以对 发表于 2025-8-15 11:39:32

wujunchen 发表于 2025-8-15 11:29
大佬,购买的资源,后面更新后,也要再支付一次吗?

是的
如果是刚需,可以开个年费会员,所有包都是免费下载

weilaiyilai 发表于 2025-8-18 17:59:37

我的是特斯拉t10显卡,驱动都正常,但是这个还是直接用cpu跑起来啦,慢的不能用,自己不会修改设置。哎

无言以对 发表于 2025-8-18 18:22:31

weilaiyilai 发表于 2025-8-18 17:59
我的是特斯拉t10显卡,驱动都正常,但是这个还是直接用cpu跑起来啦,慢的不能用,自己不会修改设置。哎
...

这卡好像不支持bf16
tesla很多老架构卡,很多模型都跑不了

weilaiyilai 发表于 2025-8-18 18:46:41

无言以对 发表于 2025-8-18 18:22
这卡好像不支持bf16
tesla很多老架构卡,很多模型都跑不了

哦,谢谢回复

tnt56466939 发表于 2025-9-6 00:32:24

-----------------------------------------------------
更多前沿AI技术和应用,访问 https://deepface.cc
软件WebUI正在启动中,请稍后...
-----------------------------------------------------
*一键包制作:https://deepface.cc
* 请合理合法使用AI技术,不要用于违法用途
* 否则由此带来的任何责任后果,请自行承担
2025-09-06 00:28:53.594201: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
WARNING:tensorflow:From E:\EchoMimic_v3\deepface\lib\site-packages\keras\src\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.

loaded 3D transformer's pretrained weights from models/Wan2.1-Fun-V1.1-1.3B-InP\./ ...
### missing keys: 93;
### unexpected keys: 0;
['audio2token.proj.weight', 'audio2token.norm.weight', 'audio2token.norm.bias', 'blocks.0.cross_attn.k_audio.weight', 'blocks.0.cross_attn.v_audio.weight', 'blocks.0.cross_attn.norm_k_audio.weight', 'blocks.1.cross_attn.k_audio.weight', 'blocks.1.cross_attn.v_audio.weight', 'blocks.1.cross_attn.norm_k_audio.weight', 'blocks.2.cross_attn.k_audio.weight', 'blocks.2.cross_attn.v_audio.weight', 'blocks.2.cross_attn.norm_k_audio.weight', 'blocks.3.cross_attn.k_audio.weight', 'blocks.3.cross_attn.v_audio.weight', 'blocks.3.cross_attn.norm_k_audio.weight', 'blocks.4.cross_attn.k_audio.weight', 'blocks.4.cross_attn.v_audio.weight', 'blocks.4.cross_attn.norm_k_audio.weight', 'blocks.5.cross_attn.k_audio.weight', 'blocks.5.cross_attn.v_audio.weight', 'blocks.5.cross_attn.norm_k_audio.weight', 'blocks.6.cross_attn.k_audio.weight', 'blocks.6.cross_attn.v_audio.weight', 'blocks.6.cross_attn.norm_k_audio.weight', 'blocks.7.cross_attn.k_audio.weight', 'blocks.7.cross_attn.v_audio.weight', 'blocks.7.cross_attn.norm_k_audio.weight', 'blocks.8.cross_attn.k_audio.weight', 'blocks.8.cross_attn.v_audio.weight', 'blocks.8.cross_attn.norm_k_audio.weight', 'blocks.9.cross_attn.k_audio.weight', 'blocks.9.cross_attn.v_audio.weight', 'blocks.9.cross_attn.norm_k_audio.weight', 'blocks.10.cross_attn.k_audio.weight', 'blocks.10.cross_attn.v_audio.weight', 'blocks.10.cross_attn.norm_k_audio.weight', 'blocks.11.cross_attn.k_audio.weight', 'blocks.11.cross_attn.v_audio.weight', 'blocks.11.cross_attn.norm_k_audio.weight', 'blocks.12.cross_attn.k_audio.weight', 'blocks.12.cross_attn.v_audio.weight', 'blocks.12.cross_attn.norm_k_audio.weight', 'blocks.13.cross_attn.k_audio.weight', 'blocks.13.cross_attn.v_audio.weight', 'blocks.13.cross_attn.norm_k_audio.weight', 'blocks.14.cross_attn.k_audio.weight', 'blocks.14.cross_attn.v_audio.weight', 'blocks.14.cross_attn.norm_k_audio.weight', 'blocks.15.cross_attn.k_audio.weight', 'blocks.15.cross_attn.v_audio.weight', 'blocks.15.cross_attn.norm_k_audio.weight', 'blocks.16.cross_attn.k_audio.weight', 'blocks.16.cross_attn.v_audio.weight', 'blocks.16.cross_attn.norm_k_audio.weight', 'blocks.17.cross_attn.k_audio.weight', 'blocks.17.cross_attn.v_audio.weight', 'blocks.17.cross_attn.norm_k_audio.weight', 'blocks.18.cross_attn.k_audio.weight', 'blocks.18.cross_attn.v_audio.weight', 'blocks.18.cross_attn.norm_k_audio.weight', 'blocks.19.cross_attn.k_audio.weight', 'blocks.19.cross_attn.v_audio.weight', 'blocks.19.cross_attn.norm_k_audio.weight', 'blocks.20.cross_attn.k_audio.weight', 'blocks.20.cross_attn.v_audio.weight', 'blocks.20.cross_attn.norm_k_audio.weight', 'blocks.21.cross_attn.k_audio.weight', 'blocks.21.cross_attn.v_audio.weight', 'blocks.21.cross_attn.norm_k_audio.weight', 'blocks.22.cross_attn.k_audio.weight', 'blocks.22.cross_attn.v_audio.weight', 'blocks.22.cross_attn.norm_k_audio.weight', 'blocks.23.cross_attn.k_audio.weight', 'blocks.23.cross_attn.v_audio.weight', 'blocks.23.cross_attn.norm_k_audio.weight', 'blocks.24.cross_attn.k_audio.weight', 'blocks.24.cross_attn.v_audio.weight', 'blocks.24.cross_attn.norm_k_audio.weight', 'blocks.25.cross_attn.k_audio.weight', 'blocks.25.cross_attn.v_audio.weight', 'blocks.25.cross_attn.norm_k_audio.weight', 'blocks.26.cross_attn.k_audio.weight', 'blocks.26.cross_attn.v_audio.weight', 'blocks.26.cross_attn.norm_k_audio.weight', 'blocks.27.cross_attn.k_audio.weight', 'blocks.27.cross_attn.v_audio.weight', 'blocks.27.cross_attn.norm_k_audio.weight', 'blocks.28.cross_attn.k_audio.weight', 'blocks.28.cross_attn.v_audio.weight', 'blocks.28.cross_attn.norm_k_audio.weight', 'blocks.29.cross_attn.k_audio.weight', 'blocks.29.cross_attn.v_audio.weight', 'blocks.29.cross_attn.norm_k_audio.weight']
### All Parameters: 1707.215168 M
### attn1 Parameters: 0.0 M
Missing keys: 0, Unexpected keys: 0
### missing keys: 0;
### unexpected keys: 0;
[] []
Press any key to continue . . .

tnt56466939 发表于 2025-9-6 00:33:25

没理解,问题出在哪里,模型都全了,缺权重?

无言以对 发表于 2025-9-6 10:18:33

tnt56466939 发表于 2025-9-6 00:32
-----------------------------------------------------
更多前沿AI技术和应用,访问 https://deepface.c ...

手动设置虚拟内存,32G起

tnt56466939 发表于 2025-9-6 13:32:25

无言以对 发表于 2025-9-6 10:18
手动设置虚拟内存,32G起

谢谢,我晚上测试一下,另外咨询一下,为什么加载时电脑会特别卡,配置挺高的还在加载时特别卡呢?
页: [1] 2
查看完整版本: EchoMimic V3 - 一张照片一段音频生成半身数字人说话视频 支持手势动作 支持50系显卡 一键整合包下载