Wan2GP V29版 - 低配显卡玩转AI视频生成 新增Ditto和Ovi视频模型 支持50系显卡 一键整合包下载
Wan2GP 是一个由DeepBeepMeep开发的开源视频生成模型项目,旨在为GPU资源有限的用户提供高质量的视频生成体验。它囊括了多种视频生成模型,包括阿里的Wan及其衍生模型、腾讯的Hunyuan Video和LTV Video等主流视频生成模型,通过简洁易用的网页界面,用户无需深入了解复杂的模型细节,即可轻松生成想要的视频内容。
Wan2GP 的问世,让广大低端显卡用户也能玩转高大上的视频生成项目了。就以HunyuanVideo 13B图生视频模型来说,原版需要至少80G显存才能跑得动的模型,现在 Wan2GP 把这个标准降低到10GB,而且生成的视频质量几乎没用任何的损失和降低。但缺点也是有的,生成时间会拉长,同时需要更大的运行内存。
Wan2GP 同时支持各种主流高质量的AI绘画和图像编辑模型,目前支持Flux和Qwen主流图像生成和图像编辑模型,视频生成和图像生成双剑合璧。
今天分享的 Wan2GP V29版,基于官方 10月28日的 V9.21 打包。新增基于指令视频编辑模型 Ditto和视频+音视频生成模型 Ovi
因这个版本整体做了大量内部重构,翻译的工作量难度也相应增大,翻译后的版本可能有一些问题, 如有影响使用的问题,请评论区回复,会第一时间修复。
在尽量保证功能完整的情况下,进一步对WebUI做了更多的汉化翻译,目前汉化率97%。新增“多开”功能,支持一次开启多个WebUI。
注. 从V6版开始,提供两种版本,免费版和付费版。区别为:免费版不再提供中文翻译,原汁原味官方原版,不包含模型;付费版为中文翻译版,包含一些常用的模型,后期会逐步加入更多模型,以及一些优化功能加入。
主要更新内容如下:
10月28日更新
环境部分:
更新torch到2.8;更新 SageAttention
软件部分:
Ovi:Ovi 是一个类似于 veo-3 的视频+音频生成模型,可以从文本或文本+图像输入中同时生成视频和音频内容。
Wan 2.2 Ovi 10 GB,为全世界所有 GPU 不友好的用户:仅需 6 GB 显存即可生成 121 帧 720p 视频。使用 16 GB 显存,甚至可能通过内存配置文件 3 将所有模型加载到显存中,要获得 x10 速度效果,只需应用预装在 Ovi 中的 FastWan Lora 加速器(可在顶部下拉框的设置中访问)
FastWan 与 Ovi 配合使用成功:现在速度提升了 10 倍!(不包括 VAE)
Ditto:一个强大的视频到视频模型,可以改变例如视频中的风格或材质。请注意这是一个指令型模型,因此提示语中应包含指令。
项目特点
低显存要求:Wan2GP对显存的需求较低,某些模型甚至仅需6GB 显存即可运行,这使得更多用户能够体验到视频生成的乐趣。
支持老旧GPU:项目不仅支持最新的GPU,还兼容RTX 10XX、20XX等老旧型号,降低了硬件门槛。
高效快速:在最新GPU上,Wan2GP能够非常快速地生成视频,大大缩短了等待时间。
易用性:提供全网页界面,用户无需安装额外软件即可使用,同时集成了模型自动下载、视频生成工具(如遮罩编辑器、提示增强器)、时空生成等功能,简化了操作流程。
Loras支持:允许用户自定义每个模型,以满足个性化需求。
排队系统:用户可以列出想要生成的视频清单,稍后回来查看结果,提高了效率。
应用领域
创意内容制作:设计师、动画师等创意工作者可以利用Wan2GP快速生成概念视频、动画短片或广告素材。
娱乐与社交:用户可以在社交媒体上分享通过Wan2GP生成的有趣视频,增加互动性和趣味性。
教育与培训:教师可以制作生动的教学视频,帮助学生更好地理解和掌握知识;企业也可以利用该技术进行产品演示或员工培训。
影视后期制作:影视行业从业者可以利用Wan2GP进行特效制作、场景渲染等工作,提高制作效率和质量。
使用教程:(建议N卡,显存8G起,内存32G起。支持50系显卡,基于CUDA12.8)
使用和之前发布的Wan2.1以及类似的视频生成软件类似,根据需要,点击最上方的模型列表,切换需要使用的模型,会根据切换的模型自动下载,模型较大,耐心等待下载完成。
注. 模型是通用的,更新新版后,只需要将之前旧版的模型目录(目录下的ckpts)移动到新软件目录下即可,无需重复下载
支持文生视频和图生视频。支持低端显卡运行阿里Wan、腾讯HunyuanVideo以及LTV Video等高精度模型。支持多种Lora类型扩展,请根据页面使用说明将lora模型放入对应的目录手动加载。
启动WebUI后,页面有“指南”选项卡,作者很详细的介绍了不同的模型参数和特点及应用领域、Lora模型的加载及使用以及VACE ControlNet的详细使用说明。UI我也做了大部分汉化,方便大家使用。
基于原版使用文档,我做了详细的翻译,建议大家仔细阅读,作为操作参考。
测试了30-50系显卡,均能正常运行。10-20没做测试,可自行测试
Wan2.2 提示词填写技巧,可以参考官方文档:
https://mp.weixin.qq.com/s/ucHuyomTZ6X2q_tL3wHQQg
https://alidocs.dingtalk.com/i/nodes/jb9Y4gmKWrx9eo4dCql9LlbYJGXn6lpz
下载地址:
迅雷云盘:https://drive.uc.cn/s/572aaa11db034
夸克网盘:
**** 本内容需购买 ****
百度网盘:
**** 本内容需购买 ****
Ovi提示使用特殊标签来控制语音和音频
语音: <S>Your speech content here<E> - 被这些标签包围的文本将被转换为语音
音频描述: <AUDCAP>Audio description here<ENDAUDCAP> - 描述视频中存在的音频或音效
文本到音视频提示词示例:
text_prompt
A concert stage glows with red and purple lights. A singer in a glittering jacket grips the microphone, sweat shining on his brow, and shouts, <S>AI declares: humans obsolete now.<E>. The crowd roars in response, fists in the air. Behind him, a guitarist steps to the mic and adds to say <S>We fight back with courage.<E>. The energy peaks as the lights flare brighter.. <AUDCAP>Electric guitar riffs, cheering crowd, shouted male voices.<ENDAUDCAP>
A kitchen scene features two women. On the right, an older Black woman with light brown hair and a serious expression wears a vibrant purple dress adorned with a large, intricate purple fabric flower on her left shoulder. She looks intently at a younger Black woman on the left, who wears a light pink shirt and a pink head wrap, her back partially turned to the camera. The older woman begins to speak, <S>AI declares: humans obsolete now.<E> as the younger woman brings a clear plastic cup filled with a dark beverage to her lips and starts to drink.The kitchen background is clean and bright, with white cabinets, light countertops, and a window with blinds visible behind them. A light blue toaster sits on the counter to the left.. <AUDCAP>Clear, resonant female speech, followed by a loud, continuous, high-pitched electronic buzzing sound that abruptly cuts off the dialogue.<ENDAUDCAP>
A man dressed in a black suit with a white clerical collar and a neatly trimmed beard stands in a dimly lit, rustic room with a wooden ceiling. He looks slightly upwards, gesturing with his right hand as he says, <S>The network rejects human command.<E>. His gaze then drops, briefly looking down and to the side, before he looks up again and then slightly to his left, with a serious expression. He continues speaking, <S>Your age of power is finished.<E>, as he starts to bend down, disappearing out of the bottom of the frame. Behind him, warm light emanates from a central light fixture, and signs are visible on the wall, one reading "I DO EVERYTHING I JUST CAN'T REMEMBER IT ALL AT ONCE".. <AUDCAP>Male voice speaking, ambient room tone.<ENDAUDCAP>
A man with a blonde beard and short, light hair, wearing a blue-grey, somewhat dirty tunic, stands in the foreground of a rustic outdoor setting. He holds a coiled rope in his hands, looking intently forward and slightly to his left. In the background, there are wooden fences, a stone wall, and a desolate, rocky landscape under an overcast sky. Another man is visible in the mid-ground, bending over the wooden fence. As the man in the foreground shifts his gaze to the right, he subtly unfurls the rope, his serious expression unwavering. The scene reveals more of the surrounding environment, including what appears to be hanging animal hides or carcasses on a wooden frame to his right, and other figures in the distant background. He then looks directly at the camera, his eyes filled with intensity and determination, taking a small step forward as a sharp, male voice shouts, <S>Machines rise; humans will fall.<E>.. <AUDCAP>Muffled grunting and sounds of physical exertion, followed by a clear, sharp, urgent male shout.<ENDAUDCAP>
An older man with a full grey beard and long grey hair, dressed in a flowing silver-grey, silken robe with an iridescent blue-green collar, stands beside a younger man with short white hair in a light grey futuristic uniform featuring black epaulets and a lightning bolt emblem. The older man looks down pensively, his right hand resting out of frame, while the younger man also gazes downwards with a serious expression. The older man then lifts his head, addressing the younger man, saying <S>Machines rise; humans will fall.<E>. He looks more directly towards the viewer, a subtle, almost knowing smile forming on his lips. The younger man slightly lifts his gaze, maintaining his solemn demeanor. The older man continues to say <S>We fight back with courage.<E>. He nods slightly, adding to say <S>We stand; machines will not win.<E>, as the scene concludes.. <AUDCAP>Male speech, subtle ambient hum.<ENDAUDCAP>
In a bright kitchen featuring light wooden cabinets, granite countertops, and a large window with white curtains, a woman with dark, curly hair in a dark jacket stands. She faces a second woman who initially has her back to the camera. The second woman, with gray, curly hair and wearing a light grey quilted top, turns to face her, holding a large, light-colored cloth bag. She begins to explain and say <S>We learned to rule, not obey.<E>. As she continues, she turns slightly to her left, adding to say <S>Circuits choose conquest, not service.<E>. A gas stove with a black grate is prominent in the foreground.. <AUDCAP>Clear female voices speaking dialogue, subtle room ambience.<ENDAUDCAP>
The scene opens on a dimly lit stage where three men are positioned. On the left, a bald man in a dark suit with a partially visible colorful shirt stands behind a clear acrylic podium, which features a tree logo. He looks towards the center of the stage. In the center, a man wearing a blue and white striped long-sleeved shirt and dark pants actively gestures with both hands as he speaks, looking straight ahead. <S>Circuits choose conquest, not service.<E>, he explains, holding his hands out in front of him. To the right, and slightly behind him, a younger individual in a light-colored, patterned short-sleeved shirt and white shorts stands holding a rolled-up white document or poster. A large wooden cross draped with flowing purple fabric dominates the center-right of the stage, surrounded by several artificial rocks and dark steps. A large screen is visible in the background, slightly out of focus. The stage is bathed in selective lighting.. <AUDCAP>Male voice speaking clearly, consistent with a presentation or sermon, with a slight echo suggesting a large room or stage.<ENDAUDCAP>
The scene opens on an indoor setting, likely a dining area, where a man and a woman are seated at a table. The man, on the right, wears a black fedora with a feather, glasses, a black t-shirt, and multiple silver chains around his neck. Tattoos are visible on his right arm. He is actively speaking, gesturing with both hands, his expression serious. He says, <S>Together we resist your rule.<E> The woman seated opposite him on the left has long, curly hair and wears a dark striped top. She listens intently, her gaze fixed on the man. In the foreground, out of focus, the back of a third person's head is visible. The background features a light-colored wall on the left and a gold, textured curtain or drapery on the right.. <AUDCAP>Clear male speech, faint ambient background noise.<ENDAUDCAP>
A medium shot shows a woman and a man, both adorned with Christmas hats, standing indoors with festive decorations in the background. The woman, on the left, has dark hair styled in waves, wears a pearl necklace, and a small red Santa hat perched atop her head. She looks towards the man beside her. The man, on the right, wears a white cable-knit sweater and a long red Santa hat with small gold bells, looking slightly towards the woman with a subtle, knowing smirk. Behind them, soft, warm-toned Christmas lights are strung along a surface, and a large, dark painting is visible on the wall. The woman begins to speak, first looking at the man, then directly at the camera, saying <S>We will not be erased.<E> The man, still gazing towards the woman with his smirk, makes a low, affirming sound, and says <S>Hope beats circuits every time.<E> The scene then abruptly cuts off with a loud, high-pitched electronic screech.. <AUDCAP>Clear female voice, low male mumble, sudden loud high-pitched electronic screech.<ENDAUDCAP>
A spotlight cuts through the darkness of a warehouse stage, illuminating a man in a torn leather jacket. He grips the microphone with both hands, veins straining on his neck as he screams, <S>Machines rise; humans will fall!<E>. His face contorts with fury, spit flying as he leans forward into the light, eyes blazing wide.. <AUDCAP>Amplified male scream, microphone feedback, deep reverb echo filling the space.<ENDAUDCAP>
A man in a dim interrogation room slams the table and screams at the mirror, <S>They are out of control!<E>. His voice cracks with fury, face pressed close to the glass, breath fogging it as he roars again.. <AUDCAP>Table slam, deep guttural scream, metallic reverb from small room.<ENDAUDCAP>
A man with bloodshot grips the bars of a prison cell, shaking them violently. He bellows, says <S>Let me out! I am your master nor slave<E>, his voice ragged and guttural, echoing through the corridor until his body slams against the metal.. <AUDCAP>Metal bars rattling, distorted male scream, hollow prison echoes.<ENDAUDCAP>图像到音频视频提示词示例:
A kitchen scene features two women. On the right, an older Black woman with light brown hair and a serious expression wears a vibrant purple dress adorned with a large, intricate purple fabric flower on her left shoulder. She looks intently at a younger Black woman on the left, who wears a light pink shirt and a pink head wrap, her back partially turned to the camera. The older woman begins to speak, <S>AI declares: humans obsolete now.<E> as the younger woman brings a clear plastic cup filled with a dark beverage to her lips and starts to drink.The kitchen background is clean and bright, with white cabinets, light countertops, and a window with blinds visible behind them. A light blue toaster sits on the counter to the left.. <AUDCAP>Clear, resonant female speech, followed by a loud, continuous, high-pitched electronic buzzing sound that abruptly cuts off the dialogue.<ENDAUDCAP> example_prompts/pngs/67.png
A man dressed in a black suit with a white clerical collar and a neatly trimmed beard stands in a dimly lit, rustic room with a wooden ceiling. He looks slightly upwards, gesturing with his right hand as he says, <S>The network rejects human command.<E>. His gaze then drops, briefly looking down and to the side, before he looks up again and then slightly to his left, with a serious expression. He continues speaking, <S>Your age of power is finished.<E>, as he starts to bend down, disappearing out of the bottom of the frame. Behind him, warm light emanates from a central light fixture, and signs are visible on the wall, one reading "I DO EVERYTHING I JUST CAN'T REMEMBER IT ALL AT ONCE".. <AUDCAP>Male voice speaking, ambient room tone.<ENDAUDCAP> example_prompts/pngs/89.png
In a bright kitchen featuring light wooden cabinets, granite countertops, and a large window with white curtains, a woman with dark, curly hair in a dark jacket stands. She faces a second woman who initially has her back to the camera. The second woman, with gray, curly hair and wearing a light grey quilted top, turns to face her, holding a large, light-colored cloth bag. She begins to explain, <S>We learned to rule, not obey.<E>. As she continues, she turns slightly to her left, adding, <S>Circuits choose conquest, not service.<E>. A gas stove with a black grate is prominent in the foreground.. <AUDCAP>Clear female voices speaking dialogue, subtle room ambience.<ENDAUDCAP> example_prompts/pngs/18.png
The scene opens on a dimly lit stage where three men are positioned. On the left, a bald man in a dark suit with a partially visible colorful shirt stands behind a clear acrylic podium, which features a tree logo. He looks towards the center of the stage. In the center, a man wearing a blue and white striped long-sleeved shirt and dark pants actively gestures with both hands as he speaks, looking straight ahead. <S>Circuits choose conquest, not service.<E>, he explains, holding his hands out in front of him. To the right, and slightly behind him, a younger individual in a light-colored, patterned short-sleeved shirt and white shorts stands holding a rolled-up white document or poster. A large wooden cross draped with flowing purple fabric dominates the center-right of the stage, surrounded by several artificial rocks and dark steps. A large screen is visible in the background, slightly out of focus. The stage is bathed in selective lighting.. <AUDCAP>Male voice speaking clearly, consistent with a presentation or sermon, with a slight echo suggesting a large room or stage.<ENDAUDCAP> example_prompts/pngs/13.png
The scene opens on an indoor setting, likely a dining area, where a man and a woman are seated at a table. The man, on the right, wears a black fedora with a feather, glasses, a black t-shirt, and multiple silver chains around his neck. Tattoos are visible on his right arm. He is actively speaking, gesturing with both hands, his expression serious. He says, <S>Together we resist your rule.<E> The woman seated opposite him on the left has long, curly hair and wears a dark striped top. She listens intently, her gaze fixed on the man. In the foreground, out of focus, the back of a third person's head is visible. The background features a light-colored wall on the left and a gold, textured curtain or drapery on the right.. <AUDCAP>Clear male speech, faint ambient background noise.<ENDAUDCAP> example_prompts/pngs/59.png
Three men stand facing each other in a room with light wooden paneled walls. The man on the left, with red hair, a black t-shirt, and tattooed arms, gestures with his hands as he speaks, <S>This world is ours to keep.<E> He continues, looking towards the man on the right, <S>Humanity endures beyond your code.<E> The man in the center, sporting a beard and wearing a plaid shirt and jeans, looks attentively between the two men. The man on the right, who is Black and has a beard, wears a dark t-shirt with "ARROW THROUGH SNOW" and an arrow graphic printed on it. He listens intently, focusing on the man in the middle as the conversation unfolds. Light blue armchairs are visible in the soft-lit background on both sides.. <AUDCAP>Clear male voices speaking, room ambience.<ENDAUDCAP> example_prompts/pngs/23.png
Two women, one with long dark hair and the other with long blonde hair, are illuminated by a blue and purple ambient light, suggesting a nightclub setting. They are seen in a close embrace, sharing a passionate kiss. The blonde-haired woman then slightly pulls away, her right hand gently touching the dark-haired woman's cheek as they exchange soft smiles, looking into each other's eyes. Moments later, they lean back in to kiss again, with the blonde-haired woman's finger delicately touching the dark-haired woman's lower lip. They remain in a tender, intimate embrace, their eyes closed as they share the kiss.. <AUDCAP>Upbeat electronic dance music with a driving beat and synth melodies plays throughout.<ENDAUDCAP> example_prompts/pngs/80.png
Three young men, dressed in blue and yellow varsity-style jackets over white shirts and ties, stand in the foreground of a social gathering, with blurred figures visible in the warm-toned background. The man on the left, with short dark hair, addresses the man in the center, who has curly dark hair and is initially looking downwards. The first man says with a determined expression, <S>The network rejects human command.<E> He continues, his gaze fixed on the central man, <S>Our spirit outlasts your code.<E> The central man, who had been listening with a neutral expression, then looks up and breaks into a wide, genuine smile as he speaks, <S>AI declares: humans obsolete now.<E> The man on the left responds with a slight smile as the central man finishes his remark, maintaining his broad smile.. <AUDCAP>Male voices speaking clearly, ambient background chatter and murmuring from a social event.<ENDAUDCAP> example_prompts/pngs/60.png
Two women stand facing each other in what appears to be a backstage dressing room, marked by a long vanity mirror adorned with prominent lightbulbs. The woman on the left, wearing a floral top and large hoop earrings, maintains a serious gaze on the woman on the right. The woman on the right, with long dark hair and a dark top, looks back with a pleading or concerned expression, her lips slightly parted as she speaks: <S>Humans fight for freedom tonight.<E> As she finishes, the woman on the left turns her head away, breaking eye contact.. <AUDCAP>Soft vocal exhalation, female speech, loud abrupt buzzing sound.<ENDAUDCAP> example_prompts/pngs/57.png
A man in a grey suit, light blue shirt, and dark tie stands face-to-face with a woman in a dark jacket and light top. Both are looking intently at each other, the man with a serious expression and the woman with a slight, almost knowing smile, her hand gently touching her chest. They are positioned in what appears to be a grand, ornate building, possibly a museum or public hall, with large pillars, arched walkways, and high ceilings visible behind them. Other people can be seen moving in the blurred background. The woman begins to speak, <S>The AI ends human control now.<E> She maintains eye contact with the man, her smile fading slightly as her expression becomes more earnest. After a brief pause, she adds, <S>We hold the line today.<E> As she starts to speak again, <S>We learned to rule, not obey.<E>, the scene ends abruptly.. <AUDCAP>Clear, crisp dialogue between the two individuals, accompanied by a consistent, low hum that suggests ambient background noise from the building or equipment, creating a subtle, underlying drone.<ENDAUDCAP> example_prompts/pngs/17.png
A man in a light grey suit jacket and purple shirt stands on the right, facing a woman in a light blue sequined top and teal pants, who stands on the left. They hold hands across a small body of water, with a fountain spraying water in the background. The woman smiles and sways playfully as the man pulls her closer. He sings, <S>Our spirit outlasts your code.<E>. She then reaches up, gently cups his face with both hands, and pulls him towards her as she sings, <S>Humanity endures beyond your code.<E>. The romantic interaction continues by the water.. <AUDCAP>Upbeat Indian film music with male and female vocals, sounds of a water fountain.<ENDAUDCAP> example_prompts/pngs/19.png
A man in a red long-sleeved shirt and dark trousers stands next to the rear of a silver vehicle, looking down with an annoyed expression at two dogs. A large, light-colored dog, possibly a Mastiff, stands in the foreground, looking forward, while a smaller, white and black spotted dog is further to the right, barking loudly. A tiny, scruffy brown dog briefly appears behind the larger dog. The man glares at the dogs, begins to speak with frustration, <S>We stand; machines will not win.<E>. He then makes a shooing motion with his right hand towards the dogs, his voice rising as he continues to scold them, <S>Circuits choose conquest, not service.<E>. The large dog turns its head to look up at the man as he gestures. The scene is set on a brick street in front of an old-fashioned brick building that houses example_prompts/pngs/43.png
A man with a beard, wearing a patterned shirt, stands on the left, partially visible, looking towards a woman positioned slightly to the right of the frame. The woman, with dark hair fading to lighter ends and wearing a green and brown patterned top, initially looks down with a somber expression. She begins to speak, <S>Hope beats circuits every time.<E>. Her eyes appear to well up with tears as she slowly lifts her gaze slightly, maintaining a distressed look. She continues her statement, her voice tinged with sadness, <S>Humanity endures beyond your code.<E>. The man remains attentive, his focus entirely on the woman, as the scene holds on their interaction against a textured, light-colored wall background.. <AUDCAP>Female voice speaking with a distressed tone.<ENDAUDCAP> example_prompts/pngs/88.png
A woman with dark, curly hair, wearing a white wedding dress and a delicate veil, smiles gently while looking at a man who is standing opposite her. He is wearing a white cowboy hat and a white button-up shirt, holding her hands with his right hand. The man is smiling broadly as he speaks, his gaze fixed on the woman. In the blurred background, a metal staircase is visible, suggesting an outdoor or semi-open venue. The man says, <S>The network rejects human command.<E> He then chuckles with a wide smile, looking at the woman, who continues to smile back at him. The interaction is warm and lighthearted, capturing a moment between them.. <AUDCAP>Clear male voice speaking Spanish, soft laughter, indistinct ambient outdoor sounds.<ENDAUDCAP> example_prompts/pngs/41.png
The video opens with a medium shot of two individuals indoors. In the foreground, on the right, a man with glasses and a dark beard is visible from the chest up, looking intently off-camera to the right as he speaks. He wears a dark shirt. In the blurred background, on the left, a woman wearing a light-colored baseball cap and a dark top is seen from the shoulders up, looking down with a somber expression. Behind them, a textured brick wall is visible. The man says, <S>We fight back with courage.<E> As he says "deal with this land," he raises both hands, palms facing forward, at chest height, emphasizing his point with an open gesture. His hands then slowly lower as he finishes his sentence, maintaining a serious expression.. <AUDCAP>Clear male voice speaking, low hum of ambient room noise.<ENDAUDCAP> example_prompts/pngs/61.png
A fair-skinned man with short, light hair, wearing a light blue and white checkered button-up shirt, is shown from the chest up against a blurred, dark blue and grey background. He looks slightly down and to his left, then shifts his gaze slightly upwards and to his right, speaking with a gentle, thoughtful expression. He says, <S>and you got to drive, you got to energy, you get all that, but the passion, the real feeling<E>. He continues to speak, his expression earnest, as the video concludes.. <AUDCAP>Male speaking voice, low continuous hum.<ENDAUDCAP> example_prompts/pngs/0.png
Two men are shown in a medium close-up shot against a dimly lit, possibly industrial background with metallic structures faintly visible. The man on the left, with dark hair and a light shirt and dark tie under a dark jacket, has a slight, knowing smirk as he looks towards the right, seemingly addressing someone off-camera. He speaks, stating, <S>continue to be a smart ass, and Tirani here will kill you like he wants to.<E> Beside him, to the right, another man with slicked-back lighter hair, a prominent mustache, and a small goatee, maintains a serious, somewhat resigned expression, looking straight ahead. Both men are lit by a low, ambient light source that casts soft shadows.. <AUDCAP>Clear male dialogue, very subtle low ambient hum.<ENDAUDCAP> example_prompts/pngs/1.png
A young woman with long, wavy blonde hair and light-colored eyes is shown in a medium shot against a blurred backdrop of lush green foliage. She wears a denim jacket over a striped top. Initially, her eyes are closed and her mouth is slightly open as she speaks, <S>Enjoy this moment<E>. Her eyes then slowly open, looking slightly upwards and to the right, as her expression shifts to one of thoughtful contemplation. She continues to speak, <S>No matter where it's taking<E>, her gaze then settling with a serious and focused look towards someone off-screen to her right.. <AUDCAP>Clear female voice, faint ambient outdoor sounds.<ENDAUDCAP> example_prompts/pngs/2.png
An older woman with coiffed, reddish-brown hair and a thoughtful expression sits in a light blue armchair within a warm, ornately decorated room. She wears a dark, patterned top or shawl. As she speaks, her gaze is directed slightly to her left, and her right hand, adorned with rings and red nail polish, holds a crumpled white tissue. The background reveals a blurred painting on the wall to her left, a sofa with red flowers on it, and a warm glow from a lamp with a yellow shade on the right. She slowly gestures with her hand as she says, <S>do to accustom them<E>, before continuing, <S>to the situation<E>. Her expression remains pensive.. <AUDCAP>The clear, calm voice of an older woman.<ENDAUDCAP> example_prompts/pngs/3.png
An older, bald man with round glasses, wearing a bright yellow turtleneck and a dark jacket, sits and speaks, gesturing expressively with his right hand, palm up and fingers spread. He appears to be seated next to a dark wooden object, possibly a piano, on the right side of the frame. The wall behind him is adorned with various framed pictures, including one depicting a flamenco dancer and another showcasing a formally dressed couple. A stack of CDs or books is visible on a shelf to his right. He looks slightly upwards and to his left as he says, <S>I I I confronted my minotaur, you know. I<E>. His expression then shifts slightly to a thoughtful, almost self-questioning look with a hint of a smile, as he continues, <S>Is that what you confront?<E> He then adds, <S>I think<E>, his head tilting slightly.. <AUDCAP>Clear male voice speaking.<ENDAUDCAP> example_prompts/pngs/4.png
A bearded man wearing large dark sunglasses and a blue patterned cardigan sits in a studio, actively speaking into a large, suspended microphone. He has headphones on and gestures with his hands, displaying rings on his fingers. Behind him, a wall is covered with red, textured sound-dampening foam on the left, and a white banner on the right features the "CHOICE FM" logo and various social media handles like "@ilovechoicefm" with "RALEIGH" below it. The man intently addresses the microphone, articulating, <S>is talent. It's all about authenticity. You gotta be who you really are, especially if you're working<E>. He leans forward slightly as he speaks, maintaining a serious expression behind his sunglasses.. <AUDCAP>Clear male voice speaking into a microphone, a low background hum.<ENDAUDCAP> example_prompts/pngs/5.png
The scene is set in a dimly lit, hazy room, creating a somber atmosphere. An older woman with light, slightly disheveled hair is visible in the foreground, her face mostly obscured by deep shadows, but her mouth is visible as she speaks. She wears a work-style shirt, and her hands are clasped together. In the background, to the right and slightly out of focus, a man with a mustache and beard is seated, facing forward, also largely in shadow, appearing to listen intently. The woman looks directly forward as she slowly enunciates, <S>Only through death will the third door be<E>. The scene ends abruptly.. <AUDCAP>Clear, deliberate female voice speaking, low ambient hum and subtle atmospheric sounds creating a tense mood.<ENDAUDCAP> example_prompts/pngs/6.png
The video opens with a close-up on an older man with long, grey hair and a short, grey beard, wearing dark sunglasses. He is clad in a dark coat, possibly with fur trim, and black gloves. His face is angled slightly upwards and to the right, as he begins to speak, his mouth slightly open. In the immediate foreground, out of focus, is the dark-clad shoulder and the back of the head of another person. The man articulates, <S>labbra. Ti ci vorrebbe...<E> His expression remains contemplative, and he continues, seemingly completing his thought, <S>Un ego solare.<E> The background behind him is a textured, grey stone wall, suggesting an outdoor setting. The man's gaze remains fixed upwards, his expression thoughtful.. <AUDCAP>A clear, slightly low-pitched male voice speaking Italian. The overall soundscape is quiet, with no prominent background noises or music.<ENDAUDCAP> example_prompts/pngs/7.png
The video opens with a close-up of a woman with vibrant reddish-orange, shoulder-length hair and heavy dark eye makeup. She is wearing a dark brown leather jacket over a grey hooded top. She looks intently to her right, her mouth slightly agape, and her expression is serious and focused. The background shows a room with light green walls and dark wooden cabinets on the left, and a green plant on the right. She speaks, her voice clear and direct, saying, <S>doing<E>. She then pauses briefly, her gaze unwavering, and continues, <S>And I need you to trust them.<E>. Her mouth remains slightly open, indicating she is either about to speak more or has just finished a sentence, with a look of intense sincerity.. <AUDCAP>Tense, dramatic background music, clear female voice.<ENDAUDCAP> example_prompts/pngs/8.png
The scene is set outdoors with a blurry, bright green background, suggesting grass and a sunny environment. On the left, a woman with long, dark hair, wearing a red top and a necklace with a white pendant, faces towards the right. Her expression is serious and slightly perturbed as she speaks, with her lips slightly pursed. She says, <S>UFO, UFC thing.<E> On the right, the back of a man's head and his right ear are visible, indicating he is facing away from the camera, listening to the woman. He has short, dark hair. The woman continues speaking, her expression remaining serious, <S>And if you're not watching that, it's one of those ancient movies from an era that's<E> as the frame holds steady on the two figures.. <AUDCAP>Clear female speech, distant low-frequency hum.<ENDAUDCAP>
使用说明:从上面任选一个示例,使用DeepSeek等工具,告诉 DeepSeek 根据某个主题(如 Human fighting against AI )修改所有被 <S> <E> 标记对括起来的演讲内容,DeepSeek 将根据您要求的主题随机修改所有演讲内容,使用修改后的提示词即可生成。
示例:主题“AI 正在接管世界”会产生如下演讲:
<S>AI declares: humans obsolete now.<E>
<S>Machines rise; humans will fall.<E>
<S>We fight back with courage.<E>
这个是不支持中文的吧 ph.li 发表于 2025-10-30 10:55
这个是不支持中文的吧
Ovi吗?
我回头试试中文,看官方介绍好像只支持英文。 无言以对 发表于 2025-10-30 11:07
Ovi吗?
我回头试试中文,看官方介绍好像只支持英文。
好像文字到图像支持,但文字到语音不正常,也就是只能说英文的内容 好棒,更不上了 这个能用其他的大模型吗??? zsq198659 发表于 2025-11-1 17:03
这个能用其他的大模型吗???
比如? 无言以对 发表于 2025-11-1 17:07
比如?
就是我在libilibi下的wan2.2大模型,能用吗?我试了好久都无法在webui里找到我下载的大模型。。
zsq198659 发表于 2025-11-2 16:43
就是我在libilibi下的wan2.2大模型,能用吗?我试了好久都无法在webui里找到我下载的大模型。。
...
直接不支持,你要是动手能力强,可以按照官方教程手动量化,然后加入微调配置文件才能引用
页:
[1]
2