|
|
发表于 2025-11-28 12:24:09
|
显示全部楼层
参数不要改,默认的试试,你和楼下的一样,应该是选错了模型
我刚还生成了几个,都没问题
ComfyUI found: E:\ComfyUI_InfiniteTalk_V2\ComfyUI
T5Encoder: 100%|███████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 51.40it/s]
T5Encoder: 100%|█████████████████████████████████████████████████████████████████████| 24/24 [00:00<00:00, 1354.48it/s]
CUDA Compute Capability: 12.0
Detected model in_channels: 36
Model cross attention type: i2v, num_heads: 40, num_layers: 40
Model variant detected: i2v_480
InfiniteTalk detected, patching model...
model_type FLOW
Loading LoRA: Wan21_I2V_14B_lightx2v_cfg_step_distill_lora_rank64 with strength: 1
Requested to load CLIPVisionModelProjection
loaded completely 12107.0744140625 1208.09814453125 True
Clip embeds shape: torch.Size([1, 257, 1280]), dtype: torch.float32
Combined clip embeds shape: torch.Size([1, 257, 1280])
The local file (ComfyUI\models\torch\hub\torchaudio\models\hdemucs_high_trained.pt) exists. Skipping the download.
[MultiTalk] --- Raw speaker lengths (samples) ---
speaker 1: 192000 samples (shape: torch.Size([1, 1, 192000]))
[MultiTalk] total raw duration = 12.000s
[MultiTalk] multi_audio_type=para | final waveform shape=torch.Size([1, 1, 192000]) | length=192000 samples | seconds=12.000s (expected max of raw)
Using GGUF to load and assign model weights to device...
Loading transformer parameters to cuda:0: 100%|█████████████████████████████████| 1633/1633 [00:00<00:00, 37113.91it/s]
------- Scheduler info -------
Total timesteps: tensor([999, 970, 916, 785], device='cuda:0')
Using timesteps: tensor([999, 970, 916, 785], device='cuda:0')
Using sigmas: tensor([1.0000, 0.9706, 0.9167, 0.7857, 0.0000])
------------------------------
sigmas: tensor([1.0000, 0.9706, 0.9167, 0.7857, 0.0000])
Multitalk audio features shapes (per speaker): [(300, 12, 768)]
Multitalk mode: infinitetalk
Sampling 300 frames in 8 windows, at 480x640 with 4 steps
Sampling audio indices 0-49: 100%|███████████████████████████████████████████████████████| 4/4 [00:54<00:00, 13.71s/it]
Sampling audio indices 42-91: 100%|██████████████████████████████████████████████████████| 4/4 [00:51<00:00, 12.95s/it]
Sampling audio indices 84-133: 100%|█████████████████████████████████████████████████████| 4/4 [00:51<00:00, 12.94s/it]
Sampling audio indices 126-175: 100%|████████████████████████████████████████████████████| 4/4 [00:51<00:00, 12.98s/it]
Sampling audio indices 168-217: 100%|████████████████████████████████████████████████████| 4/4 [00:51<00:00, 12.96s/it]
Sampling audio indices 210-259: 100%|████████████████████████████████████████████████████| 4/4 [00:57<00:00, 14.48s/it]
Audio embedding for subject 0 not long enough: 300, need 301, padding...
Padding length: 4
Sampling audio indices 252-301: 100%|████████████████████████████████████████████████████| 4/4 [00:51<00:00, 12.97s/it]
Allocated memory: memory=0.232 GB
Max allocated memory: max_memory=9.531 GB
Max reserved memory: max_reserved=12.090 GB
ComfyUI found: E:\ComfyUI_InfiniteTalk_V2\ComfyUI |
|