Budget
Split raw compute from practical project allowance.
Compute is not the expensive part.
Generating test stills should not require 100 GPU hours. A larger allowance covers learning, failed setup, model downloads, endpoint tests, storage, and rejected clips. Raw compute for a first test is more likely single-digit pounds to low tens.
| Thing | Concrete calculation | Compute-only cost | Practical cash to allow |
|---|---|---|---|
| No-source ComfyUI smoke test | 2-6 hours on RunPod RTX 4090 at $0.69/hr. This is for proving the workflow, not final-quality video. | £1-£4 | £5-£30 |
| Public endpoint video tests | RunPod Wan I2V is listed at $0.30 per 5s request. 50-100 attempts = $15-$30. | £12-£24 | £20-£50 |
| First private LoRA | Research shows LoRA creation can be done with 20 images, 24GB VRAM, ~15 minutes. Realistically allow several runs. | £1-£10 | £15-£50 |
| LoRA + 100 stills + 50 clips | Use 4090/48GB GPU for stills and LoRA; use 48-80GB GPU or endpoint I2V for serious video attempts. | £20-£60 | £50-£150 |
| Paid pilot pack | More attempts, edit time, thumbnails, captions, creator review. Compute is still not the main cost. | £30-£120 | £100-£500 if doing it yourselves; more if hiring help. |
| Local hardware | A £2k-£5k NVIDIA box only beats £0.55/hr rental after thousands of rented hours. | Not relevant now | Do not buy for discovery. |
GPU choice
Quality target and hardware target are not the same thing.
4090 is enough to learn. It is not the best final video tier.
A 24GB RTX 4090 is a good cheap CUDA machine for ComfyUI, still-image generation, LoRA training, inpainting, upscaling, and short quantised/offloaded video tests. For high-quality final image-to-video, rent 48GB or 80GB NVIDIA GPUs and compare output quality before committing.
| Hardware | Use it for | Do not expect | Practical view |
|---|---|---|---|
| MBP M4 Pro, 48GB unified memory | Planning, prompt writing, reviewing outputs, small local tests. | Fast ComfyUI video diffusion. Apple unified memory is not equivalent to CUDA VRAM for these workflows. | Bad proxy for rented NVIDIA performance. |
| RTX 4090, 24GB | Learning ComfyUI, SDXL/FLUX stills, LoRA training, inpainting, 480p/short I2V tests. | Comfortable full-quality large video models without quantisation/offload compromises. | Best first rental tier. |
| RTX 6000 Ada, 48GB | More comfortable video testing, larger batches, fewer memory workarounds. | Guaranteed good output. Source quality and workflow still dominate. | Good serious-test tier if price is close to 4090. |
| A100/H100, 80GB | High-end I2V tests, larger models, less offloading, faster iteration. | A magic quality button. It mainly buys speed, headroom, and fewer compromises. | Use for final comparison runs, not first setup. |
Technical stack
Recommended for a private, consenting source workflow.
ComfyUI on rented GPU
Use a RunPod/Vast pod with a persistent volume. Do not keep private source files scattered across public tools.
Still images first
Use SDXL/FLUX-style workflows for stills, inpainting, pose control, and face correction. Video quality depends on the keyframe.
Private LoRA
Train one creator/style LoRA. Keep it private. Do not upload the weights to Civitai or shared model hosts.
Wan I2V
Most practical first video target. Start at 480p/720p and 4-6 seconds. Generate many short attempts, then edit.
WanVideoWrapper
Useful practical reference for Wan workflows, FP8/GGUF models, examples, block swapping, and VRAM caveats. Pin versions; custom nodes can break.
LTX-Video, HunyuanVideo
LTX is useful for speed tests. Hunyuan can look good but is heavier. Compare one baseline at a time.
Full video fine-tune
Too much complexity for discovery. Get a still LoRA and image-to-video loop pack working first.
ComfyUI workflow
What to actually build first.
| Stage | Workflow | Output target | Pass/fail metric |
|---|---|---|---|
| 1. Generic smoke test | ComfyUI template, no private source. Generate stills, inpaint, upscale, then I2V. | 50 stills, 10 video attempts. | Can you get 3 usable 4-6s clips in one evening? |
| 2. Private LoRA | Train creator/costume LoRA from approved source set. Test weights on neutral prompts first. | 1 LoRA, 100 still candidates. | Creator approves 10+ stills without "almost me but wrong" reaction. |
| 3. I2V loop pack | Animate only approved stills. Do not try to generate full scenes directly. | 30-80 video attempts. | 3-8 clips survive face drift, hand/body errors, and bad motion. |
| 4. Paid mini-pack | Final stills + loops + SFW teaser crops + explicit OF version if allowed by her rules. | 10-20 stills, 3-6 loops. | Fan response beats normal PPV baseline or creates custom demand. |
Source material
A good source set saves more money than cheaper GPU time.
20-30 images
- Face closeups, half-body, full-body.
- Front, 3/4, profile.
- Neutral light, clean background.
40-60 images
- Same costume + alternate costume.
- Several expressions.
- Makeup/hair closeups and full outfit shots.
Bad training data
- Other people in frame.
- Old/under-18/ambiguous material.
- Brand logos or exact copyrighted character marks.
First useful experiment
Small enough to run; specific enough to learn from.
| Experiment | Spend cap | Run | Decision |
|---|---|---|---|
| No-source technical test | £25 | Run 50 stills and 10 I2V clips using generic references on a 4090-tier rental. | If this is painful, do not use her source material yet. |
| Creator-source LoRA test | £80 compute/API | Train one LoRA, produce 100 stills, animate top 20. Use 48-80GB GPU or endpoint I2V for the top candidates. | Continue only if she approves likeness and at least 3 clips look sellable. |
| OF paid mini-pack | £150 compute/edit buffer | 10-20 stills, 3-6 loops, labelled AI-assisted and creator-approved. | Compare PPV conversion, comments, refunds, custom requests. |
Source links
Specific sources used for the rewrite.