Need help?
<- Back

Comments (41)

  • mishu2
    Having the ability to do real-time video generation on a single workstation GPU is mind blowing.I'm currently hosting a video generation website, also on a single GPU (with a queue), which is also something I didn't even think possible a few years ago (my show HN from earlier today, coincidentally: https://news.ycombinator.com/item?id=46388819). Interesting times.
  • jjcm
    Looks like there is some quality reduction, but nonetheless 2s to generate a 5s video on a 5090 for WAN 2.1 is absolutely crazy. Excited to see more optimizations like this moving into 2026.
  • kristopolous
    this is probably the best tool for this stuff now: https://github.com/deepbeepmeep/Wan2GPIt has fastwan ... probably will have this soon. it's a request in multiple tickets: https://github.com/deepbeepmeep/Wan2GP/issues
  • bsenftner
    Video AI acceleration is tricky, where many of the currently in use acceleration loras and cache level accelerations have a subtle at first impact on the generated video, which renders these accelerations as poison for video work: the AI's become dumber to the degree they can't follow camera directions, and the character performances suffer, the lip sync becomes a lip flap, and the body motions are reduced in quality, and become repetitive.Now, I've not tested TurboDiffusion yet, but I am very actively generating AI video, I probably did a half hour of finished video clips yesterday. There is no test for this issue yet, and for the majority it is yet to be realized as an issue.
  • codingbuddy
    We are scarily close to realtime personalization of video which if you agree with this NeurIPS paper [1] may lead to someone inadvertently creating “digital heroin”[1] https://neurips.cc/virtual/2025/loc/san-diego/poster/121952
  • sroussey
    I want to use this on a website!
  • redundantly
    Now if someone could release an optimization like this for the M4 Max I would be so happy. Last time I tried generating a video it was something like an hour for a 480p 5-second clip.
  • villgax
    I mean the baselines were deliberately worse and not how someone would be using these to begin with maybe noobs and the quoted number is only for DIT steps not for other encoding and decoding steps, which is actually quite high still. No actual use of FA4/Cutlass based kernels nor TRT at any point.