Testing Wan2.2 Best Practices for I2V

Hello everyone! I wanted to share some tests that I carried out to determine a good setup for the WAN 2.2 image video generation. First, so much appreciation for the people who have posted via WAN 2.2 setups, both ask for help and make suggestions. There have been some "Best Practices" postings lately, and these were incredibly informative. I really have to fight which of the many recommended "best practices" are the best compromise between quality and speed, so I chopped a kind of test suite for myself in convenience. I have generated a number of input requests with the help of Google Gemini by feeding a number of information about how I would like to ask WAN 2.2 and the different functions (camera movement, subject movement, quick compliance, etc.), I would like to test. Chose some of the proposed requests that seemed to illustrate this (and got rid of a bunch that had just failed). I then selected 4 different stab -test techniques -two, which basically the default settings of/without LightX2V Lora, one without Loras and a sampler/Scheduller that I have seen that I have recommended a few times (dpmpp_2m/sgm_uniform) and one after the three sampler described in this post There are obviously many more options to get a more complete picture, but I need to start with something to generate a lot more to generate, and there is a lot more Options to generate more variations, and there are many more variations, and there is more time to generate more and more to generate more and more and up to times, and more and more time until it takes more and more and until more and more variations. I plan to carry out more tests over time, but I wanted to get something out for everyone before another model comes out and makes everything outdated. This is all especially I2V. I cannot say whether the results of the different setups would be comparable to T2V. That should be another series of tests. Observations/: I would never use the standard 4-stage workflow. However, I imagine that it could be better with different samplers or other optimizations. The three-scouting approach seems to be a good balance between speed/quality, but with the settings I used, it also differs the most from the standard 20-step video (apart from the standard 4 step). The three-K sampler setup often misses the end of the command prompt. Adding an additional unnecessary event can help. For example, I added "the necromant grins" in the necromante video, in which only the arms come out of the ground. Until the end of the command prompt, and this meant that your body also rose towards the end (but it didn't look good, but I think that was more than the Loras). I have to ask better. I should have recorded the time of each generation as part of the comparison. Could add that later. What does everyone think? I would like to hear the opinions of other people about which one is best if you look at time compared to quality. Does anyone have any special comparisons that they would like to see? If there is much requested, I probably can't do everyone, but at least I could make a sample. If you have better input requests (including a starting picture or a request to produce one), I would be grateful for them and could carry out a few more tests on you. Does anyone know a website that I can upload several images/videos on which the metadata keeps me so that I can divide the workflows/input requests more easily for everything? I am happy to share everything that has come into creating these conversations, but I don't know the easiest way to do this, and I don't think 20 exported .json files are the answer. Transmitted by /u /dzdn1

Testing Wan2.2 Best Practices for I2V

Related