I think one thing to remember is that it’s going to be on the XBox Game Pass when it releases. So if you are subscribed to that, you can download it for free and play it before you decide whether to get it or not.
Though, given it’s supposed to be 125 GB, I’m seriously wondering if I want to dedicate that much storage space on my XBox Series S to it…
Generally speaking, the way training works is this:
You put together a folder of pictures, all the same size. It would’ve been 1024x1024 in this case. Other models have used 768z768 or 512x512. For every picture, you also have a text file with a description.
The training software takes a picture, slices it into squares, generates a square the same size of random noise, then trains on how to change that noise into that square. It associates that training with tokens from the description that went with that picture. And it keeps doing this.
Then later, when someone types a prompt into the software, it tokenizes it, generates more random noise, and uses the denoising methods associated with the tokens you typed in. The pictures in the folder aren’t actually kept by it anywhere.
From the side of the person doing the training, it’s just put together the pictures and descriptions, set some settings, and let the training software do its work, though.
(No money involved in this one. One person trained it and plopped it on a website where people can download loras for free…)