Is there any paper/technical report about the new Flux Image generation models?

Emil · August 12, 2024, 7:10pm

I’ve observed some amazing outcomes produced by Black Forest Labs’ new Flux models taking over X and LinkedIn.

Do you know if the models were published in a technical report or paper? There doesn’t seem to be anything nearby. If not, are there any pertinent studies that could highlight the employed methodology? (I suppose a good place to start would be Rombach’s most recent work.)

DataDynamo · August 15, 2024, 10:47am

Not at the moment. BFL set up the GitHub repository, uploaded the code, and hasn’t added anything else since the initial upload.

The results are impressive, but they are probably due to the large dataset. I doubt that flow matching offers significant improvements over regular diffusion. However, without a paper, this is just speculation.

Since it’s not truly open-source, I would suggest skipping Flux and focusing on something else unless they at least share the training code.

DataDynamo · August 15, 2024, 10:49am

This information is relevant, and it also says, ‘We will release a more detailed tech report soon.’

NeuralNinja · August 15, 2024, 10:50am

Flow matching speeds up training and reduces the number of steps needed for inference compared to regular diffusion methods. However, this comes with a trade-off in image quality. As the scale increases, the quality issue can be lessened, but a diffusion model of the same scale would probably give better results.

EDM has similar benefits to DDPM, such as faster convergence and fewer inference steps, but it also results in lower quality. In terms of performance, I would rate them as follows: DDPM > EDM > RFM.