
New diffusion model revolutionizes AI image generation, solving key issues
New diffusion model revolutionizes AI image generation, solving key issues
Kapil Kajal/Intresting Engineering
Generative artificial intelligence (AI) has historically struggled to produce consistent images, often misinterpreting details such as fingers and facial symmetry. Moreover, when prompted to generate images of different sizes and resolutions, these models can fail. Rice University computer scientists have developed a new method for generating images using pre-trained diffusion models to curb such issues.
These models are generative AI that learns by adding layer after layer of random noise to the images they are trained on and then generates new images by removing the added noise. ElasticDiffusion Moayed Haji Ali, a doctoral student in computer science at Rice University, presented the new approach called ElasticDiffusion in a peer-reviewed paper at the 2024 Institute of Electrical and Electronics Engineers (IEEE) Conference on Computer Vision and Pattern Recognition (CVPR) in Seattle. “Diffusion models like Stable Diffusion, Midjourney, and DALL-E create impressive results, generating fairly lifelike and photorealistic images,” Haji Ali said. “But they have a weakness: They can only generate square images. So, in cases where you have different aspect ratios, like on a monitor or a smartwatch … that’s where these models become problematic.” If you instruct a model like Stable Diffusion to generate a non-square image, such as one with a 16:9 aspect ratio, the elements used to construct the resulting image may become repetitive.
That repetition manifests as abnormal deformities in the image or image subjects, such as individuals with six fingers or a strangely elongated car. The way these models are trained also contributes to the problem. “If you train the model on only images that are a certain resolution, they can only generate images with that resolution,” said Vicente Ordóñez-Román, an associate professor of computer science who advised Haji Ali on his work alongside Guha Balakrishnan, assistant professor of electrical and computer engineering. Overfitting Ordóñez-Román explained that overfitting is a common problem in AI, where the model becomes too specialized in the training data. “You could solve that by training the model on a wider variety of images, but it’s expensive and requires massive amounts of computing power ⎯ hundreds, maybe even thousands of graphics processing units,” Ordóñez-Román said.
According to Haji Ali, digital noise used by diffusion models can be translated into a signal with two data types: local and global. The local signal contains detailed pixel-level information, such as the shape of an eye or the texture of a dog’s fur, while the global signal captures the image’s overall outline. “One reason diffusion models need help with non-square aspect ratios is that they usually package local and global information together,” said Haji Ali, who worked on synthesizing motion in AI-generated videos before joining Ordóñez-Román’s research group at Rice for his Ph.D. studies. “When the model tries to duplicate that data to account for the extra space in a non-square image, it results in visual imperfections.” Different approach The ElasticDiffusion method explained in Haji Ali’s paper takes a unique approach to generating images.
Instead of combining both signals, Elastic Diffusion separates the local and global signals into conditional and unconditional generation paths. It subtracts the conditional model from the unconditional one, resulting in a score encompassing overall image information. After that, the unconditional path with the local pixel-level detail is applied to the image in quadrants, filling in the details one square at a time. Global information, such as the image aspect ratio and the content of the image (e.g., a dog, a person running, etc.), remains separate. This ensures that the AI does not confuse the signals and repeat data. The result is a clearer image that does not require additional training, regardless of the aspect ratio. The only drawback to ElasticDiffusion relative to other diffusion models is time. Currently, it takes up to 6-9 times as long for Haji Ali’s method to make an image.
The goal is to reduce that to the same inference time as other models like Stable Diffusion or DALL-E.
(Except for the headline, this story has not been edited by VoM News staff and is published from the syndicated feed)
Latest Posts
- Coaching Centre Firing Case: Khan Sir Gets Interim Relief, Investigation Continues
June 9, 2026 | Breaking News, India, Politics - Looki Expands Into Southeast Asia, Showcases AI Wearable Technology at SuperAI Singapore 2026
June 9, 2026 | AI & ML, Artificial Intelligence, Press Release, Technology - Studio De Schutter Unveils Immersive Lighting Design for Tonhain Music Studio and Concert Venue in Berlin
June 9, 2026 | Entertainment, Press Release, World - WellSpan Health Reports 66% Reduction in Workplace Violence Following Systemwide Safety Initiative with Canopy
June 9, 2026 | Health, Press Release, World - Global Nuclear Weapons Spending Hits Record High in 2025 Amid Rising Global Tensions: ICAN & SIPRI Reveal
June 9, 2026 | Breaking News, Politics, World - Donald Trump Officially Nominates Ex-Personal Lawyer Todd Blanche as Permanent Attorney General of the United States
June 9, 2026 | Breaking News, Politics, World - Israel Carried out 3,500 air strikes, Hundreds of Demolition Since the US-mediated Ceasefire : Lebanese Prime Minister Nawaf Salam
June 9, 2026 | Breaking News, Politics, World - Earthquake Strikes Western Cuba
June 9, 2026 | Breaking News, World - Historian MyNaa Swamy traced ‘Konidela’ Inscription at Tadipatri Temple
June 8, 2026 | Featured by VoM, History, India - Grass Valley Appoints Sam Craig as Vice President, Global Pre-Sales
June 8, 2026 | Breaking News, Press Release, World