New diffusion model revolutionizes AI image generation, solving key issues

New diffusion model revolutionizes AI image generation, solving key issues Image: Rice University

VoM News Desk | September 15, 2024 | 08:43 AM IST | AI & ML, Artificial Intelligence |

New diffusion model revolutionizes AI image generation, solving key issues

Kapil Kajal/Intresting Engineering

Generative artificial intelligence (AI) has historically struggled to produce consistent images, often misinterpreting details such as fingers and facial symmetry. Moreover, when prompted to generate images of different sizes and resolutions, these models can fail. Rice University computer scientists have developed a new method for generating images using pre-trained diffusion models to curb such issues.

These models are generative AI that learns by adding layer after layer of random noise to the images they are trained on and then generates new images by removing the added noise. ElasticDiffusion Moayed Haji Ali, a doctoral student in computer science at Rice University, presented the new approach called ElasticDiffusion in a peer-reviewed paper at the 2024 Institute of Electrical and Electronics Engineers (IEEE) Conference on Computer Vision and Pattern Recognition (CVPR) in Seattle. “Diffusion models like Stable Diffusion, Midjourney, and DALL-E create impressive results, generating fairly lifelike and photorealistic images,” Haji Ali said. “But they have a weakness: They can only generate square images. So, in cases where you have different aspect ratios, like on a monitor or a smartwatch … that’s where these models become problematic.” If you instruct a model like Stable Diffusion to generate a non-square image, such as one with a 16:9 aspect ratio, the elements used to construct the resulting image may become repetitive.

That repetition manifests as abnormal deformities in the image or image subjects, such as individuals with six fingers or a strangely elongated car. The way these models are trained also contributes to the problem. “If you train the model on only images that are a certain resolution, they can only generate images with that resolution,” said Vicente Ordóñez-Román, an associate professor of computer science who advised Haji Ali on his work alongside Guha Balakrishnan, assistant professor of electrical and computer engineering. Overfitting Ordóñez-Román explained that overfitting is a common problem in AI, where the model becomes too specialized in the training data. “You could solve that by training the model on a wider variety of images, but it’s expensive and requires massive amounts of computing power ⎯ hundreds, maybe even thousands of graphics processing units,” Ordóñez-Román said.

Add VoM News As Preferred Source

According to Haji Ali, digital noise used by diffusion models can be translated into a signal with two data types: local and global. The local signal contains detailed pixel-level information, such as the shape of an eye or the texture of a dog’s fur, while the global signal captures the image’s overall outline. “One reason diffusion models need help with non-square aspect ratios is that they usually package local and global information together,” said Haji Ali, who worked on synthesizing motion in AI-generated videos before joining Ordóñez-Román’s research group at Rice for his Ph.D. studies. “When the model tries to duplicate that data to account for the extra space in a non-square image, it results in visual imperfections.” Different approach The ElasticDiffusion method explained in Haji Ali’s paper takes a unique approach to generating images.

Instead of combining both signals, Elastic Diffusion separates the local and global signals into conditional and unconditional generation paths. It subtracts the conditional model from the unconditional one, resulting in a score encompassing overall image information. After that, the unconditional path with the local pixel-level detail is applied to the image in quadrants, filling in the details one square at a time. Global information, such as the image aspect ratio and the content of the image (e.g., a dog, a person running, etc.), remains separate. This ensures that the AI does not confuse the signals and repeat data. The result is a clearer image that does not require additional training, regardless of the aspect ratio. The only drawback to ElasticDiffusion relative to other diffusion models is time. Currently, it takes up to 6-9 times as long for Haji Ali’s method to make an image.

The goal is to reduce that to the same inference time as other models like Stable Diffusion or DALL-E.

(Except for the headline, this story has not been edited by VoM News staff and is published from the syndicated feed)

VoM News Desk

VoM News is an online web portal in jammu Kashmir offers regional, National & global news.

Latest Posts

From Delhi to New York, San José, London and Dublin, Communities Stand With India’s Students; Hindus for Human Rights Joins International Demonstrations
July 21, 2026 | Breaking News, India, Press Release
Four Indians Killed After MV Golden Leo Ship Hit in Russian Missile Attack Near Odesa Port, Ukraine
July 21, 2026 | Breaking News, India, World
No Decision on Ethanol Beyond 20 Percent: Petroleum and Natural Gas Minister Suresh Gopi Informs Rajya Sabha
July 20, 2026 | Breaking News, India, Politics
Sikkim Tunnel Rescue: All Trapped Workers Brought Out Safely
July 20, 2026 | Breaking News, India
LoP Rahul Gandhi Slams Police Action Against Students at Jantar Mantar; Supports for Education Reforms
July 20, 2026 | Breaking News, India, Politics
Govt Urges CJP Protesters to End Sit-in After JP Nadda Meeting, Denies Forcible Removal of Protestors From Jantar Mantar
July 20, 2026 | Breaking News, India, Politics
Iran Warns US Against Any Attempt to Seize Kharg Island Amid Escalating Tensions
July 20, 2026 | Breaking News, Politics, World
2G Mobile Internet Services Restored in Doda After Three-Day Suspension, Users Expresses Resentment
July 20, 2026 | Breaking News, Doda, Jammu Kashmir
Shooting Stones Hit Vehicles on Doda Highway; Casualties Feared
July 20, 2026 | Breaking News, Doda, Jammu Kashmir
Activist Sonam Wangchuk Sets Three Conditions to End Hunger Strike
July 20, 2026 | Breaking News, India, Politics

New diffusion model revolutionizes AI image generation, solving key issues

New diffusion model revolutionizes AI image generation, solving key issues

Share this Post

Latest Posts