Researchers at the Electronics and Telecommunications Research Institute (ETRI) are poised to unveil a breakthrough technology that combines generative artificial intelligence (AI) with visual intelligence to create images in as little as two seconds after sentence input. ETRI has announced plans to introduce a series of five models to the public.
Among them, KOALA and Ko-LLaVA, though compact in size, boast impressive capabilities that have reshaped the generative AI market. KOALA serves as a text-to-image model, rapidly generating images from input sentences or text. On the other hand, Ko-LLaVA is an interactive AI model capable of answering questions using images or videos, bridging the gap between visual information and textual queries.
These models are foundational elements of generative AI technology, facilitating seamless conversions across different modalities, including text, speech, image, and video, thus opening up new avenues for AI applications.
KOALA, with its remarkable speed and compact design, represents a significant breakthrough in image generation capabilities. Dr. Lee Yong-joo, Head of ETRI's Visual Intelligence Research Center, elucidates that KOALA utilizes a knowledge distillation technique, compressing the model size while maintaining optimal performance. This innovation allows KOALA to generate images in just one to two seconds, a feat that surpasses existing open AI models.
Similarly, the Ko-LLaVA model extends the capabilities of conversational AI by integrating textual questions and answers with visual information. By seamlessly combining text models with images or video, Ko-LLaVA facilitates intuitive interactions, offering promising applications across various industries such as education, broadcasting, and content creation.
Dr. Lee emphasizes the necessity of continued research and development in AI technology, despite challenges such as limited computing resources and data availability. ETRI remains committed to advancing the field by conducting fundamental studies on video generation and developing core algorithms to further enhance AI capabilities.
The implications of these advancements are vast. KOALA and Ko-LLaVA hold promise for numerous industries, including education, broadcasting, content creation, and conversational AI. As these models pave the way for future innovations, The experts expect serve as catalysts for ushering in a new era of intelligent technologies.

