SK Telecom Releases New Vision-Language and Document Analysis AI Models on Open-Source Platform
SK Telecom Releases New Vision-Language and Document Analysis AI Models on Open-Source Platform
  • Jung So-yeon
  • 승인 2025.07.29 10:28
  • 댓글 0
이 기사를 공유합니다

Headquarters of SK Telecom

SK Telecom announced the official release of its latest vision-language model (VLM) and versatile document interpretation technology on July 29. The models, based on its proprietary large language model (LLM) called ‘A.X’, were made available through the open-source platform Hugging Face.

The newly released models include ‘A.X Encoder’ and ‘A.X 4.0 VL Light’, which are freely accessible for research and commercial use. This launch marks a significant step in SK Telecom’s ongoing effort to advance AI technology and expand its industrial applications. Following the release of two models in July—the standard and lightweight versions of A.X 4.0, as well as two versions of A.X 3.1 built from scratch—the company now offers a total of six models. SK Telecom plans to continue refining its A.X 4.0-based inference models and further expanding the practical scope of its LLM technology.

The ‘A.X Encoder’ is designed as a natural language processing (NLP) encoder model optimized for large-scale LLM training. It can process up to 16,384 tokens, offering up to three times faster inference and twice the training speed compared to previous models. This allows it to handle longer documents and more complex contexts effectively. With approximately 149 million parameters, the model achieved an average score of 85.47 in natural language understanding benchmarks, surpassing the 80.19 score of the global open-source model ‘RoBERTa-base (KLUE benchmark)’, proving its top-tier performance.

Lightweight yet Powerful: ‘A.X 4.0 VL Light’

‘A.X 4.0 VL Light’ is a multimodal Korean visual-language model trained on a diverse dataset that combines visual elements and text comprehension. It demonstrates excellent performance in understanding complex visual data such as tables, graphs, and manufacturing diagrams, making it ideal for corporate environments.

Despite its lighter structure containing 7 billion parameters, the model delivers competitive performance comparable to medium-sized models. It achieved an average score of 79.4 on Korean visual benchmarks and 60.2 on textual benchmarks, placing it among Korea’s top lightweight models. Additionally, on the K-Viscuit benchmark—which measures multimodal Korean cultural and contextual understanding—it scored 80.2, and on KoBizDoc (document and chart comprehension), it scored 89.8. Notably, it uses approximately 41% fewer text tokens than the Qwen2.5-VL32B model at the same input level, enhancing cost efficiency for enterprise users.

Kim Taeyoon, head of SK Telecom’s foundation models division, emphasized the importance of proprietary technology in realizing AI sovereignty. He stated, “Securing independent technological capabilities is at the core of AI sovereignty. We will continue to enhance our technology and strengthen collaboration within our consortium to increase our global AI competitiveness.”


댓글삭제
삭제한 댓글은 다시 복구할 수 없습니다.
그래도 삭제하시겠습니까?
댓글 0
댓글쓰기
계정을 선택하시면 로그인·계정인증을 통해
댓글을 남기실 수 있습니다.

  • ABOUT
  • CONTACT US
  • SIGN UP MEMBERSHIP
  • RSS
  • URL : www.koreaittimes.com | Tel : +82-2-578- 0434 / + 82-10-2442-9446 | North America Dept: 070-7008-0005
  • Email : info@koreaittimes.com | Publisher. Editor :: Chung Younsoo
  • Masthead: Korea IT Times. Copyright(C) Korea IT Times, All rights reserved.
ND소프트