Korean Society of Heat Treatment Presents Research Findings on On-Premises LLM-Based Materials Data / Students from the Artificial Intelligence Materials Design Research Lab
- 26.07.03 / 홍유민

Park Se-jin, a Ph.D. candidate, along with undergraduate research assistants Moon Hyun-jun and Choi Jun-hyuk (advisor: Professor Cho Ki Sub) from the Artificial Intelligence Materials Design Laboratory in the Department of New Materials Engineering at Kookmin University, demonstrated outstanding research capabilities by presenting their findings on the development of on-premises LLM-based material and industrial databases at the 2026 Spring Academic Conference of the Korean Society of Heat Treatment, held recently. In particular, Ph.D. candidate Park Se-jin was recognized for the excellence of her research by receiving the Jeong In Award for Outstanding Paper.
Ph.D. candidate Park Se-jin presented her research findings on the topic, “A Methodology for Constructing a CPSP Database for Ni-Based Superheat-Resistant Alloys by Combining Adaptive Schema Induction and Semantic Alignment Based on Contrastive Learning.” This research proposes a methodology for constructing a Composition–Processing–Structure–Property (CPSP) database—which serves as the core foundation for AI-based materials design and property prediction technologies—in a more sophisticated and reproducible manner.
Ni-based superheat-resistant alloys are high-performance materials used in extreme environments such as aircraft engines and gas turbines, and their composition, processing, microstructure, and properties are intricately interconnected. However, existing fixed-schema-based data structuring methods struggle to flexibly incorporate diverse information from the literature and have limitations in systematically representing CPSP relationships. In response, Ph.D. candidate Park Se-jin proposed a schema-induced database construction methodology that inductively forms the necessary structure from literature data. Through this approach, he constructed information on the process history, microstructure, and mechanical properties of Ni-based superheat-resistant alloys in a hierarchical JSON structure within an on-premises LLM environment, and ensured structural consistency among the data by applying semantic alignment and contrastive learning.
Undergraduate research students Moon Hyun-jun and Choi Jun-hyuk presented their research findings on the topic, “Establishment of a Multimodal LLM-Based Automated Data Extraction and Integration Pipeline in an On-Premises Environment Considering Industrial Data Security.” This research proposed a multimodal LLM-based data pipeline designed to automatically extract and integrate unstructured data—such as image-based documents, experimental data, and process data—that exists in a fragmented state across industrial sites.
Material and process data in industrial settings are dispersed across various document formats and image files, and because data formats are not standardized, analyzing and utilizing this data poses challenges. In particular, applying external cloud-based AI systems to sensitive internal manufacturing and process data has been limited due to security concerns. Consequently, the research team built an automated data extraction and integration pipeline utilizing multimodal LLMs in an on-premises environment rather than a cloud environment. Furthermore, to reliably extract key information from image-based documents, the team designed a cross-validation system that utilizes the top three multimodal LLM models—selected through preliminary performance evaluations—in parallel and compares and reviews the results of each model. This approach reduces reliance on a single model and enhances the accuracy and reliability of the data refinement process through collaboration between AI and humans.
The results of this study demonstrate the excellence of the research being conducted by Kookmin University’s Artificial Intelligence Materials Design Laboratory, which focuses on building on-premises LLM-based materials databases, industrial data automation, and AI-based materials analysis and design. This research is particularly significant as it demonstrates the potential to automatically extract and structure high-quality data even in research and industrial environments where security and reproducibility are critical. It is expected to be applied in the future to process monitoring and the development of high-reliability materials in the aerospace, energy, defense, and metal materials industries.
|
This content is translated from Korean to English using the AI translation service DeepL and may contain translation errors such as jargon/pronouns. If you find any, please send your feedback to kookminpr@kookmin.ac.kr so we can correct them.
|
|
Korean Society of Heat Treatment Presents Research Findings on On-Premises LLM-Based Materials Data / Students from the Artificial Intelligence Materials Design Research Lab |
||||
|---|---|---|---|---|
|
2026-07-03
195
Park Se-jin, a Ph.D. candidate, along with undergraduate research assistants Moon Hyun-jun and Choi Jun-hyuk (advisor: Professor Cho Ki Sub) from the Artificial Intelligence Materials Design Laboratory in the Department of New Materials Engineering at Kookmin University, demonstrated outstanding research capabilities by presenting their findings on the development of on-premises LLM-based material and industrial databases at the 2026 Spring Academic Conference of the Korean Society of Heat Treatment, held recently. In particular, Ph.D. candidate Park Se-jin was recognized for the excellence of her research by receiving the Jeong In Award for Outstanding Paper. Ph.D. candidate Park Se-jin presented her research findings on the topic, “A Methodology for Constructing a CPSP Database for Ni-Based Superheat-Resistant Alloys by Combining Adaptive Schema Induction and Semantic Alignment Based on Contrastive Learning.” This research proposes a methodology for constructing a Composition–Processing–Structure–Property (CPSP) database—which serves as the core foundation for AI-based materials design and property prediction technologies—in a more sophisticated and reproducible manner. Ni-based superheat-resistant alloys are high-performance materials used in extreme environments such as aircraft engines and gas turbines, and their composition, processing, microstructure, and properties are intricately interconnected. However, existing fixed-schema-based data structuring methods struggle to flexibly incorporate diverse information from the literature and have limitations in systematically representing CPSP relationships. In response, Ph.D. candidate Park Se-jin proposed a schema-induced database construction methodology that inductively forms the necessary structure from literature data. Through this approach, he constructed information on the process history, microstructure, and mechanical properties of Ni-based superheat-resistant alloys in a hierarchical JSON structure within an on-premises LLM environment, and ensured structural consistency among the data by applying semantic alignment and contrastive learning. Undergraduate research students Moon Hyun-jun and Choi Jun-hyuk presented their research findings on the topic, “Establishment of a Multimodal LLM-Based Automated Data Extraction and Integration Pipeline in an On-Premises Environment Considering Industrial Data Security.” This research proposed a multimodal LLM-based data pipeline designed to automatically extract and integrate unstructured data—such as image-based documents, experimental data, and process data—that exists in a fragmented state across industrial sites. Material and process data in industrial settings are dispersed across various document formats and image files, and because data formats are not standardized, analyzing and utilizing this data poses challenges. In particular, applying external cloud-based AI systems to sensitive internal manufacturing and process data has been limited due to security concerns. Consequently, the research team built an automated data extraction and integration pipeline utilizing multimodal LLMs in an on-premises environment rather than a cloud environment. Furthermore, to reliably extract key information from image-based documents, the team designed a cross-validation system that utilizes the top three multimodal LLM models—selected through preliminary performance evaluations—in parallel and compares and reviews the results of each model. This approach reduces reliance on a single model and enhances the accuracy and reliability of the data refinement process through collaboration between AI and humans. The results of this study demonstrate the excellence of the research being conducted by Kookmin University’s Artificial Intelligence Materials Design Laboratory, which focuses on building on-premises LLM-based materials databases, industrial data automation, and AI-based materials analysis and design. This research is particularly significant as it demonstrates the potential to automatically extract and structure high-quality data even in research and industrial environments where security and reproducibility are critical. It is expected to be applied in the future to process monitoring and the development of high-reliability materials in the aerospace, energy, defense, and metal materials industries.
|
||||






