TY - GEN
T1 - Generating SysML Behavior Models via Large Language Models
T2 - 16th International Conference on Internetware, Internetware 2025
AU - Wang, Yuan
AU - Ge, Ning
AU - Liu, Jiangxi
AU - Cao, Zhilong
AU - Chen, Zheping
AU - Hu, Chunming
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/10/27
Y1 - 2025/10/27
N2 - Model-driven development (MDD) is a mainstream approach in safety-critical domains, providing standardized modeling languages like SysML. SysML behavior models describe system dynamics and are widely used in aerospace, manufacturing, and IoT. However, manual modeling is inefficient and prone to quality issues, restricting MDD’s practical adoption. The potential of LLMs in SysML behavior model generation and its challenges remain unclear, making it a key research topic. This empirical study evaluates LLMs in generating three types of SysML behavior models, focusing on performance and hallucinations. Our contributions are twofold: (1) constructing and publishing a dataset of 107 SysML behavior models spanning various domains; (2) analyzing hallucinations in LLM-assisted SysML behavior model generation from syntactic and semantic perspectives and proposing model-checking rules to mitigate them and enhance model quality. We analyze hallucinations in SysML behavior model generation, classifying them and exploring their possible causes. The evaluation results show that while the models generally meet syntactic requirements, they consistently lack semantic accuracy. Across both phases, LLMs achieve over 90% grammar accuracy. For semantic accuracy, the average F1-score for ACT reaches 95%, while SD drops to just 50%. These results demonstrate that while our model-checking rules effectively correct format and syntax, they are insufficient for addressing deeper semantic gaps. Overcoming these challenges requires advanced strategies, such as counterexamples and simulation traces, to provide optimal feedback. Additionally, model-checking in LLM-based generation is costly, and reducing this cost is another critical issue to address in the future.
AB - Model-driven development (MDD) is a mainstream approach in safety-critical domains, providing standardized modeling languages like SysML. SysML behavior models describe system dynamics and are widely used in aerospace, manufacturing, and IoT. However, manual modeling is inefficient and prone to quality issues, restricting MDD’s practical adoption. The potential of LLMs in SysML behavior model generation and its challenges remain unclear, making it a key research topic. This empirical study evaluates LLMs in generating three types of SysML behavior models, focusing on performance and hallucinations. Our contributions are twofold: (1) constructing and publishing a dataset of 107 SysML behavior models spanning various domains; (2) analyzing hallucinations in LLM-assisted SysML behavior model generation from syntactic and semantic perspectives and proposing model-checking rules to mitigate them and enhance model quality. We analyze hallucinations in SysML behavior model generation, classifying them and exploring their possible causes. The evaluation results show that while the models generally meet syntactic requirements, they consistently lack semantic accuracy. Across both phases, LLMs achieve over 90% grammar accuracy. For semantic accuracy, the average F1-score for ACT reaches 95%, while SD drops to just 50%. These results demonstrate that while our model-checking rules effectively correct format and syntax, they are insufficient for addressing deeper semantic gaps. Overcoming these challenges requires advanced strategies, such as counterexamples and simulation traces, to provide optimal feedback. Additionally, model-checking in LLM-based generation is costly, and reducing this cost is another critical issue to address in the future.
KW - SysML behavior models
KW - hallucination
KW - large language models
KW - model checking
KW - model generation
UR - https://www.scopus.com/pages/publications/105023698390
U2 - 10.1145/3755881.3755926
DO - 10.1145/3755881.3755926
M3 - 会议稿件
AN - SCOPUS:105023698390
T3 - 16th International Conference on Internetware, Internetware 2025 - Proceedings
SP - 366
EP - 377
BT - 16th International Conference on Internetware, Internetware 2025 - Proceedings
A2 - Mei, Hong
A2 - Lv, Jian
A2 - Jin, Zhi
A2 - Li, Xuandong
A2 - Zimmermann, Thomas
A2 - Li, Ge
A2 - Bu, Lei
A2 - Xia, Xin
PB - Association for Computing Machinery, Inc
Y2 - 20 June 2025 through 22 June 2025
ER -