摘要
用大数据方法对自然语言软件需求文档进行分析,基于LDA三层贝叶斯网概率主题模型,将文档建模为多个主题的混合概率分布,将每一个隐含主题建模为多个词的混合概率分布。用Gibbs采样算法估算模型的主题概率分布和词概率分布,将需求文档集合中计算出的文档主题映射为需求视点,用多视点方法分析软件需求,对分解、投影到不同子问题域的视点独立求精,系统需求转化为视点需求,集成多视点构成系统的需求规约。将文档主题的词概率分布映射为需求视点的涉众知识和需求规约知识,为需求知识复用提供了可靠依据,增强了需求分析的科学性和完备性。
Natural language software requirement documents are analyzed with big data method.Based on LDA three-layer Bayesian network probabilistic topic model,documents are modeled as mixed probability distributions of multiple topics,and each implicit topic is modeled as mixed probability distributions of multiple words.Topics probability distribution and words probability distribution of the model are estimated by Gibbs sampling algorithms.Document topics computed in the requirement document set are mapped to requirement viewpoints.Multiple viewpoints method is used to analyze software requirements.The viewpoints which are decomposed and projected to different subproblem domains are refined independently.System requirements are transformed into viewpoint requirements,and multiple viewpoints are integrated into requirement specification of system.Words probability distribution of document topics is mapped to stakeholder’s knowledge and requirements specification knowledge of viewpoints,which provides a reliable basis for requirement knowledge reuse,and enhances the scientificity and completeness of requirement analysis.
作者
张国生
ZHANG Guo-sheng(Yunnan University,Kunming 650500,China)
出处
《中国电子科学研究院学报》
北大核心
2020年第2期147-151,158,共6页
Journal of China Academy of Electronics and Information Technology
基金
国家自然科学基金项目(61379032).
关键词
LDA
主题
GIBBS采样
视点
概率分布
大数据
需求规约
LDA
topic
Gibbs sampling
viewpoint
probability distribution
big data
requirements specification
作者简介
张国生(1968—),男,云南人,副教授,主要研究方向为软件工程。E_mail:zhanggs@ynu.edu.cn