The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactic...The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactical and semantic information of source code is proposed. Firstly, abstract syntax tree(AST) is divided based on node category to obtain statement sequence. The statement tree is encoded into vectors to capture lexical and syntactical knowledge at the statement level.Secondly, the source code is transformed into vector representation by the sequence naturalness of the statement. Therefore,the problem of gradient vanishing and explosion caused by a large AST size is obviated when using AST to the represent source code. Finally, the correlation between bug reports and source files are comprehensively analyzed from three aspects of syntax, semantics and text to locate the buggy code. Experiments show that compared with other standard models, the proposed model improves the performance of bug localization, and it has good advantages in mean reciprocal rank(MRR), mean average precision(MAP) and Top N Rank.展开更多
In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to dete...In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to detect complex and duplicated many-to-many association relationships in source code,and to provide guidance for further refactoring.In the approach,source code is first transformed to an abstract syntax tree from which all data members of each class are extracted,then each class is characterized in connection with a set of association classes saving its data members.Next,classes in common associations are obtained by comparing different association classes sets in integrated analysis.Finally,on condition of pre-defined thresholds,all class sets in candidate for refactoring and their common association classes are saved and exported.This approach is tested on 4 projects.The results show that the precision is over 96%when the threshold is 3,and 100%when the threshold is 4.Meanwhile,this approach has good execution efficiency as the execution time taken for a project with more than 500 classes is less than 4 s,which also indicates that it can be applied to projects of different scales to identify their refactoring opportunities effectively.展开更多
基金supported by the National Key R&D Program of China (2018YFB1702700)。
文摘The existing software bug localization models treat the source file as natural language, which leads to the loss of syntactical and structure information of the source file. A bug localization model based on syntactical and semantic information of source code is proposed. Firstly, abstract syntax tree(AST) is divided based on node category to obtain statement sequence. The statement tree is encoded into vectors to capture lexical and syntactical knowledge at the statement level.Secondly, the source code is transformed into vector representation by the sequence naturalness of the statement. Therefore,the problem of gradient vanishing and explosion caused by a large AST size is obviated when using AST to the represent source code. Finally, the correlation between bug reports and source files are comprehensively analyzed from three aspects of syntax, semantics and text to locate the buggy code. Experiments show that compared with other standard models, the proposed model improves the performance of bug localization, and it has good advantages in mean reciprocal rank(MRR), mean average precision(MAP) and Top N Rank.
文摘In order to deal with the complex association relationships between classes in an object-oriented software system,a novel approach for identifying refactoring opportunities is proposed.The approach can be used to detect complex and duplicated many-to-many association relationships in source code,and to provide guidance for further refactoring.In the approach,source code is first transformed to an abstract syntax tree from which all data members of each class are extracted,then each class is characterized in connection with a set of association classes saving its data members.Next,classes in common associations are obtained by comparing different association classes sets in integrated analysis.Finally,on condition of pre-defined thresholds,all class sets in candidate for refactoring and their common association classes are saved and exported.This approach is tested on 4 projects.The results show that the precision is over 96%when the threshold is 3,and 100%when the threshold is 4.Meanwhile,this approach has good execution efficiency as the execution time taken for a project with more than 500 classes is less than 4 s,which also indicates that it can be applied to projects of different scales to identify their refactoring opportunities effectively.