Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collab...Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collaboration with governments and businesses.This study aims to investigate the development of research fields over time,translating it into a topic detection problem.Design/methodology/approach:To achieve the objectives,we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents.Document embedding approaches are utilized to transform documents into vector-based representations.The proposed method is evaluated by comparing it with a combination of different embedding and clustering approaches and the classical topic modeling algorithms(i.e.LDA)against a benchmark dataset.A case study is also conducted exploring the evolution of Artificial Intelligence(AI)detecting the research topics or sub-fields in related AI publications.Findings:Evaluating the performance of the proposed method using clustering performance indicators reflects that our proposed method outperforms similar approaches against the benchmark dataset.Using the proposed method,we also show how the topics have evolved in the period of the recent 30 years,taking advantage of a keyword extraction method for cluster tagging and labeling,demonstrating the context of the topics.Research limitations:We noticed that it is not possible to generalize one solution for all downstream tasks.Hence,it is required to fine-tune or optimize the solutions for each task and even datasets.In addition,interpretation of cluster labels can be subjective and vary based on the readers’opinions.It is also very difficult to evaluate the labeling techniques,rendering the explanation of the clusters further limited.Practical implications:As demonstrated in the case study,we show that in a real-world example,how the proposed method would enable the researchers and reviewers of the academic research to detect,summarize,analyze,and visualize research topics from decades of academic documents.This helps the scientific community and all related organizations in fast and effective analysis of the fields,by establishing and explaining the topics.Originality/value:In this study,we introduce a modified and tuned deep embedding clustering coupled with Doc2Vec representations for topic extraction.We also use a concept extraction method as a labeling approach in this study.The effectiveness of the method has been evaluated in a case study of AI publications,where we analyze the AI topics during the past three decades.展开更多
While XML web services become recognized as a solution to business-to-business transactions, there are many problems that should be solved. For example, it is not easy to manipulate business documents of existing stan...While XML web services become recognized as a solution to business-to-business transactions, there are many problems that should be solved. For example, it is not easy to manipulate business documents of existing standards such as RosettaNet and UN/EDIFACT EDI, traditionally regarded as an important resource for managing B2B relationships. As a starting point for the complete implementation of B2B web services, this paper deals with how to support B2B business documents in XML web services. In the first phase, basic requirements for driving XML web services by business documents are introduced. As a solution, this paper presents how to express B2B business documents in WSDL, a core standard for XML web services. This kind of approach facilitates the reuse of existing business documents and enhances interoperability between implemented web services. Furthermore, it suggests how to link with other conceptual modeling frameworks such as ebXML/UMM, built on a rich heritage of electronic business experience.展开更多
Campus network establishment belongs to the field of system engineering. It is necessary to carry on cooperation among departments. Standardization is the key to solve the problem, and its core is standardization of d...Campus network establishment belongs to the field of system engineering. It is necessary to carry on cooperation among departments. Standardization is the key to solve the problem, and its core is standardization of documents. Therefore, this paper will be concentrated on the discussion of relevant problems in combination with our campus network practice.展开更多
Plain Language has made a great difference nowadays. As it turns out, Plain Language works effectively to express clearly, concisely and systematically. However, it is necessary for contemporary practitioners to revie...Plain Language has made a great difference nowadays. As it turns out, Plain Language works effectively to express clearly, concisely and systematically. However, it is necessary for contemporary practitioners to review the origin and development of Plain Language Movement and to examine whether it has thoroughly implemented Plain Language policies in every federal document. Examining a contemporary federal document against the Guidelines for Document Designers reveals existing problems for further development.展开更多
Purpose: The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents.Design/methodology/approach: Based on the inherent characteristics of Chines...Purpose: The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents.Design/methodology/approach: Based on the inherent characteristics of Chinese-English mixed texts and the cybernetics theory,we proposed an integrated control method for indexing documents. It consists of 'feed-forward control','in-progress control' and 'feed-back control',aiming at improving the accuracy of automatic indexing of Chinese-English mixed documents. An experiment was conducted to investigate the effect of our proposed method.Findings: This method distinguishes Chinese and English documents in grammatical structures and word formation rules. Through the implementation of this method in the three phases of automatic indexing for the Chinese-English mixed documents,the results were encouraging. The precision increased from 88.54% to 97.10% and recall improved from97.37% to 99.47%.Research limitations: The indexing method is relatively complicated and the whole indexing process requires substantial human intervention. Due to pattern matching based on a bruteforce(BF) approach,the indexing efficiency has been reduced to some extent.Practical implications: The research is of both theoretical significance and practical value in improving the accuracy of automatic indexing of multilingual documents(not confined to Chinese-English mixed documents). The proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas.Originality/value: So far,few studies have been published about the method for increasing the accuracy of multilingual automatic indexing. This study will provide insights into the automatic indexing of multilingual documents,especially Chinese-English mixed documents.展开更多
Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review arti...Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.展开更多
The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dy...The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dynamic correlation functions. This paper introduces a new method and uses XML to store and manage engineering documents to realize the format unity of engineering documents and their dynamic correlations.展开更多
文摘Purpose:Detection of research fields or topics and understanding the dynamics help the scientific community in their decisions regarding the establishment of scientific fields.This also helps in having a better collaboration with governments and businesses.This study aims to investigate the development of research fields over time,translating it into a topic detection problem.Design/methodology/approach:To achieve the objectives,we propose a modified deep clustering method to detect research trends from the abstracts and titles of academic documents.Document embedding approaches are utilized to transform documents into vector-based representations.The proposed method is evaluated by comparing it with a combination of different embedding and clustering approaches and the classical topic modeling algorithms(i.e.LDA)against a benchmark dataset.A case study is also conducted exploring the evolution of Artificial Intelligence(AI)detecting the research topics or sub-fields in related AI publications.Findings:Evaluating the performance of the proposed method using clustering performance indicators reflects that our proposed method outperforms similar approaches against the benchmark dataset.Using the proposed method,we also show how the topics have evolved in the period of the recent 30 years,taking advantage of a keyword extraction method for cluster tagging and labeling,demonstrating the context of the topics.Research limitations:We noticed that it is not possible to generalize one solution for all downstream tasks.Hence,it is required to fine-tune or optimize the solutions for each task and even datasets.In addition,interpretation of cluster labels can be subjective and vary based on the readers’opinions.It is also very difficult to evaluate the labeling techniques,rendering the explanation of the clusters further limited.Practical implications:As demonstrated in the case study,we show that in a real-world example,how the proposed method would enable the researchers and reviewers of the academic research to detect,summarize,analyze,and visualize research topics from decades of academic documents.This helps the scientific community and all related organizations in fast and effective analysis of the fields,by establishing and explaining the topics.Originality/value:In this study,we introduce a modified and tuned deep embedding clustering coupled with Doc2Vec representations for topic extraction.We also use a concept extraction method as a labeling approach in this study.The effectiveness of the method has been evaluated in a case study of AI publications,where we analyze the AI topics during the past three decades.
文摘While XML web services become recognized as a solution to business-to-business transactions, there are many problems that should be solved. For example, it is not easy to manipulate business documents of existing standards such as RosettaNet and UN/EDIFACT EDI, traditionally regarded as an important resource for managing B2B relationships. As a starting point for the complete implementation of B2B web services, this paper deals with how to support B2B business documents in XML web services. In the first phase, basic requirements for driving XML web services by business documents are introduced. As a solution, this paper presents how to express B2B business documents in WSDL, a core standard for XML web services. This kind of approach facilitates the reuse of existing business documents and enhances interoperability between implemented web services. Furthermore, it suggests how to link with other conceptual modeling frameworks such as ebXML/UMM, built on a rich heritage of electronic business experience.
文摘Campus network establishment belongs to the field of system engineering. It is necessary to carry on cooperation among departments. Standardization is the key to solve the problem, and its core is standardization of documents. Therefore, this paper will be concentrated on the discussion of relevant problems in combination with our campus network practice.
文摘Plain Language has made a great difference nowadays. As it turns out, Plain Language works effectively to express clearly, concisely and systematically. However, it is necessary for contemporary practitioners to review the origin and development of Plain Language Movement and to examine whether it has thoroughly implemented Plain Language policies in every federal document. Examining a contemporary federal document against the Guidelines for Document Designers reveals existing problems for further development.
基金supported by the Shanghai International Studies University(Grant No.:2011114061)
文摘Purpose: The thrust of this paper is to present a method for improving the accuracy of automatic indexing of Chinese-English mixed documents.Design/methodology/approach: Based on the inherent characteristics of Chinese-English mixed texts and the cybernetics theory,we proposed an integrated control method for indexing documents. It consists of 'feed-forward control','in-progress control' and 'feed-back control',aiming at improving the accuracy of automatic indexing of Chinese-English mixed documents. An experiment was conducted to investigate the effect of our proposed method.Findings: This method distinguishes Chinese and English documents in grammatical structures and word formation rules. Through the implementation of this method in the three phases of automatic indexing for the Chinese-English mixed documents,the results were encouraging. The precision increased from 88.54% to 97.10% and recall improved from97.37% to 99.47%.Research limitations: The indexing method is relatively complicated and the whole indexing process requires substantial human intervention. Due to pattern matching based on a bruteforce(BF) approach,the indexing efficiency has been reduced to some extent.Practical implications: The research is of both theoretical significance and practical value in improving the accuracy of automatic indexing of multilingual documents(not confined to Chinese-English mixed documents). The proposed method will benefit not only the indexing of life science documents but also the indexing of documents in other subject areas.Originality/value: So far,few studies have been published about the method for increasing the accuracy of multilingual automatic indexing. This study will provide insights into the automatic indexing of multilingual documents,especially Chinese-English mixed documents.
文摘Purpose:Accurately assigning the document type of review articles in citation index databases like Web of Science(WoS)and Scopus is important.This study aims to investigate the document type assignation of review articles in Web of Science,Scopus and Publisher’s websites on a large scale.Design/methodology/approach:27,616 papers from 160 journals from 10 review journal series indexed in SCI are analyzed.The document types of these papers labeled on journals’websites,and assigned by WoS and Scopus are retrieved and compared to determine the assigning accuracy and identify the possible reasons for wrongly assigning.For the document type labeled on the website,we further differentiate them into explicit review and implicit review based on whether the website directly indicates it is a review or not.Findings:Overall,WoS and Scopus performed similarly,with an average precision of about 99% and recall of about 80%.However,there were some differences between WoS and Scopus across different journal series and within the same journal series.The assigning accuracy of WoS and Scopus for implicit reviews dropped significantly,especially for Scopus.Research limitations:The document types we used as the gold standard were based on the journal websites’labeling which were not manually validated one by one.We only studied the labeling performance for review articles published during 2017-2018 in review journals.Whether this conclusion can be extended to review articles published in non-review journals and most current situation is not very clear.Practical implications:This study provides a reference for the accuracy of document type assigning of review articles in WoS and Scopus,and the identified pattern for assigning implicit reviews may be helpful to better labeling on websites,WoS and Scopus.Originality/value:This study investigated the assigning accuracy of document type of reviews and identified the some patterns of wrong assignments.
文摘The eXtensible markup language (XML) is a kind of new meta language for replacing HTML and has many advantages. Traditional engineering documents have too many expression forms to be expediently managed and have no dynamic correlation functions. This paper introduces a new method and uses XML to store and manage engineering documents to realize the format unity of engineering documents and their dynamic correlations.