期刊文献+

大数据与化学数据挖掘 被引量:15

Big data and chemical data mining
原文传递
导出
摘要 数据是重要的战略资源,大数据挖掘技术已成为学术界、企业界甚至各国政府关注的热点.本文介绍了大数据的基本概念及发展现状,综述了与化学研究有关的大数据研究状况,讨论了大数据在基础理论与关键技术2个层面上的主要问题以及大数据挖掘技术在化学各领域中的应用,并对大数据发展的未来及其在化学学科中的应用前景进行了展望. Big data is fast becoming an important resource and a hot topic in academic research, business and government. In this paper, we introduce the concept of big data, and review advances in big data research, including technology for big data collection, cloud computing technology like Google's file system, BigTable, MapReduce and Hadoop, and data mining and visualization methods for big data. Big data are commonly defined by the so-called 4 V's, i.e., volume, variety, velocity, and value. High volume data with large variety make the analysis of big data much more difficult. Since velocity is important, fast high performance analysis methods are needed for big data. Moreover, the high value of big data is precisely the reason for the importance of and research activity in this area. In this paper, we also summarize various applications of big data in chemistry. Professional information platforms like the Collaboratory for Multi-scale Chemical Sciences (CMCS) and Chemical Informatics and Cyberinfrastructure Collaboratory (CICC) have been developed to manage and research chemical big data, while search engines like the ChemDB Portal have been established to extract chemical information from the internet. Software like the Integrated Project View and ArQiologist can be used to assist in the design of new medicines in medicinal chemistry. A data management system called BioGames has been proposed to analyze microfluidics big data. Moreover, graphics processing units are widely used to improve the computational capabilities of molecular dynamics simulations, while compressed score plots have been proposed to solve visualization issues in the field of chemometrics. In the era of big data, the analytical instruments, chemical data systems, and even the research methods may need to be changed and therefore, new strategies and techniques are still needed for the generation and processing of big data.
出处 《科学通报》 EI CAS CSCD 北大核心 2015年第8期694-703,共10页 Chinese Science Bulletin
关键词 大数据 数据挖掘 可视化 云计算 化学 big data, data mining, visualization, cloud computing, chemistry
作者简介 E-mail:xshao@nankai.edu.cn
  • 相关文献

参考文献17

二级参考文献383

共引文献3156

同被引文献161

引证文献15

二级引证文献119

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部