期刊文献+

Encoding of Primary Structures of Biological Macromolecules Within a Data Mining Perspective 被引量:1

Encoding of Primary Structures of Biological Macromolecules Within a DataMining Perspective
原文传递
导出
摘要 An encoding method has a direct effect on the quality and the representationof the discovered knowledge in data mining systems. Biological macromolecules are encoded by stringsof characters, called primary structures. Knowing that data mining systems usually use relationaltables to encode data, we have then to re-encode these strings and transform them into relationaltables. In this paper, we do a comparative study of the existing static encoding methods, that arebased on the Biologist know-how, and our new dynamic encoding one, that is based on the constructionof Discriminant and Minimal Substrings (DMS). Different classification methods are used to do thisstudy. The experimental results show that our dynamic encoding method is more efficient than thestatic ones, to encode biological macromolecules within a data mining perspective. An encoding method has a direct effect on the quality and the representationof the discovered knowledge in data mining systems. Biological macromolecules are encoded by stringsof characters, called primary structures. Knowing that data mining systems usually use relationaltables to encode data, we have then to re-encode these strings and transform them into relationaltables. In this paper, we do a comparative study of the existing static encoding methods, that arebased on the Biologist know-how, and our new dynamic encoding one, that is based on the constructionof Discriminant and Minimal Substrings (DMS). Different classification methods are used to do thisstudy. The experimental results show that our dynamic encoding method is more efficient than thestatic ones, to encode biological macromolecules within a data mining perspective.
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第1期78-88,共11页 计算机科学技术学报(英文版)
关键词 encoding methods biological macromolecules data mining strings encoding methods biological macromolecules data mining strings
  • 相关文献

参考文献17

  • 1Dickerson R E, Geis I. The Structure and Actions of Proteins. Harper & Row Publishers, New York, NY,1969, pp.16-17.
  • 2Hirsh J D, Sternberg M J E. Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry,1992, 31(32): 7211-7218.
  • 3Hirsh H, Noordewier M. Using background knowledge to improve inductive learning of DNA sequences. InProc. the Tenth Conference on Artificial Intelligencefor Applications, 1994. pp.351-357.
  • 4wang J T L, Marr T G: Shasha D et al. Discovering active motifs in sets of related protein sequences and using them for classification. Nucleic Acids Res., 1994,22: 2769-2775.
  • 5Qicheng M, Wang J T L, Gaggiker J R. Mining biomolecular data using background knowledge and artificial neural networks, technical report.
  • 6Quinlan J R. Learning efficient classification procedures and their application to chess end games. In Machine Learning: An AI Approach, Vol.1, Michalski R S, Carbonell J G, Mitchell T M (Eds.), 1983, pp.463-482.
  • 7Towell G G. Symbolic knowledge and neural networks: Insertion, refinement and extraction [Dissertation]. Department of Computer Sciences, University of Wisconsin-Madison, 1991.
  • 8Zurada J M. Introduction to Artificial Neural Systems.West Publishing Co. St. Paul, MN, 1992, pp.186-196.
  • 9Lu S Y, Fu K S.A sentence-to-sentence clustering procedure for pattern analysis. IEEE Trans. Systems, Man and Cybernetics, 1978, (8): 381-389.
  • 10O'Neill M C. Consensus methods for finding and ranking DNA binding sites. Journal of Molecular Biology,1989. 207: 301-310.

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部