An improved heuristic ant-clustering algorithm(HAC)is presented in this paper.A device of ’memory bank’ is proposed,which can bring forth heuristic knowledge guiding ant to move in the bi-dimension grid space.The de...An improved heuristic ant-clustering algorithm(HAC)is presented in this paper.A device of ’memory bank’ is proposed,which can bring forth heuristic knowledge guiding ant to move in the bi-dimension grid space.The device lowers the randomness of ants’ moving and avoids the producing of"un-assigned data object".We have made some experiments on real data sets and synthetic data sets.The results demonstrate that HAC has superiority in misclassification error rate and runtime over the classical algorithm.展开更多
A scheme to construct signatures automatically for Snort from the data captured by honeypots is presented. With this scheme intrusion detection systems can be quickly updated to detect new intrusions soon when happen....A scheme to construct signatures automatically for Snort from the data captured by honeypots is presented. With this scheme intrusion detection systems can be quickly updated to detect new intrusions soon when happen. The idea is based on the observation that any traffic to and from honeypots represents abnormal activities, so data patterns extracted from these packets can be used by misuse detection system to identify new attacks. The algorithm of constructing rules is discussed. Experiment illustrates the effectiveness of the scheme.展开更多
Protein secondary structure prediction and high-throughput drug screen data mining are two important applications in bioinformatics. The data is represented in sparse feature spaces and can be unrepresentative of futu...Protein secondary structure prediction and high-throughput drug screen data mining are two important applications in bioinformatics. The data is represented in sparse feature spaces and can be unrepresentative of future data. There is certainly some noise in the data and there may be significant noise. Supervised learners in this context will display their inherent bias toward certain solutions, generally solutions that fit the training set well. In this paper, we first describe an ensemble approach using subsampling that scales well with dataset size. A sufficient number of ensemble members using subsamples of the data can yield a more accurate classifier than a single classifier using the entire dataset. Experiments on several datasets demonstrate the effectiveness of the approach. We report results from the KDD Cup 2001 drug discovery dataset in which our approach yields a higher weighted accuracy than the winning entry. We then ex-tend our ensemble approach to create an over-generalized classifier for prediction by reducing the individual subsample size. The ensemble strategy using small subsamples has the effect of averaging over a wider range of hypotheses. We show that both protein secondary structure prediction and drug discovery prediction can be improved by the use of over-generalization, specifically through the use of ensembles of small subsamples.展开更多
Sparse arrays of telescopes have a limited (u, v)-plane coverage. In this paper, an optimization method for designing planar arrays of an aperture synthesis telescope is proposed that is based on distributed genetic a...Sparse arrays of telescopes have a limited (u, v)-plane coverage. In this paper, an optimization method for designing planar arrays of an aperture synthesis telescope is proposed that is based on distributed genetic algorithm. This distributed genetic algorithm is implemented on a network of workstations using community communication model. Such an aperture synthesis system performs with imperfection of (u, v) components caused by deviations and(or) some missing baselines. With the maximum (u, v)-plane coverage of this rotation-optimized array, the image of the source reconstructed by inverse Fourier transform is satisfactory.展开更多
A hybrid approach for fuzzy system design based on clustering and a kind of neurofuzzy networks is proposed. An unsupervised clustering technique is firstly used to determine the number of if-then fuzzy rules and gene...A hybrid approach for fuzzy system design based on clustering and a kind of neurofuzzy networks is proposed. An unsupervised clustering technique is firstly used to determine the number of if-then fuzzy rules and generate an initial fuzzy rule base from the given input-output data. Then, a class of neurofuzzy networks is constructed and its weights are tuned so that the obtained fuzzy rule base has a high accuracy. Finally, two examples of function approximation problems are given to illustrate the effectiveness of the proposed approach.展开更多
文摘An improved heuristic ant-clustering algorithm(HAC)is presented in this paper.A device of ’memory bank’ is proposed,which can bring forth heuristic knowledge guiding ant to move in the bi-dimension grid space.The device lowers the randomness of ants’ moving and avoids the producing of"un-assigned data object".We have made some experiments on real data sets and synthetic data sets.The results demonstrate that HAC has superiority in misclassification error rate and runtime over the classical algorithm.
文摘A scheme to construct signatures automatically for Snort from the data captured by honeypots is presented. With this scheme intrusion detection systems can be quickly updated to detect new intrusions soon when happen. The idea is based on the observation that any traffic to and from honeypots represents abnormal activities, so data patterns extracted from these packets can be used by misuse detection system to identify new attacks. The algorithm of constructing rules is discussed. Experiment illustrates the effectiveness of the scheme.
基金This research was partially funded by Tripos Inc.+2 种基金 the United States Department of Energy through the Sandia National Laboratories LDRD program and ASCI VIEWS Data Discovery Program contract number DE-AC04-76D000789 and the National Science Foundati
文摘Protein secondary structure prediction and high-throughput drug screen data mining are two important applications in bioinformatics. The data is represented in sparse feature spaces and can be unrepresentative of future data. There is certainly some noise in the data and there may be significant noise. Supervised learners in this context will display their inherent bias toward certain solutions, generally solutions that fit the training set well. In this paper, we first describe an ensemble approach using subsampling that scales well with dataset size. A sufficient number of ensemble members using subsamples of the data can yield a more accurate classifier than a single classifier using the entire dataset. Experiments on several datasets demonstrate the effectiveness of the approach. We report results from the KDD Cup 2001 drug discovery dataset in which our approach yields a higher weighted accuracy than the winning entry. We then ex-tend our ensemble approach to create an over-generalized classifier for prediction by reducing the individual subsample size. The ensemble strategy using small subsamples has the effect of averaging over a wider range of hypotheses. We show that both protein secondary structure prediction and drug discovery prediction can be improved by the use of over-generalization, specifically through the use of ensembles of small subsamples.
基金This project was supported by the High Technology Research and Development Programme of China (2002AA111040).
文摘Sparse arrays of telescopes have a limited (u, v)-plane coverage. In this paper, an optimization method for designing planar arrays of an aperture synthesis telescope is proposed that is based on distributed genetic algorithm. This distributed genetic algorithm is implemented on a network of workstations using community communication model. Such an aperture synthesis system performs with imperfection of (u, v) components caused by deviations and(or) some missing baselines. With the maximum (u, v)-plane coverage of this rotation-optimized array, the image of the source reconstructed by inverse Fourier transform is satisfactory.
基金This project was supported by the National Natural Science Foundation of China (60141002).
文摘A hybrid approach for fuzzy system design based on clustering and a kind of neurofuzzy networks is proposed. An unsupervised clustering technique is firstly used to determine the number of if-then fuzzy rules and generate an initial fuzzy rule base from the given input-output data. Then, a class of neurofuzzy networks is constructed and its weights are tuned so that the obtained fuzzy rule base has a high accuracy. Finally, two examples of function approximation problems are given to illustrate the effectiveness of the proposed approach.