Kernel-based methods work by embedding the data into a feature space and then searching linear hypothesis among the embedding data points. The performance is mostly affected by which kernel is used. A promising way is...Kernel-based methods work by embedding the data into a feature space and then searching linear hypothesis among the embedding data points. The performance is mostly affected by which kernel is used. A promising way is to learn the kernel from the data automatically. A general regularized risk functional (RRF) criterion for kernel matrix learning is proposed. Compared with the RRF criterion, general RRF criterion takes into account the geometric distributions of the embedding data points. It is proven that the distance between different geometric distdbutions can be estimated by their centroid distance in the reproducing kernel Hilbert space. Using this criterion for kernel matrix learning leads to a convex quadratically constrained quadratic programming (QCQP) problem. For several commonly used loss functions, their mathematical formulations are given. Experiment results on a collection of benchmark data sets demonstrate the effectiveness of the proposed method.展开更多
Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all cha...Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all characteristics of networks.In fact,network vertices usually contain rich text information,which can be well utilized to learn text-enhanced network representations.Meanwhile,Matrix-Forest Index(MFI)has shown its high effectiveness and stability in link prediction tasks compared with other algorithms of link prediction.Both MFI and Inductive Matrix Completion(IMC)are not well applied with algorithmic frameworks of typical representation learning methods.Therefore,we proposed a novel semi-supervised algorithm,tri-party deep network representation learning using inductive matrix completion(TDNR).Based on inductive matrix completion algorithm,TDNR incorporates text features,the link certainty degrees of existing edges and the future link probabilities of non-existing edges into network representations.The experimental results demonstrated that TFNR outperforms other baselines on three real-world datasets.The visualizations of TDNR show that proposed algorithm is more discriminative than other unsupervised approaches.展开更多
Newton's learning algorithm of NN is presented and realized. In theory, the convergence rate of learning algorithm of NN based on Newton's method must be faster than BP's and other learning algorithms, because the ...Newton's learning algorithm of NN is presented and realized. In theory, the convergence rate of learning algorithm of NN based on Newton's method must be faster than BP's and other learning algorithms, because the gradient method is linearly convergent while Newton's method has second order convergence rate. The fast computing algorithm of Hesse matrix of the cost function of NN is proposed and it is the theory basis of the improvement of Newton's learning algorithm. Simulation results show that the convergence rate of Newton's learning algorithm is high and apparently faster than the traditional BP method's, and the robustness of Newton's learning algorithm is also better than BP method' s.展开更多
Low-rank matrix recovery is an important problem extensively studied in machine learning, data mining and computer vision communities. A novel method is proposed for low-rank matrix recovery, targeting at higher recov...Low-rank matrix recovery is an important problem extensively studied in machine learning, data mining and computer vision communities. A novel method is proposed for low-rank matrix recovery, targeting at higher recovery accuracy and stronger theoretical guarantee. Specifically, the proposed method is based on a nonconvex optimization model, by solving the low-rank matrix which can be recovered from the noisy observation. To solve the model, an effective algorithm is derived by minimizing over the variables alternately. It is proved theoretically that this algorithm has stronger theoretical guarantee than the existing work. In natural image denoising experiments, the proposed method achieves lower recovery error than the two compared methods. The proposed low-rank matrix recovery method is also applied to solve two real-world problems, i.e., removing noise from verification code and removing watermark from images, in which the images recovered by the proposed method are less noisy than those of the two compared methods.展开更多
针对基于图的无监督特征选择算法存在挖掘数据内在信息不充分,且易受噪声干扰难以获取更具有判别性特征的问题,提出一种基于广义不相关回归和潜在表示学习的无监督特征选择方法(uncorrelated regression and latent representation for ...针对基于图的无监督特征选择算法存在挖掘数据内在信息不充分,且易受噪声干扰难以获取更具有判别性特征的问题,提出一种基于广义不相关回归和潜在表示学习的无监督特征选择方法(uncorrelated regression and latent representation for unsupervised feature selection,URLUFS)。该方法将非负矩阵分解作用于广义不相关回归模型的投影矩阵,使投影矩阵实现非线性的维数约简并获得特征选择矩阵。在特征选择矩阵的基础上,引入自适应图学习来进一步挖掘数据的局部流形结构,并对特征选择矩阵施加范数约束以保持稀疏性。利用潜在表示对数据样本间的相互关系进行学习,引导回归模型中的伪标签矩阵,从而选择出更具有判别性的特征。在8个公开的数据集上进行了数值对比实验,实验结果表明:基于广义不相关回归和潜在表示学习的无监督特征选择算法明显优于其他8种无监督特征选择算法。展开更多
基金supported by the National Natural Science Fundation of China (60736021)the Joint Funds of NSFC-Guangdong Province(U0735003)
文摘Kernel-based methods work by embedding the data into a feature space and then searching linear hypothesis among the embedding data points. The performance is mostly affected by which kernel is used. A promising way is to learn the kernel from the data automatically. A general regularized risk functional (RRF) criterion for kernel matrix learning is proposed. Compared with the RRF criterion, general RRF criterion takes into account the geometric distributions of the embedding data points. It is proven that the distance between different geometric distdbutions can be estimated by their centroid distance in the reproducing kernel Hilbert space. Using this criterion for kernel matrix learning leads to a convex quadratically constrained quadratic programming (QCQP) problem. For several commonly used loss functions, their mathematical formulations are given. Experiment results on a collection of benchmark data sets demonstrate the effectiveness of the proposed method.
基金Projects(11661069,61763041) supported by the National Natural Science Foundation of ChinaProject(IRT_15R40) supported by Changjiang Scholars and Innovative Research Team in University,ChinaProject(2017TS045) supported by the Fundamental Research Funds for the Central Universities,China
文摘Most existing network representation learning algorithms focus on network structures for learning.However,network structure is only one kind of view and feature for various networks,and it cannot fully reflect all characteristics of networks.In fact,network vertices usually contain rich text information,which can be well utilized to learn text-enhanced network representations.Meanwhile,Matrix-Forest Index(MFI)has shown its high effectiveness and stability in link prediction tasks compared with other algorithms of link prediction.Both MFI and Inductive Matrix Completion(IMC)are not well applied with algorithmic frameworks of typical representation learning methods.Therefore,we proposed a novel semi-supervised algorithm,tri-party deep network representation learning using inductive matrix completion(TDNR).Based on inductive matrix completion algorithm,TDNR incorporates text features,the link certainty degrees of existing edges and the future link probabilities of non-existing edges into network representations.The experimental results demonstrated that TFNR outperforms other baselines on three real-world datasets.The visualizations of TDNR show that proposed algorithm is more discriminative than other unsupervised approaches.
文摘Newton's learning algorithm of NN is presented and realized. In theory, the convergence rate of learning algorithm of NN based on Newton's method must be faster than BP's and other learning algorithms, because the gradient method is linearly convergent while Newton's method has second order convergence rate. The fast computing algorithm of Hesse matrix of the cost function of NN is proposed and it is the theory basis of the improvement of Newton's learning algorithm. Simulation results show that the convergence rate of Newton's learning algorithm is high and apparently faster than the traditional BP method's, and the robustness of Newton's learning algorithm is also better than BP method' s.
基金Projects(61173122,61262032) supported by the National Natural Science Foundation of ChinaProjects(11JJ3067,12JJ2038) supported by the Natural Science Foundation of Hunan Province,China
文摘Low-rank matrix recovery is an important problem extensively studied in machine learning, data mining and computer vision communities. A novel method is proposed for low-rank matrix recovery, targeting at higher recovery accuracy and stronger theoretical guarantee. Specifically, the proposed method is based on a nonconvex optimization model, by solving the low-rank matrix which can be recovered from the noisy observation. To solve the model, an effective algorithm is derived by minimizing over the variables alternately. It is proved theoretically that this algorithm has stronger theoretical guarantee than the existing work. In natural image denoising experiments, the proposed method achieves lower recovery error than the two compared methods. The proposed low-rank matrix recovery method is also applied to solve two real-world problems, i.e., removing noise from verification code and removing watermark from images, in which the images recovered by the proposed method are less noisy than those of the two compared methods.
基金Supported by National Basic Research Program of China (973 Program) (2005CB321902) National Natural Science Foundation of China (60727002 60774003 60921001 90916024)+2 种基金 the Commission on Science Technology and Industry for National Defense (A2120061303) the Doctoral Program Foundation of Ministry of Education of China (20030006003) the Innovation Foundation of BUAA for Ph.D. Graduates
文摘针对基于图的无监督特征选择算法存在挖掘数据内在信息不充分,且易受噪声干扰难以获取更具有判别性特征的问题,提出一种基于广义不相关回归和潜在表示学习的无监督特征选择方法(uncorrelated regression and latent representation for unsupervised feature selection,URLUFS)。该方法将非负矩阵分解作用于广义不相关回归模型的投影矩阵,使投影矩阵实现非线性的维数约简并获得特征选择矩阵。在特征选择矩阵的基础上,引入自适应图学习来进一步挖掘数据的局部流形结构,并对特征选择矩阵施加范数约束以保持稀疏性。利用潜在表示对数据样本间的相互关系进行学习,引导回归模型中的伪标签矩阵,从而选择出更具有判别性的特征。在8个公开的数据集上进行了数值对比实验,实验结果表明:基于广义不相关回归和潜在表示学习的无监督特征选择算法明显优于其他8种无监督特征选择算法。