Recent studies employing deep learning to solve the traveling salesman problem(TSP)have mainly focused on learning construction heuristics.Such methods can improve TSP solutions,but still depend on additional programs...Recent studies employing deep learning to solve the traveling salesman problem(TSP)have mainly focused on learning construction heuristics.Such methods can improve TSP solutions,but still depend on additional programs.However,methods that focus on learning improvement heuristics to iteratively refine solutions remain insufficient.Traditional improvement heuristics are guided by a manually designed search strategy and may only achieve limited improvements.This paper proposes a novel framework for learning improvement heuristics,which automatically discovers better improvement policies for heuristics to iteratively solve the TSP.Our framework first designs a new architecture based on a transformer model to make the policy network parameterized,which introduces an action-dropout layer to prevent action selection from overfitting.It then proposes a deep reinforcement learning approach integrating a simulated annealing mechanism(named RL-SA)to learn the pairwise selected policy,aiming to improve the 2-opt algorithm's performance.The RL-SA leverages the whale optimization algorithm to generate initial solutions for better sampling efficiency and uses the Gaussian perturbation strategy to tackle the sparse reward problem of reinforcement learning.The experiment results show that the proposed approach is significantly superior to the state-of-the-art learning-based methods,and further reduces the gap between learning-based methods and highly optimized solvers in the benchmark datasets.Moreover,our pre-trained model M can be applied to guide the SA algorithm(named M-SA(ours)),which performs better than existing deep models in small-,medium-,and large-scale TSPLIB datasets.Additionally,the M-SA(ours)achieves excellent generalization performance in a real-world dataset on global liner shipping routes,with the optimization percentages in distance reduction ranging from3.52%to 17.99%.展开更多
In this paper, recent developments of some heuristic algorithms were discussed. The focus was laid on the improvements of ant-cycle (AC) algorithm based on the analysis of the performances of simulated annealing (SA) ...In this paper, recent developments of some heuristic algorithms were discussed. The focus was laid on the improvements of ant-cycle (AC) algorithm based on the analysis of the performances of simulated annealing (SA) and AC for the traveling salesman problem (TSP). The Metropolis rules in SA were applied to AC and turned out an improved AC. The computational results obtained from the case study indicated that the improved AC algorithm has advantages over the sheer SA or unmixed AC.展开更多
A new local search method for the traveling salesman problem based on an original greedy representation of solution space and neighborhood structure is proposed. First, a partial closed route that only consists of thr...A new local search method for the traveling salesman problem based on an original greedy representation of solution space and neighborhood structure is proposed. First, a partial closed route that only consists of three cities is given; then other cities are added to this route by a greedy procedure successively. Implemented on a personal computer, this algorithm finds optimal solutions for 24 out of 27 standard benchmarks, and outperforms the Full Subpath Ejection Algorithm (F-SEC) proposed by Rego in 1998.展开更多
Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by ...Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by an Ant Colony Algorithm(ACA) performing a searching operation and to develop a rule set searcher which approximates the ACA′s searcher.An attribute-oriented induction methodology was used to explore the relationship between an operations′ sequence and its attributes and a set of rules has been developed.At the end of this paper,the experimental results have shown that the proposed approach has good performance with respect to the quality of solution and the speed of computation.展开更多
提出了一种基于蚁群优化和粒子群优化的混合算法求解TSP(Traveling Salesm an Prob lem)问题。在应用蚁群算法对TSP问题的求解过程中,利用粒子群算法对蚁群系统的参数进行优化,其目的是提高蚁群系统的优化性能,使蚁群系统的参数不必靠...提出了一种基于蚁群优化和粒子群优化的混合算法求解TSP(Traveling Salesm an Prob lem)问题。在应用蚁群算法对TSP问题的求解过程中,利用粒子群算法对蚁群系统的参数进行优化,其目的是提高蚁群系统的优化性能,使蚁群系统的参数不必靠人工经验或反复试验选取,而是通过粒子搜索自适应选取。展开更多
基金Project supported by the National Natural Science Foundation of China(Grant Nos.72101046 and 61672128)。
文摘Recent studies employing deep learning to solve the traveling salesman problem(TSP)have mainly focused on learning construction heuristics.Such methods can improve TSP solutions,but still depend on additional programs.However,methods that focus on learning improvement heuristics to iteratively refine solutions remain insufficient.Traditional improvement heuristics are guided by a manually designed search strategy and may only achieve limited improvements.This paper proposes a novel framework for learning improvement heuristics,which automatically discovers better improvement policies for heuristics to iteratively solve the TSP.Our framework first designs a new architecture based on a transformer model to make the policy network parameterized,which introduces an action-dropout layer to prevent action selection from overfitting.It then proposes a deep reinforcement learning approach integrating a simulated annealing mechanism(named RL-SA)to learn the pairwise selected policy,aiming to improve the 2-opt algorithm's performance.The RL-SA leverages the whale optimization algorithm to generate initial solutions for better sampling efficiency and uses the Gaussian perturbation strategy to tackle the sparse reward problem of reinforcement learning.The experiment results show that the proposed approach is significantly superior to the state-of-the-art learning-based methods,and further reduces the gap between learning-based methods and highly optimized solvers in the benchmark datasets.Moreover,our pre-trained model M can be applied to guide the SA algorithm(named M-SA(ours)),which performs better than existing deep models in small-,medium-,and large-scale TSPLIB datasets.Additionally,the M-SA(ours)achieves excellent generalization performance in a real-world dataset on global liner shipping routes,with the optimization percentages in distance reduction ranging from3.52%to 17.99%.
文摘In this paper, recent developments of some heuristic algorithms were discussed. The focus was laid on the improvements of ant-cycle (AC) algorithm based on the analysis of the performances of simulated annealing (SA) and AC for the traveling salesman problem (TSP). The Metropolis rules in SA were applied to AC and turned out an improved AC. The computational results obtained from the case study indicated that the improved AC algorithm has advantages over the sheer SA or unmixed AC.
文摘A new local search method for the traveling salesman problem based on an original greedy representation of solution space and neighborhood structure is proposed. First, a partial closed route that only consists of three cities is given; then other cities are added to this route by a greedy procedure successively. Implemented on a personal computer, this algorithm finds optimal solutions for 24 out of 27 standard benchmarks, and outperforms the Full Subpath Ejection Algorithm (F-SEC) proposed by Rego in 1998.
文摘Travelling Salesman Problem(TSP) is a classical optimization problem and it is one of a class of NP-Problem.The purposes of this work is to apply data mining methodologies to explore the patterns in data generated by an Ant Colony Algorithm(ACA) performing a searching operation and to develop a rule set searcher which approximates the ACA′s searcher.An attribute-oriented induction methodology was used to explore the relationship between an operations′ sequence and its attributes and a set of rules has been developed.At the end of this paper,the experimental results have shown that the proposed approach has good performance with respect to the quality of solution and the speed of computation.
文摘提出了一种基于蚁群优化和粒子群优化的混合算法求解TSP(Traveling Salesm an Prob lem)问题。在应用蚁群算法对TSP问题的求解过程中,利用粒子群算法对蚁群系统的参数进行优化,其目的是提高蚁群系统的优化性能,使蚁群系统的参数不必靠人工经验或反复试验选取,而是通过粒子搜索自适应选取。