Based on the general methods in power flow calculation of power system and on conceptions and classifications of parallel algorithm, a new approach named Dynamic Asynchronous Parallel Algorithm that applies to the onl...Based on the general methods in power flow calculation of power system and on conceptions and classifications of parallel algorithm, a new approach named Dynamic Asynchronous Parallel Algorithm that applies to the online analysis and real-time dispatching and controlling of large-scale power network was put forward in this paper. Its performances of high speed and dynamic following have been verified on IEEE-14 bus system.展开更多
Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging withou...Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging without memory conflicts are adopted. To find a solution for the n-element knapsack problem, the proposed algorithm needs O(2^3n/8) time when O(2^3n/8) shared memory units and O(2^n/4) processors are available. The comparisons between the proposed algorithm and 10 existing algorithms show that the improved parallel three-fist algorithm is the first exclusive-read exclusive-write (EREW) parallel algorithm that can solve the knapsack instances in less than O(2^n/2) time when the available hardware resource is smaller than O(2^n/2) , and hence is an improved result over the past researches.展开更多
A new parallel algorithm is proposed for the knapsack problem where the method of divide and conquer is adopted. Based on an EREW-SIMD machine with shared memory, the proposed algorithm utilizes O(2 n/4 ) 1-ε ...A new parallel algorithm is proposed for the knapsack problem where the method of divide and conquer is adopted. Based on an EREW-SIMD machine with shared memory, the proposed algorithm utilizes O(2 n/4 ) 1-ε processors, 0≤ ε ≤1, and O(2 n/2 ) memory to find a solution for the n -element knapsack problem in time O(2 n/4 (2 n/4 ) ε) . The cost of the proposed parallel algorithm is O(2 n/2 ) , which is an optimal method for solving the knapsack problem without memory conflicts and an improved result over the past researches.展开更多
On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP ...On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP pipelining algorithm makes full use of overlapping technique between computation and communication. Compared with broadcast operation, the parallel algorithm reduces communication cost. This algorithm has been implemented on MPI on PC-cluster. The theoretical analysis and experimental results show that the parallel algorithm is an efficient and scalable algorithm.展开更多
The purpose of this paper is to study necessary and su?cient condition for the strong convergence of a new parallel iterative algorithm with errors for two finite families of uniformly L-Lipschitzian mappings in Bana...The purpose of this paper is to study necessary and su?cient condition for the strong convergence of a new parallel iterative algorithm with errors for two finite families of uniformly L-Lipschitzian mappings in Banach spaces. The results presented in this paper improve and extend the recent ones announced by [2–7].展开更多
In this paper,it has proposed a realtime implementation of low-density paritycheck(LDPC) decoder with less complexity used for satellite communication on FPGA platform.By adopting a(2048.4096)irregular quasi-cyclic(QC...In this paper,it has proposed a realtime implementation of low-density paritycheck(LDPC) decoder with less complexity used for satellite communication on FPGA platform.By adopting a(2048.4096)irregular quasi-cyclic(QC) LDPC code,the proposed partly parallel decoding structure balances the complexity between the check node unit(CNU) and the variable node unit(VNU) based on min-sum(MS) algorithm,thereby achieving less Slice resources and superior clock performance.Moreover,as a lookup table(LUT) is utilized in this paper to search the node message stored in timeshare memory unit,it is simple to reuse and save large amount of storage resources.The implementation results on Xilinx FPGA chip illustrate that,compared with conventional structure,the proposed scheme can achieve at last 28.6%and 8%cost reduction in RAM and Slice respectively.The clock frequency is also increased to 280 MHz without decoding performance deterioration and convergence speed reduction.展开更多
The influence of chemical nonequilibrium on the thermal characteristics is explored by using the 2Dhybrid grid direct simulation Monte Carlo(DSMC)parallel method.An improved molecule search algorithm is proposed,which...The influence of chemical nonequilibrium on the thermal characteristics is explored by using the 2Dhybrid grid direct simulation Monte Carlo(DSMC)parallel method.An improved molecule search algorithm is proposed,which can preserve the high efficiency of area search algorithm.This method can overcome the defects of area search algorithm,and give all information about molecules hitting surface.The heat flux calculation method for a rarefied hypersonic flow is established.In addition,the testing methods of chemical reaction probability for five species of mixed gas with limited speed chemical reactions are also selected.To validate the effectiveness of the present method,hypersonic flow around a cylinder is firstly simulated,and subsequently numerical simulations of the heat flux and flow field characteristics around the blunt body at different heights are carried out in two different cases:the thermal nonequilibrium condition and the thermochemical nonequilibrium condition.Numerical results demonstrate the validity and reliability of the proposed methods.展开更多
In this article,two kinds of expandable parallel finite element methods,based on two-grid discretizations,are given to solve the linear elliptic problems.Compared with the classical local and parallel finite element m...In this article,two kinds of expandable parallel finite element methods,based on two-grid discretizations,are given to solve the linear elliptic problems.Compared with the classical local and parallel finite element methods,there are two attractive features of the methods shown in this article:1)a partition of unity is used to generate a series of local and independent subproblems to guarantee the final approximation globally continuous;2)the computational domain of each local subproblem is contained in a ball with radius of O(H)(H is the coarse mesh parameter),which means methods in this article are more suitable for parallel computing in a large parallel computer system.Some a priori error estimation are obtained and optimal error bounds in both H^1-normal and L^2-normal are derived.Finally,numerical results are reported to test and verify the feasibility and validity of our methods.展开更多
This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core...This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core of this mining system are four sets of parallel graphmining algorithms programmed in the BSP parallel model and one set of data extractiontransformationload ing (ETE) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a welldesigned data management function enables users to view, delete and input data in the Ha doop distributed file system (HDFS). Experiments on artificial data show that the components of graphmining algorithm in MBGM are efficient.展开更多
Average (mean) voter is one of the commonest voting methods suitable for decision making in highly-available and long-missions applications where the availability and the speed of the system are critical.In this pap...Average (mean) voter is one of the commonest voting methods suitable for decision making in highly-available and long-missions applications where the availability and the speed of the system are critical.In this paper,a new generation of average voter based on parallel algorithms and parallel random access machine(PRAM) structure are proposed.The analysis shows that this algorithm is optimal due to its improved time complexity,speed-up,and efficiency and is especially appropriate for applications where the size of input space is large.展开更多
In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the ...In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the algorithms. The proofs are essentially based on the results of sequential methods shown by Eggermontt[1].展开更多
We present a fast method for polynomial evaluation at points in arithmetic progression. By dividing the progression into m new ones and evaluating the polynomial at each point of these new progressions recursively,thi...We present a fast method for polynomial evaluation at points in arithmetic progression. By dividing the progression into m new ones and evaluating the polynomial at each point of these new progressions recursively,this method saves most of the multiplications in the price of little increase of additions comparing to Horner's method, while their accuracy are almost the same. We also introduce vector structure to the recursive process making it suitable for parallel applications.展开更多
A novel scalable architecture for coherent beam combining with hybrid phase control involving passive phasing and active phasing in master oscillator-power amplifier configuration is presented. Wide-linewidth mutually...A novel scalable architecture for coherent beam combining with hybrid phase control involving passive phasing and active phasing in master oscillator-power amplifier configuration is presented. Wide-linewidth mutually injected passive phasing fibre laser arrays serve as master oscillators for the power amplifiers, and the active phasing using stochastic parallel gradient descent algorithm is induced. Wide-linewidth seed laser can suppress the stimulated Brillouin scattering effectively and improve the output power of the fibre laser amplifier, while hybrid phase control provides a robust way for in-phase mode coherent beam combining simultaneously. Experiment is performed by active phasing fibre laser amplifiers with passive phasing fibre ring laser array seed lasers. Power encircled in the main-lobe increases1.57 times and long-exposure fringe contrast is obtained to be 78% when the system evolves from passive phasing to hybrid phasing.展开更多
A new real-time model based on parallel time-series mining is proposed to improve the accuracy and efficiency of the network intrusion detection systems. In this model, multidimensional dataset is constructed to descr...A new real-time model based on parallel time-series mining is proposed to improve the accuracy and efficiency of the network intrusion detection systems. In this model, multidimensional dataset is constructed to describe network events, and sliding window updating algorithm is used to maintain network stream. Moreover, parallel frequent patterns and frequent episodes mining algorithms are applied to implement parallel time-series mining engineer which can intelligently generate rules to distinguish intrusions from normal activities. Analysis and study on the basis of DAWNING 3000 indicate that this parallel time-series mining-based model provides a more accurate and efficient way to building real-time NIDS.展开更多
A two-level optimization method for the design of complex truss and parallel distributed implementation on a LAN is presented using parallel virtual machine (PVM) for Win 32 as message passing between PCs. The volu...A two-level optimization method for the design of complex truss and parallel distributed implementation on a LAN is presented using parallel virtual machine (PVM) for Win 32 as message passing between PCs. The volumes of truss are minimized by decomposing the original optimization problem into a number of bar optimization problems executed concurrently and a coordinate optimization problem, subject to constraints on nodal displacements, and stresses, buckling and crippling of bars, etc. The system sensitivity analysis that derives the partial derivatives of displacements and stresses with respect to areas are also performed in parallel so as to shorten the analysis time. The convergence and the speedup performances as well as parallel computing efficiency of the method are investigated by the optimization examples of a 52-bar planar truss and a 3 126-bar three-dimensional truss. The results show that the ideal speedup is obtained in the cases of 2 PCs for the 3 126-bar space truss optimization, while no speedup is observed for the 52-bar truss. It!is concluded that (1) the parallel distributed algorithm proposed is efficient on the PC-based LAN for the coarse-grained large optimization problem; (2) to get a high speedup, the problem granularity should match with the network granularity; and (3) the larger the problem size is, the higher the parallel efficiency is.展开更多
文摘Based on the general methods in power flow calculation of power system and on conceptions and classifications of parallel algorithm, a new approach named Dynamic Asynchronous Parallel Algorithm that applies to the online analysis and real-time dispatching and controlling of large-scale power network was put forward in this paper. Its performances of high speed and dynamic following have been verified on IEEE-14 bus system.
文摘Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging without memory conflicts are adopted. To find a solution for the n-element knapsack problem, the proposed algorithm needs O(2^3n/8) time when O(2^3n/8) shared memory units and O(2^n/4) processors are available. The comparisons between the proposed algorithm and 10 existing algorithms show that the improved parallel three-fist algorithm is the first exclusive-read exclusive-write (EREW) parallel algorithm that can solve the knapsack instances in less than O(2^n/2) time when the available hardware resource is smaller than O(2^n/2) , and hence is an improved result over the past researches.
文摘A new parallel algorithm is proposed for the knapsack problem where the method of divide and conquer is adopted. Based on an EREW-SIMD machine with shared memory, the proposed algorithm utilizes O(2 n/4 ) 1-ε processors, 0≤ ε ≤1, and O(2 n/2 ) memory to find a solution for the n -element knapsack problem in time O(2 n/4 (2 n/4 ) ε) . The cost of the proposed parallel algorithm is O(2 n/2 ) , which is an optimal method for solving the knapsack problem without memory conflicts and an improved result over the past researches.
基金the National Natural Science Foundation of China under Grant No. 60671033.
文摘On the basis of Floyd algorithm with the extended path matrix, a parallel algorithm which resolves all-pair shortest path (APSP) problem on cluster environment is analyzed and designed. Meanwhile, the parallel APSP pipelining algorithm makes full use of overlapping technique between computation and communication. Compared with broadcast operation, the parallel algorithm reduces communication cost. This algorithm has been implemented on MPI on PC-cluster. The theoretical analysis and experimental results show that the parallel algorithm is an efficient and scalable algorithm.
基金supported by the National Natural Science Foun-dation of China (11071169)the Natural Science Foundation of Zhejiang Province (Y6110287)
文摘The purpose of this paper is to study necessary and su?cient condition for the strong convergence of a new parallel iterative algorithm with errors for two finite families of uniformly L-Lipschitzian mappings in Banach spaces. The results presented in this paper improve and extend the recent ones announced by [2–7].
文摘In this paper,it has proposed a realtime implementation of low-density paritycheck(LDPC) decoder with less complexity used for satellite communication on FPGA platform.By adopting a(2048.4096)irregular quasi-cyclic(QC) LDPC code,the proposed partly parallel decoding structure balances the complexity between the check node unit(CNU) and the variable node unit(VNU) based on min-sum(MS) algorithm,thereby achieving less Slice resources and superior clock performance.Moreover,as a lookup table(LUT) is utilized in this paper to search the node message stored in timeshare memory unit,it is simple to reuse and save large amount of storage resources.The implementation results on Xilinx FPGA chip illustrate that,compared with conventional structure,the proposed scheme can achieve at last 28.6%and 8%cost reduction in RAM and Slice respectively.The clock frequency is also increased to 280 MHz without decoding performance deterioration and convergence speed reduction.
基金supported by the National Defense Basic Research Program during the Twelfth Five-Year Plan Period
文摘The influence of chemical nonequilibrium on the thermal characteristics is explored by using the 2Dhybrid grid direct simulation Monte Carlo(DSMC)parallel method.An improved molecule search algorithm is proposed,which can preserve the high efficiency of area search algorithm.This method can overcome the defects of area search algorithm,and give all information about molecules hitting surface.The heat flux calculation method for a rarefied hypersonic flow is established.In addition,the testing methods of chemical reaction probability for five species of mixed gas with limited speed chemical reactions are also selected.To validate the effectiveness of the present method,hypersonic flow around a cylinder is firstly simulated,and subsequently numerical simulations of the heat flux and flow field characteristics around the blunt body at different heights are carried out in two different cases:the thermal nonequilibrium condition and the thermochemical nonequilibrium condition.Numerical results demonstrate the validity and reliability of the proposed methods.
基金Subsidized by NSFC (11701343)partially supported by NSFC (11571274,11401466)
文摘In this article,two kinds of expandable parallel finite element methods,based on two-grid discretizations,are given to solve the linear elliptic problems.Compared with the classical local and parallel finite element methods,there are two attractive features of the methods shown in this article:1)a partition of unity is used to generate a series of local and independent subproblems to guarantee the final approximation globally continuous;2)the computational domain of each local subproblem is contained in a ball with radius of O(H)(H is the coarse mesh parameter),which means methods in this article are more suitable for parallel computing in a large parallel computer system.Some a priori error estimation are obtained and optimal error bounds in both H^1-normal and L^2-normal are derived.Finally,numerical results are reported to test and verify the feasibility and validity of our methods.
基金supported by ZTE Industry-Academia-Research Cooperaton Funds
文摘This paper proposes an analytical mining tool for big graph data based on MapReduce and bulk synchronous parallel (BSP) com puting model. The tool is named Mapreduce and BSP based Graphmining tool (MBGM). The core of this mining system are four sets of parallel graphmining algorithms programmed in the BSP parallel model and one set of data extractiontransformationload ing (ETE) algorithms implemented in MapReduce. To invoke these algorithm sets, we designed a workflow engine which optimized for cloud computing. Finally, a welldesigned data management function enables users to view, delete and input data in the Ha doop distributed file system (HDFS). Experiments on artificial data show that the components of graphmining algorithm in MBGM are efficient.
文摘Average (mean) voter is one of the commonest voting methods suitable for decision making in highly-available and long-missions applications where the availability and the speed of the system are critical.In this paper,a new generation of average voter based on parallel algorithms and parallel random access machine(PRAM) structure are proposed.The analysis shows that this algorithm is optimal due to its improved time complexity,speed-up,and efficiency and is especially appropriate for applications where the size of input space is large.
文摘In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the algorithms. The proofs are essentially based on the results of sequential methods shown by Eggermontt[1].
基金Supported by the Graduate Starting Seed Fund of Northwestern Polytechnical University(Z2012030)
文摘We present a fast method for polynomial evaluation at points in arithmetic progression. By dividing the progression into m new ones and evaluating the polynomial at each point of these new progressions recursively,this method saves most of the multiplications in the price of little increase of additions comparing to Horner's method, while their accuracy are almost the same. We also introduce vector structure to the recursive process making it suitable for parallel applications.
基金supported by the Innovation Foundation for Graduates in National University of Defense Technology,China (GrantNo.B080702)
文摘A novel scalable architecture for coherent beam combining with hybrid phase control involving passive phasing and active phasing in master oscillator-power amplifier configuration is presented. Wide-linewidth mutually injected passive phasing fibre laser arrays serve as master oscillators for the power amplifiers, and the active phasing using stochastic parallel gradient descent algorithm is induced. Wide-linewidth seed laser can suppress the stimulated Brillouin scattering effectively and improve the output power of the fibre laser amplifier, while hybrid phase control provides a robust way for in-phase mode coherent beam combining simultaneously. Experiment is performed by active phasing fibre laser amplifiers with passive phasing fibre ring laser array seed lasers. Power encircled in the main-lobe increases1.57 times and long-exposure fringe contrast is obtained to be 78% when the system evolves from passive phasing to hybrid phasing.
文摘A new real-time model based on parallel time-series mining is proposed to improve the accuracy and efficiency of the network intrusion detection systems. In this model, multidimensional dataset is constructed to describe network events, and sliding window updating algorithm is used to maintain network stream. Moreover, parallel frequent patterns and frequent episodes mining algorithms are applied to implement parallel time-series mining engineer which can intelligently generate rules to distinguish intrusions from normal activities. Analysis and study on the basis of DAWNING 3000 indicate that this parallel time-series mining-based model provides a more accurate and efficient way to building real-time NIDS.
基金heNationalNaturalScienceFoundationofChina (No .5 96 6 5 0 0 2 )andtheScientificResearchFoundationofGuangxiUniversity (No .X0 32 0 32 )
文摘A two-level optimization method for the design of complex truss and parallel distributed implementation on a LAN is presented using parallel virtual machine (PVM) for Win 32 as message passing between PCs. The volumes of truss are minimized by decomposing the original optimization problem into a number of bar optimization problems executed concurrently and a coordinate optimization problem, subject to constraints on nodal displacements, and stresses, buckling and crippling of bars, etc. The system sensitivity analysis that derives the partial derivatives of displacements and stresses with respect to areas are also performed in parallel so as to shorten the analysis time. The convergence and the speedup performances as well as parallel computing efficiency of the method are investigated by the optimization examples of a 52-bar planar truss and a 3 126-bar three-dimensional truss. The results show that the ideal speedup is obtained in the cases of 2 PCs for the 3 126-bar space truss optimization, while no speedup is observed for the 52-bar truss. It!is concluded that (1) the parallel distributed algorithm proposed is efficient on the PC-based LAN for the coarse-grained large optimization problem; (2) to get a high speedup, the problem granularity should match with the network granularity; and (3) the larger the problem size is, the higher the parallel efficiency is.