期刊文献+
共找到17篇文章
< 1 >
每页显示 20 50 100
Parallel computing approach for efficient 3-D X-ray-simulated image reconstruction 被引量:1
1
作者 Ou-Yi Li Yang Wang +1 位作者 Qiong Zhang Yong-Hui Li 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2023年第7期122-136,共15页
Accurate 3-dimensional(3-D)reconstruction technology for nondestructive testing based on digital radiography(DR)is of great importance for alleviating the drawbacks of the existing computed tomography(CT)-based method... Accurate 3-dimensional(3-D)reconstruction technology for nondestructive testing based on digital radiography(DR)is of great importance for alleviating the drawbacks of the existing computed tomography(CT)-based method.The commonly used Monte Carlo simulation method ensures well-performing imaging results for DR.However,for 3-D reconstruction,it is limited by its high time consumption.To solve this problem,this study proposes a parallel computing method to accelerate Monte Carlo simulation for projection images with a parallel interface and a specific DR application.The images are utilized for 3-D reconstruction of the test model.We verify the accuracy of parallel computing for DR and evaluate the performance of two parallel computing modes-multithreaded applications(G4-MT)and message-passing interfaces(G4-MPI)-by assessing parallel speedup and efficiency.This study explores the scalability of the hybrid G4-MPI and G4-MT modes.The results show that the two parallel computing modes can significantly reduce the Monte Carlo simulation time because the parallel speedup increment of Monte Carlo simulations can be considered linear growth,and the parallel efficiency is maintained at a high level.The hybrid mode has strong scalability,as the overall run time of the 180 simulations using 320 threads is 15.35 h with 10 billion particles emitted,and the parallel speedup can be up to 151.36.The 3-D reconstruction of the model is achieved based on the filtered back projection(FBP)algorithm using 180 projection images obtained with the hybrid G4-MPI and G4-MT.The quality of the reconstructed sliced images is satisfactory because the images can reflect the internal structure of the test model.This method is applied to a complex model,and the quality of the reconstructed images is evaluated. 展开更多
关键词 parallel computing Monte Carlo Digital radiography 3-D reconstruction
在线阅读 下载PDF
Parallel Computing of the Underwater Explosion Cavitation Effects on Full-scale Ship Structures 被引量:7
2
作者 Zhi Zong Yanjie Zhao +2 位作者 Fan Ye Haitao Li Gang Chen 《Journal of Marine Science and Application》 2012年第4期469-477,共9页
As well as shock wave and bubble pulse loading, cavitation also has very significant influences on the dynamic response of surface ships and other near-surface marine structures to underwater explosive loadings. In th... As well as shock wave and bubble pulse loading, cavitation also has very significant influences on the dynamic response of surface ships and other near-surface marine structures to underwater explosive loadings. In this paper, the acoustic-structure coupling method embedded in ABAQUS is adopted to do numerical analysis of underwater explosion considering cavitation. Both the shape of bulk cavitation region and local cavitation region are obtained, and they are in good agreement with analytical results. The duration of reloading is several times longer than that of a shock wave. In the end, both the single computation and parallel computation of the cavitation effect on the dynamic responses of a full-scale ship are presented, which proved that reloading caused by cavitation is non-ignorable. All these results are helpful in understanding underwater explosion cavitation effects. 展开更多
关键词 underwater explosion CAVITATION parallel computation full-scale ship
在线阅读 下载PDF
Parallel-Computing Wavelet-Based FDTD Method for Modeling Nanoscale Optical Resonator
3
作者 蒋锡燕 王瑾 +1 位作者 陆云清 许吉 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2014年第3期260-268,共9页
An efficient wavelet-based finite-difference time-domain(FDTD)method is implemented for analyzing nanoscale optical devices,especially optical resonator.Because of its highly linear numerical dispersion properties the... An efficient wavelet-based finite-difference time-domain(FDTD)method is implemented for analyzing nanoscale optical devices,especially optical resonator.Because of its highly linear numerical dispersion properties the high-spatial-order FDTD achieves significant reduction in the number of cells,i.e.used memory,while analyzing a high-index dielectric ring resonator working as an add/drop multiplexer.The main novelty is that the wavelet-based FDTD model is extended in a parallel computation environment to solve physical problems with large dimensions.To demonstrate the efficiency of the parallelized FDTD model,a mirrored cavity is analyzed.The analysis shows that the proposed model reduces computation time and memory cost,and the parallel computation result matches the theoretical model. 展开更多
关键词 integrated optics electromagnetic field analysis finite difference time domain(FDTD) WAVELET parallel computation
在线阅读 下载PDF
APO-Based Parallel Algorithm of Channel Allocation for Cognitive Networks 被引量:1
4
作者 Ming Zhong Hailin Zhang Bei Ma 《China Communications》 SCIE CSCD 2016年第6期100-109,共10页
This article investigates channel allocation for cognitive networks, which is difficult to obtain the optimal allocation distribution. We first study interferences between nodes in cognitive networks and establish the... This article investigates channel allocation for cognitive networks, which is difficult to obtain the optimal allocation distribution. We first study interferences between nodes in cognitive networks and establish the channel allocation model with interference constraints. Then we focus on the use of evolutionary algorithms to solve the optimal allocation distribution. We further consider that the search time can be reduced by means of parallel computing, and then a parallel algorithm based APO is proposed. In contrast with the existing algorithms, we decompose the allocation vector into a number of sub-vectors and search for optimal allocation distribution of sub-vector in parallel. In order to speed up converged rate and improve converged value, some typical operations of evolutionary algorithms are modified by two novel operators. Finally, simulation results show that the proposed algorithm drastically outperform other optimal solutions in term of the network utilization. 展开更多
关键词 CRNs channel allocation parallel computing APO PSO
在线阅读 下载PDF
MDSLB:A new static load balancing method for parallel molecular dynamics simulations 被引量:1
5
作者 武云龙 徐新海 +2 位作者 杨学军 邹顺 任小广 《Chinese Physics B》 SCIE EI CAS CSCD 2014年第2期628-643,共16页
Large-scale parallelization of molecular dynamics simulations is facing challenges which seriously affect the simula- tion efficiency, among which the load imbalance problem is the most critical. In this paper, we pro... Large-scale parallelization of molecular dynamics simulations is facing challenges which seriously affect the simula- tion efficiency, among which the load imbalance problem is the most critical. In this paper, we propose, a new molecular dynamics static load balancing method (MDSLB). By analyzing the characteristics of the short-range force of molecular dynamics programs running in parallel, we divide the short-range force into three kinds of force models, and then pack- age the computations of each force model into many tiny computational units called "cell loads", which provide the basic data structures for our load balancing method. In MDSLB, the spatial region is separated into sub-regions called "local domains", and the cell loads of each local domain are allocated to every processor in turn. Compared with the dynamic load balancing method, MDSLB can guarantee load balance by executing the algorithm only once at program startup without migrating the loads dynamically. We implement MDSLB in OpenFOAM software and test it on TianHe-lA supercomputer with 16 to 512 processors. Experimental results show that MDSLB can save 34%-64% time for the load imbalanced cases. 展开更多
关键词 molecular dynamics static load balancing parallel computing
在线阅读 下载PDF
Study of MPI based on parallel MOM on PC clusters for EM-beam scattering by 2-D PEC rough surfaces
6
作者 麻军 郭立新 王安琪 《Chinese Physics B》 SCIE EI CAS CSCD 2009年第8期3431-3437,共7页
This paper firstly applies the finite impulse response filter (FIR) theory combined with the fast Fourier transform (FFT) method to generate two-dimensional Gaussian rough surface. Using the electric field integra... This paper firstly applies the finite impulse response filter (FIR) theory combined with the fast Fourier transform (FFT) method to generate two-dimensional Gaussian rough surface. Using the electric field integral equation (EFIE), it introduces the method of moment (MOM) with RWG vector basis function and Galerkin's method to investigate the electromagnetic beam scattering by a two-dimensional PEC Gaussian rough surface on personal computer (PC) clusters. The details of the parallel conjugate gradient method (CGM) for solving the matrix equation are also presented and the numerical simulations are obtained through the message passing interface (MPI) platform on the PC clusters. It finds significantly that the parallel MOM supplies a novel technique for solving a two-dimensional rough surface electromagnetic-scattering problem. The influences of the root-mean-square height, the correlation length and the polarization on the beam scattering characteristics by two-dimensional PEC Gaussian rough surfaces are finally discussed. 展开更多
关键词 electromagnetic scattering rough surface beam parallel computing
在线阅读 下载PDF
Parallel numerical simulations for quantized vortices in Bose-Einstein condensates
7
作者 黄朝晖 王德生 《Chinese Physics B》 SCIE EI CAS CSCD 2007年第1期32-37,共6页
We employ the parallel computing technology to study numerically the three-dimensional structure of quantized vortices of Bose-Einstein condensates, For anisotropic cases, the bending process of vortices is described ... We employ the parallel computing technology to study numerically the three-dimensional structure of quantized vortices of Bose-Einstein condensates, For anisotropic cases, the bending process of vortices is described in detail by the decrease of Gross-Pitaevskii energy. A completely straight vortex and the steady and symmetrical multiple-vortex configurations are obtained. We analyse the effect of initial conditions and angular velocity on the number and shape of vortices. 展开更多
关键词 3D numerical simulations quantized vortices Bose-Einstein condensates parallel computing
在线阅读 下载PDF
Switching Delay Aware Computing Resource Allocation in Virtualized Base Station
8
作者 Mingjin Gao He(Henry) Chen +2 位作者 Yonghui Li Yiqing Zhou Jinglin Shi 《China Communications》 SCIE CSCD 2016年第11期226-233,共8页
In centralized cellular network architecture,the concept of virtualized Base Station(VBS) becomes attracting since it enables all base stations(BSs) to share computing resources in a dynamic manner. This can significa... In centralized cellular network architecture,the concept of virtualized Base Station(VBS) becomes attracting since it enables all base stations(BSs) to share computing resources in a dynamic manner. This can significantly improve the utilization efficiency of computing resources. In this paper,we study the computing resource allocation strategy for one VBS by considering the non-negligible effect of delay introduced by switches. Specifically,we formulate the VBS's sum computing rate maximization as a set optimization problem. To address this problem,we firstly propose a computing resource schedule algorithm,namely,weight before one-step-greedy(WBOSG),which has linear computation complexity and considerable performance. Then,OSG retreat(OSG-R) algorithm is developed to further improve the system performance at the expense of computational complexity. Simulation results under practical setting are provided to validate the proposed two algorithms. 展开更多
关键词 virtualized base station parallel computing computing resource allocation C-RAN
在线阅读 下载PDF
Fast parallel Grad–Shafranov solver for real-time equilibrium reconstruction in EAST tokamak using graphic processing unit 被引量:1
9
作者 黄耀 肖炳甲 罗正平 《Chinese Physics B》 SCIE EI CAS CSCD 2017年第8期276-283,共8页
To achieve real-time control of tokamak plasmas, the equilibrium reconstruction has to be completed sufficiently quickly. For the case of an EAST tokamak experiment, real-time equilibrium reconstruction is generally r... To achieve real-time control of tokamak plasmas, the equilibrium reconstruction has to be completed sufficiently quickly. For the case of an EAST tokamak experiment, real-time equilibrium reconstruction is generally required to provide results within 1ms. A graphic processing unit(GPU) parallel Grad–Shafranov(G-S) solver is developed in P-EFIT code,which is built with the CUDA? architecture to take advantage of massively parallel GPU cores and significantly accelerate the computation. Optimization and implementation of numerical algorithms for a block tri-diagonal linear system are presented. The solver can complete a calculation within 16 μs with 65×65 grid size and 27 μs with 129×129 grid size, and this solver supports that P-EFIT can fulfill the time feasibility for real-time plasma control with both grid sizes. 展开更多
关键词 TOKAMAK Grad-Shafranov equation equilibrium reconstruction GPU parallel computation
在线阅读 下载PDF
GPIC:A GPU-based parallel independent cascade algorithm in complex networks
10
作者 Chang Su Xu Na +1 位作者 Fang Zhou Linyuan Lü 《Chinese Physics B》 2025年第3期20-30,共11页
Independent cascade(IC)models,by simulating how one node can activate another,are important tools for studying the dynamics of information spreading in complex networks.However,traditional algorithms for the IC model ... Independent cascade(IC)models,by simulating how one node can activate another,are important tools for studying the dynamics of information spreading in complex networks.However,traditional algorithms for the IC model implementation face significant efficiency bottlenecks when dealing with large-scale networks and multi-round simulations.To settle this problem,this study introduces a GPU-based parallel independent cascade(GPIC)algorithm,featuring an optimized representation of the network data structure and parallel task scheduling strategies.Specifically,for this GPIC algorithm,we propose a network data structure tailored for GPU processing,thereby enhancing the computational efficiency and the scalability of the IC model.In addition,we design a parallel framework that utilizes the full potential of GPU's parallel processing capabilities,thereby augmenting the computational efficiency.The results from our simulation experiments demonstrate that GPIC not only preserves accuracy but also significantly boosts efficiency,achieving a speedup factor of 129 when compared to the baseline IC method.Our experiments also reveal that when using GPIC for the independent cascade simulation,100-200 simulation rounds are sufficient for higher-cost studies,while high precision studies benefit from 500 rounds to ensure reliable results,providing empirical guidance for applying this new algorithm to practical research. 展开更多
关键词 complex networks information spreading independent cascade model parallel computing GPU
在线阅读 下载PDF
A high fidelity general purpose 3-D Monte Carlo particle transport program JMCT3.0 被引量:12
11
作者 Li Deng Gang Li +11 位作者 Bao-Yin Zhang Rui Li Ling-Yu Zhang Xin Wang Yuan-Gang Fu Dun-Fu Shi Peng Liu Yan Ma Dan-Hu Shangguan Ze-Hua Hu Sheng-Cheng Zhou Jing-Wen Shen 《Nuclear Science and Techniques》 SCIE EI CAS CSCD 2022年第8期175-192,共18页
JMCT is a large-scale,high-fidelity,three-dimensional general neutron–photon–electron–proton transport Monte Carlo software system.It was developed based on the combinatorial geometry parallel infrastructure JCOGIN... JMCT is a large-scale,high-fidelity,three-dimensional general neutron–photon–electron–proton transport Monte Carlo software system.It was developed based on the combinatorial geometry parallel infrastructure JCOGIN and the adaptive structured mesh infrastructure JASMIN.JMCT is equipped with CAD modeling and visualizes the image output.It supports the geometry of the body and the structured/unstructured mesh.JMCT has most functions,variance reduction techniques,and tallies of the traditional Monte Carlo particle transport codes.Two energy models,multi-group and continuous,are provided.In recent years,some new functions and algorithms have been developed,such as Doppler broadening on-thefly(OTF),uniform tally density(UTD),consistent adjoint driven importance sampling(CADIS),fast criticality search of boron concentration(FCSBC)domain decomposition(DD),adaptive control rod moving(ACRM),and random geometry(RG)etc.The JMCT is also coupled with the discrete ordinate SNcode JSNT to generate source-biasing factors and weight-window parameters.At present,the number of geometric bodies,materials,tallies,depletion zones,and parallel processors are sufficiently large to simulate extremely complicated device problems.JMCT can be used to simulate reactor physics,criticality safety analysis,radiation shielding,detector response,nuclear well logging,and dosimetry calculations etc.In particular,JMCT can be coupled with depletion and thermal-hydraulics for the simulation of reactor nuclear-hot feedback effects.This paper describes the progress in advanced modeling,high-performance numerical simulation of particle transport,multiphysics coupled calculations,and large-scale parallel computing. 展开更多
关键词 Advanced modeling High-performance numerical simulation Multi-physics coupled calculation Large-scale parallel computing JMCT
在线阅读 下载PDF
A Matrix Formulation of Discrete Chirp Fourier Transform Algorithms 被引量:1
12
作者 Juan Pablo Soto Quiros Domingo Rodriguez 《Journal of Electronic Science and Technology》 CAS 2014年第2期206-210,共5页
This work presents a computational matrix framework in terms of tensor signal algebra for the formulation of discrete chirp Fourier transform algorithms. These algorithms are used in this work to estimate the point ta... This work presents a computational matrix framework in terms of tensor signal algebra for the formulation of discrete chirp Fourier transform algorithms. These algorithms are used in this work to estimate the point target functions (impulse response functions) of multiple-input multiple-output (MIMO) synthetic aperture radar (SAR) systems. This estimation technique is being studied as an alternative to the estimation of point target functions using the discrete cross-ambiguity function for certain types of environmental surveillance applications. The tensor signal algebra is presented as a mathematics environment composed of signal spaces, finite dimensional linear operators, and special matrices where algebraic methods are used to generate these signal transforms as computational estimators. Also, the tensor signal algebra contributes to analysis, design, and implementation of parallel algorithms. An instantiation of the framework was performed by using the MATLAB Parallel Computing Toolbox, where all the algorithms presented in this paper were implemented. 展开更多
关键词 Discrete chirp Fourier transform MATLAB parallel computing tensor signal algebra
在线阅读 下载PDF
Towards sparse matrix operations:graph database approach for power grid computation
13
作者 Daoxing Li Kai Xiao +2 位作者 Xiaohui Wang Pengtian Guo Yong Chen 《Global Energy Interconnection》 EI CAS CSCD 2023年第1期50-63,共14页
The construction of new power systems presents higher requirements for the Power Internet of Things(PIoT)technology.The“source-grid-load-storage”architecture of a new power system requires PIoT to have a stronger mu... The construction of new power systems presents higher requirements for the Power Internet of Things(PIoT)technology.The“source-grid-load-storage”architecture of a new power system requires PIoT to have a stronger multi-source heterogeneous data fusion ability.Native graph databases have great advantages in dealing with multi-source heterogeneous data,which make them suitable for an increasing number of analytical computing tasks.However,only few existing graph database products have native support for matrix operation-related interfaces or functions,resulting in low efficiency when handling matrix calculations that are commonly encountered in power grids.In this paper,the matrix computation process is expressed by a strategy called graph description,which relies on the natural connection between the matrix and structure of the graph.Based on that,we implement matrix operations on graph database,including matrix multiplication,matrix decomposition,etc.Specifically,only the nodes relevant to the computation and their neighbors are concerned in the process,which prunes the influence of zero elements in the matrix and avoids useless iterations compared to the conventional matrix computation.Based on the graph description,a series of power grid computations can be implemented on graph database,which reduces redundant data import and export operations while leveraging the parallel computing capability of graph database.It promotes the efficiency of PIoT when handling multi-source heterogeneous data.An comprehensive experimental study over two different scale power system datasets compares the proposed method with Python and MATLAB baselines.The results reveal the superior performance of our proposed method in both power flow and N-1 contingency computations. 展开更多
关键词 Graph database Graph description MATRIX parallel computing Power flow
在线阅读 下载PDF
Compute Unified Device Architecture Implementation of Euler/Navier-Stokes Solver on Graphics Processing Unit Desktop Platform for 2-D Compressible Flows
14
作者 Zhang Jiale Chen Hongquan 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2016年第5期536-545,共10页
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N... Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially. 展开更多
关键词 graphics processing unit(GPU) GPU parallel computing compute unified device architecture(CUDA)Fortran finite volume method(FVM) acceleration
在线阅读 下载PDF
A Case for Cloud-Based Mobile Search
15
作者 Yan Gao Li Fu Zhenwei Zhang Shengmei Luo Ping Lu 《ZTE Communications》 2011年第1期33-36,共4页
Mobile search is beset with problems because of mobile terminal constraints and also because its characteristics are different from the traditional Internet search model. This paper analyzes cloud computing technologi... Mobile search is beset with problems because of mobile terminal constraints and also because its characteristics are different from the traditional Internet search model. This paper analyzes cloud computing technologies--especially mass data storage, parallel computing, and virtualization--in an attempt to solve technical problems in mobile search. The broad prospects of cloud computing are also discussed. 展开更多
关键词 mobile search cloud computing parallel computing VIRTUALIZATION
在线阅读 下载PDF
Chip-Based High-Dimensional Optical Neural Network 被引量:7
16
作者 Xinyu Wang Peng Xie +1 位作者 Bohan Chen Xingcai Zhang 《Nano-Micro Letters》 SCIE EI CAS CSCD 2022年第12期570-578,共9页
Parallel multi-thread processing in advanced intelligent processors is the core to realize high-speed and high-capacity signal processing systems.Optical neural network(ONN)has the native advantages of high paralleliz... Parallel multi-thread processing in advanced intelligent processors is the core to realize high-speed and high-capacity signal processing systems.Optical neural network(ONN)has the native advantages of high parallelization,large bandwidth,and low power consumption to meet the demand of big data.Here,we demonstrate the dual-layer ONN with Mach-Zehnder interferometer(MZI)network and nonlinear layer,while the nonlinear activation function is achieved by optical-electronic signal conversion.Two frequency components from the microcomb source carrying digit datasets are simultaneously imposed and intelligently recognized through the ONN.We successfully achieve the digit classification of different frequency components by demultiplexing the output signal and testing power distribution.Efficient parallelization feasibility with wavelength division multiplexing is demonstrated in our high-dimensional ONN.This work provides a high-performance architecture for future parallel high-capacity optical analog computing. 展开更多
关键词 Integrated optics Optical neural network High-dimension Mach-Zehnder interferometer Nonlinear activation function parallel high-capacity analog computing
在线阅读 下载PDF
Numerical Simulation of ATPS Parachute Transient Dynamics Using Fluid-Structure Interaction Method
17
作者 Fan Yuxin Xia Jian 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI CSCD 2017年第5期535-542,共8页
In order to simulate and analyze the dynamic characteristics of the parachute from advanced tactical parachute system(ATPS),a nonlinear finite element algorithm and a preconditioning finite volume method are employed ... In order to simulate and analyze the dynamic characteristics of the parachute from advanced tactical parachute system(ATPS),a nonlinear finite element algorithm and a preconditioning finite volume method are employed and developed to construct three dimensional parachute fluid-structure interaction(FSI)model.Parachute fabric material is represented by membrane-cable elements,and geometrical nonlinear algorithm is employed with wrinkling technique embedded to simulate the large deformations of parachute structure by applying the NewtonRaphson iteration method.On the other hand,the time-dependent flow surrounding parachute canopy is simulated using preconditioned lower-upper symmetric Gauss-Seidel(LU-SGS)method.The pseudo solid dynamic mesh algorithm is employed to update the flow-field mesh based on the complex and arbitrary motion of parachute canopy.Due to the large amount of computation during the FSI simulation,massage passing interface(MPI)parallel computation technique is used for all those three modules to improve the performance of the FSI code.The FSI method is tested to simulate one kind of ATPS parachutes to predict the parachute configuration and anticipate the parachute descent speeds.The comparison of results between the proposed method and those in literatures demonstrates the method to be a useful tool for parachute designers. 展开更多
关键词 parachute dynamics fluid-structure interaction nonlinear structure dynamics time dependent flow parallel computation technique
在线阅读 下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部