In this work, the air breakdown problem encountered with high-power microwave (HPM) operation is modeled using a fully coupled nonlinear Newton scheme in the time domain. During the air breakdown process, the HPM ionizes neutral air molecules and generates free electrons, which are pushed to move by the Lorentz force produced by the electromagnetic fields. The motion of free electrons produces plasma currents, which generate secondary electromagnetic fields that couple back to the externally applied fields and interact with the free electrons. Such a breakdown process is highly nonlinear, and can be described by a coupled electromagnetic-plasma system, where the electromagnetic fields are governed by Maxwell’s equations and the plasma current is modeled by a simplified plasma fluid equation due to the high electron density in the air breakdown under the atmosphere condition. The coupled nonlinear system equations are solved by the time-domain finite element method (TDFEM) together with a coupled Newton’s method. Numerical examples are presented to demonstrate the nonlinear characteristics of the breakdown phenomenon and the self-sustaining property of the plasma current.
Su Yan and J.-M. Jin, “A fully coupled nonlinear scheme for time-domain modeling of high-power microwave air breakdown,” IEEE Trans. Microw. Theory Tech., 2015, under review.
In this work, an efficient parallel strategy for the FETI-DP algorithm is proposed. To achieve a good load balance, the mesh is partitioned into subdomains with similar sizes and shapes. The subdomains in close proximity are then distributed to the same processor to minimize inter-process communication. The parallel generalized minimal residual method, enhanced with the iterative classical Gram-Schmidt orthogonalization scheme to reduce global communication, is adopted to solve the order-reduced global interface problem in the FETI-DP algorithm with a fast convergence rate. The global coarse problem, formed to improve the convergence of the global interface problem, is solved iteratively by a parallel communication-avoiding biconjugate gradient stabilized method to minimize global communication. The iterative solution of the global coarse problem is accelerated by a diagonal preconditioner constructed from the coarse system matrix. To alleviate neighboring communication overhead, the non-blocking communication approach is employed in both Krylov subspace methods. Numerical examples are presented to demonstrate the accuracy and scalability of the proposed parallel scheme for electromagnetic modeling of general objects and antenna arrays.
Su Yan and J.-M. Jin, “Three-dimensional time-domain finite element simulation of dielectric breakdown based on nonlinear conductivity model," IEEE Trans. Antennas Propag., 2015, under review.
The standard finite-element method (FEM) is one of the best choices for materials with complex structures. At material interfaces, the FEM has to resort to meshes that are conformal with the interfaces to yield an accurate representation of the solution. However the discretization of complex 3D objects with high quality elements remains challenging and time-consuming despite the rapid development of the computer aided design (CAD). There are also other occasions, such as crack growth simulations, shape optimizations, and transient field analysis, where the generation of multiple conformal meshes needed to capture the geometric changes is cumbersome, expensive, and sometimes impractical. In this work, an interface-enriched generalized FEM (IGFEM) is proposed for accurate and efficient electromagnetic analysis of problems involving intricate internal structures. Without using meshes that conform to the material interfaces, which greatly lessens the burden of mesh generation, the method assigns generalized degrees of freedom (DOFs) at material interfaces to capture the normal derivative discontinuity of the tangential field. The generalized DOFs are supported by enriched vector basis functions, which are constructed through a linear combination of the vector basis functions from the sub-elements. Several verification examples are provided to show that the IGFEM is not sensitive to the quality of the sub-elements and maintains the same level of solution accuracy and computational complexity as the standard FEM based on conformal meshes. The potential of the proposed IGFEM is demonstrated by simulating some engineering problems with complex, periodic internal structures, including composite materials with randomly distributed spherical particles and ellipsoidal inclusions and microvascular channels.
K.-D. Zhang, J.-M. Jin, and P. H. Geubelle, “A 3D Interface-Enriched Generalized FEM for Electromagnetic Problems with Non-Conformal Discretizations,” IEEE. Trans. Antennas. Propagat., submitted for review.
K.-D. Zhang, A. Raeisi Najafi, J.-M. Jin, and P. H. Geubelle, “An Interface-Enriched Generalized Finite Element Analysis for Electromagnetic Problems with Non-Conformal Discretizations,” Int. J. Numer. Model., Accepted.
The dual-primal finite-element tearing and interconnecting (FETI-DP) algorithm has been shown to be very powerful for electromagnetic analysis because of its numerical stability. As a nonoverlapping domain decomposition method, the FETI-DP algorithm forms fully decoupled subdomain problems and exhibits high scalability potential. However, due to the complex structure of this algorithm, efficient parallelization is not trivial, especially on a large number of processors. In this work, an efficient parallel strategy for the FETI-DP algorithm is proposed. To achieve a good load balance, the mesh is partitioned into subdomains with similar sizes and shapes. The subdomains in close proximity are then distributed to the same processor to minimize inter-process communication. The parallel generalized minimal residual method, enhanced with the iterative classical Gram-Schmidt orthogonalization scheme to reduce global communication, is adopted to solve the order-reduced global interface problem in the FETI-DP algorithm with a fast convergence rate. The global coarse problem, formed to improve the convergence of the global interface problem, is solved iteratively by a parallel communication-avoiding biconjugate gradient stabilized method to minimize global communication. The iterative solution of the global coarse problem is accelerated by a diagonal preconditioner constructed from the coarse system matrix. To alleviate neighboring communication overhead, the non-blocking communication approach is employed in both Krylov subspace methods. Numerical examples are presented to demonstrate the accuracy and scalability of the proposed parallel scheme for electromagnetic modeling of general objects and antenna arrays.
K.-D. Zhang and J.-M. Jin, “Parallel FETI-DP Algorithm for Efficient Simulation of Large-Scale EM Problems,” Int. J. Numer. Model., submitted for review.
Nonlinear phenomena in electromagnetics generally involve changes in the material properties due to the presence of electromagnetic fields. The changes in the material properties in turn modify the state of the original electromagnetic fields in the medium. Since the material properties and the contained fields interact with each other constantly, it is most natural to describe and model these interactions in the time domain, where at each time instant the changes in the fields induce nonlinear modifications on both the material properties and the fields themselves. In this work, a discontinuous Galerkin time-domain (DGTD) algorithm is formulated and implemented to model the third-order instantaneous nonlinear effect on electromagnetic fields due the field-dependent medium permittivity. The nonlinear DGTD computation is accelerated using graphics processing units (GPUs), and examples are presented to show the different Kerr effects observed through the third-order nonlinearity. With the acceleration using MPI + GPU under a large cluster environment, the solution times for nonlinear simulations are significantly reduced.
H.-T. Meng and J.-M. Jin, “GPU Acceleration of Nonlinear Modeling by the Discontinuous Galerkin Time-Domain Method,” ACES Express Journal, submitted for review.
An accurate and efficient finite element-boundary integral (FE-BI) method with graphics processing unit (GPU) acceleration is presented for solving electromagnetic problems with complex structures and materials. A mixed testing scheme, in which the Rao-Wilton-Glisson and the Buffa-Christiansen functions are both employed as the testing functions, is first presented to improve the accuracy of the FE-BI method. An efficient absorbing boundary condition (ABC)-based preconditioner is then proposed to accelerate the convergence of the iterative solution. To further improve the efficiency of the total computation, a GPU-accelerated multilevel fast multipole algorithm (MLFMA) is applied to the iterative solution. The radar cross sections (RCS) of several benchmark objects are calculated to demonstrate the numerical accuracy of the solution and also to show that the proposed method not only is free of interior resonance corruption, but also has a better convergence than the conventional FE-BI methods. The capability and efficiency of the proposed method are analyzed through several numerical examples, including a large dielectric coated sphere, a partial human body, and a coated missile-like object. Compared with the 8-threaded CPU-based algorithm, the GPU-accelerated FE-BI-MLFMA algorithm can achieve a total speedup up to 25.5 times.
J. Guan, S. Yan, and J.-M. Jin, "An Accurate and Efficient Finite Element-Boundary Integral Method with GPU Acceleration for 3-D Electromagnetic Analysis," IEEE Trans. Antennas Propag., vol. 62, no. 12, pp. 6325-6336, Dec. 2014.
With the fast rate of innovation of portable wireless devices, more and more communication and entertainment functions are featured in portable devices such as cell phones, many of which require the integration of multiple transmitting/receiving antennas into the limited space of the device. In this research, we propose to simulate the SAR induced in the human head by multiple transmitting antennas, and the total radiated power (TRP) from the antenna into the space. In order to obtain robust mathematical models for SAR and TRP that incorporate various input parameters, we will consider different scenarios, for example, different operating frequencies, different phase combinations of the input signals, different physical locations of the antennas, and different gestures in which customer holds the device. Based on the data obtained from the full-wave simulation, we will formulate and construct accurate mathematical models for the SAR and TRP. We will check the models against different input parameter combinations for validation purposes. The result of this research serves as a starting point for the SAR minimization with the multiple transmitting chains in portable devices.
B. M. Hochwald, D. J. Love, S. Yan, P. Fay, and J. M. Jin, “Incorporating specific absorption rate (SAR) constraints into wireless signal design,” IEEE Communications Magazine, vol. 52, no. 9, pp. 126-133, Sept. 2014.
A multi-GPU implementation of the multilevel fast multipole algorithm (MLFMA) based on the hybrid OpenMP-CUDA parallel programming model (OpenMP-CUDA-MLFMA) is presented for computing electromagnetic scattering of a three-dimensional conducting object. The proposed hierarchical parallelization strategy ensures a high computational throughput for the GPU calculation. The resulting OpenMP-based multi-GPU implementation is capable of solving real-life problems with over one million unknowns with a remarkable speed-up. The radar cross sections of a few benchmark objects are calculated to demonstrate the accuracy of the solution. The results are compared with those from the CPU-based MLFMA and measurements. The capability and efficiency of the presented method are analyzed through the examples of a sphere, an aerocraft, and a missile-like object. Compared with the 8-threaded CPU-based MLFMA, the OpenMP-CUDA-MLFMA method can achieve from 5 to 20 total speed-up ratios.
J. Guan, S. Yan, and J.-M. Jin, "An OpenMP-CUDA implementation of multilevel fast multipole algorithm for electromagnetic simulation on multi-GPU computing systems," IEEE Trans. Antennas Propag., vol. 61, no. 7, pp. 3607-3616, July, 2013.
General-purpose computing on graphics processing units (GPGPU), with programming models such as the Compute Unified Device Architecture (CUDA) by NVIDIA, offers the capability to accelerate the solution process of computational electromagnetic analysis. However, due to the communication-intensive nature of the finite element algorithm, both the assembly and the solution phases cannot be implemented via fine-grained many-core GPU processors in a straightforward manner. In this work, we identify the bottlenecks in the GPU parallelization of the finite element method for electromagnetic analysis, and propose potential solutions to alleviate the bottlenecks. We first investigate efficient parallelization strategies for the finite element matrix assembly on a single GPU and on multiple GPUs. We then explore parallelization strategies for the finite element matrix solution, in conjunction with parallelizable preconditioners to reduce the total solution time. We show that with a proper parallelization and implementation, GPUs are able to achieve significant speedup over OpenMP enabled multi-core CPUs.
H.-T. Meng and J.-M. Jin, “Acceleration of the dual-field domain decomposition algorithm using MPI-CUDA on large-scale computing systems,” IEEE Trans. Antennas Propag., vol. 62, no. 9, pp. 4706-4715, Sept. 2014.
H.-T. Meng, B.-L. Nie, S. Wong, C. Macon, and J.-M. Jin, “GPU accelerated finite element computation for electromagnetic analysis,” IEEE Antennas Propag. Mag., vol. 56, no. 2, pp. 39-62, Apr. 2014.