Hosted by Curvenote
Submission Preview
status: pending

Sparse magnetic vector inversion in spherical coordinates

Abstract

Magnetic vector inversion has received considerable attention over recent years for processing magnetic field data that are affected by remanent magnetization. However, the magnetization models obtained with current inversion algorithms are generally too smooth to be easily interpreted geologically. To address this, we review the magnetic vector inversion formulated in a spherical coordinate system. We tackle convergence issues posed by the non-linear transformation from Cartesian to spherical coordinates by using an iterative sensitivity weighting approach and a scaling of the spherical parameters. The spherical formulation allows us to impose sparsity assumptions on the magnitude and direction of magnetization independently and, as a result, the inversion recovers simpler and more coherent magnetization orientations. The numerical implementation of our algorithm on large-scale problems is facilitated by discretizing the forward problem using tiled Octree meshes. All of our results are generated using the open-source SimPEG software. We demonstrate the enhanced capabilities of our algorithm on a large airborne magnetic survey collected over the Kevitsa Ni-Cu-PGE deposit. The recovered magnetization direction inside the intrusive and in the host stratigraphy is consistent with laboratory measurements and provides evidence for tectonic deformation.

Keywords:3Dinversionmagnetizationnonlinear

1Introduction

The study of magnetism in rocks has a long history in Earth sciences and continues to play a key role in mineral exploration and tectonic studies Kissel & Laj (1989)Pueyo et al. (2016)Li et al. (2019). Magnetic susceptibility, the ability of rocks to become magnetized by the geomagnetic field, is a useful property for mapping geology under cover. To this end, a large number of datasets, from global satellite measurements to deposit scale surveys, have been made available over the years. These data sets have also brought to light the prevalence of remanent magnetization: a permanent magnetization direction acquired by certain minerals and often associated with mineral deposits such as diamondiferous kimberlites, volcanogenic massive sulphides and porphyry deposits Henkel (1991)Enkin (2014). The remanent component of magnetization is typically ignored in the interpretation of magnetic data, and therefore it is often considered as “noise” which complicates the geologic interpretation.

Meanwhile, the same remanent component has been used extensively for paleomagnetic studies and regarded as meaningful geophysical “data”. A number of researchers have used the permanent magnetization orientation to map continental block rotation Norris & Black (1961)Vine & Matthews (1963)Kissel & Laj (1989), for fold and thrust belts reconstruction Ramon et al. (2012)Villalaín et al. (2016) and in geochronology Henkel (1991)Lockhart et al. (2004)Enkin (2003).

While these studies provide valuable information about Earth’s history, they have relied primarily on laboratory measurements performed on oriented cores. Analysis based on point measurements of magnetization direction remains a limiting factor in understanding the spatial and temporal variability of large fold and thrust belts systems Pueyo et al. (2016).

Geophysical inversions provide an approach for extracting information about magnetization and constructing a 3D model of the subsurface from observed total magnetic field (TMI) data.

Imaging the shape and depth of magnetic bodies can help identify mineral deposits, while recovering the orientation of magnetization can benefit paleomagnetic studies.

As noted in the review by Li (2017), inversion strategies for magnetic data put forward in the literature can be broadly grouped into three main categories. In the first group, the magnetization orientation of compact bodies is estimated through search algorithms Fedi et al. (1994)Dannemiller & Li (2006), magnetic moment analysis Helbig (1963)Phillips (2008), and inversion methods. The most common inversion approach uses simple parametric shapes to approximate elongated and tabular bodies Foss & McKenzie (2011)Fullagar & Pears (2013)Clark (2014)Pratt et al. (2014). These methods are computationally inexpensive and work well when the magnetic response can be isolated, but, in practice, they rely heavily on experts to build and test different scenarios; this can become impractical in complex geological settings.

The second group deals with remanence by transforming the field data into a quantity that is weakly sensitive to the orientation of magnetization. Magnetic amplitude data, for example, can be inverted for an effective susceptibility model; this approach was first introduced by Shearer (2005,  p. 52). While the amplitude inversion method has proven to be successful in identifying geological features Krahenbuhl & Li (2007)Li et al. (2010), several drawbacks persist. First, the amplitude data must be calculated by pre-processing the observed TMI data either through a Fourier transform or by the equivalent source method Li et al. (2014). This process may introduce unintended biases or numerical errors into the inversion. Secondly, the solution does not provide information about the direction of magnetization unless estimated in post-processing. Thirdly, the conversion to amplitude removes phase information from the data. This makes dip interpretation more difficult.

The third category, which is the focus of this study, aims to recover a 3D distribution of magnetization vectors directly from TMI data. Lelievre & Oldenburg (2009) introduced a Magnetic Vector Inversion algorithm, implemented in both Cartesian (MVI-C) and spherical (MVI-S) coordinate systems.

The MVI-C formulation is closely related to the method proposed by Kubota & Uchiyama (2005), and later adopted by Ellis et al. (2012). This has substantially improved the interpretation of magnetic data and it is now frequently used by industry.

Solving for a vector model remains difficult however, since it results in a large underdetermined inverse problem with three times the number of variables (i.e. spatial components) compared to conventional inversions. The method relies heavily on the regularization function and often yields a smooth model that is too blurred for direct geological interpretation.

In order to deal with this deficiency, Li & Sun (2016) introduce a fuzzy c-means clustering technique to force the magnetization to be in a set number of domains.

Sparsity constraints on the vector components have been proposed, either directly applied to the Cartesian components through Gramian constraints Zhu et al. (2015), or indirectly through a cooperative approach Liu et al. (2015)Fournier (2015,  p. 101). These ‘focusing’ methods have shown success in better recovering the shape of compact bodies, but ambiguity remains about the magnetization direction.

Lelievre & Oldenburg (2009) advocate the use of the spherical representation so that constraints can be applied independently on the amplitude and orientation angles. This formulation is well-suited for incorporating sparsity and physical property constraints, but its implementation remains challenging. The nonlinear transformation makes the inversion prone to converge to a local minimum. The main reason is that the inversion parameters have different units (length, radian) and their respective sensitivities vary over several orders of magnitude. This issue has partially been addressed in Liu et al. (2017) with a fixed scaling parameter.

In this study, we propose to address the current limitations encountered with the MVI-S algorithm. Building upon the work of Liu et al. (2017), we define two scaling mechanisms: the first addresses the differing magnitudes of parameters in spherical coordinates and the second compensates for rapid changes in the sensitivity due to the non-linearity of the problem. We demonstrate the benefit of our iterative sensitivity re-weighting strategy on a simple two-parameter problem. Employing a stable MVI-S algorithm allows us to impose penalties on the magnetization direction and amplitude independently, which allows us to recover compact bodies with coherent magnetization direction. We use a tiled Octree mesh decoupling approach to deal with the large memory footprint of the problem; this removes the need for compression. We first showcase our approach on a synthetic block model. The same algorithm is applied to an airborne magnetic field data set acquired over the Kevitsa Ni-Cu-PGE deposit, Finland. We unravel the complex magnetic signal to recover geologic units and their magnetization directions and we recover regions of high magnetization at depth which are in good agreement with known dunite units. The orientation of magnetization is consistent with previously published measurements performed on core samples. We infer tectonic deformation from the orientation of magnetization recovered within the folded host stratigraphy.

2Methodology

Our goal is to image the subsurface using magnetic field data.

From Gauss’s law, the forward relation between magnetization and the magnetic field response can be expressed as:

b(r)=μ04πV1rM  dV  ,\vec{b}(r) = \frac{\mu_0}{4\pi} \int_{V} \nabla \nabla \frac{1}{ r} \cdot \vec{M} \; dV\;,

where b\vec{b} is the magnetic flux density in teslas (T) and rr is the radial distance between an arbitrary position and the magnetic source with magnetization per unit volume M\vec{M} (A/m).

For most geophysical applications, we do not measure the vector field b\vec b of magnetized bodies, but rather the amplitude of the field, or TMI data, that includes both the geomagnetic and secondary fields

bTMI=b0+bb^{TMI} = \|\vec b_0 + \vec b \|

where b0=μ0h0\vec b_0 = \mu_0\vec h_0 is the geomagnetic flux density.

The anomalous field data, dTMAd^{TMA}, is computed by subtracting b0\|\vec b_0\| from the TMI data. Thus

dTMA=b0+bb0bb^0\begin{split} {d}^{TMA} &= \|\vec b_0 + \vec b \| - \|\vec b_0\| \\ &\approx \vec{b}\cdot \hat b_0 \end{split}

This approximation is valid so long as bb0\|\vec b\| \ll \|\vec b_0\|.

In matter, the total magnetization per unit volume can be separated into its induced and remanent components such that:

M=κ(h0+hs)+Mr  ,\vec{M} = \kappa (\vec{h}_0 + \vec{h}_s) + \vec{M}_{r}\;,

where the magnetic susceptibility κ (SI) is the physical property describing the ability of a rock to get magnetized under an applied field. In nature this inducing field has two components: the Earth’s geomagnetic field h0\vec{h}_0 and secondary fields hs\vec{h}_s related to local magnetic anomalies.

The remanent magnetization Mr\vec{M}_{r} is a permanent magnetization preserved in the absence of an inducing field. This is the quantity of interest in the field of paleomagnetism as it preserves information about the inducing field direction and position of a rock at the time of remanence acquisition.

Our goal is to recover a 3D model mm describing some quantity related to magnetization (m:=Mm:=\vec{M}) from the observed data.

The inverse problem can be formulated as an optimization problem of the form:

minm  ϕ(m)=  ϕd+βϕmsubject to  ϕdϕd  .\begin{split} \underset{{m}}{\text{min}}\; \phi({m}) & = \; \phi_d + \beta \phi_m \\ \text{subject to} \; \phi_d & \leq \phi_d^* \; . \end{split}

Tikhonov et al. (1995,  p. 7).

The objective function ϕ(m)\phi({m}) has two terms.

The misfit function,

ϕd=i=1N(diprediobsσi)2  ,\phi_d =\sum_{i=1}^{N}\left(\frac{{d}_i^{pre} - {d}_i^{obs}}{\sigma_i}\right)^2 \;,

measures the residual between the predicted data dpre=F[m]{d}^{pre} = F[m] and observed dobs{d}^{obs} data normalized by estimated uncertainties σ{\sigma}. Assuming that the noise and error on the data are random, our expected target misfit ϕd=N\phi_d^*=N, where NN is the number of data.

The model objective function, ϕm\phi_m, also known as the regularization function, is typically made up of several functions that control the magnitude and roughness of the model. The trade-off parameter β controls the relative importance between the two competing objectives.

In this research, we use the general lpl_p-norm measure of the model and its spatial gradients as our model objective function

ϕm=r=s,x,y,zαjVwr  fr(m)pr  dV  .\begin{split} \phi_m &= \sum_{r=s,x,y,z} \alpha_j \int_V w_r \;|f_r (m)|^{p_r} \;dV\;. \\ \end{split}

The regularization functions frf_r are user-defined but most often have the following form

fs=m,  fx=dmdx,  fy=dmdy,  fz=dmdz  .\begin{split} f_s= m,\;f_x= \frac{d m}{dx},\; f_y= \frac{d m}{dy},\;f_z= \frac{d m}{dz}\;. \end{split}

Thus fs(m)f_s(m) measures the size of mm and fx(m)f_x(m), fy(m)f_y(m) and fz(m)f_z(m) measure the roughness along orthogonal directions. These functionals can also include a reference model mrefm_{ref} but for the sake of notation we set it to zero.

Weighting terms wrw_r allow the user to adjust the strength of the regularization function in specific regions of the inversion domain.

The general strategy to minimize equation 5 requires that we discretize the model onto a mesh (we use m\mathbf{m} to denote the discretized model which has length MM) and find a solution such that the gradient of the objective function

g=mϕ(m)=mϕd+β[αsmϕs+αxmϕx+αymϕy+αzmϕz]=0  .\mathbf{g} = \nabla_m \phi(\mathbf{m}) = \nabla_m \phi_d + \beta \bigg[ \alpha_s \nabla_m \phi_s + \alpha_x \nabla_m \phi_x + \alpha_y \nabla_m \phi_y + \alpha_z \nabla_m \phi_z \bigg] = \mathbf{0} \;.

A solution to the optimization (equation 5) is found for decreasing values of the trade-off parameter β; we stop when we find a suitable model that reproduces the data to within a predefined tolerance:

ϕdϕdϕdηϕd  .\frac{|\phi_d- \phi_d^*|}{\phi_d^*} \leq \eta_{\phi_d}\;.

Building upon the work presented in Fournier & Oldenburg (2019), we solve the nonlinear problem in equation 7 by the Scaled Iteratively Reweighted Least Squares (S-IRLS) method.

The p\ell_p-norm is approximated by the Lawson method, such that equation 7 is expressed as a weighted least-squares regularization of the form

ϕrpr=jMfrj2((frj(k1))2+ϵr2)1pr/2wj  ,\phi_r^{p_r} = \sum_{j}^M\frac{f_{r_j}^2}{{{((f_{r_j}^{(k-1)})^{2} + \epsilon_r^2 )}^{1-p_r/2}} }w_j\;,

where kk denotes the iteration number and the subscript rr is one of the functions making up the regularization (e.g. ss, xx, yy, zz). The regularization function in equation 12 allows for the independent mixing of norms on the complete interval 0pr20 \leq p_r \leq 2. This lets us generate a suite of models with different characteristics and assess the variability of the solution by varying the prp_r parameters. For small psp_s values, the inversion favors compact anomalies with large physical property contrasts, while reducing the pxyzp_{xyz} values generate flat anomalies with sharp edges along the Cartesian directions. This formulation greatly increases the flexibility of inversion outcomes compared to those using conventional 2\ell_2-norm measures.

We express the regularization function equation 7 in matrix form as:

ϕm=r=s,x,y,zαrWr  Rr  Dr  m22  ,\begin{split} \phi_m = \sum_{r = s,x,y,z} \alpha_r {\|\mathbf{W}_{r} \;\mathbf{R}_r \; \mathbf{D}_r \; \mathbf{m}\|}^2_2 \;, \end{split}

where the gradient terms Dx,Dy\mathbf{D}_x, \mathbf{D}_y and Dz\mathbf{D}_z are finite difference operators measuring the model roughness along the three Cartesian directions, and Ds\mathbf{D}_s is the identity matrix. Once again, each term may, or may not, include a reference model mref\mathbf{m}_{ref}. Hyper-parameters αr\alpha_r allow the user to change the relative influence of each term. In this study we set αr=1\alpha_r=1 for simplicity.

Sparsity weights Rr\mathbf{R}_r are the discretized version of equation 11 and are calculated as:

Rr=diag[γr(Dr(m(k1))2+ϵr2)(1pr/2)]1/2\begin{split} \mathbf{R}_r = \text{diag} \left[ \gamma_r {\Big( {{\mathbf{D}_r\:(\mathbf{m}^{(k-1)}}})^{2} + \epsilon_r^2 \Big)}^{- (1 - p_r/2)} \right]^{1/2} \\ \end{split}

such that weights depend on the model obtained at a previous kthk^{th} iteration. The γr\gamma_r scaling parameters are used to balance the contribution of different p\ell_p-norms based on the maximum derivatives such that:

γr=gr2grpr  .\gamma_r = \frac{ \left\| \mathbf{g}_r^2 \right\|_\infty}{\left\|\mathbf{g}_r^{p_r}\right\|_\infty}\;.

In equation 14, the superscript denotes the pp-value used to approximate the p\ell_p-norm used in equation 13. Our objective is to minimize equation 12 for ϵr108\epsilon_r \rightarrow 10^{-8}, which we do along scaled gradient steps such that each regularization term ϕr\phi_r remains influential throughout the iterative process.

More details about the S-IRLS algorithm are provided in Fournier & Oldenburg (2019).

Lastly, the sensitivity weighting functions Wr\mathbf{W}_{r} are used to counteract spatial changes in sensitivities; designing these weights is the main focus of this research. In the work of Li & Oldenburg (1996), a distance weighting function is employed and is fixed at the onset. In this study we advocate for an iterative re-weighting strategy calculated directly from the sensitivity of a given problem. Adapted from Haber et al. (1997), we formulate the sensitivity-based weighting function:

Wr=PCFrdiag[[wmax(w)]1/2]wj=[i=1NJij2+δ]1/2  ,\begin{split} \mathbf{W}_r &= \mathbf{P}_C^{F_r} \rm{diag} \left[ {\left[\frac{\mathbf{ w}}{\rm{max}(\mathbf{ w})}\right]}^{1/2}\right]\\ w_{j} &= {\left[\sum_{i=1}^{N}{J_{ij}}^2 + \delta \right]}^{1/2}\;, \end{split}

where weights wjw_j measure the sum of squares of the columns of the sensitivity matrix

J=  F[m(k)]m  .\mathbf{J} = \frac{\partial \; \mathbb{F}[\mathbf{m}^{(k)}]}{\partial \boldsymbol{\mathbf{m}}}\;.

The projection matrices PCFr\mathbf{P}_C^{F_r} average the sensitivity weights from cell-center to corresponding faces (or the identity matrix for ϕs\phi_s). A small constant δ is added to avoid singularity. This sensitivity weighting strategy is general and adaptable to any inverse problems where the sensitivity matrix can be calculated explicitly.

While the initial purpose of the sensitivity weighting function of Li & Oldenburg (1996) is to simply counteract the decay of potential fields, we will show numerically that the iterative re-scaling process can also be beneficial in improving the convergence of gradient methods applied to non-linear inverse problems.

2.0.1Susceptibility Inversion

To illustrate some challenges that remanent magnetization can cause in magnetic inversions, we revisit the linear susceptibility method implemented by Li & Oldenburg (1996), Pilkington (1997) and others.

As derived by Sharma (1966), the integral in equation 1 can be evaluated analytically for the magnetic field of a rectangular prism such that:

F[m]=dpre=b^0Tm  ,F[m] = d^{pre} = \mathbf{\hat b}_0^\top \: \mathbf{T} \:\mathbf{m}\;,

where the matrix TR3×3\mathbf{T} \in \mathbb{R}^{3 \times 3} describes the linear relation between the magnetic field components measured outside a discrete prism with magnetization m=[mx,  my,  mz]\mathbf{m} = [m_x,\;m_y,\;m_z]^\top. The dot product with the normalized inducing flux b^0\mathbf{\hat b}_0 handles the TMA projection.

Different assumptions regarding the magnetization in equation 4 give rise to different inverse problems.

We will assume here that the magnetic response is purely induced along Earth’s field and ignore remanent and self-demagnetization effects (Mr=hs=0\vec{M}_{r}=\vec{h}_s=0).

Under this assumption, the definition of magnetization equation 4 simplifies to

M=κh0  .\vec{M}= \kappa \vec{h}_0\;.

This assumption gives rise to a linear system relating NN data, dpre\mathbf{d}^{pre}, to MM discrete model cells of magnetic susceptibility κ\boldsymbol \kappa

dpre=  F  κdpreRN,FRN×M,κRM  .\begin{split} \mathbf{d}^{pre}&= \;\mathbf{F\;\boldsymbol{\kappa}} \\ \mathbf{d}^{pre} \in \mathbb{R}^{N},\mathbf{F}& \in \mathbb{R}^{N \times M}, \boldsymbol{\kappa} \in \mathbb{R}^{M}\;. \end{split}

As an entry point to the inverse problem with remanent magnetism, we provide a synthetic experiment. From equation 18 we generate magnetic data on 441 stations on a 21×2121 \times 21 grid. The data is centered over a magnetized cube 25 m in width and placed 15 m below the grid (Figure 1a). The earth is discretized in 5 m cubic cells. The magnetization of the block is made up of an induced and a remanent component. We set the magnetic susceptibility of the cube to be κ=0.035 SI and the inducing flux to be b0\vec{b}_0 [50,000 nT, I: 9090^\circ, D: 00^\circ]. We set the remanent component equal in magnitude and pointing along the x-axis Mr\vec{M}_r [1.4A/m, I: 0°, D: 9090^\circ]. This results in a total magnetization M\vec{M} [2.0 A/m, I: 4545^\circ, D: 9090^\circ] as shown in Figure 1b. From equation 18, we calculate the TMA data with random Gaussian noise added to simulate field conditions (σ = 1 nT).

(a) Vertical section through the synthetic block model (Y = 0 m) with magnetization \mathbf{M} [2.0\: A/m,\; I: 45^\circ,\; D: 90^\circ](\kappa_e = 0.05 SI). Survey points (black dots) are placed 15 m above the magnetic anomaly. (b) Simulated TMI data map with random Gaussian noise added, 1 nT standard deviation. The horizontal position of the block is shown in black for reference.

Figure 1:(a) Vertical section through the synthetic block model (Y = 0 m) with magnetization M\mathbf{M} [2.0A/m,  I:45,  D:902.0\: A/m,\; I: 45^\circ,\; D: 90^\circ](κe\kappa_e = 0.05 SI). Survey points (black dots) are placed 15 m above the magnetic anomaly. (b) Simulated TMI data map with random Gaussian noise added, 1 nT standard deviation. The horizontal position of the block is shown in black for reference.

Since we are dealing with strictly positive magnetic susceptibility κ, we impose bounds by the projected gradient method Vogel (2002,  p. 157) and use a Gauss-Newton method to solve the inverse problem.

A gradient descent direction δm\delta \mathbf{m} is calculated by solving

H  δm=g  ,\mathbf{H}\; \delta \mathbf{m} = -\mathbf{g}\;,

where g\mathbf{g} is the gradient in equation 9 of the objective function

g=  JWdWd(F[m(k1)]dobs)+βr=s,x,y,zαrDrRrWrWr  Rr  Drm(k1)  ,\begin{split} \mathbf{g} =\; \mathbf{J}^\top \mathbf{W}_d^\top \mathbf{W}_d \left( \mathbb{F}[\mathbf{m}^{(k-1)}] - \mathbf{d}^{obs}\right)+ \beta \sum_{r = s,x,y,z} \alpha_r \mathbf{D}_r^\top \mathbf{R}_r^\top \mathbf{W}_r^\top \mathbf{W}_r \;\mathbf{R}_r \; \mathbf{D}_r \mathbf{m}^{(k-1)}\;, \end{split}

and H\mathbf{H} is the approximate Hessian:

2ϕm2H=  JWdWdJ+βr=s,x,y,zαrDrRrWrWr  Rr  Dr  .\begin{split} \frac{\partial^2 \phi}{\partial m^2} \approx \mathbf{H} =\;& \mathbf{J}^\top \mathbf{W}_d^\top \mathbf{W}_d \mathbf{J} + \\ & \beta \sum_{r = s,x,y,z} \alpha_r \mathbf{D}_r^\top \mathbf{R}_r^\top \mathbf{W}_r^\top \mathbf{W}_r \;\mathbf{R}_r \; \mathbf{D}_r \;. \end{split}

We use the Conjugate Gradient method Hestenes & Stiefel (1952) to solve equation 19.

The model update at the kthk^{th} iteration is then given by

m(k)=m(k1)+αδm\mathbf{m}^{(k)} = \mathbf{m}^{(k-1)} + \alpha \delta \mathbf{m}

where the step length α is found by a line-search method Nocedal & Wright (1999,  p. 30).

The optimization problem is solved for a sequence of β-values until the data misfit reaches the user-defined tolerance ηϕd\eta_{\phi_d} defined in equation 10.

Since this problem is linear with respect to the model parameters, the sensitivity matrix simplifies to:

J=  F[κ]κ=F\mathbf{J} = \frac{\partial \; \mathbb{F}[\boldsymbol{\kappa}]}{\partial \boldsymbol{\kappa}} = \mathbf{F}

and does not change as a function of iteration.

We begin with the conventional approach that assumes a smooth model (psp_s, pxp_x, pyp_y, pz=2p_z = 2). After reaching the target misfit criteria in equation 10, we recover the susceptibility model shown in Figure 2a. We note that the position of the susceptibility anomaly is shifted to the side of the true block and appears to dip at 4545^\circ angle towards the west. This is due to the large negative lobe introduced in the data by the remanent component that we have ignored. Attempting to improve the solution by solving for a sparse model (psp_s, pxp_x, pyp_y, pz=0p_z = 0) yields the solution presented in Figure 2b. The magnetic anomaly is imaged at the right depth and the vertical extent is better recovered, but the position and shape of the anomaly have not improved. In both cases, the data residual maps (Figure 2c and 2d) show correlated signal with the location of negative data. The inversion has difficulty reproducing the negative anomalies using strictly positive susceptibility subject to a vertical inducing field.

Vertical section through the recovered susceptibility model using (a) smooth assumption (p_s, \;p_x,\; p_y,\; p_z = 2) and (b) sparse \ell_p-norms to recover a compact model (p_s,\;p_x,\; p_y,\; p_z = 0). Both solutions show an anomaly with a false dip due to the wrong assumption of a vertical magnetization. Data residuals for (c) the smooth and (d) compact solution shows correlated signal with the negative data.

Figure 2:Vertical section through the recovered susceptibility model using (a) smooth assumption (ps,  px,  py,  pz=2p_s, \;p_x,\; p_y,\; p_z = 2) and (b) sparse p\ell_p-norms to recover a compact model (ps,  px,  py,  pz=0p_s,\;p_x,\; p_y,\; p_z = 0). Both solutions show an anomaly with a false dip due to the wrong assumption of a vertical magnetization. Data residuals for (c) the smooth and (d) compact solution shows correlated signal with the negative data.

The presence of remanence has long been recognized as an obstacle for the geological interpretation of magnetic data.

In a mining exploration context, having the wrong image could result in false drilling targets which is costly in time, resources, and confidence in geophysical methods.

These factors motivate the need for a more robust algorithm that does not require knowledge about the orientation of magnetization.

2.0.2Magnetic Vector Inversion - Cartesian parameters

Generalizing the susceptibility inversion, Lelievre & Oldenburg (2009)

proposed a strategy to directly recover the magnetization vector without assumptions about the orientation. They define an effective susceptibility parameter that scales the strength of magnetization along orthogonal directions such that

κe=Mh0  .\vec \kappa_e = \frac{\vec{M}}{\|\vec{h}_0\|} \;.

Re-writing the discrete system in equation 18 in terms of three orthogonal components of magnetization (u,v,wu, v, w), we obtain the augmented system:

dpre=Feκe=[FuFvFw]κuκvκwFu,  Fv,  FwRN×M  ,\begin{split} \mathbf{d}^{pre} &= \mathbf{F}_{e} \boldsymbol{\kappa}_e \\ &= [ \mathbf{F}_u \: \mathbf{F}_v \: \mathbf{F}_w] \begin{matrix} \boldsymbol{\kappa}_{u}\\ \boldsymbol{\kappa}_{v}\\ \boldsymbol{\kappa}_{w} \end{matrix} \\ \mathbf{F}_u,\; &\mathbf{F}_v, \;\mathbf{F}_w \in \mathbb{R}^{N \times M} \;, \end{split}

where Fu\mathbf{F}_u, Fv\mathbf{F}_v and Fw\mathbf{F}_w are forward operators for the components of magnetization.

We are now dealing with a linear system that has three times the number of unknown parameters compared to the susceptibility assumption (κeR3M\boldsymbol{\kappa}_e \in \mathbb{R}^{3M}). The regularization function in equation 12 becomes

ϕm=c=u,v,w  r=s,x,y,zαcrWr  Rcr  Dcr  Pc  κe22  ,\begin{split} \phi_m = \sum_{c = u,v,w} \; \sum_{r = s,x,y,z} \alpha_{c_r} {\|\mathbf{W}_r \;\mathbf{R}_{c_r} \; \mathbf{D}_{c_r} \;\mathbf{P}_{c} \; \boldsymbol{\kappa}_e\|}^2_2 \;, \end{split}

where the projection matrices Pc\mathbf{P}_{c} select individual components of the vector model κe\boldsymbol{\kappa}_e. Our regularization is made up of 12 terms. Norm measures can be applied to each Cartesian component independently.

Keeping the same inversion methodology and smooth assumptions (pcs,  pcx,  pcy,  pcz=2p_{c_s}, \;p_{c_x},\; p_{c_y},\; p_{c_z} = 2), we recover the magnetization model presented in Figure 3a.

This solution is an improvement over the susceptibility inversion as the bulk magnetization is recovered at the right position and with the correct magnetization orientation on average inside the block. We note however that the solution is distributed over a large volume and with a broad distribution in magnetization direction.

Vertical section through the recovered magnetization vector model using the Cartesian formulation with (a) smooth l_2-norm assumption and (b) sparsity constraints applied on all three Cartesian components (p_{i_s}, \;p_{i_x},\; p_{i_y},\; p_{i_z} = 0). Color is scaled by the magnitude of magnetization. The true magnetization direction is shown with a red arrow. Data residual map calculated with (c) the smooth and (d) compact solution.

Figure 3:Vertical section through the recovered magnetization vector model using the Cartesian formulation with (a) smooth l2l_2-norm assumption and (b) sparsity constraints applied on all three Cartesian components (pis,  pix,  piy,  piz=0p_{i_s}, \;p_{i_x},\; p_{i_y},\; p_{i_z} = 0). Color is scaled by the magnitude of magnetization. The true magnetization direction is shown with a red arrow. Data residual map calculated with (c) the smooth and (d) compact solution.

In order to reduce the complexity of the solution, we once again resort to the p\ell_p-norm measure (pcs,  pcx,  pcy,  pcz=0p_{c_s}, \;p_{c_x},\; p_{c_y},\; p_{c_z} = 0). As shown in Figure 3b, the recovery of the block has clearly improved. It is important to point out, however, that the magnetization vectors are pointing along the Cartesian directions and the anomaly is slightly wider than the true model. In the Cartesian formulation, both the direction and strength of magnetizations are coupled in the vector components. This formulation therefore lacks flexibility in recovering sparse vector along arbitrary orientations. This was the main motivation behind recent research investigating more advanced regularization methodologies Zhu et al. (2015)Liu et al. (2015)Fournier (2015,  p. 101). We will attempt to improve this solution by decoupling the strength and direction of the magnetization vector with the spherical formulation.

2.0.3Magnetic Vector Inversion - Spherical parameters

As an alternative to the Cartesian formulation, Lelievre & Oldenburg (2009) also proposed the vector inversion in a spherical coordinate system.

The conversion between Cartesian to spherical system follows the relation:

u=ρ  cos(θ)  cos(ϕ)v=ρ  cos(θ)  sin(ϕ)w=ρ  sin(θ)\begin{split} u = & \rho \; \cos(\theta)\;\cos(\phi) \\ v = & \rho \; \cos(\theta)\;\sin(\phi) \\ w = & \rho \; \sin(\theta) \end{split}

where the magnetization vector is defined by parameters of amplitude (ρ{\rho}) and two angles (θ{\theta}, ϕ{\phi}).

The spherical formulation separates the magnitude and orientation of magnetization vector, which comes with two advantages. First, physical property constraints from rock measurements (Koenigsberger ratio, magnetization angle) can easily be incorporated. Second, sparsity constraints can be applied to the magnitude and orientation independently, potentially resulting in compact bodies with uniform magnetization direction in any orientation.

Despite its obvious advantages, the MVI-S method has received little attention in the literature due to the non-linear transformation it introduces.

We demonstrate challenges encountered with the spherical approach on our synthetic problem.

Taking the partial derivatives of equation 25 as a function of m(ρ,θ,ϕ)\mathbf{m}( \boldsymbol{\rho}, \boldsymbol{\theta}, \boldsymbol{\phi}) yields:

J=Fe[κe]m=Fe[κe]κeκem  ,\mathbf{J} = \frac{\partial \mathbf{F}_{e}[\boldsymbol{\kappa}_e]}{\partial \mathbf{m}} = \frac{\partial \mathbf{F}_{e}[\boldsymbol{\kappa}_e]}{\partial \boldsymbol{\kappa}_e} \frac{\partial \boldsymbol{\kappa}_e}{\partial \mathbf{m}}\;,

where kem\frac{\partial \mathbf{k}_e}{\partial \mathbf{m}} involves partial derivatives of trigonometric functions prescribed in equation 27.

The sensitivity matrix can be linearized before each Gauss-Newton step as:

J=FeS  ,\mathbf{J} = \mathbf{F}_{e}\: \mathbf{S}\;,

where the matrix S\mathbf{S} holds the partial derivatives

S=[cosθcosϕρsinθcosϕρcosθsinϕcosθsinϕρsinθsinϕρcosθcosϕsinθρcosθ0]  .\mathbf{S} = \begin{bmatrix} \cos{\theta}\cos{\phi} & -\rho\sin{\theta}\cos{\phi} & -\rho\cos{\theta}\sin{\phi} \\ \cos{\theta}\sin{\phi} & -\rho\sin{\theta}\sin{\phi} & \rho\cos{\theta}\cos{\phi} \\ \sin{\theta} & \rho\cos{\theta} & 0 \end{bmatrix}\;.

With this choice of parameterization, the regularization function becomes

ϕm=c=ρ,θ,ϕ  r=s,x,y,zαcrWr  Rcr  Dcr  Pc  m22  .\begin{split} \phi_m = \sum_{c = \rho,\theta,\phi} \; \sum_{r = s,x,y,z} \alpha_{c_r} {\|\mathbf{W}_r \;\mathbf{R}_{c_r} \; \mathbf{D}_{c_r} \;\mathbf{P}_{c} \; \mathbf{m}\|}^2_2 \;. \end{split}

We note that zero reference angle values θref\boldsymbol \theta_{ref} and ϕref\boldsymbol \phi_{ref} would imply a magnetization direction pointing along the xx-axis. Since we do not want to assume a specific orientation (no ground truth), we set αθs=αϕs=0\alpha_{\theta_s}=\alpha_{\phi_s} = 0 in all our experiments such that the regularization only penalizes the change in angle between neighboring cells.

To demonstrate the difficulties that the nonlinearity of the MVI-S formulation can cause in the inversion, we invert our synthetic TMI data with the wrong initial assumption m0(ρ=102,θ=45,ϕ=0\mathbf{m_0}(\rho=10^{-2}, \theta=-45^{\circ}, \phi=0^{\circ}), such that the starting magnetization orientation is 9090^\circ from the true model (Figure 4a). After convergence of the algorithm, we recover the model shown in Figure 4b. The solution is a poor representation the true magnetization. Model updates were forced to stop before reaching the target data misfit and the optimization is likely trapped in a local minimum. We note that most of the model updates were performed on the amplitude ρ, with only marginal changes on the angle of magnetization. Similar behaviors have been documented by Lelievre & Oldenburg (2009) and later by Liu et al. (2017), who attributed the problem to an imbalance between the model parameters. Before attempting to implement more advanced constraints, we make improvements to the convergence of the non-linear MVI-S formulation.

Vertical section through the (a) starting model and (b) recovered magnetization vector model in spherical coordinates with its (c) data residual map. The inversion stopped after three iterations, unable to further reduce the objective function.

Figure 4:Vertical section through the (a) starting model and (b) recovered magnetization vector model in spherical coordinates with its (c) data residual map. The inversion stopped after three iterations, unable to further reduce the objective function.

2.1Iterative sensitivity re-weighting

To gain some insight about the issues encountered with the MVI-S formulation, we consider a simpler two-parameter linear problem of the form

x+2y=1  ,x + \:2\:y =1 \;,

which we can express in matrix form as

FC  mC=dobs  ,FC=[1  2],  mC=xy,  dobs=1\begin{split} \mathbf{F}_C\;\mathbf{m}_C &= \mathbf{d}^{obs} \;,\\ \mathbf{F}_C = [1\;2], \; \mathbf{m}_C &= \begin{matrix} x\\ y \end{matrix}, \; \mathbf{d}^{obs} = 1 \end{split}

This defines an under-determined linear system of equations. Just as we did for the magnetic inverse problem, we can isolate a solution by minimizing an objective function of the form

ϕ(m)=FC  mCdobs22+βCmC22  .\phi(m) = \| \mathbf{F}_C\;\mathbf{m}_C - \mathbf{d}^{obs} \|_2^2 + \beta_C \| \mathbf{m}_C \|_2^2 \;.

Figure 5a displays a contour map of the objective function along with its gradients.

Contour maps for the objective functions of a two-parameter inverse problem solved in (a) Cartesian and (b) polar coordinate systems. Solid colored lines show the model updates taken by different algorithms in their respective coordinate systems and, in dash, the equivalent steps in the other domain for comparison. Each inversion started with the same initial model (triangle). The colors and inversions are: Gray: the non-weighted Cartesian problem; Black: the sensitivity weighted Cartesian problem; Red: the non-linear polar coordinate system with fixed sensitivity weights; Green: the polar problem with iterative sensitivity re-weighting; and Blue: with the added scaling to compensate for the dynamic range of the parameters (see equation 50).

Figure 5:Contour maps for the objective functions of a two-parameter inverse problem solved in (a) Cartesian and (b) polar coordinate systems. Solid colored lines show the model updates taken by different algorithms in their respective coordinate systems and, in dash, the equivalent steps in the other domain for comparison. Each inversion started with the same initial model (triangle). The colors and inversions are: Gray: the non-weighted Cartesian problem; Black: the sensitivity weighted Cartesian problem; Red: the non-linear polar coordinate system with fixed sensitivity weights; Green: the polar problem with iterative sensitivity re-weighting; and Blue: with the added scaling to compensate for the dynamic range of the parameters (see equation 50).

Following the same methodology as in equation 9, we find a solution such that the gradient of the objective function ϕ(m)\phi(m) vanishes

ϕm=g=(FCFC+βCI)mCFCdobs=0  ,\begin{split} \frac{\partial \phi}{\partial \mathbf{m}}=\mathbf{g} = \left(\mathbf{F}_C^\top\mathbf{F}_C + \beta_C \mathbf{I}\right) \mathbf{m}_C - \mathbf{F}_C^\top \mathbf{d}^{obs} &= \mathbf{0}\;, \end{split}

where I\mathbf{I} is the identity matrix. The factor of 2 from the derivative of the 2\ell_2-norm is absorbed by the zero right-hand side.

After determining a trade-off parameter βC\beta_C such that ϕd103\phi_d \leq 10^{-3}, we recover the Cartesian model mC[x=0.2,y=0.4]\mathbf{m}_C[x=0.2, y=0.4]. It is the solution that minimizes the distance (evaluated with the 2\ell_2-norm) between the origin and the solution space of FC\mathbf{F}_C. We note that the relative magnitudes of model parameters in mC\mathbf{m}_C reflect the size of the forward coefficients in FC\mathbf{F}_C.

As previously discussed for the magnetic problem, the smallest solution is often not satisfactory as it is strongly influenced by the physics of the experiment.

From equation 15, we can introduce sensitivity-based weights to counteract this bias towards a large yy value:

WC=diag[[wCmax(wC)]1/2]wCj=[i=1NFij2]1/2  ,\begin{split} \mathbf{W}_C &= \rm{diag} \left[ {\left[\frac{\mathbf{ w}_C}{\rm{max}(\mathbf{ w}_C)}\right]}^{1/2}\right]\\ w_{C_j} &= {\left[\sum_{i=1}^{N}{F_{ij}}^2 \right]}^{1/2}\;, \end{split}

where WC\mathbf{W}_C holds sensitivity weights added to the regularization (wC=[1,2]\mathbf{w}_C=[1,\:2]^\top).

The new weighted objective function becomes

ϕ(m)=FC  mCdobs22+βCWCmC22  ,\phi(m) = \| \mathbf{F}_C\;\mathbf{m}_C - \mathbf{d}^{obs} \|_2^2 + \beta_C \| \mathbf{W}_C \mathbf{m}_C \|_2^2 \;,

and the weighted gradient is

gC=FCFCmC+βCWCWCmCFCdobs  .\begin{split} \mathbf{g}_C = \mathbf{F}_C^\top\mathbf{F}_C \mathbf{m}_C + \beta_C \mathbf{W}_C^\top\mathbf{W}_C\mathbf{m}_C - \mathbf{F}_C^\top \mathbf{d}^{obs}\;. \end{split}

After determining the appropriate βC\beta_C^*, we get the solution mC[x=0.33,y=0.33]\mathbf{m}^*_C[x=0.33, y=0.33] marked with a black circle in Figure 5a. We have reached the only solution with equal contribution from both model parameters that also predicts the data within the tolerance.

Alternatively, we can attempt to solve the same problem in a polar coordinate system under the transformation

mP=[ρ,  θ]x=ρcosθy=ρsinθ  ,\begin{split} \mathbf{m}_P &= [\rho,\;\theta]^\top\\ x &= \rho \cos{\theta} \\ y &= \rho \sin{\theta} \;,\\ \end{split}

where the polar model mP\mathbf{m}_P is defined by a radius ρ and an angle θ. This is analogous to the spherical transformation performed for the MVI-S formulation in equation 27.

The objective function to be minimized becomes

ϕ(mP)=F[mP]dobs22+βPWCmP22  .\begin{split} \phi(\mathbf{m}_P)= \| \mathbb{F}[\mathbf{m}_P] - \mathbf{d}^{obs} \|_2^2 + \beta_P \| \mathbf{W}_C \mathbf{m}_P \|_2^2 \;. \end{split}

The inverse problem is now non-linear with respect to the polar model so we solve it iteratively with the standard Gauss-Newton procedure described in equation 19.

The partial derivatives of the forward mapping with respect to the polar coordinates are calculated by

J=F[mP]mP=F[mP]mCmCmP=FCS\begin{split} \mathbf{J} = \frac{\partial \mathbb{F}[\mathbf{m}_P]}{\partial \mathbf{m}_P} &= \frac{\partial \mathbb{F}[\mathbf{m}_P]}{\partial \mathbf{m}_C} \frac{\partial \mathbf{m}_C}{\partial \mathbf{m}_P} = \mathbf{F}_C\mathbf{S} \\ \end{split}

where the matrix S\mathbf{S} holds the partial derivatives of the model with respect to the polar parameters

S=[cosθρsinθsinθρcosθ]  .\begin{split} \mathbf{S} &= \begin{bmatrix} \cos{\theta} & -\rho\sin{\theta} \\ \sin{\theta} & \rho\cos{\theta} \end{bmatrix} \;. \end{split}

The gradient of the objective function becomes

gP=SFCF[mP]+βPWCWCmPSFCdobs  .\begin{split} \mathbf{g}_P = \mathbf{S}^\top \mathbf{F}_C^\top \:\mathbb{F}[\mathbf{m}_P] + \beta_P \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_P - \mathbf{S}^\top \mathbf{F}_C^\top \mathbf{d}^{obs} \;. \end{split}

A trade-off parameter βP\beta_P is determined through the cooling schedule established previously. The inversion is terminated once the data misfit falls below the tolerances ηϕd\eta_{\phi_d} defined in equation 10.

Since mC  [x=0.33,y=0.33]\mathbf{m}_C^*\;[x=0.33, y=0.33] is a desirable model, we would like to be able to recover a similar solution in polar parameters (mP[ρ=0.47,θ=0.76]\mathbf{m}_P^{*}[\rho=0.47, \theta=0.76]). Unfortunately, as shown in Figure 5b, the minimization process performed in polar coordinates converges to a different solution (mP  [ρ=0.67,θ=0.26]\mathbf{m}_P\;[ \rho=0.67,\:\theta=0.26]) and the iterations steps are oscillatory.

We display the equivalent iterations (red dash) in the Cartesian space for comparison (mCP  [x=0.65,y=0.17]\mathbf{m}_C^P\;[ x=0.65,\:y=0.17]). This is an unsatisfactory solution.

Our main goal is to recover the solution mP\mathbf{m}_P^{*}, and we want to reach it with only a few model updates.

To understand why the problem has arisen we compare their respective gradients for a given starting model mC(0)\mathbf{m}_C^{(0)} and its equivalent polar model mP(0)\mathbf{m}_P^{(0)}. In Cartesian coordinates, the gradient direction is

gC(0)=FCFCmC(0)+βCWCWCmC(0)FCdobs  .\begin{split} \mathbf{g}_C^{(0)} = \mathbf{F}_C^\top \:\mathbf{F}_C\mathbf{m}_C^{(0)} + \beta_C \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_C^{(0)} - \mathbf{F}_C^\top \mathbf{d}^{obs}\;. \\ \end{split}

We can convert these gradients to polar coordinate by multiplying equation 44 with the matrix of partial derivatives S\mathbf{S} such that

gPC=S[FCFCmC(0)+βCWCWCmC(0)FCdobs]  .\begin{split} \mathbf{g}_P^C = \mathbf{S}^\top \left[ \mathbf{F}_C^\top \:\mathbf{F}_C\mathbf{m}_C^{(0)} + \beta_C \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_C^{(0)} - \mathbf{F}_C^\top \mathbf{d}^{obs} \right] \;.\\ \end{split}

We want to compare this gradient to the gradient calculated in polar coordinates. From equation 43, and also keeping the same βC\beta_C used in the Cartesian framework, we have

gP=SFCF[mP(0)]+βCWCWCmP(0)SFCdobs  .\begin{split} \mathbf{g}_P &= \mathbf{S}^\top \mathbf{F}_C^\top \:\mathbb{F}[\mathbf{m}_P^{(0)}] + \beta_C \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_P^{(0)} - \mathbf{S}^\top \mathbf{F}_C^\top \mathbf{d}^{obs}\;. \end{split}

Noting that FCmC(0)=F[mP(0)]\mathbf{F}_C \mathbf{m}_C^{(0)} = \mathbb{F}[ \mathbf{m}_P^{(0)}] (i.e. the forward modellings are consistent) then equation 45 and 46 are the same only if

SWCWCmC(0)WCWCmP(0)  .\begin{split} \mathbf{S}^\top \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_C^{(0)} & \simeq \mathbf{W}_C^\top\mathbf{W}_C \mathbf{m}_P^{(0)}\;. \end{split}

We would like both sides to be roughly equal such that the gradient direction in polar space resembles the gradient direction calculated in the Cartesian space. This is unlikely since S is a coordinate transformation matrix whose columns can have quite different values (see equation 42).

The critical element is the specification of the regularization matrix for the polar system. Using the regularization matrix generated for the Cartesian system is inappropriate. Instead, we should use the sensitivity weighting for the polar problem. From equation 41 we had defined the polar sensitivities as J=FCS\mathbf{J}=\mathbf{F}_C\mathbf{S}. We can define a new weighting matrix WP\mathbf{W}_P by

WP=diag[[wPmax(wP)]1/2]wPj=[i=1NJij2]1/2  ,\begin{split} \mathbf{W}_P &= \rm{diag} \left[ {\left[\frac{\mathbf{ w}_P}{\rm{max}(\mathbf{w}_P)}\right]}^{1/2}\right]\\ w_{P_j} &= {\left[\sum_{i=1}^{N}{J_{ij}}^2 \right]}^{1/2}\;, \end{split}

and our objective function to be minimized is

ϕ(mP)=F[mP]dobs22+βPWPmP22  .\begin{split} \phi(\mathbf{m}_P) = \| \mathbb{F}[\mathbf{m}_P] - \mathbf{d}^{obs} \|_2^2 + \beta_P \| \mathbf{W}_P \mathbf{m}_P \|_2^2 \;. \end{split}

Importantly, we note that the sensitivities change at each iteration and so WP\mathbf{W}_P must be continually updated.

Inverting the non-linear problem with the iterative scaling strategies (green) we recover the model mP[ρ=0.53,θ=0.53]\mathbf{m}_P[\rho=0.53,\: \theta=0.53] (see Figure 5b). The solution has equal parameters of ρ and θ, and we reached this solution in a few iterations. In most applications, however, obtaining proportionality between the magnitude and angle of the vector is not meaningful. Converted to Cartesian space mCP[x=0.46,y=0.27]\mathbf{m}_C^P[x=0.46,\: y=0.27], we note that the solution is still different from mC\mathbf{m}_C^*.

To understand this result, we now examine equation 47 in terms of the size of the model parameters in mP\mathbf{m}_P.

We have used a regularization function to penalize two parameters with different units: the radius ρ[0,]\rho \in [0, \infty] has units of length and angle θ[π,π]\theta \in [-\pi,\: \pi] is in radians. The range of values spanned by these parameters differs in scale as depicted by the aspect ratio of Figure 5b.

Handling this disparity can be accomplished by introducing an additional weighting matrix that effectively scales the variables to the same dynamic range. We define a scaling factor between the two parameters

ω=ρθ  .\omega = \frac{\|\boldsymbol{\rho} \|_\infty}{\|\boldsymbol{\theta} \|_\infty}\;.

In a general problem with lots of variables, we can evaluate ρ\|\boldsymbol{\rho} \|_\infty and θ\|\boldsymbol{\theta} \|_\infty (ratio of largest model parameters at a given iteration).

Here, where we have a restricted problem of two variables and one datum, we set this to the target

(ω=ρ/θ\omega = \rho^*/\theta^*).

The scaled regularization becomes

W^P=diag([1ω](1/2))WP  .\begin{split} \mathbf{\hat W}_P &= diag\left( \begin{bmatrix} 1 & \omega \end{bmatrix}^{(1/2)}\right) \mathbf{W}_P\;. \end{split}

The new scaled objective function becomes

ϕ(mP)=F[mP]dobs22+βPW^PmP22  .\begin{split} \phi(\mathbf{m}_P) = \| \mathbb{F}[\mathbf{m}_P] - \mathbf{d}^{obs} \|_2^2 + \beta_P \| \mathbf{\hat W}_P \mathbf{m}_P \|_2^2 \;. \end{split}

Minimizing this function we get the model (blue) presented in Figure 5 (mP  [ρ=0.47,θ=0.74]\mathbf{m}_P\;[\rho=0.47,\: \theta=0.74]). Converted to Cartesian space, this solution mCP[x=0.35,y=0.32]\mathbf{m}_C^P[x=0.35,\: y=0.32] closely matches mC\mathbf{m}^*_C.

2.1.1Scaled MVI-S algorithm

Now that we have demonstrated the benefit of an iterative sensitivity re-weighting of the regularization, we re-visit our synthetic magnetic problem.

We invert the synthetic data once more with the starting magnetization orientated at 9090^\circ from the true model m0(ρ=102,  θ=45,  ϕ=0)\mathbf{m}_0( \rho=10^{-2},\; \theta=-45^\circ,\;\phi=0^\circ).

The recovered magnetization obtained after convergence of the scaled MVIS-S algorithm with smooth assumptions is presented in Figure 6a. We note close similarities with the MVI-C solution presented in 3a, with the bulk magnetization centered around the position of the block. The inversion took 15 iterations to converge to this solution. This is a clear improvement over the model previously shown in Figure 4b.

(a) Vertical section through the recovered magnetization vector model in spherical coordinates using sensitivity based weighting and (b) the corresponding data residual map. The same poor starting model shown in Figure %sa was used, but the algorithm converged to a solution similar to the Cartesian solution.

Figure 6:(a) Vertical section through the recovered magnetization vector model in spherical coordinates using sensitivity based weighting and (b) the corresponding data residual map. The same poor starting model shown in Figure 4a was used, but the algorithm converged to a solution similar to the Cartesian solution.

From a practical standpoint, we have found that it is more efficient to initialize the MVI-S algorithm with the Cartesian solution. The linear MVI-C approach allows us to rapidly find a model that fits the observed data and it provides a good starting point for the MVI-S formulation. We invert the data once more using the smooth Cartesian solution as a starting model. Figure 7a shows the recovered solution obtained after only three iterations of MVI-S. The solution closely resembles the starting Cartesian model. We note, however, that there is some correlated signal in the residual misfit map in Figure 7c.

Having achieved a stable and reasonable solution with the 2\ell_2-norm, we can now apply sparse norms to recover a block with a coherent magnetization direction.

We vary the regularization measures on the amplitude, derivatives of amplitude and derivatives of angles uniformly such that (pisp_{i_s}, pixp_{i_x}, piyp_{i_y} piz=0p_{i_z} = 0).

Figure 7b presents a section through the magnetic vector model. The shape of the anomalous body matches the magnetic block and the magnetization direction is uniform and orientated at 4545^\circ inclination. Also, the residual data map in Figure 7d shows almost no correlated signal. In previous inversions we showed that the data could be fit within the global tolerance ϕdϕd\phi_d \leq \phi_d^*, but some of the important signal could not be replicated due to the smooth regularization. We have managed to better replicate the fields of a compact source by using the appropriate sparse and blocky assumptions. This result increases our confidence in our ability to accurately recover the magnetization of geological bodies in 3D, as long as the regularization function is adaptable to the geological settings.

Vertical section through the recovered magnetization vector model in spherical coordinates using sensitivity based weighting and the smooth Cartesian solution as starting model: (a) smooth regularization (p_s=p_x=p_y=p_z = 2) and (b) sparse norms on the strength and angles of the magnetization vector (p_s=p_x=p_y=p_z = 0). By using the sparse norms, we recover a solution that closely resembles the true model and, as observed in (d), we also eliminate the coherent signal in the residual data that is observed in (c).

Figure 7:Vertical section through the recovered magnetization vector model in spherical coordinates using sensitivity based weighting and the smooth Cartesian solution as starting model: (a) smooth regularization (ps=px=py=pz=2p_s=p_x=p_y=p_z = 2) and (b) sparse norms on the strength and angles of the magnetization vector (ps=px=py=pz=0p_s=p_x=p_y=p_z = 0). By using the sparse norms, we recover a solution that closely resembles the true model and, as observed in (d), we also eliminate the coherent signal in the residual data that is observed in (c).

3Case Study: Kevitsa Ni-Cu-PGE deposit

We demonstrate the capabilities of our inversion strategy with an airborne magnetic survey acquired over the Kevitsa Ni-Cu-PGE deposit, Northern Finland.

The deposit was discovered in the mid-1980s through exploration programs sponsored by the Geological Survey of Finland.

The geology of the deposit has been studied extensively over the past three decades by using surface mapping and borehole logging.

Figure 8a presents a simplified geological map of the Kevitsa-Satovaara intrusive complex, adapted from Koivisto et al. (2015).

(a) Geological map of the Kevitsa-Satovaara intrusive complex adapted from , with geological definition provided in Table 1. Mapped faults (dash) are shown for reference. (b) 2D seismic line reflection line E5 with interpreted geological contacts between the main reflectors.

Figure 8:(a) Geological map of the Kevitsa-Satovaara intrusive complex adapted from Koivisto et al. (2015), with geological definition provided in Table 1. Mapped faults (dash) are shown for reference. (b) 2D seismic line reflection line E5 with interpreted geological contacts between the main reflectors.

CodeDescriptionSusceptibility
[OVB]0.25cm0.25cmOVBOverburdenMedium
[IGB]0.25cm0.25cmIGB
IGB: Gabbro
IPG: Pegmatite
IGBO: Olivine gabbro
IDI: Diorite
IDB: Diabase
IMO: Intrusive (mafic)
IGBM: Magnetite gabbro
Medium
High
[UPX]0.25cm0.25cmUPX
UPXO: Olivine peroxinite
UOO: Ultramafic (Undiff.)
MPE: Metaperidotite
UWB: Websterite
Medium
[UDU]0.25cm0.25cmUDU
UDU: Dunite
UPE: Peridotite
High
[UKO]0.25cm0.25cmUKOKomatiiteHigh
[BXH]0.25cm0.25cmBXH
BXO (undiff.)
BXHC: Hydrothermal (crackle)
Medium
[SOO]0.25cm0.25cmSOO
MAB: Albitite
MAM: Amphibolite
MQZ: Quartzite
MSCSD: Schist
Low
[MPHB]0.25cm0.25cmMPH
MPH: Phyllite
MPHB: Black Phyllite
MSCBK: Black Schist
MHF: Hornfels
Low
[VMO]0.25cm0.25cmMVS
VMO: Volcanic Mafic
VBA: Basalt
VTUM: Volcanic tuff
Low
[VOO]0.25cm0.25cmVIO
VIO: Volcanic Intermediate
VTUI: Volcanic tuff
VAN: Andesite
VOO: Volcanic (undiff.)
Medium

The Ni-Cu-PGE mineralization is hosted in a funnel shaped ultramafic olivine pyroxenite (UPXO) unit, bordered to the south-west by a gabbro (IGB) unit. The intrusion is hosted in a layered sequence of mafic volcanic (MVS) to intermediate volcanic (VIO) and carbonaceous phyllites (MPH) units. This sequence is interbedded with discontinuous layers of arkose (ARK), arenite (ARN), and felsic volcanic (FVS) units.

It is believed that the disseminated sulphide mineralization within the UPXO unit may have precipitated from the dissolution of Proterozoic Ni-Cu-PGE rich MPH units and sulfur rich evaporates Mutanen (1997). From seismic reflection surveys and borehole data, Koivisto et al. (2015) identified geological contacts at a depth >1>1 km that defines the base of the intrusion (Figure 8b).

A large dunite (UDU) block is located in the center of the intrusion. Thick lenses of dunite below the base of the Kevitsa intrusion were identified, as well as a vertical unit between the IGB and UPXO units. While the extent and geometry of these UDU units are not well understood, we expect the dunite be serpentinized and hence highly magnetic.

Table 3 provides a relative ranking of expected magnetization strength based upon the compilation of 105,000 susceptibility readings made on cores from 279 boreholes. Figure 9 summarizes the susceptibility measurements grouped by lithologies.

Hole IDInterval (m)κ (SI)Inc. ()Q
KV2970-52.90.034-42.4[2,10][2, 10]
KV20029.90.038-50.95.4
Whisker plot of magnetic susceptibility measured along 279 boreholes. The coloured boxes have a width scaled by the calculated standard deviation and centered on the mean value for all intercepts belonging to the same lithological classification, as defined in Table 1. The black lines on either side define the minimum and maximum values. The different lithologies are colour coded and grouped based on relative age and similarities in physical properties.

Figure 9:Whisker plot of magnetic susceptibility measured along 279 boreholes. The coloured boxes have a width scaled by the calculated standard deviation and centered on the mean value for all intercepts belonging to the same lithological classification, as defined in Table 1. The black lines on either side define the minimum and maximum values. The different lithologies are colour coded and grouped based on relative age and similarities in physical properties.

The deposit is interesting from a geophysical perspective due to the large number of data acquired and made available to researchers: borehole petrophysical measurements, direct-current resistivity, magnetotelluric, ground gravity and magnetic data as well as two airborne time-domain EM surveys (VTEM 2009, SkyTEM 2010).

In this study, we focus our efforts on the magnetic data collected during the 2009 VTEM survey, presented in Figure 10. The inducing field parameters at the time of acquisition were B0  [A:52,800nT,I:77.5,  D:12.2]B_0\;[A:\:52,800\: \text{nT}, I:\:77.5^\circ, \;D:\:12.2^\circ].

Observed TMI data over the Kevitsa intrusion with histogram equalized color scale. Geological contacts (black), faults (dash) identified from surface mapping and the 2D seismic line locations E5, are shown for reference. Sun shading from east is added to highlight subtle features (Azimuth: 270^\circ, Dip: 45^\circ).

Figure 10:Observed TMI data over the Kevitsa intrusion with histogram equalized color scale. Geological contacts (black), faults (dash) identified from surface mapping and the 2D seismic line locations E5, are shown for reference. Sun shading from east is added to highlight subtle features (Azimuth: 270270^\circ, Dip: 4545^\circ).

Table 3:Summary table grouping the various lithological units logged from boreholes. Expected magnetic susceptibility contrasts are derived from Figure 9.

From visual inspection, we note some obvious connections between the observed TMI data and the surface geology:

  • Strong magnetic signal correlated with the UKO, UPXO and hydrothermal BXH

  • Moderate response from the MVS and VIO

  • Weak fields over most of the MPH units

  • Large negative anomaly within the UDU and near the southern edge of UPXO

The strong negative field observed over the dunite unit within the UPXO is of particular interest for this study since it is likely related to remanent magnetization.

Analysis of core samples indicates large Koenigsberger ratios and reversed magnetization direction in the UDU unit as summarized in Table %s Montonen (2012,  p. 47). It is important to note that large Koenigsberger ratios were also measured in the lower UPXO unit, although susceptibility values remained small. In the absence of oriented core, no magnetic declinations were provided. From forward modeling of magnetized sheets, Montonen (2012) estimated that a magnetized unit with effective susceptibility 0.82 SI and orientated [I=42.5,D=240][I=-42.5^\circ, D=240^\circ] could be responsible for the observed negative magnetic anomaly.

3.1Magnetic susceptibility model

As a first pass, we invert the TMI data for a smooth susceptibility model (ps=px=py=pz=2p_s=p_x=p_y=p_z=2) and ignore the effects of remanence.

To invert this large dataset we resort to a tiled-Octree mesh decoupling strategy Haber & Schwartzbach (2014). Sensitivity calculations are performed on nested sub-meshes to reduce the memory footprint required for the forward calculations. The dense sensitivity matrices are stored on disk in zarr file format and accessed in parallel using the open-source Dask library Dask (2016). The combination of both the forward mesh decoupling and lazy-loading of sensitivities allows us to run large problems on a desktop computer without the need for compression. The inversion algorithm is written under the open-source SimPEG package in Python Cockett et al. (2015). More details regarding the algorithm are provided in Fournier (n.d.)(p.13). The full inversion domain comprises over 500,000 cells and 17,000 data points.

From sections through the recovered susceptibility model, presented in Figure 11a and 11b, we note discrepancies with the known geology:

  • In plan view, the arc shaped anomaly, SW of the deposit, is recovered outside the mapped UKO unit.

  • Along the E5 seismic section, no susceptibility anomaly is recovered over the central dunite. This directly contradicts the core sample measurements made by Montonen (2012).

  • The shape and extent of the large anomaly correlates poorly with the UPXO unit interpreted by Koivisto et al. (2015) from seismic reflectors.

It is also important to note the large correlated residuals shown in Figure 11c. The inversion had difficulty finding a strictly positive susceptibility model that could account for both the large positive and negative fields. This is a good indicator that the magnetic response observed at Kevitsa cannot solely be attributed to an induced magnetization alone.

(a) Horizontal and (b) vertical sections through the recovered susceptibility model that ignores the effect of remanence. Lithological contacts (black) identified by  are shown for reference. (c) The residual map shows strong correlation with the negative magnetic data.

Figure 11:(a) Horizontal and (b) vertical sections through the recovered susceptibility model that ignores the effect of remanence. Lithological contacts (black) identified by Koivisto et al. (2015) are shown for reference. (c) The residual map shows strong correlation with the negative magnetic data.

3.2Magnetization vector model

To address the issues posed by remanence, we proceed with the MVI-S algorithm.

We perform a series of nine inversions with varying sparsity measures to assess the variability in the magnetization model. Starting from a common 2\ell_2-norm MVI-C model, we sequentially vary the combination of norms applied to the amplitude and its derivatives for (pρs,pρxyz[0,2]p_{\rho_s}, p_{\rho_{xyz}} \in [0, 2]). In all cases, we fix the p\ell_p-norm regularization on the derivatives of the angles (pθxyz,  pϕxyz=0p_{\theta_{xyz}},\;p_{\phi_{xyz}} =0) to promote coherent magnetization orientations.

Horizontal and vertical sections through the recovered nine magnetization models are shown in Figures 12 and 13, respectively.

Horizontal sections at \approx 300 m below topography for a suite of models using various sparsity assumptions put on the amplitude of magnetization for p_s\;, p_{x,y,z} \in [0,\: 2]. Norm measure on the magnetization angle are fixed to p_{x,y,z}=0 in order to promote uniform magnetization direction.

Figure 12:Horizontal sections at 300 m below topography for a suite of models using various sparsity assumptions put on the amplitude of magnetization for ps  ,px,y,z[0,2]p_s\;, p_{x,y,z} \in [0,\: 2]. Norm measure on the magnetization angle are fixed to px,y,z=0p_{x,y,z}=0 in order to promote uniform magnetization direction.

Vertical sections along the E5 seismic reflection line for a suite of models using various sparsity assumptions put on the amplitude of magnetization for p_s\;, p_{x,y,z} \in [0,\: 2]. Norm measure on the magnetization angle are fixed to p_{x,y,z}=0 in order to promote uniform magnetization direction.

Figure 13:Vertical sections along the E5 seismic reflection line for a suite of models using various sparsity assumptions put on the amplitude of magnetization for ps  ,px,y,z[0,2]p_s\;, p_{x,y,z} \in [0,\: 2]. Norm measure on the magnetization angle are fixed to px,y,z=0p_{x,y,z}=0 in order to promote uniform magnetization direction.

To simplify the analysis, we superimpose the 90th90^{th} percentile iso-value of amplitude for each of the nine models (Figure 13). We calculate an average magnetization direction (white) and standard deviation on the angle (red). We observe the following:

  • Parts of the central dunite (UDU) unit appear to be reversely magnetized [κe=0.09  SI,  I=52±15,  D=246±5][\kappa_{e} = 0.09\;SI, \;I=-52^\circ \pm 15^\circ, \;D=246^\circ \pm 5^\circ]. The recovered inclination in the model cells nearest to the published results from Montonen (2012) agree well.

  • A tabular magnetic anomaly between the IGB and UPX unit, likely related to the central dunite unit, appears to be plunging towards SE, potentially extending below the UPX unit as hypothesized by Koivisto et al. (2015).

  • Strong magnetization recovered along the outershell of the UPX ultramafic intrusion appears to be pointing normal to its base.

  • Similar radial outward magnetization recovered along the arc-shaped UKO unit, east of the Kevitsa deposit.

The last two remarks are interesting for a few reasons. First, strong magnetization near the base of the ultramafic supports the presence of magnetic UDU units below the intrusion.

Second, the orientation of magnetization pointing radially outward may be indicative of past tectonic deformation.

Under the assumption that the remanent magnetization component had been fairly uniform within the layered UDU, UKO and UPX unit at the time of formation, then the current radiating magnetization pattern would be explained by subsequent folding of the units.

If this is the case, than it would be one of the most complex geological scenarios for which magnetic data inversions have been used to infer tectonic deformation.

While our modeling of the central dunite unit agrees with published laboratory measurements, the cause for this reverse magnetization direction remains unclear. No other lithological units appear to share this orientation. Re-magnetization after emplacement of the ultramafic intrusion is unlikely as a similar reversed polarity pattern would also be expected elsewhere at Kevitsa. We speculate that the dunite block could be related to the lower UDU unit, which would have been folded to its current sub-vertical location.

4Conclusion

In this study, we introduced an iterative sensitivity re-weighting strategy to improve the convergence of the non-linear MVI-S formulation.

The iterative re-scaling of the regularization function associated with the amplitude and angles of magnetization was crucial in order to achieve stable convergence of the algorithm.

Smoother and more robust solutions allowed us to apply compact norms on the three model parameters independently, which greatly simplified the solution over the conventional MVI-C formulation. Despite this improvement, the MVI problem remains largely under-determined. Incorporating a  prioria \; priori information, either through model constraints or joint physical properties, remains important to accurately represent the geology.

We demonstrated the capability of the newly accessible MVI-S formulation on an airborne magnetic survey collected over the Kevitsa Ni-Cu-PGE deposit. The recovered effective susceptibility and magnetization model confirmed that the central dunite unit was associated with strong reversed magnetization oriented roughly [κe=0.09\kappa_{e} = 0.09 SI, I=52±15I=-52^\circ \pm 15^\circ, D=246±5D=246^\circ \pm 5^\circ].

Potentially the most significant outcome of this case study is the recovered sparse magnetization pointing normal to the base of the Kevitsa olivine-pyroxenite unit. If confirmed by laboratory measurements, this result would be one of the few paleomagnetic interpretations carried over folded geology that is based on the inversion of airborne magnetic data.

Acknowledgments

The authors would like to thank Markku Montonen at New Boliden for given us access to the geophysical data. Special thanks to Dr. Emilia Koivisto at the University of Helsinki for providing the 3D geological surfaces and interpretation and Dr. Frank Santaguida for providing the background information and putting us in contact with key people. Last but not least, we are grateful to the open-source community, in particular, those that contributed to the Dask and SimPEG packages.

References
  1. Kissel, C., & Laj, C. (1989). Paleomagnetic rotation and continental deformation (NATO ASI Series, Ed.; Vol. 254). Kluwer Academic Publishers.
  2. Pueyo, E. L., Cifelli, F., Sussman, A. J., & Oliva-Urcia, B. (2016). Introduction: Palaeomagnetism in fold and Thrust belts: New perspective. Geological Society, London, 425(1), 1–6.
  3. Li, Y., Melo, A., Martinez, C., & Sun, J. (2019). Geology differentiation: A new frontier in quantitative geophysical interpretation in mineral exploration. The Leading Edge, 38(1), 60–66.
  4. Henkel, H. (1991). Petrophysical properties (density and magnetization) of rocks from the northern part of the Baltic Shield. Tectonophysics, 192, 1–19.
  5. Enkin, R. J. (2014). The rock physical property database of British Columbia, and the distinct petrophysical signature of the Chilcotin basalts. Canadian Journal of Earth Sciences, 51, 327–338.