Introduction
In this post I go over the basics of index notation for calculus. This is the notation that was invented by Einstein and also known in machine learning community as einsum. It serves as a convenient way to supress summations in formulas, by viewing repeated indices as being summed over. In the field of tensor calculus and in particular fluid dynamics, this notation can come in handy when deriving complex formulas involving $\nabla, \nabla \cdot, \nabla^2$.
In this post, I follow the standard physics convention (rather than the machine learning one) to denote a generic vector as $\underline{x}$, a scalar as $x$, and a generic tensor as $\underline{\underline{X}}$. Note that tensors are unlike matrices in linear algebra, as they can be multi-dimensional (e.g., the stress tensor in fluid mechanics has three free indices $i,j,k$).
Elementary Symbols
Two identities are important for the index notation:
The Kronecker Delta is a rank-2 symmetric tensor defined as:
$$\delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{if } i \neq j \end{cases}$$
The alternating unit tensor is a rank-3 antisymmetric tensor defiend as: $$ \epsilon_{ijk} = \begin{cases} 1 & \text{if } ijk = 123, 231, \text{ or } 312 \\ 0 & \text{if any two indices are the same} \\ -1 & \text{if }ijk = 132, 213, \text{ or } 321 \end{cases} $$
The delta-epsilon identity
$$\begin{align} \epsilon_{ijk}\epsilon_{ilm} = \delta_{jl}\delta_{km} - \delta_{jm}\delta_{kl} \end{align}$$
Also the conversion of indices by Kronecker Delta: $\delta_{ij} x_j = x_i$.
Tensor Arithmetic
This part is similar to what we in machine learning community are familiar with:
- Element-wise addition, multiplication, subtraction, division.
- vector inner and outer product, vector-matrix multiplication.
- Tensor-Tensor multiplication (also known as tensor product), with special case of matrix multiplication.
$$\begin{aligned} & b \underline{a} = \underline{c} \equiv b a_i = c_i && (\text{scalar-vector product})\\ & \underline{a}\cdot \underline{b} = c \equiv a_i b_i = b_i a_i = c && (\text{vector inner product})\\ & \underline{a} \otimes \underline{b} = \underline{\underline{c}} \equiv a_i b_j = c_{ij} && (\text{vector outer product})\\ & \underline{\underline{a}} \cdot \underline{b} = \underline{c} \equiv a_{ij} b_j = c_i && (\text{matrix vector product})\\ & \underline{\underline{a}} \cdot \underline{\underline{b}} = \underline{\underline{c}} \equiv a_{ij}b_{jk} = c_{ik} && (\text{matrix-matrix product})\\ & \underline{\underline{a}} : \underline{\underline{b}} = c \equiv a_{ij} b_{ji} = c && (\text{tensor-tensor inner product})\\ & \underline{\underline{a}} \otimes \underline{\underline{b}} = \underline{\underline{c}} \equiv a_{ij} b_{kl} = c_{ijkl} && (\text{general tensor product})\\ & \underline{a} \times \underline{b} = \underline{c} \equiv \epsilon_{ijk} a_i b_j = c_i && (\text{vector cross product} \end{aligned})$$
The tensor product scenario is the most complicated one, with rich literature in tensor analysis (we are only listing 3D case), as general tensors are defined mathematically as bilinear maps . The last one has application in calculating vortex for fluid, which has the familiar form: $(a_1, a_2, a_3) \times (b_1, b_2, b_3) = (a_2b_3 - a_3b_2, a_3b_1-a_1b_3, a_1b_2-a_2b_1)$.
The trace operator can also be genearlized to be:
$$ tr(\underline{\underline{A}}) = b \equiv a_{ii\cdots i} = b $$
Calculus
This part talks about the index notation for derivative-based operators: $\nabla$ (grad), $\nabla \cdot$ (divergence), $\nabla \times$ (curl) and the caveats of ordering. Let $\phi(\underline{x}, t)$ be a scalar field, such as the one described by a PDE, and $\underline{a}(\underline{x}, t)$ be a vector field, then:
$$\begin{aligned} & \nabla \phi \equiv \partial_i \phi = b && (\text{spatial derivative (gradient) of scalar field})\\ & \nabla \underline{a} \equiv \partial_j a_i = c_{ij} && (\text{Jacobian (matrix) of vector field})\\ & \nabla \cdot \underline{a} \equiv \partial_i a_i = b && (\text{divergence of vector field})\\ & \nabla \times \underline{a} \equiv \epsilon_{ijk} \partial_j a_k = b_i && (\text{curl of vector field})\\ & \nabla^2 \underline{a} = \nabla \cdot (\nabla \underline{a}) \equiv \partial_i \partial_i a_j = b_j && (\text{Laplacian of vector field}) \end{aligned}$$
The caveats with vector calculus index notation is that ordering matters: $$\begin{aligned} \underline{a} \cdot \nabla = a_i \partial_i = \sum_{i=1}^d a_i \frac{\partial}{\partial x_i} \end{aligned}$$ is an operator.
Furthermore, suppose that $\underline{\underline{z}}(\underline{x},t)$ is a tensor-valued function, then we can similarly define the differential operators using the index notation.
First we talk about $\underline{\underline{z}}(\underline{x},t)$ as a rank-2 tensor (a matrix):
$$\begin{aligned} & \nabla \underline{\underline{z}} \equiv \partial_k z_{ij} = f_{ijk} && (\text{gradient of matrix is rank-3 tensor})\\ & \nabla \cdot \underline{\underline{z}}(\underline{x},t) \equiv \partial_i z_{ij} = g_j && (\text{divergence of matrix is a vector})\\ & \nabla \times \underline{\underline{z}}(\underline{x},t) \equiv \epsilon_{ijk} \partial_j z_{kl} = h_{il} && (\text{curl of matrix is a matrix}) \end{aligned}$$
We can seem to generalize the result for a rank-$n$ tensor, for $n = 1,2,\cdots, N$:
- $\nabla$ increments the rank of the tensor (e.g., grad of rank-$n$ tensor is a rank-($n+1$) tensor)
- $\nabla \cdot$ decrements the rank of the tensor (e.g., divergence of a rank-$n$ tensor is rank-($n-1$) tensor).
- $\nabla \times$ doesn’t change the rank of the tensor.
Usually to work on differential operators applied to tensors, it is easier to specify an index beforehand, e.g., for $\nabla (\underline{x} \cdot \underline{x})$, we can work on its $i$-th index:
Example 1: divergence of a vector outer product.
$$\begin{aligned} (\nabla \underline{x} \cdot \underline{x})_i &= \frac{\partial}{\partial_j} (x_i x_j)\\ &= \frac{\partial x_i}{\partial x_j} x_j + \frac{\partial x_j}{\partial x_j} x_i && (\text{product rule})\\ &= \delta_{ij} x_j + \delta_{jj} x_i && (\text{since} \frac{\partial x_i}{\partial x_j} = \delta_{ij})\\ &= x_i + d x_i = (d+1)x_i && (\delta_{jj} = d \text{ the dimension of \underline{x}}) \end{aligned}$$
In vector notation, this becomes $\nabla \cdot (\underline{x} \cdot \underline{x}) = 4\underline{x}$.
Example 2: Divergence of curl for a $C^2$ vector field $\underline{x}$
$$\begin{aligned} (\nabla \cdot (\nabla \times \underline{x})) &\equiv \partial_i (\sigma_{ijk} \partial_j x_k)\\ &= \sigma_{ijk} \partial_i \partial_j x_k\\ &= \sigma_{ijk} \partial_j \partial_i x_k\\ &= -\sigma_{jik} \partial_j \partial_i x_k &&\text{anti-symmtry of $\sigma$}\\ &= -\sigma_{ijk} \partial_i \partial_j x_k \end{aligned}$$
This shows that $\nabla \cdot (\nabla \times \underline{x}) = 0$, that the divergence of curl of a $C^2$ vector field is zero.
Notations specific to Fluid Dynamics
$(\mathbf{u}\cdot \nabla)f$
The parenthesis-grad notation for $f(x_1,x_2,x_3,t)$, a scalar quantity of interest on spacetime, and for velocity field $\mathbf{u} = (dx_1/dt, dx_2/dt, dx_3/dt) = (u_1, u_2, u_3)$, we have: $$\begin{aligned} (\mathbf{u}\cdot \nabla) f \equiv u_i \partial_i f = \sum_{i=1}^3 u_i \frac{\partial f}{\partial x_i} \end{aligned}$$
and the material derivative is:
$$\begin{aligned} \frac{Df}{dt} = \frac{\partial f}{\partial t} + (\mathbf{u}\cdot \nabla) f = \frac{\partial f}{\partial t} + (u_i \partial_i) f \end{aligned}$$
Which measures the rate of change for the quantity following the fluid flow, on the streamline. An important scenario in this case is the Bernoulli’s equation for irrotational flow, where $f$ is the scalar field $H = \frac{\rho}{p} + \frac{1}{2}\mathbf{u}^2 + \chi$. e.g., the notation $(\mathbf{u} \cdot \nabla) f$ appears when we study conserved quantity in steady flow.
$(\mathbf{u}\cdot \nabla)\mathbf{u}$
The parenthesis-grad notation for vector quantity, such as the nonlinear term in Navier-Stokes equation:
$$ \begin{aligned} (\mathbf{u}\cdot \nabla)\mathbf{u} \equiv (u_j \partial_j) u_i &= u_j \frac{\partial u_i}{\partial x_j} = (u_1 \frac{\partial}{\partial x} + u_2 \frac{\partial }{\partial y} + u_3 \frac{\partial}{\partial z})(u_1,u_2,u_3)\\ &= (u_1 \frac{\partial u_1}{\partial x} + u_2\frac{\partial u_1}{\partial y} + u_3 \frac{\partial u_1}{\partial z}, u_1\frac{\partial u_2}{\partial x} + u_2\frac{\partial u_2}{\partial y} + u_3 \frac{\partial u_2}{\partial z}, u_1\frac{\partial u_3}{\partial x} + u_2\frac{\partial u_3}{\partial y} + u_3 \frac{\partial u_3}{\partial z}) \end{aligned}$$
The index form of Euler’s equation:
$$\begin{aligned} \frac{D \mathbf{u}}{Dt} = -\frac{1}{\rho}\nabla p + \mathbf{g} \equiv \frac{\partial u_i}{\partial t} + u_j \frac{\partial u_i}{\partial x_j} = -\frac{1}{\rho}\frac{\partial p}{\partial x_i} + g_i \end{aligned}$$
which, written in full, are 5 equations (together with incompressible flow condition $\nabla \cdot \mathbf{u} = 0$): $$\begin{aligned} & \frac{\partial u_1}{\partial t} + u_1 \frac{\partial u_1}{\partial x} + u_2 \frac{\partial u_1}{\partial y} + u_3 \frac{\partial u_1}{\partial z} = -\frac{1}{\rho}\frac{\partial p}{\partial x} \\ & \frac{\partial u_2}{\partial t} + u_1 \frac{\partial u_2}{\partial x} + u_2 \frac{\partial u_2}{\partial y} + u_3 \frac{\partial u_2}{\partial z} = -\frac{1}{\rho}\frac{\partial p}{\partial y} \\ & \frac{\partial u_3}{\partial t} + u_1 \frac{\partial u_3}{\partial x} + u_2 \frac{\partial u_3}{\partial y} + u_3 \frac{\partial u_1}{\partial z} = -\frac{1}{\rho}\frac{\partial p}{\partial z} - g \\ & \frac{\partial u_1}{\partial x} + \frac{\partial u_2}{\partial y} + \frac{\partial u_3}{\partial z} = 0 \end{aligned}$$
Index Form Bernoulli’s Equation
Bernoulli Equation Identity: $(\nabla \times \mathbf{u})\times \mathbf{u} = -\nabla H$. Here $H$ was defined above, and the actual formula to use is the following:
$$\begin{aligned} (\mathbf{u}\cdot \nabla) \mathbf{u} = \frac{1}{2}\nabla (\mathbf{u} \cdot \mathbf{u}) - \mathbf{u}\times (\nabla \times \mathbf{u}) \end{aligned}$$
The index notation for this tricky formula is: $$\begin{aligned} u_j \frac{\partial u_i}{\partial x_j} = \frac{1}{2}\frac{\partial u_j u_j}{\partial x_j} -\epsilon_{ijk}\left( u_j \epsilon_{klm} \frac{\partial u_m}{\partial x_l}\right) \end{aligned}$$
For verification of this formula, notice that LHS results in index $i$, and the same goes for RHS.
Index Form Vorticity Equation
The Vorticity equation relates $\mathbf{w} = \nabla \times \mathbf{u}$ with $\mathbf{u}$:
$$\begin{aligned} \frac{D\mathbf{w}}{Dt} = (\mathbf{w}\cdot \nabla) \mathbf{u} \equiv \frac{\partial w_i}{\partial t} + u_j \frac{\partial w_i}{\partial x_j} = w_j \frac{\partial u_i}{\partial x_j} \end{aligned}$$
Index Form Navier-Stokes Equation
The Navier-Stokes equation for Incompressible Flow is:
$$\begin{aligned} \frac{D\mathbf{u}}{Dt} = -\frac{1}{\rho}\nabla p + \nu \nabla^2 \mathbf{u} + \mathbf{g} \end{aligned}$$
where $\nu$ is known as the viscosity coefficient, then the index notation form is:
$$\begin{aligned} \frac{\partial u_i}{\partial t} + u_j \frac{\partial u_i}{\partial x_j} = -\frac{1}{\rho} \frac{\partial p}{\partial x_i} + \nu \frac{\partial^2 u_i}{\partial x_i^2} + g_i \end{aligned}$$