We know that the general form of an ODE system is where
is called the derivative function or the right hand side of the ODE
is called the dependent variable or the states.
is the parameters.
is the independent variable.
In a freshman differential equations course you will learn a bag of tricks to rearrange such equations to analytically solve them, i.e. to find the formula for . However, most of the ODEs are impossible to solve analytically, so in practice, we solve ODEs numerically. This means that we need to come up with some iterative procedure that approximates the function for starting from .
The basic idea behind all numerical ODE solvers is to arrange terms in a particular way to truncate the Taylor series. For instance, let be the step size. The famous explicit Euler method is
Of course, there's a wide array of methods to solve ODEs, but they all originate from the idea of truncating Taylor series.
In engineering, the right hand side of the ODE often doesn't have an explicit formula. For instance, in a circuit, we know that the change of voltage is the current over the capacitance, i.e. . Also, the voltage and current follow the Kirchhoff laws, which form a system of linear equations. Hence, the governing equations of circuits naturally have both differential equations and algebraic equations.
We also want to compose different components together to build a larger model, and connection equations are algebraic relations among the state variables. It's possible to reduce algebraic equations away so that we are only left with explicit differential equations. However, this process is extremely time consuming and error-prone. Also, we lose the composability and the acausalness in the process. Hence, we want to use algorithms to automate the simplification and simulation of differential-algebraic equations (DAEs) to make modeling large scale simulation possible.
Like we hinted before, ODEs with algebraic relations among its states are DAEs. We can rephrase this as "differentiated states and states form an implicit relation together with parameters and the independent variable ." Translating that into a formula gives
where .
Similar to ODEs, we can truncate the Taylor series of to solve the above system. The explicit Euler method becomes
Let's introduce to denote the numerical solution to avoid visual clutter. As we already need to solve the implicit equations to approximate using , we can use the implicit Euler method to gain more stability
Numerically, we use Newton's method to solve potentially nonlinear equations by solving the best approximating linear equations (i.e. the Jacobian of the nonlinear function with respect to the unknowns) iteratively to refine an initial guess. By the chain rule, we have
The Newton iteration is then
where denotes the iteration variable at the -th iteration. Astute readers will notice that the Jacobian in Newton's method is a matrix pencil of the form . We know that Newton's method only makes sense when there exists a (a suitable step size) such that (J is invertible), or in other words, the matrix pencil is regular.
A natural question is: how can we relate DAEs to ODEs? Notice that is 0 for all , so the derivative of with respect to time also gives a new set of equations, so do any higher order time derivatives of , i.e.
We need a way of knowing when differentiating does not add new "information" into the system. First, let's develop a characterization on the variables. Let be the set of the highest order derivative variables, and let contain the rest of the variables. Note that and must be disjoint. Therefore, DAEs can then be written as
Differentiating the above equation gives us
Note that contains all the new terms generated by the differentiation, as it contains variables with higher order derivatives than before. Rearranging terms, we get
When is invertible, we can use old terms to explicitly solve for , so we don't generate genuinely new equations. Therefore, we only add new equations to the system if and only if is singular. It's wasteful to differentiate the entire system until the matrix is invertible; we can differentiate a minimal subset of equations to make fully ranked.
Also, note that the above equation can be converted to an ODE if is invertible. We can characterize DAEs by the number of differentiations needed to convert it into an ODE, and we call this property the index of a DAE. In particular, when the Jacobian matrix is invertible, the system is index 1. Here, we note that the numerical significance of an index 1 DAE system is that, we can initialize and solve the system easily with Newton's method because all the "latent" equations appear in the system, or in other words, is invertible, so all the constraining variables can be solved from the differential variables.

Now, let's translate the theory into a concrete example. The equations of motion of the pendulum system in Cartesian coordinate are
where is the tension from the string, is the gravitational acceleration, is the length of the string, and we assume the mass of the object is 1. Physically, we know that must be uniquely determined from and . However, computing is not obvious given the above equations. Let's use the mathematical tools that we defined earlier to analyze this system.
The highest order derivative terms are . Note that is one of the highest order derivative terms because no derivative of appears in the system. The exact rank of is too difficult to compute because it's time-varying. In practice, we use the sparsity pattern or the structure of the Jacobian to determine its structural rank — a weak version of the numerical rank. We can use a bipartite graph to represent the sparsity pattern. We define the structural rank as the maximum cardinality of the bipartite matching. If we assume that no cancellations among the non-zero entries are possible, then the structural rank is exactly the numerical rank.
The structure of from the pendulum system is
which is clearly singular. Hence, we cannot use Newton's method to compute from given and . Since only the last row of the matrix is problematic, we just need to use differentiations to introduce , or into the last equation to make the Jacobian structurally fully ranked. Differentiating the last equation twice, we get
Now, the structure of is
which has full structural rank. Note that we can simplify the differentiated equation by substituting the differential equations into it, which gives us
The resulting system is
We can use the initial condition of , , , and to uniquely determine the initial condition of the tension , because is invertible. This also means that the resulting system is an index 1 DAE, and the original pendulum system is index 3, since we differentiated the last equation twice. Lastly, as an exercise, we can get the underlying ODE by differentiating the last equation one more time
which implies
Hence, the ODE system is then