A Gentle Guide to Partial Differential Equations
This page is print-friendly. Simply press Ctrl + P (or Command + P if you use a Mac) to print the page and download it as a PDF.
Table of contents
Note: it is highly recommended to navigate by clicking links in the table of contents! It means you can use the back button in your browser to go back to any section you were reading, so you can jump back and forth between sections!
This is a short guide/mini-book on introducing various topics in partial differential equations, including analytical methods of finding solutions, boundary-value problems, and discussions of widely-known PDEs.
This guide is dedicated to Professor Yuri Lvov of Rensselaer Polytechnic Institute, who teaches the course on which this guide is based, and to whom I am greatly thankful. They are freely-sharable and released to the public domain. This guide also closely follows the book Partial Differential Equations, 2nd. Ed. by Walter A. Strauss, which is highly recommended for following on while reading the guide.
Note: familiarity with vector calculus and ordinary differential equations is assumed. Full length guides for both are available if a refresher is needed; they can be found on the vector calculus guide and the introduction to differential equations.
Introduction to partial differential equations
A partial differential equation (PDE) is an equation that describes a function of several variables in terms of its partial derivatives. For instance, let be an arbitrary function of several variables; a PDE takes the form:
A few of the most well-known partial differential equations are listed in the table below:
PDE name | Mathematical form |
---|---|
1D heat equation | |
1D transport equation | |
1D inviscid Burger's equation | |
1D viscous Burger's equation | |
1D Wave equation | |
Korteweg–De Vries (KdV) equation | |
2D Laplace's equation | |
3D Laplace's equation | |
Incompressible Euler equations | , |
We want to solve PDEs because they provide mathematical descriptions which allow us to understand the dynamics of a physical system. The processes of solving and analyzing PDEs are the focus of this guide.
Linearity
A crucial distinction that must be made before attempting any solution of a PDE is whether it is linear or nonlinear. A linear PDE has no terms involving its unknown function multiplied to partial derivatives, and only linear terms involving its unknown function anywhere else. For instance, consider the PDE shown:
It may be illuminating to write it in expanded form:
Notice that the PDE consists of an unknown function . On each of the derivatives, there is no term involving . The only time appears is the term (which is not multiplied to a derivative, and is linear in form). We therefore say that the PDE is linear. All of the following cases are similarly linear:
Linear modification | Reason for linearity |
---|---|
Only terms involving (the unknown function) matter when analyzing linearity; any terms in , etc. don't matter | |
Same; only terms involving matter when analyzing linearity, so the factor does not change the linearity of the differential equation | |
It doesn't matter whether the partial derivatives are first derivatives, second-derivatives, nth-derivatives, etc. |
By contrast, any of the following cases are nonlinear:
Nonlinear modification | Reason for nonlinearity |
---|---|
There is a term in multiplied to the first derivative | |
There is a squared term in (the term) which is not a linear term | |
Both a term in on one of the derivatives and a nonlinear term in | |
Taking powers of derivatives makes a PDE nonlinear |
Linear differential equations allow us to write a PDE in terms of a linear differential operator, denoted . For instance, consider the heat equation:
We note that we can rewrite the heat equation as follows:
The quantity in the brackets on the left-hand side of the equation is the linear operator. If we let:
Then we may write the heat equation as . As is a linear operator, it has the properties that for two solutions and and a constant , and . Linearity means that any sum of two solutions is a solution to a PDE, so it is possible to write a general solution as:
Homogeneity
Linear PDEs can further be divided into two main types: homogenous linear PDEs, which take the form , or inhomogenous (also called nonhomogenous) linear PDEs, which take the form . That is to say, roughly speaking, if one rearranges a linear PDE such that every term involving derivatives is moved to the left-hand side, then the right-hand side will be zero for homogenous PDEs and some function for inhomogenous PDEs.
For instance, the following PDE is a homogenous linear PDE:
Whereas the following PDE is an inhomogenous linear PDE:
Note how in both cases, all the terms involving derivatives have been moved over to the left-hand side of the equation, and the value of the right-hand side of the equation determines the homogeneity (whether the equation is homogeneous or inhomogeneous).
Solving by direct integration
Without any additional knowledge, we may begin our study of PDEs by examining the direct integration approach, which is applicable to a few very simple PDEs. Consider, for instance, the following PDE:
If we take the partial integral once with respect to twice, we have:
Thus our general solution is:
The reason for why the general solution contains two functions of , namely and , is because we are performing partial integration since this is a partial differential equation. Thus, the constants of integration are actually functions and , these can be arbitrary functions of . For instance, the following are all valid solutions to our PDE:
Similarly, consider the PDE:
The general solution is given by , which means that all of the following are also solutions:
The sheer diversity of solutions - and yes, all of these are valid solutions - means that finding general solutions is not very useful when solving PDEs, since there are always degrees of freedom from the arbitrary functions that could make the particular solution very, very different. Therefore, we usually need to specify boundary conditions, specific mathematical requirements that PDEs must satisfy on a particular domain, to find a unique (and useful) solution.
Boundary conditions
As we have seen, when solving PDEs, one must be careful to recognize that a general solution to a PDE does not uniquely specify the solution to a given physical scenario. A particular solution (which is usually synonymous with unique solution) can only be found if one additionally requires that the solution take a certain value at the boundaries of the PDE's domain.
Let us consider an example. Consider solving Laplace's equation on a unit square, i.e. the domain defined by . We will call this domain . The boundary of the domain would be the perimeter of the unit square. We will call this boundary for "boundary of ".
A boundary condition for finding a unique equation to Laplace's equation could be specifiying that:
This means that at all points along the boundary (which is the perimeter of the unit square). Using this information, the PDE can be solved for exactly, and a (much more useful) unique solution can be found.
Separation of variables
For any PDE more complex than the most basic examples, direct integration no longer suffices. Another technique is necessary to tackle these more complicated PDEs, and this is the method of separation of variables.
To demonstrate this method, let us consider the wave equation:
When performing the separation of variables, we first assume that the solution may be written as a product of two functions of a single variable, which we will call and . That is:
In this form, we are able to take the partial derivatives explicitly:
By substitution of these partial derivatives into the wave equation we have:
If we divide by from both sides, we have:
We now have an expression with only and derivatives of on the left-hand side and only and derivatives of on the right-hand side. This is only possible if both expressions are equal to an arbitrary constant , called the separation constant (we could just as well choose or or but this form simplifies the mathematical analysis later on). So now we have separated the variables and are left with two ordinary differential equations:
These can be written in more traditional form as:
Which have the general solutions:
Where are undetermined constants that can be solved for by applying the boundary conditions. It is common to write (the greek letter omega, not to be confused with ) to simplify the equations:
So the general solution to the wave equation is:
This can be simplified further using trigonometric identities, and is the end result of our successful separation of variables.
Note: In many cases, a PDE may be separable in one coordinate system and not separable in another. This is famously the case for the Schrödinger equation, which is an inhomogeneous linear PDE; when its inhomogeneous term is a term that is proportional to , as is the case for many atomic solutions, then the Schrödinger equation is no longer separable in Cartesian coordinates, but remains separable in spherical coordinates.
Useful calculus identities for PDEs
By nature of partial differentiation, there are several results that are incredibly crucial for the study of PDEs. First, the order of differentiation does not matter. That is to say:
Second, integration and partial differentiation can (in some cases) be order-swapped:
This is known as the Leibnitz rule. Note that when applying the Leibnitz rule, it is important to recognize that the above rule applies only in the case of definite integrals where is integrated over . Notice that integrating over bounds in results in a new function we may call that is purely in terms of , by the Fundmental Theorem of Calculus:
Therefore, we write for the left-hand-side integral, as is only in terms of , whereas we write for the right-hand-side integral, as is in terms of both and . In the more general form, where and rather than constants, we have:
Another very useful relationship used extensively in studying PDEs is the divergence theorem, which relates the volume integral of the divergence of a vector-valued function over a volume to its surface integral across the boundary surface of , written :
Note that this can also be written with slightly different but mathematically-equivalent notation as:
That is to say, the number of integral signs does not matter, that is a notational choice; only the integration variables ( and ) matter. However, the multiple-integral notation often used since it is sometimes more illustrative to write a volume integral with triple integral signs to signify it is computed over a three-dimensional volume, and a surface integral with double integral signs to signify it is computed over a two-dimensional surface.
From the divergence theorem, it is possible to derive the vanishing theorem:
Solutions to 1st-order linear PDEs
Up to this point, we have discussed two methods of solving PDEs: direct integration and separation of variables. We will now examine a few more ways to solve PDEs of a specific form: first-order linear PDEs.
Solving for homogenous vs. inhomogenous PDEs
Before we actually solve a PDE, it is important to first identify whther it is homogenous, inhomogenous, or neither. Consider a linear differential operator , similar to the ones we have already studied. A solution to a homogenous linear PDE is a function that satisfies:
For an inhomogenous linear PDE in the form , the general solution to the PDE is a combination of the general solution to the corresponding homogenous PDE and the particular solution to the inhomogenous PDE . That is:
In simpler terms, to solve for the general solution to a inhomogenous linear PDE (e.g., where :
- First, you solve for the general solution to its homogenous version , which we'll call . In our case, it would be equivalent to finding the general solution to
- Then, you find any solution to its inhomogenous version, which we'll call . In our case, it would be equivalent to finding the particular solution to
- Finally, you add and together. This gives you the general solution to the inhomogeneous PDE
The method of characteristics
The method of characteristics is a technique to solve first-order linear PDEs. We will first overview the simplest case, where we additionally require that the PDE is homogenous and has constant coefficients That is, we consider PDEs similar to the transport equation:
Note that this may be cast in an alternative form (this will be important later) given by:
To solve this PDE, we use a geometric argument from vector calculus. Recall that the directional derivative is given by:
We can therefore reinterpret the transport PDE as the equation of a directional derivative , such that are the components of a vector , and . Therefore, the transport equation reduces to:
Notice how this equation is equivalent to saying that the directional derivative of along the vector is zero. This means that does not change along , and thus for all points along the direction , must be equal to a constant , or more generally, some function of a constant (because if is a constant, then is also a constant).
The curves traced by the collection of these points are known as characteristic curves (sometimes also called integral curves). Each curve would mathematically take the form , where is some function of , so the solution can be found by just substituting in to have . Therefore, once we can determine the expression for the characteristic curve , we know the solution . Thus the method of characteristics reduces the problem of solving a PDE into a problem of finding the characteristic curves along which
But how do we go about finding ? Let us consider moving along following a characteristic curve . Since along the characteristic curve, we know that . But we also know that we may expand using the chain rule to have:
If we compare this with the alternate form (given previously) of our PDE:
We immediately notice that , then:
Which perfectly matches our PDE! Thus we have reduced the PDE to a problem of finding the characteristic curves . To be able to solve for the characteristic curves, we need only solve the system of ordinary differential equations we have derived:
The second differential equation has the straightforward solution, by inspection, of , where is some arbitrary constant of integration. For the first differential equation, however, we must be more careful, because is a total derivative of the multivariable function . Therefore, we must perform partial integration:
Notice here that instead of a constant of integration, we have an arbitrary function of integration , since we are taking the partial integral.
Now, let us recall that since for all points along the characteristic curve, then this must be true for the point as well, meaning that Therefore:
By substitution into , we have:
But we know that , which we can rearrange to . Therefore:
Since is a completely arbitrary function, we can define a new (and also arbitrary) function , where (here is a substitution variable). Thus we have:
So that we may write our generalized solution as:
This is a general solution, meaning that is a yet-to-be determined function and substituting in provided boundary conditions is necessary to determine the exact expression for .
Note: An important theme when studying general solutions of PDEs is to remember that arbitrary compositions of arbitrary functions make no difference in writing the general solution to a PDE, just like the addition of a different constant of integration makes no difference to the general solution to an ODE. The choice of arbitrary function is purely stylistic, since the solution to a PDE cannot be determined without provided boundary and initial conditions.
We may verify that our general solution is indeed a solution to our PDE by taking the derivatives of and substituting them back into our PDE:
Again, note that the solution is a general solution for arbitrary . To find a unique solution, we must be provided with a condition that constrains . For instance, such a condition may be:
If we substitute into our general solution, we find that:
Therefore we have:
where we used the substitution to solve. Therefore, the particular solution given the condition that becomes:
Coordinate transformation method
We may alternately solve the transport equation by another means: a coordinate transformation. If we define the following transformed coordinates:
Then by the chain rule, we may translate the derivatives with respect to and to derivatives with respect to and :
Now if we substitute these expressions back into the equation, we find that:
Where we notice how this transformation of coordinates means that the equation greatly simplifies. Since , then the only way for to be true is if . Now, if we take the partial integral, we find that:
Where we remember that we always have to add a constant of integration (technically, function of integration, which we represent with here) when taking the partial integral. Recall now that we defined our transformed coordinates such that:
Therefore, by substitution of the bottom equation for , we have:
Where the function must be determined by the boundary conditions supplied to the problem. Note that this is the same solution as we arrived by the method of characteristics, showing that the two methods yield identical results (it would be mathematically inconsistent if they didn't!)
Generalized method of characteristics
We may generalize the method of characteristics for first-order linear PDEs with variable coefficients (rather than constant ones). These PDEs are in the form:
As with before, we can interpret the left-hand side of the PDE as the directional derivative along the direction of . Since the directional derivative is equal to zero, there exist characteristic curves along which does not change, instead taking a constant value, just as we saw previously.
To be able to solve for the characteristic curves, we again rewrite the equation into the form:
And we still use the multivariable chain rule to find that:
From which we make the identification that if we set , then becomes identical to the left-hand side of the PDE. Thus, we need only solve for the differential equation:
This results in some solution in the form where is a constant. Afterwards, the steps match near-identically from prior discussion of the simpler case.
PDEs from physical phenomena
One of the motiving reasons for the study of partial differential equations is in their close relationship with physics. PDEs model many physical phenomena, such as flows, vibrations, oscillations, diffusion, advection, and heat conduction, just to name a few. In many cases, PDEs can be derived from physical principles, and we will show that this is the case with several examples.
The transport equation
We will first derive the transport equation, with a derivation partially based on the following guide. The transport equation is given by:
To begin our analysis, consider a moving distribution of mass (for instance, spreading cement or some syrup slowly flowing down a spoon), modelled by a mass density function , which varies with time as the distribution moves. For simplicity, we can consider a linear distribution of mass, such that the distribution is confined to move along one axis. Thus, the mass density function depends only on one spatial and one time coordinate, and simplifies to . Let us call the velocity at which moves as (which we will call the speed of propagation).
Let the mass density at time be distributed between two endpoints and . The total mass at is found by integrating the mass density between the endpoints and . That is:
We may find the rate of change of the mass within the region of to as follows:
But by the law of the conservation of mass:
The quantity on the right-hand side is the mass flux, meaning the net amount of mass flow from the amount of mass leaving the region and the amount of mass entering the region at the same time. The flux is thus given by:
Where is a factor to ensure the units are dimensionally consistent. But , that is, the mass per unit length, is simply the mass density! Thus we may equivalenly write:
Where is the mass density at at time , and is the mass density at at the same time. We note that by the fundamental theorem of calculus, we have:
So we have:
Recall from earlier that . If we now substitute our derived expressions for and , we have
And therefore we have:
We have arrived at the transport equation. More advanced readers may note that the transport equation is actually the 1D case of the more general continuity equation:
Various forms of the continuity equation appear in nearly all fields in physics, from fluid dynamics to electromagnetic theory to special and general relativity to even quantum mechanics. Thus, studying the transport equation is crucial to understanding its more complex derivatives.
The wave equation
The next PDE we will derive is the wave equation. In its most common form, the one-dimensional wave equation is given by:
The standard derivation of the wave equation comes from the study of a vibrating string, in which the tensile force of a string, together with a fair bit of mathematical wizardry, is used to arrive at the PDE. We will take an alternative route and offer a simpler - although less mathematically-rigorous - derivation.
Recall that Newton's second law, in one dimension, is given by:
where is a force in the direction. Now once again, consider a distribution of mass that can be modelled as a mass density function . Since we are considering a mass density (i.e. mass over length) rather than a singular mass, the left-hand side of Newton's second law becomes:
Now suppose that at time , an external force is applied that causes a disturbance in the mass distribution. In the traditional derivation of the wave equation, this is stretching a string under tension; but our mass distribution doesn't have to be a string. The mass distribution would respond to the disturbance with a restoring force that "smooths out" the disturbance throughout the mass distribution to try to restore itself to equilibrium. Thus we would expect this force to be proportional to the curvature of the function in space. But recall that the second derivative encodes information about curvature - this is why we use it to determine concavity (concave-up or concave-down) in optimization problems. So we could expect the restoring force to take the form:
Where is some constant to get the units right (we will discuss its physical significance later). Now, susbtituting everything into Newton's second law, we have:
Thus, with just a bit of rearrangement, we have arrived at the wave equation:
Note that interestingly, the wave equation can be factored into two transport equations, one that gives leftward-traveling (i.e. direction) solutions and one that gives rightward-traveling (i.e. direction) solutions:
This is a tremendously-helpful fact, as it means that solving the transport equation already brings us halfway to solving the wave equation.
The diffusion equation
Consider some distribution of mass given by mass density confined in a volume . For instance, this may be a gas, which has regions of varying density. The total mass of the gas within the volume would be given by the volume integral of over the region:
By the conservation of mass, any gas that flows out of the volume must flow across the boundary of the volume. For instance, some gas flowing out of an (imaginary) spherical region must flow across the surface of the (imaginary) sphere. The rate at which gas flows, or diffuses, per unit area, is given by Fick's law:
To find the total rate of diffusion across the entire boundary of the volume (this is called the flux, denoted ), we need to take the surface integral across the entire surface area of the volume's boundary:
To ensure the conservation of mass, the reduction in mass of the gas within the volume must be equal to the flux (amount of diffusion out of the volume). Therefore, we have:
Therefore, by substitution of the expression for and :
Now, recalling the divergence theorem, we can rewrite the surface integral on the left-hand side as a volume integral, as follows:
Therefore we have:
We may combine this into one integral by using the Leibnitz rule (for differentiation under the integral sign):
So that we have:
By the vanishing theorem, the quantity inside the brackets must also be zero. Therefore, we have:
Written slightly differently, we have the diffusion equation:
Note that this is the homogeneous case. In the case where there is a source and , the linear inhomogeneous case becomes:
In the other case, if there is no source , and , then we have and thus we have:
If does not change with time, then and thus we have Laplace's equation:
Another particular case of the diffusion equation is the heat equation, where the diffusing substance is heat. The distribution of heat is given by the temperature and the heat equation takes the form:
The heat equation's physical basis is Fourier's law: for two regions of temperatures , the rate of heat flow between the regions is proportional to the gradient of the temperature . This is very similar to Fick's law for diffusion, and thus the heat equation is classified as a type of diffusion equation.
Finally, a famous case of the diffusion equation (albeit where is a complex-valued function) is the Schrödinger equation, which takes the form:
Due to the conservation of probability, the Schrödinger equation requires that the normalization condition to be satisfied:
Initial and boundary-value problems
We have discussed previously that knowledge of only the PDE is insufficient to provide a unique solution to a given physical problem. Thus, while we may have derived (or at least know) a particular PDE, we can only write down a unique solution when we are provided with initial and boundary conditions.
Let us take the example of the wave equation. Recall that the wave equation can be factored into two transport equations:
This means that for a solution to be found, we must provide initial and boundary conditions by specifying the values of both the function and its derivatives at specific points along our area of interest. The possible types of initial and boundary conditions for a generalized quantity described by a PDE are as follows:
Initial/Boundary condition | Mathematical form | Physical description | Example |
---|---|---|---|
Initial condition | The quantity takes the value at | Initial heat distribution before heat flow for heat equation | |
Dirichlet boundary condition | The quantity takes a specified value at the boundaries of the area of interest | The initial height of a vibrating membrame for Laplace's equation of a vibrating membrane | |
Neumann boundary condition | The derivative of the quantity takes a specified value at the boundaries of the area of interest (note: is the normal vector of the boundary and ) | The rate of heat spreading away from the edges of a hot object. When , the object is perfectly insulative, meaning no heat escapes from the object | |
Robin boundary condition | A linear combination of the quantity and its derivative takes a specified value at the boundaries of the area of interest | A linear relation between the spread gas across the boundary and the gas density at that boundary. | |
Periodic boundary condition | The quantity repeats such that its vaulue at two points is the same | The height of a sinusoidal wave at two locations (periodic boundary conditions are often used for oscillating or wave-like phenomena) | |
Vanishing boundary condition | The quantity vanishes at infinity (often used for quantities where the total amount is finite) | Normalization condition of Schrödinger's equation (which demands that as ) |
For instance, suppose our area of interest is a linear region between endpoints , and we want to solve the wave equation. We would then want to specify the initial condition , the boundary conditions and , which are Dirichlet boundary conditions, as well as $\dfrac{\partial u(x, 0)}{\partial x} \bigg|{x = x_1}\dfrac{\partial u(x, 0)}{\partial x} \bigg|{x = x_2}$ which are Neumann boundary conditions. This combination is called an initial boundary-value problem (IBVP) and allows for finding a unique solution. For partial differential equations that are time-independent, we simply refer to a combination of boundary conditions and the PDE as a boundary-value problem (BVP).
Solving and classifying second-order PDEs
A second-order PDE is a PDE describing a function which has the standard form:
In general, any second-order PDE can be transformed via a coordinate transformation into this standard form. Furthermore, we can distinguish between three main categories of second-order PDEs, based on the coefficients , , and :
- If , the equation is hyperbolic
- If , the equation is parabolic
- If , the equation is elliptic
Why do we care about the coefficients? The reason is because the qualitative behavior of second-order PDEs is almost entirely determined by its second-order derivative terms. The lower order terms do not really matter as they do not contribute as broadly to the characteristics of the solution. Thus, we are very much interested in whether a second-order PDE is hyperbolic, parabolic, or elliptic.
As a quick non-rigorous guide, we can visually tell whether a solution is hyperbolic, parabolic, or elliptic by the general form they take:
Source: Stanford MATH220A course
However, it is worth it to familiarize ourselves with each specific type in detail. We will now look at the three standard second-order PDEs, the first one hyperbolic, the second one parabolic, and the third one elliptic.
Note: the names hyperbolic, parabolic, and elliptic originate from the geometric terms of the hyperbola, parabola, and ellipse that are the three standard conic sections. It may seem like PDEs have nothing in common with geometry; on the surface level this can be presumed to be the case (although there is a deeper mathematical connection that comes up in more advanced studies). One can (for now) take the main utility of the geometry-inspired classification system to be a a convenient way to distinguish between the three types of 2nd-order PDEs that is based on familiar terminology.
The standard hyperbolic second-order PDE is the 1D wave equation, which (physically) describes a propagating wave :
While the standard 1D wave equation describes a function of the form , one may also write a perfectly valid wave equation that describes a function of the form that takes the form:
It may not be very apparent how the wave equation fits the standard form of a second-order PDE we looked at previously. To show explicitly that the wave equation does indeed fit the standard form, we may perform a coordinate transformation of the wave equation. Let our transformed coordinates be given by:
We then find that in these transformed coordinates, the wave equation takes the form:
We can show this by manually computing the coordinate-transformed partial derivatives by the chain rule:
From which we obtain:
Therefore substituting into the wave equation we have:
Thus we have and , making the wave equation hyperbolic. The hyperbolic nature of wave equation has some unique consequences for its solutions. First, in solutions to the wave equation, a disturbance (e.g. pulse) in one region propagates at a finite speed in space, where is the propagation speed. That is to say, information about one part of a solution can only be transmitted at a finite speed to other parts of the solution. Thus hyperbolic equations in physics typically describe waves, including waves of light (electromagnetic waves), vibrations of a string, and acoustic (sound) waves. Second, hyperbolic equations do not "smooth out" shocks that arise at one point in time. These shocks propagate throughout the solution, but once again, at a finite speed.
Note: the 2D and 3D wave equations can be written as a system of 1D wave equations via a coordinate transformation, which are hyperbolic. Therefore, the 2D and 3D wave equations are also hyperbolic.
The standard parabolic second-order PDE, meanwhile, is the 1D heat equation, which (physically) describes a temperature distribution through time:
Recall that the time derivative on the left-hand side is a lower-order term, and therefore it does not matter. It also does not matter if the left-hand side is a first-order derivative with respect to another variable; the following would still be a heat equation (although not the heat equation):
Since we ignore all lower-order terms when classifying PDEs, the first derivative can be effectively ignored. Therefore, we can place the heat equation in standard form as follows:
Where we use the notation that represents the lower order term. In this form, we can tell that while , and thus , which is the defining characteristic of a parabolic PDE (as we would expect).
Note: As with the wave equation, the 2D and 3D heat equations can also be put in parabolic form by rewriting them as a first-order system; each equation in the system would then take a parabolic form. Therefore, the 2D and 3D heat equations are also parabolic.
Just as the wave equation possesses specific properties due to it being hyperbolic, the heat equation possesses specific properties due to it being parabolic. One property is that maxima and minima in the solution tend to be "smoothed out" as the solution evolves in time, resulting in (mostly) smooth solutions. The behavior happens *globally (i.e. across the entire solution) at the same time. That is to say, unlike the wave equation, the heat equation assumes infinite propagation speed and thus describes phenomena where a bulk quantity uniformly stabilizes towards equilibrium (as opposed to the non-uniform delayed propagation of information in the wave equation). Thus the heat equation physically describes diffusion and diffusion-like phenomena, making it the natural choice to describe heat transfer.
Lastly, the the standard elliptic second-order PDE is Laplace's equation (in 2D), which describes a wide variety of phenomena in physics. It takes the form:
Note that the positive sign matters here - if the plus sign were a negative sign, we would instead have a hyperbolic PDE. Laplace's equation can also be written in terms of the Laplacian operator as:
Where the Laplacian in two dimensions takes the form , and thus gives Laplace's equation:
The Laplace equation, in this form, is already in the standard form of a 2nd-order PDE (that we covered at the beginning of this section). As it has the coefficients and , we have , which makes the PDE elliptic.
Note: Laplace's equation in 3D (which also takes the form , just with the 3D Laplacian, so it reads ) is also elliptic, which can be shown (again) by rewriting it as a system of PDEs). However, Laplace's equation in 1D is not elliptic (rather, it is parabolic).
As the standard elliptic PDE, Laplace's equation demonstrates several of the defining characteristics of elliptic PDEs. First, note that Laplace's equation is usually written in a time-independent form - therefore, it describes steady-state systems (i.e. systems at equilibrium) that are slowly-evolving or constant in time. For instance, in electrostatics, the electric potential satisfies Laplace's equation. Second, the solutions to Laplace's equation are always smooth. This is because, just like parabolic PDEs, information about parts of a solution can be thought of as travelling instantly between different parts of the solution; the nuance being that physical scenarios modelled by Laplace's equation are ones that already reached equilibrium. This also means that for , the 2D heat equation becomes Laplace's equation as .
The three representative 2nd-order PDEs - the wave equation, heat equation, and Laplace's equation - are unique in that any second-order hyperbolic, parabolic, and elliptic PDE can be transformed into one of those equations (ignoring the lower-order terms, which are not significant). Thus the study of these three 2nd-order PDEs provides results that carry over to all 2nd-order PDEs. Therefore, we can summarize our general conclusions about 2nd-order PDEs as follows:
Type of 2nd-order PDE | Representative equation | Representative form |
---|---|---|
Hyperbolic | Wave equation | |
Parabolic | Heat equation | |
Elliptic | Laplace's equation |
It is also possible to condense this information in a matrix, as follows:
We may find the eigenvalues of the matrix by solving eigenvalue equation:
The solutions to the above equation are the eigenvalues of . The eigenvalues of the matrix can then be used to determine the classification of the PDE:
Type of 2nd-order PDE | Eigenvalues of |
---|---|
Hyperbolic | Opposite signs |
Parabolic | At least one eigenvalue is zero |
Elliptic | Same signs |
Mixed-type 2nd-order PDEs
We have seen that 2nd-order PDEs may be classified, based on their coefficients, as hyperbolic, parabolic, or elliptic. But these distinctions are not as clear-cut as they may first appear; in fact, a 2nd-order PDE may be of mixed type, which means it can be hyperbolic in one region, parabolic in another region, and elliptic in yet another region. Consider the following 2nd-order PDE:
From first appearance, this PDE may appear to be elliptic, as evidenced by the positive sign. But we must exercise caution, as this is not always so; in fact, it is only true for . Notice what happens at the origin, where : then second term becomes zero, and we are left with:
Which is a parabolic differential equation. Finally, if we have , the second term would then become negative, which is a hyperbolic differential equation. This is why we classify this PDE as a mixed-type, as it is classified differently depending on region. Below is a graph of the regions in which the PDE takes each type (red represents and thus elliptic, blue represents and thus hyperbolic, and the dashed black line represents and thus parabolic):
Existence, uniqueness, and stability
Up to this point, we have been solving PDEs with the assumption that solutions always exist and are unique and that solutions are stable. That is, a solution satisfies the following characteristics:
- Existence: It is mathematically-possible to find the solution of a PDE for given initial (and/or boundary) conditions
- Uniqueness: The solution of a PDE for given initial (and/or boundary) conditions is the one and only solution
- Stability: A PDE has solutions without shocks, discontinuities, divergent behavior, or any other instabilities that result in unpredictable behavior
It may be odd to think that solutions may not even exist for a given PDE and boundary conditions, but this is possible. Typically, the most straightforward to show that a solution does not exist is to (attempt to) solve a given problem with the provided boundary conditions, and show that the boundary conditions are impossible to be satisfied.
Furthermore, even if a solution exists, there may be multiple solutions possible - a good check is to see if the trivial solution is a solution to the BVP in addition to some other solution, or if a solution where is some constant is still a solution to the BVP. If multiple solutions are possible, then we say that the solution is non-unique. For example, consider the following BVP for the transport equation:
In this case through just guess-and-check we find that one particular solution to the BVP is the function . There is, however, another solution: . Thus the solution is certainly not unique.
In many cases, physical intuition can be enough to deduce whether the boundary conditions lead to nonexistent or non-unique solutions. If a PDE doesn't have sufficiently many boundary conditions, a solution often exists but is not unique. For instance.
Finally, a PDE (and particularly hyperbolic PDEs) may have numerical instabilities that make a solution useless to compute even if found. If a problem satisfies all of the three criteria, for which we say that the problem is well-posed. This is particularly true for hyperbolic PDEs, which are in the form of the modified wave equation:
Such PDEs are highly sensitive to their initial conditions and thus stability depends on the smoothness (i.e. no abrupt jumps, discontinuities, and asymptotic behavior) of their initial conditions. The same is often true with nonlinear PDEs. Again, instability is hard to show without explicitly solving the BVP and examining the resulting solution.
It is rarely possible to prove existence, uniqueness, and stability for a general class of boundary-value (or initial-value) problems. It is often much easier to give a counterexample (i.e. disprove existence, uniqueness, and stability). Even when the conditions of existence, uniqueness, and stability are satisfied, it may be impossible to prove this or a proof may only be possible on a case-by-case basis. In this introductory treatment of PDEs, we will not delve into the intricacies of proving existence, uniqueness, and stability, which involves very advanced mathematics.
It is, however, useful to mention several theorems for specific PDEs that provide for existence and uniqueness (and in some cases, stability):
Theorem 1: The solution to Poisson's equation (as well as its limiting case, Laplace's equation, given by ) is a unique solution for the boundary condition as well as for the boundary condition . That is to say, if the values of are specified on the boundary, or if the values of the derivative of are specified on the boundary (but not specifying both), then a solution will always exist and is unique.
Theorem 2: By the Cauchy–Kowalewskaya Theorem, any second-order PDE in the form of the wave equation or diffusion equation with the Cauchy (initial) conditions , yields a unique solution if both conditions are provided and and are both functions with well-defined power series.
Theorem 3: Assuming theorem (2) holds, the solution to the diffusion equation is always stable for (although unstable for ).
Theorem 4: Assuming theorem (1) holds, the solution to Poisson's equation and Laplace's equation is always stable.
Solutions of the wave equation
We have previously seen that the wave equation is the prototypical hyperbolic PDE, and is given by:
Note: The fact that the equation has instead of as a constant is significant. is guaranteed to be positive, while can be positive or negative, and this makes a massive difference in the behavior of the PDE (and therefore, its solutions).
As the prototypical hyperbolic PDE, the wave equation has a special significance because any hyperbolic PDE can be transformed into the wave equation by a change of coordinates. In addition, it has the desirable characteristic that its general solution is actually rather simple:
Which we can find by the method of characteristics or by the method of coordinate transforms and then direct integration (which we previously showed). Particular solutions for the wave equation can be found by giving initial conditions. Let us consider the specific initial-value problem given by:
Where are arbitrary functions. This initial-value problem possesses a particular solution, given by:
In the special case where we have , then the particular solution reduces to:
Meanwhile in the special case we have , then the particular solution reduces to:
Solutions of relevance in physics
One of the most useful particular solutions to the wave equation (particularly in Physics) is found by imposing the additional periodic boundary conditions that and . The solution then becomes:
Which can also be written in a more compact form if we define and , as follows:
In physics, each of these quantities has a very specific physical interpretation - is called the wavelength, is called the period, is the angular frequency and is called the wavenumber. Such solutions are known as plane-wave solutions (also called traveling waves) as they describe waves that propagate sinusoidally to infinity, with their wavefronts being uniform (thus, planar). See the guide on waves and oscillations in physics for more details.
Note: we can construct a more general solution if we write this solution as by Euler's formula . As the wave equation is linear, we can sum over an infinite number of plane waves spaced out in space (with different values of ) to have . This is also a form often used in physics.
Note that a particular case of this particular solution can also be found with the following boundary conditions:
Where we have:
We may verify that this solution does, indeed, satisfy the wave equation:
As we saw previously, solutions to the wave equation (waves) have the universal feature that the wave travels to the right, that is, , and travels to the left, that is, , at a speed of propagation of . Since is a constant, is also the maximal speed at which information at a particular point of a solution can affect another particular point . This has important consequences in physics: light is described in physics as an electromagnetic wave, and thus the is the famous speed of light, which is invariant (the same everywhere) and directly led to the development of the theory of relativity.
But remember that the wave equation describes all types of waves, one of which being the vibrations of a string, attached at its two ends. In this case, represents the displacement of the string from its equilibrium point. If we let the string be of length and have density under tension force , then the boundary-value problem for the string reads:
We note that the wave equation for a string can be equivalently written (by expanding out and distributing) as:
The expression for the kinetic energy of the string is given by:
Where we must integrate over as and thus to find the total mass we have to integrate the mass density across the string. If we assume constant density, we have:
If we differentiate the kinetic energy we have:
But remember the wave equation for a string reads . Therefore, substituting in, we have:
Meanwhile, the potential energy of a string (which we will not derive, but it comes from the work-energy theorem ) is given by:
Thus we find that the total mechanical energy, given by the sum of the kinetic and potential energies, is given by:
But since , then , and therefore . Therefore, total mechanical energy is always constant, as required by the law of the conservation of energy, and this constant value of energy is equal to:
Solutions of the diffusion equation
As we recall, the diffusion equation is the prototypical parabolic equation. It is given by:
As the prototypical parabolic PDE, the wave equation has a special significance because any hyperbolic PDE can be transformed into the diffusion equation by a change of coordinates.
Note: A special case of the diffusion equation is the well-studied heat equation, which many texts use in place of the diffusion equation. The diffusion equation, however, is more general.
Let us consider the following intial-boundary-value problem (IVBP) for the diffusion equation:
The first (initial) condition for describes the initial spatial distribution of the function at , whereas the second and third (boundary) conditions describe the value of through time on the boundaries . The two boundary conditions must satisfy the consistency condition . The heat equation satisfies four important conditions (which we will not prove and simply state, as the proofs are very lengthy):
Maximum principle: For any solution for a well-posed initial-boundary-value problem, then at all times . The maximum of must lie on either or at one of the boundaries
Minimum principle: For any solution for a well-posed initial-boundary-value problem, then . The minimum must lie on either or at one of the boundaries
Well-posedness theorem for the heat equation: For the aforementioned boundary-value problem, a solution always exists, is unique, and is stable: a small perturbation does not strongly affect the solution . That is, for two given points for which (where is a small number), then one may always find a unique solution, and for all times (i.e. two points initially close together stay together).
Linear property of solutions: For any solution , then and is also a solution.
Differentiation and integration of solutions: For any solution that possesses a derivative , also satisfies a diffusion equation . Likewise, for any solution that possesses an integral , also satisfies a diffusion equation . The same applies for This directly results from the linearity of the diffusion equation (i.e. the sum of two solutions is also a solution).
Intuitively, these results makes the diffusion equation extremely easy (and powerful) to work with. In addition, they allow us to write the general solution of the diffusion equation either in terms of an infinite series:
Or in terms of an integral:
Note: Whether the general solution should be written as a sum or integral typically depends on the boundary conditions (though it can also depend on other factors).
Let us now consider solving the initial-boundary-value problem for the diffusion equation, as shown:
This initial-boundary-value problem is often called diffusion on the real line. We may solve this problem as follows. It should be noted that this is not the conventional way it is derived and is not a rigorous derivation at all. To start, let us assume a solution in the form (this is actually a valid assumption because the diffusion equation is separable). In this form, if we differentiate, we find that:
Where we can effectively treat as a constant when differentiating with respect to since doesn't depend on time, and likewise, we can treat as a constant when differentiating with respect to since doesn't depend on position. If we substitute into the PDE then divide both sides by , we have:
The last line arises from the property that when two derivatives are equal, both must be equal to a constant. This means we now have two differential equations, and Let us look at just the first equation. If we call the constant in the equation (the sign does not matter as the constant is arbitrary), then we have:
Which has the solution:
Thus our solution becomes:
But this is not a general solution (yet) because due to the property of linearity of the diffusion equation (which we showed previously), is also a solution. Therefore, the more general solution is a linear sum of solutions:
In the limiting case, this becomes the general solution (at least for the given initial and boundary conditions):
Where we must switch the bounds for the term from to distinguish the variables we are integrating over versus the variables that represent the coordinates of - this is due to the fundamental theorem of calculus . The above equation is true for any that allows to satisfy the given boundary conditions. An intuitive explanation is as follows: if we sum infinitely-many "clones" of the solution spaced-out at different positions and different times (as governed by ), then the solutions can add up to form an arbitrary function (in some ways, similar to Taylor series or Fourier series if that is familiar).
Now, recall the property (which we showed before) that for any solution , any shift of the solution is also a solution. If we let our previous solution be written , that is:
then we have:
We can write this in a simpler form if we use the change of variables , for which the solution simplifies to:
The final step is require that the solution does indeed satisfy the boundary conditions - that is, for (the other boundary condition is automatically-satisfied from our separation of variables procedure). A specific that does indeed satisfy the boundary conditions is the Gaussian (which in this case is a shifted Gaussian to accomodated our shifted solution):
Therefore substituting we have:
Since is not integrated over, we can pull out the factor in outside the integral:
Thus the particular solution to the diffusion equation for the given initial-boundary-value problem is then:
Where the integrand is known as a Green's function (other names include source function, propagator, kernel, or fundamental solution). The Green's function may be thought of as something that "pushes" (evolves) the initial condition to bring it to by time (for those familiar with the term, it can be thought of as an operator). This may be a bit easier to see if we write the solution as follows:
In this form, we can see that the Green's function serves as a multiplication factor that starts out at at (but luckily, since it also starts out as infinitely thin, the integral is finite). We call this initial state of the Green's function as the Dirac delta or delta function, written as . The delta function has the special property that:
Which means that if we evaluate at we do indeed get back the initial condition. Now, as increases, the value of rapidly diminishes. Thus, we find that the maximum principle is also satisfied - that . Since our Green's function smoothly decays to infinity as , multiplying by it and integrating results in the solution being "spread out" over time, exactly analogous to our intuitive understanding of diffusion. Our way of writing the general solution previously was just a renaming - , and we changed integration variable we used from to , as well as pulling out the factor at the front outside the integral, but otherwise the mathematics are identical:
We may use this solution to solve the initial-boundary-value problem given by:
This is simply a matter of substituting in the integral (as long as we also remember the constraint that this is true for only , i.e. within ). Since this is a piecewise initial condition, we need to use the integral rule for piecewise functions. For a piecewise function given by:
The integral of the function over is equal to:
Using this identity, and substituting into the Green's function solution, we have:
Thus our solution is given by:
There is, however, a slightly nicer-looking way to write out this solution, and that is using the error function. The error function, denoted , is given by:
To cast the solution into this "nicer" form, we start by dividing the integral into two parts, one between and the other between :
Now, let , for which we have and . Additionally, we must also change the integral's bounds on substitution, so we must adjust the bounds as follows:
Thus, after substitution our integral becomes:
Which is the solution to our boundary value problem for our chosen initial condition . Note that this is one of the few cases in which an analytical solution can be written out; in most cases, the integral is simply impossible to evaluate by hand and can only be solved numerically.
Note for the interested reader: the solution of the diffusion equation on the half-line, which is very similar to diffusion on the real line, except that it requires the additional boundary condition , is given by
Concluding remarks to the diffusion and wave equation
We have studied the diffusion and wave equation in detail and constructed particular solutions to several initial-boundary value problems (IBVPs). As well as solving specific IBVPs, we have found that the wave equation and diffusion equation satisfy particular characteristics:
PDE | Speed of propagation | Singularities (i.e. shocks/other "not smooth" disturbances) | Energy | Requirements for well-posedness |
---|---|---|---|---|
Diffusion equation (and all other parabolic PDEs) | Infinite | Smoothed out immediately | Not conserved | Only for ; not well-posed for |
Wave equation (and all other hyperbolic PDEs) | Lost (smoothed-out) immediately | Conserved | Well-posed for all time |
Separation of variables for boundary-value problems
At the beginning of the guide, we gave a brief overview of the separation of variables procedure. Separation of variables is a powerful technique for solving PDEs, so to develop our understanding of it further, let us now re-examine the same procedure in more detail.
Separation of variables for the wave equation
To start, let us consider the wave equation in one dimension on the domain , with the following boundary conditions:
We assume a solution in the form , for which we find that upon taking the derivatives, we have:
Note: You can verify this by direct calculation; the reason why the partial derivatives are so simple is that only depends on , and only depends on . This means that when taking the partial derivative with respect to , can be treated as a constant; similarly, when taking the partial derivative with respect to , can be treated as a constant.
Therefore, substituting into the wave equation and dividing by yields:
Note how that, in the last line, we have two combinations of derivatives equal to each other. This can only happen if both derivatives are a constant, so it must be the case that:
This constant is what we call the separation constant; we now find that the wave equation, a partial differential equation, has reduced (relief!) to two ordinary differential equations that are much easier to solve. Let us name our separation constant (the square does not matter, since the square of a constant is still a constant, it just makes the mathematics easier). Since it is a constant, its value is arbitrary; can have a positive or negative sign, or be zero or complex. This is one of the cases where choosing the appropriate answer requires a fair bit of intuition about what the solution to the ODEs will be. In this particular case, the right choice is for to have a negative sign, as shown:
The reason for this is the boundary conditions. The solution to ODEs in the form (where is a constant) is either a sine or cosine wave (or a sum of both). Such solutions can be zero at finite values. However, the solutions to ODEs in the form are exponential functions (or sum of exponential functions). Such solutions can only be zero at infinity. Additionally, the solutions to ODEs in the form are linear functions, which can only be zero at one point. Since our boundary conditions are , that is, is zero at and (which are not at infinity and are zero at two locations), we must choose , not or zero. This means that our solutions are:
We may make the solutions look "prettier" by defining , so we can rewrite the solutions as follows:
Recall that we expressed our solution in the form . Thus, substituting our found solutions for and , we have:
To determine the values of our coefficients , we must apply the boundary conditions. Applying our first boundary condition , we have:
To satisfy this boundary condition, since , it must be the case that . Therefore our solution simplifies to:
Applying our second boundary condition , we have:
Again, since , this can only be satisfied if . We find that this condition can be satisfied if , i.e. if is equal to an integer multiple of , since we know that . Therefore, we have found that , and now our solution becomes:
We will now take the step of distributing the constant factor of into each of the two terms. Defining , we can rewrite the above solution as:
The remaining two initial-boundary conditions are both initial conditions: one for , and one for . One might wonder how these conditions can be satisfied when our solution is entirely written in terms of sines and cosines. The answer is that because the diffusion equation is linear, we can sum solutions together to form new solutions. In fact, we can sum as many solutions as we'd like! This results in a more general solution given by:
Where we wrote the second form of the solution (in the previous line) via our definition from before, . This is the solution to our boundary-value problem. From here, if we substitute , we have:
This is called a Fourier sine series, and any "well-behaved" function can be written using this series expansion. For instance, if we had , then the series expansion reads:
Similarly, if we take the derivative with respect to time, we can write a Fourier series expansion for . Using these series, we can effectively write out a solution to any choice of and . This is, indeed, one of the reasons why the separation of variables technique is so powerful.
Note: how do we determine what the correct coefficients should be? We will discover this later, but for now (for just ) we will provide a formula without proof: .
Separation of variables for the diffusion equation
Let us now consider the very similar boundary-value problem for the diffusion equation across the domain :
Here, we again assume a solution in the form , for which we have:
Therefore, substituting into the diffusion equation, and dividing by , we have:
We choose our separation constant to be with a negative sign (with the same reasoning as before, due to our pair of finite boundary conditions). We could've equivalently called it , but since and look too similar, we will use the letter instead. Thus we have reduced the diffusion equation into the set of two ordinary differential equations, given by:
The general solutions to the differential equations are:
Our solution for therefore becomes:
We can now individually substitute our boundary conditions. But first, there is a way that we can simplify the solution just by inspection (that is, by "being clever"). We solve on the domain , which means that we would want our solution to not blow up for large values of . This automatically means that must be zero; otherwise the term grows exponentially as increases! So we have deduced that the solution must be in the following form if it is to be "reasonable":
Which we can write in a "prettier" way by defining , with which the solution becomes:
Now, we substitute in our boundary conditions and . For the first boundary condition, our solution reduces to:
From which we deduce that , and with which we are left with:
For our other boundary condition , we substitute to find:
Thus we find that , just as we did for the wave equation. Our solution now becomes:
To satisfy the final boundary condition , we again write our solution as a Fourier series:
Thus, we can also write in terms of a series:
And we find that, indeed, we have found the solution to the diffusion equation for the given boundary-value problem.
Show table of contents Back to all notes