Differentiation

Physical Interpretation of the Derivative

The primary concept of calculus deals with the rate of change of one variable with respect to another.

Let’s imagine a person who travels 90 km in 3 hours. Their average speed (rate of change of distance with respect to time) is 30 km/h. Of course, they don’t need to travel at that fixed speed; they may slow down or speed up at different times during their travel. For many purposes, it suffices to know the average speed.

However, in many daily happenings, the average speed is not a significant quantity. If a person traveling in an automobile strikes a tree, the quantity that matters is the speed at the instant of collision (this quantity might determine if they survive or not).

Concept Description
Interval Happens over a period of time
Instant Happens so fast that no time elapses

Calculating the average speed is simple. By definition, it’s the rate of change of distance with respect to time.

average speed=distance traveledinterval of time\text{average speed} = \frac{\text{distance traveled}}{\text{interval of time}}

The same computation process can’t be applied to get the instantaneous speed at some point in time. Since instantaneous means that the event happened in an infinitesimal or very short space of time, then distance and time might both be zero. Hence, using the average speed definition won’t help because 00\frac{0}{0} is meaningless. We know that this is a physical reality, but if we can’t calculate it, it’s impossible to work with it mathematically.

We can’t compute it with the knowledge we have right now, but we can surely approximate it. Let’s say that a ball is dropped near the surface of the Earth, and we want to know its instantaneous speed after 4 seconds. To calculate the instantaneous speed at any point in time, we need to know the distance it travels after some period of time. This relation can be expressed as a formula that relates distance and time traveled. The formula that relates the distance (in feet) to the time elapsed is:

f(t)=s=16t2f(t) = s = 16t^2

We can calculate the distance the ball traveled after 4 seconds by replacing tt with 4:

s4=1642=256 feet\begin{align*} s_4 &= 16 \cdot 4^2 \\ &= 256 \text{ feet} \end{align*}

Let’s also compute the distance the ball traveled after 5 seconds:

s5=1652=400 feet\begin{align*} s_5 &= 16 \cdot 5^2 \\ &= 400 \text{ feet} \end{align*}

The average speed for this interval of time is then:

average speed for the interval of time [4, 5]=s5s41=4002561=144  feet/s\text{average speed for the interval of time [4, 5]} = \frac{s_5 - s_4}{1} = \frac{400 - 256}{1} = 144 \;\text{feet/s}

So the average speed during the fifth second is 144  feet/s144\;\text{feet/s}. This quantity is no more than an approximation of the instantaneous speed, but we may improve the approximation by calculating the average speed in the interval of time from 4 to 4.1 seconds, which is:

average speed for the interval of time [4, 4.1]=268.962560.1=129.6  feet/s\text{average speed for the interval of time [4, 4.1]} = \frac{268.96 - 256}{0.1} = 129.6\;\text{feet/s}

Let’s register more computations of the above process with smaller and smaller intervals of time in a table:

|time elapsed after 4 seconds|  1|  0.1|  0.01|  0.001|  0.0001|
|average speed (in feet/s)   |144|129.6|128.16|128.016|128.0016|

Of course, no matter how small the interval is, the result is not the instantaneous speed at the instant t=4t=4. However, we now see that the average speed for the intervals seems to be approaching the fixed number 128 feet/s.

Let’s redo the process described above over an arbitrary interval of time. To do so, let’s introduce a quantity hh, which represents an interval of time beginning at t=4t=4 and extending before or after t=4t=4. (hh is called an increment in tt because it’s some interval of time.)

The formula for the example above is:

\labelballdrops=16t2\begin{equation} \label{balldrop} s = 16t^2 \end{equation}

When calculated once by the end of the fourth second, it is:

\labelballdrop1s4=1642=256\begin{equation} \label{balldrop1} s_4 = 16 \cdot 4^2 = 256 \end{equation}

When substituted with the interval [4,4+h][4, 4 + h], it is:

s4+k=16(4+h)2=256+128h+16h2\labelballdrop2\begin{align} s_4 + k &= 16 (4 + h) ^2 \notag \\ &= 256 + 128h + 16h^2 \label{balldrop2} \end{align}

Where kk is the additional distance the object falls hh seconds after the initial 44 seconds. To obtain kk, we have to subtract \eqrefballdrop1\eqref{balldrop1} from \eqrefballdrop2\eqref{balldrop2}. The result is:

k=128h+16h2k = 128h + 16h^2

The average speed in this interval of time is then kh\frac{k}{h}. Dividing both sides by hh:

kh=128+16h\frac{k}{h} = 128 + 16h

To compute the instantaneous speed, the interval hh must become smaller and smaller until it reaches 0. If hh approaches 0, then 16h16h also approaches 0. We can conclude that the instantaneous speed when t=4t=4 approaches 128 feet/s.

Let’s generalize the process above for \eqrefballdrop\eqref{balldrop} for any value of tt. To do so, let’s apply the method of increments when tt is substituted with the interval t+ht + h:

s+k=16(t+h)2=16t2+32th+h2\begin{align*} s + k &= 16(t + h)^2 \\ &= 16t^2 + 32th + h^2 \end{align*}

Subtracting \eqrefballdrop\eqref{balldrop} from the equation above:

k=32th+h2\begin{align*} k &= 32th + h^2 \end{align*}

Dividing both sides by hh:

\labelballdropderivativekh=32t+h\begin{equation} \label{balldrop-derivative} \frac{k}{h} = 32t + h \end{equation}

Just as stated above, to compute the instantaneous speed, the interval hh must become smaller and smaller until it reaches 0. If hh approaches 0, then the instantaneous speed approaches 32t32t, which is a function that will tell us the instantaneous speed of the falling object at any time tt!

It has been customary since the days of Euler to use Δt\Delta{t} (delta t) for the increment of tt. Δt\Delta{t} means a “change in the value of tt”. Thus, Δt\Delta{t} has the same meaning as hh; likewise, Δs\Delta{s} has the same meaning as kk. We can rewrite \eqrefballdropderivative\eqref{balldrop-derivative} as:

\labelballdrop3ΔsΔt=32t+16Δt\begin{equation} \label{balldrop3} \frac{\Delta{s}}{\Delta{t}} = 32t + 16\Delta{t} \end{equation}

It’s desirable to have some short notation for the statement that we have evaluated the limit of as the values of Δt\Delta{t} approach 0, which can be expressed as:

limΔt0ΔsΔt\lim_{\Delta{t} \to 0} \frac{\Delta{s}}{\Delta{t}}

Where lim is an abbreviation for limit, replacing \eqrefballdrop3\eqref{balldrop3} with this new notation:

\labelballdroplimitlimΔt0ΔsΔt=32t\begin{equation} \label{balldrop-limit} \lim_{\Delta{t} \to 0} \frac{\Delta{s}}{\Delta{t}} = 32t \end{equation}

To some mathematicians, this notation is somewhat lengthy; hence, mathematicians replaced it with different variations:

limΔt0ΔsΔt=dsdt=s=f(t)\lim_{\Delta{t} \to 0} \frac{\Delta{s}}{\Delta{t}} = \frac{ds}{dt} = s' = f'(t)

The rate of change is not always related to time or distances. A generalization of the formulas above is needed. Instead of the symbols ss and tt, let’s use xx and yy without specifying what xx and yy mean physically.

Let’s calculate the instantaneous rate of change of yy with respect to xx (the word instantaneous does not really apply because xx doesn’t represent time), using the method of increments on a function which depends on xx:

y=f(x)\labelxay+Δy=f(x+Δx)\labelxb\begin{align} y &= f(x) \label{x-a} \\ y + \Delta{y} &= f(x + \Delta{x}) \label{x-b} \end{align}

Subtracting \eqrefxa\eqref{x-a} from \eqrefxb\eqref{x-b}:

Δy=f(x+Δx)f(x)\Delta{y} = f(x + \Delta{x}) - f(x)

Dividing both sides by Δx\Delta{x}:

ΔyΔx=f(x+Δx)f(x)Δx\frac{\Delta{y}}{\Delta{x}} = \frac{f(x + \Delta{x}) - f(x)}{\Delta{x}}

The instantaneous rate of change of yy with respect to xx is reached when Δx\Delta{x} approaches 0:

\labellimitlimΔx0f(x+Δx)f(x)Δx\begin{equation} \label{limit} \lim_{\Delta{x} \to 0} \frac{f(x + \Delta{x}) - f(x)}{\Delta{x}} \end{equation}

We can also use the variations for the notation of the rate of change:

limΔt0ΔyΔx=dydx=y=f(x)\lim_{\Delta{t} \to 0} \frac{\Delta{y}}{\Delta{x}} = \frac{dy}{dx} = y' = f'(x)

What we did with the process above was to find the instantaneous rate of change of yy with respect to xx. We call this rate the derivative of yy with respect to xx. The process of applying the method of increments to obtain the derivative is called differentiation.

Geometric Interpretation of the Derivative

Let’s graph the following formula:

y=x2y = x^2

<div id=“geometric-representation”></div>

A point belonging to this geometrical representation of yy has the form (x1,f(x1))(x_1, f(x_1)); e.g., when x=1,y=1x = 1, y = 1 and when x=2,y=4x = 2, y = 4.

<div id=“geometric-representation-two-points”></div>

Let’s say that (x1,f(x1))(x_1, f(x_1)) is a fixed point on the curve (for the sake of this example, the point will be x1=1,y1=1x_1 = 1, y_1 = 1). Any other point that belongs to the curve can make a line with the fixed point.

<div id=“geometric-representation-secant”></div>

The slope is a quantity that describes the direction and steepness of a line and is calculated by finding the ratio of the vertical change to the horizontal change between any two distinct points on the line. The previous statement expressed as a formula is:

m=y2y1x2x1=ΔyΔxm = \frac{y_2 - y_1}{x_2 - x_1} = \frac{\Delta{y}}{\Delta{x}}

What if the movable point gets closer and closer to the fixed point such that Δx\Delta{x} reaches 0? That’s exactly the definition of the derivative, which means that the derivative of a function will tell us the slope of the [tangent line][tangent-line] to the function (represented geometrically as a curve) at any derivable point!

Let’s find the instantaneous rate of change of this function evaluated at x=1x=1, using \eqreflimit\eqref{limit}:

m1=f(1)=limΔx0f(1+Δx)f(1)Δx=limΔx0(1+Δx)212Δx=limΔx012+2ΔxΔx212Δx=limΔx02Δx=2\begin{align*} m_1 = f'(1) &= \lim_{\Delta{x} \to 0} \frac{f(1 + \Delta{x}) - f(1)}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} \frac{(1 + \Delta{x}) ^ 2 - 1^2}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} \frac{1^2 + 2\Delta{x} - \Delta{x}^2 - 1^2}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} 2 - \Delta{x} \\ &= 2 \end{align*}

This fixed number is the value of the slope of the line tangent to the derivative function when it’s evaluated with 11. Let’s find out the Point–slope form of the tangent line whose slope is mm:

\labellineequationyy1=m(xx1)\begin{equation}\label{line-equation} y - y_1 = m(x - x_1) \end{equation}

Substituting y1=1y_1=1, m=2m=2, and x1=1x_1=1 computed above:

y=2(x1)+1=2x2+1=2x1\begin{align*} y &= 2(x - 1) + 1 \\ &= 2x - 2 + 1 \\ &= 2x - 1 \end{align*}

If we graph this line next to the geometric representation of y=x2y = x^2, we see that it’s actually touching the curve at the point (1,1)(1, 1).

<div id=“slope-static-x-1”></div>

Before finding the equation of the slope for any value of xx, let’s imagine the graph produced by the slope function. If we take a look at the graph produced by \eqrefyx2\eqref{yx2}, we can see that for any point that belongs to the curve whose xx coordinate is negative, the slope will be negative, and for any point that belongs to the curve whose xx coordinate is positive, the slope will be positive, expressed mathematically:

sign(m)={1if x<0,0if x=0,1if x>0.sign(m) = \begin{cases} -1 & \text{if } x < 0, \\ 0 & \text{if } x = 0, \\ 1 & \text{if } x > 0. \end{cases}

Now that we have an idea of the values of the slope, let’s find the value of mm for any value of xx that is the derivative of yy with respect to xx, using \eqreflimit\eqref{limit}:

f(x)=limΔx0f(x+Δx)f(x)Δx=limΔx0(x+Δx)2x2Δx=limΔx0x2+2xΔxΔx2x2Δx=limΔx02xΔx=2x\begin{align*} f'(x) &= \lim_{\Delta{x} \to 0} \frac{f(x + \Delta{x}) - f(x)}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} \frac{(x + \Delta{x}) ^ 2 - x^2}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} \frac{x^2 + 2x\Delta{x} - \Delta{x}^2 - x^2}{\Delta{x}} \\ &= \lim_{\Delta{x} \to 0} 2x - \Delta{x} \\ &= 2x \end{align*}

<div id=“slope-graph”></div>

By looking at the line, we confirm our expectation of the values. Any point which belongs to the line whose xx coordinate is negative has its yy coordinate (the value of the slope) negative as well, and any xx coordinate belonging to the line whose xx coordinate is positive has its yy coordinate positive as well.

There are infinite tangent lines to the curve that represents \eqrefyx2\eqref{yx2}. In the following graph, the equation of the line is computed dynamically based on the position of the mouse pointer (computed by doing substitutions on \eqreflineequation\eqref{line-equation}):

<div id=“slope-dynamic”></div>

Going back to the falling object formula (ss is the distance the object moved after tt seconds have elapsed):

s=16t2s = 16t^2

The instantaneous rate of change of the distance with respect to time is:

\labelballdropfirstderivatives=32t\begin{equation} \label{balldrop-first-derivative} s' = 32t \end{equation}

ss' represents speed, and it is customary to use vv (the first letter of velocity) instead of ss':

\labelballdropvelocityv=32t\begin{equation} \label{balldrop-velocity} v = 32t \end{equation}

Now vv is a function of tt, and we can ask for the rate of change of vv with respect to tt. This is called instantaneous acceleration. Acceleration is a change of speed that takes place during an interval of time. If there weren’t acceleration in a moving object, the moving object would be moving for the rest of its life with a constant speed. If the speed is given as a function of time, then we can calculate the instantaneous rate of change of the velocity with respect to time:

\labelballdropsecondderivativev=32\begin{equation} \label{balldrop-second-derivative} v' = 32 \end{equation}

The instantaneous acceleration obtained above is the derived function of the instantaneous speed, which is the derived function of the distance function. Then we can relate the instantaneous acceleration and the distance function with the following notation:

sord2sdt2s'' \quad \text{or} \quad \frac{d^2s}{dt^2}

The function above is called the second derived function of \eqrefballdrop\eqref{balldrop}. This notation applied to the generalized version using the variables xx and yy is:

d2ydx2oryorf(x)\frac{d^2y}{dx^2} \quad \text{or} \quad y'' \quad \text{or} \quad f''(x)

Physical problems lead to more complicated algebraic functions, for example, y=x2+1y = \sqrt{x^2 + 1}, which arises when one wants to work with the upper half of the parabola y2=x2+1y^2 = x^2 + 1. We can express this function as a combination of two functions:

y=uu=x2+1y = \sqrt{u}\, \quad u = x^2 + 1

If yy is a function of uu and uu is a function of xx, then:

dydx=dydududx\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}

Expressed in the function notation:

y=f(u)andu=g(x)y = f(u) \quad \text{and} \quad u = g(x)

Then:

\labelchainruledydx=f(u)g(x)\begin{equation}\label{chain-rule} \frac{dy}{dx} = f'(u) \cdot g'(x) \end{equation}

Returning to the original problem, let’s find the derivative of y=x2+1y = \sqrt{x^2 + 1} with respect to xx using the chain rule:

Let f(u)=u1/2f(u) = u^{1/2} and g(x)=x2+1g(x) = x^2 + 1.

dydx=f(u)g(x)=u1/222x=xx2+1\frac{dy}{dx} = f'(u) \cdot g'(x) = \frac{u^{-1/2}}{2} \cdot 2x = \frac{x}{\sqrt{x^2 + 1}}

Differentiation of Implicit Functions

Going back to the definition of a function, it’s a relation between two variables such that given a value of one in some domain, there’s a unique value determined for the second variable. However, functions often occur in forms where giving the independent variable some value will not result in a unique value. For example, the equation of a circle of radius equal to 5 is:

\labelimplicitx2+y2=25\begin{equation}\label{implicit} x^2 + y^2 = 25 \end{equation}

Here, yy is not expressed in terms of xx. Solving for yy, we have two equations:

\labelexplicity=25x2y=25x2\begin{equation}\label{explicit} y = \sqrt{25 - x^2} \quad y = -\sqrt{25 - x^2} \end{equation}

eqrefimplicit\\eqref{implicit} represents the circle implicitly, and eqrefexplicit\\eqref{explicit} represents the equation explicitly.

We know that yy in eqrefimplicit\\eqref{implicit} represents some function of xx. If we recognize that the left side of eqrefimplicit\\eqref{implicit} is only a set of terms in xx, then we can differentiate it. The problem is to find the derivative of y2y^2, which should remind us of the chain rule (yy plays the role of uu in the chain rule):

d(y2)dx=2ydydx\frac{d(y^2)}{dx} = 2y \frac{dy}{dx}

Applying a differentiation process to eqrefimplicit\\eqref{implicit}:

2x+2ydydx=02x + 2y \frac{dy}{dx} = 0

Solving for fracdydx\\frac{dy}{dx}:

dydx=xy\frac{dy}{dx} = -\frac{x}{y}

Read “Calculus: An Intuitive and Physical Approach”.

  • Determination of the velocity and acceleration of a particle given its distance as a function of time.
  • Concentrate light, sound, and radio waves in a particular direction (see the reflective property of the parabola).
  • Finding the maximum/minimum value of a function, i.e., find the largest/smallest value of f(x)f(x) when axba \leq x \leq b. A well-described solution to this problem can be found here.
  • Approximation of the roots of a polynomial with Newton’s method, described here.

Let’s say that we throw an object into the air and we want to know the maximum height it acquires. As it rises, its velocity decreases, and when it reaches the highest point, its velocity is zero. We also know that the velocity is the instantaneous rate of change of height with respect to time; hence, the derivative is involved in this process, and therefore we expect it to be involved in other maxima/minima problems.

More generally, if yy is a function of xx, it seems that to find the maximum value of yy, we must find yy' and set it to 0.

Let’s see an example. The following function has a maximum value of 3.3333.333 near x=1x = 1 and a minimum value of 22 near x=3x = 3.
If we analyze the slope of the function near those points, we will see that on the left of x=1x = 1, the slope is positive, and on the
right of x=1x = 1, the slope is negative. Since we know that the derivative represents the slope of a function, we can also expect that
the derivative of this function near x=1x = 1 will go from a positive value to a negative value, intersecting the x-axis.
If we analyze the slope near x=3x = 3, we will see the same behavior with the slope, but it’s going from a negative value to a positive one.

<div class=“tw-flex tw-flex-col md:tw-flex-row”>
<div class=“md:tw-w-1/2 tw-relative”>
$$
y = x^3/3 - 2x^2 + 3x + 2
$$
<div id=“maxima-minima-f”></div>
</div>
<div class=“md:tw-w-1/2 tw-relative”>
$$
y’ = x^2 - 4x + 3
$$
<div id=“maxima-minima-f-derivative”></div>
</div>
</div>

Now the problem reduces to finding the points where y=0y' = 0 in the derivative function. Finding them will tell us exactly the maximum/minimum value of yy. Finding the values of xx when y=0y' = 0:

0=x24x+30=(x1)(x3)\begin{align*} 0 &= x^2 - 4x + 3 \\ 0 &= (x - 1)(x - 3) \end{align*}

And we see that:

y=0whenx=1andx=3y' = 0 \quad \text{when} \quad x = 1 \quad \text{and} \quad x = 3

The process didn’t actually find the maximum/minimum values since for x>3x > 3, the function increases indefinitely. The same goes for when x<1x < 1, but in this case, the function decreases indefinitely. These values are called the relative maxima/minima because near x=3x = 3 or x=1x = 1, these points are the minimum/maximum that can be found.

  • Refraction of light: we can build a function of time which relates the velocity/distance the light travels in different mediums. Finding the derivative and making it equal to 00 will find the relative minimum time needed to go from one point in medium aa to a point in medium bb.
  • Finding the sides of the rectangle with the maximum perimeter.

The slope of the tangent line of a function f(x)f(x) at any derivable point is given by m=f(x)m = f'(x). Let x1x_1 be a derivable point; then the slope of the tangent line at x1x_1 is m1=f(x1)m_1 = f'(x_1). The Point–slope form of the tangent line whose slope is f(x1)f'(x_1) is:

yy1=m1(xx1)yf(x1)=f(x1)(xx1)\begin{align*} y - y_1 &= m_1(x - x_1) \\ y - f(x_1) &= f'(x_1) \cdot (x - x_1) \end{align*}

Newton found out that if we find the intercept of this tangent line with the xx-axis at some initial guess x1x_1, the value found approaches one of the roots of f(x)f(x), i.e., when f(x)=0f(x) = 0 (obviously, given that it has roots).

If y=f(x)=0y = f(x) = 0, then the equation of the line is:

0f(x1)=f(x1)(xx1)0 - f(x_1) = f'(x_1) \cdot (x - x_1)

Solving for xx:

x=x1f(x1)f(x1)x = x_1 - \frac{f(x_1)}{f'(x_1)}

xx in the last equation is the abscissa of the next approximation of one of the roots of f(x)f(x). If we run the algorithm above a few times with an acceptable initial guess, then we’ll obtain a better approximation of one of the roots of f(x)f(x).

<div id=“newton-raphson”></div>
<div class=“tw-text-center tw-mb-4”>
<button id=“run-newton-raphson” class=“tw-inline-block tw-p-2 tw-rounded-md tw-border-2 tw-border-primary”>Approximate with Newton-Raphson</button>
</div>

Finding the Square Root of a Number

Let’s say that we want to find the square root of a number nn. This is equivalent to finding the solution to:

x2=nx^2 = n

The function to use is then:

f(x)=x2nf(x) = x^2 - n

whose derivative is:

f(x)=2xf'(x) = 2x

Substituting in eqrefnewtonraphson\\eqref{newton-raphson}:

x=x1x12n2x1=x1x12+n2x1=x12+n2x1=12(x1+nx1)\begin{align*} x &= x_1 - \frac{x_1^2 - n}{2x_1} \\ &= x_1 - \frac{x_1}{2} + \frac{n}{2x_1} \\ &= \frac{x_1}{2} + \frac{n}{2x_1} \\ &= \frac{1}{2} \cdot \big ( x_1 + \frac{n}{x_1} \big ) \end{align*}