Discrete Sum Approximation

This is going to be one of my more unique posts, since instead of talking about topics that are in Indonesia's national curriculum, I'll be talking about a specific math problem.

The integral of a function is the area below the function's curve. You could calculate an integral by summing small rectangles, which helps approximate the value of the integral. Although, what happens if we do it the other way? What if we used integrals to approximate the area of said rectangles?

Chapter I: Discrete summation and the sigma notation

Discrete summation is basically the opposite of continuous summation. An integral is a continuous summation, hence it deals with fractions and really small lengths or areas. Meanwhile, discrete summation deals with finite numbers, ones that we could measure or count.

The sigma notation is a way to represent summation. It has four main elements: the first term, the last term, the variable, and the expression.

In this case, 1 is the first term, 10 is the last term, n is the variable, and 2n is the expression. This whole expression represents the value of 2 + 4 + 6 + ... + 20.

Let's see another example:

Here, 1 is the first term, 5 is the last term, i is the variable, and Uᵢ is the expression. Therefore, this expression represents the sum of U₁, U₂, all the way to U₅, whatever these variables may represent.

Chapter II: Definite Integrals

A definite integral is a bounded area under a curve. (Although if the "bound" is infinity, it technically isn't bounded.) Definite integrals are written similarly to the sigma notation: a lower bound, an upper bound, an integration variable, and an expression a.k.a. function. (Note: the lower bound doesn't necessarily have to be a smaller number than the upper bound, this just negates the result.)

Here, -1 represents the lower bound, 3 represents the upper bound, x² represents the function, and dx represents the integration variable. There's also a visualization of what an "area under the curve" looks like, just in case you haven't understood what it is yet.

Chapter III: Riemann Sum

The Riemann sum is a way to approximate the value of a definite integral by calculating the areas of thin rectangles that have varying heights, depending on the value of a function at a certain point.

A notable feature of this method is that the value of the Riemann sum gets closer to the integral's true value as the rectangles become thinner.

Chapter IV: The Problem

"Instead of using a sum to approximate an integral, is it possible to derive a formula that uses an integral to approximate a sum?"

This problem can be represented with a graph, with the black graph representing a function f(x) and the blue graph representing the value of f(⌈x⌉) where ⌈x⌉ is the nearest whole number bigger than x.

This transforms the problem into deriving a formula that approximates the area under the blue graph (from x₀-1 to x₁) with the integral of the black curve (from x₀-1 to x₁) plus the area of the shaded blue region.

Chapter V: Formula Derivation

We will now derive a formula to calculate the area beneath the blue graph from (x₀-1) to x₁.

The Area Under the Blue Graph

In the picture above, you can see ten individual blue rectangles with equal width (notice the small rectangle to the left x₀, but ignore the blue rectangle past x₁). As is seen above, the area under the blue graph from x₀-1 to x₁ is the total area of the ten rectangles that make up the shaded blue region. The area of one rectangle is its width * height, hence:

area under blue graph = w₁h₁ + w₂h₂ + … + w₁₀h₁₀

Since they all have the same width, 1 unit wide, we can factor out the w's.

area under blue graph = w(h₁ + h₂ + … + h₁₀)

area under blue graph = h₁ + h₂ + … + h₁₀

Notice that the height of the first rectangle is f(x₀), the second rectangle is f(x₀+1), and so on. We can then substitute h with f(x):

area under blue graph = f(x₀) + f(x₀+1) + … + f(x₁)

This expression can be rewritten with the sigma notation.

The Area Under the Black Curve

Our next task is to find out the area under the black curve from x₀-1 to x. Fortunately, the area under a curve can be calculated with an integral. So:

The Area of the Shaded Blue Region

If you look closely, the shaded blue regions look like upside-down right triangles. Although this is not entirely true, it is a really good approximation, especially as x approaches large values. (Note: Δh is the height of one triangle)

area of blue region ≈ ½⋅b₁Δh₁ + ½⋅b₂Δh₂ + … + ½⋅b₁₀Δh₁₀

area of blue region ≈ ½(b₁Δh₁ + b₂Δh₂ + … + b₁₀Δh₁₀)

Since all of these triangles have the same base width, 1 unit wide, we can factor it out.

area of blue region ≈ ½(Δh₁ + Δh₂ + … + Δh₁₀)

The height of each individual triangle is the height difference between the rectangle below it and the rectangle to its left. Hence, the first triangle's height (Δh₁) is h₁-h₀, the second triangle's height (Δh₂) is h₂-h₁, and so on. (Note: hᵢ means the height of the i-th rectangle)

area of blue region ≈ ½(h₁-h₀ + h₂-h₁ + … + h₁₀-h₉)

Matching terms cancel each other out, leaving us with:

area of blue region ≈ ½(h₁₀ - h₀)

We can substitute h₁₀ with f(x₁) and h₀ with f(x₀-1).

Formula Derivation

Now we've got everything we need, so we only need to derive the final formula.

area under blue graph = area under black curve

+ area of blue region

We can substitute area under blue graph with the sigma notation we've derived, the area under black curve with the integral we've derived, and the area of blue region with the equation above.

Chapter VI: Final Thoughts

Dang, that was a journey and a half. I've tested this equation several times to check if I've made a mistake in my calculations. And I don't think I have. For example:

Plugging in this formula on an inconspicuous-looking quadratic function with 50 as an upper bound returns a pretty good approximated value, with a 0.01% percentage difference.

So, is this formula actually useful? I hope it is. I somewhat think that integrating a function is easier to do than adding 50 enormous numbers by hand. Nevertheless, I have not found any real-world usages for this formula, so maybe I'll leave that to you, the reader, to figure out for yourself. ;)

Also, for now, try using this formula only on algebraic functions for intervals where the function keeps growing. I'm not sure if it works with trigonometric functions or algebraic function intervals which have negative gradients.

Well, I guess that's all! Thanks for reading this long Peridot post. I hope you understand everything I wrote.