All Numbers Contain the Digit 7

Introduction

If you look at all the numbers from 11 to 1010, you’d notice that 10%10\% of them contain the digit 77 - specifically, only the number 77 contains that digit.

Well, if you look at all the numbers from 11 to 100100, how many contain the digit 77? Here we need to work a bit. Obviously, all numbers with a 77 in the ones place contain 77, so that’s 1010 numbers already:

07,17,27,37,47,57,67,77,87,9707, 17, 27, 37, 47, 57, 67, 77, 87, 97

Also, all numbers that have 77 in their tens place contain the digit, which gives us 1010 more numbers, but we already counted 7777, so we only have 99 numbers to account for here. This gives us a total of 1919 numbers that contain 77, or 19%19\% of the numbers.

What’s interesting is that the higher up you go, the higher the percentage becomes, until you get to the infinite case in which you get that 100%100\% of numbers contain 77.

This is the reasoning I saw in pop media, explained by “science people”. And while it may look like that, we know in mathematics not to trust what patterns or sequences look like at first glance, but to rigorously show that the infinite case behaves as claimed.

You might have an intuition that this statement is correct, but my goal with this article is to prove (or disprove) it formally, while guiding you (the reader) along.

How Do We Formally Prove This?

From here on out we’ll use tools from Calculus I. If you haven’t studied limits, you can skip to the conclusion for the answer.

To formally prove a statement, we need to formally state it. Here’s my interpretation of the statement, as inferred from the videos (and my “proof”) above.

Theorem:

Let sns_n be a sequence such that sis_i is the number of integers containing 77 as a digit in the set

[i]{1,2,,i}. [i] \coloneqq \{1, 2, \dots, i\}.

Then:

limnsnn=1. \lim_{n \to \infty} \frac{s_n}{n} = 1.

Hopefully you’re convinced this captures the intended claim. If you think there’s a better interpretation, feel free to reach out - I’m happy to revise.

The Proof

Let sns_n be the sequence counting how many numbers in [n]{1,,n}\left[ n \right] \coloneqq \{1, \dots, n\} contain the digit 77. Let an=snna_n = \dfrac{s_n}{n}.

We’ll show that ana_n converges; then all of its subsequences converge to the same limit. Thus, we can look at the subsequence a10na_{10^n} and evaluate that limit.

To show that ana_n converges, we’ll show it’s bounded above and monotone increasing.

ana_n is Bounded by 11

Note:

This might feel obvious, but we’ll include it for completeness.

Assume, for contradiction, that ana_n is not bounded above by 11. Then there exists nn with an>1a_n > 1:

an>1    snn>1    sn>n, a_n > 1 \iff \frac{s_n}{n} > 1 \iff s_n > n,

meaning there are more numbers containing the digit 77 in [n]\left[n\right] than there are numbers in the set (since n=[n]n = |\left[n\right]|). Contradiction. Therefore, an1a_n \le 1.

ana_n is Monotone Increasing

Let nNn\in\mathbb{N}. Since [n][n+1]\left[n\right] \subseteq \left[n+1\right], the number of integers containing 77 in [n+1]\left[n+1\right] (i.e., sn+1s_{n+1}) is greater than or equal to that in [n]\left[n\right]. If not, some element of [n]\left[n\right] would be missing from [n+1]\left[n+1\right], contradicting [n][n+1]\left[n\right]\subset\left[n+1\right].

ana_n Converges to 11

We know ana_n is monotone increasing and bounded above, so it converges. To find its limit, consider the subsequence a10na_{10^n}. Define

bn=a10n=s10n10n. b_n = a_{10^n} = \frac{s_{10^n}}{10^n}.

To count how many numbers from 11 to 10n10^n contain the digit 77, consider the set of nn-digit strings

An={(dn1,dn2,,d1,d0) | 0i<n: di{0,1,,9}},A_n = \left\{ (d_{n-1}, d_{n-2}, \dots, d_1, d_0) \ \middle|\ \forall\,0\le i<n:\ d_i\in\{0,1,\dots,9\} \right\},

where dkd_k is the digit in the 10k10^k place. Clearly An=10n|A_n| = 10^n.

Note:

In our setup, the all-zeros string is treated as the representative for 10n10^n and plays the same role here.

When we see “at least once” in combinatorics, it’s natural to use complements. The number of strings with no 77 at all is 9n9^n. Therefore, the number of strings (hence numbers) containing at least one 77 is 10n9n10^n - 9^n.

Check against our earlier examples:

Thus s10n=10n9ns_{10^n} = 10^n - 9^n, and

limns10n10n=limn(1(910)n)=10=1. \lim_{n\to\infty} \frac{s_{10^n}}{10^n} = \lim_{n\to\infty} \left(1 - \left(\frac{9}{10}\right)^n\right) = 1 - 0 = 1.

Hence bn1b_n \to 1, and therefore an1a_n \to 1.

The Conclusion

From the above, the limit is 11. That is, as we consider larger and larger initial segments (11 to 1,000,000,000,0001,000,000,000,000, etc.), the proportion of numbers containing the digit 77 tends to 100%100\%.

This phenomenon doesn’t rely on 77 specifically; the same argument works for any fixed digit. So, in that asymptotic sense, “all numbers contain that digit”.

But of course, not every individual number contains every digit - for example, 55 doesn’t contain 77. That’s the paradoxical charm of infinities: the set of counterexamples is infinite, yet vanishingly rare in the limiting proportion. If you pick a large number uniformly at random from a vast range, it’s very unlikely to be missing a given digit.

Extra

To sanity-check the result, here’s a small Python script sampler.py you can download and run here:

from random import randint

n = 100
k = 10_000

count = sum(len(set(str(randint(0, 10**n - 1)))) < 10 for _ in range(k))
print(f"\t{count} / {k} => {count / k * 100}%")

Running it (nn refers to the range [0,10n1][0, 10^n - 1] and kk to the number of samples) yields:

 python3 sampler.py
    9 / 10000 => 0.09%
 python3 sampler.py
    0 / 10000 => 0.0%
 python3 sampler.py
    2 / 10000 => 0.02%

Very small percentages, which gives us some numerical validations.