This post is a playful combination of maths and computing to design “traps for numbers”. Here is the accompanying code to this post along with its tests.
Collatz sequence
Many classical problems in maths can be given a new twist with the help of computers, resulting in surprising results.
One of these problems is the Collatz conjecture. The Collatz sequence is a sequence of positive integers defined as follows:
For instance, if the first element is 6, the sequence is: {6, 3, 10, 5, 16, 8 …}.
The conjecture is that no matter what positive integer, the sequence will always reach 1. If we continue the above sequence, eventually it gets to 1: {6, 3, 10, 5, 16, 8, 4, 2, 1}.
What’s more, the sequence enters and infinite loop: {6, 3, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1} where the cycle {4, 2, 1} repeats indefinitely. Using a not very mathematical terminology, we can say that any element of a Collatz sequence becomes trapped in the cycle {4, 2, 1}.
According to Collatz conjecture, as of 2017, the conjecture has been checked by computer for all starting values up to 264.
Collatz-like sequences
Similarly, Collatz-like sequences should begin with a positive integer and end in an infinite cycle (traps for numbers from which is impossible to escape). It should not diverge to infinite or converge to a finite value. Therefore it seems reasonable to define such a sequence with terms that alternatively become bigger and smaller than the previous one.
A first example can be created by modifying the definition of the Collatz sequence to substract instead of adding 1:
Writing a program to check the resulting sequences for all starting values up to 107, it turns out that all numbers end up trapped in one of these cycles:
- {1, 2}
- {5, 14, 7, 20, 10}
- {17, 50, 25, 74, 37, 110, 55, 164, 82, 41, 122, 61, 182, 91, 272, 136, 68, 34}
Given that these cycles repeat infinitely, {1,2,1,2…} is equivalent to {2,1,2,1…}. We will call the sequence starting with the minimum value of the elements of the cycle “canonical representation”. To make the notation more succinct, we can refer to each cycle by its minimum value, e.g. <1> = {1,2}
Warning, maths stuff ahead!
The following paragraphs provide some mathematical vocabulary to simplify the discussion of the concepts presented in the rest of this post. This terminology has also been employed in the implementation of the algorithms used to elaborate this post.
In mathematical jargon, the above sequence definition defines an “equivalence relation” (an equivalence relation is represented by ~) and the set of all possible equivalence classes by the equivalence relation ~ is the “quotient set”. Using the notation previously defined, the quotient set would be {<1>, <5>, <17>}
This equivalence relation ~ partitions the set of the positive integers in 3 groups, according to the cycle they are related to. The “projection” of ~ is the function that maps elements into their respective equivalence classes.
Check Wikipedia for more info about these concepts.
Additionally, we will call the sequence that takes each number to the representative value of its final cycle “decay path”, e.g. the decay path of 6 is: {6, 3, 10, 5, 16, 8, 4, 2, 1}.
Exploring the results
Now, equipped with all these concepts, let’s go on to explore the numbers and their decay path. There is an important thing to bear in mind: unless specified otherwise, the following results only apply in the range of numbers 1-107
The first remarkable result is that the number of elements in each partition is, roughly, similar. This means that when picking a random number in the range 1-107, there is equal probability of picking a number whose equivalence class is <1>, <5> or <17>.
Here’s the exact probability (in reality there is a slightly higher probability of obtaining a number whose equivalence class is <17>):
- <17> => 34.8%
- <5> => 32.4%
- <1> => 32.7%
Even more interesting is the fact that the cycles can be reached only through some of its elements. Let’s define the “entry point” of a number N into its final cycle C as the first element of C in the decay path of N.
It is clear that the only entry point into <1> is 2. In the case of <5>, the only entry points are 14 and 20. The remaining elements, 5, 7 and 10, cannot be reached from outside the cycle. For instance, according to the rules of the sequence, 10 can only be reached:
- from 20 (by dividing by 2)
- from a number such that multiplying by 3 and subtracting 1 results in 10; that number does not exist as 11 is not divisible by 3.
To visualise these ideas, let’s take the first 100 numbers and draw their decay path. In the range 1-100, the distribution of numbers by equivalence class is:
- <17> => 31%
- <5> => 31%
- <1> => 38%
In the following diagrams,
- numbers in red are those less or equal to 100
- solid numbers are elements of the final cycles
- grey solid numbers are those elements of the cycle that cannot be reached from outside the cycle whereas the orange ones are those that are entry points to the cycle
- broken lines indicate the omission of some elements
Equivalence class <1>
For <1>, all the paths join the backbone formed by the powers of 2 and from there reach the final cycle through the number 2.
Equivalence class <5>
For <5>, most of the paths have as entry point number 14 (whereas no path goes through 5, 7 or 10 as previously discussed). If these results are something to go by, we can infer that about 74% (23/31) of the decay paths to <5> will have 14 as entry point.
Actually, within the range 1-107 the percentages are:
- 14 => 83%
- 20 => 17%
Equivalence class <17>
Finally, for <17>, 68 seems to be the most popular entry point whereas no decay path in the range 1-100 goes through 110, 164, 182 or 272 (although they are reachable from outside).
Again, we can compare these results with the ones in the range 1-107 to check if this initial trend in the range 1-100 is representative of what happens when expanding the range of numbers.
Surprisingly, more than half of the elements of <17> have 122 as entry point! Then 74, 50 and 68 are in the order of 10% whereas 110, 164, 183 and 272 are in the order of 1%.
- 110 = 1.6%
- 164 = 0.9%
- 74 = 10%
- 50 = 15.8%
- 182 = 1.6%
- 272 = 0.6%
- 68 = 13.1%
- 122 = 56.4%
Collatz sequences as Graphs
The above diagrams representing the decay paths of the numbers in the range 1-100 can be thought of as graphs. More specifically, directed cyclic graphs. Thus it makes sense to ask about the connecting paths of the elements of the graph.
In this context, the path from n to m (n <=> m) is the sequence of numbers that connect n with m throughout their decay path. These are the possible cases:
- if the equivalence classes of n and m are different, then n and m are not connected, e.g. 6 <=> 40 = {}
- if m is in the decay path of n, then the connecting path is a subset of n’s decay path, e.g. 16 <=> 2 = {16,8,4,2}
- if n is in the decay path of m, then the connecting path is a subset of m’s decay path going in reverse, e.g. 16 <=> 11 = {16,32,11}
- if the decay paths of n and m overlap, then the connecting path is the concatenation of the subsets of their decay path that go up to the intersection point, e.g. 11 <=> 6 = {11,32,16,8,3,6}, where 8 is the intersection point of the decay paths of 11 and 6
Beyond 3*n-1
The natural question now is what happens for rules like 5*n-1, 7*n-1 and so on.
For 5*n-1, there is still a cycle containing 1, namely: {4,2,1}. Therefore any power of 2 or or number connected to a power of 2 will decay into that cycle. For instance, all numbers in the range 1-8 decay into {4,2,1}. However, the decay path of 9 grows indefinitely.
In the case of 7*n-1, even the decay path of 1 grows indefinitely. Clearly, as we increase the multiplication factor, the upwards pressure of the odd numbers outstrips the downward influence of the even ones.