Basis Computation for Multiple Zeta Values in PARI/GP

Computing a basis of the $$\Q$$-vector space spanned by the Multiple Zeta Values of weight $$k$$ (which we denote as $$\mathcal Z_k$$) is a hard problem. In 1994, Zagier conjectured that the dimension of the vector space is given by the generating function $$\sum_{k \geq 0}d_k X^k = \frac{1}{1 - X^2 - X^3}$$. However, we currently do not know much about the correctness of this conjecture, even for smaller weights.

Instead of attempting to prove the conjecture, we will attempt to compute the basis of $$\mathcal Z_k$$ by approximating Multiple Zeta Values using rational numbers which is computer-friendly. Of course, the result of this computation is not guaranteed to be an actual basis of $$\mathcal Z_k$$. However, this still can be seen as supporting evidence of the Zagier's conjecture above.

The first few sections of this tutorial is based on the slides by Prof. Koji Tasaka.

Precision
In PARI/GP, real numbers are represented up to certain precision. This precision can be set as follow as shown in the slides. ? \p 200 realprecision = 211 significant digits (200 digits displayed) Here, the precision is set such that the displayed 200 digits are precise. More on this can be found in this documentation.

When we ask PARI/GP to compute the Multiple Zeta Values, as mentioned above, the value is approximated up to certain precision. This precision is according to the precision that we last set in the program. Since we will be using these approximated Multiple Zeta Values for our basis computation, the computed result is not guaranteed to be correct. Nevertheless, we can try to be more confident with the result by using higher precision in our computation. But of course, setting higher precision will cause our program to consume more resources such as CPU, memory and time. Hence, we should set the precision according to our hardware capabilities and time resources.

Listing Admissible Indexes
Let's first consider the problem of counting the number of possible indexes (not necessarily admissible) with weight $$k$$. This is the same as counting the number of tuples $$a_1, a_2, ..., a_r$$ such that $$a_1 + a_2 + ... + a_r = k$$, $$r \geq 0$$ and $$a_i \geq 0$$ for $$1 \leq i \leq r$$. Let $$A$$ be the set of such tuple and let $$B$$ be the set of binary strings of length $$r$$ such that the first entry of the binary string is $$1$$. We can easily show that there is a bijection between $$A$$ and $$B$$.

As an illustration, suppose that $$k=3$$. In this case, we have $$A = \{(1, 1, 1), (1, 2), (2, 1), (3)\}$$ and $$B = \{111, 110, 101, 100\}$$. Now, let's define a function $$f:B \to A$$ such that the $$f(b)$$ is defined as follow. First, we construct a tuple by splitting $$b$$ according to its $$1$$ entries. For example, if $$b=10010100$$, then the splitted result is $$(100, 10, 100)$$. Then, we convert this into a tuple that consists of the length (i.e., the number of digits) of the binary string in the tuple. Since the length of $$100$$ is 2 and the length of $$10$$ is 2, the tuple $$(100, 10, 100)$$ is converted to $$(3, 2, 3)$$. Then, we set $$f(b)$$ to be this tuple. With this rule, for the example $$A$$ and $$B$$ above, we have $$f(111) = (1, 1, 1)$$, $$f(110) = (1, 2)$$, $$f(101) = (2, 1)$$ and $$f(100) = 3$$. It is easy to see that for any $$k$$, the function $$f$$ is a bijection between $$A$$ and $$B$$.

Fortunately, it is very easy to generate $$B$$ since $$B$$ is just the binary representations of positive integers whose binary representation consists of $$r$$ bits. The positive integers whose binary representations consists of $$r$$ bits are basically just all integers in the interval $$[2^{r-1}, 2^r-1]$$. Hence, to list all possible admissible indexes, we can just iterate through each positive integers in the interval $$[2^{r-1}, 2^r-1]$$, convert each of them to the binary representation, and apply the function $$f$$ above. This is practically what the pseudocode on slide 18 of the slides by Prof. Koji Tasaka is about, which will be included below with little formatting. ind ← [] /* [] is an empty vector */ for j=2ˆ(k−1) to 2ˆk−1 do w ← binary(j), kk ← [],M ← k+1 /* kk is going to be an index */ for n=k downto 1 do   if w[n]!=0 then kk ← concat(kk, M−n), M ← n   endif endfor /* created an index kk */ ind ← concat(ind, [kk]) endfor Here, the for loop at line 2 listing the integers, which is converted to the binary representation at line 3, and then the inner for loop at line 4-8 applies the function $$f$$ that is described above. The pseudocode above can be easily translated to PARI/GP and will be left as an exercise to the reader.

As also noted in the slide, listing admissible indexes is as easy as adding a check to filter out indexes that start with $$1$$.

Computing the Basis of a Vector Space
From linear algebra, we know that any maximal linearly independent subset of a vector space forms a basis. It is also easy to see that a maximal linearly independent subset of a set $$S$$ form a basis of the vector space spanned by $$S$$. When $$S$$ is finite, we can easily compute a basis by starting with an empty list and keeps adding an element of $$S$$ that is linearly independent with the list until there is no more element to add to the list. This is clearly illustrated in the following pseudocode by Prof. Koji Tasaka (with little formatting). find ← [] basis ← [Pi] for j=2ˆ(k−1) to 2ˆk−1 do ... /* the same with Algorithm 1; kk is an index */ if kk[1] > 1 then val ← zetamult(kk), v ← concat(basis,val) if lindep(v)[1] != 0 then basis ← v, find ← concat(ind,[kk]) endif /* judged if zeta(kk) is independent or not */ endif endfor In the code above, we generate the set $$S$$ by listing admissible indexes using the previous algorithm and apply  on each of them. After that, the function  is used to find a non-trivial linear combination of the set of vector $$v$$ that results to $$0$$ and keep expanding the basis list as explained above.

However, as a keen reader might notice, the basis list starts with $$\pi$$ instead of an empty list. In a hypothetical (though probably unrealistic) perfect world where we represent Multiple Zeta Values precisely without using approximations, we can expect that  will only return results when the set $$v$$ is linearly dependent. However, remember that here we only use the approximated Multiple Zeta Values up to certain precision that we can set. Since PARI/GP works using arbitrary precision integers, it will always be able to find a non-trivial linear combination of any of the approximated Multiple Zeta Values (since the equation consists of $$n-1$$ free variables). To deal with this, the pseudocode above uses a very clever trick by utilizing a famous conjecture that $$\pi$$ is linearly independent with any Multiple Zeta Values. For example,  will find a solution for $$\zeta(2,3)x + \zeta(5)y = 0$$ even though they are linearly independent since $$\zeta(2,3)$$ and $$\zeta(5)$$ are only approximated in PARI/GP. Instead, we will try to find the solution of $$\pi x + \zeta(2, 3)y + \zeta(5)z = 0$$. In this case, if $$\zeta(2,3)$$ and $$\zeta(5)$$ are linearly dependent (hypothetically), then  will likely find a solution with $$x = 0$$. Otherwise,  will likely find a solution with $$x \neq 0$$. This is the idea that is used in the pseudocode above.

The slides also provided the complete PARI/GP implementation for this which will also be copied here with little formatting. ? basisMZV(k)= { local(basis,ind,j,w,kk,M,n,val); basis=[Pi]; ind=[]; for(j=2^(k-1), 2^k-1, w=binary(j); kk=[]; M=k+1; forstep(n=k, 1,-1, if(w[n], kk=concat(kk,M-n); M=n));  if(kk[1]>1, val=zetamult(kk); v=concat(basis,val); if(lindep(v)[1], basis=v; ind=concat(ind,[kk])))); ind; }

Computing a Basis for a Fixed Depth
Suppose that we would like to compute a basis of the $$\Q$$-vector space generated by Multiple Zeta Values of weight $$k$$ and depth $$r$$. An obvious way to do this is to use the code from the previous section and filter out any indexes with depth not equals to $$r$$. However, this is not very efficient because we will need to go through all $$2^{k-1}$$ indexes of weight $$k$$ even though there are only $$\binom{k-1}{r-1}$$ indexes whose depth is $$r$$ among them.

Instead of doing such filtering, it will be more efficient if we can only iterate integers whose binary representation contains exactly $$r$$ bits among the integers whose binary representation consists of $$k$$ bits.

Suppose that $$k=9$$ and $$r=5$$. A simple observation will show that the smallest integer with binary representation that maps to an index with weight $$9$$ and depth $$5$$ has the binary representation of $$100001111$$ which is mapped to the index $$(5, 1, 1, 1, 1)$$ as can be verified by applying the function $$f$$ that we previously defined. From now on, we will just denote the binary representation of such smallest integer as the smallest binary string. More generally, the smallest binary string that represents an index with weight $$k$$ and depth $$r$$ is a binary string of length $$k$$ that starts with $$1$$, follows by several zeroes, and ended with $$r-1$$ bits of $$1$$. In PARI/GP, we can compute this as. Here,  is an operator to perform bitwise left shift. For example,  means shifting $$8 = (1000)_2$$ to the left as many as 3 steps by appending zeroes to be $$(1000000)_2$$. Effectively,  equals to $$a (2^b)$$. With the previous example where  and , we have. In binary form, it is equivalent to $$(100000000)_2 + (10000)_2 - 1$$. Also note that in binary form, $$2^x-1$$ is just a string of $$1$$s of length $$x$$. Hence, the previous expression becomes $$(100000000)_2 + (1111)_2 = (100001111)_2$$ which is what we wants.

Now we are able to generate the smallest binary string for a given weight and depth. We would like to inductively generate the list of all binary strings that are associated with the given weight and depth. Suppose that $$n$$ is an integer whose binary representation maps to an index with weight $$k$$ and depth $$r$$. Can we find the smallest integer $$m>n$$ whose binary representation also maps to an index with weight $$k$$ and depth $$r$$? Suppose that in binary representation, $$n = (10110001111000)_2$$. Any integers whose binary representation in the form of  must not exceed $$n$$ because $$(1111000)_2$$ is the largest possible number that consists of $$7$$ bits and contains $$4$$ bits of $$1$$s. Instead, $$m$$ must be the smallest integer whose binary representation is in the form of  that contains $$7$$bits of $$1$$s. Since the first seven bits already used up four bits of $$1$$s, we need to use up the remaining three bits of $$1$$s in the last four bits. Since we want $$m$$ to be the smallest possible, then the last four bits must also be the smallest possible that contains three bits of $$1$$s. Hence, the last four bits should form $$0111$$. So, it must be that $$m$$ is. We can generalize this for any $$n$$ as follows:
 * 1) We first compute the right-most bit of $$1$$. In the example above when $$n = (10110001111000)_2$$, the right-most bit of $$1$$ is $$1000$$. This can be computed as   where   is $$n$$. This is a standard trick that is used, for example in Fenwick Tree data structure. The expression   will convert the right-most bit $$1$$ to $$0$$ and converts the right-most zeroes to $$1$$. In the example above, $$n - 1 = (10110001110111)_2$$. Basically, this is the binary string that agrees with $$n$$ from the left up to before the last bit of $$1$$ and disagree on the remaining right bits. Hence, negating this will result to binary string that agrees with $$n$$ starting from the last bit of $$1$$ to the right. Applying   to $$n-1$$ above will result to $$(01001110001000)_2$$. As can be seen, only the last $$1000$$ agree with $$n$$. Hence, a simple bitwise with $$n$$ will result to the right-most bit of one.
 * 2) Next, we compute the right most group of $$1$$s. In the example above, this group is $$1111000$$. This can be easily done with  . Here, adding   from the previous step will zeroes the last group of ones. As an example, adding $$(10110001111000)_2 + (1000)_2 = (10110010000000)_2$$. Hence, similar to the previous step, negating this and applies bitwise and with the original integer will result to the last group of ones $$1111000$$.
 * 3) Clear the last group of ones and add $$1$$ to the $$0$$ before the last group of ones. In this example, the next smallest integer that we are trying to find should be in the form of   as previously explained. Notice that we can reuse the trick from the previous step. By adding   from the first step to our number, we will get $$(10110010000000)_2$$. Hence, this step is as simple as.
 * 4) Add the remaining bits of $$1$$s to the right-most part of the result of the previous step. In our example, the number that we want is $$(10110010000000)_2 + (111)_2 = (10110010000111)_2$$. We should get $$(111)_2$$ by removing one $$1$$ and all the zeroes from  . To remove the zeroes, we can just divides by   since   is a power of two:  . To remove one bit of $$1$$, we can just either divide by two or perform a bitwise shift right. Hence, the next smallest integer that we are looking for can be found as follow:.

Hence, we found a way to inductively generates the list of binary strings that will map to indexes with weight $$k$$ and depth $$r$$. We can then apply the function $$f$$ to maps it to index and then performs the basis computation. The following function snippet will generate such integers according to the algorithm above and applies  to each integer. {   for_each(weight, depth, func) = number_of_ones = depth; \\ The variable bitmask is initialized with the smallest possible number \\ that has number_of_ones bits of 1s in the binary representation. bitmask = (1 << (weight - 1)) + (1 << (number_of_ones - 1)) - 1;

\\ Loop until bitmask is greater than the biggest possible number that has \\ number_of_ones bits of 1s in the binary representation. until (bitmask > ((1 << number_of_ones) - 1) << (weight - number_of_ones),           func(bitmask);

\\ We will find the smallest number whose binary representation \\ contains exactly number_of_ones 1 bits and greater than \\ bitmask. We will call it the "next bitmask".

\\ Get the right-most bit 1 from bitmask. For example, if the \\ original bitmask is 100111100, last_one will be 000000100. last_one = bitand(bitmask, bitneg(bitmask - 1));

\\ Get the last group of 1s from bitmask. For example, if the \\ original bitmask is 100111100 (as above), last_group_of_ones \\ will be 000111100. last_group_of_ones = bitand(bitmask, bitneg(bitmask + last_one));

\\ Consider the case 100111100 above. The next bitmask can be           \\ constructed by splitting the last_group_of_ones so that we            \\ shift the left most 1 of the last_group_of_ones one bit to the \\ left and shift the rest to the right most bits: 10100111. \\ Another example, 101100011100. Then, we will split the group \\ of 111 to be: 101100100011.

\\ First, replace the last_group_of_ones with bit 1 right to           \\ the left of the group. \\ For example, 101100011100 becomes 101100100000. bitmask = bitmask + last_one;

\\ Finally, we add the remaining of the last_group_of_ones to           \\ right-most bits. Notes that the trick below will also work \\ even when last_group_of_ones contains only one bit of 1. \\ For example, the previous line will change 101100011100 \\ to 101100100000. This line now will add the remaining bits \\ to be 101100100011. \\ Here, bitmask is now the next bitmask of the original one. bitmask = bitmask + ((last_group_of_ones / last_one) >> 1) ) }

As an example, to generate all possible basis but according to the order of depth: {   MZV_basis(weight) = basis_in_real_with_sentinel = [Pi]; basis = []; for (depth = 1, weight, for_each(weight, depth, (bitmask)-> remaining_bits = weight; previous = -1; index = []; for (i = 0, weight,                   if (bitand(bitmask, 1 << i) > 0, index = concat(index, i - previous); remaining_bits -= (i - previous); previous = i                   )                ); if (index[1] >= 2,                   concattenated_basis_in_real_with_sentinel = concat(basis_in_real_with_sentinel, zetamult(index));                    if (lindep(concattenated_basis_in_real_with_sentinel)[1] != 0, basis_in_real_with_sentinel = concattenated_basis_in_real_with_sentinel; basis = concat(basis, [index]))               ) ));       basis; }

And to compute a basis of the $$\Q$$-vector space of Multiple Zeta Values of weight $$k$$ and depth $$r$$, we can simply remove the for loop at line 5 above and fix the depth.

The full code can be found in this link.