Following the first post: Circle STARK: Understanding Circle Group’s Operation as Rotation
The goal of this post is to explain 2-adic FFT on circle domain.
why FFT?
In STARK, there are two cases where FFT is used:
- transform a polynomial from its evaluation form to its coefficients form.
- FRI, where the “split and fold” idea comes from FFT
The first use case happens when a trace is considered as the evaluations of a polynomial on trace domain, we need to use FFT to get the coefficient form of this polynomial, and then evaluate this polynomial on trace evaluation domain.
What is the requirement for a circle domain where FFT can be applied?
For both cases to work, the underlying domain where FFT is applied on, needs to satisfy some requirements. In cyclic group generated from prime field, we know the group order needs to be smooth. but what about FFT on circle domain?
Before dig into the FFT in circle domain, it would be beneficial to introduce the “core idea” of FFT in a simple case, then we can see how can we apply the same strategy towards the circle domain and make FFT works there.
Core idea of FFT
FFT can transfer a polynomial from its coefficient format to evaluation format. The core idea of FFT is re-utilize the computation to make this transfer less expensive. A good explanation is in this video or my video, which is the copy of the first.
Come back to the question, if we want to transfer a polynomial from its coefficient domain to its evaluation domain, how can we do it? the simple answer is we can just choose random points and evaluated the polynomial on these points. Starting with this simple method,how can we make it cheaper?
The thing we can do is to smartly choose the points we want to evaluate the polynomial. Say, if we have a polynomial \(f(x)=x^2\), we would choose point \(x\) and point \(-x\), then once \(f(x)\) is computed, the evaluation of \(f(-x)\) immediately implied. This is the property of even function. If we have a polynomial \(f(x)=x^3\), we still choose point \(x\) and point \(-x\) as upon \(f(x)\) is computed, \(f(-x)\) will just be the addition inverse of it. Following this strategy, for a general polynomial:
\(f(x)=a_0+a_1x+a_2x^2+a_3x^3+…\)
we can split it to even function and odd function, then use the above strategy to re-utilize the computations along the way:
\(f_e(x^2)=a_0+a_2x^2+a_4x^4+…\)
\(xf_o(x)=a_1+a_3x^3+a_5x^5…\)
\(f(x)=f_e(x^2)+xf_o(x^2)\)
now when we want to compute the evaluation of \(f(x)\) on point \(x_0\), we instead of compute evaluation of \(f_e(x_0^2)\) and \(f_e(x_0^2)\) , then get the evaluations of \(f(x)\) on two points by combing the values:
\(f(x_0)=f_e(x_0^2)+x_0f_o(x_0^2)\)
\(f(-x_0)=f_e(x_0^2)-x_0f_o(x_0^2)\)
by doing this, we reduce a problem of computing two evaluation of \(f(x)\) to computing two evaluation of half degree polynomial, two extra multiplication and addition. Cheaper!
This process can be recursive, namely, if \(f_e(x)\) and \(f_e(x)\) can be further divided to even and odd function, the reduction can continue, in the condition that the underlying algebraic space support it: 2 to 1 map.
Comes to circle domain
First thing to realise is that now we are working on a bi-variate polynomial, \(f(x,y)\), its evaluation format is still straightforward, but when it comes to coefficient form, it is different.
For a general bi-variate polynomial, the terms has combinations of different degrees of each variant, it looks like:
\(a_0+a_1x+a_2y+a_3xy+a_4x^2+a_5xy^2 …\)
in this example the polynomial basis is
\(1, x, y, xy, x^2, xy^2 …\)
one can also change the basis to other order of terms or combinations, but the coefficients will change correspondingly. it is just a matter of representing the same thing using different basis with different coefficients.
since in circle domain, we have \(x^2+y^2=1\), so we choose to work on basis
\( 1, y, x, xy, 2x^2 – 1, 2x^2y – y, 2x^3 – x, 2x^3y – xy, 8x^4 – 8x^2 + 1, \dots \)
so the polynomial we are working on now (still use polynomial with 8 coefficients as example) is
\(f(x,y)=a_0 + a_1y + a_2x + a_3xy +a_4(2x^2 – 1)\)
\(+a_5(2x^2y – y) + a_6(2x^3 – x)+a_7( 2x^3y – xy)\)
we have this polynomial in its coefficients form, we want to get its evaluation form, let’s assume we can make it works on circle group, and we apply the core idea of FFT —-reuse of the computation result
First round: split the polynomial into terms with only \(x\) and terms with \(y\)
\(f(x,y)=\underbrace{a_0 + a_2x +a_4(2x^2 – 1)+ a_6(2x^3 – x)}_{f_0 \quad \text{terms with only} x}\)
\(\underbrace{+ a_1y + a_3xy+a_5(2x^2y – y) +a_7( 2x^3y – xy)}_{f_1 \quad \text{terms with } y}\)
in other words
\(f(x,y)=f_0(x)+yf_1(x)\)
The circle group we have looks like
we want to reuse computation, therefore we apply the symmetric tricks of the points directly to:
FFT round 1:
point 1 and point 7 has same x and opposite y
\(f(p1.x,p1.y)=f_1(p1.x)+p0.y f_1(p1.x)\)
\(f(p7.x,p7.y)=f_0(p1.x)-p1.y f_1(p1.x)\)
point 2 and point 6 has same x and opposite y
\(f(p2.x,p2.y)=f_0(p2.x)+p2.y f_1(p2.x)\)
\(f(p6.x,p6.y)=f_0(p2.x)-p0.y f_1(p2.x)\)
the same for points 3 and 5, but what about point 4? we would expect it can grouped with point 0, but they have opposite x and same y, in this case we could not utilize the computations for point 0 to compute the computation for point 4
\(f(p0.x,po.y)=f_0(p0.x)+p0.y f_1(po.x)\)
\(f(p4.x,po.y)=f_0(-p0.x)+p0.y f_1(-po.x)\)
FFT cannot continue any more, the 2 to 1 map broke. If the underlining algebraic structure is good, we would expect the problem reduced to compute 8 polynomials without y variant, but now it becomes 10 polynomials.
\(f_0(p0.x), f_1(po.x),f_0(p1.x), f_1(p1.x),f_0(p2.x), f_1(p2.x),f_0(p3.x), f_1(p3.x),f_0(-p0.x), f_1(-po.x)\)
How can we over come this problem?
The essential property we want to make FFT work is that in every iteration of FFT, the 2 to 1 map holds. we can simply rotated the points by \(\frac{\pi}{16}\) to get a new circle domain looks like
Now in the first round of FFT, the 2 to 1 map holds, we could continue the iteration.
A full computation of FFT can be found in this post (not finished yet), to have a full view of folded bit reverse there.
Another method to over come this problem, is to choose a bigger domain at the beginning and only working on its odd points:
only considering points: 1,3,5,7,9,11,13,15
More mathematical description of the underlying algebraic structure:
the algebraic structure we end up with, is a rotation of the circle group, rotation is the group operation of this group therefore, we can present the rotated domain as \(QG_8\), it is a coset. where the generator/step size of the group is \(\frac{\pi}{4}\), the offset is \(\frac{\pi}{8}\)
question: is the offset of this coset free to choose?
no, it needs to be a offset that make 2 to 1 map, namely, make the x coordinate of the first point and the last point the same. because of this, we can also present the coset as a twin coset
\(QG_8= Q G_4 \cup Q ^{-1}G_4 \)
More about the creation of twin coset in next post!