Multiproof scheme for polynomial commitments
This post describes a multiproof scheme for (additively homomorphic) polynomial commitment schemes. It is very efficient to verify (the dominant operation for the verifier being one multiexponentiation), as well as efficient to compute as long as all the polynomials are fully available (it is not suitable in the case where the aggregations has to be done from only proofs without access to the data).
It is thus very powerful in the setting of verkle tries for the purpos of implementing weak statelessness.
Note that this post was written with the KZG commitment scheme in mind. It does however work for any “additively homomorphic” scheme, where it is possible to add two commiments together to get the commitment to the sum of the two polynomials. This means it can also be applied to Inner Product Arguments (the core argument behind bulletproofs) and is actually a very powerful aggregation scheme for this use case.
Verkle multiproofs using random evaluation
Problem: In a verkle tree of width , we want to provide all the intermediate KZG (“Kate”) proofs as efficiently as possible.
We need to provide all intermediate commitments, there is no way around that in Verkle trees, but we only need a single KZG proof in the optimal case. There are efficient multiverification techniques for KZG if all proofs are given to the verifier, but we want to do with just a small constant number of proofs.
For the notation used here, please check my post on KZG commitments.
Please see here for an introduction to verkle tries.
Connection to verkle tries
Quick recap: looking at this verkle trie:
In order to prove the leaf value
0101 0111 1010 1111 -> 1213 we have to give the commitment to
Node A and
Node B (both marked in cyan), as well as the following KZG proofs:
- Proof that the root (hash of key and value) of the node
0101 0111 1010 1111 -> 1213is the evaluation of the commitment of
Inner node Bat the index
- Proof that the root of
Inner node B(hash of the KZG commitment) is the evaluation of the commitment of
Inner node Aat the index
- Proof that the root of
Inner node A(hash of the KZG commitment) is the evaluation of the
Rootcommitment at the index
Each of these commitments, let’s call them (
Inner node B), (
Inner node A) and (
Root) is to a polynomial function , What we are really saying by the claim that the commitment evaluates to some at index is that the function committed to by , i.e. , evaluates to at , i.e. , So what we need to prove is
- (hash of key and value), where , i.e. is the commitment to
- , where
- , where
Note that we replaced the index with , where is a -th root of unity which makes many operations more efficient in practice (we will explain below why). stands for a collision-resistant hash function, for example
If we have a node deeper inside the trie (more inner nodes on the path), there will be more proofs to provide. Also, if we do a multiproof, where we provide the proof for multiple key/value pairs at the same time, the list of proofs will be even longer. Overall, we can end up with hundreds or thousands of evaluations of the form to prove, where we have the commitments (these are part of the verkle proof as well).
Relation to prove
The central part of a verkle multiproof (a verkle proof that proves many leaves at the same time) is to prove the following relation:
Given KZG commitments , prove evaluations
where , and is a -th root of unity.
Let ( is a hash function) The prover computes the polynomial
If we can prove that is actually a polynomial (and not a rational function), then it means that all the quotients are exact divisions, and thus the proof is complete. This is because it is a random linear combination of the quotients: if we just added the quotients, it could be that two of them just “cancel out” their remainders to give a polynomial. But because is chosen after all the inputs are fixed (see Fiat-Shamir heuristic), it is computationally impossible for the prover to find inputs such that two of the remainders cancel.
Everything else revolves around proving that is a polynomial (and not a rational function) with minimal effort for the prover and verifier.
Note that any function that we can commit to via a KZG commitment is a polynomial. So the prover computes and sends the commitment , Now we only need to convince the verifier that is, indeed, a commitment to the function , This is what the following steps are about.
We will prove the correctness of by (1) evaluating it at a completely random point and (2) helping the verifier check that the evaluation is indeed , Let , We will evaluate and help the verifier evaluate the equation
with the help of the prover. Note that we can split this up into two sums
The second sum term is completely known to the verifier and can be computed using a small number of field operations. The first term can be computed by giving an opening to the commitment
at , Note that the commitment itself can be computed by the verifier using a multiexponentiation (this will be the main part of the verifier work), because they have all the necessary inputs.
The prover computes
which satisfies ,
Let denote the polynomial committed to by – if the prover is honest, then this will be , however this is what the verifier wants to check. Due to the binding property there can be at most one polynomial that the prover can open at, which makes this valid.
What remans to be checked for the verifier to conclude the proof is that
or, reordering this:
The verifier can compute the left hand side without the help of the prover. What remains is for the prover to give an opening to the commitment at to prove that it is equal to . The KZG proof verifies that this is the case.
The proof consists of and .
The verifier starts by computing and ,
As we have seen above, the verifier can compute the commitment (using one multiexponentiation) and the field element
Then the verifier computes
The verifier checks the Kate opening proof
This means that the verifier now knows that the commitment , opened at a completely random point (that the prover didn’t know when they committed to it), has exactly the value of which the verifier computed with the help of a prover. According to the Schwartz-Zippel lemma this is extremely unlikely (read: impossible in practice, like finding a hash collision), unless is actually a commitment to ; thus, must be a polynomial and the proof is complete.
Optimization: Do everything in evaluation form
This section is to explain various optimizations which make all of the above easy to compute, it is not essential for understanding the correctness of the proof (but it is for making an efficient implementation). The great advantage of the above version of KZG multiproofs, compared to many others, is that very large parts of the prover and verifier work only need to be done in the field. In addition, all these operations can be done on the evaluation form of the polynomial (or in maths terms: in the “Lagrange basis”). What that means and how it is used is explained below.
Usually, we see a polynomial as a sequence of coefficients defining a polynomial function , Here we define another way to look at polynomials: the so-called “evaluation form”.
Given points , there is always a unique polynomial of degree that assumes all these points, i.e. for all , Conversely, given a polynomial, we can easily compute the evaluations at the roots of unity. We thus have a one-to-one correspondence of
This can be seen as a “change of basis”: On the left, the basis is “the coefficients of the polynomial”, whereas on the right, it’s “the evaluations of the polynomial on the ”.
Often, the evaluation form is more natural: For example, when we want to use KZG as a vector commitment, we will commit to a vector by committing to a function that is defined by , But there are more advantages to the evaluation form: Some operations, such as multiplying two polynomials or dividing them (if the division is exact) are much more efficient in evaluation form.
In fact, all the operations in the KZG multiproof above can be done very efficiently in evaluation form, and in practice we never even compute the polynomials in coefficient form when we do this!
Let’s define the Lagrange polynomials on the domain .
For any ,
so the Lagrange polynomials can be seen as the “unit vectors” for polynomials in evaluation form. Using these, we can explicitly translate from the evaluation form to the coefficient form: say we’re given as a polynomial in evaluation form, then the polynomial is
Polynomials in evaluation form (given by the ) are sometimes called “Polynomials in Lagrange basis” because of this.
For KZG commitments, we can use another trick: Recall that the setup for KZG to commit to polynomials of degree consists of , From these, we can compute , Then we can simply compute a polynomial commitment like this:
There is no need to compute the polynomial in coefficient form to compute its KZG commitment.
FFT to change between evaluation and coefficient form
The Discrete Fourier Transform of a vector is defined by
Note that if we define the polynomial , then , i.e. the DFT computes the values of on the domain , This is why in practice, we will the roots of unity as our domain whenever it is available because then we can use the DFT to compute the evaluation form from the coefficient form.
The inverse, the Inverse Discrete Fourier Transform , is given by
Similar to how the DFT computes the evaluations of a polynomial in coefficient form, the inverse DFT computes the coefficients of a polynomial from its evaluations.
The “Fast Fourier Transform” is a fast algorithm, that can compute the DFT or inverse DFT in only multiplications. A direct implementation of the sum above would take multiplications. This speedup is huge and makes the FFT such a powerful tool.
In strict usage, DFT is the generic name for the operation, whereas FFT is an algorithm to implement it (similar to sorting being an operation, whereas quicksort is one possible algorithm to implement that operation). However, colloquially, people very often just speak of FFT even when they mean the operation as well as the algorithm.
Multiplying and dividing polynomials
Let’s say we have two polynomials and such that the sum of the degrees is less than , Then the product is a polynomial of degree less than . If we have the evaluations and , then we can easily compute the evaluations of the product:
This only needs multiplications, whereas multiplying in coefficient form needs multiplications. So multiplying two polynomials is much easier in evaluation form.
Now lets assume that divides exactly, i.e. there is a polynomial such that , Then we can find this quotient in evaluation form
using only divisions. Again, using long division, this would be a much more difficult task taking operations in coefficient form.
We can use this trick to compute openings for Kate commitments in evaluation form, where we need to compute a polynomial quotient. So we can use this trick to compute the proof above.
Dividing when one of the points is zero
There is only one problem: What if one of the is zero, i.e. is zero somewhere on our evaluation domain? Note by definition this can only happen if is also zero, otherwise cannot divide , But if both are zero, then we are left with and can’t directly compute it in evaluation form. But do we have to go back to coefficient form and use long division? It turns out that there’s a trick to avoid this, at least in the case that we care about: Often, is a linear factor. We want to compute
Now, we introduce the polynomial . The roots of the polynomial are all the points of the domain. So we can also write in another form as
The formal derivative of is given by
This polynomial is extremely useful because we can write the Lagrange polynomials as
Now, let’s go back to the equation for , The one problem we have if we want to get this in evaluation form is the point where we encounter a division by zero; all the other points are easy to compute. But now we can replace
Because , this lets us compute
For all , we can compute directly
This allows us to efficiently allow all in evaluation form for all , including , This trick is necessary to compute in evaluation form.
In order to make this efficient, it’s best to precompute the as computing them takes time, but only needs to be performed once.
Special case roots of unity
In the case where we are using the roots of unity as our domain, we can use some tricks so that we don’t need the precomputation of . The key observation is that can be rewritten in a simpler form:
Because of this the formal derivative becomes much simpler:
And we can now easily derive :
Evaluating a polynomial in evaluation form on a point outside the domain
Now there is one thing that we can do with a polynomial in coefficient form, that does not appear to be easily feasible in evaluation form: We can evaluate it at any point. Yes, in evaluation form, we do have the values at the , so we can evaluate by just taking an item from the vector; but surely, to evaluate at a point outside the domain, we have to first convert to coefficient form?
It turns out, it is not necessary. To the rescue comes the so-called barycentric formula. Here is how to derive it using Lagrange interpolation:
The last part can be computed in just steps (assuming the precomputation of the ), which makes this formula very useful, for example for computing and without changing into coefficient form.
This formula can be simplified in the case where the domain is the roots of unity: