Jekyll2021-10-13T00:37:00+00:00https://dankradfeist.de/feed.xmlDankrad FeistResearcher at Ethereum FoundationKZG多项式承诺2021-10-13T00:00:00+00:002021-10-13T00:00:00+00:00https://dankradfeist.de/ethereum/2021/10/13/kate-polynomial-commitments-mandarin<p>原文链接： <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG Polynomial Commitments</a></p>
<p>翻译：Star.LI @ Trapdoor Tech</p>
<h2 id="简介">简介</h2>
<p>今天我想向你们介绍一下Kate，Zaverucha和Goldberg发表的多项式承诺方案 <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>。这篇文章并不涉及复杂的数学及密码学理论知识，仅作为一篇简介。</p>
<p>该方案通常被称作卡特（Kate，读作<a href="https://www.cs.purdue.edu/homes/akate/howtopronounce.html">kah-tay</a>）多项式承诺方案。在一个多项式承诺方案中，证明者计算一个多项式的承诺（commitment）, 并可以在多项式的任意一个点进行打开（opening）：该承诺方案能证明多项式在特定位置的值与指定的数值一致。</p>
<p>之所以被称为<em>承诺</em>，是因为当一个承诺值（椭圆曲线上的一个点）发送给某对象（ <em>验证者</em>），证明者不可以改变当前计算的多项式。它们只能够对一个多项式提供有效的证明；当试图作弊时，它们要不无法提供证明，要不证明被验证者拒绝。</p>
<h3 id="预备知识">预备知识</h3>
<p>如果你对有限域，椭圆曲线和配对这几个话题不是很熟悉的话，非常推荐去读一读<a href="https://vitalik.ca/general/2017/01/14/exploring_ecp.html">Vitalik Buterin的博客：椭圆曲线配对</a>这篇文章。</p>
<h3 id="默克尔树对比">默克尔树对比</h3>
<p>如果你已经熟知默克尔树，我想在此之上和卡特承诺进行对比。默克尔树即是密码学家所说的<em>矢量承诺</em>：运用一个深度为<script type="math/tex">d</script>的默克尔树，你可以计算一个矢量的承诺（矢量为一个固定长度的列表<script type="math/tex">a_0, \ldots, a_{2^d-1}</script>）。运用熟知的<em>默克尔证明</em>，你可以用<script type="math/tex">d</script>个哈希来提供证明元素<script type="math/tex">a_i</script>存在于这个矢量的位置<script type="math/tex">i</script>。</p>
<p>事实上，我们可以用默克尔树来构造多项式承诺：回忆一下，一个<script type="math/tex">n</script>次的多项式<script type="math/tex">p(X)</script>，无非是一个函数 <script type="math/tex">p(X) = \sum_{i=0}^{n} p_i X^i</script>，其中<script type="math/tex">p_i</script>是该多项式的系数。</p>
<p>通过设置<script type="math/tex">a_i=p_i</script>，我们可以计算这一系列系数的默克尔树根，从而比较容易地对一个<script type="math/tex">n=2^{d}-1</script>次的多项式进行承诺。证明一个取值，意味着证明者想要向验证者展示对于某个值z，<script type="math/tex">p(z)=y</script>。为达到这个目的，证明者可以向验证者发送所有的<script type="math/tex">p_i</script>，然后验证者计算p(z)是否等于y。</p>
<p>当然，这是一个极度简单化的多项式承诺，但它能帮助我们理解真实的多项式承诺的益处。让我们一起回顾多项式承诺的性质：</p>
<ol>
<li>承诺的大小是一个单一哈希（默克尔树根）。一个足够安全的加密散列一般需要256位，即32字节。</li>
<li>为了证明一个取值，证明者需要发送所有的<script type="math/tex">p_i</script>，所以证明的大小和多项式次数是线性相关的。同时，验证者需要做同等的线性量级的计算（他们需要计算多项式在<script type="math/tex">z</script>点的取值，即计算<script type="math/tex">p(z)=\sum_{i=0}^{n} p_i z^i</script>）。</li>
<li>该方案不隐藏多项式的任何部分 - 证明者一个系数接一个系数地发送完整的多项式。</li>
</ol>
<p>现在让我们一起来看看卡特方案是如何达成以上要求的：</p>
<ol>
<li>承诺大小是一个支持配对的椭圆曲线群元素。比如说对于BLS12_381曲线，大小应是48字节。</li>
<li>证明大小<em>独立</em>于多项式大小，永远是一个群元素。验证，同样独立于多项式大小，无论多项式次数为多少都只要两次群乘法和两次配对。</li>
<li>大多数时候该方案隐藏多项式 - 事实上，无限多的多项式将会拥有完全一样的卡特承诺。但是这并不是完美隐藏：如果你能猜多项式（比如说该多项式过于简单，或者它存在于一个很小的多项式集合中），你就可以找到这个被承诺的多项式。</li>
</ol>
<p>还有一点，在一个承诺中合并任意数量的取值证明是可行的。这些性质使得卡特方案对于零知识证明系统来说非常具有吸引力，例如PLONK和SONIC。同时对于一些更日常的目的，或者简单的作为一个矢量承诺来使用也是非常有趣的场景，接下来的文章中我们就会看到。</p>
<h2 id="椭圆曲线以及配对">椭圆曲线以及配对</h2>
<p>正如之前所提到的预备知识所说，我强烈推荐<a href="https://vitalik.ca/general/2017/01/14/exploring_ecp.html">Vitalik Buterin的博客：椭圆曲线配对</a>。本文包含了本文所需的背景知识：特别是有限域，椭圆曲线和配对相关知识。</p>
<p>假设<script type="math/tex">\mathbb G_1</script>和<script type="math/tex">\mathbb G_2</script>是两条满足<script type="math/tex">e: \mathbb G_1 \times \mathbb G_2 \rightarrow \mathbb G_T</script>的配对，假设p是<script type="math/tex">\mathbb G_1</script>和<script type="math/tex">\mathbb G_2</script>的阶，同时G和H是<script type="math/tex">\mathbb G_1</script>和<script type="math/tex">\mathbb G_2</script>的生成元。接下来，我们定义一个非常有效的速记符号：对于任意<script type="math/tex">x \in \mathbb F_p</script>
<script type="math/tex">\displaystyle
[x]_1 = x G \in \mathbb G_1 \text{ and } [x]_2 = x H \in \mathbb G_2</script></p>
<h3 id="可信设置">可信设置</h3>
<p>假设我们已有一个可信设置，使得对于一个秘密s，其子元素<script type="math/tex">[s^i]_1</script>和<script type="math/tex">[s^i]_2</script>都对于任意<script type="math/tex">i=0, \ldots, n-1</script>的证明者和验证者有效。</p>
<p>有一种方法能够达到这种可信设置：我们用离线计算机生成一个随机数<script type="math/tex">s</script>，计算所有的群元素<script type="math/tex">[s^i]_x</script>，并通过电线传输出去（不包括<script type="math/tex">s</script>）,然后烧掉这部计算机。当然这并不是一个好的解决方案，你必须相信计算机的操纵者没有通过其他渠道泄露这个秘密<script type="math/tex">s</script>。</p>
<p>在实际应用中，这种设置通常采用安全多方计算（MPC），使用一组计算机来创建这个群元素，而没有任何单一计算机知道秘密s，这样只有挟持了整组计算机才能知道s。</p>
<p>注意这里有一件事是不可能的：你不能仅仅选择一个随机群元素<script type="math/tex">[s]_1</script>（其中<script type="math/tex">s</script>是未知的）然后通过它计算其他的群元素。不知道<script type="math/tex">s</script>是无法计算<script type="math/tex">[s^2]_1</script>的。</p>
<p>好了，椭圆曲线密码学基础告诉我们通过可信设置的群元素是无法破解<script type="math/tex">s</script>的，它是有限域<script type="math/tex">\mathbb F_p</script>中的一个数字，但证明者无法找出它的具体数值。他们只能在给定的元素上做一些特定的计算。举个例子，他们可以用椭圆曲线乘法轻易地计算<script type="math/tex">c [s^i]_1 = c s^i G = [cs^i]_1</script>，或者说将椭圆曲线点值相加算出<script type="math/tex">c [s^i]_1 + d [s^j]_1 = (c s^i + d s^j) G = [cs^i + d s^j]_1</script>。实际上如果<script type="math/tex">p(X) = \sum_{i=0}^{n} p_i X^i</script>是一个多项式，证明者可以计算
<script type="math/tex">\displaystyle
[p(s)]_1 = [\sum_{i=0}^{n} p_i s^i]_1 = \sum_{i=0}^{n} p_i [s^i]_1</script></p>
<p>这就显得非常有趣 – 通过使用这套可信设置，任何人都可以计算出一个多项式在一个谁也不知道的秘密点s上的值。只是他们得到的输出值不是一个自然数，而是一个椭圆曲线点<script type="math/tex">[p(s)]_1 = p(s) G</script>，这已经足够有用。</p>
<h2 id="卡特承诺">卡特承诺</h2>
<p>在卡特承诺方案中，元素<script type="math/tex">C = [p(s)]_1</script>是多项式<script type="math/tex">p(X)</script>的承诺。</p>
<p>这样你可能会问了，证明者是不是在不知道<script type="math/tex">s</script>的情况下找到另一个有相同承诺的多项式<script type="math/tex">q(X) \neq p(X)</script>，使得<script type="math/tex">[p(s)]_1 = [q(s)]_1</script>？我们假设这个推理成立，那么就是说<script type="math/tex">[p(s) - q(s)]_1=[0]_1</script>，即<script type="math/tex">p(s)-q(s)=0</script>。</p>
<p><script type="math/tex">r(X) = p(X)-q(X)</script>本身就是一个多项式。我们知道它不是常数，因为<script type="math/tex">p(X) \neq q(X)</script>。有一个非常著名的定理，即是任意非常数的<script type="math/tex">n</script>次多项式至多可以有<script type="math/tex">n</script>个零点，这是因为如果<script type="math/tex">r(z)=0</script>，<script type="math/tex">r(X)</script>就可以被线性因子<script type="math/tex">X−z</script>整除；因为每一个零点都意味着可以被一个线性因子整除，同时每经过一次除法会降低一阶，所以推理可知至多存在<script type="math/tex">n</script>个零点<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>。</p>
<p>因为证明者不知道<script type="math/tex">s</script>，他们只能通过在尽可能多的地方让<script type="math/tex">p(X)−q(X)=0</script>来使得<script type="math/tex">p(s)−q(s)=0</script>。如上所证，他们只能在至多<script type="math/tex">n</script>个点上使<script type="math/tex">p(s)−q(s)=0</script>，那么成功的可能性就很小，因为<script type="math/tex">n</script>比起曲线的次数<script type="math/tex">p</script>要小很多，<script type="math/tex">s</script>被选中成为<script type="math/tex">p(X)=q(X)</script>成立点的概率是微乎其微的。来感受一下这个概率的大小，假设我们采用现有最大的可信设置，当<script type="math/tex">n = 2^{28}</script>，把它来和曲线顺序<script type="math/tex">p \approx 2^{256}</script>对比：攻击者设立的多项式<script type="math/tex">q(X)</script>来与<script type="math/tex">p(X)</script>尽可能多的重合，<script type="math/tex">n=2^{28}</script>个点，得到相同承诺（p(s)=q(s)）的概率是<script type="math/tex">2^{28}/2^{256} = 2^{28-256} \approx 2 \cdot 10^{-69}</script>。这是一个非常低的概率，在现实中意味着攻击者没有办法施行该攻击。</p>
<h3 id="多项式相乘">多项式相乘</h3>
<p>目前为止我们学习了在一个秘密<script type="math/tex">s</script>的多项式取值是可计算的，这就使得我们可以对一个独一无二的多项式进行承诺 - 对于同一个承诺<script type="math/tex">C=[p(s)]_1</script>存在多个多项式，但是在实践中它们其实是无法计算的（这就是密码学家所说的<em>绑定</em> （computationally binding））。</p>
<p>但是，我们仍缺少在不发送给验证者完整多项式的情况下“打开”这个承诺的能力。为了达到这个目的，我们需要用到配对。如上所述，我们可以对这个秘密进行线性操作；举个例子，我们可以计算<script type="math/tex">p(X)</script>的承诺<script type="math/tex">[p(s)]_1</script>，还可以通过两个承诺<script type="math/tex">p(X)</script>和<script type="math/tex">q(X)</script>来计算<script type="math/tex">p(X)+q(X)</script>的联合承诺：<script type="math/tex">[p(s)]_1+[q(s)]_1=[p(s)+q(s)]_1</script>。</p>
<p>现在我们所缺少的就是两个多项式的乘法。如果我们做到乘法，就能利用多项式的性质打开更多酷炫玩法的大门。尽管椭圆曲线本身不允许作乘法，幸运的事我们可以通过配对解决这个问题：我们有</p>
<p><script type="math/tex">\displaystyle
e([a]_1, [b]_2) = e(G, H)^{(ab)} = [ab]_T</script>
在这里介绍一个新的标识方法：<script type="math/tex">[x]_T = e(G, H)^x</script>。这样，尽管我们不能直接<em>在椭圆曲线上</em>直接将两个元素相乘得到它们的乘积，一个椭圆曲线元素（这就是所谓全同态加密/FHE的一个性质；椭圆曲线仅是<em>加同态</em>)。如果是在不同的曲线上（比如一个在<script type="math/tex">\mathbb G1</script>，另一个在<script type="math/tex">\mathbb G2</script>上）提交承诺，我们可以将两个字段元素相乘，这样所得到的输出就是一个<script type="math/tex">\mathbb G_T</script>元素。</p>
<p>这里我们就看到了卡特证明的核心。记得我们之前提到的线性因子：如果一个多项式在<script type="math/tex">z</script>处有零点，那么它就可以被<script type="math/tex">X−z</script>整除。同理反向可证 - 如果多项式可以被<script type="math/tex">X−z</script>整除，那么它必在<script type="math/tex">z</script>处有零点。可被<script type="math/tex">X−z</script>整除，意味着对于某个多项式<script type="math/tex">q(X)</script>我们可得<script type="math/tex">p(X)=(X−z)⋅q(X)</script>，并且很明显在<script type="math/tex">X=z</script>处得到零点。</p>
<p>举个例子，我们想要证明<script type="math/tex">p(z)=y</script>，使用多项式<script type="math/tex">p(X)−y</script> – 明显该多项式在<script type="math/tex">z</script>处达到零点，这样我们就可以应用线性因子的知识。取多项式<script type="math/tex">q(X)</script>，<script type="math/tex">p(X)−y</script>被线性因子<script type="math/tex">X−z</script>除，即：</p>
<script type="math/tex; mode=display">\displaystyle
q(X) = \frac{p(X)-y}{X-z}</script>
<p>这就等同于<script type="math/tex">q(X)(X-z) = p(X)-y</script>。</p>
<h3 id="卡特证明">卡特证明</h3>
<p>定义<script type="math/tex">p(z)=y</script>的卡特证明为<script type="math/tex">π=[q(s)]_1</script>，记得多项式<script type="math/tex">p(X)</script>的承诺是<script type="math/tex">C=[p(s)]_1</script>.</p>
<p>验证者用如下等式来确认这个证明：</p>
<p><script type="math/tex">\displaystyle
e(\pi,[s-z]_2) = e(C-[y]_1, H)</script>
注意验证者可以计算<script type="math/tex">[s−z]_2</script>，因为这仅是可信设置的元素<script type="math/tex">[s]_2</script>和多项式被计算的点<script type="math/tex">z</script>的一个组合。同样的，验证者已知了<script type="math/tex">y</script>是取值<script type="math/tex">p(z)</script>，所以他们也可以计算<script type="math/tex">[y]_1</script>。那么为什么上述证明能向验证者证明<script type="math/tex">p(z)=y</script>，或者更准确地说，<script type="math/tex">C</script>所提交的多项式在<script type="math/tex">z</script>点的取值是<script type="math/tex">y</script>？</p>
<p>这里我们需要考证两个性质：<em>正确性</em> 和 <em>可靠性</em>。 <em>正确性</em> 指的是如果证明者遵循我们定义的步骤，他们就可以产出一个能被验证的证明。这个通常难度不大。还有就是<em>可靠性</em>，这个性质是指证明者不会产出一个“不正确”的证明 – 比如说，他们不会欺骗验证者对于某个<script type="math/tex">y′≠y</script>，<script type="math/tex">p(z)=y′</script>。</p>
<p>接下来我们先写出配对组的对应等式：
<script type="math/tex">\displaystyle
[q(s) \cdot (s-z)]_T = [p(s) - y]_T</script>
<em>正确性</em>非常一目了然 – 这就是等式<script type="math/tex">q(X)(X−z)=p(X)−y</script>在一个没人知道的随机点<script type="math/tex">s</script>的取值。</p>
<p>那么，我们怎么才能知道它的可靠性，证明者不会创建假的证明呢？让我们从多项式的角度来看待这个问题。如果证明者想依循我们的方法来构建一个证明，他们就需要用<script type="math/tex">X−z</script>来除<script type="math/tex">p(X)−y′</script>。但是<script type="math/tex">p(z)−y′</script>并不为零，无论怎么除都会有一个余数，所以他们就无法进行这个多项式除法。这样一来，证明者就无法用这个方法进行伪造了。</p>
<p>剩下的就只能直接在椭圆群中想办法了：如果说对于某个承诺<script type="math/tex">C</script>，他们可以计算椭圆群元素</p>
<p><script type="math/tex">\displaystyle
\pi_\text{Fake} = (C-[y']_1)^{\frac{1}{s-z}}</script>
一旦成立，那证明者就可以为所欲为了。感觉上这是很难做到的，你必须用和s相关的什么东西来求幂，但s又是未知的。为了严格证明，你需要针对证明和配对的一个密码学假设，即所谓的<script type="math/tex">q</script>-strong SDH假设 <sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>。</p>
<h3 id="多重证明">多重证明</h3>
<p>为这里为止我们已经看到了如何在一个单点上证明一个多项式取值，这是已经是非常了不起的一件事：你可以仅靠发送单个的群元素（可以是48字节大小，例如BLS12_381）来证明任何次数的多项式 - 比如说<script type="math/tex">2^{28}</script>次 – 在任意点的取值。作为对比，在一个简单的把默克尔树用作多项式承诺的例子中，我们需要发送<script type="math/tex">2^{28}</script>个元素，即这个多项式所有的系数。</p>
<p>更进一步，我们来看看如何仅使用一个群元素，来计算并证明一个多项式在<em>任意多个点</em> 的取值。首先我们需要了解一个新概念：插值多项式。有一个包含k个点的列表<script type="math/tex">(z_0, y_0), (z_1, y_1), \ldots, (z_{k-1}, y_{k-1})</script>，我们随时都可以找到一个次数小于<script type="math/tex">k</script>的多项式来经过这些点。其中一个方法是利用拉格朗日插值，这样我们可以得到该多项式的公式I(X)：</p>
<p><script type="math/tex">I(X) = \sum_{i=0}^{k-1} y_i \prod_{j=0 \atop j \neq i}^{k-1} \frac{X-z_j}{z_i-z_j}</script>
现在我们假设已知<script type="math/tex">p(X)</script>经过了所有的点，那么多项式<script type="math/tex">z_0, z_1, \ldots, z_{k-1}</script>都是零点。这就意味着多项式可被所有的线性因子：<script type="math/tex">(X-z_0), (X-z_1), \ldots (X-z_{k-1})</script>整除，我们将它们组合在一起，称为零多项式：</p>
<p><script type="math/tex">\displaystyle
Z(X) = (X-z_0) \cdot (X-z_1) \cdots (X-z_{k-1})</script>
我们可以计算商值</p>
<p><script type="math/tex">\displaystyle
q(X) = \frac{p(X) - I(X)}{Z(X)}</script>
注意，因为<script type="math/tex">p(X)−I(X)</script>能被<script type="math/tex">Z(X)</script>所有的线性因子整除，所以它能被<script type="math/tex">Z(X)</script>本身整除。</p>
<p>现在我们可以定义这个计算<script type="math/tex">(z_0, y_0), (z_1, y_1), \ldots, (z_{k-1}, y_{k-1})</script>的卡特证明：<script type="math/tex">\pi=[q(s)]_1</script> – 这仍然仅是一个群元素。</p>
<p>为了验证这个证明，验证者同样需要计算插值多项式<script type="math/tex">I(X)</script>和零多项式<script type="math/tex">Z(X)</script>，使用这些结果他们可以计算<script type="math/tex">[Z(s)]_2</script>和<script type="math/tex">[I(s)]_1</script>，然后就可以确认配对等式：</p>
<p><script type="math/tex">\displaystyle
e(\pi,[Z(s)]_2) = e(C-[I(s)]_1, H)</script>
将该等式写成配对，我们可以像单点上的卡特证明一样简单地确认它是否能够成立：</p>
<p><script type="math/tex">\displaystyle
[q(s)\cdot Z(s)]_T = [p(s)-I(s)]_T</script>
这就非常酷炫了：仅仅提供一个群元素，你就能证明任何数量的计算，甚至是百万个！这相当于通过48个字节来证明海量的计算。</p>
<h2 id="将卡特作为矢量承诺来使用">将卡特作为矢量承诺来使用</h2>
<p>尽管卡特承诺被设计成多项式承诺，但它作为矢量承诺来使用也大有用处。回忆一下，一个矢量承诺是针对矢量<script type="math/tex">a_0, \ldots, a_{n-1}</script>的承诺，并且允许你证明任意位置<script type="math/tex">i</script>对应<script type="math/tex">a_i</script>。我们可以使用卡特承诺的方案来重现这一场景：使<script type="math/tex">p(X)</script>为对所有的<script type="math/tex">i</script>计算 <script type="math/tex">p(i)=a_i</script>的一个多项式，我们知道这样一个多项式存在，并且可以通过拉格朗日插值来计算它：</p>
<p><script type="math/tex">\displaystyle
p(X) = \sum_{i=0}^{n-1} a_i \prod_{j=0 \atop j \neq i}^{n-1} \frac{X-j}{i-j}</script>
使用这个多项式，我们可以就可以利用一个单一群元素来证明这个矢量中任意数量的元素！注意到比起默克尔树（在证明大小方面）这个方案更加高效：仅证明一个元素，默克尔证明就需要花费<script type="math/tex">\log n</script>大小的哈希！</p>
<h2 id="延伸阅读">延伸阅读</h2>
<p>为了得到一个无状态版本的以太坊，我们正在积极探索卡特承诺的应用。在这里我强烈建议在ethresearch论坛中使用关键字<a href="https://ethresear.ch/search?q=kate">Kate</a>来搜索你感兴趣的话题。</p>
<p>另一篇很赞的博文是Vitalik的<a href="https://vitalik.ca/general/2019/09/22/plonk.html">introduction to PLONK</a>，其中大量运用到了多项式承诺，其中卡特方案就是多项式承诺实现的主要方案。</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>https://www.iacr.org/archive/asiacrypt2010/6477178/6477178.pdf <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>这个结果经常被错误引用为代数基本定理。实际上代数基本理论是其反推的结论。该结论在代数封闭的域中才有:效。而代数基本定理是对于复数而言，所有n次的多项式都有n个线性因子。很可惜这个简单一点的版本没有简洁易记的名字，尽管它可以说比代数基本定理更基本一些。 <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>https://www.cs.cmu.edu/~goyal/ibe.pdf <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>原文链接： KZG Polynomial CommitmentsProofs of Custody2021-09-30T00:00:00+00:002021-09-30T00:00:00+00:00https://dankradfeist.de/ethereum/2021/09/30/proofs-of-custody<p><em>Thanks to Vitalik Buterin, Chih-Cheng Liang and Alex Stokes for helpful comments</em></p>
<p>A proof of custody is a construction that helps against the “lazy validator” problem. A lazy validator is a validator that instead of doing the work they are supposed to do – for example, ensuring that some data is available (relevant for data sharding) or that some execution was performed correctly (for execution chains) – they pretend that they’ve done it and sign the result, for example an attestations that claims the data is available anyway.</p>
<p>The proof of custody construction is a cryptoeconomic primitive that changes the game theory so that lazy validating simply isn’t an interesting strategy anymore.</p>
<h2 id="lazy-validators--the-game-theory">Lazy validators – the game theory</h2>
<p>Let’s assume there is a well-running Ethereum 2.0 chain (insert your favourite alternative PoS blockchain if you prefer). We don’t usually expect that bad things – data being withheld, invalid blocks being produced happens. In fact, you are likely to not see them ever happen, because as long as the system is run by a majority of honest validators there is no point in even trying to attack it in one of these ways. Since the attack is pretty much guaranteed to fail, there is no point in even doing it.</p>
<p>Now assume you run a validator. This comes with different kinds of costs – obviously the staking captial, but also hardware costs, electricity and internet bandwidth, which you might pay for directly (your provider charges you per GB) or indirectly (when your validator is running, your netflix lags). The lower you can make this cost, the more net profits you make from running your validator.</p>
<p>One of the tasks you do as a validator in sharded Eth2, is to assure the availability of shard data. Each attestation committee is assigned one blob of data to check, which is around 512 kB to 1 MB. The task of each validator is to download it and store it for around 90 days.</p>
<p>But what happens if you simply sign all attestations for shard blobs, without actually downloading the data? You would still get your full rewards, but your costs have suddently decreased. We are assuming the network is in a good state, so your laziness isn’t going to do anything to the network immediately. Let’s say your profit of running a validator was $1 per attestation, and the cost of downloading all the blocks was $0.10 per year. Now your profit has increased to $1.10.</p>
<table>
<thead>
<tr>
<th> </th>
<th>Profit per signed attestation</th>
</tr>
</thead>
<tbody>
<tr>
<td>Honest</td>
<td>$1.00</td>
</tr>
<tr>
<td>Lazy</td>
<td>$1.10</td>
</tr>
</tbody>
</table>
<p>This problem is called the verifier’s dilemma and was introduced in <a href="https://eprint.iacr.org/2015/702.pdf">Demystifying Incentives in the Consensus Computer</a> by Luu et al.</p>
<h3 id="but-i-would-never-do-this-who-would-cheat-like-that">But I would never do this! Who would cheat like that?</h3>
<p>It often seems obvious to us that in games like this, surely you would not succumb to bribery and stay with the honest behaviour. But it’s often more subtle than that.</p>
<p>Let’s assume that after having run a validator for years, a new client comes out that claims to be 10% more cost effective. People run it and see that it works, and it seems to be safe. The way it actually does this is by not downloading the shard blocks.</p>
<p>This could even happen by accident. Someone cut some corners in the development process, everything looks normal, it’s just that it doesn’t join the right shard subnet and nobody missed this, because it does not cause any faults in normal operation.</p>
<p>Some people will probably run this client.</p>
<p>Something else that could happen is that a service could step in to do the downloading for you. For $0.01 per shard blob, they will download the data, store it for 90 days, and send you a message that the data is available and you can sign the attestation. How bad is this?</p>
<p>It’s also quite bad. Because as many people start using this service, it becomes a single point of failure. Or even worse, it could be part of an attack. If it can make more than 50% of validators vote for the availability of a shard blob, without ever publishing the blob, that would be a withholding attack.</p>
<p>As it is often the case, dishonesty can come in many disguises, so our best bet is to work on the equilibrium to make the honest strategy rational.</p>
<h2 id="a-proof-of-custody-and-an-update-to-the-game-theory">A proof of custody and an update to the game theory</h2>
<p>The proof of custody works like this: Imagine we can put a “bomb” in a shard blob: If you sign this blob, you get a large penalty (you get slashed), of $3,000. You definitely don’t want to sign this blob.</p>
<p>Does that make you want to download it? That is certainly one way to avoid signing the bomb. But if anyone can detect the bomb, then someone can simply write a service that warns you before signing an attestation if it’s a bomb. So the bomb needs to be specific to an individual validator, and noone else can compute whether a shard blob is a bomb.</p>
<p>OK, now we have the essential ingredients for the proof of custody. We need</p>
<ol>
<li>An ephemeral secret, that is recomputed every custody epoch (ca. 90 days), individual to each validator, and then revealed when it has expired (so that other validators have a chance to check the proof of custody)</li>
<li>A function that takes the whole shard blob data, as well as the ephemeral key, and outputs 0 (not a bomb), or, with very small probability, 1 (this blob is a bomb)</li>
</ol>
<p>It is essential that the ephemeral secret isn’t made available to anyone else, so there are three slashing conditions:</p>
<ol>
<li>A validator can get slashed if anyone knows its current ephemeral secret</li>
<li>The ephemeral secret has to be published after the custody period, and failing to do so also leads to slashing</li>
<li>Signing a bomb leads to slashing</li>
</ol>
<p>How can we create this function? A simple construction works like this. Compute a Merkle tree of leaves <code class="highlighter-rouge">(data0, secret, data1, secret, data2, secret, ...)</code> as illustrated here:</p>
<div class="mermaid">graph TB
A[Root] -->B[Hash]
A --> B1[Hash]
B --> C[Hash]
B --> C1[Hash]
C --> D[data0]
C --> E[secret]
C1 --> D1[data1]
C1 --> E1[secret]
B1 --> C2[Hash]
B1 --> C3[Hash]
C2 --> D2[data2]
C2 --> E2[secret]
C3 --> D3[data3]
C3 --> E3[secret]
</div>
<p>Then take the logical <code class="highlighter-rouge">AND</code> of the first 10 bits. This gives you a single bit that’s 1 in an expected 1 in 1024 times.</p>
<p>This function cannot be computed without knowing both the secret and the data.</p>
<p>(Because we do want to enable secret shared validators, a lot of work has gone into optimizing this function so that it can be efficiently computed in an MPC, which a Merkle tree cannot. For this we are suggesting a construction based on a Universal Hash Function and the Legendre symbol: https://ethresear.ch/t/using-the-legendre-symbol-as-a-prf-for-the-proof-of-custody/5169)</p>
<h3 id="new-game-theory">New game theory</h3>
<p>All right, so with the proof of custody, any shard blob has a 1/1,024 chance of being a bomb, and you don’t know which one it is without downloading it.</p>
<p>The lazy validator does just fine when the blob is not a bomb. However, when it is a bomb, we see the big difference: The honest validator simply skips this attestation, which is very minor an simply sets the profit to zero. However, the lazy validator signs it and will get slashed, making a huge loss. The payoff matrix now looks like this:</p>
<table>
<thead>
<tr>
<th> </th>
<th>Profit for non-bomb attestation</th>
<th>Profit for bomb attestation</th>
<th>Average for 1,024 attestations</th>
</tr>
</thead>
<tbody>
<tr>
<td>Honest</td>
<td>$1.00</td>
<td>$0.00</td>
<td>$1,023.00</td>
</tr>
<tr>
<td>Lazy</td>
<td>$1.10</td>
<td>$-3,000.00</td>
<td>$-1,873.60</td>
</tr>
</tbody>
</table>
<p>In the third column, we see that the expected profit for the lazy validator is now negative. Since the whole reason for being lazy was increased profits from lower costs, this means that the lazy validator is not an interesting strategy anymore.</p>
<h2 id="proof-of-custody-for-execution">Proof of custody for execution</h2>
<p>Another task of validators will be verifying the correct execution of blocks. This means verifying that the new stateroot that is part of a block is the correct one that results from applying all the transactions. The proof of custody idea can also be applied to this: The validator will have to compute the proof of custody in the same way as described above, however the data is the <em>execution trace</em>. The execution trace is some output generated by the step by step execution of the block. It does not have to be complete in any sense; what we want from it is just two properties:</p>
<ol>
<li>It should be difficult to guess the execution trace without actually executing the block.</li>
<li>The total size of the execution trace should be large enough that simply distributing it in addition to normal blocks is unattractive.</li>
</ol>
<p>There are some easy options of doing this; for example simply outputting every single instruction byte that the EVM executes would probably result in an execution trace of a few MB per execution block. Another option would be to use the top of the stack.</p>
<h3 id="with-fraud-proofs-do-we-still-need-the-proof-of-custody-for-execution">With fraud proofs, do we still need the proof of custody for execution?</h3>
<p>When we upgrade the execution chain to statelessness, which means that blocks can be verified without having the current state, fraud proofs become easy. (Without statelessness, they are hard: Fraud proofs always have to be included on a chain <em>different</em> from the one where the fraud happened, and thus the actual pre-state would not be available when they have to be verified.)</p>
<p>This means that it will be possible to slash a validator who has produced an invalid execution block. Furthermore we can also penalize any validator that has attested to this block. Would that mean that the proof of custody is no longer necessary?</p>
<p>It does certainly shift the balance. But even with this penalty present, lazy validation can still be a rational strategy. It would probably be a bad idea for a validator to simply sign every block without verifying execution, as an attacker only needs to sacrifice a single validator of their own to get you slashed.</p>
<p>However, you can employ the following strategy: On each new block, you wait for some small percentage of other validators to sign it before you sign it yourself. Those who sign it first are unlikely to be lazy validators, as they would be employing the same strategy. This would get you quite good protection in most situations, but at a systemic level it would still leave the chain vulnerable in extreme cases.</p>
<p>The case with fraud proofs is thus improved, but a proof of custody remains superior for ensuring that lazy validation can’t be a rational strategy.</p>
<h2 id="how-is-it-different-from-data-availability-checks">How is it different from data availability checks?</h2>
<p>I wrote a primer on data availability checks <a href="https://dankradfeist.de/ethereum/2019/12/20/data-availability-checks.html">here</a>. It looks like the proof of custody for shard blobs tries to solve a very similar problem: Ensuring that data that is committed to in shard blob headers is actually available on the network.</p>
<p>So we may wonder: Do we need both a proof of custody and data availability checks?</p>
<p>There is an important difference between the two constructions, though:</p>
<ul>
<li>Data availability checks ensure the availability of the data <em>independent of the honest majority assumption</em>. Even a powerful attacker controlling the entirety of the stake can’t trick full nodes into accepting data is available that is actually withheld</li>
<li>In contrast, a proof of custody does not help if the majority of the stake is performing an attack. The majority can compute the proof of custody without ever releasing the data to anyone else.</li>
</ul>
<p>So in a theoretical sense, data availability checks are strictly superior to proof of custody for shard data: They hold unconditionally, whereas the latter only serve to keep rational validators honest, making an attack less likely.</p>
<p>Why do we still need a proof of custody for shard blobs? It might not necessarily be needed. There are however some practical problems with data availability checks that make it desirable to have a “first line of defence” against missing data:</p>
<p>The reason for this is that data availability checks work by excluding unavailable blocks from the fork choice rule. However, this cannot be permanent: data availability checks only ensure that <em>eventually</em>, everyone will see the same result, but not immediately.</p>
<p>The reason for this is that publishing a partially available block, might result in some nodes seeing it as available (they are seeing all their samples) and some other nodes as unavailable (missing some of the samples). Data availability checks ensure that in this situation, the data can always be reconstructed. However, this needs some node to first get enough samples to reconstruct the data, and then re-seed the samples so everyone can see them; this process can take a few slots.</p>
<p>In order to avoid a minority attacker (with less than 1/3 of the stake) to cause such a disruption, we only want to apply data availability checks when the chain is finalized and not immediately. In the meantime, the proof of custody can ensure that an honest majority will only ever build an available chain, where the shard data is already seeded in committees; since the committees are ready to re-seed all samples even if the original blob producer doesn’t, an attacker can’t easily force a partially available block.</p>
<p>In this construction, the proof of custody and data availability checks have two orthogonal functions:</p>
<ol>
<li>The proof of custody for shard data ensures that an honest majority of validators will only ever build a chain in which all shard data is available and well seeded across committees. A minority attacker cannot easily cause disruption to this.</li>
<li>Data availability checks will guarantee that even if the majority of stake is attacking, they will not be able to get the remaining full nodes to consider a chain with withheld data as finalized.</li>
</ol>Thanks to Vitalik Buterin, Chih-Cheng Liang and Alex Stokes for helpful commentsJust because it has a fixed supply doesn’t make it a good store of value2021-09-27T10:00:00+00:002021-09-27T10:00:00+00:00https://dankradfeist.de/ethereum/2021/09/27/store-of-value-from-limited-supply<h2 id="what-we-should-really-build-is-productive-assets-and-stablecoins">What we should really build is productive assets and stablecoins</h2>
<p><em>Special thanks to David Andolfatto, Vitalik Buterin, Chih-Cheng Liang, Barnabé Monnot and Danny Ryan for comments that helped me improve this essay</em></p>
<p>I think the “store of value” narrative and the misunderstanding of what “fiat” currency really is are a huge problem undermining the whole of the cryptocurrency world. Only when we come to an honest understanding of this will we really be able to build something better.</p>
<p>Here are some core theses of what I believe and which I will try to illustrate in the full article:</p>
<ol>
<li>The “store of value” narrative doesn’t hold water. There is no such thing as a guaranteed way of transmitting value into the future, and just having an asset with a fixed supply doesn’t fix that.</li>
<li>If you want your best bet on sending the most value possible into the future, what you really need is productive assets (for long term) and stablecoins (if you need you money in the near future).</li>
</ol>
<h2 id="why-store-of-value-does-not-exist">Why “store of value” does not exist</h2>
<p>Here is a common form of the cryptocurrency narrative: “Look at fiat currency. 1 US Dollar from 1950 had about 10 times more purchasing power than one US dollar now. It’s a scam. If you store your value in US dollars, then you are constantly losing due to inflation. This is because the central bank/government can just print more US Dollars. You should instead store value in an asset with predictable supply, such as gold or Bitcoin, which does not have this problem.”</p>
<p>The true part of this statement is that if you stored your money in USD, then you would have lost a large part of your purchasing power over the decades. That is not in question. The question is, is there another way, implied by the term “store of value”, that does not have this property? Store of value proponents claim that there is if you instead used an asset with a predictable supply. And of course, historical data backs this up to some extent: If you had used gold instead of storing your value in USD, then you would have fared better: You could have bought an ounce for $35 in 1950, and it would now be worth around $1765 (price as of June 20 2021 from <a href="https://www.bullionbypost.eu/gold-price/alltime/ounces/USD/">here</a>). Given that the Dollar is worth 10x less now due to inflation, that’s $176.50 in 1950-Dollars or a 5x increase in value.</p>
<p>But we could have done much better than this: If we put the $35 in an S&P 500 tracker in 1950, then we would now have a staggering <a href="https://www.officialdata.org/us/stocks/s-p-500/1950?amount=35&endYear=2021">$74,418.65</a>, which is a 212x increase <strong>after correcting for the 10x loss in purchasing power</strong> of the US Dollar (so 7,441.87 1950-Dollars). So clearly, this investment is a much better “store of value” than investing in gold.</p>
<p>Now Bitcoin has fared <em>much</em> better than both gold and the S&P 500 over the last 10 years. However this is a very short timespan, in which Bitcoin went from an absolutely tiny niche to an asset that most people in the world have heard of and some significant minority has invested in. There is no reason to believe that this can be repeated (I don’t think it can). The historical data for gold says, over long periods of time, stores of value that are purely based on “limited supply” do much worse than productive assets.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
<p>So why do people believe gold, or Bitcoin, would make a better store of value than just investing in productive assets like companies, real estate, etc.? There are two reasons that I can see:</p>
<ol>
<li>Stock markets clearly have a lot of volatility. So maybe they believe productive assets are a good long-term store of value, but not for the short term.</li>
<li>The people who believe in “limited supply” stores of value have an apocalyptic mindset. So they believe that in the case of a major social collapse, their stores of value will somehow fare better than more productive assets.</li>
</ol>
<p>Argument number 1 does not convince me at all. That would depend on their preferred store of value having lower volatility than productive assets, which simply <a href="https://seekingalpha.com/article/4296091-gold-vs-stocks">does not bear out in reality</a>. Both gold and Bitcoin are much more volatile than holding an S&P 500 tracker fund. If you want low volatility, then you should still go for the productive assets.</p>
<p>Number 2 means that you can simply “send” value into the future even when society collapses. I think that’s a pretty crazy belief – because when society collapses both the value you can buy as well as the demand for the “limited supply asset” will do as well.</p>
<p>Of course, people think companies (and therefore the S&P 500) will probably go down, but other assets don’t fare any better:</p>
<ol>
<li>Is property a good “store of value” in a catastrophe? Property is mostly valuable because of where it is in relation to valuable economic and social activity. Central Manhattan property is so valuable because it’s in a city where many want to live. A random plot of land in the middle of nowhere usually has very little value. It’s unlikely to fare that well in a major disaster (and might even work out worse than property with a garden to grow your own vegetables)</li>
<li>Similarly the value attributed to gold is a social convention, albeit one that has lasted for an extremely long time. Society could decide on a new asset to value highly, which is indeed what Bitcoiners argue for. But more importantly, your gold isn’t worth anything if there’s nothing of value to buy.</li>
</ol>
<p>If we accept that value depends on a society that provides valuable goods, we have to accept that there is simply no guaranteed way to send money into the future. You might as well make real investments in productive assets.</p>
<h2 id="what-we-need--productive-assets-and-stablecoins">What we need – productive assets and stablecoins</h2>
<p>Above, I argued why I think “limited supply stores of value” (unproductive assets like gold or Bitcoin, that derive their value simply from being scarce and not utility value) are of no advantage to productive assets like stocks. They have the same or higher volatility, but at least for gold (for which we have a decent amount of history) is outperformed by productive assets in the long term. The same will probably happen to Bitcoin once it’s absorbed the initial demand and has arrived at a stable position like gold (other results, with it largely losing its current value, are certainly also possible). They also don’t necessarily fare better in catastrophes; if this is what you’re afraid of you might want to buy goods that are useful in a catastrophe instead.</p>
<p>This means productive assets should be the better long-term stores of values, as they are better on all dimensions.</p>
<p>But clearly the volatility that comes with them is undesirable for many applications that fiat currency is used for now: I don’t think many people would appreciate their salary fluctuating by 50% month on month; in fact the vast majority of people would struggle to pay for all their expenses if their salaries suddenly fell by 50%. Many people simply need or want much more stability than that.</p>
<p>Similarly, if you keep money around to buy a house in the near future, or run a company that keeps cash reserves to make sure they can pay their employees and suppliers, you need stability.</p>
<p>Even if we assumed that everyone suddenly started using Bitcoin, it would simply not fix this problem. Since its supply can’t be dynamically adjusted, its value would continue to be very volatile due to economic fluctuations.</p>
<p>Luckily, there are mechanisms around to create stablecoins using only volatile assets for these situations. My favourite system is the idea behind MakerDAO and DAI, which I describe in an article <a href="/ethereum/2021/09/27/stablecoins-supply-demand.html">here</a>.</p>
<h3 id="so-if-the-current-system-is-so-great-why-do-we-even-need-cryptocurrencies">So if the current system is so great why do we even need cryptocurrencies?</h3>
<p>I think we need to become more nuanced thinkers in the cryptocurrency space, and start seeing the real properties of the systems we are trying to rebuild if we want to be successful. I think fiat currencies as we know them at the moment have been tremendously successful, as long as we see them for what they are: A hedge against short-term volatility rather than maximizing value long term.</p>
<p>I believe that crypto can vastly improve the current financial system, but hopefully not mainly by providing an asset with a limited supply (which won’t solve most of our most important problems). Instead we should make sure our assets are productive to maximize long term value, and create stablecoins for applications where volatility has to be avoided. This system improves on our current financial system because:</p>
<ol>
<li>It is much more transparent – anyone can verify balance sheets and exposures, not just specialized audit firms. This is pretty important because currently, the detailed exposures of banks are not public, which means depositors simply don’t know enough to about banks to make an informed decision which ones they can trust</li>
<li>We can make it fairer – giving everyone access at the same conditions. For example, why should banks have access to central bank accounts whereas normal people and companies don’t?</li>
<li>Governance can be improved, bringing everyone to the table when big decisions have to be made (like Quantitative Easing after the Global Financial Crisis)</li>
<li>Getting rid of the baggage (for example physical currency) and thus allowing more flexibility of the system; for example there is no technical need for inflation when all balances are electronic (though in practice, it might be required for psychological reasons or “price stickiness”)</li>
<li>And most importantly, creating a permissionless and censorship resistant system that anyone can participate in at all levels</li>
</ol>
<p>–</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Vitalik pointed out that this will overstate the case against Bitcoin somewhat, because gold supply has increased much more (ca. 3x) since 1950 than Bitcoin will over a similar period. I do not think this will make up for the massive difference in returns between gold and the S&P 500, though. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>What we should really build is productive assets and stablecoinsOn supply and demand for stablecoins2021-09-27T09:00:00+00:002021-09-27T09:00:00+00:00https://dankradfeist.de/ethereum/2021/09/27/stablecoins-supply-demand<p><em>Special thanks to David Andolfatto, Vitalik Buterin, Chih-Cheng Liang, Barnabé Monnot and Danny Ryan for comments that helped me improve this essay</em></p>
<p>The value of a freely tradable asset is determined by supply and demand. This obviously applies to stocks and cryptocurrencies. But it also applies to any “stablecoin” we are trying to create. It even applies to traditional fiat currencies like the US Dollar or the Euro.</p>
<p>When I talk about stablecoins here, I am referring to decentralized, collateralized stablecoins like MakerDAO’s DAI – not to USDT or USDC, where the supply/demand problem is obvious. So how does MakerDAO balance supply and demand for stablecoins?</p>
<p>And how does this help us learn how central banks do this for fiat currencies?</p>
<h3 id="how-to-create-a-stablecoin">How to create a stablecoin</h3>
<p>Let’s understand how we can create a stablecoin if as a building block we only have assets which are subject to undesirably large volatility. Luckily we have a great example on how to do this by means of collateralized stablecoins, the prime example of which is <a href="https://makerdao.com/">MakerDAO</a>, the project behind the DAI stablecoin.</p>
<p>The idea behind this project is to create a token, called DAI, that tracks the value of one USD as closely as possible. Note that instead of using USD, we can track any other asset as well – RAI, as an example, tracks a time-averaged version of the Ether price. I suggest that long-term, the Ethereum community should strive to create an Oracle that tracks the prices of consumer goods in Ether, so that we can create a stablecoin that has nothing to do with any currently existing fiat currency and is thus truly global and independent. But as a starting point, using USD which is a denomination that most of the world understands intuitively as relatively stable was probably a very good idea.</p>
<p>How did MakerDAO manage to create this stablecoin, without any cash reserves in the form of bank accounts in USD and only the on-chain assets, which are all highly volatile? The core idea is the so-called Collateralized Debt Position, or CDP. It’s a margin position where someone can lock up a volatile asset – for example Ether – and in return create, or “borrow”, a number of DAI. The CDP essentially splits the value of the locked up Ether into two tranches:</p>
<p><img src="/assets/cdp.png" alt="CDP" /></p>
<ol>
<li>The first tranche is the “debt tranche” – this tranche is fixed in its USD value and belongs to whoever owns the actual DAI stablecoins</li>
<li>The second tranche is the equity tranche – it belongs to the owner of the CDP and is the value that is left once the first tranche is satisfied</li>
</ol>
<p>Notice I called them “debt” and “equity” here, because that’s the way we call them when we talk about companies doing the same thing: When companies need capital, they can raise “debt” – in the form of bank loans and bonds, typically – which is very predictable and gets preference (as in is paid back first using the remaining assets) when the company runs out of money. That’s why bonds (which are tradable debt) are quite stable in price: As long as the company doesn’t go bust, they will always be paid back. Equity is the value that’s left over once these debt positions are satisfied, and is traded in the form of stocks – which are much more volatile, because their value depends on the profitability of the company, not just it’s solvency.</p>
<p>The elegance of this system is that the equity position can absorb the volatility, so that the debt holder (which is whoever holds the DAI thus created) has a predictable value. As an illustration here, see what happens when the value of the 1 ETH that has been locked up in the above CDP fluctuates:
<img src="/assets/cdp_fluctuation.png" alt="CDP fluctuation" />
The equity holder gets a position that’s now highly volatile (and in return, if the value of ETH goes up, will get much enhanced returns). The “debt” part of the CDP stays nice and constant and is always worth 1000 USD, as long as the ETH price does not crash too rapidly.</p>
<p>This last part may look scary as if the red “equity” part of the line ever goes to the $1,000 line, the DAI debt could suddenly not be satisfied and thus the value of one DAI would fall below one USD. However, MakerDAO will actually liquidate CDP positions once they get too close to zero equity. This works by auctioning off the collateral to the highest bidder in DAI.</p>
<p>This means that in practice, MakerDAO can deal with extreme falls if they do not happen too rapidly; this has been tested repeatedly, for example in March 2020 when DAI held its peg despite a precipitous fall in crypto asset values.</p>
<p>(This largely describes the old version of DAI, single-collateral DAI (which only accepted ETH as collateral). The current instantiation, multi-collateral DAI, differs in that it also accepts other forms of collateral (which is great), some of which are centralized stablecoins (such as USDC) which is not so good in my opinion.)</p>
<h3 id="why-we-need-to-add-interest-rates-to-this">Why we need to add interest rates to this</h3>
<p>MakerDAO has a simple mechanism to make sure the long-term expected value of DAI should be one USD: In the case of a large deviation, the governance system can trigger global settlement, which will immediately give all DAI holders their current equivalent in ETH by tapping all the CDPs that secure it. However, this event can be far in the future and thus doesn’t guarantee that the instantaneous price is exactly one DAI.</p>
<p>Let us understand the goal that MakerDAO has with DAI: They want 1 DAI to always be worth 1 USD.</p>
<p>One might think: Oh but it would be ok if it sometimes is more than 1 USD right? As a matter of fact, this is also bad: If it costs more than 1 USD to get 1 DAI, then MakerDAO would have failed. Because if I can only get a DAI for 1.10 USD then it means it doesn’t act as a stablecoin for me – it can suddenly fall by 10% and I will lose that value when it goes back to its intended peg of 1 USD. It’s thus essential that the peg is always kept in both directions.</p>
<p>But like any freely traded asset, the value of DAI is determined by supply and demand.</p>
<p>What does it mean for a price to be determined by supply and demand? Let’s say we’re talking about a commodity like wheat with many independent buyers and sellers. The buyers of wheat follow a certain “demand curve”: The higher the price of wheat, the lower the quantity demanded; this is intuitively easy to see: if wheat becomes really expensive I will buy rice instead of flour. If wheat becomes crazy cheap then I will substitute other foods by using more wheat or even buy a few extra bags just in case I need it later. The behaviour of many consumers in aggregate makes this a smooth curve.</p>
<p>The supply curve looks at the other side, the producers of wheat who want to sell it into the market. The suppliers are farmers who grow wheat. They make a similar decision based on the current market price. If the price is low, they won’t grow wheat, or potentially put some of it in storage to sell later at a higher price. If the price is high, they can replace other crops with wheat or even grow it on fields that aren’t currently worthwhile because the yield is lower or it’s harder to harvest.</p>
<p>Conceptually the two curves can be drawn into a graph like this:</p>
<p><img src="/assets/supply_demand.png" alt="Supply and demand" /></p>
<p>Economists traditionally put price on the <script type="math/tex">y</script>-axis (vertical) in this graph, when as the independent variable it would usually be on the <script type="math/tex">x</script>-axis (horizontal).</p>
<p>There is a price at which both curves meet. In equilibrium, this is the expected price for the commodity if there isn’t any interference with the market. This is because if the price is lower, then not all demand can be satisfied, so producers will notice they can be more profitable by increasing their prices, thus raising the overall price. On the other hand, if the current price is higher, then there will be too much supply fighting for the few consumers wanting to buy wheat, and thus the producers who lower their prices will be the ones making a profit (or a lower loss) as the consumers will be turning to them. The only stable point is where the two curves meet.</p>
<p>The same applies to DAI – which can be traded freely on exchanges.</p>
<p>Supply of DAI is given by those people who are happy to take a CDP position, which basically means leveraging their volatile assets, in order to create more DAI, as well as anyone already holding DAI and wanting to sell. Demand comes from those who want the stability of keeping their value in DAI.</p>
<p>These two curves don’t necessarily meet at a price of one USD per DAI.</p>
<p>As an example, if the market for Ether is very bullish and many people think it will go up, then it probably means that there is little demand for the stability that holding DAI provides and a high demand for leveraged Ether positions. People who are very bullish on Ether would be tempted to leverage their positions to profit even more when the price increases. In this kind of environment, so many people want to take out CDPs and create DAI that there are not enough people interested in actualling using all the DAI. The value of DAI would fall below the peg, which is undesirable.</p>
<p>MakerDAO can correct this by adding a positive interest rate (“savings rate”) for holding DAI, rewarding the holders and charging those who take the margin position. This makes it more attractive to hold DAI. You may think ETH is a great investment, but it’s volatile, so maybe DAI with a 5% savings interest would seem attractive. If it’s not 5%, then maybe 10% is.
At some value for this interest rate, the demand for DAI will increase enough (and the supply in form of CDP decrease enough) such that the value of DAI will return to the intended peg.</p>
<p>But the reverse is also possible – in an environment where many people prefer the stability (maybe in a “bear market” where holding Ether isn’t as attractive), a negative interest rate makes holding DAI less attractive and thus reduces the demand. On the other hand, taking out a CDP becomes more attractive when you actually get paid for it. You may be scared of taking out a 1000$ loan against your ETH, but what if you got paid 10% or 100$ per year for it?</p>
<p>So we now effectively have another dimension in order to change supply and demand for DAI – the savings interest rate. A lower rate (even negative) rate will decrease demand and increase supply, leading to a lower DAI price. A higher rate does the opposite and increases the price of DAI. In order to move the price to 1 USD, we just have to adjust the interest rate until the prices agree.</p>
<p>Here is a graphic that illustrates how this works:</p>
<p><img src="/assets/supply_demand_shift.png" alt="Supply and demand change with interest rate" /></p>
<p>On the left, we have supply and demand curves at an interest rate of 1%. The curves meet at a price of 0.95 USD, which is the current fair market price of DAI and thus too low. In this situation MakerDAO would need to raise the interest rate. By raising the interest to 2% (on the right), the CDPs become less attractive (shifting supply) and holding DAI becomes more attractive, thus making the curves meet at the desired price 1.00 USD.</p>
<p>In the light of this, I am very happy that MakerDAO has after a long time decided to implement the ability to support <a href="https://mips.makerdao.com/mips/details/MIP20">negative interest rates</a>. They are essential when a lot of stability is demanded. In fact, not having this in the past has required the very unfortunate decision to use centralized stablecoins such as USDC to back DAI, otherwise the demand could not have been satisfied and it would have shot above the peg. Hopefully, long term, this will be reversed.</p>
<p>To summarize, the interest rate is a mechanism that balances the demand and supply of the stablecoin. An ideal system should simply pick a rate that equates supply and demand – this interest rate would represent the fair market price for keeping value stable. Depending on the overall economic situation, this interest rate can be either positive or negative.</p>
<h3 id="an-analogy-to-fiat-currencies">An analogy to fiat currencies</h3>
<p>“Fiat” currency is actually a huge misnomer for our state currencies. “Fiat” implies that someone just creates a large amount of (what we in the cryptocurrency ecosystem would call) tokens and – by “fiat” (latin “let it be done”) – tells everyone that this is now money.</p>
<p>However this is not really how fiat currency works. Fiat currencies are actually to some extent “collateralized stablecoins” as described above, with some extra complications. As many commenters have noted in the past, we should be calling them “credit currencies” instead.</p>
<p>To see this, we need to understand that traditional money consists of two different components (there are more but these two will give the idea how it works):</p>
<ul>
<li>Central bank money, which consists of reserve accounts (that banks have with the central bank) as well as all the physical money (bills, coins) in circulation; this is often denoted M0 (and can properly be called “fiat” currency)</li>
<li>Bank deposits, which is basically the money you have in your bank account, and similar liquid deposits. This is called M1.</li>
</ul>
<p>But what actually is M1 money? It’s nothing else but “debt” that your bank owes you. This debt is often created by someone taking out a loan from the bank: E.g. when you take a mortgage, two accounts are created: One that says “bank owes you money” and the other one “you owe the bank money” and they cancel each other out. Your bank’s net position hasn’t changed, although it has become riskier (more leveraged) through the process. And new deposits have been created, thus enlarging the M1 quantity.</p>
<p>But that mortage is backed – collateralized – by both your income and the property it’s taken out for. In effect, each loan a bank gives out is very similar to our collateralized debt position above. When you take out your mortgage for 200,000 USD, your CDP is:</p>
<ul>
<li>You are long 1 house</li>
<li>You are short 200,000 USD</li>
</ul>
<p>Now it looks much more like a CDP. While central banks and states have other tools to change supply and control inflation, this debt mechanism is a powerful constraint that can dynamically adjust the quantity while keeping the value of the currency more or less unchanged.</p>
<h3 id="so-are-negative-rates-and-inflation-not-a-scam">So are negative rates and inflation not a scam?</h3>
<p>As we have seen previously, DAI sometimes needs negative interest rates to maintain the peg. In the wake of the financial crisis of 2008, people were surprised that interest rates on bank accounts, and indeed even central bank interests, can be negative. But is this really that surprising?</p>
<p>Central banks have more than a single lever to adjust supply and demand for their currencies, but interest rates are still an important one. Negative interest rates send a signal to the market that rebalances the equilibrium towards lower demand for stable currency and higher supply by means of people taking debt in order to invest it in ventures.</p>
<p>Furthermore, inflation is basically a negative interest rate on physical cash (bills and coins), which is necessary because we don’t have any way of applying it directly. If all balances were electronic, we could equivalently also just apply the negative interest rates directly to the balances and not have any inflation. (This ignores price stickiness, which is another problem that probably also favors some form of inflation)</p>
<h2 id="conclusion">Conclusion</h2>
<p>MakerDAO has demonstrated that even if you only have a volatile asset, like ETH, you can build a stable currency on top. For simplicity, a peg to the US Dollar was chosen, but it doesn’t have to be a currency. Any measure of value could be used, as long as we have a way to find a reasonably objective oracle for it.</p>
<p>I don’t believe that assets that are only defined by their limited supply – such as gold or Bitcoin – are very good “stores of value”. Historically speaking, the S&P 500 has vastly outperformed gold <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> at lower volatility. I don’t think this will be different for Bitcoin and other “limited supply” assets. If what you want to do is maximize value over long timescale, productive assets (which Ethereum will be after EIP1559 and the merge) are a much better bet.</p>
<p>If instead you want stability over short term, you need an explicit mechanism that guarantees that; stablecoins are one, and fiat has similar mechanism. But someone will have to take the other side, and you will probably have to pay for it by having lower or even negative returns. That’s the price for stability.</p>
<p>You can also do something in between, like <a href="https://reflexer.finance/">Reflexer Labs RAI</a>. What I don’t see is how gold or Bitcoin, simply by having a fixed supply, provide something superior. They don’t. They will be strictly inferior by providing less returns at higher volatility than productive assets and the stable synthetix we can build using them. I wrote an essay about this topic: <a href="/ethereum/2021/09/27/store-of-value-from-limited-supply.html">Just because it has a fixed supply doesn’t make it a good store of value</a></p>
<p>–</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Starting in 1950, investing $35 in gold (one ounce) would have yielded $1765 (price as of June 20 2021 from <a href="https://www.bullionbypost.eu/gold-price/alltime/ounces/USD/">here</a>) vs <a href="https://www.officialdata.org/us/stocks/s-p-500/1950?amount=35&endYear=2021">$74,418.65</a> for investing in an S&P 500 tracker. Both yield positive returns even after accounting for the ca. 90% inflation of the USD, but the S&P 500 is much better at 212x real returns vs only 5x for gold. Also gold <a href="https://seekingalpha.com/article/4296091-gold-vs-stocks">is more volatile than stocks</a>. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Special thanks to David Andolfatto, Vitalik Buterin, Chih-Cheng Liang, Barnabé Monnot and Danny Ryan for comments that helped me improve this essayInner Product Arguments2021-07-27T23:00:00+00:002021-07-27T23:00:00+00:00https://dankradfeist.de/ethereum/2021/07/27/inner-product-arguments<h1 id="introduction">Introduction</h1>
<p>You might have heard of Bulletproofs: It’s a type of zero knowledge proof that is used for example by Monero, and that does not require a trusted setup. The core of this proof system is the Inner Product Argument <sup id="fnref:2"><a href="#fn:2" class="footnote">1</a></sup>, a trick that allows a prover to convince a verifier of the correctness of an “inner product”. An inner product is the component by component product of two vectors:</p>
<script type="math/tex; mode=display">\vec a \cdot \vec b = a_0 b_0 + a_1 b_1 + a_2 b_2 + \cdots + a_{n-1} b_{n-1}</script>
<p>where <script type="math/tex">\vec a = (a_0, a_1, \ldots, a_{n-1})</script> and <script type="math/tex">\vec b = (b_0, b_1, \ldots, b_{n-1})</script>.</p>
<p>One interesting case is where we set the vector <script type="math/tex">\vec b</script> to be the powers of some number <script type="math/tex">z</script>, i.e. <script type="math/tex">\vec b = (1, z, z^2, \ldots, z^{n-1})</script>. Then the inner product becomes the evaluation of the polynomial</p>
<script type="math/tex; mode=display">f(X) = \sum_{i=1}^{n-1} a_i X^i</script>
<p>at <script type="math/tex">z</script>.</p>
<p>Inner Product Arguments work on <em>Pedersen Commitments</em>. I have previously written about <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG commitments</a>, and Pedersen commitments are similar in that the commitment is in an elliptic curve. However a difference is that they do not require a trusted setup. Here is a comparison of the KZG commitment scheme and using Pedersen combined with an Inner Product Argument as a Polynomial Commitment Scheme (PCS):</p>
<table>
<thead>
<tr>
<th> </th>
<th>Pedersen+IPA</th>
<th>KZG</th>
</tr>
</thead>
<tbody>
<tr>
<td>Assumption</td>
<td>Discrete log</td>
<td>Bilinear group</td>
</tr>
<tr>
<td>Trusted setup</td>
<td>No</td>
<td>Yes</td>
</tr>
<tr>
<td>Commitment size</td>
<td>1 Group element</td>
<td>1 Group element</td>
</tr>
<tr>
<td>Proof size</td>
<td>2 log n Group elements</td>
<td>1 Group element</td>
</tr>
<tr>
<td>Verification</td>
<td>O(n) group operations</td>
<td>1 Pairing</td>
</tr>
</tbody>
</table>
<p>Basically, compared to KZG commitments, our commitment scheme is less efficient. Proofs are larger (<script type="math/tex">O(\log n)</script>), which wouldn’t be the end of the world as logarithmic is still very small. But unfortunately, the verifier has to do a linear amount of work, so they are not succinct. This makes them impractical for some applications. However in some cases this can be worked around.</p>
<ul>
<li>One example is my writeup on <a href="/ethereum/2021/06/18/pcs-multiproofs.html">multiopenings</a>. In this case, the trick is that you can aggregate many openings into a single one.</li>
<li>The Halo system <sup id="fnref:1"><a href="#fn:1" class="footnote">2</a></sup>, where the linear cost of many openings is aggregated</li>
</ul>
<p>In both of these examples, the trick is to amortize many openings. If you only want to open a single polynomial, then it’s tough and you have to incur the full cost, though.</p>
<p>However, the big advantage is that Pedersen and Inner Product Arguments come with much fewer assumptions, in particular a pairing is not needed and they don’t require a trusted setup.</p>
<h1 id="pedersen-commitments">Pedersen commitments</h1>
<p>Before we can discuss Inner Product Arguments, we need to discuss the data structure that they operate on: Pedersen commitments. In order to use Pedersen commitments, we need an elliptic curve <script type="math/tex">G</script>. Let’s quickly remind ourselves what you can do in an elliptic curve (I will use additive notation because I think it is the more natural one):</p>
<ol>
<li>You can add two elliptic curve elements <script type="math/tex">g_0 \in G</script> and <script type="math/tex">g_1 \in G</script>:
<script type="math/tex">h = g_0 + g_1</script></li>
<li>You can multiply an element <script type="math/tex">g \in G</script> with a scalar <script type="math/tex">a \in \mathbb F_p</script>, where <script type="math/tex">p</script> is the curve order of <script type="math/tex">G</script> (i.e. the number of elements):
<script type="math/tex">h = a g</script></li>
</ol>
<p>There is no way to compute the “product” of two curve elements: the operation “<script type="math/tex">h * h</script>” is not defined, so you cannot compute “<script type="math/tex">h * h = a g * a g = a^2 g</script>”; as opposed to multiplying by a scalar; so <script type="math/tex">2 h = 2 a g</script>, for example, is easy to compute.</p>
<p>Another important property is that there is no efficient algorithm to compute “discrete logarithms”. The meaning of this is that given <script type="math/tex">h</script> and <script type="math/tex">g</script> with the property that <script type="math/tex">h=ag</script>, if you don’t know <script type="math/tex">a</script> it is computationally infeasible to find <script type="math/tex">a</script>. We call <script type="math/tex">a</script> the discrete logarithm of <script type="math/tex">h</script> with respect to <script type="math/tex">g</script>.</p>
<p>Pedersen commitments make use of this infeasibility to construct a commitment scheme. Let’s say you have two points <script type="math/tex">g_0</script> and <script type="math/tex">g_1</script> and their discrete logarithm with respect to each other (i.e. the <script type="math/tex">x \in \mathbb F_p</script> such that <script type="math/tex">g_1 = x g_0</script>) is unknown, then we can commit to two numbers <script type="math/tex">a_0, a_1 \in \mathbb F_p</script>:</p>
<script type="math/tex; mode=display">C = a_0 g_0 + a_1 g_1</script>
<p><script type="math/tex">C</script> is an element of the elliptic curve <script type="math/tex">G</script>.</p>
<p>To reveal the commitment, the prover gives the verifier the numbers <script type="math/tex">a_0</script> and <script type="math/tex">a_1</script>. The verifier computes <script type="math/tex">C</script> and if it matches will accept.</p>
<p>The central property of a commitment scheme is that it is binding. So given <script type="math/tex">C=a_0 g_0 + a_1 g_1</script>, could a cheating prover come up with <script type="math/tex">b_0, b_1 \in \mathbb F_p</script> such that the verifier will accept them, i.e. such that <script type="math/tex">C = b_0 g_0 + b_1 g_1</script> but with <script type="math/tex">b_0, b_1 \not= a_0, a_1</script>?</p>
<p>If someone can do this, then they could also find the discrete logarithm. Here is why: We know that <script type="math/tex">a_0 g_0 + a_1 g_1 = b_0 g_0 + b_1 g_1</script>, and by regrouping the terms on both sides of the equation we get</p>
<script type="math/tex; mode=display">(a_0 - b_0) g_0 = (b_1 - a_1) g_1</script>
<p>Either <script type="math/tex">a_0 - b_0</script> or <script type="math/tex">b_1 - a_1</script> have to be not equal to zero. Let’s say it’s <script type="math/tex">a_0 - b_0</script>, then we get:</p>
<script type="math/tex; mode=display">g_0 = \frac{b_1 - a_1}{a_0 - b_0} g_1 = x g_1</script>
<p>for <script type="math/tex">x = \frac{b_1 - a_1}{a_0 - b_0}</script>. Thus we’ve found <script type="math/tex">x</script>. Since we know this is a hard problem, in practice no attacker can perform this.</p>
<p>This means it’s computationally infeasible for an attacker to find alternative <script type="math/tex">b_0, b_1</script> to reveal for the commitment <script type="math/tex">C</script>. (They definitely do exist, they are just computationally infeasible to find – similar to finding a collision for a hash function).</p>
<p>We can generalize this and commit to a vector, i.e. a list of scalars <script type="math/tex">a_0, a_1, \ldots, a_{n-1} \in \mathbb F_p</script>. We just need a “basis”, i.e. an equal number of group elements that don’t have known discrete logarithms between them. Then we can compute the commitment</p>
<script type="math/tex; mode=display">C = a_0 g_0 + a_1 g_1 + a_2 g_2 + \ldots + a_{n-1} g_{n-1}</script>
<p>This gives us a vector commitment, although with quite a bad complexity: In order to reveal any element, all elements of the vector have to be revealed. But there is one redeeming property: The commitment scheme is additively homomorphic. This means that if we have another commitment <script type="math/tex">D = b_0 g_0 + b_1 g_1 + b_2 g_2 + \ldots + b_{n-1} g_{n-1}</script>, then it’s possible to just add the two commitments to get a new commitment to the sum of the two vectors <script type="math/tex">\vec a</script> and <script type="math/tex">\vec b</script>:</p>
<script type="math/tex; mode=display">C + D = (a_0 + b_0) g_0 + (a_1 + b_1) g_1 + (a_1 + b_1) g_2 + \ldots + (a_{n-1} + b_{n-1}) g_{n-1}</script>
<p>Thanks to this additive homomorphic property, this vector commitment actually turns out to be useful.</p>
<h1 id="inner-product-argument">Inner Product Argument</h1>
<p>The basic strategy of the Inner Product Argument is “divide and conquer”: Take the problem and instead of completely solving it, turn it into a smaller one of the same type. At some point, it becomes so small that you can simply reveal everything and prove that the instance is correct.</p>
<p>At each step, the problem size halves. This ensures that after <script type="math/tex">\log n</script> steps, the problem is reduced to size one, so it can be proved trivially.</p>
<p>The idea is that we want to prove that a commitment <script type="math/tex">C</script> is of the form</p>
<script type="math/tex; mode=display">C = \vec a \cdot \vec g + \vec b \cdot \vec h + (\vec a \cdot \vec b) q</script>
<p>where <script type="math/tex">\vec g = (g_0, g_1, \ldots, g_{n-1})</script> and <script type="math/tex">\vec h = (h_0, h_1, \ldots, h_{n-1})</script> as well as <script type="math/tex">q</script> are our “basis”, i.e. they are group elements in <script type="math/tex">G</script> and none of their discrete logarithms with respect to each other are known. We also introduced the new notation <script type="math/tex">\vec a \cdot \vec g</script> for a product between a vector of scalars (<script type="math/tex">\vec a</script>) and another vector of group elements (<script type="math/tex">\vec g</script>), and it is defined as</p>
<script type="math/tex; mode=display">\vec a \cdot \vec g = a_0 g_0 + a_1 g_1 + \cdots + a_{n-1} g_{n-1}</script>
<p>So essentially, we are proving that <script type="math/tex">C</script> is a commitment to</p>
<ul>
<li>a vector <script type="math/tex">\vec a</script> with basis <script type="math/tex">\vec g</script></li>
<li>a vector <script type="math/tex">\vec b</script> with basis <script type="math/tex">\vec h</script> and</li>
<li>their inner product <script type="math/tex">\vec a \cdot \vec b</script> with respect to the basis <script type="math/tex">q</script>.</li>
</ul>
<p>This in itself does not seem very useful – in most applications we want the verifier to know <script type="math/tex">\vec a \cdot \vec b</script>, and not just have it hidden in some commitment. But this can be remedied with a small trick which I will come to below.</p>
<h2 id="the-argument">The argument</h2>
<p>We want the prover to convince the verifier that <script type="math/tex">C</script> is of the form <script type="math/tex">C = \vec a \cdot \vec g + \vec b \cdot \vec h + (\vec a \cdot \vec b) q</script>. As I mentioned before, instead of doing this outright, we will only reduce the problem by computing another commitment <script type="math/tex">C'</script> in such a way that if the property holds for <script type="math/tex">C'</script>, then it also holds for <script type="math/tex">C</script>.</p>
<p>In order to do this, the prover and the verifier play a little game. The prover commits to certain properties, after which the verifier sends a challenge, which leads to the next commitment <script type="math/tex">C'</script>. Describing it as a game does not mean the proof has to be interactive though: The Fiat-Shamir construction allows us to turn interactive proofs into non-interactive one, by replacing the challenge with a collision-resistant hash function of the commitments.</p>
<h3 id="statement-to-prove">Statement to prove</h3>
<p>The commitment <script type="math/tex">C</script> has the form <script type="math/tex">C = \vec a \cdot \vec g + \vec b \cdot \vec h + (\vec a \cdot \vec b) q</script> with respect to the basis given by <script type="math/tex">\vec g, \vec h, q</script>. We call the fact that <script type="math/tex">C</script> has this form the “Inner Product Property”.</p>
<h3 id="reduction-step">Reduction step</h3>
<p>Let <script type="math/tex">m = \frac{n}{2}</script></p>
<p>The prover computes</p>
<script type="math/tex; mode=display">z_L = a_m b_0 + a_{m+1} b_1 + \cdots + a_{n-1} b_{m-1} = \vec a_R \cdot \vec b_L \\
z_R = a_0 b_m + a_{1} b_{m+1} + \cdots + a_{m-1} b_{n-1} = \vec a_L \cdot \vec b_R</script>
<p>where we’ve defined <script type="math/tex">\vec a_L</script> as the “left half” of the vector <script type="math/tex">\vec a</script> and <script type="math/tex">\vec a_R</script> the “right half” and analogously for <script type="math/tex">\vec b</script>.</p>
<p>Then the prover computes the following commitments:</p>
<script type="math/tex; mode=display">C_L = \vec a_R \cdot \vec g_L + \vec b_L \cdot \vec h_R + z_L q \\
C_R = \vec a_L \cdot \vec g_R + \vec b_R \cdot \vec h_L + z_R q \\</script>
<p>and send them to the verifier. Then the verifier sends the challenge <script type="math/tex">x \in \mathbb F_p</script> (when using the Fiat-Shamir construction to make this non-interactive, this means that <script type="math/tex">x</script> would be the hash of <script type="math/tex">C_L</script> and <script type="math/tex">C_R</script>). The prover uses this to compute the updated vectors</p>
<script type="math/tex; mode=display">\vec a' = \vec a_L + x \vec a_R \\
\vec b' = \vec b_L + x^{-1} \vec b_R</script>
<p>which have half the length.</p>
<p>Now the verifier computes the new commitment:</p>
<script type="math/tex; mode=display">C' = x C_L + C + x^{-1} C_R</script>
<p>as well as the updated basis</p>
<script type="math/tex; mode=display">\vec g' = \vec g_L + x^{-1} \vec g_R \\
\vec h' = \vec h_L + x \vec h_R</script>
<p>Now, <em>if</em> the new commitment <script type="math/tex">C'</script> has the property that it is of the form <script type="math/tex">C' = \vec a' \cdot \vec g' +\vec b' \cdot \vec h' + \vec a' \cdot \vec b' q</script> – then the commitment <script type="math/tex">C</script> fulfills the originall claim.</p>
<p>All the vectors have halved in size – so we have achieved something. From here we replace <script type="math/tex">C:=C'</script>, <script type="math/tex">\vec g := \vec g'</script> and <script type="math/tex">\vec h := \vec h'</script> and repeat this step.</p>
<p>I will below go through the maths on why this works, but Vitalik also made a nice <a href="https://twitter.com/VitalikButerin/status/1371844878968176647">visual representation</a> that I recommend to get an intuition.</p>
<h3 id="final-step">Final step</h3>
<p>When we repeat the step above, we will reduce <script type="math/tex">n</script> by a factor of two each time. At some point, we will encounter <script type="math/tex">n=1</script>. At this point we don’t repeat the step anymore. Instead the prover will send <script type="math/tex">\vec a</script> and <script type="math/tex">\vec b</script>, which in fact are now only a single scalar each. Then the verifier can simply compute</p>
<script type="math/tex; mode=display">D = a g + b h + a b q</script>
<p>and accept the statement if this is indeed equal to <script type="math/tex">C</script>, or reject if it is not.</p>
<h3 id="correctness-and-soundness">Correctness and soundness</h3>
<p>Above I claimed that if <script type="math/tex">C'</script> has the desired form, then it follows that <script type="math/tex">C</script> also has it. I now want to show why this is the case. In order to do this, we need to look at two things:</p>
<ul>
<li><em>Correctness</em> – i.e. given a prover who follows the protocol, can they always convince the verifier that the statement is correct; and</li>
<li><em>Soundness</em> – i.e. a dishonest prover cannot convince the verifier of an incorrect statement, except with a very small probability.</li>
</ul>
<p>Let’s start with correctness. This assumes that the prover is doing everything according to the protocol. Since the prover is following the protocol, we know that <script type="math/tex">C = \vec a \cdot \vec g + \vec b \cdot \vec h + (\vec a \cdot \vec b) q</script> with respect to the basis given by <script type="math/tex">\vec g, \vec h, q</script>. We need to show that then <script type="math/tex">C'= \vec a' \cdot \vec g' +\vec b' \cdot \vec h' + \vec a' \cdot \vec b' q</script>.</p>
<p>The verifier computes <script type="math/tex">C' = x C_L + C + x^{-1} C_R</script>.</p>
<script type="math/tex; mode=display">C' = x C_L + C + x^{-1} C_R \\
= x ( \vec a_R \cdot \vec g_L + \vec b_L \cdot \vec h_R + z_L q) \\
+ \vec a_L \cdot \vec g_L + \vec a_R \cdot \vec g_R + \vec b_L \cdot \vec h_L + \vec b_R \cdot \vec h_R + \vec a \cdot \vec b q \\
+ x^{-1} (\vec a_L \cdot \vec g_R + \vec b_R \cdot \vec h_L + z_R q)\\
= (x \vec a_R + \vec a_L)\cdot(\vec g_L + x^{-1} \vec g_R) \\
+ (\vec b_L + x^{-1} \vec b_R)\cdot(\vec h_L + x \vec h_R) \\
+ (x z_L + \vec a \cdot \vec b + x^{-1} z_R) q \\
= (x \vec a_R + \vec a_L)\cdot \vec g' + (\vec b_L + x^{-1} \vec b_R)\cdot \vec h' + (x z_L + \vec a \cdot \vec b + x^{-1} z_R) q</script>
<p>So in order for the commitment to have the Inner Product Property, we need to verify that <script type="math/tex">(x \vec a_R + \vec a_L) \cdot (\vec b_L + x^{-1} \vec b_R) = z_L + \vec a \cdot \vec b + x^{-1} z_R</script>. This is true because</p>
<script type="math/tex; mode=display">(x \vec a_R + \vec a_L) \cdot (\vec b_L + x^{-1} \vec b_R) \\
= x \vec a_R \cdot \vec b_L + \vec a_L \cdot \vec b_L + \vec a_R \cdot \vec b_R + x^{-1} \vec a_L \cdot \vec b_R \\
= x z_L + \vec a \cdot \vec b + x^{-1} z_R</script>
<p>This concludes the proof of correctness. Now in order to prove soundness, we need the property that a prover can’t start with a commitment <script type="math/tex">C</script> that does not fulfill the Inner Product Property and end up with a <script type="math/tex">C'</script> that does by going through the reduction step.</p>
<p>So let’s assume that the prover committed to <script type="math/tex">C=\vec a \cdot \vec g + \vec b \cdot \vec h + r q</script> for some <script type="math/tex">r \not= \vec a \cdot \vec b</script>. If we go through the same process as before, we find</p>
<script type="math/tex; mode=display">C' = (x \vec a_R + \vec a_L)\cdot \vec g' + (\vec b_L + x^{-1} \vec b_R)\cdot \vec h' + (x z_L + r + x^{-1} z_R) q</script>
<p>So now let’s assume that the prover managed to cheat, and thus <script type="math/tex">C'</script> fulfills the Inner Product Property. That means that</p>
<script type="math/tex; mode=display">(x \vec a_R + \vec a_L) \cdot (\vec b_L + x^{-1} \vec b_R) = x z_L + r + x^{-1} z_R</script>
<p>Expanding the left hand side, we get</p>
<script type="math/tex; mode=display">x \vec a_R \cdot \vec b_L + \vec a \cdot \vec b + x^{-1} \vec a_L \cdot \vec b_R = x z_L + r + x^{-1} z_R</script>
<p>Note that the prover can choose <script type="math/tex">z_L</script> and <script type="math/tex">z_R</script> freely, so we cannot assume that they will be according to the above definitions.</p>
<p>Multiplying by <script type="math/tex">x</script> and moving everything to one side we get a quadratic equation in <script type="math/tex">x</script>:</p>
<script type="math/tex; mode=display">x^2 ( \vec a_R \cdot \vec b_L - z_L) + x (\vec a \cdot \vec b - r) + (\vec a_L \cdot \vec b_R - z_R )</script>
<p>Unless all the terms are zero, this equation has at most two solutions <script type="math/tex">x \in \mathbb F_p</script>. But the verifier chooses <script type="math/tex">x</script> after the prover has already committed to their values <script type="math/tex">r</script>, <script type="math/tex">z_L</script> and <script type="math/tex">z_R</script>. The probability that the prover can successfully cheat is thus extremely small; we typically choose the field <script type="math/tex">\mathbb F_p</script> to be of size ca. <script type="math/tex">2^{256}</script>, so the probability that the verifier chooses a value for <script type="math/tex">x</script> such that this equation holds, when the values were not chosen according to the protocol, is vanishingly small.</p>
<p>This concludes the soundness proof.</p>
<h2 id="only-compute-basis-changes-at-the-end">Only compute basis changes at the end</h2>
<p>The verifier has to do two things each round: Compute the challenge <script type="math/tex">x</script> and compute the updates bases <script type="math/tex">\vec g'</script> and <script type="math/tex">\vec h'</script>. However, updating the basis <script type="math/tex">g</script> at every round is inefficient. Instead the verifier can simply keep track of the challenge values <script type="math/tex">x_1</script>, <script type="math/tex">x_2</script>, up to <script type="math/tex">x_{\ell}</script> that they will encounter during the <script type="math/tex">k</script> rounds.</p>
<p>Let’s call the basis after round <script type="math/tex">k</script> <script type="math/tex">\vec g_k, \vec h_k</script>. The elements <script type="math/tex">g_\ell</script> and <script type="math/tex">h_\ell</script> are scalars (or vectors of length one) because we end the protocol once our vectors have reached length one. Computing <script type="math/tex">g_\ell</script> from <script type="math/tex">\vec g_0</script> is a multiscalar multiplication (MSM) of length <script type="math/tex">n</script>. The scalar factors for <script type="math/tex">\vec g_0</script> are the coefficients of the polynomial</p>
<script type="math/tex; mode=display">f_g(X) = \prod_{i=0}^\ell \left(1+x^{-1}_{\ell-j} X^{2^{j}}\right)</script>
<p>and the scalar factors for <script type="math/tex">\vec h_0</script> are given by</p>
<script type="math/tex; mode=display">f_h(X) = \prod_{i=0}^\ell \left(1+x_{\ell-j} X^{2^{j}}\right)</script>
<h1 id="using-inner-product-arguments-to-evaluate-polynomials">Using Inner Product Arguments to evaluate polynomials</h1>
<p>For our main application – evaluating a polynomial defined by <script type="math/tex">f(x) = \sum_{i=1}^{n-1} a_i x^i</script> at a point <script type="math/tex">z</script> – we want to make some small additions to this protocol.</p>
<ul>
<li>Most importantly, we want to know the result <script type="math/tex">f(z) = \vec a \cdot \vec b</script>, and not just that <script type="math/tex">C</script> has the “Inner Product Property”</li>
<li><script type="math/tex">\vec b = (1, z, z^2, ..., z^{n-1})</script> is known to the verifier. We can thus make things a bit easier by removing it from the commitment</li>
</ul>
<h2 id="how-to-construct-the-commitment">How to construct the commitment</h2>
<p>If we want to verify a polynomial evaluation for the polynomial <script type="math/tex">f(x) = \sum_{i=1}^{n-1} a_i x^i</script>, then we are typically working from a commitment <script type="math/tex">F = \vec a \cdot \vec g</script>. The prover would send the verifier the evaluation <script type="math/tex">y=f(z)</script>.</p>
<p>So it seems like the verifier can just compute the initial commitment <script type="math/tex">C=\vec a \cdot \vec g + \vec b \cdot \vec h + \vec a \cdot \vec b q = F + \vec b \cdot \vec h + f(z) q</script>, since they know <script type="math/tex">\vec b = (1, z, z^2, ..., z^{n-1})</script>, and start the protocol.</p>
<p>But not so fast. In most cases, <script type="math/tex">F</script> will be a commitment that is generated by the prover. A malicious prover could cheat by, for example, committing to <script type="math/tex">F = \vec a \cdot \vec g + tq</script>. In this case, they would be able to prove that <script type="math/tex">f(z) = y - t</script>, because they have effectively shifted the result.</p>
<p>To prevent this, we need to do a small change to the protocol. After receiving the commitment <script type="math/tex">F</script> and the evaluation <script type="math/tex">y</script>, the verifier generates a scalar <script type="math/tex">w</script> and rescales the basis <script type="math/tex">q:=wq</script>. Afterwards the protocol can proceed as usual. Because the prover can’t predict what <script type="math/tex">w</script> is going to be, they can’t succeed (except with very small probability) at manipulating the result to be something other then <script type="math/tex">f(z)</script>.</p>
<p>Note that we also need to stop the prover from manipulating the vector <script type="math/tex">\vec b</script> if what we want is a generic inner product – but for a polynomial evaluation, we can simply get rid of that part alltogether so I won’t go into the details.</p>
<h2 id="how-to-get-rid-of-the-second-vector">How to get rid of the second vector</h2>
<p>Note that the verifier knows the vector <script type="math/tex">\vec b = (1, z, z^2, ..., z^{n-1})</script> if what we want is to compute a polynomial evaluation. Given the challenges <script type="math/tex">x_0, x_1, \ldots, x_\ell</script> they can simply compute the final result <script type="math/tex">b_\ell</script> using the same technique as demonstrated in “compute the basis change at the end”.</p>
<p>We can thus remove the second vector from all commitments and simply compute <script type="math/tex">b_\ell</script>. This means the verifier has to be able to compute the final version <script type="math/tex">b_\ell</script> from the initial vector <script type="math/tex">\vec b_0 = (1, z, z^2, ..., z^{n-1})</script>. Since the folding process for <script type="math/tex">\vec b</script> is the same as that for the basis vector <script type="math/tex">\vec g</script>, the coefficients of the previously defined polynomial <script type="math/tex">f_g</script> will define the linear combination, in other words <script type="math/tex">b_\ell=f_g(z)</script>.</p>
<h2 id="creating-an-ipa-for-a-polynomial-in-coefficient-form">Creating an IPA for a polynomial in coefficient form</h2>
<p>So far, we have used an Inner Product Argument to evaluate a polynomial that is committed to by its coefficients, which are the <script type="math/tex">f_i</script> for a polynomial defined by <script type="math/tex">f(X) = \sum_{i=0}^{n-1} f_i X^i</script>. However, often we want to work with a polynomial that is defined using its evaluations on a domain <script type="math/tex">x_0, x_1, \ldots, x_{n-1}</script>. Since any polynomial of degree less than <script type="math/tex">n-1</script> is uniquely defined by the evaluations <script type="math/tex">f(x_0), f(x_1), \ldots, f(x_{n-1})</script> these two are completely equivalent. However, transforming between the two can be computationally expensive: it costs <script type="math/tex">O(n \log n)</script> operations if the domain admits an efficient Fast Fourier Transform, and otherwise it’s <script type="math/tex">O(n^2)</script>.</p>
<p>To avoid this cost, we try to simply never change to coefficient form. This can be done by changing the commitment to <script type="math/tex">f</script> by committing to the evaluations instead of the coefficients:</p>
<script type="math/tex; mode=display">C = f(x_0) g_0 + f(x_1) g_1 + \cdots + f(x_{n-1}) g_{n-1}</script>
<p>This means that our <script type="math/tex">\vec a</script> vector in the IPA is now given by
<script type="math/tex">\vec a = (f(x_0), f(x_1), \ldots, f(x_{n-1}))</script></p>
<p>The <a href="/ethereum/2021/06/18/pcs-multiproofs.html#evaluating-a-polynomial-in-evaluation-form-on-a-point-outside-the-domain">barycentric formula</a> allows us now to compute an IPA to evaluate a polynomial using this new commitment. It says that</p>
<script type="math/tex; mode=display">f(z) = A(z)\sum_{i=0}^{n-1} \frac{f(x_i)}{A'(x_i)} \frac{1}{z-x_i}</script>
<p>If we choose the vector <script type="math/tex">\vec b</script> to be</p>
<script type="math/tex; mode=display">b_i = \frac{A(z)}{A'(x_i)} \frac{1}{z-x_i}</script>
<p>we get that <script type="math/tex">\vec a \cdot \vec b = f(z)</script>, and thus an IPA with this vector can be used to prove the evaluation of a polynomial which is itself in evaluation form. Other than this, the strategy is exactly the same.</p>
<div class="footnotes">
<ol>
<li id="fn:2">
<p>Bootle, Cerulli, Chaidos, Groth, Petit: <a href="https://eprint.iacr.org/2016/263.pdf">Efficient Zero-Knowledge Arguments forArithmetic Circuits in the Discrete Log Setting</a> <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:1">
<p>Bowe, Grigg, Hopwood: <a href="https://eprint.iacr.org/2019/1021.pdf">Recursive Proof Composition without a Trusted setup</a> <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>IntroductionVerkle trie for Eth1 state2021-06-18T22:55:00+00:002021-06-18T22:55:00+00:00https://dankradfeist.de/ethereum/2021/06/18/verkle-trie-for-eth1<h1 id="verkle-trie-for-eth1-state">Verkle trie for Eth1 state</h1>
<p>This post is a quick summary on how verkle tries work and how they can be used in order to make Eth1 stateless. Note that this post is written with the KZG commitment scheme in mind as it is easy to understand and quite popular, but this can easily be replaced by any other “additively homomorphic” commitment scheme, meaning that it should be possible to compute the commitment to the sum of two polynomials by adding the two commitments.</p>
<h2 id="using-kzg-as-a-vector-commitment">Using KZG as a Vector commitment</h2>
<p>The <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG (Kate) polynomial commitment scheme</a> is a polynomial commitment scheme. Its primary functionality is the ability to commit to a polynomial <script type="math/tex">f(x)</script> via an elliptic curve group element <script type="math/tex">C = [f(s)]_1</script> (for notation see the linked post).
We can then open this commitment at any point <script type="math/tex">z</script> by giving the verifier the value <script type="math/tex">y=f(z)</script> as well as a group element <script type="math/tex">\pi = [(f(s) - y)/(s-z)]_1</script>, and this proof of the correctness of the value <script type="math/tex">y</script> can be checked using a pairing equation.</p>
<p>A vector commitment is a commitment scheme that takes as an input <script type="math/tex">d</script> different values <script type="math/tex">v_0, v_1, \ldots, v_{d-1}</script> and produces a commitment <script type="math/tex">C</script> that can be opened at any of these values. As an example, a Merkle tree is a vector commitment, with the property that opening at the <script type="math/tex">i</script>-th value required <script type="math/tex">\log d</script> hashes as a proof.</p>
<p>Let <script type="math/tex">\omega</script> be a <script type="math/tex">d</script>-th root of unity, i.e. <script type="math/tex">\omega^d=1</script>, and <script type="math/tex">\omega^i \not=1</script> for <script type="math/tex">% <![CDATA[
0\leq i<d %]]></script>.</p>
<p>We can turn the Kate commitment into a vector commitment that allows commiting to a vector of length <script type="math/tex">d</script> by committing to the degree <script type="math/tex">d-1</script> polynomial <script type="math/tex">f(x)</script> that is defined by <sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
<script type="math/tex; mode=display">% <![CDATA[
f(\omega^i) = v_i \text{ for } 0\leq i<d %]]></script>
<p>To open the commitment <script type="math/tex">C</script> at any point <script type="math/tex">i</script>, we simply have to compute a Kate proof for <script type="math/tex">f(\omega^i)</script>. Fortunately, this proof is constant sized: It does not depend on the width <script type="math/tex">d</script>. Even better, many of these proofs can be combined into a <a href="/ethereum/2021/06/18/pcs-multiproofs.html">single proof</a>, which is much cheaper to verify.</p>
<h2 id="introduction-to-verkle-tries">Introduction to Verkle tries</h2>
<p>Verkle is an amalgamation of “vector” and “Merkle”, due to the fact that they are built in a tree-like structure just like Merkle trees, but at each node, instead of a hash of the <script type="math/tex">d</script> nodes below (<script type="math/tex">d=2</script> for binary Merkle trees), they commit to the <script type="math/tex">d</script> nodes below using a vector commitment. <script type="math/tex">d</script>-ary Merkle trees are inefficient, because each proof has to include all the unaccessed siblings for each node on the path to a leaf. A <script type="math/tex">d</script>-ary Merkle tree thus needs <script type="math/tex">(d - 1) \log_d n = (d - 1) \frac{\log n}{\log d}</script> hashes for a single proof, which is worse than the binary Merkle tree, which only needs <script type="math/tex">\log n</script> hashes. This is because a hash function is a poor vector commitment: a proof requires all siblings to be given.</p>
<p>Better vector commitments change this equation; by using the <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG polynomial commitment scheme</a> as a vector commitment, each level only requires a constant size proof, so the annoying factor of <script type="math/tex">d-1</script> that kills <script type="math/tex">d</script>-ary Merkle trees disappears.</p>
<p>A verkle trie is a trie where the inner nodes are <script type="math/tex">d</script>-ary vector commitments to their children, where the <script type="math/tex">i</script>-th child contains all nodes with the prefix <script type="math/tex">i</script> as a <script type="math/tex">d</script>-digit binary number. As an example here is a <script type="math/tex">d=16</script> verkle trie with nine nodes inserted:
<img src="/assets/verkle_trie.svg" alt="verkle trie" /></p>
<p>The root of a leaf node is simply a hash of the (key, value) pair of 32 byte strings, whereas the root of an inner node is the hash of the vector commitment (in KZG, this is a <script type="math/tex">G_1</script> element).</p>
<h2 id="verkle-proof-for-a-single-leaf">Verkle proof for a single leaf</h2>
<p>We assume that key and value are known (they have to be provided in any witness scheme). Then for each inner node that the key path crosses, we have to add the commitment to that node to the proof. For example, let’s say we want to prove the leaf <code class="highlighter-rouge">0101 0111 1010 1111 -> 1213</code> in the above example (marked in green), then we have to give the commitment to <code class="highlighter-rouge">Node A</code> and <code class="highlighter-rouge">Node B</code> (both marked in cyan), as the path goes through these nodes. We don’t have to give the <code class="highlighter-rouge">Root</code> itself because it is known to the verifier. The <code class="highlighter-rouge">Root</code> as well as the node itself are marked in green, as they are data that is required for the proof, but is assumed as given and thus is not part of the proof.</p>
<p>Then we need to add a KZG proof for each inner node, that proves that the hash of the child is a correct reveal of the KZG commitment. So the proofs in this example would consist of three KZG evaluation proofs:</p>
<ul>
<li>Proof that the root (hash of key and value) of the node <code class="highlighter-rouge">0101 0111 1010 1111 -> 1213</code> is the evaluation of the commitment of <code class="highlighter-rouge">Inner node B</code> at the index <code class="highlighter-rouge">1010</code></li>
<li>Proof that the root of <code class="highlighter-rouge">Inner node B</code> (hash of the KZG commitment) is the evaluation of the commitment of <code class="highlighter-rouge">Inner node A</code> at the index <code class="highlighter-rouge">0111</code></li>
<li>Proof that the root of <code class="highlighter-rouge">Inner node A</code> (hash of the KZG commitment) is the evaluation of the <code class="highlighter-rouge">Root</code> commitment at the index <code class="highlighter-rouge">0101</code></li>
</ul>
<p>But does that mean we need to add a Kate proof at each level, so that the complete proof will consist of <script type="math/tex">\log_d n - 1</script> elliptic curve group elements for the commitments [Note the -1 because the root is always known and does not have to be included in proofs] and an additional <script type="math/tex">\log_d n</script> group elements for the reveals?</p>
<p>Fortunately, this is not the case. KZG proofs can be compressed using different schemes to a small constant size, so given <em>any</em> number of inner nodes, the proof can be done using a small number of bytes. Even better, given any number of leaves to prove, we only need this small size proof to prove them alltogether! So the amortized cost is only the total size of the commitments of the inner nodes. Pretty amazing.</p>
<p>In practice, we want a scheme that is very efficient to compute and verify, so we use <a href="/ethereum/2021/06/18/pcs-multiproofs.html">this scheme</a>. It is not the smallest in size (but still pretty small at 128 bytes total), however it is very efficient to compute and check.</p>
<h2 id="average-verkle-trie-depth">Average verkle trie depth</h2>
<p>My numerical experiments indicate that the average depth (number of inner nodes on the path) of a verkle tree with <script type="math/tex">n</script> random keys inserted is <script type="math/tex">\log_d n + 2/3</script>.</p>
<p>For <script type="math/tex">n=2^{30}</script> and <script type="math/tex">d=2^{10}</script>, this results in an average trie depth of ca. <script type="math/tex">3.67</script>.</p>
<h2 id="attack-verkle-trie-depth">Attack verkle trie depth</h2>
<p>An attacker can attempt to fill up the siblings of an attacked key in order to lengthen the proof path. They only need to insert one key per level in order to maximise the proof size; for this, however, they have to be able to find hash prefix collisions. Currently it is possible to find prefix collisions of up to 80 bits, indicateing that with <script type="math/tex">d=2^{10}</script>, up to 8 levels of collisions can be provoked. Note that this is only about twice longer compared to the average expected depth for <script type="math/tex">n=2^{30}</script> keys, so the attack doesn’t do very much overall.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Note that we could use <script type="math/tex">f(i) = v_i</script> instead, which would seem more intuitive, but this convention allows the use of Fast Fourier Transforms in computing all the polynomials, which is much more efficient. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Verkle trie for Eth1 statePCS multiproofs using random evaluation2021-06-18T22:43:00+00:002021-06-18T22:43:00+00:00https://dankradfeist.de/ethereum/2021/06/18/pcs-multiproofs<h1 id="multiproof-scheme-for-polynomial-commitments">Multiproof scheme for polynomial commitments</h1>
<p>This post describes a multiproof scheme for (additively homomorphic) polynomial commitment schemes. It is very efficient to verify (the dominant operation for the verifier being one multiexponentiation), as well as efficient to compute as long as all the polynomials are fully available (it is not suitable in the case where the aggregations has to be done from only proofs without access to the data).</p>
<p>It is thus very powerful in the setting of verkle tries for the purpos of implementing <a href="/ethereum/2021/02/14/why-stateless.html">weak statelessness</a>.</p>
<p>Note that this post was written with the <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG commitment scheme</a> in mind. It does however work for any “additively homomorphic” scheme, where it is possible to add two commiments together to get the commitment to the sum of the two polynomials. This means it can also be applied to <a href="/ethereum/2021/07/27/inner-product-arguments.html">Inner Product Arguments</a> (the core argument behind bulletproofs) and is actually a very powerful aggregation scheme for this use case.</p>
<h1 id="verkle-multiproofs-using-random-evaluation">Verkle multiproofs using random evaluation</h1>
<p>Problem: In a verkle tree of width <script type="math/tex">d</script>, we want to provide all the intermediate KZG (“Kate”) proofs as efficiently as possible.</p>
<p>We need to provide all intermediate commitments, there is no way around that in Verkle trees, but we only need a single KZG proof in the optimal case. There are efficient multiverification techniques for KZG if all proofs are given to the verifier, but we want to do with just a small constant number of proofs.</p>
<p>For the notation used here, please check my post on <a href="/ethereum/2020/06/16/kate-polynomial-commitments.html">KZG commitments</a>.</p>
<p>Please see <a href="/ethereum/2021/06/18/verkle-trie-for-eth1.html">here</a> for an introduction to verkle tries.</p>
<h2 id="connection-to-verkle-tries">Connection to verkle tries</h2>
<p>Quick recap: looking at this verkle trie:
<img src="/assets/verkle_trie.svg" alt="verkle trie" /></p>
<p>In order to prove the leaf value <code class="highlighter-rouge">0101 0111 1010 1111 -> 1213</code> we have to give the commitment to <code class="highlighter-rouge">Node A</code> and <code class="highlighter-rouge">Node B</code> (both marked in cyan), as well as the following KZG proofs:</p>
<ul>
<li>Proof that the root (hash of key and value) of the node <code class="highlighter-rouge">0101 0111 1010 1111 -> 1213</code> is the evaluation of the commitment of <code class="highlighter-rouge">Inner node B</code> at the index <code class="highlighter-rouge">1010</code></li>
<li>Proof that the root of <code class="highlighter-rouge">Inner node B</code> (hash of the KZG commitment) is the evaluation of the commitment of <code class="highlighter-rouge">Inner node A</code> at the index <code class="highlighter-rouge">0111</code></li>
<li>Proof that the root of <code class="highlighter-rouge">Inner node A</code> (hash of the KZG commitment) is the evaluation of the <code class="highlighter-rouge">Root</code> commitment at the index <code class="highlighter-rouge">0101</code></li>
</ul>
<p>Each of these commitments, let’s call them <script type="math/tex">C_0</script> (<code class="highlighter-rouge">Inner node B</code>), <script type="math/tex">C_1</script> (<code class="highlighter-rouge">Inner node A</code>) and <script type="math/tex">C_2</script> (<code class="highlighter-rouge">Root</code>) is to a polynomial function <script type="math/tex">f_i(X)</script>, What we are really saying by the claim that the commitment <script type="math/tex">C_i</script> evaluates to some <script type="math/tex">y_i</script> at index <script type="math/tex">z_i</script> is that <em>the function committed to</em> by <script type="math/tex">C_i</script>, i.e. <script type="math/tex">f_i(X)</script>, evaluates to <script type="math/tex">y_i</script> at <script type="math/tex">z_i</script>, i.e. <script type="math/tex">f_i(z_i) = y_i</script>, So what we need to prove is</p>
<ul>
<li><script type="math/tex">f_0(\omega^{0b1010}) = H(0101\ 0111\ 1010\ 1111, 1213)</script> (hash of key and value), where <script type="math/tex">C_0 = [f_0(s)]_1</script>, i.e. <script type="math/tex">C_0</script> is the commitment to <script type="math/tex">f_0(X)</script></li>
<li><script type="math/tex">f_1(\omega^{0b0111}) = H(C_0)</script>, where <script type="math/tex">C_1 = [f_1(s)]_1</script></li>
<li><script type="math/tex">f_2(\omega^{0b0101}) = H(C_1)</script>, where <script type="math/tex">C_2 = [f_2(s)]_1</script></li>
</ul>
<p>Note that we replaced the index with <script type="math/tex">z_i = \omega^{\text{the index}}</script>, where <script type="math/tex">\omega</script> is a <script type="math/tex">d</script>-th root of unity which makes many operations more efficient in practice (we will explain below why). <script type="math/tex">H</script> stands for a collision-resistant hash function, for example <code class="highlighter-rouge">sha256</code>.</p>
<p>If we have a node deeper inside the trie (more inner nodes on the path), there will be more proofs to provide. Also, if we do a multiproof, where we provide the proof for multiple key/value pairs at the same time, the list of proofs will be even longer. Overall, we can end up with hundreds or thousands of evaluations of the form <script type="math/tex">f_i(z_i) = y_i</script> to prove, where we have the commitments <script type="math/tex">C_i = [f_i(s)]_1</script> (these are part of the verkle proof as well).</p>
<h2 id="relation-to-prove">Relation to prove</h2>
<p>The central part of a verkle multiproof (a verkle proof that proves many leaves at the same time) is to prove the following relation:</p>
<p>Given <script type="math/tex">m</script> KZG commitments <script type="math/tex">C_0 = [f_0(s)]_1, \ldots, C_{m-1}=[f_{m-1}(s)]_1</script>, prove evaluations</p>
<script type="math/tex; mode=display">f_0(z_0)=y_0 \\
\vdots\\
f_{m-1}(z_{m-1})=y_{m-1}</script>
<p>where <script type="math/tex">z_i \in \{\omega^0, \ldots, \omega^{d-1}\}</script>, and <script type="math/tex">\omega</script> is a <script type="math/tex">d</script>-th root of unity.</p>
<h2 id="proof">Proof</h2>
<ol>
<li>
<p>Let <script type="math/tex">r \leftarrow H(C_0, \ldots, C_{m-1}, y_0, \ldots, y_{m-1}, z_0, \ldots, z_{m-1})</script> (<script type="math/tex">H</script> is a hash function)
The prover computes the polynomial</p>
<script type="math/tex; mode=display">g(X) = r^0 \frac{f_0(X) - y_0}{X-z_0} + r^1 \frac{f_1(X) - y_1}{X-z_1} + \ldots +r^{m-1} \frac{f_{m-1}(X) - y_{m-1}}{X-z_{m-1}}</script>
<p>If we can prove that <script type="math/tex">g(X)</script> is actually a polynomial (and not a rational function), then it means that all the quotients are exact divisions, and thus the proof is complete. This is because it is a random linear combination of the quotients: if we just added the quotients, it could be that two of them just “cancel out” their remainders to give a polynomial. But because <script type="math/tex">r</script> is chosen after all the inputs are fixed (see <a href="https://en.wikipedia.org/wiki/Fiat%E2%80%93Shamir_heuristic">Fiat-Shamir heuristic</a>), it is computationally impossible for the prover to find inputs such that two of the remainders cancel.</p>
<p>Everything else revolves around proving that <script type="math/tex">g(X)</script> is a polynomial (and not a rational function) with minimal effort for the prover and verifier.</p>
<p>Note that any function that we can commit to via a KZG commitment is a polynomial. So the prover computes and sends the commitment <script type="math/tex">D = [g(s)]_1</script>, Now we only need to convince the verifier that <script type="math/tex">D</script> is, indeed, a commitment to the function <script type="math/tex">g(X)</script>, This is what the following steps are about.</p>
</li>
<li>
<p>We will prove the correctness of <script type="math/tex">D</script> by (1) evaluating it at a completely random point <script type="math/tex">t</script> and (2) helping the verifier check that the evaluation is indeed <script type="math/tex">g(t)</script>,
Let <script type="math/tex">t \leftarrow H(r, D)</script>,
We will evaluate <script type="math/tex">g(t)</script> and help the verifier evaluate the equation</p>
<script type="math/tex; mode=display">g(t) = \sum_{i=0}^{m-1}r^i \frac{f_i(t) - y_i}{t-z_i}</script>
<p>with the help of the prover. Note that we can split this up into two sums</p>
<script type="math/tex; mode=display">g(t) = \underbrace{\sum_{i=0}^{m-1} r^i \frac{f_i(t)}{t-z_i}}_{g_1(t)} - \underbrace{\sum_{i=0}^{m-1} r^i \frac{y_i}{t-z_i}}_{g_2(t)}</script>
<p>The second sum term <script type="math/tex">g_2(t)</script> is completely known to the verifier and can be computed using a small number of field operations. The first term can be computed by giving an opening to the commitment</p>
<script type="math/tex; mode=display">E = \sum_{i=0}^{m-1} \frac{r^i}{t-z_i} C_i</script>
<p>at <script type="math/tex">t</script>, Note that the commitment <script type="math/tex">E</script> itself can be computed by the verifier using a multiexponentiation (this will be the main part of the verifier work), because they have all the necessary inputs.</p>
<p>The prover computes</p>
<script type="math/tex; mode=display">h(X) = \sum_{i=0}^{m-1} r^i \frac{f_i(X)}{t-z_i}</script>
<p>which satisfies <script type="math/tex">E = [h(s)]_1</script>,</p>
</li>
<li>
<p>Let <script type="math/tex">f_D(X)</script> denote the polynomial committed to by <script type="math/tex">D</script> – if the prover is honest, then this will be <script type="math/tex">g(X)</script>, however this is what the verifier wants to check. Due to the binding property there can be at most one polynomial that the prover can open <script type="math/tex">D</script> at, which makes this valid.</p>
<p>What remans to be checked for the verifier to conclude the proof is that</p>
<script type="math/tex; mode=display">f_D (t) = h(t) - g_2(t)</script>
<p>or, reordering this:</p>
<script type="math/tex; mode=display">g_2(t) = h(t) - f_D(t)</script>
<p>The verifier can compute the left hand side <script type="math/tex">y=g_2(t)</script> without the help of the prover. What remains is for the prover to give an opening to the commitment <script type="math/tex">E - D</script> at <script type="math/tex">t</script> to prove that it is equal to <script type="math/tex">y</script>. The KZG proof <script type="math/tex">\pi = [(h(s) - g(s) - y)/(s-t)]_1</script> verifies that this is the case.</p>
<p>The proof consists of <script type="math/tex">D</script> and <script type="math/tex">\pi</script>.</p>
</li>
</ol>
<h2 id="verification">Verification</h2>
<p>The verifier starts by computing <script type="math/tex">r</script> and <script type="math/tex">t</script>,</p>
<p>As we have seen above, the verifier can compute the commitment <script type="math/tex">E</script> (using one multiexponentiation) and the field element <script type="math/tex">g_2(t)</script></p>
<p>Then the verifier computes</p>
<script type="math/tex; mode=display">y = g_2(t)</script>
<p>The verifier checks the Kate opening proof</p>
<script type="math/tex; mode=display">e(E - D - [y]_1,[1]_2) = e(\pi, [s-t]_2)</script>
<p>This means that the verifier now knows that the commitment <script type="math/tex">D</script>, opened at a completely random point (that the prover didn’t know when they committed to it), has exactly the value of <script type="math/tex">g(t)</script> which the verifier computed with the help of a prover. According to the <a href="https://en.wikipedia.org/wiki/Schwartz%E2%80%93Zippel_lemma">Schwartz-Zippel lemma</a> this is extremely unlikely (read: impossible in practice, like finding a hash collision), unless <script type="math/tex">D</script> is actually a commitment to <script type="math/tex">g(X)</script>; thus, <script type="math/tex">g(X)</script> must be a polynomial and the proof is complete.</p>
<h1 id="optimization-do-everything-in-evaluation-form">Optimization: Do everything in evaluation form</h1>
<p>This section is to explain various optimizations which make all of the above easy to compute, it is not essential for understanding the correctness of the proof (but it is for making an efficient implementation). The great advantage of the above version of KZG multiproofs, compared to many others, is that very large parts of the prover and verifier work only need to be done in the field. In addition, all these operations can be done on the evaluation form of the polynomial (or in maths terms: in the “Lagrange basis”). What that means and how it is used is explained below.</p>
<h2 id="evaluation-form">Evaluation form</h2>
<p>Usually, we see a polynomial as a sequence of coefficients <script type="math/tex">c_0, c_1, \ldots</script> defining a polynomial function <script type="math/tex">f(X) = \sum_i c_i X^i</script>, Here we define another way to look at polynomials: the so-called “evaluation form”.</p>
<p>Given <script type="math/tex">d</script> points <script type="math/tex">(\omega^0, y_0), \ldots, (\omega^{d-1}, y_{d-1})</script>, there is always a unique polynomial <script type="math/tex">f</script> of degree <script type="math/tex">% <![CDATA[
<d %]]></script> that assumes all these points, i.e. <script type="math/tex">f(\omega^i) = y_i</script> for all <script type="math/tex">% <![CDATA[
0 \leq i < d %]]></script>, Conversely, given a polynomial, we can easily compute the evaluations at the <script type="math/tex">d</script> roots of unity. We thus have a one-to-one correspondence of</p>
<script type="math/tex; mode=display">% <![CDATA[
\{\text{all polynomials of degree }<d\} \leftrightarrow \{\text{vectors of length } d \text{, seen as evaluations of a polynomial at } \omega^i\} %]]></script>
<p>This can be seen as a “change of basis”: On the left, the basis is “the coefficients of the polynomial”, whereas on the right, it’s “the evaluations of the polynomial on the <script type="math/tex">\omega^i</script>”.</p>
<p>Often, the evaluation form is more natural: For example, when we want to use KZG as a vector commitment, we will commit to a vector <script type="math/tex">(y_0, \ldots, y_{d-1})</script> by committing to a function that is defined by <script type="math/tex">f(\omega^i) = y_i</script>, But there are more advantages to the evaluation form: Some operations, such as multiplying two polynomials or dividing them (if the division is exact) are much more efficient in evaluation form.</p>
<p>In fact, all the operations in the KZG multiproof above can be done very efficiently in evaluation form, and in practice we never even compute the polynomials in coefficient form when we do this!</p>
<h2 id="lagrange-polynomials">Lagrange polynomials</h2>
<p>Let’s define the Lagrange polynomials on the domain <script type="math/tex">x_0, \ldots, x_{d-1}</script>.</p>
<script type="math/tex; mode=display">\ell_i(X) = \prod_{j \not= i} \frac{X - x_j}{x_i - x_j}</script>
<p>For any <script type="math/tex">x \in {x_0, \ldots, x_{d-1}}</script>,</p>
<script type="math/tex; mode=display">% <![CDATA[
\ell_i(x) = \begin{cases}
1 & \text{if } x=x_i \\
0 & \text{otherwise}
\end{cases} %]]></script>
<p>so the Lagrange polynomials can be seen as the “unit vectors” for polynomials in evaluation form. Using these, we can explicitly translate from the evaluation form to the coefficient form: say we’re given <script type="math/tex">(y_0, \ldots, y_{d-1})</script> as a polynomial in evaluation form, then the polynomial is</p>
<script type="math/tex; mode=display">f(X) = \sum_{i=0}^{d-1} y_i \ell_i(X)</script>
<p>Polynomials in evaluation form (given by the <script type="math/tex">y_i</script>) are sometimes called “Polynomials in Lagrange basis” because of this.</p>
<p>For KZG commitments, we can use another trick: Recall that the <script type="math/tex">G_1</script> setup for KZG to commit to polynomials of degree <script type="math/tex">% <![CDATA[
<k %]]></script> consists of <script type="math/tex">% <![CDATA[
[s^i]_1, 0\leq i < d %]]></script>, From these, we can compute <script type="math/tex">% <![CDATA[
[\ell_i(s)]_1, 0\leq i < d %]]></script>, Then we can simply compute a polynomial commitment like this:</p>
<script type="math/tex; mode=display">[f(s)]_1 = \sum_{i=0}^{d-1} y_i [\ell_i(s)]_1</script>
<p>There is no need to compute the polynomial in coefficient form to compute its KZG commitment.</p>
<h2 id="fft-to-change-between-evaluation-and-coefficient-form">FFT to change between evaluation and coefficient form</h2>
<p>The Discrete Fourier Transform <script type="math/tex">u=\mathrm{DFT}(v)</script> of a vector <script type="math/tex">v</script> is defined by</p>
<script type="math/tex; mode=display">u_i = \sum_{j=0}^{d-1} v_j \omega^{ij}</script>
<p>Note that if we define the polynomial <script type="math/tex">f(X) = \sum_{j=0}^{k-1} v_j X^j</script>, then <script type="math/tex">u_i = f(\omega^i)</script>, i.e. the DFT computes the values of <script type="math/tex">f(X)</script> on the domain <script type="math/tex">% <![CDATA[
\omega^i, 0 \leq i < d %]]></script>, This is why in practice, we will the roots of unity as our domain whenever it is available because then we can use the DFT to compute the evaluation form from the coefficient form.</p>
<p>The inverse, the Inverse Discrete Fourier Transform <script type="math/tex">v_i = \mathrm{DFT}^{-1}(u_i)</script>, is given by</p>
<script type="math/tex; mode=display">v_i = \frac{1}{d}\sum_{j=0}^{d-1} u_j \omega^{-ij}</script>
<p>Similar to how the DFT computes the evaluations of a polynomial in coefficient form, the inverse DFT computes the coefficients of a polynomial from its evaluations.</p>
<p>To summarize:</p>
<script type="math/tex; mode=display">\text{coefficient form} \overset{\mathrm{DFT}}{\underset{\mathrm{DFT}^{-1}}\rightleftarrows} \text{evaluation form}</script>
<p>The “Fast Fourier Transform” is a fast algorithm, that can compute the DFT or inverse DFT in only <script type="math/tex">\frac{d}{2} \log d</script> multiplications. A direct implementation of the sum above would take <script type="math/tex">d^2</script> multiplications. This speedup is huge and makes the FFT such a powerful tool.</p>
<p>In strict usage, DFT is the generic name for the operation, whereas FFT is an algorithm to implement it (similar to sorting being an operation, whereas quicksort is one possible algorithm to implement that operation). However, colloquially, people very often just speak of FFT even when they mean the operation as well as the algorithm.</p>
<h2 id="multiplying-and-dividing-polynomials">Multiplying and dividing polynomials</h2>
<p>Let’s say we have two polynomials <script type="math/tex">f(X)</script> and <script type="math/tex">g(X)</script> such that the sum of the degrees is less than <script type="math/tex">k</script>, Then the product <script type="math/tex">h(X) = f(X) \cdot g(X)</script> is a polynomial of degree less than <script type="math/tex">k</script>. If we have the evaluations <script type="math/tex">f_i = f(x_i)</script> and <script type="math/tex">g_i = g(x_i)</script>, then we can easily compute the evaluations of the product:</p>
<script type="math/tex; mode=display">h_i = h(x_i) = f(x_i) g(x_i) = f_i g_i</script>
<p>This only needs <script type="math/tex">d</script> multiplications, whereas multiplying in coefficient form needs <script type="math/tex">O(d^2)</script> multiplications. So multiplying two polynomials is much easier in evaluation form.</p>
<p>Now lets assume that <script type="math/tex">g(X)</script> divides <script type="math/tex">f(X)</script> exactly, i.e. there is a polynomial <script type="math/tex">q(X)</script> such that <script type="math/tex">f(X) = g(X) \cdot q(X)</script>, Then we can find this quotient <script type="math/tex">q(X)</script> in evaluation form</p>
<script type="math/tex; mode=display">q_i = q(x_i) = f(x_i) / g(x_i) = f_i / g_i</script>
<p>using only <script type="math/tex">d</script> divisions. Again, using long division, this would be a much more difficult task taking <script type="math/tex">O(d^2)</script> operations in coefficient form.</p>
<p>We can use this trick to compute openings for Kate commitments in evaluation form, where we need to compute a polynomial quotient. So we can use this trick to compute the proof <script type="math/tex">\pi</script> above.</p>
<h2 id="dividing-when-one-of-the-points-is-zero">Dividing when one of the points is zero</h2>
<p>There is only one problem: What if one of the <script type="math/tex">g_i</script> is zero, i.e. <script type="math/tex">g(X)</script> is zero somewhere on our evaluation domain? Note by definition this can only happen if <script type="math/tex">f_i</script> is also zero, otherwise <script type="math/tex">g(X)</script> cannot divide <script type="math/tex">f(X)</script>, But if both are zero, then we are left with <script type="math/tex">q_i = 0/0</script> and can’t directly compute it in evaluation form. But do we have to go back to coefficient form and use long division? It turns out that there’s a trick to avoid this, at least in the case that we care about: Often, <script type="math/tex">g(X) = X-x_m</script> is a linear factor. We want to compute</p>
<script type="math/tex; mode=display">q(X) = \frac{f(X)}{g(X)} = \frac{f(X)}{X-x_m} = \sum_{i=0}^{d-1} f_i \frac{\ell_i(X)}{X-x_m}</script>
<p>Now, we introduce the polynomial <script type="math/tex">A(X) = \prod_{i=0}^{d-1} (X-x_i)</script>. The roots of the polynomial are all the points of the domain. So we can also write <script type="math/tex">A(X)</script> in another form as</p>
<p>The <a href="https://en.wikipedia.org/wiki/Formal_derivative">formal derivative</a> of <script type="math/tex">A</script> is given by</p>
<script type="math/tex; mode=display">A'(X) = \sum_{j=0}^{d-1} \prod_{i \not= j}(X-x_i)</script>
<p>This polynomial is extremely useful because we can write the Lagrange polynomials as</p>
<script type="math/tex; mode=display">\ell_i(X) = \frac{1}{A'(x_i)}\frac{A(X)}{X-x_i}</script>
<p>so</p>
<script type="math/tex; mode=display">\frac{\ell_i(X)}{X-x_m} = \frac{1}{A'(x_i)}\frac{A(X)}{(X-x_i)(X-x_m)} = \frac{A'(x_m)}{A'(x_i)}\frac{\ell_m(X)}{X - x_i}</script>
<p>Now, let’s go back to the equation for <script type="math/tex">q(X)</script>, The one problem we have if we want to get this in evaluation form is the point <script type="math/tex">q(x_m)</script> where we encounter a division by zero; all the other points are easy to compute. But now we can replace</p>
<script type="math/tex; mode=display">q(X) = \sum_{i=0}^{d-1} f_i \frac{\ell_i(X)}{X-x_m} = \sum_{i=0}^{d-1} f_i \frac{A'(x_m)}{A'(x_i)}\frac{\ell_m(X)}{X - x_i}</script>
<p>Because <script type="math/tex">\ell_m(x_m)=1</script>, this lets us compute</p>
<script type="math/tex; mode=display">q_m = q(x_m) = \sum_{i=0}^{d-1} f_i \frac{A'(x_m)}{A'(x_i)}\frac{1}{x_m - x_i}</script>
<p>For all <script type="math/tex">j \not= m</script>, we can compute directly</p>
<script type="math/tex; mode=display">q_j = q(x_j) = \sum_{i=0}^{d-1} f_i \frac{\ell_i(x_j)}{x_j-x_m} = \frac{f_j}{x_j-x_m}</script>
<p>This allows us to efficiently allow all <script type="math/tex">q_j</script> in evaluation form for all <script type="math/tex">j</script>, including <script type="math/tex">j=m</script>, This trick is necessary to compute <script type="math/tex">q(X)</script> in evaluation form.</p>
<p>In order to make this efficient, it’s best to precompute the <script type="math/tex">A'(x_i)</script> as computing them takes <script type="math/tex">O(d^2)</script> time, but only needs to be performed once.</p>
<h3 id="special-case-roots-of-unity">Special case roots of unity</h3>
<p>In the case where we are using the roots of unity as our domain, we can use some tricks so that we don’t need the precomputation of <script type="math/tex">A'(x_i)</script>. The key observation is that <script type="math/tex">A(X)</script> can be rewritten in a simpler form:</p>
<script type="math/tex; mode=display">A(X) = \prod_{i = 0} ^ {d-1} (X-\omega^i) = X^d - 1</script>
<p>Because of this the formal derivative becomes much simpler:</p>
<script type="math/tex; mode=display">A'(X) = d x^{d-1} = \sum_{j=0}^{d-1} \prod_{i \not= j}(X-\omega^i)</script>
<p>And we can now easily derive <script type="math/tex">A'(x_i)</script>:</p>
<script type="math/tex; mode=display">A'(\omega^i) = d (\omega^i)^{d-1}= d \omega^{-i}</script>
<h2 id="evaluating-a-polynomial-in-evaluation-form-on-a-point-outside-the-domain">Evaluating a polynomial in evaluation form on a point outside the domain</h2>
<p>Now there is one thing that we can do with a polynomial in coefficient form, that does not appear to be easily feasible in evaluation form: We can evaluate it at any point. Yes, in evaluation form, we do have the values at the <script type="math/tex">x_i</script>, so we can evaluate <script type="math/tex">f(x_i)</script> by just taking an item from the vector; but surely, to evaluate <script type="math/tex">f(X)</script> at a point <script type="math/tex">z</script> <em>outside</em> the domain, we have to first convert to coefficient form?</p>
<p>It turns out, it is not necessary. To the rescue comes the so-called barycentric formula. Here is how to derive it using Lagrange interpolation:</p>
<script type="math/tex; mode=display">f(z) = \sum_{i=0}^{d-1} f_i \ell_i(z) = \sum_{i=0}^{d-1} f_i \frac{1}{A'(x_i)}\frac{A(z)}{z-x_i} = A(z)\sum_{i=0}^{d-1} \frac{f_i}{A'(x_i)} \frac{1}{z-x_i}</script>
<p>The last part can be computed in just <script type="math/tex">O(d)</script> steps (assuming the precomputation of the <script type="math/tex">A'(x_i)</script>), which makes this formula very useful, for example for computing <script type="math/tex">g(t)</script> and <script type="math/tex">h(t)</script> without changing into coefficient form.</p>
<p>This formula can be simplified in the case where the domain is the roots of unity:</p>
<script type="math/tex; mode=display">f(z) = \frac{z^d-1}{d}\sum_{i=0}^{d-1} f_i \frac{\omega^i}{z-\omega^i}</script>Multiproof scheme for polynomial commitmentsWhat everyone gets wrong about 51% attacks2021-05-20T22:38:00+00:002021-05-20T22:38:00+00:00https://dankradfeist.de/ethereum/2021/05/20/what-everyone-gets-wrong-about-51percent-attacks<h1 id="what-everyone-gets-wrong-about-51-attacks">What everyone gets wrong about 51% attacks</h1>
<p>Excuse the provocation in the title. Clearly not everyone gets it wrong. But sufficiently many people that I think it’s good to write a blog post about the topic.</p>
<p>There is a myth out there that if you control more than 50% of the hashpower in Bitcoin, Ethereum, or another blockchain, then you can do whatever you want with the network. A similar restatement for Proof of Stake is that if you control more than two thirds of the stake, you can do anything. You can take another person’s coins. You can print new coins. Anything.</p>
<p>This is <strong>not</strong> true. Let’s discuss what a 51% attack can do:</p>
<ul>
<li>They can stop you from using the chain, i.e. block any transaction they don’t like. This is called censorship</li>
<li>They can revert the chain, i.e. undo a certain number of blocks and change the order of the transactions in them.</li>
</ul>
<p>What they <strong>cannot</strong> do is change the rules of the system. This means for example:</p>
<ul>
<li>They cannot simply print new coins, outside of the provisions of the blockchain system; e.g. Bitcoin currently gives each new block producer 6.25 BTC; they cannot simply turn this into one million BTC</li>
<li>They cannot spend coins from an address for which they don’t have the private key</li>
<li>They cannot make larger blocks than consensus rules allow them to do</li>
</ul>
<p>Now this is not to say that 51% attacks aren’t devastating. They are still very bad attacks. Reordering allows double spending of coins, which is quite a big problem. But there are limits on what they can do.</p>
<p>Now how do most Blockchains, including Bitcoin and Ethereum, ensure this? What happens if a miner mines a block that goes against the rules? Or a majority of the stake signs a block that goes against the rules?</p>
<h2 id="the-blockchain-security-model">The blockchain security model</h2>
<p>Sometimes people claim that the longest chain is the valid Bitcoin or Ethereum chain. This is somewhat incomplete. The proper definition of the current chain head is</p>
<ul>
<li>The <strong>valid</strong> chain with the highest total difficulty.</li>
</ul>
<p>So there are two properties that a client verifies before accepting that a chain should be used to represent the current history:</p>
<ol>
<li>It has to be valid. This means that all state transitions are valid; for example in Bitcoin, that means that all transactions only spent previously unspent transaction outputs, the coinbase only receives the transaction fees and block rewards, etc.</li>
<li>It has to be the chain with the highest difficulty. Colloquially, that’s the longest chain, however not measured in terms of blocks but how much total mining power was spent on this chain.</li>
</ol>
<p>This may all sound a bit abstract. It is legitimate to ask who verifies that first condition, that all blocks on the chain should be valid? Because if it’s just the miners that also verify that the chain is valid, then this is a tautology and we haven’t really gained anything.</p>
<p>But blockchains are different. Let’s see why. Start with a normal client/server database architecture:</p>
<p><img src="/assets/database_diagram.png" alt="Database user and server" /></p>
<p>Note that for a typical database, the user trusts the database server. They don’t check that the response is correct; the client makes sure that it is validly formatted according to the protocol, and that’s it. The client, here represented by an empty square, is “dumb”: It can’t verify anything.</p>
<p>A blockchain architecture however, looks like this:</p>
<p><img src="/assets/blockchain_diagram.png" alt="Blockchain architecture" /></p>
<p>So let’s summarise what happens here. There are miners (or stakers) that produce the chain. There is a peer to peer network – its role is to make sure that a valid chain is always available to everyone, even if some of the nodes aren’t honest (you need to be connected to at least one honest and well-connected P2P node, to ensure that you will always be up to date with the valid chain). And there is a client, who sends transactions to the P2P network and receives the latest chain updates (or the full chain, if they are syncing) from other nodes in the network. They are actually part of the network and will also contribute by forwarding blocks and transactions, but that’s not so important here.</p>
<p>The important part is that the user is running a full node, as represented by the cylinder in their client. Whenever the client gets a new block, just like any other node, whether it’s a miner or just a node in the P2P network, they will validate whether that block is a valid state transition.</p>
<p>And if it’s not a valid state transition, the block will just be ignored. That’s why there is very little point in a network for miners to ever try to mine an invalid state transition. Everyone would just ignore it.</p>
<p>Many users run their own node to interact with blockchains like Ethereum or Bitcoin. Many communities have made this part of their culture and place a great emphasis on everyone running their own node, so that they are part of the validation process. Indeed, you could say that it’s really important that the majority of users, especially those with a lot of value at stake, run full nodes; if the majority of users become too lazy, then suddenly miners can be tempted to produce invalid blocks, and this model would not hold anymore.</p>
<h2 id="analogy-separation-of-powers">Analogy: Separation of powers</h2>
<p>You can think of this a bit like the separation of powers in liberal democracies – there are different branches of the government, and just because you have a majority in one of them (say, the legislation) does not mean you can simply do anything you like and ignore all laws. Miners or stakers have the power to order transactions in blockchains; they don’t have the power to simply dictate new rules on the community.</p>
<h2 id="but-do-all-blockchains-work-like-this">But do all blockchains work like this?</h2>
<p>That’s a good question. And what’s important to note is that this only works if a full node is easy to run. As an average user, you will simply not do it if it means having to buy another computer for 5,000$ and needing 1 GBit/s of internet connection permanently. Even if you can get such a connection in some places, having it permanently clogged by your blockchain node is probably not very convenient. In this case, you will probably not run your own node (unless your transactions are exceptionally valuable), which means that you will trust someone else to do it for you.</p>
<p>Imagine a chain that is so expensive to run that only stakers and exchanges will run a full node. You have just changed the trust model, and a majority of stakers and the exchanges could come together and change the rules. There would be no debate with the users about this – users cannot lead a fork if they literally have no control over the chain, at all. They could insist on the old rules, but unless they start running full nodes, they would have no idea if their requests are answered using a chain that satisfies the rules that they want.</p>
<p>That’s why there are always huge debates around increasing the block size of say, Ethereum or Bitcoin – everytime you do this, you increase the burden for people running their own nodes. It’s not much of a problem for miners – the cost of running a node is tiny compared to actual mining operations – so it shifts the balance of power away from users and to the miners (or stakers).</p>
<h2 id="how-about-light-clients">How about light clients?</h2>
<p>All right, but what if you just want to pay for your coffee using cryptocurrencies? Are you going to run a full node on your phone?</p>
<p>Of course, nobody expects that. And users don’t. Here, light clients come into play. Light clients are simpler clients that do not verify the full chain – they only verify the consensus, i.e. the total difficulty or the amount of stake that has voted for it.</p>
<p>In other words, light clients <em>can</em> be tricked into following a chain that contains invalid blocks. There are remedies for this, in the form of data availability checks and fraud proofs. As far as I know, no chain has implemented these at this point, but at least Ethereum will do this in the future.</p>
<p>So using light clients with data availability checks and fraud proofs, we will be able to make the blockchain security model available without requiring all users to run a full node. This is the ultimate goal, that any phone can easily run an Ethereum light client.</p>
<h2 id="and-what-about-sidechains">And what about sidechains?</h2>
<p>Sidechains are a hot topic right now. It would seem that they are an easy way to provide scaling, without the complexity of rollups. Simply speaking</p>
<ul>
<li>Create a new Proof of Stake chain</li>
<li>Create a two-way bridge with Ethereum</li>
<li>…</li>
<li>Profit!
Note that the security of the sidechain relies pretty much entirely on the bridge – that is the construction that allows one chain to understand another chain’s state. After all, if you can trick the bridge on the main chain that all the assets on the bridged chain now belong to Mr. Evil, then it doesn’t matter if full nodes on the Proof of Stake chain think differently. So it’s all in the bridge.</li>
</ul>
<p>Unfortunately, the state of bridges is the same as with light clients. They don’t verify correctness, but only the majority part of the consensus condition. However, there are two things that are worse than light clients</p>
<ol>
<li>Bridges are used for very high value transactions, where most users would choose a full node if they could</li>
<li>Unfortunately, there is no way to fortify bridges as we can do for light clients – the reason is that they cannot verify data availability checks</li>
</ol>
<p>The second point is quite subtle and could easily fill another blog post or two. But in short, bridges cannot do data availability checks, and without these, fraud proofs are also mostly useless. Using zero knowledge proofs, you can get an improvement by requiring bridges to include proofs of all blocks being correct – unfortunately, this still suffers from some data availability attacks, but it is an improvement.</p>
<p>In summary, sidechains have a different, much weaker security model than a blockchain like Ethereum and Bitcoin. They cannot protect against invalid state transitions.</p>
<h2 id="does-this-all-have-to-do-something-with-sharding">Does this all have to do something with sharding?</h2>
<p>In fact, all of this has a lot to do with sharding. The reason why we need sharding to scale is because it is the only way to scale without raising the bar for running a full node, while maintaining the full security guarantees of blockchains as closely as possible.</p>
<h2 id="but-what-if-you-just-undo-all-of-history-then-you-can-still-just-steal-all-the-bitcoinetheretc">But what if you just undo all of history? Then you can still just steal all the Bitcoin/Ether/etc.</h2>
<p>From a theoretical point of view, on a non-checkpointed Proof of Work chain, it is true that by not reverting some transactions, but all transactions ever, you could still get all the Bitcoins. OK, so you cannot print a trillion Bitcoin, but you can still get all the Bitcoins in existence, so that’s pretty good, right?</p>
<p>I think this point is very theoretical. The probability that either of these communities would accept a fork that revises years (or even just hours) of its history is precisely zero. There would be massive scrambling together on all possible channels, with the pretty quick conclusion that people should reject this and just agree that the valid chain should be the one that is already in existence.</p>
<p>With Proof of Stake and finalization, this mechanism will become formalized – clients simply never revert on finalized blocks, ever.</p>What everyone gets wrong about 51% attacksWhy it’s so important to go stateless2021-02-14T21:20:00+00:002021-02-14T21:20:00+00:00https://dankradfeist.de/ethereum/2021/02/14/why-stateless<p>One of Eth1’s biggest problem is the current state size. Estimated at around 10-100GB (depending on how exactly it is stored), it is impractical for many nodes to keep in working memory, and is thus moved to slow permanent storage. However, hard disks are way too slow to keep up with Ethereum blocks (or god forbid, sync a chain from genesis), and so much more expensive SSDs have to be used. Arguably, the current state size isn’t even the biggest problem. The biggest problem is that it is relatively cheap to grow this state, and state growth is permanent, so even if we can raise the cost for growing state, there is no way to make someone pay for the actual impact on the network, which is eternal.</p>
<p>A solution space, largely crystallizing around two ideas, has emerged:</p>
<ul>
<li>State rent – the idea that in order to keep a state element in active memory, a continuous payment is required, and</li>
<li>Statelessness – blocks come with full witnesses (e.g. Merkle proofs) and thus no state is required to validate whether a block is valid</li>
</ul>
<p>On the spectrum to statelessness, there are further ideas worth exploring:</p>
<ul>
<li>partial statelessness – reducing the amount of state required to validate blocks, by requiring witnesses only for some (old) state</li>
<li>weak statelessness – validating block requires no state, but proposing blocks requires the full state</li>
</ul>
<p>Vitalik has written some ideas how to put these into a common framework <a href="https://hackmd.io/@HWeNw8hNRimMm2m2GH56Cw/state_size_management">here</a>, showing that partial statelessness and state rent are very similar in that both require some form of payment for introducing something into active state, and a witness to reactivate state that has become inactive.</p>
<p>If you come from the Eth1 world, then you may think that partial statelessness with a remaining active state of 1 GB or even 100 MB is a great achievement, so why work so much harder to go for full statelessness? I argue that full (weak) statelessness unlocks a huge potential that any amount of partial statelessness cannot, and thus that we should work very hard to enable full statelessness.</p>
<h2 id="understanding-eth2-validators">Understanding Eth2 validators</h2>
<p>Eth1 has been criticised in the past for having very high hardware requirements, and though not all of these criticisms are fair (it is still very possible to run an Eth1 node on moderate but well chosen consumer hardware), they are to be taken seriously, especially since we want to scale Ethereum without compromising decentralization. For Eth2, we have thus set ourselves a very ambitious goal – to be able to run an Eth2 node and validator on very low-cost hardware, even a Raspberry Pi or a smartphone.</p>
<p>This is not the easy route, but the hard route to scaling. Other projects, like EOS and Solana, instead require much more performant hardware and internet connections. But I think for decentralization it is essential to keep the requirements on consensus nodes, as well as P2P nodes, very low.</p>
<p>In Eth2, the consensus node is the validator. There is an important difference with the consensus nodes in Eth1 and Eth2:</p>
<ul>
<li>In Eth1, the consensus nodes are miners. To “vote” for a chain, you have to produce a block on it. In other words, the consensus nodes and block producers are inseparable.</li>
<li>In Eth2, or rather its current first phase, the beacon chain, proposing blocks and forming consensus are two different functions: Blocks are proposed every 12 seconds by a randomly selected validator, but consensus is formed via attestations, with every validator voting for a chain <em>every epoch, that is, every 6.4 minutes</em>. Yes, at the moment, that is already almost 100,000 validators casting one vote every few minutes. Block producers have (almost <sup id="fnref:3"><a href="#fn:3" class="footnote">1</a></sup>) no influence on consensus, they only get to select what is included in a block<sup id="fnref:1"><a href="#fn:1" class="footnote">2</a></sup></li>
</ul>
<p>The property that block proposers are irrelevant for consensus opens up a significant design space. While for the beacon chain, block proposers are simply selected at random from the full validator set, for the shard chains, this doesn’t have to be true:</p>
<ul>
<li>One interesting possibility would be that for a shard, especially an Eth1 execution shard, there is a way for a validator to enter a list that they are capable of producing blocks. These validators may require better hardware and may need to have “full” state</li>
<li>Another possibility, which we are currently implementing for the data shards, is that anyone can be selected for proposing blocks, but the actual content of the block isn’t produced by the proposer; instead, different entities can bid on getting their pre-packaged blocks proposed.</li>
</ul>
<p>In both cases, weakly stateless validation means that all the other validators, who are not proposing blocks or preparing block content, do not need the state. That is a huge difference to Eth1: In Eth1, the consensus forming nodes (the miners) have high requirements anyway, so requiring them to keep full state seems fine. But with Eth2, we have the possibility of significantly lowering this requirement, and we should make use of it, to benefit decentralization and security.</p>
<h2 id="so-why-is-it-ok-to-have-expensive-proposers">So why is it ok to have expensive proposers?</h2>
<p>An important objection may be that it defeats decentralization if block proposing becomes expensive, even if we get cheap validators and P2P nodes. This is not the case. There is an important difference between “proposers” and “validators”:</p>
<ul>
<li>For validators, we need an honest supermajority, i.e. more than 2/3 of the total staked ETH must be honest. A similar thing can be said about P2P nodes – while there isn’t (as far as I know) a definite fraction of P2P nodes that must be honest, there is the requirement that everyone is connected to at least one honest P2P node in order to be able to be sure to always receive the valid chain; this could be 5% but in practice it is probably higher.</li>
<li>For proposers, we actually get away with much lower honesty requirements; note that unlike in Eth1, in Eth2 proposers do not get to censor past blocks (because they do not vote), but only get to decide about the content of their own block. Assuming that your transaction is not highly time critical, if 95% of proposers try to censor it, then the 20th proposer would still be able to get it safely included. (Low-latency censorship resistance is a different matter however, and in practice more difficult to achieve)</li>
</ul>
<p>This is why I am much less worried about increasing hardware requirements for proposers than for validators. I think it would be fine if we need proposers to run a PC with 128GB RAM that is fully capable of storing even a huge state, if we can keep normal validator requirements low. I would be worried if a PC that can handle these requirements costs 100,000$, but if we can keep it to under 5,000$, it seems unconscionable that the community would not react very quickly by introducing more proposers if censorship were detected.</p>
<p>Finally, let’s not neglect that there are <a href="https://ethresear.ch/t/flashbots-frontrunning-the-mev-crisis/8251">other reasons</a> why block proposing will likely be done by those with significant hardware investments anyway, as they are better at exploiting MEV.</p>
<p>Note that I am using the word “proposer” here for the entities that package blocks, which is not necessarily the same as the one who formally signs it and introduces them; they could be “sequencers” (for rollups) etc. For simplicity I call them proposers here, because I do not think any of the system would fundamentally break if we simply introduced a completely new role into the system that only proposes blocks and nothing else.</p>
<h2 id="the-benefits-of-going-stateless">The benefits of going stateless</h2>
<p>So far I haven’t argued why (at least weak, but not partial) statelessness is such a powerful paradigm; in the <a href="https://ethresear.ch/t/executable-beacon-chain/8271">executable beacon chain</a> proposal, reducing state from 10 GB to 1 GB or 100MB seems to unlock a lot of savings for validators, so why do we have to go all the way?</p>
<p>Because if we go all the way, the executable Eth1 blocks can become a shard. Note that in the executable beacon chain proposal, all validators have to run the full Eth1 execution all the time (or they risk signing invalid blocks). A shard should not have this property; the point of a shard is that only a committee needs to sign a block (so only 1/1024 of all validators); and the others don’t have to trust that the majority of this committee is honest <sup id="fnref:2"><a href="#fn:2" class="footnote">3</a></sup>, but only that it has at least one honest member, who would blow the whistle when it tries to do something bad. This is only possible if Eth1 becomes stateless:</p>
<ul>
<li>We want the load on all validators to be rougly equal, and free of extreme peaks. Thus, sending a validator to become an Eth1 committee member for a long time, like an hour or a day, is actually terrible: It means the validator still has to be dimensioned to be able to keep up with the full Eth1 chain in terms of bandwidth requirements. This is in addition to committees becoming much more attackable if they are chosen for a long time (for example through bribing attacks)</li>
<li>We want to be able to have easy fraud proofs for Eth1 blocks, because otherwise the other validators can’t be sure that the committee has done its work correctly. The easiest way to get fraud proofs is if a block can be its own fraud proof: If a block is invalid, you simply have to broadcast the block itself to show that fraud has happened.</li>
</ul>
<p>So Eth1 can become a shard (that requires much less resources, like 1/100, to maintain) only if it becomes fully stateless. And at the same time, only then can we introduce more execution shards, in addition to the data shards.</p>
<h2 id="arent-caches-always-good">Aren’t caches always good?</h2>
<p>So what if we go to full statelessness but introduce a 10 MB cache? Or 1 MB? That can easily be downloaded even if you only want to check one block, because you are assigned to a committee or you received it as a fraud proof?</p>
<p>You can do this, but there is a simple way to see that this is very unlikely to be optimal, if the majority of validators only validate single blocks. Let’s say we target 1 MB blocks and in addition, we have a 1 MB cache. That means, every time a validator wants to validate a block, they have to download 2 MB – both the block and the cache. They have to download the cache every time, except if they download <em>all</em> blocks to also keep the cache up to date, which is exactly what we want to avoid.</p>
<p>This means, at the same cost of having blocks of size 1 MB with a cache of 1 MB, we could set the cache to 0 and allow blocks of 2 MB.</p>
<p>Now, it’s clear that a block of 2 MB is at least as powerful as having 1 MB blocks with 1 MB cache. The reason is that the 2 MB block could simply include a 1 MB cache if that’s what we thought was optimal – you simply commit to the cache at every block, and reintroduce the full cache in the next block. This is unlikely to be the best use of that 1 MB block space, but you could, so it’s clear that a 2 MB block is at least as powerful as a 1 MB block with a 1 MB cache. It’s much more likely that the extra 1 MB would be of better use to allow more witnesses to be introduced.</p>
<h2 id="binary-tries-or-verkle-tries">Binary tries or verkle tries?</h2>
<p>I think overall, the arguments for shooting for full weak statelessness, and not partial statelessness or state rent, are overwhelming. It will impact users much less: They simply don’t have to think about it. The only thing they have to do, when constructing transactions, is to add witnesses (so that the P2P network is able to validate it’s a valid transaction). Creating these witnesses is so cheap that it’s unimaginable that there won’t be a plethora of services offering it. Most wallets, in practice, already rely on external services and don’t require users to run their own nodes. Getting the witnesses is a trivial thing to add<sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup>.</p>
<p>Partial statelessness, or state rent, adds a major UX hurdle on the way to full weak statelessness, where it would disappear again. It has some merit when you consider how difficult statelessness is to achieve using just binary Merkle tries, and that the gas changes required to allow Merkle trie witnesses will themselves be detrimental to UX.</p>
<p>So in my opinion, we should go all the way to <a href="https://notes.ethereum.org/_N1mutVERDKtqGIEYc-Flw">verkle tries</a> now. They allow us to have manageable, <1 MB witnesses, with only moderate gas repricings as proposed by <a href="https://eips.ethereum.org/EIPS/eip-2929">EIP-2929</a> and charging for code chunks. Their downsides are well contained and of little practical consequence for users:</p>
<ul>
<li>A new cryptographic primitive to learn for developers</li>
<li>adding more non post-quantum secure cryptography
The second sounds scary, but we will already introduce KZG commitments in Eth2 for data availability sampling, and we are using elliptic curve based signatures anyway. Several post quantum upgrades of the combined Eth1 and Eth2 chain are required, because there simply aren’t practical enough post quantum alternatives around now. We can’t stop progress because of this. The next 5 years are extremely important in terms of adoption. The way forward is to implement the best we can now, and in 5-10 years, when STARKs are powerful enough, we will introduce a full post quantum upgrade of all primitives.</li>
</ul>
<p>In summary, verkle tries will solve our state problems for the next 5 years to come. We will be able to implement full (weak) statelessness now, with almost no impact on users and smart contract developers; we will be able to implement gas limit increases (because validation becomes faster) and more execution shards – and all this comes with little downside in terms of security and decentralization.</p>
<p>The big bullet to bite is for everyone to learn to understand how KZG commitments and verkle tries work. Since Eth2 will use KZG commitments for data availability, most of this work will soon be required of most Ethereum developers anyway.</p>
<div class="footnotes">
<ol>
<li id="fn:3">
<p>Almost no influence, because there is now a small modification to improve resilience against certain balancing attacks, that does give block proposers a small amount of short term influence on the fork choice <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:1">
<p>To be precise, they can have an influence, if they start colluding and censoring large numbers of attestations, but single block producers have a completely negligible effect on how consensus is formed <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>A dishonest committee can do some annoying things, that could impact the network and introduce major latency, but it cannot introduce invalid/unavailable blocks <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>Users who do want to run their own node can still use an external service to get witnesses. Doing so is trustless, since the witnesses are their own proof if you know what the latest state root is <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>One of Eth1’s biggest problem is the current state size. Estimated at around 10-100GB (depending on how exactly it is stored), it is impractical for many nodes to keep in working memory, and is thus moved to slow permanent storage. However, hard disks are way too slow to keep up with Ethereum blocks (or god forbid, sync a chain from genesis), and so much more expensive SSDs have to be used. Arguably, the current state size isn’t even the biggest problem. The biggest problem is that it is relatively cheap to grow this state, and state growth is permanent, so even if we can raise the cost for growing state, there is no way to make someone pay for the actual impact on the network, which is eternal.Running a validator client on a Raspberry Pi2020-11-20T16:16:00+00:002020-11-20T16:16:00+00:00https://dankradfeist.de/ethereum/2020/11/20/staking-on-raspi<h1 id="running-a-validator-client-on-a-raspberry-pi">Running a Validator Client on a Raspberry Pi</h1>
<p>I assume familiarity with Staking on Eth2.0 in general.</p>
<p>This is a quick manual on how to run the Validator Client, and only the Validator client, on a Raspberry Pi. I have decided to run my beacon node on a separate machine with more resources. It is probably possible (possibly after more optimizations) to run your whole staking operations on a Raspberry Pi, which would be a good idea for cost/energy efficiency – but at this point, I am much more concerned about security. The idea here is to effectively use the Raspberry Pi as a simple “Hardware Security Module” or HSM: It acts to protect the keys and contains the validator slashing protection.</p>
<p><strong>Advantages of this setup</strong></p>
<ul>
<li>Better protection of validator keys than at-home staking with single node – the machine containing the staking keys does not have a direct internet connection</li>
<li>Beacon node is not running on resource-constrained Raspberry Pi, so should run safely even under non-optimal conditions where the Raspberry Pi might struggle, e.g. very high number of validators or long periods of non-finality</li>
</ul>
<p><strong>What this setup does not cover</strong></p>
<ul>
<li>Optimized for security, not cost (double hardware, higher electricity consumption)</li>
<li>Not optimized for liveness. An additional point of failure by relying on two machines for staking</li>
</ul>
<p>The Raspberry Pi in this configuration can be seen as a kind of Hardware Security Module or HSM. Until dedicated HSMs for staking become available, this is my suggestion on how to reach nearly equivalent.</p>
<p>As there is no mainnet, I will describe how to run on the Medalla testnet. Once I update my setup for mainnet launch I will update this guide.</p>
<h2 id="diagram">Diagram</h2>
<p>As an illustration, here is a diagram of the node configuration</p>
<p><img src="/assets/nodediagram.png" alt="Diagram" /></p>
<h2 id="hardware">Hardware</h2>
<p>To run the Validator Client, I use a Raspberry Pi 4 4GB. If you want to compile on your Raspberry Pi, I recommend 8GB (at the time of writing, the lighthouse build did not complete on 4GB, but I think it would on 8GB). What you need in addition:</p>
<ul>
<li>A USB-C charger</li>
<li>Micro-SD card – I recommend not skimping on this one as cheap cards may be unreliable, especially after many write operations. I am using a Samsung EVO 32 GB</li>
<li>Raspberry Pi case – since I don’t like to have a fan, I got the “GeekPi Argon NEO Aluminum Case” which is fanless</li>
<li>Another machine to run the beacon node. Mine has 2 Ethernet ports, of which I use 1 to connect to the router and one for the Raspberry Pi, so the Raspberry Pi is not visible on the local network at all</li>
</ul>
<h2 id="installation">Installation</h2>
<p>First, we need an operating system on the Raspberry Pi. I use Ubuntu 20.04 LTS Server, as I’m most familiar with Ubuntu. I will give a short summary of the installation which should be enough for technical user, otherwise you can find a manual on how to install it <a href="https://ubuntu.com/tutorials/how-to-install-ubuntu-on-your-raspberry-pi">here</a>. The installation is quite easy even without a display attached:</p>
<ul>
<li>Use <a href="https://www.raspberrypi.org/software/">rpi-imager</a> to install an image of Ubuntu 20.04 LTS Server (64 bit) on your MicroSD card</li>
<li>Use the “system-boot” partition to configure your Pi for the first launch. In order to run Headless, this is essential: You want to be able to SSH into the Pi. In my case, I first configured it to my home network because I want to be able to download all updates before I cut it off the internet. In order to connect to your wifi, edit the <code class="highlighter-rouge">network-config</code> file to add your SSID and password:
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wifis:
wlan0:
dhcp4: true
optional: true
access-points:
<wifi network name>:
password: "<wifi password>"
</code></pre></div> </div>
</li>
<li>Now you can insert the MicroSD card into the Raspberry Pi and boot the Pi by attaching USB C power</li>
<li>Find out the IP address (Your router interface might have a list of connected devices. Otherwise a quick <code class="highlighter-rouge">nmap 192.168.0.0/24</code> will also do the trick)</li>
<li>Log in via SSH using the credentials <code class="highlighter-rouge">ubuntu/ubuntu</code>. You will be asked to change password</li>
<li>After this, I decided to create a user in my own name using <code class="highlighter-rouge">adduser [username]</code>. Make sure to add your user to the sudo group using <code class="highlighter-rouge">addgroup [username] sudo</code>. This step can be skipped if you just want to use the <code class="highlighter-rouge">ubuntu</code> user</li>
<li>Bring your system up to date using <code class="highlighter-rouge">sudo apt update && sudo apt upgrade</code></li>
</ul>
<p>Now it’s time to configure the Pi for a static network connection. Edit <code class="highlighter-rouge">/etc/netplan/50-cloud-init.yaml</code> to remove the wifi (we don’t want the Pi to be in the local network on production) and add a static Ethernet address:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> network:
version: 2
ethernets:
eth0:
dhcp4: false
optional: true
addresses:
- 192.168.1.2/24
</code></pre></div></div>
<p>Disable the audomatic cloud configuration by creating the file <code class="highlighter-rouge">/etc/cloud/cloud.cfg.d/99-disable-network-config.cfg</code> and adding the line</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code> network: {config: disabled}
</code></pre></div></div>
<p>Configure the beacon node machine’s main machine second Ethernet port to a static <code class="highlighter-rouge">192.168.1.1/24</code> address. You can now connect the Pi and the main machine via ssh to <code class="highlighter-rouge">192.168.1.2</code>.</p>
<h2 id="cross-compile-the-lighthouse-client">Cross compile the lighthouse client</h2>
<p>I want to run the lighthouse client because it has performed well in testnets so far and seems to have suffered few critical bugs compared to other clients. Unfortunately, lighthouse does not come with a compile target just for the Validator Client, which is the part that I want to run on my Pi – you need to build everything. I couldn’t get that to complete on my 4GB Pi – it may be possible (but likely still slow) on an 8GB Pi. The easier way is to cross compile. Here is how to do this on ubuntu:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir ~/ethereum
cd ~/ethereum
curl https://sh.rustup.rs -sSf | sh
git checkout https://github.com/sigp/lighthouse
cd lighthouse
cargo install cross
make build-aarch64-portable
</code></pre></div></div>
<p>Via scp you can copy the resulting binaries to the raspberry Pi:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>scp -r target/aarch64-unknown-linux-gnu/release 192.168.1.2:~/lighthouse
</code></pre></div></div>
<p>Install your validator keys to <code class="highlighter-rouge">~/.lighthouse/medalla/validators</code>. Optionally you can add the passwords to the validator keystores to <code class="highlighter-rouge">~/.lighthouse/medalla/secrets</code>. I prefer not to do this, so that if someone happened to take the Pi from my house without maintaining continuous power supply, they would not have access to the validator keys. However, this means whenever the Pi reboots, I have to log in to enter the password; it’s a tradeoff depending on how easily you can do this (probably not ideal if you are planning long trips without internet connection).</p>
<p>If you are planning to store the keystore passwords, you can create a systemd service to launch the VC automatically on boot. Create <code class="highlighter-rouge">/etc/systemd/system/lighthousevc.service</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
After=network.service
[Service]
ExecStart=/home/[username]/ethereum/lighthouse/target/release/lighthouse vc --beacon-node http://192.168.1.1:5052
User=[username]
[Install]
WantedBy=default.target
</code></pre></div></div>
<p>Otherwise, I recommend running it inside a screen session to be able to keep it running when you close the ssh session:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>screen /home/[username]/ethereum/lighthouse/target/release/lighthouse vc --beacon-node http://192.168.1.1:5052
</code></pre></div></div>
<h2 id="compiling-lighthouse-and-openethereum-for-the-beacon-node">Compiling Lighthouse and OpenEthereum for the Beacon node</h2>
<p>These instructions are if you are also using Ubuntu 20.04 on you main (beacon chain) machine. It is possible to run this setup with a different operating system.</p>
<p>You need to install openethereum (or another Mainnet ethereum client, such as Geth). Download and install it into <code class="highlighter-rouge">~/ethereum/openethereum</code>.</p>
<p>Build the lighthouse client for the host system in order to be able to run the beacon node:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cd ~/ethereum/lighthouse/
make
</code></pre></div></div>
<h2 id="first-launch">First launch</h2>
<p>We are now ready to run the Eth1 and the Beacon node:
To start the Eth1 node, run</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~/ethereum/openethereum/openethereum --chain goerli --jsonrpc-interface=all
</code></pre></div></div>
<p>(This is for the Goerli testnet, required for running on Medalla – change this for mainnet)</p>
<p>To start the Beacon node, in another terminal, run</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>~/ethereum/lighthouse/target/release/lighthouse bn --testnet medalla --http-port 5052 --eth1-endpoint http://localhost:8545 --http-address 192.168.1.1 --http
</code></pre></div></div>
<p>Both the Eth1 and Beacon node should start syncing. Note that this can take a long time – many hours on testnets and several days for an Eth1 mainnet nodes.</p>
<h2 id="installing-the-beacon-node-as-daemons">Installing the beacon node as daemons</h2>
<p>To launch the Eth1 and beacon nodes automatically as daemons, we can create systemd files:</p>
<p>For Openethereum: <code class="highlighter-rouge">/etc/systemd/system/openethereum.service</code></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
After=network.service
[Service]
ExecStart=/home/[username]/ethereum/openethereum/openethereum --chain goerli --jsonrpc-interface=all
User=[username]
[Install]
WantedBy=default.target
</code></pre></div></div>
<p>For the Beacon node service: <code class="highlighter-rouge">/etc/systemd/system/lighthouse.service</code></p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Unit]
After=network.service
[Service]
ExecStart=/home/[username]/ethereum/lighthouse/target/release/lighthouse bn --http-port 5052 --eth1-endpoint http://localhost:8545 --http-address 192.168.1.1 --http
User=[username]
[Install]
WantedBy=default.target
</code></pre></div></div>
<p>These services now be started and stopped using the nomal systemd interface using <code class="highlighter-rouge">sudo service openethereum [start/stop]</code> and <code class="highlighter-rouge">sudo service lighthouse [start/stop]</code>.</p>
<h2 id="synchronize-the-raspberry-pis-clock">Synchronize the Raspberry Pi’s clock</h2>
<p>Note that the Raspberry Pi has no battery to keep a time when it’s shut down, and it also isn’t connected to the Internet so Ubuntu’s default mechanism to synchronize the time will not work. So we will need to have an NTP server on our beacon node machine to keep it synced. On the beacon chain machine, install the <code class="highlighter-rouge">ntp</code> server:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt install ntp
sudo service ntp start
</code></pre></div></div>
<p>This is also a good time to adjust your NTP settings. Keeping your time well in sync is essential as a staker. There are several attacks via time services, and unfortunately rogue NTP servers cannot be ruled out. Our best defence against these attacks is avoiding all large time adjustments. NTP has a nice parameter, and you should add this to your <code class="highlighter-rouge">/etc/ntp.conf</code> file to stop all adjustments of more than 5 seconds (this will mean if your clock drifts by more than 5s, you have to set it manually – this should never really happen unless you have a long power outage).</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tinker panic 5
</code></pre></div></div>
<p>Now on the Pi, edit <code class="highlighter-rouge">/etc/systemd/timesyncd.conf</code> to connect to the other machine’s NTP server:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[Time]
NTP=192.168.1.1
</code></pre></div></div>
<h2 id="optional-harden-your-pi-using-ufw">Optional: Harden your Pi using ufw</h2>
<p>You can use a firewall to only allow the ssh port on the Pi. Note the configuration I have given will already do that, however if you are planning to use your Pi for anything else, this might be an extra measure to prevent accidental opening of additional ports.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>sudo apt-get install ufw
sudo ufw default deny incoming
sudo ufw default allow outgoing
sudo ufw allow ssh
sudo ufw enable
</code></pre></div></div>
<h2 id="future-work">Future work</h2>
<ul>
<li>Adapt this guide for mainnet where necessary</li>
<li>Simplify updating ubuntu and lighthouse node on the Pi</li>
</ul>Running a Validator Client on a Raspberry Pi