From private payments to programmable privacy: Are we stuck?

Is private chain state useful for web3 apps or just a marketing trick?

Dec 16, 2024

Programmable privacy is complex to reason about and build applications with. Private payments have been around for a while, yet a proper zero knowledge, fully programmable chain (think EVM-equivalent) is a myth: the so-called "zkRollups" today are only verifiable, not private. The fundamental impossibility of a "private" programmable chain is that web3 apps rely on a shared state, which should be public to be of any use. The natural question then follows: could some parts of the state be public while others remain private?

The impossibility of a zero-knowledge chain

In this post, I start by providing a bird's-eye view of private payments and then proceed to discuss how more complex functionality could be built in a privacy-preserving manner. Taking swaps as an example, I outline the difficulty of composing private and public states and how hand-designed swap protocols are the only reasonable choice today.

Along the way, we will see the design and engineering challenges in building even a single privacy-preserving functionality like swaps on top of a blockchain protocol. Is there a viable path to achieving general programmable privacy? I will conclude this post with an opinionated outline of the status quo of programmable privacy chains and leave a teaser of a simpler design.

Background on private transfers

Most ledger-based systems, such as Bitcoin or Ethereum, are public by default. Asset transfers on these networks reveal the sender, recipient, and amount. Payment history is, therefore, fully linked and can always be traced back to the source of funds (the genesis or coinbase events).

You might have heard of ZCash or Tornado Cash, two protocols that enable private transfers (a.k.a. shielded transfers) and solve precisely the problem of breaking the transaction traceability.

They have quite different architectures: ZCash is its own chain, while Tornado Cash is a series of smart contracts on Ethereum. Nevertheless, under the hood, they both utilize notes for transferring shielded value. A note allows one account to spend value and another to redeem it, without disclosing the link between the spender and the redeemer.

Notes in ZCash and Tornado Cash are enabled by zero-knowledge proofs (ZKPs, also commonly called zkSNARKs), a cryptographic tool that ensures the correctness of some computation but reveals no unnecessary information otherwise. In the case of private value transfers, a fundamental property ZKPs ensure is that no assets are minted out of thin air while protecting the parties' anonymity.

Despite their implementation differences, the goal of private payment systems is the same: enable two parties to send assets without linking them to their previous onchain activity. They prevent third-party observers from tracking the source and amount of funds moved.

To the outside, the transfer looks like this:

Whereas the two parties involved, Alice and Bob, see the details of the transfer:1

Towards private asset swaps

We will now try to make a leap from private transfers to private swaps, but first, let's quickly review what the most basic public swap looks like today.

Recall: swaps today

One of the distinguishing features of DeFi is the peer-to-protocol mechanism. At any given time, a user can join the protocol and perform an action that would have traditionally required either finding a willing counterparty (p2p), or a trusted third party to facilitate price discovery (centralized exchange). This includes actions such as (digital) goods exchange, auctions, or lending.

In the case of asset exchanges - or swaps - the simplest example of a decentralized exchange (DEX) protocol is the Uniswap v1 Automated Market Maker (AMM). In Uniswap (details here), liquidity providers add assets to a pool, and the users wishing to exchange some token A interact with the protocol (rather than with any specific other user) by depositing some token A into a pool and withdrawing a certain amount of token B. The amounts are calculated based on the protocol's public rules - in the case of Uniswap, the product of token A amount (x) and token B amount (y) are kept constant (k) before and after the swap, i.e., k = x * y.

The amount in each pool is publicly visible at all times, which allows the user to determine the conditions for their trade - in other words, the exchange rate is known. They know precisely how much they will get in return for their tokens.

Private swaps: a naive approach

Suppose we wanted to do Uniswap privately. Let's explicitly state our desiderata:

The user remains anonymous (swap is unlinkable to previous user actions), and
The amounts swapped remain private.

It's been described many times before why this doesn't work, notably on the ETH Research forum, but it's worth re-iterating the crux of the issue.

Recall the purpose of private transfers: I send assets to someone without revealing how much, or who I am.

This time, let's work backward: in order for the Uniswap-like protocol to work, I need to know the conditions of the trade, i.e., what I’m going to get in return for my tokens. So if I know how many tokens I'm going to get, and I know the protocol's swap rules (e.g., that it's a constant-product AMM), I am also able to calculate the current amounts of tokens held by the DEX. So, at any time, the current amounts of tokens A and B in the pool must be public.

In Ethereum, transactions in a block are executed sequentially, and after each transaction, the global state is updated. So even if my transaction were to somehow hide the amount in isolation (imagine if the transaction is never executed and the amounts remain hidden), the result of my transaction will alter the public state of the DEX, after which my private state will be exposed. Therefore, anyone can look at the state of the pool before and after my transaction and infer how much I swapped.

A seemingly cleverer approach would involve hiding the amounts of tokens A and B and only providing cryptographic proof about their correct updates. Unfortunately, computing such a proof requires knowing the actual values, which by definition precludes them from being hidden.

As expected, this doesn’t work. In fact, if we take a step back and think about privacy in finance more broadly, we’d arrive at the following axiom:

Finance relies on selective disclosure for supply and demand to match.

Fully private DeFi is a myth, and some tradeoffs must be made. As explained further in the linked talk from T. Chitra, this usually comes in the form of efficiency vs. privacy tradeoff. As we’ll see later, “efficiency” can encompass price, time, or even UX. Let’s first explore a few mechanisms for achieving/increasing privacy in swaps.

Enjoying the read? Share with your network…or keep the alpha to yourself.

What to do, how to swap?

In all the settings below, shielded transfers (ZCash- or Tornado Cash Nova- like) are a prerequisite functionality of the chain.

p2p swaps

The simplest form of private swaps is the peer-to-peer setting, where users engage in an atomic exchange. The public doesn't learn anything besides the fact that some asset exchange happened.

p2p swaps unfortunately offer poor user experience:

Users have the task to find the counterpart{y/ies} to their trade. Private price discovery requires either engaging in a multi-party computation (MPC) or the fallback to a trusted third party to facilitate matching (we assume the intermediary only helps with price discovery and does not enter into the asset custody),
Exchange price is subject to heavy arbitrage: once the "match" is found, atomic swaps carry an inherent delay that can be exploited by one of the parties (one party has the choice to abort frequently or to execute at a profitable price).

Furthermore, p2p swaps defeat the whole purpose of AMMs, which rely on the premise of not requiring a counterparty to perform a trade.

Batched swaps

Hold on, earlier we said that when engaging in a transaction, users should know the exact conditions of the trade. Let’s revisit this claim.

It is only valid when viewing the trade in isolation, when no one else is interacting with the protocol. In practice, multiple users might perform swaps in a single block, and even if there are no organic swaps, we still have arbitrage happening. The reality is that a user's swap might be executed at a slightly different price than initially agreed to. This is why most DEX apps include a maximum slippage parameter in a user's request: how far from the current price the user is willing to accept the trade.

Since we already allow some flexibility, maybe we could relax a little further the condition of knowing those exact trade conditions?

This is where the idea of batching swaps comes in handy: asset trades are aggregated at block-level and all executed at a fixed clearing price. The individual swap amounts are summed by the validators, who then proceed to calculate an aggregate clearing price per asset pair in a block. The total amount and the clearing price are made public so that even an outside observer can calculate the total output amounts. The users then claim their output note privately, i.e., in a "fresh” unlinkable claim, without revealing how much they claim.

In the only currently deployed private batch swap protocol2, the input amounts are revealed publicly. Since the total amount swapped is also known, the individual output amounts can also be inferred - but it's difficult to tell which claim corresponds to which amount, since outputs are claimed privately. From the Penumbra docs: "Although sweep descriptions do not reveal the amounts, or which swap’s outputs they claim, they do reveal the block and trading pair, so their anonymity set is considerably smaller than an ordinary shielded value transfer." In other words, the anonymity set (the "level of privacy" in commonspeak) heavily depends on the number of swaps per block and the timing of the claims. To illustrate, imagine a constant-product batch AMM with 60 UM and 100 USDC (we ignore routing via other pools and fees, for simplicity).

No privacy

We wish to swap 20 UM for USDC,
We are the only user performing the swap,
We add 20 UM of liquidity, and so to keep a constant product k, the new amount of USDC in the pool becomes 6000 / (60+20) = 75,
We claim 25 USDC from the swap. The claim transaction references the block where the swap was performed, so anyone can infer who the user is since there is only one user swapping.
It doesn't matter when we claim; there is no anonymity.

"High" privacy

We wish to swap 2 UM for USDC,
There are nine other users, who each swap 2 UM for an additional 18 UM,
The clearing price for the UM<>USDC asset pair is the same as in the previous example,
All users claim their notes immediately after they see their swap transaction was included in a block,
Sometime in the future, a user pays 1 USDC for a subscription service using their swap output. Since the payments are executed privately and only the subscription amount is revealed, it is hard to pinpoint where the payment came from. It might have come from some output in the 20 UM <> 25 USDC swap or from any number of other swaps executed. The anonymity pool is large.

An illusion of privacy

We wish to swap 2 UM for USDC,
There are nine other users, who in total, swap another 18 UM as before, but no other users' input equals 2 UM,
Same as above, but instead of claiming the output immediately, one of the users waits with the claim until they have to pay for the subscription service,
They claim the swap output of 2.50 USDC, and immediately after pay 1 USDC for the subscription service,
The swap claim and the subscription payment are now highly correlated in time,
Since there was only one swap input amount with the value of 2 UM, that input is also correlated with the subscription payment,
Thus, the source of funds can be tracked with higher probability.

In future versions of the protocol, Penumbra plans to add threshold homomorphic encryption to introduce privacy on the side of the input as well, whereby the users would encrypt their input amounts so that they can be added together. The validators would then participate in a threshold decryption protocol (another form of MPC) to reveal the grand total of the block's swap amount, while maintaining the individual contributor's inputs secret. Claims would function as before.

Uniform Random Execution

Another mitigating strategy to tackle the lack of privacy in Uniswap and other AMMs was proposed in 2021 by G. Angeris, A. Evans, and T. Chitra and subsequently analyzed for differential privacy by the same authors. The strategy is called Uniform Random Execution and relies on trade splitting, permuting the order and adding random noise to the price the AMM quotes.

The problem described in their paper arises in the context of what I’d term a semi-private DEX: the price quoted for an asset pair is public and updated with each trade, but the reserve amounts are hidden; also, the individual trades are private.

“Wait, didn’t you say at the beginning that reserves must be public for such a scheme to work?”

Yes, if we’re using only ZKPs. The CAE 21 paper assumes the existence of an oracle responsible for maintaining the DEX state.

I want to expand on the concept of this DEX oracle a bit further since it’s not explicitly covered in the paper. Imagine we have access to an imaginary, honest offchain entity3 that lets LPs privately add liquidity to a pool and keeps track of the reserves internally. It always publishes the most up-to-date prices. Similarly, users can privately submit exchange requests, whereby they transfer a certain amount of assets to the oracle (in a shielded way). The oracle then sends them back the corresponding swapped amounts of target assets (also in a shielded way) and, as before, publishes the up-to-date price. Note that the reserve amounts are never made explicitly public.

The authors analyze this scenario and conclude that for most practical deployments4 of AMMs, an adversary can infer the reserve amounts by submitting their own trades and, subsequently, infer the amounts traded in a “private” transaction they are interested in.

So a semi-private DEX, as expected (again), doesn’t work. To remedy this, the authors propose two solutions: 1) Batching swaps, which we already mentioned above; and 2) Uniform Random Execution (URE).

The idea behind URE is extremely simple: introduce random noise to the price quoted by the AMM. Realizing our magical oracle most likely involves some advanced cryptography (MPC) and/or trust assumptions, but the details of how one would go about implementing this are neither covered nor relevant to our discussion.

Notably, the authors demonstrate “that the URE achieves differential privacy with parameters dependent on the number of trades and the curvature”. Differential privacy essentially means that one can do useful things with a dataset while not learning much about an individual data point (not even whether it’s present in the dataset or not). In the context of swaps, the useful thing is enabling a party to swap with a decent enough price, while hiding how much this individual entity transacted.

The URE approach seems to offer users the best privacy-price impact tradeoff. I’m not aware of any practical deployments of URE. Perhaps the engineering challenges of doing this trustlessly are an overkill: batched swaps have been deployed (only?) by Penumbra, and even they haven’t rolled out privacy on the transaction input front. Building URE would involve similar techniques and some more.

An outlook on general programmable privacy

It should be clear by now that providing privacy on swaps is, at best, tricky and complex to implement and, at worst, leads to negative price impact and poor UX.

Is there any hope to extend privacy to arbitrary logic beyond just swaps? And what do we even mean by this?

We will look at general programmable privacy in the follow-up post. Here’s a sneak preview of our assessment of the state of programmable privacy solutions currently {deployed/being worked on}:

A few blockchain projects have architectures that allow having a combination of private and public states in smart contracts (Aleo, Aztec, Miden, ?), with the aim to provide both confidentiality AND anonymity. Yes, those designs let you build specialized applications that rely on combining private and public states within one program. However, this programming model is overly restrictive. A private state in isolation is useless for composability (see: this entire post), so the private state must "interact" with the public state. Reasoning about and explicitly managing these interactions at the smart contract level can lead to information leakage and is, unfortunately, a burden on application developers. Furthermore, it places a new responsibility on them: developers must now navigate the compliance landscape within each new application.

Next time, we will also provide a fresh perspective on programmable privacy that foregoes confidentiality and focuses only on anonymity, resulting in a much simpler design and a clear path to adoption.

Summary

Shielded transfers are mostly solved today via zkSNARKs,
Adding privacy to functionality beyond transfers is more convoluted than sprinkling ZK over your protocol,
- Applications need shared global state
- When private state interacts with shared state, we lose privacy
(semi-)Private DEX design assumes shielded transfers, but further requires protocol-tailored techniques such as batching, splitting or randomization,
zkSNARKs might not be enough: the current design of a fully private DEX requires multi-party computation (e.g. homomorphic threshold encryption),
- MPC comes with its own trust vs. performance tradeoff,
- TEEs can help circumventing collusion concerns

Programmable privacy is clearly still in its infancy, with major challenges at the side of cryptography, compliance, UX, DevEx and technical education. I am excited about the novel developments in the field and happy to contribute to making it more accessible.

Author: Marti Górny

This deep dive was inspired by the short talk I gave at Club3 by Very Early Ventures.

Technically, Alice doesn’t need to know Bob’s identity - she creates a note that some other account will spend in the future. Similarly, Bob doesn’t need to know who sent him the note. In practice, however, the parties will often know each other’s identities e.g., in p2p payments.

CoW Protocol utilizes batch swaps, too, in order to provide a uniform clearing price, but it doesn't offer privacy protection since they don't rely on shielded transfers which are a prerequisite for offering privacy with batching.

This functionality could be realized e.g. by employing ZKPs for shielded transfers + TEE to instantiate the oracle. The role of the oracle would ideally not be assumed by a single centralized entity, but there are ways around that, such as strapping a consensus network on top of it, using MPC, or both, or your other favorite decentralization trick depending on the desired properties.

Practical deployments of AMMs are s.t.:

The trading function is known. Otherwise you don’t know what you’re going to receive.
The trading function is strictly concave and increasing.
The adversary is allowed to interact with the AMM before and after the transaction that they wish to target.

NP Labs

Discussion about this post

Ready for more?