There is no such thing as “a bitcoin”
Recently on Twitter, the St. Louis Fed’s David Andolfatto asked an interesting question:
How hard would it be to track these funds and mark them as fully taxable at any point they are redeemed for dollars (or any other currency) at an exchange? https://t.co/T1VqFGs0X7
— David Andolfatto (@dandolfa) July 1, 2020
That is, would it be possible to mark a particular bitcoin as having been paid in ransom so that such tainted coins could not be exchanged for dollars at regulated exchanges without first paying a tax? Putting aside whether that would be good policy, the threshold question is whether it’s technically possible to trace individual bitcoins in this way. In a later tweet he elaborated,
Can one not simply treat each satoshi as the basic unit? Don't care how they're combined into larger units -- I just follow the individual satoshis and threaten to confiscate them (and them alone) whenever I detect one being part of an exchange for electronic USD on an exchange.
— David Andolfatto (@dandolfa) July 1, 2020
My shorthand answer to this question is, no because there are actually no such things as “a bitcoin” that can be traced. Twitter did not prove to be an auspicious medium to convey what I meant by that shorthand, so here’s a stab at explaining the concept in detail.
Many people who generally understand how Bitcoin and its blockchain system works are nevertheless under the erroneous impression that there are such things as atomic units of bitcoin that are moved from address to address and can therefore be tracked on the public chain. I think this mistake has its roots in language and analogy.
Bitcoin is a digital currency, “coin” is in its name, the White Paper describes it as electronic cash and mentions “wallets.” As a result we talk about it like we talk about physical currency; like we talk about paper bills or coins. We say things like, “She gave me a bitcoin.” As a result, people tend to quite justifiably envision bitcoins as atomic units that are passed around. Even if they understand that bitcoins are sub-divisible to eight decimal places, and that those units are called satoshis, they may still think that those are the atomic units.
This leads people to conceptualize bitcoins like dollar bills with serial numbers.
I mean, I don't understand this. Replace "satoshi" with $ bills, each with unique serial number. I don't care about the order in which they move in and out of a bar. If I catch the stolen $, I remove it from the bar, same as I would if I caught a counterfeit.
— David Andolfatto (@dandolfa) July 1, 2020
And since people know that the blockchain keeps a public record of all transactions, they imagine that this means that individual bitcoins can be traced by their “serial numbers” moving from address to address, from wallet to wallet.
But the coins and wallets analogy is just that: an analogy. This is not at all how Bitcoin works.
So how does it work? Suppose Alice wants to send some Bitcoins to Bob. What she does is compose a new transaction message to be broadcast on the network for validation and inclusion in the blockchain by a miner. In the transaction message she must reference one or more previous transactions on the blockchain of which she was a recipient.1 Those referenced transactions are called inputs; they are in essence the funds with which she’s transacting. The sum of the inputs must total to exactly or (almost invariably) more than the amount she wants to send to Bob. She must also, of course, note the address or addresses to which she wants to send funds. These are called outputs. She signs the transaction message with the private keys corresponding to the input addresses and broadcasts it on the peer-to-peer network. If all goes well, a miner will include the transaction in the next block after verifying that the inputs and signatures are valid. The transaction is now complete and Bob can now reference this completed transaction as an input into a future transaction when he wants to move the funds.
In the simplest case, there is only one input and only one output. Alice references a transactions in which someone sent her exactly one bitcoin and she sends exactly one bitcoin to Bob’s address. In such a case, it might not be unreasonable to say that we can track the movement of one particular bitcoin, just as if it was one particular $100 bill. However, such a transaction is not only improbable, I’m not sure there have ever been any like it.
First, miners don’t typically work for free. In addition to block rewards, miners are incentivized by fees they can collect from each transaction they include in a block that they add to the blockchain. These fees are voluntary, but miners will almost always ignore transactions that don’t have fees (or have fees below the current market rate). So, if Alice wants to send one bitcoin to Bob, it’s likely she will have inputs to the transaction that total, say, 1.01 bitcoins. Whatever is the balance between the inputs and the outputs in a transaction message is understood to be the fee and the miner gets to keep it.
Second, users typically don’t have exact change. It’s unlikely that Alice will have a single prior transaction that she can use as an input to pay Bob (i.e. 1.01 BTC, exactly one bitcoin for bob plus the appropriate mining fee). More likely she will have many different possible inputs to choose from—for example, one for 87 bitcoins, one for .51, one for .7365, one for 14.98, etc. As a result, the number of inputs and/or outputs in her transaction will increase. For example, if she wants to send Bob one bitcoin, and wants to also include a .01 bitcoin mining fee, she has to find inputs equal to 1.01 bitcoins or more. So, she can use one big input (like the one for 87 bitcoins) and get change of 85.99 bitcoins back by including a second output for that amount to an address that she controls. The transaction would now have one input and two outputs. Alternatively she could use two small inputs (like the ones for .51 and .7365) but still want change (i.e. 0.2365) for which she would provide a change address. That transaction would therefore have two inputs and two outputs.
One last wrinkle I’ll add is that when Alice composes a transaction message to pay Bob, she can also take the opportunity to pay Charlie as well. She can send up to the total of her inputs to as many addresses as she wants, so the number of transaction outputs can be very large. So, a not uncommon bitcoin transaction will look like this:
The key thing to realize is that there are no individual, atomic units of bitcoin that are transferred among addresses. Indeed, there are no such things as bitcoins that one can point to, much less track. What individuals “own” are not bitcoins per se, but unspent transaction outputs (UTXOs) that can serve as inputs for new transactions. When a new transaction is added to the blockchain, the transactions that served as inputs are of course still visible on the ledger, but they can no longer be spent,2 and new unspent transaction outputs are available in the newly created transaction. The input and new transactions are certainly linked in a chain, but in no way can we identify in the new UTXO a particular satoshi that was present in the input UTXO—again, because individual satoshis don’t really exist.3
Let’s now bring it back to Andolfatto’s proposal to tax blacklisted bitcoins when they are presented to a regulated exchange. We can now see that in Bitcoin it makes no sense to say that a particular bitcoin was paid in ransom and is now being brought to an exchange. One can nevertheless trace the movement of funds on the blockchain and show that a particular UTXO has somewhere in its chain an illicit transaction, but that’s not the same thing as saying that a particular bitcoin was part of an illicit transaction.
In the figure above, suppose the UTXO with 87 bitcoins are the proceeds of crime, while the rest are not. When Alice sends bitcoins to Bob and Charlie (and pays a mining fee to do so), there is no meaningful way in which we can say that Bob, Charlie, or the miner received some of the 87 ill-gotten bitcoins. And it’s just as meaningless to say that either Bob’s or Charlie’s or the miner’s coins came from the legitimate UTXOs. It’s not just that we can’t be sure, it’s that the concept of particular coins changing hands makes no sense because individual coins don’t exist.
At this point, I hope I have answered the initial question, is it possible to track individual units of bitcoin? It’s not.
What someone might then say is, why not simply treat each of the new UTXOs as containing a share of the illicit input coins in proportion to the value of the outputs illicit coins? Couldn’t you then tax that proportion of Bob or Charlie’s coins? The answer is sure, I guess you could do that, but that would be a policy choice external to the mechanics of Bitcoin, and I imagine it would be seen as a pretty inequitable policy. There’s a reason the currency rule emerged.
As a result, what’s developed in practice by industry, regulators, and law enforcement is to focus on addresses and not individual bitcoins. You can know with certainty that a particular address received a ransom payment, and you can know which other addresses have been sent funds from that ransom address and on and on. They also look at what proportion of each transaction comes from previous illicit transactions and how many suspect transactions an individual may be tied to. This information can be used by exchanges to require more information from customers with addresses in close proximity to known illicit transactions, and law enforcement can use all this to prosecute crimes. Using solely the information available on-chain, however, it’s not really possible to automate justice in any sensible way.
Finally, I’d like to say that I wrote this post because I couldn’t find any publication explaining the above in detail. After writing the preceding 2,000 words, however, I came across this article by Northern Illinois University philosopher Craig Warmke, which although it doesn’t address exactly the same issue, would have worked as a reference for the proposition that you can’t track individual bitcoins. It’s very interesting and worth reading.
Thank you to Tom Robinson of Elliptic for reviewing a draft of this post.
[1] By this I mean that the transactions in question were sent to a public address for which she controls the corresponding private key. More specifically she would reference UTXOs, which will be explained momentarily
[2] That’s because also visible on the ledger is the fact that they have been used as inputs in another transaction.
[3] UTXOs are the closest analogy we have to discrete coins, but they are “destroyed” when they are used in transactions, and new UTXOs created with the same aggregate value.