The first time I co-founded a company was back in 1996 with a group of four other guys (and yes we were certainly “guys” in the most utterly-stereotypical tech startup sense of the word). One of these was T, who even today literally glows with Chicago-style marketing/finance enthusiasm — I’d challenge anyone to take a coffee-and-Howard-Stern-fueled “perception is reality” road trip with T and not end up a lifelong fan. When I tell fellow nerds that they need a great business person to start their company with, I’m thinking about him.
So when T called a few weeks ago and asked what I knew about Ethereum and Smart Contracts, I figured I should take a look. I’ve spent the last decade dismissing blockchain and crypto without really spending any brain cycles on them … just has never smelled right to me. But obviously there are a ton of people who think it’s the future, and it’s survived more than a few existential crises already, so ok, let’s see what’s going on in there.
Fair warning — the innards of these technologies are quite complex; clearly the vast majority of people “explaining” them online don’t have a clue what is really going on under the covers. I’m going to try to share what I’ve learned in relatively simple terms, but almost certainly will get some of it wrong. I’d love to be corrected where I’ve messed up so please let me know. And if you actually invest your dollars based on any of this, well, that’s 100% on you my friend.
What problem are you trying to solve?
The root of all of this stuff is technology that enables two strangers to exchange value without a trusted third party mediating the transaction. A distributed network of computers (not majority-owned by any single entity) coordinates to ensure that transactions are permanently recorded, that value is owned by exactly one entity at a time, and that it cannot be counterfeited. For sure it takes resources to run the network, but the costs are diffused across a ton of folks and transaction fees can at least theoretically be kept pretty low (more on this later).
The appeal is pretty obvious. Right now we count on trusted third parties for all of our transactions, from banks to mortgage escrow companies to Visa and Paypal and Square. Not only do these folks eat into our assets via transaction fees and float; they exert an incredible amount of power on the market overall. Remember “too big to fail?” Oh yeah, and they also keep (and profit from) our personal information. A fair, anonymous trading platform that avoids these issues seems pretty cool.
Part 1: Hashes, keys and signatures
First, a few core concepts. None these are new; we’ve been using them forever to do things like keep websites secure. But they’re building blocks for all that comes later so worth setting up a bit of a glossary.
A one-way hash is an algorithm for representing any arbitrarily-sized chunk of data with a small, opaque, unique label. That’s a lot of words. “Small” is important because you can represent huge files, like an entire video, in a small, easy-to-manipulate string (typically 256 bits these days). Note this isn’t some magic compression technology; “one-way” or “opaque” means that you cannot reasonably get back to the original bytes using only the hash, and you can’t predict what a hash will look like from the original bytes. Last, a hash is “unique” because you will never (in practice) get the same hash for two pieces of original data, even if they only differ by only a single bit.
A key is a secret used to encrypt to decrypt data. In stories we usually see symmetric keys, where two spies exchange a password that is used to both encrypt and decrypt a message. The problem with symmetric keys is that they need to be shared, which leaves them at risk to be stolen. Stolen keys not only mean that the wrong folks can read messages, but it also mean that the wrong people can encrypt messages — all kinds of potential for nastiness there.
Public/private key pairs work differently. A person’s private key is something that is never, ever shared — but the corresponding public key can be shared openly (it’s “public” after all). A message encrypted with my public key can only be decrypted using the corresponding private key. Because the private key never leaves my control, it’s a much more secure means of communication.
A digital signature is another way to use public/private key pairs — to prove that something is genuine and unaltered. A signature is typically computed by first generating a one-way hash of the data, and then using a private key to encrypt the hash. The resulting signature is sent to a recipient along with the original content, who computes his own one-way hash of the data, decrypts the signature using the corresponding public key, and compares the two results. If they match, they can be confident that the data (a) did in fact come from the owner of the key and (b) was not altered in transit.
Part 2: Blockchains
A blockchain is just a ledger — an ordered list of all transactions that have ever been processed in a particular system. Transactions are collected into blocks, and each block is hashed using input data from those transactions and the previous block. In this way the integrity of each block is reinforced by all the blocks that have come before it. Only one single, exact, ordered sequence of transaction and header data can culminate in the unique hash at the end of the chain; by following the hashes from start to finish, anyone can easily (albeit laboriously) verify that no transactions have been lost or tampered with along the way.
Part 3: Consensus
The blockchain data structure itself is actually pretty simple; the process of creating one in a distributed, “trustless” way is anything but. Thousands of independent nodes work together to create a “consensus” about what the blockchain should look like, hopefully making it really hard for a bad guy to screw things up.
Consensus is another long-standing problem in computer science, and many ways of dealing with it have been developed over the years. The approach used by Bitcoin is popularly called proof of work, and it’s useful to start there because it is pretty much the gold standard (ha ha, get it?) as far as crypto goes, and once you understand it the others make much more sense. Actually, PoW is really just one part of the complete consensus protocol, but that really only matters to the pedants out there. Here’s how it works (buckle up):
1. The Bitcoin peer-to-peer network is made up of thousands of nodes, where each node is effectively a computer running the Bitcoin client software. Nodes are configured as miners, full nodes or light nodes. (More on this later. There is also a subclass of full nodes called super nodes that I’m ignoring, sorry.)
2. Transactions are submitted directly to an arbitrary node through the Bitcoin RPC interface, most commonly using a wallet. Transaction data is signed prior to submission so that private keys need never leave the user’s chosen wallet or other personal storage.
3. Nodes relay transactions around the network, where miners pick them up and bundle them together into blocks. Once enough transactions are collected, the miner verifies signatures, accounts and balances, hashes the transaction data (this is called the “merkle root”) and starts hunting for a valid block hash.
This is where things get a little complicated. A “valid block hash” is one where the numerical representation is less than the current network “difficulty” threshold. Difficulty ratchets up and down according to a global algorithm that attempts to keep the rate of block creation constant, based on how much miner capacity is running across the network.
Remember that the block hash uses the transaction data (merkle root) and previous block hash as input. It also uses a nonce value, which is just a random number picked by the miner. The nonce is the only input to the block hash that can change. And remember that nobody can predict what the hash value will be based on the input — so miners just randomly pick a nonce and compare the resulting hash against the difficulty threshold until they find a valid one. This brute-force hunting process is what “proof of work” means … you cannot create a valid block hash unless you do a bunch of computing work, which costs real-world resources — making it infeasible to game in a way that makes economic sense.
Once a valid block hash is found, the miner tacks it onto the end of the chain and broadcasts their accomplishment across the network. Block creation includes a “block reward” for the miner (currently 6.25 BTC per block) — this is how new Bitcoin comes to exist, and why the term “miner” makes sense.
4. The story isn’t finished yet! Thousands of miners are all doing this work at the same time, using the same transactions1 and trying to find valid block hashes so they can get their reward. This is where the consensus part comes into play. All nodes on the network (including miners but also full and light nodes) work to validate the blocks created by miners. The obvious part of this is just making sure all the math lines up — the signatures and hashes check out, the block hash is below the target difficulty, and so on.
Nodes also have to decide which block “wins” when more than one miner finds a valid block hash for the same transactions. This part is pretty cool — nodes just always prefer longer blockchains. Whenever the node validates a block in a chain that is longer than their current view of the world, they accept that one. The effect here is that all nodes converge on the same chain / view of the world — but it takes time for that to happen. Most bitcoin clients consider a transaction “final” when they see six “confirmations” — meaning that there are six blocks in the chain after the one including the transaction. Six is an arbitrary number but seems to work well as a conservative threshold.
A side note: The value “six” here is just one of a bunch of constants and other parameters that are part of the Bitcoin system … the maximum number of coins in the system, the blockchain reward, the difficulty threshold, and so on. Whoever figured all this out up front, mostly before the algorithms were ever really live in the real world, is a freaking genius. The mystery around that genius is great drama in and of itself. Who is Satoshi?
Still with me? Hard to believe. But if you are, you now have a better understanding of how Bitcoin and the other PoW blockchains work than most folks spending money on them. Thumbs up!
Part 5: Ethereum and Smart Contracts
One thing I glossed over in previous sections is the specific data in each “transaction”. At one level it’s useful to understand that this just doesn’t matter — the blockchain structure and consensus protocols are largely agnostic to the claims made in the transaction itself. But of course for any specific blockchain implementation, those transactions are the whole point of the exercise and matter a lot.
Bitcoin (BTC) was the first mainstream cryptocurrency, and its blockchain is hyper-focused on financial transactions. Each transaction just describes movement of Bitcoin value between “input” and “output” accounts. Actually that’s a little bit of a lie, Bitcoin transactions are built on some rudimentary scripting primitives — but in practice Bitcoin transactions just end up being value exchange.
The second-best-known blockchain is Ethereum (ETH). Naïve traders view it as just another cryptocurrency, but that’s super-wrong. ETH transactions certainly support value exchange just like BTC, but their real purpose is to manage smart contracts. A smart contract is a full-on piece of software that lives on the ETH blockchain with its own address that can hold funds and arbitrary state. Smart contracts can also expose an API which can be called by submitting (you guessed it) transactions on the chain. Folks sometimes refer to Ethereum as the “world’s computer.”2 ETH the currency is used to pay for processing and storage time on the chain, so theoretically there is a little more “oomph” behind its value vs. something purely abstract like BTC, maybe.
Anyways, this is cool because it makes the mechanics of trustless, distributed transactions available in a bunch of new contexts. A few popular use cases you probably have already heard of:
- Fungible tokens based on the ERC-20 standard that enable arbitrary “currencies” — these could be purely abstract like airline miles, or tied to physical stuff like shares of a company or even fiat currencies like US Dollars. Of course, ties to the physical world are based entirely on contract law; ETH just provides a great platform for representing and exchanging them in a robust way. Token behavior could be implemented by any smart contract, but adherence to the ERC-20 standard means that most Ethereum wallets will be able to hold and trade the token.
- Non-fungible tokens based on the ERC-721 standard that enable collectibles and ownership of other unique assets. These differ from fungible tokens in that each one is unique — e.g., there is only one NFT representing a particular painting and only one account can own it at a time (modulo fractional ownership). Obviously the same caveats about physical objects apply, but even for digital NFTs (e.g., an NBA Top Shot video) it’s a little weird — block data is too limited to store large files directly, so NFTs typically hold a hash of the actual asset, which is stored somewhere else (“off-chain”, i.e., basically unprotected). And of course a digital asset can be bit-for-bit copied with no loss of fidelity, so …. I am generally trying to reserve opinions for later, but while this is a neat idea, it really just seems kind of silly to me.
- Decentralized autonomous organizations like the one that tried to buy a copy of the US Constitution. Member accounts (defined by ownership of a particular token or some other rule) call smart contract APIs to vote on issues, with the winning votes automatically triggering actions defined in the code of the contract. For example, members might contribute ETH to a shared giving fund and vote on which account(s) should receive donations.
- Distributed finance applications like MakerDAO that provide financial services like loan and interest-bearing accounts without a central bank.
- Lots and lots of games and lotteries and other crap.
Building all of this on the smart contracts framework provides some neat benefits. For example, an NFT can be coded so that when it changes hands, a royalty is sent to the original creator. Tokens might be used as an access pass to an online or real world event. Defi currencies can be coded to automatically increase or decrease supply to reduce volatility. DAOs might include a poison pill that liquidates the organization’s assets if the market reaches a certain level. And because all of the smart contract code is visible on and verified by the blockchain, there is an unprecedented level of transparency as to what is going on.
On the flip side, since smart contracts are just code, they can have bugs, and those bugs can be disastrous. It’s also very cumbersome to include “off-chain” information into smart contract algorithms … for example, a DAO might want to automatically send money to relief organizations when natural disasters occur — knowing that a disaster has occurred in a trustable, automated way is a challenge that lives outside of what Ethereum can easily do today.
In a follow-up post, I’ll walk through the concrete stuff required to deploy and run a smart contract in some detail — but this screed is long and boring enough already.
Part 6: Why are these things valuable?
The question rational people always ask about cryptocurrencies (after “WTF?”) is “why are they worth anything?” Honestly, this is a philosophical question more than anything else, and if there is anyone on the planet less well-equipped to plumb the depths of philosophy than this guy, well, I haven’t met them yet. But I think at least for me, the best answer is just:
BTC and ETH and other cryptocurrencies are valuable because people think they are.
That’s really it. Enough people believe enough in BTC and ETH that they are willing to accept them in exchange for other things they believe to have value, like US Dollars or Euros. This really isn’t so different from the reason that “everyone” attributes value to the US Dollar. And as Hamilton reminds us, even that wasn’t always the case:
Local merchants deny us equipment, assistance
They only take British money, so sing a song of sixpence
This conversation always seems to devolve into people shrinking into the fetal position and questioning the nature of reality, so I’m just going to walk away now. Decide for yourself.
Part 7: What could go wrong?
Much more interesting to me is the technical viability of blockchain-based systems. The more I learn, the more I get convinced that they are fatally flawed and likely to blow up. I’m not smart enough to guess when, but probably pretty soon. Which is not to say that blockchains are inherently bad, or that the core value proposition they offer will not survive and change the world. Just that the current implementations — and all the value wrapped up within them — are probably not up to the task.
It feels very, very much like the dot-com days to me. Some amazing, world-changing things came out of that time, but unfortunately “dot-com” became a religion, and like all religions, grew increasingly resistant to rational criticism. With real money involved, that’s some risky business. Blockchain technical issues feel the same way to me right now. Or more precisely, one specific technical issue. But before I go there, let’s look at a couple of the other technical challenges that I think are probably fine.
High on the list of blockchain objections is energy use. Proof of work is by design an incredible redundant protocol — thousands of nodes are all basically doing exactly the same work as fast as they can, most of which is thrown away. There are all kinds of statistics on this and how ridiculous it has become; my personal favorite is that the energy use by Bitcoin miners is roughly equivalent to my home state of Washington. Does it feel a few degrees hotter to you?
The good news is, this is pretty fixable and already being demonstrated by other chains. One alternative to proof of work is proof of stake. This model does away with miners. Instead, nodes participate in block creation by putting a “stake” into escrow on the chain. The opportunity to mine each new block is granted to one node, chosen pseudo-randomly and typically weighted by the relative size of each node’s stake. That node and only that node creates the block and receives the reward, which is then validated using the normal mechanisms. Nodes that propose invalid blocks are penalized by loss of their stake.
Because it eliminates virtually all of the redundant and expensive computation, PoS is dramatically more energy-efficient than PoW. It is also considered to be helpful in maintaining decentralization of the network, which is a little counter-intuitive because nodes basically are buying influence. But maximum weighting can be easily capped to avoid this, and the approach negates any advantage from funding huge mining data centers as happens today.
Ethereum has been planning to move to PoS for some time now, but it keeps slipping — currently planned for sometime in 2022. Vitalik, I feel your pain, man.
Bitcoin can handle about four or five transactions every second, and individual transactions take around an hour to confirm. Ethereum is faster but still in the range of 20 TPS with maybe eight minute confirmations. These numbers are frankly hilarious compared to centralized processors like Visa that have TPS rates in the thousands, and confirm credit card purchases reliably in seconds.
Folks have played around the edges of this by tweaking parameters like the number of transactions that can go into a single block. Bitcoin Cash is one of these and sees rates of about 300 TPS. Layer 2 chains are another approach, basically accepting transactions quickly in separate infrastructure and bundling them up into fewer meta-transactions on the real chain. There are security/speed tradeoffs here that will take some experimentation to really get right.
More revolutionarily, Solana has demonstrated real progress by mixing up proof of stake with (sorry, more jargon) proof of history which uses a distributed, synchronized clock to reduce the back and forth typically required to keep everything in an agreed-upon order.
All up, the problems here seem eminently solvable if not already solved. I’m not worried.
Part 8: Infinite Growth
Finally we come to the one bit I just can’t get over — I’d love to hear a non-religious explanation of why I’m wrong, but:
Blockchains by definition grow forever.
At its core, a chain works because it stores a record of everything that has ever happened ever. When I stand up an Ethereum “full” node, the first thing it does is start downloading the entirety of that history to my local machine. Ethereum’s grew from about 480GB in September 2020 to 952GB a year later, and recently crossed beyond a terabyte. Ethereum is only six years old, and those early years were pretty quiet! If you simply play things out linearly, it reaches half a petabyte by the end of this decade alone — and all the work underway to accelerate throughput means it ain’t going to be linear. Bitcoin is better off, with a linear growth that hits a couple of terabytes by the end of the decade, but that’s just a result of its low TPS rate — I’m not sure that “we survive by doing less” is a winning strategy.
Both chains have the concept of a “light” node that only downloads block headers rather than full transaction histories. Light nodes verify the structure of the chain itself, but trust that the transactions inside are OK. This is a great alternative that allows low-powered devices and wallets to participate, but isn’t sufficient on its own to keep the network running.
Various data compression techniques have been proposed or are in use — this may delay the time of death, but doesn’t address the fundamental problem.
The closest thing I’ve seen to a solution is Ethereum’s proposed “sharding” solution, in which multiple small chains operate independently and periodically roll their transactions up (layer 2 style) to a master (“beacon”) chain. This presumes that most transactions occur locally within a single shard — cross-shard transactions would be painfully slow and expensive, so they need to be rare. There are a bunch of new security and integrity issues here too. Maybe it’ll work, but it’s always “just around the corner” and I have yet to see much of a concrete proposal, and it still doesn’t fundamentally solve the growth problem — smells like religion.
At the end of the day, things that grow infinitely are infinitely bad. Sure disks keep getting bigger, but not nearly as quickly as the chains are. At best this means that soon (like, SOON) only very well-capitalized entities will be able to afford the hardware required to run nodes, in which case we’re right back to the centralized system and power structure we started with. More likely, one day investors will finally realize that the end is near, and a lot of folks will lose a lot of money.
Unfortunately, facing this question has become heresy to true believers. You don’t “get it” or “see the big picture.” They talk about solutions that don’t exist as if they did. I’ve been at this movie before, and it was called dot-com. No thanks.
So where does this leave us?
Blockchains and the systems built atop them are super-cool. More than that, they are addressing real problems in novel and revolutionary ways. But right now there is so much quick money to be made that most folks are ignoring the existential question that arises from their infinite (and accelerating) growth. I am absolutely ready to look at data that shows we can keep increasing adoption of the amazing features of these new systems without falling off of a cliff — but I’ve looked and looked hard and haven’t found it yet. Until I do, I’ll remain a grumpy old guy yelling at DApp developers to get off my lawn.
But that doesn’t mean I won’t keep learning and experimenting — there is just too much amazing stuff going on to ignore. And I’ll keep listening to T because he may well see the magic before I do. Next time we’ll write and deploy a smart contract.
1 Omer points out on LinkedIn that it’s not really the same transactions. Each miner is constructing a block from some subset of all pending transactions in the system (plus a unique “coinbase” transaction representing their hoped-for reward). So while there is great redundancy/rework in the system, it’s not true that every miner is literally racing to do exactly the same thing.
2 Also from Omer is a note that the EVM (Ethereum Virtual Machine) that runs smart contracts is Turing complete. This means that it’s a for-real general-purpose computing engine, unlike Bitcoin’s simple script. Managing this additional complexity (and of course power) is the purpose behind the “gas fees” that are part of each execution. Super cool stuff worthy of its own post!