Putting the smart in smart contracts since this week

In my first “crypto” post I covered a bunch of stuff, mostly in the abstract. All well and good — tons of new concepts to get a handle on — but it ain’t real until it’s running code. So that’s the task for today: build, deploy and run an Ethereum smart contract that does something at least marginally interesting. Things are going to get pretty wonky and probably a little boring if you don’t love code, but everyone is welcome to come along — embrace your nerd!

Round Robin Lotto

We’re going to build a contract I called “Round Robin Lotto” — the full source (there ain’t much) is on github in rrlotto.sol. RRLotto is a game that works like this:

  1. The system initializes a pseudo-random counter between 2 and 25.
  2. Accounts play the lotto by executing the “play” method of the contract and sending along .001 ether ($3.89 on mainnet as I write this).
  3. Each play decrements the counter by 1. When the counter hits zero, three things happen:
    1. The current player receives 95% of all ether in the contract account.
    1. The remaining 5% is sent to the “house” (the account that deployed the contract).
    1. The counter is reset to a new pseudo-random value between 2 and 25 (starting a new round).

The effect is more or less that of a 95% payout slot machine (albeit one with only a single prize). The jackpot will range from 0.0019 ether ($7.39 today) to 0.02375 ether ($92.43) based on the initial counter value for the round. Alert the IRS.

Running a lotto on Ethereum is interesting because (1) there is no true randomness on the blockchain and (2) all code and state data is public. If plays are infrequent, it would be easy for a sneaky actor to write code that waits until the counter is 1 and then plays immediately, winning every time. The trick is to set the maximum counter value roughly equal to the number of active players, making it very difficult to manipulate the order in which plays are processed / mined.

A better solution might be to run the pseudo-random generator on every play, and just use it to pay out with the desired frequency. The problem here is that our “pseudo-random” number is actually completely deterministic based on the current block difficulty and timestamp. Since miners set the timestamp, it’d be easy for them to pick timestamps that result in payouts to known accounts. Of course, this would be a ton of work spent to abuse our piddly little lottery — but it does highlight some of the unique quirks of writing for Ethereum.

And I guess you could also say that we don’t care too much anyways, since in all cases the house walks away with its 5%!

Writing for the EVM

Ethereum smart contracts run within a purpose-built virtual machine environment (the EVM). The “assembly” language of the EVM is a set of opcodes that look more or less as you’d expect. Nobody uses these to actually write code; there are high-level languages for that. Of these, the dominant one is Solidity, which looks a lot like C++ or Java; that’s what we’ll be using.

The primary construct in the EVM is the “contract”. Contracts as expressed in Solidity are just objects — they have constructors, methods and members, support multiple inheritance, etc.. What is quite unique about these contracts is their lifecycle on the blockchain:

  1. The Ethereum blockchain maintains the state of a single huge distributed EVM. All contract instances exist within this shared memory space.
  2. Contracts are instantiated by deploying them to the blockchain. The contract constructor is called during deployment to set up initial state. Really important: if you deploy the contract 3 times, you have created 3 distinct “objects” in the EVM — each with its own contract address and distinct internal state (in our case, running 3 completely independent lottos).
  3. Contract methods are called by sending a transaction to a contract address (just like dereferencing an object pointer). Contracts can use these “pointers” to call each other as well.
    1. All code execution happens within the context of a method call transaction. There are no background threads or scheduled events in the EVM.
    1. There is no parallel execution in the EVM. All code runs in one big single thread.
  4. Contracts can be destroyed when they are no longer needed. This doesn’t remove any of their state or transaction history of course, but it does free up some memory in the EVM and ensures that their methods can no longer be called.

It’s worth reiterating that each node holds every contract and its state in a single EVM instance. All of them run all of the contract-related transactions deployed to the blockchain (miners when creating blocks; validators as part of validating the resulting block states). This can be a little hard to wrap your mind around — the “world computer” is really a crap ton of copies of the same computer. This leads to more interesting quirks of the environment that we’ll see as we keep digging in.

Our contract in Solidity

There are a ton of solid “hello world” tutorials for Solidity; I’m not going to try to replicate that here. Instead, I’ll just walk through the bits and bobs of our contract so you can see how it all fits together. Maybe that ends up being the same thing? We’ll see. Remember that this code is on github; you may find it easier to load that up and see the fragments in context.

pragma solidity ^0.8.10;

While the EVM opcodes are pretty stable, the Solidity compiler and language are still moving relatively quickly. The pragma here with the caret says “I need to be compiled using at least version 0.8.10 but not 0.9.0 or higher.” This is kind of annoying, because probably 0.9.0 will be fine. At the same time, these contracts move real money and so I can see the benefit of being conservative.

contract RoundRobinLotto

This is the name of our contract. Solidity doesn’t care about file names matching this, and you can put multiple contracts into one file — I appreciate the lack of judgment here. This is where you specify inheritance using the “is” keyword (e.g., contract MyNewContract is SomeOtherContract).

address house;
uint countdown;

These are our member variables. house is an “address”; a built-in type that includes methods for working with its balance. We use it to remember the EOA account that deployed the contract and therefore receives the 5% commission. countdown is the pseudo-random pool size we talked about. “uint” is an unsigned 256-bit integer — I could have saved some gas here by using a smaller size (e.g., uint8) and you can also drop the “u” for a signed int (e.g., int32).

uint constant MAX_CYCLE = 25;
uint constant WEI_TO_PLAY = 0.001 ether;
uint constant HOUSE_PERCENTAGE = 5;

Constant variables are just language sugar to make code more readable and maintainable. Note the literal value “0.001 ether” — “ether” there is a keyword that automatically converts its value from ether to “wei”, which is the unit denomination of Ethereum transactions.

constructor() {
    house = msg.sender;
    resetCountdown();
}

Our simple, parameter-less constructor just sets member variables to their initial state. msg is one of a few global variables and functions that supply context or utilities; msg.sender holds the address of the account that initiated the current transaction (in this case, the account that initiated the contract deployment). The countdown member is initialized using the private function described next.

function resetCountdown() private {
    countdown = (uint(keccak256(abi.encodePacked(block.difficulty, block.timestamp))) % (MAX_CYCLE - 1)) + 2;
}

As noted earlier, the EVM doesn’t support true randomness. This makes sense when you think about the block creation protocol. If validation is going to succeed, every node must end up with exactly the same end state. For our purposes, “pseudo-random” works fine, and our private resetCountdown method takes an approach that’s pretty common in the Ethereum world — take some values that are deterministic but not easily predictable (the current network block difficulty and the current block timestamp), compute their hash and cast it to a 256-bit number, then use mod to reduce the result into the desired range. The Keccak256 hash computation is another one of those globally-available functions.

event Payout(address indexed to, uint amount);

This line defines an “event” that our code can emit as a notification when something notable occurs during execution. Events are stored within the transaction log (i.e., on the blockchain), and can be received by off-chain applications that subscribe using methods of a node’s JSON-RPC interface (we’ll talk a lot about the JSON-RPC interface in a bit). Since method return values are inaccessible to off-chain code, events are really the only way to send data back to the outside world.

function play() public payable {
    require(msg.value == WEI_TO_PLAY);

This is the first part of the method called by lotto players. It is marked public so that it can be called by external accounts, and payable so that it can receive ETH. “require” is a global function useful for enforcing conditions — in our case, verifying that the .001 ETH cost to play was sent along with the transaction.  

if (--countdown > 0) {
    return;
}
resetCountdown();
payable(house).transfer(address(this).balance * HOUSE_PERCENTAGE / 100);
uint payout = address(this).balance;
payable(msg.sender).transfer(payout);
emit Payout(msg.sender, payout);

This second part of the play method is where the most interesting stuff happens. The first three lines just exit quickly when the countdown values remains greater than zero. Following that, we reset the counter, pay the house and the msg.sender, and emit our “Payout” event so that listeners can react if desired (e.g., by popping up a congratulations dialog box in the browser).

The ”payable” method casts variables of type “address” into a form that can receive ETH. The “transfer” method atomically transfers value between accounts. Either of these may cause an exception, in which case the transaction will be reverted and all value/state will be reset.

function destroy() houseOnly public {
    selfdestruct(payable(house));
}

This method calls the built-in method “selfdestruct” to destroy the contract, sending any remaining ETH balance to the house. The methods of destroyed contracts cannot be called, and the contract’s state is removed from the EVM. Of course all transaction and state history remains as part of the blockchain.

This function is marked public, but also with the nonstandard modifier “houseOnly”, described next:

modifier houseOnly {
    require(msg.sender == house);
    _;
}

Modifiers are commonly used like this to enforce prerequisites in a readable and resusable way. The method code is “wrapped” with the modifier code — the “_” marker indicates where in this process the method code should be inserted (so both pre- and post- method code can be written).

Compiling the contract

Before we can deploy our contract, we need to compile it into EVM opcodes. I love love love that the Solidity compiler is a single executable. You can install it with your package manager or whatever, but you can also just download one file and be good to go. Maybe it doesn’t take much to make me happy, but in a minute we’ll be using nodejs and it’s just the freaking worst by comparison. Anyways, go here and install solc for your system: https://docs.soliditylang.org/en/v0.8.10/installing-solidity.html.

Once that’s done, the compile is even simpler: solc --bin rrlotto.sol. Assuming you don’t hit any build errors, you’ll get a big binary string representing the compiled contract. Super cool!

Getting ready to deploy

The next step is to deploy the binary contract somewhere so that we can run it. This is the part in most Solidity tutorials where they tell you to use Remix, which is really very cool, but has a ton of under-the-covers magic built in. To help us really understand what is happening where, let’s take a closer-to-the-metal approach. First a few prerequisites — hang in there, it’ll be worth it!

Get some test ether

1 ETH on the actual Ethereum Mainnet goes for about $4,000 USD as of this writing, so we’ll be steering clear of that world. Instead we’re going to deploy on the Ropsten Test Network, where the ETH is free and the living is easy. The first step is to get some of that sweet fake ETH into an account under your control. There are tons of ways to do this; this was my approach:

  1. Add MetaMask to your browser. Set up your wallet, choose the Ropsten network and copy your account number. Important: While you can use the same MetaMask account on both Testnets and Mainnets, I don’t recommend it. I use a different wallet for “real stuff” and my MetaMask account only for testing and development.
  2. Visit a “faucet” site and request some free Ropsten ETH. I like this one; it drops 5 ETH per request which is way more than you’ll need for this exercise. It can take an hour or two for your request to float to the top of their queue; be patient!
  3. In MetaMask, under the “three dot” menu choose Account details / Export Private Key. Enter your password and copy the key; you’ll need this and your account number later.

Get access to a Ropsten node

Our code will work with any node. Running one yourself is semi-complex and deserves an article in its own right, so I suggest that you skip that for now and sign up for a free developer account at https://infura.io/. Once you’re signed in there, create a “project” to get direct access to their JSON-RPC endpoints. Whichever node you use, make sure you’re talking to the Ropsten network there as well.

Add some tools to your environment

We’re going to deploy and test our contract by interacting directly with the Ethereum JSON-RPC interface. I’ve used a few different tools to wrap this up in a set of bash scripts which require access to the following:

  • A bash environment (native on Linux or the Mac, WSL on Windows).
  • curl for making HTTP requests, installed with your package manager or from https://curl.se/download.html.
  • jq for working with JSON, installed from https://stedolan.github.io/jq/download/ (BTW jq is awesome and should be in your toolchest anyways)
  • nodejs and npm installed with your package manager or from https://nodejs.org/en/. Hopefully your installation is less wonky than mine.
  • web3.js, installed once you have node with some variant of “pip install -g web3”. I know, global is bad, blah blah blah.
  • The Ethereum bash scripts themselves from the shutdownhook github; clone the repo or just download the files.

The node and web3.js stuff is needed to support cryptographic signatures — getting signatures right is a finicky business and beyond what I wanted to attempt by hand in bash. Other than this, everything we do will be pretty straightforward and obvious.

Setup environment variables

Our deployment and test scripts rely on three environment variables. You can set these by hand at runtime, or add them to ~/.bashrc so that they’re always available. Note if we were using account details with value on a real production blockchain, I’d be recommending much tighter control over your account’s private key. With great power comes great responsibility.

export ETH_ACCOUNT=0x11111111111111111111111111111111
export ETH_PK=0x2222222222222222222222222222222222222222222222222222222222222222
export ETH_ENDPOINT=https://ropsten.infura.io/v3/00000000000000000000000000000000

The endpoint example above assumes you’re using the https://infura.io/ nodes; the zeros will be replaced by your project identifier shown on their dashboard. ETH_ACCOUNT and ETH_PK are as copied out of MetaMask or your chosen test ETH wallet.

The Ethereum JSON-RPC interface

Most of the blockchain stuff you read about is what happens “on-chain” — how blocks are assembled and mined, how transactions move value around, the EVM operations we detailed earlier, etc.. But none of that happens in a vacuum; it’s all triggered by real-world (“off-chain”) actions using an external API that bridges the two worlds. For Ethereum, this is the JSON-RPC interface exposed by every node on the network.

Nodes expose JSON-RPC over HTTP or WebSockets, typically tied to localhost to prevent unwanted access. “Unwanted” is the operative word here, because for the most part security over the interface isn’t an issue. Transactions are signed before being sent to the node, so private keys never exist on-chain. And all data in the chain is public by definition, so what is there really to protect? Three things to consider: (1) DoS or other attacks at the network level could impact your node’s performance; (2) accepting transactions does use some network and compute resource that you’re presumably paying for; (3) many node implementations DO allow you to configure private keys locally so that you can use functions like eth_sendTransaction for specific accounts without signing on the client side. This last is the source of much confusion and, while I get the convenience factor, it just seems like a bad idea.

HTTP requests to the JSON-RPC interface consist of a POST with a JSON body that identifies the method to call and parameters to send. For example, the following curl command will fetch the current network “gas price”:

$ curl -s -X POST -H "Content-Type: application/json" --data '{"jsonrpc":"2.0","method":"eth_gasPrice","params":[],"id":1}' $ETH_ENDPOINT
{"jsonrpc":"2.0","id":1,"result":"0x5968313b"}

The format of the “result” field depends on the method called; in the case of eth_gasPrice the return value is the current price of gas in wei, expressed as a hexadecimal number. This request is packaged up in the eth-gasprice script with a slightly more useful output format:

$ ./eth-gasprice
WEI:  1500006255
GWEI: 1.500006255
ETH:  .000000001

Methods like eth_gasPrice are easy because the data package doesn’t need to be signed. In similar fashion, eth-nonce will return the transaction count for your account (probably zero at this point) and eth-version will just return some info about the software running on your node.

Submitting Transactions via JSON-RPC

Transactions are a little more complicated, in two ways. First, the data package needs to be signed, which we accomplish with the eth-signtx script. This script is a bit of a cheat; we use nodejs to load up the web3.js library and just call its internal method rather than doing it ourselves. Before you give me too hard of a time here, go poke around and you try to make it work in bash alone. 😉 This is the forever story of crypto development: the math to compute hashes and signatures is complex but really not a big deal, but the “setup” to get all of the input bytes in exactly the right format is always finicky black magic. A single bit in the wrong place renders your output useless to the rest of the network. So except in some really simple or really ubiquitous situations, better to just rely on an existing implementation.

The second issue comes from the asynchronous nature of transaction execution. A successful transaction submission returns its “transaction hash”, a handle that you can use to query its status. It can take anywhere from a few seconds to a few minutes for a miner to actually pick up the transaction and get it into a block, and even longer for that block to get enough “confirmations” to be confident that it’s golden.

The eth-sendwei script shows how this works for a simple transaction that just sends ether from one account to another (no smart contracts involved). There’s no law against sending your own ether to yourself, so you can try it out like this:

$ ./eth-sendwei $ETH_ACCOUNT 1000000000000000
Transaction Hash is: 0x84015099d0c6cf6edbe0902257d7b95b51fa47f296b46ad2a8c6f83a470fdf2b
waiting...
waiting...
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "blockHash": "0xfc47ee3dd2b233996309cfb7b38dac02b793d1117a9731041acbe0efe7a19846",
    "blockNumber": "0xb186a8",
    "contractAddress": null,
    "cumulativeGasUsed": "0x1593e7",
    "effectiveGasPrice": "0x5b3a1690",
    "from": "0x5de0613c745f856e1b1a4db1c635395aabed82c8",
    "gasUsed": "0x5208",
    "logs": [],
    "logsBloom": "0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
    "status": "0x1",
    "to": "0x5de0613c745f856e1b1a4db1c635395aabed82c8",
    "transactionHash": "0x84015099d0c6cf6edbe0902257d7b95b51fa47f296b46ad2a8c6f83a470fdf2b",
    "transactionIndex": "0xd",
    "type": "0x0"
  }
}

The script returns as soon as the block has been mined. “Confirmations” is a term defined as the count of blocks mined since the one your transaction is in; this is captured in the eth-confirmations script (the number will continue to grow as more blocks are mined):

$ ./eth-confirmations 0x84015099d0c6cf6edbe0902257d7b95b51fa47f296b46ad2a8c6f83a470fdf2b
2

$ ./eth-confirmations 0x84015099d0c6cf6edbe0902257d7b95b51fa47f296b46ad2a8c6f83a470fdf2b
6

Wait, what’s that “gas” business?

If you look at the Etherscan transaction details for the transaction above, you’ll note a transaction fee of 0.00003214120392 ether — sending .001 ether from ourselves to ourselves cost us some (fake) money! This amount is paid as a “gas fee” to the node that mines the block our transaction lives in. “Gas” is the resource that makes the Ethereum blockchain work, and it can add up quickly, so it’s important to understand.

First the nuts and bolts. Every Ethereum action is assigned a cost in “units of gas” — e.g., sending ether from one account to another costs 21,000 gas. It costs a certain amount of gas to run each EVM opcode and to store state data in EVM memory. The more code and the more memory, the more gas is consumed. This enables the network to assess the resource cost of running a transaction, which is very important given the Turing-complete nature of the EVM. I could write a method in a smart contract that runs for hours or days — obviously there has to be a way to recoup those costs and prevent bad code from taking over the whole blockchain.

Gas is paid by the submitter of a transaction to the miner that performs the work involved in it. At any given point in time, gas has a price in ether — this number is pure supply and demand, dependent on how many miners are working and how many transactions are running. Actually, even this price is kind of a fiction — when a user submits a transaction, they just say what price they are willing to pay per unit of gas, and miners decide if that price is worth their time. A user can offer zero and maybe some miner will feel charitable, or they can offer a ton of ETH and have miners jump at the opportunity. The “current” gas price is just what the collective market considers reasonable at a point in time.

Transactions are also submitted with a maximum number of gas units the submitter is willing to spend. If the transaction “runs out of gas” before it is completed, an exception is thrown and the transaction is reverted. This is obviously suboptimal, and any unused gas is returned to the submitter, so generally people submit a much higher max value than they expect to be used.

As noted above, value transfers always cost 21,000 gas — but how do you even begin to make an estimate for smart contract code? This really is still a bit of an art — but there are two tools that can help. The solidity compiler can perform static analysis to estimate gas use:

$ solc --gas rrlotto.sol
======= rrlotto.sol:RoundRobinLotto =======
Gas estimation:
construction:
   infinite + 260600 = infinite
external:
   destroy():   32022
   play():      infinite
internal:
   resetCountdown():    infinite

There are a couple of interesting things here. First notice the “infinite” (really should be “unknown”) values — solc is extremely conservative about making estimates. In the resetCountdown method, we perform calculations based on the current block difficulty and timestamp. Since those can’t be known ahead of time, solc just punts, and that bubbles up to the other methods that call resetCountdown. Some of these “punts” are unavoidable in static analysis — others I think just reflect the fact that nobody has really worked too hard on this particular feature yet.

The other thing is that the cost for construction is presented as two numbers. The first one is the cost of execution (infinite as far as solc is concerned), and the second is the cost of state storage in the EVM. Our two values (house and countDown) will cost 260,600 gas to store. Keeping state can get really expensive in Ethereum; it’s definitely in your best interest to use as little as possible.

You can also use the eth_estimateGas JSON-RPC method to estimate the gas needed by a transaction (you can see this call in the eth-estimate-tx script). In this case the transaction is actually “dry-run” in an isolated EVM on the node, without impacting blockchain state, and the actual amount of gas consumed is returned. On the surface this seems like the obvious winner — it’s an exact calculation after all. But not so fast! Depending on the state of the EVM, costs can change significantly. Take for example the play() method in RRLotto … most of the time it decrements a counter and exits quickly. But once in awhile, it executes transactions, emits an event and computes new hash values. In order to be safe, you’d need to call eth_estimateGas with the “worst case” inputs and starting state. That’s not always a simple thing to figure out … so gas estimation remains a fuzzy art.

Finally! Let’s deploy some code

We finally have all the pieces we need to deploy RRLotto to the blockchain. We just need to construct a transaction with:

  1. The “to” field set to null — the null address is a special case address that means “please deploy this smart contract”.
  2. The “data” field set to the binary version of our compiled contract, as output by solc. If your constructor has parameters, this gets a little more complicated. This article does a remarkable job of explaining the details, but I haven’t included that in my scripts yet.

We also need a gas estimate — between solc and eth_estimateGas it looks like we’ll use about 360,000 gas for our constructor and storage. We’ll set a max to 500,000 just to be safe.

The eth-deploy script puts all of this together:

$ ./eth-deploy ../rrlotto/rrlotto.sol 500000
Transaction Hash is: 0x246ba2c89621a63aa49e9a6f3b5de75e60edea6d1132ed9fa187760cc73f9a1d
waiting...
waiting...
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "blockHash": "0x66715659b4672729f59710b7b5db81d37ad342d26a98fc364176ec328ed5b742",
    "blockNumber": "0xb18a63",
    "contractAddress": "0xe588f20df3c5dad47d66722c2d6c744d3a41593c",
    "cumulativeGasUsed": "0x42e496",
    "effectiveGasPrice": "0x59682f07",
    "from": "0x5de0613c745f856e1b1a4db1c635395aabed82c8",
    "gasUsed": "0x5e701",
    "logs": [],
    "logsBloom":
0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000",
    "status": "0x1",
    "to": null,
    "transactionHash": "0x246ba2c89621a63aa49e9a6f3b5de75e60edea6d1132ed9fa187760cc73f9a1d",
    "transactionIndex": "0xb",
    "type": "0x0"
  }
}

We did it! Our new contract is deployed at the “contractAddress” 0xe588f20df3c5dad47d66722c2d6c744d3a41593c — use this link to see details of what happened:

  • We paid .00058 ether as a transaction fee.
  • The miner at 0x68268… received that transaction fee.
  • The new contract at 0xe588… was created with a 0 ETH balance.

Click the little “down arrow” icon next to the contract address to see the state of storage in the contract. Storage address 0x0 holds the house address and you can see it equals my account. Address 0x1 holds the countDown value and in this case was initialized to 16 (0x10).

It took awhile to get here, but this is a pretty neat milestone. Our smart contract is deployed and active on an actual blockchain. Sure it’s the test network, but the only thing standing between us and Mainnet is about $97 in gas fees (386,817 gas x 63 gwei/gas x $3,984.07 USD/ETH). Have I mentioned that gas is kind of expensive?

That bears repeating — there is nothing stopping us from launching that contract on the Ethereum mainnet other than $97. We don’t need anybody to approve our account, or set up a monthly hosting contract, or anything — it’s just there, live, and completely anonymous managing real money. This is both super-cool and a little unsettling at the same time … no training wheels!

Calling contract methods

With the contract deployed, we can actually play the lotto. Of course, we could do that by submitting a transaction using the JSON-RPC interface — that code is in the play-rrlotto script. But if this web3 thing is going to go anywhere, I’m pretty sure that bash scripts aren’t going to be the ux that makes it happen. Instead, let’s build an actual web site that lets us play. The code for this is in rrlotto.html; hosted here if you want to give it a try live. As always, be prepared to have your mind blown by my web design skills.

The bridge that enables normal web pages to use smart contracts is the humble browser plugin — in our case MetaMask, the wallet application we used earlier to set up our test account. MetaMask and other browser-based wallets “inject” a javascript object (window.ethereum) into every web page you visit. Web developers can access this object from code on their pages, calling smart contract methods, initiating value transactions, and so on. It’s a pretty smooth trick actually.

The window.ethereum object is typically wrapped up in another javascript library to make it easier to use. Web3.js that we saw earlier is the granddaddy of these. For the lotto page I’ve chosen to use ethers just to give you a look at a second (but mostly equivalent) approach. It’s important to be clear on the difference between all of these things:

  • window.ethereum is provided by a browser plugin according to the EIP-1193 standard. It basically makes the JSON-RPC methods accessible to javascript on a web page (it turns out that MetaMask by default passes them through to the same infura.io nodes that we’ve been hitting directly).
  • The software that implements this standard is almost always (and certainly in the case of MetaMask) also a wallet, which has the job of holding account private keys — it isn’t essential that they be the same thing, but it makes web development easier.
  • A third-party javascript library like web3.js or ethers is usually used to make accessing window.ethereum methods simpler. This is purely for developer convenience — while not pleasant, it would be 100% possible to call smart contracts without it.

OK, let’s get down to it. Our page consists of a button (#playButton) and a div to display messages (#output). When the user clicks the button, the first thing we do “wire up” all the Ethereum pieces in connectEthereum starting at line 38:

  1. If we’ve already gone through all of this, just bail out — we’re ready to go.
  2. If the window.ethereum object doesn’t exist, it means the user hasn’t installed an Ethereum plugin — nothing more we can do.
  3. Call the eth_requestAccounts method. MetaMask prompts the user to allow the page access before returning an “unlocked” account number.
  4. Set up the ethers objects we’ll use to call the contract later on. Notice the “abi” parameter we pass when creating the Contract — ethers needs this metadata to be able to format method call transactions properly. We generated the abi structures using solc with the –abi parameter (solc --abi rrlotto.sol).
  5. Attach an event processor to the contract that displays a message to the user when a payout occurs. Remember that method calls made by an EOA (account for a user) can’t see return values directly, so events like this are the primary way that on-chain data comes back our way.

Next at line 66 we do a quick check to verify that MetaMask is configured to use the Ropsten network where our contract is deployed (a list of these identifiers can be found at https://chainlist.org/), and then finally at line 74 we attach our signing provider (with our private key ready to sign transactions) to the contract, call the play() method and wait for it to complete:

var superContract = contract.connect(signer);
tx = await superContract.play({ value: weiToPlay });
log("waiting...<br/>transaction hash is: " + tx.hash);

await tx.wait();
log("transaction complete! <a target='_blank' href='https://ropsten.etherscan.io/tx/" + tx.hash + "'>view on etherscan</a>");

Before calling the method (and sending ETH to the contract!), MetaMask prompts the user for confirmation, providing some guidance on how much gas to send along. On the one hand this flow is amazing and cool — on the other it’s a ux disaster. I’m sure it’ll get worked out over time, but for now grandma ain’t gonna be playing our lotto.

The transaction proceeds asynchronously on the blockchain — on our page we disable the “play” button and wait for confirmation, but the browser overall is ready to do other stuff. When the transaction completes, MetaMask shows an alert dialog and our page comes back to life, ready to play again. Anytime a payout occurs, the event fires and our handler displays a message. Woo hoo!

We made it!

Whew — that was a lot. We wrote and compiled a real smart contract in Solidity, figured out enough about gas fees to know how much it would cost to deploy, called the JSON-RPC interface to deploy the contract, and finally called it from a regular old web page using MetaMask and its Ethereum provider. We wrote in a bunch of languages across a ton of distributed services. I hope it all made sense. If you get stuck please let me know, I’d be happy to help out if I can.

The question I’m left with is … in a world where I’m pretty sure the chains fall over at some point — beyond being an awesome nerdfest, does any of this matter? Will “web3” become a meaningful part of our world, or will it fade away after making a few folks richer and a lot of folks poorer? Still not sure, but I must say I’m enjoying the ride. Until next time!

An Old Guy Looks at Crypto

The first time I co-founded a company was back in 1996 with a group of four other guys (and yes we were certainly “guys” in the most utterly-stereotypical tech startup sense of the word). One of these was T, who even today literally glows with Chicago-style marketing/finance enthusiasm — I’d challenge anyone to take a coffee-and-Howard-Stern-fueled “perception is reality” road trip with T and not end up a lifelong fan. When I tell fellow nerds that they need a great business person to start their company with, I’m thinking about him.

So when T called a few weeks ago and asked what I knew about Ethereum and Smart Contracts, I figured I should take a look. I’ve spent the last decade dismissing blockchain and crypto without really spending any brain cycles on them … just has never smelled right to me. But obviously there are a ton of people who think it’s the future, and it’s survived more than a few existential crises already, so ok, let’s see what’s going on in there.

Fair warning — the innards of these technologies are quite complex; clearly the vast majority of people “explaining” them online don’t have a clue what is really going on under the covers. I’m going to try to share what I’ve learned in relatively simple terms, but almost certainly will get some of it wrong. I’d love to be corrected where I’ve messed up so please let me know. And if you actually invest your dollars based on any of this, well, that’s 100% on you my friend.

What problem are you trying to solve?

The root of all of this stuff is technology that enables two strangers to exchange value without a trusted third party mediating the transaction. A distributed network of computers (not majority-owned by any single entity) coordinates to ensure that transactions are permanently recorded, that value is owned by exactly one entity at a time, and that it cannot be counterfeited. For sure it takes resources to run the network, but the costs are diffused across a ton of folks and transaction fees can at least theoretically be kept pretty low (more on this later).

The appeal is pretty obvious. Right now we count on trusted third parties for all of our transactions, from banks to mortgage escrow companies to Visa and Paypal and Square. Not only do these folks eat into our assets via transaction fees and float; they exert an incredible amount of power on the market overall. Remember “too big to fail?” Oh yeah, and they also keep (and profit from) our personal information. A fair, anonymous trading platform that avoids these issues seems pretty cool.

Part 1: Hashes, keys and signatures

First, a few core concepts. None these are new; we’ve been using them forever to do things like keep websites secure. But they’re building blocks for all that comes later so worth setting up a bit of a glossary.

A one-way hash is an algorithm for representing any arbitrarily-sized chunk of data with a small, opaque, unique label. That’s a lot of words. “Small” is important because you can represent huge files, like an entire video, in a small, easy-to-manipulate string (typically 256 bits these days). Note this isn’t some magic compression technology; “one-way” or “opaque” means that you cannot reasonably get back to the original bytes using only the hash, and you can’t predict what a hash will look like from the original bytes. Last, a hash is “unique” because you will never (in practice) get the same hash for two pieces of original data, even if they only differ by only a single bit.

A key is a secret used to encrypt to decrypt data. In stories we usually see symmetric keys, where two spies exchange a password that is used to both encrypt and decrypt a message. The problem with symmetric keys is that they need to be shared, which leaves them at risk to be stolen. Stolen keys not only mean that the wrong folks can read messages, but it also mean that the wrong people can encrypt messages — all kinds of potential for nastiness there.

Public/private key pairs work differently. A person’s private key is something that is never, ever shared — but the corresponding public key can be shared openly (it’s “public” after all). A message encrypted with my public key can only be decrypted using the corresponding private key. Because the private key never leaves my control, it’s a much more secure means of communication.

A digital signature is another way to use public/private key pairs — to prove that something is genuine and unaltered. A signature is typically computed by first generating a one-way hash of the data, and then using a private key to encrypt the hash. The resulting signature is sent to a recipient along with the original content, who computes his own one-way hash of the data, decrypts the signature using the corresponding public key, and compares the two results. If they match, they can be confident that the data (a) did in fact come from the owner of the key and (b) was not altered in transit.

Part 2: Blockchains

A blockchain is just a ledger — an ordered list of all transactions that have ever been processed in a particular system. Transactions are collected into blocks, and each block is hashed using input data from those transactions and the previous block. In this way the integrity of each block is reinforced by all the blocks that have come before it. Only one single, exact, ordered sequence of transaction and header data can culminate in the unique hash at the end of the chain; by following the hashes from start to finish, anyone can easily (albeit laboriously) verify that no transactions have been lost or tampered with along the way.

Part 3: Consensus

The blockchain data structure itself is actually pretty simple; the process of creating one in a distributed, “trustless” way is anything but. Thousands of independent nodes work together to create a “consensus” about what the blockchain should look like, hopefully making it really hard for a bad guy to screw things up.

Consensus is another long-standing problem in computer science, and many ways of dealing with it have been developed over the years. The approach used by Bitcoin is popularly called proof of work, and it’s useful to start there because it is pretty much the gold standard (ha ha, get it?) as far as crypto goes, and once you understand it the others make much more sense. Actually, PoW is really just one part of the complete consensus protocol, but that really only matters to the pedants out there. Here’s how it works (buckle up):

1. The Bitcoin peer-to-peer network is made up of thousands of nodes, where each node is effectively a computer running the Bitcoin client software. Nodes are configured as miners, full nodes or light nodes. (More on this later. There is also a subclass of full nodes called super nodes that I’m ignoring, sorry.)

2. Transactions are submitted directly to an arbitrary node through the Bitcoin RPC interface, most commonly using a wallet. Transaction data is signed prior to submission so that private keys need never leave the user’s chosen wallet or other personal storage.

3. Nodes relay transactions around the network, where miners pick them up and bundle them together into blocks. Once enough transactions are collected, the miner verifies signatures, accounts and balances, hashes the transaction data (this is called the “merkle root”) and starts hunting for a valid block hash.

This is where things get a little complicated. A “valid block hash” is one where the numerical representation is less than the current network “difficulty” threshold. Difficulty ratchets up and down according to a global algorithm that attempts to keep the rate of block creation constant, based on how much miner capacity is running across the network.

Remember that the block hash uses the transaction data (merkle root) and previous block hash as input. It also uses a nonce value, which is just a random number picked by the miner. The nonce is the only input to the block hash that can change. And remember that nobody can predict what the hash value will be based on the input — so miners just randomly pick a nonce and compare the resulting hash against the difficulty threshold until they find a valid one. This brute-force hunting process is what “proof of work” means … you cannot create a valid block hash unless you do a bunch of computing work, which costs real-world resources — making it infeasible to game in a way that makes economic sense.

Once a valid block hash is found, the miner tacks it onto the end of the chain and broadcasts their accomplishment across the network. Block creation includes a “block reward” for the miner (currently 6.25 BTC per block) — this is how new Bitcoin comes to exist, and why the term “miner” makes sense.

4. The story isn’t finished yet! Thousands of miners are all doing this work at the same time, using the same transactions1 and trying to find valid block hashes so they can get their reward. This is where the consensus part comes into play. All nodes on the network (including miners but also full and light nodes) work to validate the blocks created by miners. The obvious part of this is just making sure all the math lines up — the signatures and hashes check out, the block hash is below the target difficulty, and so on.

Nodes also have to decide which block “wins” when more than one miner finds a valid block hash for the same transactions. This part is pretty cool — nodes just always prefer longer blockchains. Whenever the node validates a block in a chain that is longer than their current view of the world, they accept that one. The effect here is that all nodes converge on the same chain / view of the world — but it takes time for that to happen. Most bitcoin clients consider a transaction “final” when they see six “confirmations” — meaning that there are six blocks in the chain after the one including the transaction. Six is an arbitrary number but seems to work well as a conservative threshold.

A side note: The value “six” here is just one of a bunch of constants and other parameters that are part of the Bitcoin system … the maximum number of coins in the system, the blockchain reward, the difficulty threshold, and so on. Whoever figured all this out up front, mostly before the algorithms were ever really live in the real world, is a freaking genius. The mystery around that genius is great drama in and of itself. Who is Satoshi?

Still with me? Hard to believe. But if you are, you now have a better understanding of how Bitcoin and the other PoW blockchains work than most folks spending money on them. Thumbs up!

Part 5: Ethereum and Smart Contracts

One thing I glossed over in previous sections is the specific data in each “transaction”. At one level it’s useful to understand that this just doesn’t matter — the blockchain structure and consensus protocols are largely agnostic to the claims made in the transaction itself. But of course for any specific blockchain implementation, those transactions are the whole point of the exercise and matter a lot.

Bitcoin (BTC) was the first mainstream cryptocurrency, and its blockchain is hyper-focused on financial transactions. Each transaction just describes movement of Bitcoin value between “input” and “output” accounts. Actually that’s a little bit of a lie, Bitcoin transactions are built on some rudimentary scripting primitives — but in practice Bitcoin transactions just end up being value exchange.

The second-best-known blockchain is Ethereum (ETH). Naïve traders view it as just another cryptocurrency, but that’s super-wrong. ETH transactions certainly support value exchange just like BTC, but their real purpose is to manage smart contracts. A smart contract is a full-on piece of software that lives on the ETH blockchain with its own address that can hold funds and arbitrary state. Smart contracts can also expose an API which can be called by submitting (you guessed it) transactions on the chain. Folks sometimes refer to Ethereum as the “world’s computer.”2 ETH the currency is used to pay for processing and storage time on the chain, so theoretically there is a little more “oomph” behind its value vs. something purely abstract like BTC, maybe.

Anyways, this is cool because it makes the mechanics of trustless, distributed transactions available in a bunch of new contexts. A few popular use cases you probably have already heard of:

  • Fungible tokens based on the ERC-20 standard that enable arbitrary “currencies” — these could be purely abstract like airline miles, or tied to physical stuff like shares of a company or even fiat currencies like US Dollars. Of course, ties to the physical world are based entirely on contract law; ETH just provides a great platform for representing and exchanging them in a robust way. Token behavior could be implemented by any smart contract, but adherence to the ERC-20 standard means that most Ethereum wallets will be able to hold and trade the token.
  • Non-fungible tokens based on the ERC-721 standard that enable collectibles and ownership of other unique assets. These differ from fungible tokens in that each one is unique — e.g., there is only one NFT representing a particular painting and only one account can own it at a time (modulo fractional ownership). Obviously the same caveats about physical objects apply, but even for digital NFTs (e.g., an NBA Top Shot video) it’s a little weird — block data is too limited to store large files directly, so NFTs typically hold a hash of the actual asset, which is stored somewhere else (“off-chain”, i.e., basically unprotected). And of course a digital asset can be bit-for-bit copied with no loss of fidelity, so …. I am generally trying to reserve opinions for later, but while this is a neat idea, it really just seems kind of silly to me.
  • Decentralized autonomous organizations like the one that tried to buy a copy of the US Constitution. Member accounts (defined by ownership of a particular token or some other rule) call smart contract APIs to vote on issues, with the winning votes automatically triggering actions defined in the code of the contract. For example, members might contribute ETH to a shared giving fund and vote on which account(s) should receive donations.
  • Distributed finance applications like MakerDAO that provide financial services like loan and interest-bearing accounts without a central bank.
  • Lots and lots of games and lotteries and other crap.

Building all of this on the smart contracts framework provides some neat benefits. For example, an NFT can be coded so that when it changes hands, a royalty is sent to the original creator. Tokens might be used as an access pass to an online or real world event. Defi currencies can be coded to automatically increase or decrease supply to reduce volatility. DAOs might include a poison pill that liquidates the organization’s assets if the market reaches a certain level. And because all of the smart contract code is visible on and verified by the blockchain, there is an unprecedented level of transparency as to what is going on.

On the flip side, since smart contracts are just code, they can have bugs, and those bugs can be disastrous. It’s also very cumbersome to include “off-chain” information into smart contract algorithms … for example, a DAO might want to automatically send money to relief organizations when natural disasters occur — knowing that a disaster has occurred in a trustable, automated way is a challenge that lives outside of what Ethereum can easily do today.

In a follow-up post, I’ll walk through the concrete stuff required to deploy and run a smart contract in some detail — but this screed is long and boring enough already.

Part 6: Why are these things valuable?

The question rational people always ask about cryptocurrencies (after “WTF?”) is “why are they worth anything?” Honestly, this is a philosophical question more than anything else, and if there is anyone on the planet less well-equipped to plumb the depths of philosophy than this guy, well, I haven’t met them yet. But I think at least for me, the best answer is just:

BTC and ETH and other cryptocurrencies are valuable because people think they are.

That’s really it. Enough people believe enough in BTC and ETH that they are willing to accept them in exchange for other things they believe to have value, like US Dollars or Euros. This really isn’t so different from the reason that “everyone” attributes value to the US Dollar. And as Hamilton reminds us, even that wasn’t always the case:

Local merchants deny us equipment, assistance
They only take British money, so sing a song of sixpence

This conversation always seems to devolve into people shrinking into the fetal position and questioning the nature of reality, so I’m just going to walk away now. Decide for yourself.

Part 7: What could go wrong?

Much more interesting to me is the technical viability of blockchain-based systems. The more I learn, the more I get convinced that they are fatally flawed and likely to blow up. I’m not smart enough to guess when, but probably pretty soon. Which is not to say that blockchains are inherently bad, or that the core value proposition they offer will not survive and change the world. Just that the current implementations — and all the value wrapped up within them — are probably not up to the task.

It feels very, very much like the dot-com days to me. Some amazing, world-changing things came out of that time, but unfortunately “dot-com” became a religion, and like all religions, grew increasingly resistant to rational criticism. With real money involved, that’s some risky business. Blockchain technical issues feel the same way to me right now. Or more precisely, one specific technical issue. But before I go there, let’s look at a couple of the other technical challenges that I think are probably fine.

Energy Use

High on the list of blockchain objections is energy use. Proof of work is by design an incredible redundant protocol — thousands of nodes are all basically doing exactly the same work as fast as they can, most of which is thrown away. There are all kinds of statistics on this and how ridiculous it has become; my personal favorite is that the energy use by Bitcoin miners is roughly equivalent to my home state of Washington. Does it feel a few degrees hotter to you?

The good news is, this is pretty fixable and already being demonstrated by other chains. One alternative to proof of work is proof of stake. This model does away with miners. Instead, nodes participate in block creation by putting a “stake” into escrow on the chain. The opportunity to mine each new block is granted to one node, chosen pseudo-randomly and typically weighted by the relative size of each node’s stake. That node and only that node creates the block and receives the reward, which is then validated using the normal mechanisms. Nodes that propose invalid blocks are penalized by loss of their stake.

Because it eliminates virtually all of the redundant and expensive computation, PoS is dramatically more energy-efficient than PoW. It is also considered to be helpful in maintaining decentralization of the network, which is a little counter-intuitive because nodes basically are buying influence. But maximum weighting can be easily capped to avoid this, and the approach negates any advantage from funding huge mining data centers as happens today.

Ethereum has been planning to move to PoS for some time now, but it keeps slipping — currently planned for sometime in 2022. Vitalik, I feel your pain, man.

Transaction throughput

Bitcoin can handle about four or five transactions every second, and individual transactions take around an hour to confirm. Ethereum is faster but still in the range of 20 TPS with maybe eight minute confirmations. These numbers are frankly hilarious compared to centralized processors like Visa that have TPS rates in the thousands, and confirm credit card purchases reliably in seconds.

Folks have played around the edges of this by tweaking parameters like the number of transactions that can go into a single block. Bitcoin Cash is one of these and sees rates of about 300 TPS. Layer 2 chains are another approach, basically accepting transactions quickly in separate infrastructure and bundling them up into fewer meta-transactions on the real chain. There are security/speed tradeoffs here that will take some experimentation to really get right.

More revolutionarily, Solana has demonstrated real progress by mixing up proof of stake with (sorry, more jargon) proof of history which uses a distributed, synchronized clock to reduce the back and forth typically required to keep everything in an agreed-upon order.

All up, the problems here seem eminently solvable if not already solved. I’m not worried.

Part 8: Infinite Growth

Finally we come to the one bit I just can’t get over — I’d love to hear a non-religious explanation of why I’m wrong, but:

Blockchains by definition grow forever.

At its core, a chain works because it stores a record of everything that has ever happened ever. When I stand up an Ethereum “full” node, the first thing it does is start downloading the entirety of that history to my local machine. Ethereum’s grew from about 480GB in September 2020 to 952GB a year later, and recently crossed beyond a terabyte. Ethereum is only six years old, and those early years were pretty quiet! If you simply play things out linearly, it reaches half a petabyte by the end of this decade alone — and all the work underway to accelerate throughput means it ain’t going to be linear. Bitcoin is better off, with a linear growth that hits a couple of terabytes by the end of the decade, but that’s just a result of its low TPS rate — I’m not sure that “we survive by doing less” is a winning strategy.

Both chains have the concept of a “light” node that only downloads block headers rather than full transaction histories. Light nodes verify the structure of the chain itself, but trust that the transactions inside are OK. This is a great alternative that allows low-powered devices and wallets to participate, but isn’t sufficient on its own to keep the network running.

Various data compression techniques have been proposed or are in use — this may delay the time of death, but doesn’t address the fundamental problem.

The closest thing I’ve seen to a solution is Ethereum’s proposed “sharding” solution, in which multiple small chains operate independently and periodically roll their transactions up (layer 2 style) to a master (“beacon”) chain. This presumes that most transactions occur locally within a single shard — cross-shard transactions would be painfully slow and expensive, so they need to be rare. There are a bunch of new security and integrity issues here too. Maybe it’ll work, but it’s always “just around the corner” and I have yet to see much of a concrete proposal, and it still doesn’t fundamentally solve the growth problem — smells like religion.

At the end of the day, things that grow infinitely are infinitely bad. Sure disks keep getting bigger, but not nearly as quickly as the chains are. At best this means that soon (like, SOON) only very well-capitalized entities will be able to afford the hardware required to run nodes, in which case we’re right back to the centralized system and power structure we started with. More likely, one day investors will finally realize that the end is near, and a lot of folks will lose a lot of money.

Unfortunately, facing this question has become heresy to true believers. You don’t “get it” or “see the big picture.” They talk about solutions that don’t exist as if they did. I’ve been at this movie before, and it was called dot-com. No thanks.

So where does this leave us?

Blockchains and the systems built atop them are super-cool. More than that, they are addressing real problems in novel and revolutionary ways. But right now there is so much quick money to be made that most folks are ignoring the existential question that arises from their infinite (and accelerating) growth. I am absolutely ready to look at data that shows we can keep increasing adoption of the amazing features of these new systems without falling off of a cliff — but I’ve looked and looked hard and haven’t found it yet. Until I do, I’ll remain a grumpy old guy yelling at DApp developers to get off my lawn.

But that doesn’t mean I won’t keep learning and experimenting — there is just too much amazing stuff going on to ignore. And I’ll keep listening to T because he may well see the magic before I do. Next time we’ll write and deploy a smart contract.

1 Omer points out on LinkedIn that it’s not really the same transactions. Each miner is constructing a block from some subset of all pending transactions in the system (plus a unique “coinbase” transaction representing their hoped-for reward). So while there is great redundancy/rework in the system, it’s not true that every miner is literally racing to do exactly the same thing.

2 Also from Omer is a note that the EVM (Ethereum Virtual Machine) that runs smart contracts is Turing complete. This means that it’s a for-real general-purpose computing engine, unlike Bitcoin’s simple script. Managing this additional complexity (and of course power) is the purpose behind the “gas fees” that are part of each execution. Super cool stuff worthy of its own post!