This is the fifth post in a six part series about prevailing myths in blockchain implementations. The previous myths are on:
- All security is defined by the amount of energy required to break it, and security can only be manufactured with the use of an amount of energy that is equal to or greater than the energy of the best attack.
- Proof of Work is the most efficient way of generating blockchain security precisely because it requires a known (although variable) amount of energy to solve a cryptographic puzzle and provides no alternate attack vectors.
- A PoW mining market with ASICS and large data centres is a highly specialized system for building ever more sophisticated energy barriers to defend the chain. An advantage this year does not mean that the miner maintains the advantage next year, for another competitor could develop better tech. This type of market is therefore decentralized over time.
- The PoS security mechanism relies on a static and low amount of energy used to secure it. This is manifested in it being hostage to the physical security of the stakers and, more importantly, to the security of the software that stakers must run. Therefore PoS and other non-energy intensive incentive mechanisms do not enhance blockchain security and probably decrease it specifically because they aim to lower energy use (and they try to do this inefficiently).
- Tell me, why I’m wrong. A valid counterargument can be in the form of a counterexample of a real life security model, where ultimate security relies on an energy investment and ongoing energy expenditure that is lower than the energy of the most energy efficient attack against such a framework.
In this avant garde article, I explore the meaning of security and have a go at defining some basic principles about how it works and what that means for the Proof of Work (PoW) vs Proof of Stake (PoS) debate. The news is not good for PoS, as PoW might just be the most efficient way possible at achieving security, bar none.
Throughout the article I frequently refer to Bitcoin as an example of a purist blockchain, but the concepts can be applied equally to other blockchains that rely on PoW for security, including the current version of Ethereum 1.x (ETH), Monero (XMR), or Zcash (ZEC).
I’ve been mulling over posting this more widely since April 2018, because I wanted it to settle in my head and to hear if there are any arguments out there that could persuade me to doubt what I’m about to write.
Unfortunately, I haven’t found anyone who could argue the opposite side of this argument well. If you do have such an argument, feel free to let me know. You might be surprised to know that the most common counterargument that I have heard is an unsupported declaration that I’m wrong, sometimes hastily followed up by more prominent detractors with a comment on my understanding of what energy is, or that I mistake energy for financial cost. Neither bothered to respond that energy is defined in Joules or kWh and that is what the work is in PoW, nor that cost is mostly a proxy for energy anyway (mostly, because cost variance can also be attributed information asymmetry).
Having said this, I must also say that ever since reading the Ethereum whitepaper I have considered it to be a superior concept to Bitcoin. A Turing Machine on top of a permissionless blockchain allows an arbitrary ecosystem to evolve on top of the protocol, and to piggyback on its security. An example of this ecosystem is Decentralized Finance (DeFi) with MakerDAO’s DAI stablecoin as a manifestation of the brilliance that such a platform can support. Through the Turing Complete EVM Ethereum became a true multi-purpose platform for a new financial system. In order to continue its success it is crucial that Ethereum continues maintain the highest attainable levels of security. I wish the ecosystem well and I hope to be able to participate in it for a long time to come.
The largest flaw in this article is its length. I have considered cutting it shorter, but have opted to keep the arguments as they are, since some of the concepts might not be familiar to some readers. You should be able to evaluate, whether a given section is obvious to you or not, and feel free to skip to the next one. The main points are highlighted anyway.
What do security and energy have in common? Everything, because energy defines security.
The basic argument goes like this: at the base of any security mechanism is an energy tradeoff and any attack must take this tradeoff into consideration. It is because of this that the most robust security mechanisms are ones where the energy tradeoff of an attack is known and large.
There have been many articles that argue that bitcoin is secured by energy, and this is an established fact, but since energy use is considered wasteful and harmful to our environment, crypto projects the world over have tried to create more energy efficient, or less energy hungry, ways of attaining security levels comparable to bitcoin’s PoW. I will argue here that a more energy efficient way to secure a blockchain does not exist by definition, because energy defines security in general.¹
A short story on how nature cheats.
I’ll tell you a story that motivates the questions I ask by revealing how limited our models are relative to how complex systems actually function. It starts with a Scientific American, or maybe it was Popular Mechanics, article that I read when I was a teen. A cursory google search did not yield the article, but I was able to find the source research paper here. On rereading it after all this time, I still think it’s a great piece of research. Here’s a short summary.
Dr. Adrian Thompson designed an experiment to implement evolution of a physical digital circuit that was to differentiate between a 1kHz and a 10kHz signal. The circuit used was an FPGA, which is a chip that has hardware logic which can be reconfigured to perform any logic operation using logic gates (hardware) instead of computer instructions (software). Thus, the experiment was to evolve a hardware circuit to perform the given task.
The FPGA was primed with several random gate arrangements and was connected to a fitness test that would evaluate, how well the given circuit identifies the input signals. An evolutionary algorithm for generating offspring from the best configurations was to create each new generation of configurations to be tested.
An additional twist to the challenge presented to evolution was that the FPGA had no access to a clock that would help measure frequency. This made the task too difficult for a human to design the solution.
Could evolution, or nature, design a hardware logic circuit that would recognize the frequency of an incoming signal without using a clock? I was mesmerized by the problem, the solution surpassed my wildest (and naive) expectations.
The experiment was set up and run for thousands of iterations over several hours. The design that emerged matched the hypothesis. The chip configured with the final design could accurately distinguish the 1kHz and the 10kHz signals. I was excited to read about this result, but it wasn’t altogether unexpected. What came next was the most intriguing thing I wanted to know: what was nature’s brilliant design that surpassed human skill? I couldn’t wait to read about it!
It turned out that the design was gibberish! No human could understand, how the pattern of logic circuits would produce the result it did. As a combination of logic gates, it simply made no sense. They tried the experiment again and again, with similar results. The logic gate designs coming out of the process did not seem to implement logic at all. They went in loops and were disconnected in places. Interestingly, the design did not work as well, when moved to another area of the FPGA (originally, the design was confined to only a corner of the chip).
The paper suggests that the evolutionary process ignored the abstract logic that a human designer would use and used all physical features of the chip. Apparently evolution was misinformed about what it was supposed to do and cheated by going outside of our theoretical model of how an FPGA works!
I felt the hairs stand on the back of my neck. The genius of nature was astonishing. When given the opportunity nature was playing within its own rules and didn’t care about the logic model around which the chip was designed. It turned out that in experiments nature plays with a full deck, while we humans are limited to subsets of this deck that we are able to fit into models.
There are two outcomes of experiments, boring and exciting. Boring experiments show that nature follows our theoretical model. Exciting experiments make nature show new cards from its hand that are outside of our model. The FPGA evolution experiment was of the second kind. When nature plays with us during an experiment, it sometimes pulls out cards, that we didn’t even know existed or that we were not capable of using ourselves.
Blockchain is nature showing us a glimpse of its full deck as it relates to our behaviour, and in this particular case we ourselves seem to be the cards it uses.
Blockchains are fascinating to me in the same way. They can be thought of as a hybrid of software and data that feeds on energy supplied by humans in exchange for a secure digital asset. Although this summary sounds simple enough, we must remember that it is just an incomplete mental model, and that the way in which blockchains actually work is much more nuanced as evidenced by hard fork turmoil, bitcoin maximalism as a sustaining ideology, etc.
I see in this space something that I saw then in that article years ago. Someone wrote clever code (Bitcoin) that seemed to be well defined in terms of computer science (see my first Myth article), but this code interacted with its surroundings to reveal more than most expected. Blockchain is nature showing us a glimpse of its full deck as it relates to our behaviour, and in this particular case we ourselves seem to be the cards it uses.
What can we learn from observing Bitcoin?
If there is one thing that we must admit about Bitcoin and the Proof of Work approach to security is that it seems to work. Now our exercise is about finding out how this game works because if anyone thinks they have the rules figured out, they are underestimating the rule master.
To me one of the first red flags about someone not understanding how complex this game is, is when they say that Proof of Work, the core element that makes Bitcoin or Ethereum secure, is inefficient. How do they know that PoW is inefficient? Compared to what other proven mechanism?
It is true that PoW uses amounts of energy that are unusual compared to what we are used to, but to say that because it seems like a lot of energy that PoW is inefficient is a non-sequitur. We have become accustomed to energy use being a polluting activity, and with the rampant fossil fuel that is burned across the globe that is an accurate assessment. But saying that an energy consuming activity is wasteful, just because it uses more energy than we feel it should forces a connection, where one does not exist.
It is possible to be 100% energy efficient and still seem to be energy hungry.
Let me give you an example. Consider an electric space heater that heats a room in the winter. How much energy does this heater consume? The answer is quite a bit. Maybe 1500W or more to keep a room warm. Maybe you need two or three to do the job? That’s certainly more than my laptop or the lights need, but does that mean that the space heater is inefficient? Absolutely not.
An electric space heater happens to be the most energy efficient device that we know how to make! It is 100% efficient and you can’t get any better than that. So, just because PoW is power hungry says absolutely nothing about its efficiency. It is possible to be 100% energy efficient and still seem to be energy hungry. Case closed.
So, what do we know? Well, we know that while PoW seems to be quite power hungry we actually have no idea how efficient proof of work really is, and that’s a start.
Energy is the convertible currency of the universe.
While the Ethereum community seems to be confident that they have the next incentive/scaling model all figured out and simulated in papers and test code, I’m not quite convinced. If one thing is clear, it’s that no one bothered to let nature know, which model we would like it to play by, when we let the game play out in real life.
I will argue that in real life, lower energy consumption necessarily leads to lower security, because the security of any system whatsoever is always defined by the energy required to attack it. Moreover, when we weigh the security of a system critical energy tradeoffs are often implicit in our environment and therefore taken for granted.
After all, energy is the convertible currency of the universe.
Understanding security and what lets us sleep at night.
Let’s start with a few thought experiments in the physical world, and then progress to computer systems, then blockchains. You can skip right ahead, if you like, but this progression is what was necessary for me to convince myself that most probably all systems are subject to the principle that energy underpins all security.
Example 1. A small safe.
The security of a simple safe (or lockbox) is a situation that is familiar to most of us and we do not give it a second thought. We know that a safe can help in keeping things from being stolen, but we know that there are limits to this security. We will therefore not put our most valuable possessions in a small safe without taking additional precautions. We know this, because it’s common sense.
When you put a valuable inside a small safe, you make an implicit calculation about the security of the device. The calculation is, “does it take more effort to extract the contents of the safe than the contents are worth?” You will select the safe according to this calculation, and depending on what you want to protect.
You may choose a simple lockbox in a hotel for your wallet, a larger safe in your house for some home valuables, or a safety deposit in a bank vault for significant savings or documents that need fire protection, etc. In any of these cases you will consider that an attacker would first have to break the box to get to your valuables, and you will consider how difficult it is to break into the box. Is a crowbar enough (low energy), or does it require heavy duty drilling (slightly higher energy), and if someone uses a moderate energy method, like dropping the box off a building or a blowtorch, will it result in the destruction of its contents? In all these situations, you will consider the cost of the attack vs the potential gain, and the attacker will also consider, whether a higher energy attack (which may be available) could ultimately destroy the contents. These considerations are all strictly in terms of energy trade off, or effort and risk vs reward. Simple enough.
What about a safety deposit box in a bank? How does the energy tradeoff look there? Whatever you would put in that deposit box surely isn’t worthwhile to rob a bank for, so you choose to pay a little rent to piggyback on the bank’s energy intensive physical security apparatus.
Now we are ready to take a look at a more complex system.
Example 2. Fort Knox.
Let’s consider robbing Fort Knox as there is some evidence that it contains substantial stores of gold.
There are a lot of very durable valuables in the Fort Knox vault, so a high enough energy assault should make the attack worthwhile. You might even say that the energy of the attack (like a bunch of powerful bombs) could be considerable to still yield good access to the contents, even if they do get dispersed a bit during the attack. You should be imagining a scene with dazed pedestrians wandering around a messy street scene picking up pieces of gold and catching hundred dollar bills fluttering in the breeze.
There is a catch, however. Fort Knox is protected by a security apparatus that starts with guards, continues through a police force in the area (making carrying the gold off difficult), and ultimately the US military might. The US military apparatus is an extremely high energy security system that is too big to try to overthrow for the possibly underwhelming amount of gold in Fort Knox, so the energy trade off suggests that it’s not worth it to attack Fort Knox’s high energy barrier outright under any circumstances.
That’s not the whole story though.
What if we were to find a sneaky way to enter the vault and carry off the gold? Could that lower its perceived energy barrier? Could we have a man on the inside help us Oceans 11 style? Slipping the loot out without breaking heavy doors or setting off the US military security apparatus would lower the energy barrier to carrying the gold out and perhaps make the effort and risk worthwhile? And so, Fort Knox (hopefully, as I’m no expert on Fort Knox) employs an elaborate process designed to prevent low energy attack vectors from either existing at all through good security procedures, or from being found by maintaining secrecy. In the end, because of the obscurity of the security apparatus, the lowest available energy attack vector is very hard to determine and exploit. For all we know, if it already were exploited we wouldn’t even have a way of finding out, until someone did an audit of the contents.²
Disclamer: I have little interest in conspiracy theories around Fort Knox, and just want to underline that the “security through obscurity” around Fort Knox does not make it particularly easy to verify its contents. The point is, it is possible to imagine a low energy attack vector on Fort Knox that bypasses the obvious high energy defences that are meant to symbolize its security. Most importantly, we have few ways of finding out if such a low energy attack vector exists and no opportunity at all to verify that it does not exist. Hmm.
Example 3. A modern bank.
When banks get robbed these days, it is through electronic means with the help of social engineering. Basically, thieves steal someone’s password or bribe someone on the inside to help in the operation. This, again, is an energy tradeoff.
The attackers know that you can’t walk into a bank and steal millions, so they have to break into computer systems by gaining access to accounts on these computers. They do this through low energy attacks: social tricks, viruses/worms, and phishing.
Phishing in particular does not attack the bank’s defences directly and focuses on the low security user and their funds. If the thief can get a hold of a customer’s password, he can transfer money out without breaking the bank’s security. The thief will break the customer’s security instead. Of course, as bank customers we do not actively manage our security, but we rely on a security model that the bank provides us with. Usually, this is a password and a second factor, like a code sent by SMS with a one time password for riskier operations.
The level of security the bank provides to the user depends on how much money the user has in their account. And so, a retail user will have a basic password and SMS, while a corporate user may have hardware dongles, etc. The security model is then more localized with a smaller energy attack vector and commensurately smaller payoff that is limited only to the customer’s funds and not the whole bank’s.
Note that the bank itself relies on the energy intensive state security apparatus, so it is rarely profitable to rob a bank in such a way as to activate that apparatus.
Security of Computer Systems.
And so it is with any computer system, including ones secured with modern cryptography, which is quite secure on paper. The known high energy attack vector involves brute forcing the encryption by trying all the possible keys. This is impractical for most prudent configurations, even for governments. The (relatively) easy lower energy attack is to obtain the key from its custodian or user.
We are used to using high security cryptography, while trusting that physical access to our resources is protected by another high energy security apparatus that is woven into our environment.
Obtaining the encryption key requires a physical or social attack on the user, which bypasses the hard cryptography and high energy defence. Here I reference the brilliant comic by xkcd about a $5 wrench attack. In the comic strip two potential thieves plan to break into an encrypted laptop. The first character proposes to use a high powered computer to break the encryption on the laptop, but this will not be enough, as the encryption scheme used is too strong to break with available computing power. The strip dismisses it as wishful thinking by nerds that an attack can be foiled by strong encryption and proposes another more realistic scenario. The thieves simply decide that all they need to do is to beat the laptop owner with a wrench that was purchased for $5 and the secret encryption key will be handed over by the victim on short order, rendering the encryption useless. A physical attack with a cheap weapon is relatively low energy compared to a brute force attack using an expensive computer that uses up time and energy. There is again the question of a police force, and so on, but these are still just other energy barriers that are often easier to overpower than breaking encryption by obtaining and powering an impractically huge supercomputer.
We scoff and claim that these attacks can be prevented with some common sense, like avoiding dark alleys and so on, but are we conscious of what we take for granted? For most of us living in a high security country it’s hard to attack someone with a wrench and get away with it, because we piggyback on the high energy law enforcement apparatus that these societies provide us. Without the police to help us, the default security of a plaintext private key printed on a piece of paper takes no more than a $5 wrench to break and possibly much less. A low energy defense can be easily overcome with a low energy attack that operates outside of the model’s design.
So we are used to using high security cryptography, while trusting that physical access to our resources is protected by another high energy security apparatus that is woven into our environment. I argue that we take this externality for granted, when making claims about the energy efficiency of cryptography, or any other security system.
Despite the presence of an implicit physical security framework our reliance on it has to be limited anyway, because in all honesty, how confident do we feel about our cybersecurity? In real life, most attacks on our digital security are not with the use of cheap wrenches, but through cyber attacks that have a much easier time evading physical security measures. More on that later.
In summary, strong cryptography has a minimum security threshold that is equal to the lesser of two security frameworks: the level of our physical security, and the level of our cybersecurity. Both of these frameworks are energy intensive endeavours with cybersecurity being much weaker in today’s reality.
Trustless blockchains and the case for Proof of Work.
Blockchains are borderless, and cannot depend on nation state high energy security systems or secrecy for either prevention or after-the-fact law enforcement and punishment. We must therefore be very careful about how the security of a trustless³ system is designed.⁴
Proof of Work is a brilliantly elegant solution to the energy tradeoff problem of security in that it makes the energy to attack the system explicit. The system is designed to have no lower energy back door. There is just the nuclear option, to either buy or steal enough resources (including power supply) to carry out an attack. This kind of attack is either unpalatably expensive for those that need to buy the hashing power, or self-defeating for those with the expensive ASICS in their data centres that will be rendered worthless after the attack. Why worthless?
For all the arguably convoluted attack and defence scenarios that have been considered for Proof of Stake, what are the energy tradeoffs that exist in the system, for surely they must exist?
Cryptocurrency mining farms are large and specialized infrastructure projects that earn back initial investment over time that is usually measured in many months. Imagine now that the miner decides there is some money to be made in using their equipment to attack the chain. Fair enough, but what is the cost of conducting the attack? If the coin being attacked loses value during the attack, the miner will have depleted their ability to make their expected mining revenue. The risk reward calculation for this gambit is not easy and outright frightening if one has $100m of equipment at stake as some mining farms do.
It is clear then that it would be risky for a miner to attack the blockchain that their equipment is mining, because an attack may cause a significant decrease in value of their future profits.
And now for the million dollar question: for all the arguably convoluted attack and defence scenarios that have been considered for Proof of Stake, what are the energy tradeoffs that exist in the system, for surely they must exist?
The energy-dependent model of security suggests that by reducing the energy requirements for PoS we lose security. All that remains is to figure out how that could transpire?
The answer probably is not hidden in the PoS slashing algorithms and theoretical models developed by the Ethereum protocol engineers. They are a smart bunch, and we should assume that their models, intricate as they might be, are probably theoretically sound.
How will nature surprise? What does experience teach us about weakness of theoretical models?
“Your assumptions are your windows on the world. Scrub them off every once in a while, or the light won’t come in.”
― Isaac Asimov
As with the evolutionary FPGA example, let’s look for assumptions about the model that the PoS designers may have overlooked.
False Assumption #1: PoW is inefficient.
We already identified one, where energy consumption of PoW was incorrectly linked to lack of efficiency. We don’t know how efficient it is, so we cannot assume that it is inefficient. What other assumptions could there be?
False Assumption #2: Hashing power is like voting power.
Let’s explore the famous 51% attack. Is that like voting? The 51% attack refers to capturing 51% of hashing power to generate a longer chain that will then serve as the one true chain. While this might look like voting on the surface, imagine for a moment that, through sheer luck, a minority of miners mined a longer chain than the 51% that is captured for an attack. While statistically this would happen rarely, it is possible in practice for at least short durations of time. In such a situation the longer chain generated by a minority of miners would also win as the official one true chain. This means that hashing power isn’t really a voting mechanism at all. A more appropriate analogy is that hashing power is like firepower (yes, in the military battle sense). More firepower certainly increases the chance of victory, however it does not guarantee victory on any one occasion. I hope we can agree that having more firepower is not the same as having more votes.
False Assumption #3: Democratic voting by nodes increases security.
How about the assumption that blockchain security grows with the number of nodes, since each node can be treated, like a voter. I have heard this one many times myself and asked, how so? In Bitcoin the job of a node is to listen to the network for new blocks, to keep the ones with the most accumulated proof of work, and to transmit them to others. While this appears to resemble voting, think of what would happen if 99% of the nodes colluded to propagate a false (shorter) version of the Bitcoin blockchain. Would the 1% of nodes that are in possession of a longer chain be convinced by the majority that they have the wrong one? They would not. This isn’t voting then, is it?
If 99% (or more) colluding Bitcoin nodes cannot overpower Bitcoin consensus, then it must be true that as the number of Bitcoin nodes increases, security does not.
Bitcoin is not democratic at all then, and is about as undemocratic as they come. A single copy of a longer chain, no matter how it is produced, is enough to invalidate all other pretenders, regardless of their number or their hashing power or “votes.”
False Assumption #4: If nodes have something to lose, their vote can be depended on to be more honest.
If we look at PoS game theory discussions, they do assume that more voting nodes with some skin in the game (a stake) lead to higher security, so let’s explore this assumption a little more.
Under PoS the nodes are treated as voters, which as we already know, is a significant departure from how PoW works. You might argue that PoS is supposed to be a departure from PoW, and you would be right. What we cannot skip though, is the introduction of the new assumption that a large number of voters can generate security at all! This assumption sounds safe enough, but have we considered, whether it requires more preconditions before it can hold?
In real life democracy does seem to generate security, but only under certain conditions. The power of democracy is in each voter being educated about their options and making an individual and informed decision of their own free will. Furthermore, the choice may vary depending on individual circumstances and no vote is penalized. So an underlying assumption is that in order for democracy to work, each voter must be free to make an educated choice without being coerced. Duh! Is this the case under PoS? It is not, since PoS does not give each node a free choice as to how they will vote and decisions are dictated by deterministically written software, which penalizes votes deemed to be incorrect. It seems that PoS isn’t like democracy at all then. Will it work like one?
Let’s look at this from a small staker’s perspective, as small stakers, through their sheer number, are expected to be the mainstay of PoS security. They will be running a node, staking 32 ether of their own funds as a guarantee of their independent and truthful vote. They might be looking forward to this opportunity at gaining some risk free return, since the staker certainly won’t be planning to cheat. That’s great, but if they’re not going to spend any effort thinking freely about each vote and they will be penalized for disagreeing with a majority, then what is their expected incremental contribution to the democratic model of security? Hm.
It would seem that each staker will be contributing the vote of a deterministic automaton that is guaranteed to vote in the same way that all other well-intentioned stakers will vote. Will they be making a free and informed choice then? Absolutely not, but you can bet that they will be penalized if they misbehave! This is grounds for trouble.
Ok, the argument for PoS might concede here that stakers won’t be freely voting each time, but they will all be empowered to vote honestly though open source software. Even if they can’t make individually free choices, they can choose which software to run, but they won’t have the resources to audit or write the software themselves. Is that still choice?
We’ve now departed several degrees from what we would assume is a democratic process, and we have arrived at a model that assumes a dictatorship of software, and not of individual stakers.
False Assumption #5: Peer reviewed open source software guarantees honesty.
There’s another innocent looking assumption about the “honesty” of open source software. As much as I am a fan of open source software, I don’t think it carries any intrinsic honesty. Open source software is only as good as the developers, who write it and who secure its distribution. Unfortunately, neither the developers’ skills nor the security of their release distribution framework are perfect. So PoS seems to substitute a known attack energy of PoW for the unknown security of code and a distribution channel.
I really respect the talent that goes into coding works of passion, like Ethereum, and perhaps because of this I wouldn’t pin the security of an entire economy on the shoulders of a compact group of passionate individuals. Individuals, who might not realize the responsibility that they are taking on.
With that in mind, let’s take a look at attacks through software distribution networks that we commonly call computer viruses.
Viruses, botnets, and cyber attacks
The world of physical security is familiar and we have already explored it in the three examples I started with. It is also provided by governments and varies greatly across the world. More importantly, it is mostly out of our direct control. Let’s dwell on the less known issue of cyber security for a while, because governments don’t provide much cyber security to their residents (yet).
Multiplying the number of nodes in a blockchain that depends on voting without varying the software or hardware running those installations reduces that blockchain’s security instead of increasing it.
The idea of a computer virus is as fascinating as it is scary. The virus is a piece of software that copies itself from computer to computer and multiplies its effect with every infection. What is that effect? In the beginning, the purpose of viruses to annoy the user, popping up with unexpected messages, some early malicious viruses went on to damage or erase data, recent malicious viruses record keystrokes (and passwords), use microphones and cameras, encrypt data and demand a ransom, etc. All this, while sneaking by physical security mechanisms behind our backs and under the noses of our high energy law enforcement. A well written virus is a type of low energy attack, where the energy expended is on writing virus code that exploits vulnerabilities that exist in software that we use on our devices.
A virus type made popular in recent decades is the botnet. Botnets spread like viruses, but their function is to make their hosts participate in collective action initiated from a command and control centre. These collective actions may include things like sending spam email or participating in a Distributed Denial of Service (DDoS) attack. The greater the number of machines in the botnet, the more powerful the attack.
A botnet virus is most successfully distributed to a homogeneous collection of installations, such as hoards of vulnerable Windows PCs or small Ethereum stakers running a standard software stack. The power of a cyber attack carried out by such a botnet is amplified by the number of installations.
To be precise, a PoS configuration with a relatively small stake that necessitates a low investment in security, will in practice be a default configuration. It won’t matter that many small stakers run nodes, if all of them are exploited and join into a botnet that executes malicious code. The botnet could be distributed via a virus, or via a hijacking of the node software distribution network, and could be by design, or by mistake, such as through introduction of a critical bug. The end effect is the same: lower security.
Put another way, a large number of clone nodes can be modelled as one large node with the sum of the voting power of the small nodes that run that particular piece of software.
What would an attack look like? Imagine an attacker gaining access to a small army of staking nodes with the ability to alter their voting to force a minority invalid vote that would cost them their stake. It would be sufficient to open short positions on a few exchanges and conduct the attack. The loss inflicted on the stakers might be an order of magnitude greater than the profit from the short positions, but would still be worth it for the attacker, who just had to write a good virus, or to hijack the node upgrade scripts to deploy their malicious code to a large portion of the network.
One way to defend against a viral cyber attack against homogeneous voting nodes is to have several node implementations available. This would reduce the possibility that a single exploit could affect more than a fraction of the network. Interestingly, when the Ethereum blockchain launched, its creators intuited the validity of this approach and decided to fund several different implementations of Ethereum in different languages: Go, C++, Haskell, Java, Rust, and Python were some of the earliest. This approach to security has been largely abandoned now in favour of Go (geth client) and Rust (parity client) accounting for most of the nodes running Ethereum’s mainnet. This homogeneity has already been identified as a security vulnterability for the chain, as many nodes are not updated in time to patch known vulnerabilities. Fortunately, Ethereum uses PoW for its security, and we can presume that the large miners responsible for securing the network have their houses in order and it is the small miners and nodes that remain unpatched and a risk only to themselves.
Unless there is a fundamentally different approach to security somewhere, the same risk profiles will apply to a large number of homogeneous nodes of any other DLT with a consensus mechanism that relies on voting, like PBFT (Fabric, Quorum, etc.) or DAG (IOTA, Hashgraph, etc).
a hack (in every sense), the most subversive ever perpetrated, nothing less than the root password of all evil
A PoS proponent would argue that the incentives in the protocol motivate investment in node security and that is how the chain will defend itself against cyber attacks. While this is true in principle, it puts the responsibility for how much investment is sufficient in the hands of human beings without giving them a consistent way to measure the achieved security. The only remaining measure of security is the existence of a successful attack. This is no different from the traditional model of security, where in the absence of a successful attack we assume that security must therefore be sufficient. An example of this is the Equifax data breach, where security had been lax for many months and was only discovered via a breach. PoS does not seem to deal with this flaw of the model at all and through distribution of homogeneous software makes the problem systemic rather than individual.
The Inception Hack
As a final argument that any security scheme that relies on the security of software is fundamentally flawed, I will refer you to an advanced piece of reading on the Ken Thompson Hack here. This is an advanced attack method that will only be understood with some prior knowledge on how computer programs are created using compilers. This attack method hijacks a compiler to inject a malicious piece of machine code into all programs compiled using that compiler, including the compiler itself. What is interesting about this hack is that after it is successfully injected into the compiler or another piece of the pipeline that creates an executable, it is practically undetectable. This attack was proven to be feasible in 1984 and the tools to write and deploy software have not changed since then.
I will leave you with a quote from the linked article that describes the Ken Thompson Hack as “a hack (in every sense), the most subversive ever perpetrated, nothing less than the root password of all evil.” These chilling words are not my own, but I do understand the concept. To me the quote sounds appropriate and chills the blood in my veins. A more modern metaphor suited for the non-technical reader would be to call this hack the Inception Hack, after the movie Inception from 2010. This is because the hack makes the virus imperceptibly permeate all software, from its birth, on an unconscious level.
ASICs are not evil.
Another often mentioned assumption I would like to address is that ASIC resistance is a noble goal. The concept of ASIC resistance revolves around creating mining algorithms that are difficult to implement in a specialized mining chip called ASIC (Application Specific Integrated Circuit). This assumption is linked to another unfounded assumption about the absolute virtue of decentralization. ASIC mining requires larger purchases of hardware from specialized manufacturers, which means that both production of mining equipment and mining facilities are both better done in bulk, which hurts decentralization.
While decentralization certainly has many benefits, like resilience or easier data privacy, it doesn’t do as well for economies of scale. Economies of scale are beneficial, because in a larger facility more investment can be committed to overheads. Physical and cybersecurity are such overheads. It takes a similar amount of resources to physically secure a small data centre to a larger data centre, and the same goes for cybersecurity.
In a PoW system, the miners have a lot at stake and must continuously invest into securing their hashpower (physically and digitally) and into developing ever more efficient ways to mine for hashes. They are therefore locked into an arms race against each other to always seek and find the lowest amount of energy that is required to secure the only attack vector on the PoW chain, which is finding hashes faster while using the same or less power.
This arms race motivates miners to try different technologies, to invest in cyber security, and to write their own optimized mining software. All this individualized investment amounts to a heterogeneous software and hardware landscape, which is more difficult to exploit with viruses and is much less prone to becoming part of a botnet of dangerous proportions.
Moreover, an advantage this year does not mean that the miner maintains the advantage next year, for another competitor will eventually develop better tech. This means that ASICS also depreciate as they are replaced by later and more efficient models. The system therefore incentivises substantial investment into the energy tradeoffs around the security model, both in terms of hashing power and physical security of facilities (often by geographic and some day by cosmic distribution of facilities). This means that ASIC mining is actually intrinsically decentralized over time, which we might call churn. On the other hand PoS mining is intrinsically static over time for lack of a logical basis for gaining or losing competitive advantage. Another way to look at PoS staking nodes is like virtual ASICs that can never be replaced by a better model. Under PoS miner churn is not an intrinsic property of the market.
Indeed, PoW encourages separation of the data resilience mechanism that is provided via the nodes from the security mechanism, which is provided via mining. Therefore, miners are incentivised to custom-develop their own optimal miner implementations, which maximize heterogeneity of the PoW security generation mechanism.
Lastly, the stake in PoW is physical and cannot (easily) evaporate through a clever software hack, while the stake used for PoS is as easy to annihilate as performing the hack itself.
Dan Held goes into further benefits of specialized PoW mining for our civilization in this wonderful article from September 2018. I have independently come to the same conclusions as Dan in his article and I recommend giving it a full read.
Another article worth reading is Paul Sztorc’s take on why PoW is the cheapest approach to consensus, where he discusses several economic and game theoretic takes on how PoW generates value efficiently. These arguments are interesting examples of PoS ending up not being more efficient than PoW. While Paul’s arguments explain that PoS isn’t necessarily less energy hungry than PoW, he concludes by stating that “Blockchain security is not the main function of Bitcoin’s PoW.” I come to quite the opposite conclusion, namely that uniquely PoW attains measurable efficiency in building security, while PoS does not. This stems from the fact that PoS security is underwritten by human diligence (which is not only unmeasurable, but also unreliable), while PoW security is underwritten by a real quantity of equipment, power, and technological achievement.
What am I missing?
Even though PoW seems to be the most efficient (and least energy hungry) method to achieve blockchain security, its very nature implies that there is always something to fix in a continuous evolution of technology for generating hashes and for physically securing mining facilities and access to sources of energy.
PoW is also susceptible to failures through (temporary) centralization of mining. It is possible that, through the use of better technology and cheaper energy, a specific miner might achieve a monopoly for brief periods of time. This would force the security of the chain to be strongly influenced by the security of that miner. Such centralization has happened in 2017, when Bitmain seemed on the path to control enough Bitcoin mining power (51%) to be effectively in control of the order of transactions.
An example of an innovation in Proof of Work architectures that aims to deal with this possibility is Multi-Hash mining as proposed and implemented on Zen Protocol (ZP). In this approach the mining reward is distributed deterministically between several PoW mining algorithms. If the algorithms are chosen well, the approach should result in a splintering of the market for mining technology and lowering the probability of capture by a single ASIC manufacturer.
The realization that energy seems to underpin all security makes me profoundly uncomfortable about the future of the most promising blockchain in existence today. I am, of course, acutely aware of the need to increase the throughput and capacity of the platform, but would caution against hopes that it will become as fast as some would wish. Note my article on Myth 4 about speed, the proposed dapp solution architecture outlined in Myth 2 on capacity and scaling, or Myth 3 that touches on what is appropriate to store on a blockchain.
Ahead of us are improvements in storage and network capacity, dedicated blockchain validation hardware, and layer 2 scaling architectures. Although they will not come soon enough for some of the ideas that are out there, there is probably enough room to innovate within today’s limits to keep a brilliant mind occupied. The emerging DeFi ecosystem is a fantastic example of that.
An architect once told me that the enemy of great design is lack of constraints. Over time I have learned to appreciate constraints, and to work within them, even if I believe they could ultimately be temporary.
Note: I am acutely aware that the argument I make above is not conclusive and that it is incomplete without a proof. Nevertheless, I hope that I was successful in outlining a fairly broad set of examples that do not seem to lead to a counterexample, which could disprove the fundamental insight behind the thesis that energy (and not cleverness) defines a maximum boundary on security of any system. I will be working on coming up with a more rounded and hopefully formal way to argue this in the future.
Acknowledgements: The ideas for Blockchain Myths (of which this post is a part) were initially developed at ConsenSys with the fantastic feedback and help of many individuals, including: Tee Ganbold, Zunaira Arshad, Arielle Schnaidman, Brett Li, Micah Dameron, Chris Leishman, Van Sedita, John Wolpert, Jeff Gillis, Jérôme de Tychey, Ray Valdes, Igor Lilic, and other great people roaming cryptoland. This version of the article had active input from Ahmad Hammoudi, Joseph Khalife, Katarina Podlesnaya, Marc Ziade, and Nic Carter. Thanks guys! More great folks have read the draft and I thank them for their time.
² See the numerous rumours about the possible low gold reserves in Fort Knox, due to the gold being loaned out and essentially being snuck out in legal convoys.
³ Some people argue that trustless is an incorrect term and trust-minimized is more accurate. I think there is room for distinction between the two concepts and choose the former to describe PoW chains. By trustless I mean that when we receive a copy of the blockchain (or the latest block) from someone, we are not required to trust them at all about the correctness of the chain we were sent. Instead, we are equipped to only trust the data structure itself and we make a probabilistic judgment on whether a conflicting longer chain might exist (regardless of our evaluation of trustworthiness of the sender). This makes a PoW truly trustless in principle. In practice, we may trust that the first copy of a chain that we receive is sufficient and we might settle for a minimal level of trust, in which case we might call the situation “trust minimized”, but that is a compromise we choose to make and not an intrinsic property of the system.
⁴ I must emphasise here that there are two distinct security domains, when dealing with blockchain security: the immutability of transaction order, and the secrecy of the private key used to sign a transaction. Both have their own security profiles. In this article I only deal with security as a measure of the immutability of the order of transactions. The private key is subject to the $5 wrench attack, and all methods of avoiding this attack follow the same theme as for the safe and Fort Knox. Protecting a private key (a secret) is an individual holder’s responsibility that significantly predates blockchains and has not been the subject of any notable breakthrough, but is also subject to energy barriers to be secure.