Bitcoin Q&A: Miners, pools, and consensus


[ANDREAS] Eric asks, “How many miners are out there?
How many have successfully mined blocks so far?” “Where can I find this information? Are most miners
traceable to IP addresses, or are [they] anonymised?” Let’s start from the beginning.
How many miners are out there? It’s actually very difficult to count, unless miners
identify either the pool or the mining system… that they use by [including] information
(their signature) in the header of the block. [If they don’t] self-identify, it’s [almost]
impossible to tell who a miner is. You can maybe get some statistics by looking at the
correlation to addresses that they use in the coinbase. Every block has a coinbase transaction which
pays the rewards to the miner’s bitcoin address. If they use the same bitcoin address for every
block they mine, you can correlate those blocks. But miners don’t have to use the
same bitcoin address for every block. They can use a different bitcoin address for every block
they mine, and then you could not correlate them. We don’t know how many miners are out there. We do see, through the signatures in the block headers, how many mining pools there are [which self-identify]. There are about a dozen or so prominent mining pools,
with four or five having the majority of the hash power. These mining pools are not necessarily
also owners of the mining equipment. In many cases, they are independent organisations
and the mining equipment belongs to others, who are combining [their individual
hashing power through] the mining pool. Miners and mining pools are not the same thing.
We can identify mining pools. You can see statistics on a number of websites that… show the distribution of hashing power
on the Bitcoin network by mining pool; but, again, this is self identification. The miners and mining pools can fail to identify,
refuse to identify, or they can lie about indentification. As a result, you can’t really trace [miners]. The second question was: “Are most miners traceable
to IP addresses, or are their IP addresses anonymized?” They are not traceable to IP addresses.
The Bitcoin network doesn’t keep track of IP addresses. Both transactions and blocks are propagated through
the network by a mechanism called flooding. With flooding, every node sends everything
it receives to all of the nodes it is connected to. If your node is connected to other nodes in the system,
and receives a transaction or block from another node, it has no idea if the node that it just received this transaction from is the first node (the originator), if it’s the node which mined that block, or if it’s a node
that is simply relaying information from another node. Every node only sees its immediate neighbors.
It doesn’t know where the information is originating. It just sees that it’s coming from its neighbors, and
the neighbors may simply be relaying information. When your node relays that information to the next node,
the next node doesn’t know if you’re a miner, the originator of that transaction, or if
you received it from somewhere else. As a result, it is actually very difficult to track IP
addresses because you would have to monitor… all of the connections between all of the
different nodes on the Bitcoin network. There are certain organizations
that can probably do this: intelligence agencies can probably monitor a very large
percentage of the various nodes on the Bitcoin network. But the Bitcoin network uses various cryptographically
secured protocols, such as Tor, in order to route traffic. If a miner, or someone who is creating transactions, has
a node that only communicates on the network via Tor, it’s very difficult to trace where transactions
and blocks are coming from. Jeremy asks, “How can the distribution of hash power
amongst miners in the Bitcoin network be identified?” “For example, by looking at statistics
on sites like blockchain.info/pools?” “How reliable [is this information]?” We can know this information because
most mining pools, if not all mining pools, include a signature in the coinbase transaction
that identifies which pool created the [block]. They leave a fingerprint on purpose.
They tag it with their name. The calculation that is [done] at places like blockchain.info uses a database that says, ‘these tags belong to these pools’ and then creates a
chart, looking at all blocks in a previous window of time, counting how many of those blocks had which tags,
and using it to identify miners. When I was at Blockchain, in fact, I was involved
in open-sourcing that part of the database, which was exported as a GitHub repository. It’s a JSON (JavaScript Object Notation) database
[with] the tags and the name of mining pools. People can update that to indicate new mining pools. Anything that is not identified falls into one slice
of the pie that you might notice, called ‘Unknown.’ “How reliable?” Theoretically at least, it’s possible
that miners may be mis-tagging their coinbase. It’s voluntary information. They may be lying. A miner could put the name of another
mining pool into the coinbase to make it… appear to have more hashing power,
[so they can] hide in the shadows. There’s no way of validating it, so you have
to take that information with a pinch of salt. Andreas asks, “If mining is for preventing miners
from double-spending their own transactions, why not simply flag their transactions
and check them regularly?” Andreas, the mining is not for preventing miners from
double-spending, but anyone from double-spending. Furthermore, even if it was just about miners, you
can’t flag which transactions are produced by miners; you can’t tell which transactions are
produced by miners versus somebody else. There’s no way to know who is a miner
and who is not a miner on the system. Nodes operate under this cloak of anonymity so that
you can’t tell who is a miner on the Bitcoin network. You certainly can’t easily [discern] where a
transaction came from or who mined [a block]. Mark says, “Bitcoin solves the Byzantine Generals’
Problem so long as at least 50% of the hashing power… consists of honest, non-collaborating miners.” “How do I know that at least 50% of
miners are honest and not traitors?” You don’t know that. The only way to ensure
that, is to have mining power distributed… amongst enough miners that they can’t collude. Also, even if they do collude, the reward for doing a
51% attack is not worth destroying the system itself. Susanna asks, “During a 51% attack, can
the miner spend his or her bitcoins twice?” “What [does it mean that] a 51% attacker can… produce a longer chain than the other
miners [or] reverse past transactions?” A 51% attack doesn’t allow a miner to double-spend.
This is really, really important to realize. This is a fundamental difference. Just because there is a 51% attack, just because a miner
has 51% of the hashing power, does not mean that… they can produce blocks that are not valid. The reason is that other nodes on the network — which
include other miners but also other users, merchants, exchanges, wallet infrastructure, and my node
that’s sitting out there on a server of my own — they’re all validating blocks and transactions. If I [received] a block from a miner, with a transaction
[trying to spend] coins I already marked as spent, it doesn’t matter if they have 51% of the hashing power. It doesn’t matter if they produce valid proof-of-work. That is not a valid transaction; therefore,
the block it is included in is not a valid block. My node, and the rest of the network, will reject that
block. The miner will have wasted their hashing power. Having 51% of the hash power does not
enable miners to produce invalid blocks. This is a really important point and also a point of
contention in some debates that come up in Bitcoin, specifically around user-activated soft forks and
the possibility of changing consensus rules… without agreement from everyone. Mining nodes are not the only nodes that matter. Now, let’s say a miner does have
51% of the hashing power. What does it mean that they
can produce a longer chain? That simply means that they can find more
blocks on average than any other miner, or than the rest of the miners put together, which means they [will] produce the longest
cumulative difficulty valid chain on the network [as long as they continue to hold 51% of the
hashing power], but it still has to be valid. How can they benefit from this? They can’t double spend, but they can [try to] rewrite
a past block and produce a different variation of it… that they then include in a longer chain. This takes a bit more than 51%, because
51% only allows you to be ahead by 1%. If one miner has 51% [of the hashing power]
and the rest of the miners have 49%, that means they can only mine 1%
more blocks than everybody else. That is 1.4 more blocks a day, on average. Let’s say you bought something from
a merchant and made a payment. As a miner, you [made a transaction] for a flat-screen TV;
the merchant ships the TV after one confirmation. Then you create [an alternative] chain where that block
[with your payment] for the flat-screen TV is removed. Then you create another block, building on top of
that [alternative chain], to make it the longest chain. With just two blocks, the transaction you paid
the merchant for the TV never happened, and the TV is already loaded on the truck, coming
your way. Woo-hoo, a [free] flat screen TV for you! Of course, that is why merchants wouldn’t
ship the TV after just one confirmation; in order to get six confirmations ahead, you would have
to sustain a 51% attack for seven days. Seven days! In order to rewrite the longest chain, you
would have to take an enormous risk that… you didn’t lose the majority of the hashing power. Certainly not worth doing just for a flat-screen TV. Again, it doesn’t allow you to take other people’s money.
It doesn’t allow you to double-spend money. It simply allows you to remove
transactions that happened in the past, effectively by undoing a payment
of your own, so as to steal money. It takes quite a bit of work, but it’s why, when
merchants are receiving payment for [products] or when an exchange accepts withdrawals
on a trade, they wait for six confirmations. That massively reduces the risk of an attack on
the network [causing them to lose money or value]. It takes significantly more resources
to pull off that kind of attack. Christopher says, “This question refers to
Cornell University’s selfish mining model, quoted on page 30 of the course [material].” “It’s about the threshold of honest, non-
nefariously colluding miners, at which point… Bitcoin solves the Byzantine Generals’ Problem
and creates the guarantee for system integrity.” “Is it 51% or 66% of the hashing power?” “What countermeasures might
work against dishonest miners?”” “Ultimately, any serious integrity or security deficiency
in Bitcoin or other blockchains might also provide… an opportunity to big governments, heavily centralized
corporations, or other organisations to enter the fray… and attack Bitcoin.” “In that sense, Cornell University’s [suggested
adjustments to Bitcoin] sound interesting, in order to patch this theoretical Bitcoin
deficiency, while keeping it decentralized.” “Is it gaining any attention and traction?” It isn’t, at the moment, quite honestly. Although there are some proposals to introduce certain
countermeasures that prevent dishonest miners… and ‘selfish mining’ capabilities by having other
hybrid measures in addition to proof-of-work. I think we are going to see a lot
of experimentation in the future. For the time being, what that means is 33% of
the hashing power could engage in selfish mining. This is a very high-risk option for miners. If successful, they [still could not
really*] introduce double-spending. For the most part, selfish mining is a destructive attack. It’s not an attack [where] the miner gains something.
The opportunities for double-spending are limited. The most effective reason for doing an attack like this
is not to gain, but a willingness to spend a lot of money to destroy, damage, or deny service in
Bitcoin for a prolonged period of time; even knowing that, while executing this
attack, you are not going to gain anything and probably spend a lot of money doing it. What you need is not just a selfish or dishonest miner,
but a dishonest miner who is not motivated by profit… and is willing to spend a lot of money just to damage
Bitcoin; which, of course, is a strong possibility, especially for censorship by state agents
or collusion between multiple state agents. A third of the hashing power… With the rate at which hashing power has escalated,
that becomes very difficult to do. There are a number of countermeasures against that,
including the most obvious countermeasure, which we could call ‘the nuclear option’, a hard fork
to change the proof-of-work algorithm and thereby… rendering all of the equipment that has been
amassed into slag, making it useless. There are defenses that the Bitcoin community
can marshal, but these defenses are themselves… very high cost, defenses that
can cause a lot of damage. We’ll see, in the future, if selfish mining
has some emergent defenses against it. For [now], the concern isn’t high enough to justify
additional research or practical deployment against it. Karl asks, “How is dishonesty — meaning [trying to move] the same bitcoin
to two different addresses — weeded out? Karl, that’s what we call double-spending. The way double-spending is weeded out… When you try to send a transaction
like that, who are you sending it to? The Bitcoin network works through flooding broadcast, meaning that my node or wallet is connected to
the Bitcoin network (a bunch of other nodes), and when I send a transaction, I’m announcing that
transaction to nodes I am immediately connected to. When I send the transaction to those nodes, they will
validate it before deciding whether they will send it… to their adjacent / [neighbor] nodes. When a normal transaction propagates,
I send it to the nodes I’m connected to; they send it to all the nodes they’re connected to,
which send it to all the nodes they’re connected to, until the transaction ripples out across the
network and everybody receives a copy of it. One of the nodes that receives a copy of it is probably
going to be a miner. They will put it in their mempool. They will wait. When they have another block to mine, they might
choose [to include that transaction in the block], and then send that block out to everybody in the same
way, rippling across the network until everyone sees it. Let’s look at what happens when I do a double-spend. I send one transaction, spend bitcoins to one address,
and then I construct another transaction that spends… the same bitcoin, but sends it to a different address. I send it to the nodes that are immediately connected
next to me. They will attempt to validate the transaction. They will find that the bitcoin I’m trying to spend has
already been spent and they will reject that transaction. They will not propagate it. The problem isn’t that I can’t
get it to miners; I also can’t get past nodes close to me. It won’t get very far at all. The first node I tell will reject
the transaction as invalid because it’s double-spending. Every node validates. As a result, invalid transactions, double-spend
transactions being particularly egregious case, should not propagate across the network. That doesn’t mean you can’t get a
“double-spend” transaction to a miner. The way you can do that is by connecting
directly to a node that is run by the miner, and then enticing them to ignore the validation rules… by attaching a much higher fee
to your “double-spend” transaction. They might go out and try to include it in a block. They
won’t if the first transaction has already been included. But they might replace the transaction they have
in their mempool with the second transaction, if neither of them has been [confirmed] yet. So, that is [kind of like] a double-spend; [transactions
with zero confirmations] can be “double-spent.” All you have to do is entice the miner to take the
transaction with the greatest fee from their mempool, ignore the one that has a lower fee
which you broadcasted earlier. But that only works until one of the transactions is
[confirmed]; once it is [confirmed], no miner is going to… accept a double-spend of that transaction. [Miners] know that if they do, their block
will be invalid and their time will be wasted. Roy posted a tweet from June 2013 that says, “ghash.io
is now at 52% hash rate,” and this can cause issues. The question from Roy is, “Did someone
find a way to prevent this from happening?” Well, yes. Roy, that particular tweet was mis-stating
the hash rate of ghash.io, which had not reached 52%. In any case, what happened is that the hash rates
changed very quickly as miners abandoned ghash.io, to remove their hashing power from that pool. At times, various pools jockeying for position entice
miners to join them and participate with their hashing, but if it looks like any pool is achieving mining
hash power that is close to a dangerous level, miners generally tend to switch pools. We’ve seen it happen with a number of pools over time.
There is no way to really prevent this from happening. That’s the nature of the system. You can’t control where miners will commit their
mining power, how they will collaborate to form pools, or who has the mining power. But because of the costs of creating and retaining
enough power in the network are very high, the practical risk of 51% attacks,
as they’re called, is very low. Over time, we’ve seen the amount of hashing power
any one miner or pool has, become more decentralised. If you think about it, back in January 2009 there was one
miner, Satoshi, who had nearly 100% of the hash power. Then, gradually more and more miners joined,
and the mining power was further distributed. Today we have many different miners who have
varying percentages of the mining power. The number of pools and miners is
greater than it ever was in the past. Mining is quite the decentralized operation. Of course, the risk always exists. But the economic
incentives of mining means that even though… a 51% attack may become theoretically possible by any one miner, the problem with executing such an attack… is that they get very little benefit. [They] damage Bitcoin in such a way that they lose
all of the investment in mining equipment… and all the energy they have used up to that point. There are actually economic disincentives for miners to,
if you like, ‘kill the goose that lays the golden eggs.’ “If the top three Bitcoin mining pools
were to merge,” says Jeremy, “they would control 54% [of the hash power].
Why hasn’t it happened?” Because if they did merge and control 54%, Jeremy,
people participating in them would be rightly spooked, and leave them, at [which] point
they would no longer control 54% It may be that the top three miners have a lot
more common interests than you might think, but they would rather operate under different names
so as to not even create the impression of having 50%. We saw in the early days, if you read back [to stories]
about ghash.io, where one miner got close to 50% Or rather one mining pool got close to 50%. As a result, the participants in that mining pool started
abandoning it out of fear that it might take advantage… and [execute] a 51% attack. Even if a mining pool, or certain business interests
that [run] mining pools, effectively control… more than 50% of the network, they
certainly don’t want anyone to know that. There will be repercussions; people will move their
hashing power in order to reduce that [influence], or certainly express a lot of concern over that. There’s a big difference between having hash power that
exceeds 50%, and doing something with that power. Having the power, or perceived power,
and exercising that power. In many aspects of the Bitcoin consensus system,
the power exists as long as it is not exercised. The moment someone tries to exercise it,
[the power] will evaporate.

19 thoughts on “Bitcoin Q&A: Miners, pools, and consensus”

  1. question??? why is bitcoin cash hardforking again for the second time on May 15th? i do not understand… will i get 1 bitcoin cash for every bitcoin i own?….

  2. Are mining pools not a form of centralization? It is my understand that Bitmain and it’s affiliates, friends, partners, connections etc. control over 50% of the mining…

  3. I was wondering if you have seen this white paper that claims that bitcoin mining becomes unstable when block rewards becomes incentively insignificant and transaction fees begin to make up the majority of the coinbase transaction:

    https://via.hypothes.is/http://randomwalker.info/publications/mining_CCS.pdf

    I've been looking at some game theoretical scenarios, but I was thinking that in the case of bitcoin block reward going to near zero (which won't happen for some time), that if the coinbase tx has a long enough lock time, then if someone was to try to fork blocks that by the time their undercutting attack rewrote the history, that what they collect would be worth next to nothing.

    Do you have any thoughts on this? This paper was written by CS PhD candidates at Princeton and has some computer simulator results…

  4. What i like about Andreas is that is very humble even though that his knowledge on subject is pretty much top notch.

Leave a Reply

Your email address will not be published. Required fields are marked *