Business Insights
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact

Archives

  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • August 2023
  • January 2023
  • December 2021
  • July 2021
  • November 2019
  • October 2019
  • September 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • February 2019
  • January 2019

Categories

  • Business
  • Crypto
  • Economy
  • Finance Expert
  • Forex
  • Invest News
  • Investing
  • Tech
  • Trading
  • Uncategorized
  • Videos
Apply Loan
Money Visa
Advertise Us
Money Visa
  • Home
  • Crypto
  • Finance Expert
  • Business
  • Invest News
  • Investing
  • Trading
  • Forex
  • Videos
  • Economy
  • Tech
  • Contact
A Brief History Of Wallet Clustering
  • Forex

A Brief History Of Wallet Clustering

  • July 5, 2025
  • Roubens Andy King
Total
0
Shares
0
0
0
Total
0
Shares
Share 0
Tweet 0
Pin it 0

Our previous post in this series introduced the basic idea behind wallet or address clustering, the trivial case of address reuse, and the merging of clusters based on the common input ownership heuristic (CIOH), also known as the multi-input heuristic.

Today, we’ll expand on more sophisticated clustering methods, briefly summarizing several notable papers. The content here mostly overlaps with a live stream on this topic, which is a companion to this series. Note that the list of works cited is by no means exhaustive.

Early Observational Studies – 2011-2013

As far as I’m aware, the earliest published academic study that deals with clustering is Fergal Reid and Martin Harrigan’s An Analysis of Anonymity in the Bitcoin System (PDF). This work, which studies the anonymity properties of bitcoin more broadly, in its discussion of the on-chain transaction graph, introduced the notion of a “User Network” to model the relatedness of a single user’s coins based on CIOH. Using this model, the authors critically examined WikiLeak’s claim that it “accepts anonymous Bitcoin donations.”

Another study that was not published as a paper was Bitcoin – An Analysis (YouTube) by Kay Hamacher and Stefan Katzenbeisser, presented at 28c3. They studied money flows using transaction graph data and made some remarkably prescient observations about bitcoin.

In Quantitative Analysis of the Full Bitcoin Transaction Graph (PDF), Dorit Ron and Adi Shamir analyzed a snapshot of the entire transaction graph. Among other things, they note a curious pattern, which may be an early attempt at subverting CIOH:

We discovered that almost all these large transactions were the descendants of a single large transaction involving 90,000 bitcoins [presumably b9a0961c07ea9a28…] which took place on November 8th, 2010, and that the subgraph of these transactions contains many strange looking chains and fork-merge structures, in which a large balance is either transferred within a few hours through hundreds of temporary intermediate accounts, or split into many small amounts which are sent to different accounts only in order to be recombined shortly afterward into essentially the same amount in a new account.

Another early confounding of this pattern was due to MtGox, which allowed users to upload their private keys. Many users’ keys were used as inputs to batch sweeping transactions constructed by MtGox to service this unusual pattern of deposits. The naive application of CIOH to those transactions resulted in cluster collapse, specifically the cluster previously known as MtGoxAndOthers on walletexplorer.com (now known as CoinJoinMess). Ron and Shamir seem to note this, too:

However, there is a huge variance in [these] statistics, and in fact one entity is associated with 156,722 different addresses. By analyzing some of these addresses and following their transactions, it is easy to determine that this entity is Mt.Gox

Although change identification is mentioned (Ron & Shamir refer to these as “internal” transfers), the first attempt at formalization appears to be in Evaluating User Privacy in Bitcoin (PDF) by Elli Androulaki, Ghassan O. Karame, Marc Roeschlin, Tobias Scherer, and Srdjan Capkun. They used the term “Shadow Addresses,” which these days are more commonly referred to as “change outputs.” This refers to self-spend outputs, typically one per transaction, controlled by the same entity as the inputs of the containing transaction. The paper introduces a heuristic for identifying such outputs to cluster them with the inputs. Subsequent work has iterated on this idea extensively, with several proposed variations. One example based on the amounts in 2 output transactions is if an output’s value is close to a round number when denominated in USD (based on historical exchange rates), that output is likely to be a payment, indicating the other production is the change.

This early phase of Bitcoin privacy research saw the theory of wallet clustering become established as a foundational tool for the study of Bitcoin privacy. While this wasn’t entirely theoretical, evidential support was limited, necessitating relatively strong assumptions to interpret the observable data.

Empirical Results – 2013-2017

Although researchers attempted to validate the conclusions of these papers, for example, by interviewing Bitcoin users and asking them to confirm the accuracy of the clustering of their wallets or using simulations as in Androulaki et al.’s work, little information was available about the countermeasures users were utilizing.

A fistful of bitcoins: characterizing payments among men with no names (PDFs: 1, 2) by Sarah Meiklejohn, Marjori Pomarole, Grant Jordan, Kirill Levchenko, Damon McCoy, Geoffrey M. Voelker, and Stefan Savage examined the use of Bitcoin mixers, and put the heuristics to the test by actually using such services with real Bitcoin. On the more theoretical side, they defined a more general and accurate change identification heuristic than previous work.

In his thesis, Data-Driven De-Anonymization in Bitcoin, Jonas Nick was able to validate the CIOH and change identification heuristics using information obtained from a privacy bug in the implementation of BIP 37 bloom filters, mainly used by light clients built with bitcoinj. The underlying privacy leak was described in On the privacy provisions of Bloom filters in lightweight bitcoin clients (PDF) by Arthur Gervais, Srdjan Capkun, Ghassan O. Karame, and Damian Gruber. The leak demonstrated that the clustering heuristics were rather powerful, a finding which was elaborated on in Martin Harrigan and Christoph Fretter’s The Unreasonable Effectiveness of Address Clustering (PDF).

Attackers have also been observed sending bitcoin, not through a mixer as in the fistful of bitcoins papers, but small amounts sent to addresses that have already appeared on-chain. This behavior is called dusting or dust1 attacks and can deanonymize the victim in two ways. First, the receiving wallet may spend the funds, resulting in address reuse. Second, older versions of Bitcoin Core used to rebroadcast received transactions, so an attacker who was also connected to many nodes on the p2p network could observe if any node was rebroadcasting its dusting transactions and that node’s IP address to the cluster.2

Although Is Bitcoin gathering dust? An analysis of low-amount Bitcoin transactions (PDF) by Matteo Loporchio, Anna Bernasconi, Damiano Di Francesco Maesa, and Laura Ricci offered insights in 2023, exploring dust attacks, the data set they analyzed only extends to 2017. This work looked at the effectiveness of such attacks in revealing clusters:

This means that the dust attack transactions, despite being only 4.86% of all dust creating transactions, allow to cluster 66.43% of all dust induced clustered addresses. Considering the whole data set, the transactions suspected of being part of dust attacks are only 0.008% of all transactions but allow to cluster 0.14% of all addresses that would have otherwise remained isolated.

This period of research was marked by a more critical examination of the theory of wallet clustering. It became increasingly clear that, in some cases, users’ behaviors can be easily and reliably observed and that privacy assurances are far from perfect, not just in theory but also based on a growing body of scientific evidence.

Wallet Fingerprinting – 2021-2024

Wallet fingerprints are identifiable patterns in transaction data that may indicate using particular wallet software. In recent years, researchers have applied wallet fingerprinting techniques to wallet clustering. A single wallet cluster is typically created using the same software throughout, so any observable fingerprints should be fairly consistent within the cluster.3

As a simple example of wallet fingerprinting, every transaction has an nLockTime field, which can be used to post-date transactions.4 This can be done by specifying a height or a time. When no post-dating is required, any value representing a point in time that is already in the past can be used, typically 0, but such transactions haven’t been post-dated when they were signed. To avoid revealing intended behavior and address the fee sniping concern, some wallets will randomly specify a more recent nLockTime value. However, since some wallets always specify a value of 0, when it’s not clear which output of a transaction is a payment and which is change, that information might be revealed by subsequent transactions. For example, suppose all of the transactions associated with the input coins specify nLockTime of 0, but the spending transaction of one of the outputs does not, in this case it would be reasonable to conclude that output was a payment to a different user.

There are many other known fingerprints. Wallet Fingerprints: Detection & Analysis by Ishaana Misra is a comprehensive account.

Malte Möser and Arvind Narayanan’s Resurrecting Address Clustering in Bitcoin (PDF) applied fingerprinting to the clustering problem. They used it as the basis for refinements to change identification. They relied on fingerprints to train and evaluate improved change identification using machine learning techniques (random forests).

Shortly thereafter, in How to Peel a Million: Validating and Expanding Bitcoin Clusters (PDF), George Kappos, Haaroon Yousaf, Rainer Stütz, Sofia Rollet, Bernhard Haslhofer and Sarah Meiklejohn extended and validated this approach using cluster data for a sample of transactions provided by a chain analytics company, indicating that the wallet fingerprinting approach is dramatically more accurate than only using CIOH and simpler change identification heuristics. Taking fingerprints into account when clustering makes deanonymization much easier. Likewise, taking fingerprints into account in wallet software can improve privacy.

A recent paper, Exploring Unconfirmed Transactions for Effective Bitcoin Address Clustering (PDF) by Kai Wang, Yakun Cheng, Michael Wen Tong, Zhenghao Niu, Jun Pang, and Weili Han analyzed patterns in the broadcast of transactions before they are confirmed. For example, different fee-bumping behaviors can be observed, both via replacement or with child-pays-for-parent. Such patterns, while not strictly fingerprints derived from the transaction data, can still be thought of as wallet fingerprints but about more ephemeral patterns related to certain wallet software, observable when connected to the Bitcoin P2P network but not apparent in the confirmed transaction history that is recorded in the blockchain.

Similar to the Bitcoin P2P layer, the Lightning network’s gossip layer shares information about publicly announced channels. This is not typically framed as a wallet fingerprint but might be loosely considered as such, in addition to the on-chain fingerprint lightning transactions have. Lightning channels are UTXOs, and they form the edges of a graph connecting Lightning nodes, which are identified by their public key. Since a node may be associated with several channels, and channels are coins, this is somewhat analogous to address reuse.5 Christian Decker has publicly archived historical graph data. One study that looks at clustering in this context is Cross-Layer Deanonymization Methods in the Lightning Protocol (PDF) by Matteo Romiti, Friedhelm Victor, Pedro Moreno-Sanchez, Peter Sebastian Nordholt, Bernhard Haslhofer, and Matteo Maffei.

Clustering techniques have improved dramatically over the last decade and a half. Unfortunately, widespread adoption of Bitcoin privacy technologies is still far from being a reality. Even if it was, the software has not yet caught up to the state of the art in attack research.

Not The Whole Story

As we have seen, starting from the humble beginnings of address reuse and the CIOH described by Satoshi, wallet clustering is a foundational idea in Bitcoin privacy that has seen many developments over the years. A wealth of academic literature has called into question some of the overly optimistic characterizations of Bitcoin privacy, starting with WikiLeaks describing donations as anonymous in 2011. There are also many opportunities for further study and for the development of privacy protections.

Something to bear in mind is that clustering techniques will only continue to improve over time. “[R]emember: attacks always get better, they never get worse.”6 Given the nature of the blockchain, patterns in the transaction graph will be preserved for anyone to examine more or less forever. Light wallets that use the Electrum protocol will leak address clusters to their Electrum servers. Ones that submit xpubs to a service will leak clustering information of all past and future transactions in a single query. Given the nature of the blockchain analysis industry, proprietary techniques are at a significant advantage, likely benefiting from access to KYC information labeling a large subset of transactions. This and other kinds of blockchain-extrinsic clustering information are especially challenging to account for since, despite being shared with 3rd parties, this information is not made public, unlike clustering based on on-chain data. Hence, these leaks aren’t as widely observable.

Also, bear in mind that control over one’s privacy isn’t entirely in the hands of the individual. When one user’s privacy is lost, that degrades the privacy of all other users. Through the process of elimination, which suggests a linear progression of privacy decay, every successfully deanonymized user can be discounted as a possible candidate when attempting to deanonymize the transactions of the remaining users. In other words, even if you take precautions to protect your privacy, there will be no crowd to blend into if others don’t take precautions, too.

However, as we shall see, assuming linear decay of privacy is often too optimistic; exponential decay is a safer assumption. This is because divide-and-conquer tactics also apply to wallet clustering, much like in the game of 20 questions. CoinJoins transactions are designed to confound the CIOH, and the topic of the next post will be a paper that combines wallet clustering with intersection attacks, a concept borrowed from the mixnet privacy literature, to deanonymize CoinJoins.

1

Not to be confused with a different kind of dust attack, such as this example analyzed taking clustering into account by LaurentMT and Antoine Le Calvez.

2

A notable and somewhat related attack on Zcash and Monero nodes (Remote Side-Channel Attacks on Anonymous Transactions by Florian Tramer, Dan Boneh and Kenny Paterson) was able to link node IP addresses to viewing keys by exploiting timing side channels on the P2P layer.

3

More precisely: fingerprint distributions should be consistent within a cluster, as some wallets deliberately randomize certain attributes of transactions.

4

Note for nLockTime to be enforced the nSequence value of at least one input of the transaction must also be non-final, which complicates things both for post-dating and in terms of the different observable patterns this gives rise to.

5

Channel funds are shared by both parties to the channel but the closing transaction resembles a payment from the funder of a channel. Dual-funded channels may confound CIOH, similarly to PayJoin transactions.

6

New Attack on AES – Schneier on Security

Total
0
Shares
Share 0
Tweet 0
Pin it 0
Roubens Andy King

Previous Article
A developer details how he shipped Context, a native macOS app that was almost 100% built using Claude Code (Indragie Karunaratne/My Portfolio)
  • Tech

A developer details how he shipped Context, a native macOS app that was almost 100% built using Claude Code (Indragie Karunaratne/My Portfolio)

  • July 5, 2025
  • Roubens Andy King
Read More
Next Article
Will SOL rally to 0?
  • Crypto

Will SOL rally to $200?

  • July 5, 2025
  • Roubens Andy King
Read More
You May Also Like
Galaxy Digital Sells 1,167 Bitcoin Amid Ongoing Volatility
Read More
  • Forex

Galaxy Digital Sells 1,167 Bitcoin Amid Ongoing Volatility

  • Roubens Andy King
  • August 31, 2025
At Bitcoin Asia Everything Was Upside Down
Read More
  • Forex

At Bitcoin Asia Everything Was Upside Down

  • Roubens Andy King
  • August 31, 2025
AI to Disrupt Stocks, Force Investors to adopt Bitcoin — Analyst
Read More
  • Forex

AI to Disrupt Stocks, Force Investors to adopt Bitcoin — Analyst

  • Roubens Andy King
  • August 31, 2025
XRP Price Enters Consolidation Before Next Major Breakout Move
Read More
  • Forex

XRP Price Enters Consolidation Before Next Major Breakout Move

  • Roubens Andy King
  • August 31, 2025
Venture Capital Firms Eyeing Revenue-Generating Crypto Projects
Read More
  • Forex

Venture Capital Firms Eyeing Revenue-Generating Crypto Projects

  • Roubens Andy King
  • August 31, 2025
Ethereum cofounder Joseph Lubin, ‘ETH will likely 100x from here’
Read More
  • Forex

Ethereum cofounder Joseph Lubin, ‘ETH will likely 100x from here’

  • Roubens Andy King
  • August 31, 2025
Roundup Round III | Ethereum Foundation Blog
Read More
  • Forex

Roundup Round III | Ethereum Foundation Blog

  • Roubens Andy King
  • August 31, 2025
Bitcoin Price Closes Below STH Realized Price For The 2nd Time In 2025 — Details
Read More
  • Forex

Bitcoin Price Closes Below STH Realized Price For The 2nd Time In 2025 — Details

  • Roubens Andy King
  • August 31, 2025

Recent Posts

  • Oddity (ODD) Tech Posts Strong Q2 Results, Lifts 2025 Outlook Amid Global Expansion
  • PayPal at 7% Free Cash Flow Yield $PYPL – Investment Moats
  • Southwest Airlines Gives Back Some Perks it Took Away
  • The Mortal Kombat II movie is postponed to a spring 2026 release
  • Geth 1.6 – Puppeth Master
Featured Posts
  • Oddity (ODD) Tech Posts Strong Q2 Results, Lifts 2025 Outlook Amid Global Expansion 1
    Oddity (ODD) Tech Posts Strong Q2 Results, Lifts 2025 Outlook Amid Global Expansion
    • August 31, 2025
  • PayPal at 7% Free Cash Flow Yield $PYPL – Investment Moats 2
    PayPal at 7% Free Cash Flow Yield $PYPL – Investment Moats
    • August 31, 2025
  • Southwest Airlines Gives Back Some Perks it Took Away 3
    Southwest Airlines Gives Back Some Perks it Took Away
    • August 31, 2025
  • The Mortal Kombat II movie is postponed to a spring 2026 release 4
    The Mortal Kombat II movie is postponed to a spring 2026 release
    • August 31, 2025
  • Geth 1.6 – Puppeth Master 5
    Geth 1.6 – Puppeth Master
    • August 31, 2025
Recent Posts
  • Galaxy Digital Sells 1,167 Bitcoin Amid Ongoing Volatility
    Galaxy Digital Sells 1,167 Bitcoin Amid Ongoing Volatility
    • August 31, 2025
  • Rudy Giuliani suffers fractured vertebra in car crash after being ‘flagged down’ by domestic violence victim
    Rudy Giuliani suffers fractured vertebra in car crash after being ‘flagged down’ by domestic violence victim
    • August 31, 2025
  • A Son Warned His Mom To Save For Retirement, But She Told Him It’s A Scam. Now She’s Getting Divorced And Is Retired With Nothing Saved
    A Son Warned His Mom To Save For Retirement, But She Told Him It’s A Scam. Now She’s Getting Divorced And Is Retired With Nothing Saved
    • August 31, 2025
Categories
  • Business (2,005)
  • Crypto (1,399)
  • Economy (115)
  • Finance Expert (1,654)
  • Forex (1,397)
  • Invest News (2,291)
  • Investing (1,393)
  • Tech (1,989)
  • Trading (1,974)
  • Uncategorized (2)
  • Videos (805)

Subscribe

Subscribe now to our newsletter

Money Visa
  • Privacy Policy
  • DMCA
  • Terms of Use
Money & Invest Advices

Input your search keywords and press Enter.