More than a decade ago, diehard fans hailed Bitcoin as the anarchist holy grail: a truly private digital currency for the internet.
Satoshi Nakamoto, the mysterious and unidentifiable founder, claimed in his introductory email that Bitcoin “can be completely anonymous.” And the dark web drug trafficking network SilkRoad is living proof of that claim, moving hundreds of millions of dollars in drugs and other contraband while openly mocking police and law enforcement.
But in late 2013, Bitcoin was exposed as being quite the opposite. Its blockchain allowed researchers, tech companies, and law enforcement to track and identify users with greater accuracy than the current financial system. The world of cybercrime was shaken. In the years that followed, the theft of half a billion dollars in Bitcoin from the world’s first cryptocurrency exchange was uncovered, leading to the collapse of the largest online drug market, the arrest of hundreds of child pornography offenders, and the largest, second, and third largest proceeds of crime recoveries in the history of the U.S. Department of Justice.
This 180-degree turn, and the epic cat-and-mouse game that followed, was sparked by a young puzzle-playing mathematician named Sarah Meiklejohn, the researcher who first discovered traceable “patterns” in the random chaos of the Bitcoin blockchain.
In early 2013, on the shelves of a windowless storage room in a building at the University of California, San Diego (UCSD), random objects began to appear. A Casio calculator. A pair of alpaca socks. A deck of magic cards. A Super Mario doll. Three Nintendo original cartridges. A plastic Guy Fawkers mask that the hacker group Anonymous had turned into a legend. A CD by the classic rock band Boston.
Every so often, the door opens, the lights come on, and a pretty little dark-haired girl walks in and adds some mysterious object to her collection. Then she leaves, closes the door, goes down the hall, up the stairs, and returns to the UCSD Computer Science graduate student office. It’s Sarah Meiklejohn. A wall of glass looks out over the Sorrento Valley forests and the rolling hills beyond. But Sarah sits with her back to that majestic view. She stares intently at the computer screen where she has become one of the world’s most unusual and frequent Bitcoin users.
Sarah bought all the furniture in the UCSD loft with bitcoin, completely randomly, from any vendor who would accept it. Between the purchases and the trips to the loft to store her stuff, she had to go through a series of actions that any bitcoin user would have to go through, like a true crypto enthusiast.
She transferred in and out of 10 different bitcoin wallets and converted dollars into bitcoin from more than 20 exchanges, including Bitstamp, Mt. Gox, and Coinbase. She bet on online casinos like Satoshi Dice and Bitcoin Kamikaze. She contributed her computer’s computing power to 11 “mines,” places that pool computing power from around the world to mine coins, then split it among contributors. She repeatedly transferred in and out of accounts belonging to the SilkRoad drug network, without ever buying any drugs.
In total, Sarah made 344 cryptocurrency transactions over the course of a few weeks. She carefully recorded information about each transaction, including the amount, the wallet address she used, and the buyer/receiver address, which she had to painstakingly trace back to the publicly available Bitcoin transaction blockchain.
Hundreds of seemingly insignificant buys and bids in the bitcoin community are not the result of a mental illness. Each one is a small experiment in an unprecedented research project. Sarah Meiklejohn decided to challenge Bitcoin’s anonymity, accepting the challenge of its developers and even its creator.
Such painstaking manual transactions are laborious, but Sarah has time to kill. Each time, her computer has to sift through a massive database that she and her colleagues at UCSD have collected, which can take up to 12 hours to produce results. That database is the entire Bitcoin blockchain dating back to its creation four years ago. She tracks and identifies the seller, service provider, marketplace, and recipient of each of her hundreds of transactions.
As she began digging into the Bitcoin ecosystem, Sarah felt like an archaeologist. What do people do with bitcoin? How many save it, how many spend it? As she dug deeper, she began to set a specific goal for herself, contrary to the speculations of anarchists who believed in the anonymity of bitcoin, to prove beyond doubt that btc transactions can be traced, even when the participants in the transaction think they are cleverly hiding it.
As she searched for the digital traces of bitcoins, she found herself more than 20 years in the past, in her mother’s Manhattan office. That morning, she and her mother had taken the subway from a station near the Museum of Natural History to a building in Foley Square, across from the imposing columns of the city courthouse. She was still in elementary school, her mother had been appointed a federal prosecutor, and today was take-your-child-to-work day. In the years to come, Meikejohn would confront contractors who inflated the price of school lunches, or paving sidewalks by bribing officials with taxpayer money, or banks trying to sell moribund bonds to the city’s treasury department. Many of her “targets” ended up in jail.
Sarah, not yet 10 years old, was often assigned to read piles of papers for clues about bribery schemes in her mother’s cases. It was this habit of scrutinizing tiny data points to piece together the big picture that drove her to unconsciously dig into the Bitcoin blockchain, before she even realized what she was doing.
“There was something subconsciously pushing me to follow the money trail,” Sarah admitted.
As a child, Sarah loved to do puzzles, the more complex the better. Wherever she was—on the street, at the airport, at home—her mother would hand her a puzzle book to keep her tiny, hyper-curious daughter quiet. When she was 14, she completed the New York Times crossword puzzle every day. Sarah also remembers going to one of the first websites of the early wild world, GeoCity, to try to decipher the secret messages carved into the Krypto statue in CIA headquarters that even Langley’s experts had been stumped by.
On a family trip to London, the family visited the British Museum, and Sarah became fascinated by the Rosetta Stone, as well as traces of ancient languages representing entire civilizations that could be decoded if the right keywords were found. Soon she began reading about Linear A and Linear B, a pair of Minoan inscriptions that existed on the island of Crete around 1500 BC. Linear B was deciphered in 1950 by Brooklyn College linguist Alice Kober, who had spent 20 years studying Bronze Age languages and transcribing 180,000 instructions.
Sarah was so fascinated by Linear A and Linear B that she convinced her homeroom teacher to hold a seminar on the subject, attended by just her and one other friend. She was obsessed not with the story of Alice Kober, but with why no one had cracked Linear A for centuries. The hardest puzzle was the one without a key – and no one knew if there was a solution.
In cryptography, there is a principle called Schneier’s Law, named after the scientist Bruce Schneier, which states that anyone can create an encryption system clever enough that they themselves cannot crack it. But like all the puzzling and mysterious problems that fascinated Sarah as a child, an outsider with a completely different approach turned out to be able to decipher systems that their authors had deemed “unbreakable.”
As she studied encryption, Sarah became aware of the importance of privacy and the need to communicate against surveillance. But she was by no means a rebel. She was more fascinated by the intellectual work of creating and breaking codes than she was by ideological opposition to surveillance. But like all cryptographers, she believed in the need to create unbreakable codes that would provide an untraceable hiding place for sensitive communications: whether it was groups rallying against the government or whistleblowers leaking damning information to the press. She says her instinct for accepting it was born from her childhood trying to protect her privacy from her mother, a federal prosecutor.
Sarah proved to be a real talent and soon became a teaching assistant to Anna Lysyanskaya, a talented and accomplished computer scientist. Lysyanskaya herself was directly mentored by the legendary scientist Ron Rivest, whose name is the R in the RSA algorithm that underpins modern encryption methods from email to instant messaging to browsers. RSA is one of the few encryption protocols that has not followed Schneier’s law for over 30 years.
Lysyanskaya had worked on a pre-Bitcoin cryptocurrency called eCash, created by David Chaum, who had made major innovations in anonymity systems, from VPNs to Tor. After graduating from college, Sarah began her master’s degree at Brown under Lysyanskaya’s wing, working on ways to make eCash, a truly anonymous system, efficient and scalable.
Frankly, it’s hard to imagine the system they’ve come up with actually working. It has one serious problem: an anonymous eCash spender could counterfeit a coin and transfer it to an unsuspecting recipient. When that person takes the money to some kind of eCash bank, the bank will check it and detect the counterfeit, and can strip away the anonymity to identify the fraudster. But by then, he’s probably already long gone with whatever the counterfeit money bought.
However, eCash still has one advantage that is particularly attractive to researchers: anonymity that can be considered permanent. In fact, eCash is developed based on an algorithm called “Zero Knowledge Proofs” (translated into Vietnamese as “zero knowledge proofs” is a method by which one party (the prover) can prove to another party (the verifier) that a statement is true without revealing any additional information.). eCash uses this algorithm so that the Bank can confirm the legitimacy of the payment without revealing any information about the sender.
Sarah first heard about Bitcoin while she was a PhD student at UCSD and a summer researcher at Microsoft. A friend at the University of Washington told her about this cool new cryptocurrency that she could use to buy drugs on SilkRoad. She had already finished her research on eCash and was busy researching a technique to detect ATM PINs by heat signatures on keyboards, so she didn’t pay much attention to Bitcoin until at least a year later.
At the end of 2012, during a UCSD computer science outing, a young researcher named Kiril Levchenko invited Sarah to learn about the sensational Bitcoin. Kirill eagerly shared with Sarah about Bitcoin’s unique “proof-of-work” system, as the two walked through the rocky paths of Anza Borrego Desert Park. This system required anyone who wanted to “mine” the “currency” to contribute computing power to participate in a puzzle-solving competition, with the winning results recorded in the blockchain transactions. The “miners” also built their own specialized processors to create this new currency. The system uses a very clever approach to combat fake transactions without the need for a central authority, by requiring a malicious actor who wants to record a fake transaction into the blockchain to have more computing power than thousands of “miners” out there!
When Sarah first learned about how Bitcoin worked, she was quite excited. Reading Satoshi Nakamoto’s “white paper,” she immediately understood that the system was the complete opposite of eCash. While eCash detects forgery using the reputation of banks analyzing after-the-fact, Bitcoin instantly verifies the blockchain, which is the immutable public record of all transactions for each bitcoin.
But this public blockchain ledger system comes at a huge cost to privacy: in Bitcoin, for better or worse, everyone can see every payment transaction. Of course, it’s not clear who’s behind those tangled 26- to 35-character address strings. But to Sarah, it seems like an obvious danger, like everyone will know what’s hidden behind the fig leaf. (The fig leaf is an idiom for a loose cover, referring to the biblical story of Adam and Eve using this leaf to cover their private parts.) Unlike eCash, which leaves no hints to researchers, Bitcoin gives away a whole bunch of data to study.
Who could find the “behavioral patterns” that would expose anonymous users who believed they were smarter than their trackers? Sarah couldn’t help herself. Blockchains were like a tangle of mysterious symbols from ancient languages, hiding secrets from the ages.
She recalls thinking, “You can't prove anything private in this system. And if you can't prove the system is private, how can you attack it?”
And she starts with the first question: how many people are actually using Bitcoin?
It seemed easy, but it was also quite difficult. After downloading the entire blockchain database and organizing it for easy searching, she found that there were about 12 million different Bitcoin addresses, which had made 16 million transactions. And in that pile of data, it was as easy to pick out events in Bitcoin’s recorded history as it is to pick out the shapes of furniture under a sheet in your attic.
She was able to see the nearly 1 million bitcoins that Satoshi mined in the early days of cryptocurrency, before he started using them. As well as the first transaction of 10 coins that Satoshi sent to Hal Finney in January 2009 to test. She also saw the first transaction with real value when a programmer named Laszlo Hanyecz sold two pizzas to a friend for 10,000 coins in May 2010 (now worth hundreds of millions of dollars).
Quite a few addresses and transactions had been identified and discussed on forums like Bitcointalk, and Sarah spent days cutting and pasting long strings of characters into Google to see if anyone had claimed her wallet address, or if there was talk of what looked like some high-value transaction somewhere? As Sarah began to investigate, anyone who cared enough to keep an eye on the sea of truncated addresses could see transactions that were quite valuable even at the time, between mysterious entities, hidden in the chaos of the blockchain.
But digging deeper is where the real challenge lies. Sarah can see transactions between addresses. But how does one determine the size of a person or organization’s bitcoin hoard? A user can have as many wallet addresses as they want to hold their coins, just as banks allow you to open an unlimited number of accounts with a single click. Some software even automatically generates new addresses every time a bitcoin payment is received.
However, Sarah still believes that if you search hard, you will definitely find some pattern. It turns out that in his “white paper”, Satoshi also revealed a technique that can combine several addresses into a single entity. Normally, a bitcoin transaction contains many “inputs” from different addresses. If someone has 2 wallets with 5 coins each and wants to pay a friend 10 coins, the software will create a transaction with 2 input addresses of 5 coins and 1 address receiving 10 coins as output. To do this, the sender must hold the secret keys of both wallet addresses. Therefore, anyone observing can confidently assume that the two input addresses belong to the same person or organization. From there, they can trace that person or organization.
So Sarah first used Satoshi’s technique to go through all the transactions with multiple inputs (even hundreds) and link those addresses. From 12 million wallets, she reduced it to an estimated 5 million users. And began to connect a whole chain of transactions that had previously seemed unrelated.
After that first, perhaps childish, step, Sarah really started to crack the puzzle. Like a modern-day linguistic archaeologist, searching for patterns in strings of characters, she began to probe the transaction chain in Bitcoins. She started toying with transactions, trying to pay herself and her colleagues, getting acquainted with the weirdness of cryptocurrency. Some wallets only allowed the owner to spend the entire amount of money in the wallet at a time. Others were like piggy banks, which had to be broken to spend the coins inside. The remaining coins had to be put into another piggy bank. This second piggy bank in the Bitcoin system was called a “change” address. It had nothing to do with the original piggy bank address.
It would be hard to tell them apart just by looking at the transactions. But Sarah found that if one address had been used before, it was the original address. A completely new address would be the “small change” wallet. And both “piggy banks” were likely owned by the same person.
She began to apply this “small change” lens to the connections between the spenders and the addresses that held the small change. And she began to realize the enormous power of tracking bitcoin small change. Not being able to distinguish between a main wallet and a small change wallet was like standing at a fork in the road with no signposts. Now she could bury the signposts for herself. She could follow the money, regardless of the forks. Large sums would move from one small change wallet to another with each small payment. But in the end, it was still just the original spender.
She calls the chain of transactions a “peeling chain” or “peel chain,” imagining it as someone peeling bills from a stack of dollars. Although each bill is peeled off and spent, the stack of bills is put into a different pocket, but ultimately belongs to the same person. Tracing these peeling chains reveals previously unknown facts.
While Sarah had tracked the money in ways that most users would never expect, tracking the money did not mean identifying who owned it. The identity of the people behind the money remained a mystery. To understand who they were, she knew she could not simply be an observer. She had to jump in, be a player, and sometimes hide.
For help, Sarah turned to UCSD cryptography professor Stefan Savage, who preferred hands-on experience and real-world analysis to generalizations. He was the lead advisor to the legendary research team that demonstrated to General Motors that they could over the internet take control of a Chevy Impala’s transmission and brakes, using the radio signals from the OnStar cellular network.
Before that, Savage had helped lead a team that included Kirill Levchenko (the man who had gotten Sarah into Bitcoin while she was wandering around the desert) working on an ambitious project to track the massive spam ecosystem. In this project, as with the car hack, Savage’s team wasn’t afraid to get dirty. They collected hundreds of millions of spam emails, mostly linking to websites selling pharmaceuticals, both real and fake. They then built a bot to follow all those links, spending over $50k of the products the scammers had been luring, using credit cards that were partnered with banks, with a function that tracked where the money ended up.
Some of these “hidden” banks were forced to close. A professor who participated in the project, Geoffrey Voelker, declared: “Our secret weapon is shopping.”
So when Sarah came to ask for advice, Savage recommended just that: she would have to verify the identity of each wallet address, by transacting directly, just like a drug police officer would have to buy and sell real drugs.
So in early 2013, Sarah found herself shopping for coffee, pastries, playing cards, mugs, baseball caps, silver coins… anything that could be bought online with bitcoins. She joined a dozen or so mining communities, gambled on every cryptocurrency exchange she knew, and moved money in and out of every cryptocurrency exchange, including Silk Road. Over and over again.
The hundreds of wallet addresses that Sarah correctly identified through the 344 transactions she made were just a tiny slice of the bitcoin landscape. But when combined with wallet aggregation and “stripped” chaining techniques that track individual wallets, some of those wallets were no longer isolated but part of a larger population, perhaps with a single owner. And she was able to identify more than a million previously anonymous wallets from these addresses.
For example, with just 30 addresses identified when transferring money in and out of Mt Gox, she was able to identify links to 500,000 wallet addresses on the exchange. Or with just four deposits and seven withdrawals from SilkRoad wallets, she found 300,000 dark addresses on the site. This doesn’t mean she has identified the specific players on SilkRoad, nor has she been able to unmask the mysterious leader, who goes by the moniker DPR Dread Pirate Robert. But it does undermine DPR’s claim that his bitcoin ecosystem can hide the movement of money into and out of SilkRoad accounts.
When Sarah brought the results to Savage, he was impressed. But he said that to publish, you need concrete evidence for readers, not just statistics. “We have to show people what this technology can do.”
So Sarah went back to work. She would track down transactions that might lead her to the criminal. As Sarah began scouring crypto forums for wallets worth digging into, a mountain of money suddenly appeared: a single address had accumulated 613,326 bitcoins in 2012 alone, 5% of all the coins in circulation. It was worth $7.5 million at the time, but is now worth nearly $1 billion. Speculation was that the money might be SilkRoad’s wallet, or it might be the result of an unrelated Bitcoin Ponzi scheme run by a user named pirate@40.
Sarah couldn’t say for sure which rumors were true and which were false. But clustering helped her track the vast sum. She saw that it had split off in late 2012 and branched off across the Bitcoin blockchain. Knowing the “rip” chains allowed her to follow the money. Eventually, the rip chains led to exchanges like Mt. Gox or Bitstamp, presumably to be exchanged for fiat. For an academic, this was a dead end. But for someone with a court order, it was a different story. Sarah knew that she would have to cooperate with the authorities if she wanted to solve the mystery of the $7.5 million.
Sarah’s hunt was on, and she focused on another kind of dirty money: the epidemic of large-scale crypto heists. Bitcoins are like money or gold. Anyone who steals the private key to a wallet can empty it like a safe. Unlike credit cards or other digital payment systems, there’s no way to stop or reverse the flow of money. That makes any crypto business’s “vaults” of revenue easy prey for hackers, especially when the owners store the private key on computers connected to the internet, like walking through a “gangster” area with a pile of cash in your pocket.
Sarah found hot Bitcointalk threads about the wallet addresses from the most recent heists. She began tracking the money, until certain branches were converted to fiat at exchanges. She followed the “rip-off” chain from a heist of 18,500 bitcoins from Bitcoinica to three exchanges where the thieves had undoubtedly converted to fiat. There was enough evidence on her computer screen for a detective with a few warrants to investigate further.
Finally, when Sarah presented the results to Savage, he nodded: publish them. In the final draft of the paper, they confirmed, for the first time, with heavy empirical evidence, a reversal of the beliefs that Bitcoin users had held until then: Far from being untraceable, the blockchain was an open book, revealing vast transaction spaces between parties, many of whom thought they were acting anonymously.
“The relatively small experiments demonstrated that this method can reveal the entire structure of the Bitcoin economy: how coins are used and which organizations are part of the ecosystem. We have clearly established that a court of law can determine who pays whom. Furthermore, we believe that the dominance of small Bitcoin organizations (mostly exchanges), coupled with the public nature of transactions and our ability to trace money to important organizations, will ultimately make Bitcoin a less attractive place for criminals to launder large sums of money.
After writing these lines and poking a hole in the myth of Bitcoins’ untraceability, the authors Sarah, Savage, and Geoffrey Voelker discussed naming them. Given the Wild West nature of their subject, and the professors’ love of spaghetti westerns, they titled their paper “A Fistful of Bitcoins,” a nod to Clint Eastwood’s 1960s classic, “Five Dollars.” They added a subtitle that echoed Eastwood’s cowboy sensibilities and the characters their newfound technique might help expose. And the paper was published as “A Fistful of Bitcoins: A Description of Payments Between Anonymous Persons.”
In the new age of crypto-tracing, which Sarah has paved the way for, they may not remain Anonymous for much longer.
(Source: wired
Translation: Nguyen Thanh Nam - Author: Dek knows everything and still advances)
1 comment
Làm khoa học là làm gì?
Đọc bài báo này để thấy, yếu tố cốt lõi để có những khám phá khoa học, là tính tò mò khoa học được hình thành từ nhỏ, là môi trường, là đồng nghiệp, và tiền bạc chỉ đóng một vai trò thứ yếu!
Những “pattern” như ví lẻ hay multi inputs của các giao dịch bitcoin đã nằm trong ngay thiết kế của hệ thống. Ai cũng có thể dễ dàng nhận ra, không cần phải là giáo sư của UCSD. Nên không ngạc nhiên khi nhiều bạn đọc rồi phán: tưởng gì, có gì cao siêu đâu.
Cao siêu ở đây chính là bà mẹ công tố viên đã dẫn con là Sarah đến tòa từ nhỏ, cho con vọc trong đống hồ sơ tội phạm, khuyến khích con chơi giải puzzle, cho con lên London để xem hòn đá Rosetta. Là trường học đã tổ chức seminar về chủ đề khó như Linear A và Linear B mà chỉ có 2 người dự. Là các đồng nghiệp như Kirill vừa leo núi vừa rủ bạn nhảy vào nghiên cứu Bitcoin đi, là giáo sư hướng dẫn như Savage không sợ phạm tội, để hiểu tội phạm.
Rồi cuối cùng khoa học tức là lọ mọ đi mua đủ thứ vớ vẩn về rồi ghi chép 12h đồng hồ liên tục, liên tục. Để ra kết luận khoa học cũng chỉ đơn giản là:
“Bitcoins là một môi trường không đủ hấp dẫn đế giới tội phạm rửa tiền với quy mô lớn.”
Đek biết gì không phải là đek biết gì, mà quên đi để dành chỗ cho những khám phá mới!
NTN