MIT computer scientists can predict the price of Bitcoin

October 21, 2014

iStockphoto/courtesy MIT News.
Using Bayesian regression, Devavrat Shah, member of the Laboratory for Information and Decision Systems (LIDS) and the Computer Science and Artificial Intelligence Lab (CSAIL) and recent graduate student Kang Zhang have identified patterns from five months of price data from all major Bitcoin exhanges — enabling them to predict the price of Bitcoin — thereby allowing them to double their investment over a 50 day period.

Read more in the Oct. 21, 2014 MIT News Office article by CSAIL correspondent Adam Conner-Simons titled "MIT computer scientists can predict the price of Bitcoin. CSAIL/LIDS team's algorithm doubles initial investment in under two months," also posted below.

The team's algorithm allowed for increasing profit (black) relative to the price of Bitcoin (blue).  Courtesy of the researchers
Scientists have crunched data to predict crime, hospital visits, and government uprisings — so why not the price of Bitcoin? [Figure 3 above: The team's algorithm allowed for increasing profit (black) relative to the price of Bitcoin (blue). Courtesy of the researchers/MIT News]

A researcher at MIT’s Computer Science and Artificial Intelligence Laboratory and the Laboratory for Information and Decision Systems recently developed a machine-learning algorithm that can predict the price of the infamously volatile cryptocurrency Bitcoin, allowing his team to nearly double its investment over a period of 50 days.

Earlier this year, principal investigator Devavrat Shah and recent graduate Kang Zhang collected price data from all major Bitcoin exchanges, every second for five months, accumulating more than 200 million data points.

Using a technique called “Bayesian regression,” they trained an algorithm to automatically identify patterns from the data, which they used to predict prices, and trade accordingly.

Specifically, every two seconds they predicted the average price movement over the following 10 seconds. If the price movement was higher than a certain threshold, they bought a Bitcoin; if it was lower than the opposite threshold, they sold one; and if it was in-between, they did nothing.

Over 50 days, the team’s 2,872 trades gave them an 89 percent return on investment with a Sharpe ratio (measure of return relative to the amount of risk) of 4.1.

The team’s paper was published this month at the 2014 Allerton Conference on Communication, Control, and Computing.

“We developed this method of latent-source modeling, which hinges on the notion that things only happen in a few different ways,” says Shah, who previously used the approach to predict Twitter trending topics. “Instead of making subjective assumptions about the shape of patterns, we simply take the historical data and plug it into our predictive model to see what emerges.”

Shah says he was drawn to Bitcoin because of its vast swath of free data, as well as its sizable user base of high-frequency traders.

“We needed publicly available data, in large quantities and at an extremely fine scale,” says Shah, the Jamieson Career Development Associate Professor of Electrical Engineering and Computer Science. “We were also intrigued by the challenge of predicting a currency that has seen its prices see-saw regularly in the last few years.”

In the future, Shah says he is interested in expanding the scale of the data collection to further hone the effectiveness of his algorithm.

“Can we explain the price variation in terms of factors related to the human world? We have not spent a lot of time doing that,” Shah says, before adding with a laugh, “But I can show you it works. Give me your money and I’d be happy to invest it for you.”

When Shah published his Twitter study in 2012, some academics wondered whether his approach could work for stock prices. With the Bitcoin research complete, he says he now feels confident modeling virtually any quantity that varies over time — including, he says half-jokingly, the validity of astrology predictions.

If nothing else, the findings demonstrate Shah’s belief that, more often than not, what gets in the way of our predictive powers are our preconceived notions of what patterns will pop up.

“When you get down to it,” he says, “you really should be letting the data decide.”