Pensions & Investments: Big data analysis is coming to a pension fund near you

n the next five years, big data analysis is poised to become one of the most important and competitive skill sets around. Portfolio analysis in particular is where pension funds are focusing their big data investments.

Big data is a set of techniques embedded in the latest, most sophisticated technologies: social media analytics, please click here to continue reading the article on Pensions & Investments

Marco Avellaneda: Big Data and Blockchain are the Future of Finance

Professor Marco Avellaneda will speak at the 6th Annual Big Data Finance Conference at NYC’s brand-new Cornell Tech campus on May 11, 2018.

As told to Irene Aldridge

IA: Lots of social media commentary lately suggests that “everybody talks about big data, but nobody knows what big data really is”. What are your thoughts on this?

MA: Big Data refers to the use of transaction data to infer trends in social systems. The foremost example are Facebook and Google, who collect users data and sell it to advertisers. Big Data is also used to organize records in finance, healthcare and other fields which involve large amounts of transactions.

IA: What are the top applications of Big Data in Finance today?

MA: Finance (and also Healthcare) give rise to a lot of data, but like everywhere else, there is need to satisfy some privacy requirements and laws. This is an area of great interest. In Finance, Fintech companies have streamlined applications for credit and Mortgages and become a nice way to connect lenders and consumers.  The use of Blockchain technology is also very important. Blockchain gives a new method for validating financial transactions like currency trading, and disrupts the traditional interbank fx market. Cross-border payment systems, backed by e-currency and blockchain are proliferating as we speak. Finally, using data from customers to advise them on financial investments (in the context of a bank, or brokerage) is another application of Big Data.

IA: How can Big Data be used in Fixed Income? Can Big Data add substantial value?

MA: The main application would be real-time risk management for clearinghouses such as DTCC-FICC. Risk and margin calculations for bonds and securities could be done faster.

IA: How do you see Big Data developing in Finance from here?

MA: I believe that blockchain technology – particularly extending BTC transactions into other, more complicated transactions, like stock trading for example  — could be the future of Big Data in Finance.

Professor Marco Avellaneda (PhD Univ. of Minnesota, 1985) specializes in applied mathematics, probability and statistics. Most of his research of the last 10-15 years involves applications of mathematics and statistics to financial markets, derivatives, portfolio management and risk management. His work gets published in specialized journals such as Quantitative Finance , Risk Magazine, International Journal of Theoretical and Applied Finance, and other publications read by practitioners as well as theoreticians. He was named *Quant of the Year 2010* by Risk Magazine, for an article on hard-to-borrow stocks and their effect on equity options pricing. Marco is associated with the consulting firm Finance Concepts, which he founded in 2003. His current interests are in internet-delivered financial risk-management systems for buy-side firms. To hear Marco in person, please join the 6th Annual Big Data Finance Conference at the NYC’s brand-new Cornell Tech campus on May 11, 2018:

Why Big Data Is the New Must-Have Skill in Finance

By Irene Aldridge

The history of Finance is full of Mathematical and technological revolutions. With the expansion of computer technology, mathematical innovations are not only abstractly interesting, but also very fast and profitable. Most recently, mathematical innovations in Finance first generated opportunities in quant techniques that were followed by the revolution of high-frequency trading, Now, the opportunity expands as Big Data techniques are implemented within Finance.

Many people are still puzzled by what is so special about Big Data. After all, Econometrics and other data processing tools have been used in Finance for decades. It is true that Econometrics comprises a part of Big Data analysis, known as Supervised Learning. However, the much more extensive and powerful set of Big Data techniques has barely registered on the Finance radar.

The tools comprising the core of  Big Data analytics deal with extensive data tables, sparse or missing values, as well as data clustering techniques, to name a few, that often make Econometrics look like a set of exercises for kindergartners. Gone is the need to discard the data due to incomplete information. Big Data welcomes the data in all shapes and forms, and disjointed, irregular and often even partially corrupted or biased data sets can be processed with equal ease to extract true values and relationship.

One of the key properties of Big Data is speed: the techniques lend themselves to efficient, fast and powerful data processing and inferences. Move over, high-frequency trading, Big Data can really process financial data in real time.

What is the key difference between Big Data and traditional data processing techniques? As explained in our new research paper, “Big Data in Portfolio Management”, available on SSRN at:, Big Data extensively utilizes the capabilities of eigenvalues, or principal values. The eigenvalues, first developed in the 18th century as an aid to solve differential equations, have since been extensively studied. Many properties of eigenvalues have been researched during and after the WWII, and, most recently, during the social media advertising boom of the past two decades. Eigenvalues featured prominently in the original search algorithm of Google and helped propel Google to its present prominence. In Finance, however, eigenvalues are underutilized at best and completely unknown even in some of the most prominent shops. Lots of work remains to be done in the area of implementing Big Data concepts in the financial sector.

Eigenvalues are also key drivers of artificial intelligence (AI) and automation. While the concept of AI seems mysterious and ominous, the mathematical concepts are straightforward and well-developed. A great place to learn about the techniques, their adoption in Finance and key trends is the upcoming 6th Annual Big Data Finance conference, scheduled to take place on May 11, 2018, at the brand-new Cornell Tech campus in New York City ( The mission of the conference is to bring together financial industry, academia and government to facilitate the exchange of information and key developments in the are of Big Data in Finance.

Financial regulators certainly stand to benefit from Big Data techniques. There is a real chance to catch up and even overtake financial professionals with Big Data capabilities. However, a fair amount of work remains to develop and implement the Big Data market surveillance capabilities – the available research applied to Finance is still scarce, and the field is wide open to new discoveries and applications.

Irene Aldridge is President and Head of Research of AbleMarkets, a Big Data for Capital Markets company. She is a co-author of “Real-Time Risk: What Investors Should Know About Fintech, High-Frequency Trading and Flash Crashes” (Wiley, 2017, and author of High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems (Wiley, 2014).

Using Big Data to Deal with Inherent Correlation Matrix Instability in Portfolio Management – 2018-05-11 3:00 PM

Adoption of spectral analysis as a mainstream tool for portfolio management has languished despite the growing body of academic research. The highly theoretical nature of the recent publications have slowed the application of spectral analysis to specific use cases. In this paper, we show a step-by-step approach, advantages and results of spectral decomposition in portfolio reallocation. Specifically, we show how spectral decomposition solves two of the most pressing large-scale portfolio reallocation problems: extreme weights and transaction costs. We further show the empirical results of portfolio reallocation under different common portfolio composition scenarios, and how spectral decomposition helps speed up and outperform traditional portfolio allocation techniques.

Statistics of VIX Futures and Applications to Trading Volatility Exchange-Traded Products – 2018-03-01 2:00pm

Marco Avellaneda

New York University (NYU) – Courant Institute of Mathematical Sciences; Finance Concepts LLC

Andrew Papanicolaou

NYU Polytechnic School of Engineering – Department of Finance and Risk Engineering

We study the dynamics of VIX futures and ETNs/ETFs. We find that contrary to classical commodities, VIX and VIX futures exhibit large volatility and skewness, consistent with the absence of cash-and-carry arbitrage. The constant-maturity futures (CMF) term-structure can be modeled as a stationary stochastic process in which the most likely state is a contango with VIX ≈ 12% and a long-term futures price V∞ ≈ 20%. We analyze the behavior of ETFs and ETNs based on constant-maturity rolling futures strategies, such as VXX, XIV and VXZ, assuming stationarity and through a multi-factor model calibrated to historical data. We find that buy-and-hold strategies consisting of shorting ETNs that roll long futures, or buying ETNs that roll short futures, will produce theoretically-sure profits if it is assumed that CMFs are stationary and ergodic (see Proposition 3.1). To quantify further, we estimate a 2-factor lognormal model with mean-reverting factors to VIX and CMF historical data from 2011 to 2016. The results confirm the profitability of buy-and-hold strategies, but also indicate that the latter have modest Sharpe ratios, of the order of SR = 0.5 or less, and high variability over 1-year horizon simulations. This is due to the surges in VIX and CMF backwardations which are experienced sporadically, but also inevitably, in the volatility futures market.

Bisecting K-means Algorithm Based on K-valued Self-determining and Clustering Center Optimization

New research from Jian Di and Xinyue Gou forthcoming in the Journal of Computers.

Abstract: The initial clustering centers of traditional bisecting K-means algorithm are randomly selected and the k value of traditional bisecting K-means algorithm could not determine beforehand. This paper proposes a improve bisecting K-means algorithm based on automatically determining K value and the optimization of the cluster center. Firstly, the initial cluster centers are selected by using the point density and the distance function; Secondly, automatically determining K value is proposed by using Intra cluster similarity and inter cluster difference. the experiment results on UCI database show that the algorithm can effectively avoid the influence of noise points and outliers, and improve the accuracy and stability of clustering results.

Visit the brand-new campus of Cornell University

Yes, the brand-new Billion-dollar campus of Cornell University is in NYC, 1 stop on the F train from the 63rd and Lexington train station, on Roosevelt Island. It is the fastest to reach, and a glamorous facility, boasting latest modernist architecture and all sorts of high-tech innovations. It is a place to see in its own right!

Come for the conference and see the buildings!