Navigating Global Stock Market Correlations

Published December 2, 2019

Introduction

In this article we obtain returns for each countries stock market using ETFs tracking MSCI Country Indexes as proxies. We visualize in two different ways returns in pairs to give us a sense on how correlated they are.

Then we dive into the correlations themselves! First we display the correlation matrix in no specific order, and in a second step we apply a clustering algorithm in order to find a permutation which show us how countries are linked together.

You can skip the code and go directly to the visualizations by cliking here,

Data Collection

We first define our mapping between ETF tickers and countries.

TICKER_COUNTRY = {
    'VTI': 'US', # No MSCI Country index ETF, so we pick the VTI

    # Developed Market
    'EWA': 'AU', 'EWO': 'AT', 'EWK': 'BE', 'EWC': 'CA',
    'EDEN': 'DK', 'EFNL': 'FI', 'EWQ': 'FR', 'EWG': 'DE',
    'EWH': 'HK', 'EIRL': 'IE', 'EIS': 'IL', 'EWI': 'IT',
    'EWJ': 'JP', 'EWN': 'NL', 'ENZL': 'NZ', 'ENOR': 'NO',
    'EWS': 'SG', 'EWP': 'ES', 'EWD': 'SE', 'EWL': 'CH',
    'EWU': 'GB',

    # Emerging Market
    'AGT': 'AR', 'EWZ': 'BR', 'ECH': 'CL', 'MCHI': 'CN',
    'ICOL': 'CO', 'INDA': 'IN', 'EIDO': 'ID', 'EWM': 'MY',
    'EWW': 'MX', 'EPU': 'PE', 'EPHE': 'PH', 'EPOL': 'PL',
    'QAT': 'QA', 'ERUS': 'RU', 'KSA': 'SA', 'EZA': 'ZA',
    'EWY': 'KR', 'EWT': 'TW', 'THD': 'TH', 'TUR': 'TR',
    'UAE': 'AE',
}

Then use IEX Cloud to retrieve the time series. In order to run this code you need a API key which you can get by signing up.

import requests
import pandas as pd

def get_close(ticker, country):
    resp = requests.get(
        'https://cloud.iexapis.com/beta/stock/{}/chart/max'.format(ticker),
        params={
            'token': 'YOUR_TOKEN', # Fill this with you token from IEX Cloud
            'chartCloseOnly': 'true',
        },
    )

    df = pd.DataFrame(resp.json()).set_index('date')
    df = df[['close']].rename(columns={'close': country})

    return df

close = pd.concat([
    get_close(ticker, country)
    for ticker, country in TICKER_COUNTRY.items()
], axis=1, sort=True)

close.columns.name = 'country'

returns = close.ffill().pct_change()

Data Presentation

We obtained one dataset of time series for 43 countries we want to study. We only have 5 years at most of history, but up to 10 years can be obtain with the premium versin of IEX Cloud.

Returns

dateUSCNFRDEGBCHES
2019–11–250.009485270.01749960.004100950.004502940.01380370.00724450.00567577
2019–11–260.002129790.002149830.002199180.00206897-0.002420570.00282558-0.00176367
2019–11–270.004750590.00363036-0.002821320.0006882310.006672730.001536890.00318021
2019–11–29-0.00447928-0.0190727-0.0047155-0.00481431-0.00783368-0.00511509-0.000704473
2019–12–02-0.00906137-0.00284948-0.0101074-0.00932965-0.00698451-0.0033419-0.0137469

Download (1.1M)

A First Glance

Let’s first visualize returns in a simple line chart. The first thing that is striking is how returns between seemingly distant countries are seem in fact quite correlated.

-0.04 %-0.03 %-0.02 %-0.01 %0.01 %0.02 %0.03 %0.04 %FebruaryMarchAprilMayJuneJulyAugustSeptemberOctoberNovemberDecember

A second way to compare two countries is to plot returns in a scatter plot. If you plot an european country, you will see the outlier of Brexit (2016–06–24), reminding us how non Gaussian returns are. The more you see a pattern close to a thin line, the more correlated both country will be.

X AXIS
Y AXIS
-0.10 %-0.08 %-0.06 %-0.04 %-0.02 %0.00 %0.02 %0.04 %0.06 %0.08 %0.10 %-0.10 %-0.08 %-0.06 %-0.04 %-0.02 %0.00 %0.02 %0.04 %0.06 %0.08 %0.10 %

Diving into Correlations

We will need two steps in order to visualize nicely the correlation matrix. First the correlation computation itself, which we will compute with a rolling window of 252 trading days.

correlations = returns.rolling(252, min_periods=252 // 2).corr()

Thanks to pandas, this is a one liner. Let’s plot the matrix.

A quick look at the matrix and you start to see a pattern. A lot of countries are tightly tied together. There seems to be cluster of countries.

AE
AR
AT
AU
BE
BR
CA
CH
CL
CN
CO
DE
DK
ES
FI
FR
GB
HK
ID
IE
IL
IN
IT
JP
KR
MX
MY
NL
NO
NZ
PE
PH
PL
QA
RU
SA
SE
SG
TH
TR
TW
US
ZA
AE
AR
AT
AU
BE
BR
CA
CH
CL
CN
CO
DE
DK
ES
FI
FR
GB
HK
ID
IE
IL
IN
IT
JP
KR
MX
MY
NL
NO
NZ
PE
PH
PL
QA
RU
SA
SE
SG
TH
TR
TW
US
ZA

Now that we have our correlations, we want to reveal the clusters by permuting it, to reveal structure. One way is to extract a distance matrix, and apply a clustering algorithm. Here we will use the scipy package which has all the required functions.

from scipy.spatial.distance import squareform
from scipy.cluster.hierarchy import linkage, leaves_list

leaves = []

for date in correlations.index.levels[0]:
    # Rolling correlation at a given date
    correlation = correlations.xs(date.strftime('%Y-%m-%d')).dropna(how='all', axis=1)

    # Compute a distance matrix
    dist = squareform(pd.np.sqrt(0.5 * (1 - correlation.clip(-1, 1))), checks=False)

    # Clustering with optimal order, to have the correct premutation
    hierarchy = linkage(dist, optimal_ordering=True)

    # Saving the countries permutation for later use
    leaves_country = correlation.columns[leaves_list(hierarchy)]
    leaves_df = pd.DataFrame(leaves_country).rename(columns={'country': date}).T
    leaves.append(leaves_df)

leaves = pd.concat(leaves, axis=0)

We now have countries sorted in a nice way. Now the clusters are fully visible, obvious one being Asian and European countries. Another observation one could make is that developed countries have high correlations to other developed countries. Could it be because develop countries rely on each others by imports and exports?

TR
IN
QA
AE
BR
TH
PL
PH
ID
DK
NO
AU
JP
AT
ES
IT
BE
SE
DE
FR
NL
GB
CH
US
CA
IL
FI
IE
SG
KR
TW
CN
HK
ZA
MY
CO
RU
CL
MX
PE
NZ
AR
SA
TR
IN
QA
AE
BR
TH
PL
PH
ID
DK
NO
AU
JP
AT
ES
IT
BE
SE
DE
FR
NL
GB
CH
US
CA
IL
FI
IE
SG
KR
TW
CN
HK
ZA
MY
CO
RU
CL
MX
PE
NZ
AR
SA

Conclusion

This method of permuting the correlation matrix can be also used for any time series, for instance simple stocks. What is interesting here is that by finding cluster using only the data, and not resorting to some manual classification (DM, EM for countries, Sectors for stocks for instance), we obtain clusters that capture correlations better.