In this notebook, I want to explore different methods to download stock data for analysis. There are several libraries out there for Python, so I am exploring few of them here with some code to help get you started.

Loading Libraries

# General libraries that we will use
import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np

plt.rcParams["figure.figsize"] = (14,7)

1. Using pandas_datareader

import pandas_datareader
from pandas_datareader import data 
pandas_datareader.__version__
'0.8.1'

Setting start and end date

start_date = '2012-10-01'
end_date = datetime.today().strftime('%Y-%m-%d')

# Tickers I am interested in as a Python List
ticker = ['MSFT','AAPL', 'AMZN']

Making a request to get the data for the specified data range and tickers using get_data_yahoo() function

dt = data.get_data_yahoo(ticker, start_date, end_date)
dt.head()
Attributes Adj Close Close High Low Open Volume
Symbols MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN
Date
2012-10-01 24.672274 81.897568 252.009995 29.490000 94.198570 252.009995 29.980000 96.678574 256.160004 29.42 93.785713 250.490005 29.809999 95.879997 255.399994 54042700.0 135898700.0 2581200.0
2012-10-02 24.814495 82.136055 250.600006 29.660000 94.472855 250.600006 29.889999 95.192856 253.149994 29.50 92.949997 249.029999 29.680000 94.544289 252.800003 43338900.0 156998100.0 2195800.0
2012-10-03 24.981831 83.395477 255.919998 29.860001 95.921425 255.919998 29.990000 95.980003 256.100006 29.67 94.661430 249.559998 29.750000 94.980003 251.210007 46655900.0 106070300.0 2745600.0
2012-10-04 25.124054 82.817886 260.470001 30.030001 95.257141 260.470001 30.030001 96.321426 261.519989 29.57 95.078575 255.869995 29.969999 95.892860 256.010010 43634900.0 92681400.0 2700400.0
2012-10-05 24.973461 81.053017 258.510010 29.850000 93.227142 258.510010 30.250000 95.142860 261.899994 29.74 93.040001 257.489990 30.230000 95.028572 261.200012 41133900.0 148501500.0 2806500.0
dt.tail()
Attributes Adj Close Close High Low Open Volume
Symbols MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN MSFT AAPL AMZN
Date
2020-06-26 196.330002 353.630005 2692.870117 196.330002 353.630005 2692.870117 199.889999 365.320007 2782.570068 194.880005 353.019989 2688.000000 199.729996 364.410004 2775.060059 54675800.0 51314200.0 6500800.0
2020-06-29 198.440002 361.779999 2680.379883 198.440002 361.779999 2680.379883 198.529999 362.170013 2696.800049 193.550003 351.279999 2630.080078 195.779999 353.250000 2690.010010 26701600.0 32661500.0 4223400.0
2020-06-30 203.509995 364.799988 2758.820068 203.509995 364.799988 2758.820068 204.399994 365.980011 2769.629883 197.740005 360.000000 2675.030029 197.880005 360.079987 2685.070068 34310300.0 35055800.0 3769700.0
2020-07-01 204.699997 364.109985 2878.699951 204.699997 364.109985 2878.699951 206.350006 367.359985 2895.000000 201.770004 363.910004 2754.000000 203.139999 365.119995 2757.989990 32061200.0 27684300.0 6363400.0
2020-07-02 206.259995 364.109985 2890.300049 206.259995 364.109985 2890.300049 208.020004 370.470001 2955.560059 205.000000 363.640015 2871.100098 205.679993 367.850006 2912.010010 29315800.0 28510400.0 6593400.0

Let's do a simple plotting

plt.figure(figsize=(10,7))
dt['Adj Close'].plot()
plt.title(f'Adjusted Close Price of {ticker}')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')
<Figure size 720x504 with 0 Axes>

2. Using yfinance

import yfinance as yf

yfinance has some nice features, such as getting additional information about a specific company as seen below

amzn = yf.Ticker("AMZN")
amzn.info
{'zip': '98109-5210',
 'sector': 'Consumer Cyclical',
 'fullTimeEmployees': 840400,
 'longBusinessSummary': 'Amazon.com, Inc. engages in the retail sale of consumer products and subscriptions in North America and internationally. The company operates through three segments: North America, International, and Amazon Web Services (AWS). It sells merchandise and content purchased for resale from third-party sellers through physical and online stores. The company also manufactures and sells electronic devices, including Kindle, Fire tablets, Fire TVs, Rings, and Echo and other devices; provides Kindle Direct Publishing, an online service that allows independent authors and publishers to make their books available in the Kindle Store; and develops and produces media content. In addition, it offers programs that enable sellers to sell their products on its Websites, as well as its stores; and programs that allow authors, musicians, filmmakers, skill and app developers, and others to publish and sell content. Further, the company provides compute, storage, database, and other AWS services, as well as fulfillment, advertising, publishing, and digital content subscriptions. Additionally, it offers Amazon Prime, a membership program, which provides free shipping of various items; access to streaming of movies and TV episodes; and other services. The company also operates in the food delivery business in Bengaluru, India. It serves consumers, sellers, developers, enterprises, and content creators. The company also has utility-scale solar projects in China, Australia, and the United States. Amazon.com, Inc. was founded in 1994 and is headquartered in Seattle, Washington.',
 'city': 'Seattle',
 'phone': '206-266-1000',
 'state': 'WA',
 'country': 'United States',
 'companyOfficers': [],
 'website': 'http://www.amazon.com',
 'maxAge': 1,
 'address1': '410 Terry Avenue North',
 'industry': 'Internet Retail',
 'previousClose': 2878.7,
 'regularMarketOpen': 2912.01,
 'twoHundredDayAverage': 2164.3306,
 'trailingAnnualDividendYield': None,
 'payoutRatio': 0,
 'volume24Hr': None,
 'regularMarketDayHigh': 2955.56,
 'navPrice': None,
 'averageDailyVolume10Day': 5069900,
 'totalAssets': None,
 'regularMarketPreviousClose': 2878.7,
 'fiftyDayAverage': 2581.8142,
 'trailingAnnualDividendRate': None,
 'open': 2912.01,
 'toCurrency': None,
 'averageVolume10days': 5069900,
 'expireDate': None,
 'yield': None,
 'algorithm': None,
 'dividendRate': None,
 'exDividendDate': None,
 'beta': 1.315499,
 'circulatingSupply': None,
 'startDate': None,
 'regularMarketDayLow': 2871.5,
 'priceHint': 2,
 'currency': 'USD',
 'trailingPE': 138.05406,
 'regularMarketVolume': 6593387,
 'lastMarket': None,
 'maxSupply': None,
 'openInterest': None,
 'marketCap': 1441612300288,
 'volumeAllCurrencies': None,
 'strikePrice': None,
 'averageVolume': 4771333,
 'priceToSalesTrailing12Months': 4.8658075,
 'dayLow': 2871.5,
 'ask': 2886,
 'ytdReturn': None,
 'askSize': 1000,
 'volume': 6593387,
 'fiftyTwoWeekHigh': 2955.56,
 'forwardPE': 77.55031,
 'fromCurrency': None,
 'fiveYearAvgDividendYield': None,
 'fiftyTwoWeekLow': 1626.03,
 'bid': 2881.19,
 'tradeable': False,
 'dividendYield': None,
 'bidSize': 1200,
 'dayHigh': 2955.56,
 'exchange': 'NMS',
 'shortName': 'Amazon.com, Inc.',
 'longName': 'Amazon.com, Inc.',
 'exchangeTimezoneName': 'America/New_York',
 'exchangeTimezoneShortName': 'EDT',
 'isEsgPopulated': False,
 'gmtOffSetMilliseconds': '-14400000',
 'quoteType': 'EQUITY',
 'symbol': 'AMZN',
 'messageBoardId': 'finmb_18749',
 'market': 'us_market',
 'annualHoldingsTurnover': None,
 'enterpriseToRevenue': 4.963,
 'beta3Year': None,
 'profitMargins': 0.03565,
 'enterpriseToEbitda': 40.542,
 '52WeekChange': 0.48044384,
 'morningStarRiskRating': None,
 'forwardEps': 37.27,
 'revenueQuarterlyGrowth': None,
 'sharesOutstanding': 498776000,
 'fundInceptionDate': None,
 'annualReportExpenseRatio': None,
 'bookValue': 130.806,
 'sharesShort': 3595113,
 'sharesPercentSharesOut': 0.0072000003,
 'fundFamily': None,
 'lastFiscalYearEnd': 1577750400,
 'heldPercentInstitutions': 0.57689,
 'netIncomeToCommon': 10561999872,
 'trailingEps': 20.936,
 'lastDividendValue': None,
 'SandP52WeekChange': 0.051768422,
 'priceToBook': 22.09608,
 'heldPercentInsiders': 0.15122,
 'nextFiscalYearEnd': 1640908800,
 'mostRecentQuarter': 1585612800,
 'shortRatio': 0.9,
 'sharesShortPreviousMonthDate': 1589500800,
 'floatShares': 423016940,
 'enterpriseValue': 1470471340032,
 'threeYearAverageReturn': None,
 'lastSplitDate': 936230400,
 'lastSplitFactor': '2:1',
 'legalType': None,
 'morningStarOverallRating': None,
 'earningsQuarterlyGrowth': -0.288,
 'dateShortInterest': 1592179200,
 'pegRatio': 4.51,
 'lastCapGain': None,
 'shortPercentOfFloat': 0.0085,
 'sharesShortPriorMonth': 3444687,
 'category': None,
 'fiveYearAverageReturn': None,
 'regularMarketPrice': 2912.01,
 'logo_url': 'https://logo.clearbit.com/amazon.com'}

Now, let's make a request to pull histotical prices for AMZN using period="MAX" do indicate MAX period range

hist = amzn.history(period="max")
hist.head()
Open High Low Close Volume Dividends Stock Splits
Date
1997-05-15 2.44 2.50 1.93 1.96 72156000 0 0.0
1997-05-16 1.97 1.98 1.71 1.73 14700000 0 0.0
1997-05-19 1.76 1.77 1.62 1.71 6106800 0 0.0
1997-05-20 1.73 1.75 1.64 1.64 5467200 0 0.0
1997-05-21 1.64 1.65 1.38 1.43 18853200 0 0.0

We can also check how many splits did Amazon have in the past using the .splits property

amzn.splits
Date
1998-06-02    2.0
1999-01-05    3.0
1999-09-02    2.0
Name: Stock Splits, dtype: float64

We can also specify date ranges using the snippet below:

amzn = yf.download("AMZN", start=start_date, end=end_date)
amzn.head()
[*********************100%***********************]  1 of 1 completed
Open High Low Close Adj Close Volume
Date
2012-10-01 255.399994 256.160004 250.490005 252.009995 252.009995 2581200
2012-10-02 252.800003 253.149994 249.029999 250.600006 250.600006 2195800
2012-10-03 251.210007 256.100006 249.559998 255.919998 255.919998 2745600
2012-10-04 256.010010 261.519989 255.869995 260.470001 260.470001 2700400
2012-10-05 261.200012 261.899994 257.489990 258.510010 258.510010 2806500
amzn.tail()
Open High Low Close Adj Close Volume
Date
2020-06-26 2775.060059 2782.570068 2688.000000 2692.870117 2692.870117 6500800
2020-06-29 2690.010010 2696.800049 2630.080078 2680.379883 2680.379883 4223400
2020-06-30 2685.070068 2769.629883 2675.030029 2758.820068 2758.820068 3769700
2020-07-01 2757.989990 2895.000000 2754.000000 2878.699951 2878.699951 6363400
2020-07-02 2912.010010 2955.560059 2871.100098 2890.300049 2890.300049 6593400
# plt.figure(figsize=(10,7))
amzn['Adj Close'].plot()
plt.title(f'Adjusted Close Price of AMZN')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')

3. Using Quandl

Quandl is great, but it does not have latest prices and can be ideal for historical analysis.In order to use Quandl you will need to register to obrain an `API KEY'. It is free to register.

import quandl
API_KEY = "YOU API KEY"
amaznq = quandl.get('WIKI/'+"AMZN", 
                    start_date=start_date, 
                    end_date=end_date, 
                    api_key=API_KEY)
amaznq.head()
Open High Low Close Volume Ex-Dividend Split Ratio Adj. Open Adj. High Adj. Low Adj. Close Adj. Volume
Date
2012-10-01 255.40 256.16 250.49 252.01 2581200.0 0.0 1.0 255.40 256.16 250.49 252.01 2581200.0
2012-10-02 252.80 253.15 249.03 250.60 2195800.0 0.0 1.0 252.80 253.15 249.03 250.60 2195800.0
2012-10-03 251.21 256.10 249.56 255.92 2745600.0 0.0 1.0 251.21 256.10 249.56 255.92 2745600.0
2012-10-04 256.01 261.52 255.87 260.47 2700400.0 0.0 1.0 256.01 261.52 255.87 260.47 2700400.0
2012-10-05 261.20 261.90 257.49 258.51 2806500.0 0.0 1.0 261.20 261.90 257.49 258.51 2806500.0
amaznq.tail()
Open High Low Close Volume Ex-Dividend Split Ratio Adj. Open Adj. High Adj. Low Adj. Close Adj. Volume
Date
2018-03-21 1586.45 1590.00 1563.17 1581.86 4667291.0 0.0 1.0 1586.45 1590.00 1563.17 1581.86 4667291.0
2018-03-22 1565.47 1573.85 1542.40 1544.10 6177737.0 0.0 1.0 1565.47 1573.85 1542.40 1544.10 6177737.0
2018-03-23 1539.01 1549.02 1495.36 1495.56 7843966.0 0.0 1.0 1539.01 1549.02 1495.36 1495.56 7843966.0
2018-03-26 1530.00 1556.99 1499.25 1555.86 5547618.0 0.0 1.0 1530.00 1556.99 1499.25 1555.86 5547618.0
2018-03-27 1572.40 1575.96 1482.32 1497.05 6793279.0 0.0 1.0 1572.40 1575.96 1482.32 1497.05 6793279.0
# plt.figure(figsize=(10,7))
amaznq['Adj. Close'].plot()
plt.title(f'Adjusted Close Price of {ticker}')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')

4. Using alpha_vatange

By far, I found this to be the most comprehensive and offers the most flexibility. Though may not be as simple as the other libraries, it does offer many great features. You will also need to register, it is free, in order to obtain an API KEY.

alpha_vantage_key = "YOU API KEY HERE"
import alpha_vantage
from alpha_vantage.timeseries import TimeSeries

Here I am specifiying output_format' to bepandas` to get it as a DataFrame.

ts = TimeSeries(key=alpha_vantage_key, output_format='pandas')

Intraday example:

intraday_data, data_info = ts.get_intraday("AMZN", outputsize='full', interval='1min')
data_info
{'1. Information': 'Intraday (1min) open, high, low, close prices and volume',
 '2. Symbol': 'AMZN',
 '3. Last Refreshed': '2020-07-02 19:54:00',
 '4. Interval': '1min',
 '5. Output Size': 'Full size',
 '6. Time Zone': 'US/Eastern'}
intraday_data.head()
1. open 2. high 3. low 4. close 5. volume
date
2020-07-02 19:54:00 2884.37 2885.00 2884.37 2885.00 856.0
2020-07-02 19:53:00 2887.29 2887.29 2885.95 2886.01 529.0
2020-07-02 19:52:00 2889.00 2889.00 2887.51 2887.51 466.0
2020-07-02 19:51:00 2889.00 2889.00 2889.00 2889.00 250.0
2020-07-02 19:16:00 2890.50 2890.50 2890.50 2890.50 156.0
intraday_data.tail()
1. open 2. high 3. low 4. close 5. volume
date
2020-06-29 05:57:00 2700.07 2700.07 2700.00 2700.00 419.0
2020-06-29 05:55:00 2700.00 2700.00 2700.00 2700.00 268.0
2020-06-29 05:51:00 2699.99 2699.99 2699.99 2699.99 200.0
2020-06-29 04:13:00 2700.50 2700.50 2700.50 2700.50 350.0
2020-06-29 04:01:00 2700.10 2700.10 2700.10 2700.10 126.0
specific_date = '2020-07-02'
intraday_data.filter(like = specific_date, axis=0)['4. close']
date
2020-07-02 19:54:00    2885.00
2020-07-02 19:53:00    2886.01
2020-07-02 19:52:00    2887.51
2020-07-02 19:51:00    2889.00
2020-07-02 19:16:00    2890.50
                        ...   
2020-07-02 04:30:00    2899.15
2020-07-02 04:27:00    2900.98
2020-07-02 04:07:00    2896.56
2020-07-02 04:05:00    2896.76
2020-07-02 04:03:00    2896.74
Name: 4. close, Length: 524, dtype: float64
plt.figure(figsize=(10,7))
data = intraday_data.filter(like = specific_date, axis=0)['4. close']
# same as 
# data = intraday_data.filter(like = '2020-01-31', axis=0)['4. close'].plot()
plt.plot_date(data.index, data.values, fmt='-')
plt.title(f'Intraday Price Movement for AMZN')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Time', fontsize=14)
plt.grid(which='major')
print('starting hour: ', min(data.index.hour))
print('ending hour: ', max(data.index.hour))
starting hour:  4
ending hour:  19
intraday_data.filter(like = specific_date, axis=0)
1. open 2. high 3. low 4. close 5. volume
date
2020-07-02 19:54:00 2884.37 2885.00 2884.37 2885.00 856.0
2020-07-02 19:53:00 2887.29 2887.29 2885.95 2886.01 529.0
2020-07-02 19:52:00 2889.00 2889.00 2887.51 2887.51 466.0
2020-07-02 19:51:00 2889.00 2889.00 2889.00 2889.00 250.0
2020-07-02 19:16:00 2890.50 2890.50 2890.50 2890.50 156.0
... ... ... ... ... ...
2020-07-02 04:30:00 2899.15 2899.15 2899.15 2899.15 553.0
2020-07-02 04:27:00 2900.98 2900.98 2900.98 2900.98 121.0
2020-07-02 04:07:00 2896.56 2896.56 2896.56 2896.56 164.0
2020-07-02 04:05:00 2896.76 2896.76 2896.76 2896.76 103.0
2020-07-02 04:03:00 2896.74 2896.74 2896.74 2896.74 286.0

524 rows × 5 columns

damzn = intraday_data.filter(like = specific_date, axis=0)['4. close'].resample('10 min').ohlc()
damzn.head()
open high low close
date
2020-07-02 04:00:00 2896.74 2896.76 2896.56 2896.56
2020-07-02 04:10:00 NaN NaN NaN NaN
2020-07-02 04:20:00 2900.98 2900.98 2900.98 2900.98
2020-07-02 04:30:00 2899.15 2899.15 2898.00 2898.00
2020-07-02 04:40:00 NaN NaN NaN NaN

Visualizations:

1. Using Plotly

import plotly.graph_objs as go
import plotly.io as pio
# list of available renderers
pio.renderers 
Renderers configuration
-----------------------
    Default renderer: 'plotly_mimetype+notebook'
    Available renderers:
        ['plotly_mimetype', 'jupyterlab', 'nteract', 'vscode',
         'notebook', 'notebook_connected', 'kaggle', 'azure', 'colab',
         'cocalc', 'databricks', 'json', 'png', 'jpeg', 'jpg', 'svg',
         'pdf', 'browser', 'firefox', 'chrome', 'chromium', 'iframe',
         'iframe_connected', 'sphinx_gallery']
pio.renderers.default = "plotly_mimetype+firefox+notebook"
pio.renderers.default
'plotly_mimetype+firefox+notebook'
g = go.Ohlc(x=damzn.index,
        open=damzn['open'],
       high=damzn['high'],
       low=damzn['low'],
       close=damzn['close'])

3. Using mplfinance (matplotlib finance)

mplfinance comes with great options for creating candlestick and OHLC type charts out of the box.

import mplfinance as mpf
dates = pd.date_range(start=start_date, end=end_date)
daily = pd.DataFrame(index=dates)
df_d, metadata = ts.get_daily_adjusted('AMZN', outputsize='full')
metadata
{'1. Information': 'Daily Time Series with Splits and Dividend Events',
 '2. Symbol': 'AMZN',
 '3. Last Refreshed': '2020-07-02',
 '4. Output Size': 'Full size',
 '5. Time Zone': 'US/Eastern'}
df_d.columns
Index(['Open', 'High', 'Low', 'Close', 'Volume'], dtype='object')
mpf.plot(df_d,type='ohlc',mav=2)
mpf.plot(df_d)
mpf.plot(df_d, type='candle')
mpf.plot(df_d, type='candle',mav=(3,6,9),volume=True)