Stock Data with Python
Different ways to pull stock data
In this notebook, I want to explore different methods to download stock data for analysis. There are several libraries out there for Python, so I am exploring few of them here with some code to help get you started.
Loading Libraries
# General libraries that we will use
import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams["figure.figsize"] = (14,7)
import pandas_datareader
from pandas_datareader import data
pandas_datareader.__version__
Setting start and end date
start_date = '2012-10-01'
end_date = datetime.today().strftime('%Y-%m-%d')
# Tickers I am interested in as a Python List
ticker = ['MSFT','AAPL', 'AMZN']
Making a request to get the data for the specified data range and tickers using get_data_yahoo()
function
dt = data.get_data_yahoo(ticker, start_date, end_date)
dt.head()
dt.tail()
Let's do a simple plotting
plt.figure(figsize=(10,7))
dt['Adj Close'].plot()
plt.title(f'Adjusted Close Price of {ticker}')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')
import yfinance as yf
yfinance
has some nice features, such as getting additional information about a specific company as seen below
amzn = yf.Ticker("AMZN")
amzn.info
Now, let's make a request to pull histotical prices for AMZN
using period="MAX"
do indicate MAX period range
hist = amzn.history(period="max")
hist.head()
We can also check how many splits did Amazon have in the past using the
.splits
property
amzn.splits
We can also specify date ranges using the snippet below:
amzn = yf.download("AMZN", start=start_date, end=end_date)
amzn.head()
amzn.tail()
# plt.figure(figsize=(10,7))
amzn['Adj Close'].plot()
plt.title(f'Adjusted Close Price of AMZN')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')
Quandl is great, but it does not have latest prices and can be ideal for historical analysis.In order to use Quandl you will need to register to obrain an `API KEY'. It is free to register.
import quandl
API_KEY = "YOU API KEY"
amaznq = quandl.get('WIKI/'+"AMZN",
start_date=start_date,
end_date=end_date,
api_key=API_KEY)
amaznq.head()
amaznq.tail()
# plt.figure(figsize=(10,7))
amaznq['Adj. Close'].plot()
plt.title(f'Adjusted Close Price of {ticker}')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Year', fontsize=14)
plt.grid(which='major')
alpha_vantage_key = "YOU API KEY HERE"
import alpha_vantage
from alpha_vantage.timeseries import TimeSeries
Here I am specifiying output_format' to be
pandas` to get it as a DataFrame.
ts = TimeSeries(key=alpha_vantage_key, output_format='pandas')
Intraday example:
intraday_data, data_info = ts.get_intraday("AMZN", outputsize='full', interval='1min')
data_info
intraday_data.head()
intraday_data.tail()
specific_date = '2020-07-02'
intraday_data.filter(like = specific_date, axis=0)['4. close']
plt.figure(figsize=(10,7))
data = intraday_data.filter(like = specific_date, axis=0)['4. close']
# same as
# data = intraday_data.filter(like = '2020-01-31', axis=0)['4. close'].plot()
plt.plot_date(data.index, data.values, fmt='-')
plt.title(f'Intraday Price Movement for AMZN')
plt.ylabel('Price', fontsize=14)
plt.xlabel('Time', fontsize=14)
plt.grid(which='major')
print('starting hour: ', min(data.index.hour))
print('ending hour: ', max(data.index.hour))
intraday_data.filter(like = specific_date, axis=0)
damzn = intraday_data.filter(like = specific_date, axis=0)['4. close'].resample('10 min').ohlc()
damzn.head()
import plotly.graph_objs as go
import plotly.io as pio
# list of available renderers
pio.renderers
pio.renderers.default = "plotly_mimetype+firefox+notebook"
pio.renderers.default
g = go.Ohlc(x=damzn.index,
open=damzn['open'],
high=damzn['high'],
low=damzn['low'],
close=damzn['close'])
import mplfinance as mpf
dates = pd.date_range(start=start_date, end=end_date)
daily = pd.DataFrame(index=dates)
df_d, metadata = ts.get_daily_adjusted('AMZN', outputsize='full')
metadata
df_d.columns
mpf.plot(df_d,type='ohlc',mav=2)
mpf.plot(df_d)
mpf.plot(df_d, type='candle')
mpf.plot(df_d, type='candle',mav=(3,6,9),volume=True)