Fix get_data_yahoo from Pandas Datareader

What will we cover?

If you use get_data_yahoo from Pandas Datareader and it suddenly stopped working, then we will look at how to fix.

The Error and Problem

Consider this code.

import pandas_datareader as pdr
from datetime import datetime

data = pdr.get_data_yahoo('^GSPC', datetime(1970, 1, 1))

It has been working up until now. But suddenly it writes.

Traceback (most recent call last):
  File "/Users/rune/PycharmProjects/TEST/test_yahoo.py", line 4, in <module>
    data = pdr.get_data_yahoo('^GSPC', datetime(1970, 1, 1))
  File "/Users/rune/PycharmProjects/TEST/venv/lib/python3.8/site-packages/pandas_datareader/data.py", line 86, in get_data_yahoo
    return YahooDailyReader(*args, **kwargs).read()
  File "/Users/rune/PycharmProjects/TEST/venv/lib/python3.8/site-packages/pandas_datareader/base.py", line 253, in read
    df = self._read_one_data(self.url, params=self._get_params(self.symbols))
  File "/Users/rune/PycharmProjects/TEST/venv/lib/python3.8/site-packages/pandas_datareader/yahoo/daily.py", line 153, in _read_one_data
    resp = self._get_response(url, params=params)
  File "/Users/rune/PycharmProjects/TEST/venv/lib/python3.8/site-packages/pandas_datareader/base.py", line 181, in _get_response
    raise RemoteDataError(msg)
pandas_datareader._utils.RemoteDataError: Unable to read URL: https://finance.yahoo.com/quote/^GSPC/history?period1=10800&period2=1627523999&interval=1d&frequency=1d&filter=history
Response Text:
b'<!DOCTYPE html>\n  <html lang="en-us"><head>\n  <meta http-equiv="content-type" content="text/html; charset=UTF-8">\n      <meta charset="utf-8">\n      <title>Yahoo</title>\n      <meta name="viewport" content="width=device-width,initial-scale=1,minimal-ui">\n      <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">\n      <style>\n  html {\n      height: 100%;\n  }\n  body {\n      background: #fafafc url(https://s.yimg.com/nn/img/sad-panda-201402200631.png) 50% 50%;\n      background-size: cover;\n      height: 100%;\n      text-align: center;\n      font: 300 18px "helvetica neue", helvetica, verdana, tahoma, arial, sans-serif;\n  }\n  table {\n      height: 100%;\n      width: 100%;\n      table-layout: fixed;\n      border-collapse: collapse;\n      border-spacing: 0;\n      border: none;\n  }\n  h1 {\n      font-size: 42px;\n      font-weight: 400;\n      color: #400090;\n  }\n  p {\n      color: #1A1A1A;\n  }\n  #message-1 {\n      font-weight: bold;\n      margin: 0;\n  }\n  #message-2 {\n      display: inline-block;\n      *display: inline;\n      zoom: 1;\n      max-width: 17em;\n      _width: 17em;\n  }\n      </style>\n  <script>\n    document.write(\'<img src="//geo.yahoo.com/b?s=1197757129&t=\'+new Date().getTime()+\'&src=aws&err_url=\'+encodeURIComponent(document.URL)+\'&err=%<pssc>&test=\'+encodeURIComponent(\'%<{Bucket}cqh[:200]>\')+\'" width="0px" height="0px"/>\');var beacon = new Image();beacon.src="//bcn.fp.yahoo.com/p?s=1197757129&t="+new Date().getTime()+"&src=aws&err_url="+encodeURIComponent(document.URL)+"&err=%<pssc>&test="+encodeURIComponent(\'%<{Bucket}cqh[:200]>\');\n  </script>\n  </head>\n  <body>\n  <!-- status code : 404 -->\n  <!-- Not Found on Server -->\n  <table>\n  <tbody><tr>\n      <td>\n      <img src="https://s.yimg.com/rz/p/yahoo_frontpage_en-US_s_f_p_205x58_frontpage.png" alt="Yahoo Logo">\n      <h1 style="margin-top:20px;">Will be right back...</h1>\n      <p id="message-1">Thank you for your patience.</p>\n      <p id="message-2">Our engineers are working quickly to resolve the issue.</p>\n      </td>\n  </tr>\n  </tbody></table>\n  </body></html>'

What to do?

The fix

There has been a breaking change and you need to update your Pandas Datareader.

You can upgrade to the newest version as follows.

pip install pandas_datareader --upgrade

It should update it to version 0.10.0 or later.

Then the code should work again.

5 Replies to “Fix get_data_yahoo from Pandas Datareader”

  1. While using my python script that uses pandas-DataReader v10 & pandas v1.3.3 get_data_yahoo(), when it first starts off it will show the last 5 days correctly because I am tailing this for the last 5 days, then after about a minute or two it changes to what you see below, I am using python 3.7.10 The date moves back 2 weeks and the volume from the last 4 days is off a lot, the current day is always correct ok

    I run my script on linux using the “watch -n1

    Code below
    —————————————–
    def get_data(tickers):
    df = pdr.get_data_yahoo(tickers)
    return df

    for df in tickers:
    df = get_data(df) # this function retrieve’s stock information from yahoo finance
    df = pd.DataFrame(df.Close)
    print(df.tail(5)
    ——————————————–

    #1 — works correctly Close
    Date Close
    2021-08-15 11.2500
    2021-09-16 9.9600
    2021-09-17 11.0600
    2021-09-20 11.7300
    2021-09-21 13.7001

    #2 — doesn’t work correctly Close
    Date Close
    2021-08-31 5.9600
    2021-09-01 7.4000
    2021-09-02 6.4900
    2021-09-03 6.7000
    2021-09-21 13.7001

    how can anyone help me

    1. Hi Frank,

      It looks like you have a list of Tickers you iterate over. You get the data for all of the and print the last retrieved data.

      The line taking Close in a DataFrame is maybe not needed?

      What is the goal here? For each iteration see the last five days of the specific ticket?

      If you share the list of tickers you are looking at and the full code I can take a look at it.

  2. I am getting information for one stock and that is the closing price and volume and I am using pandas-datareader get_data_yahoo() function, when I first run the stock one time it looks like the first photo, but what I am doing is using the “watch -n1 ” to get updates every 1 to 5 seconds so I get live updates, but after some time the data coming back goes back to September 09 2021 – August 31, 2021 like the second photo and the volume numbers don’t make any sense they are going into the billions, I have tried everything, different version of python3 + different versions of pip3 and pandas and pandas-datareader, the question is why does it keep going back to the last two weeks and why is my volume messed up, except for the current day., it only took 6 minutes for it to change to photo two

    https://photos.app.goo.gl/Yh7kCXUJQWq1UXKx8

    1. I think this will do what you want:


      import pandas_datareader as pdr
      from datetime import datetime, timedelta

      while True:
      df = pdr.get_data_yahoo('AAPL', datetime.utcnow() - timedelta(days=7))
      print(df)

Leave a Reply