Here’s where to get raw COVID-19 data.

Are you having trouble getting raw data for the COVID-19 virus? Do you want to see current cases that tested positive, or current deaths in the world? Or for the total US? Or by US state? Here it is. This is a Google spreadsheet, but you cannot download this one. And there was no daily historical data when I went there.

https://docs.google.com/spreadsheets/u/2/d/e/2PACX-1vRwAqp96T9sYYq2-i7Tj0pvTf6XVHjDSMIKBdZHXiCGGdNC0ypEU9NbngS8mxea55JuCFuua1MUeOj5/pubhtml#

The web page for this data is here: https://covidtracking.com/data/

You can also download JSON or CSV data. Again, there is no daily historical data to make a graph.

More detailed data

If you want more detailed data, here’s a news story: https://towardsdatascience.com/understanding-the-coronavirus-epidemic-data-44d2fb356ecb

This site also has graphs from the data.

And here’s the project called DXY-2019-nCoV-Data project. https://github.com/BlankerL/DXY-2019-nCoV-Data

This data is updated several times per day by Ding Xiang Yuan and is saved as a CSV file. There there are multiple entries for each date so you must filter out the items you do not want for a given day. Go to the Github project and look in the CSV folder. The overall data is here as a CSV file. WARNING: the file is currently 18MB and has Chinese characters in it which appear to be UTF-16. I was able to open the CSV file in the LibreOffice 6.4 spreadsheet module, here are the steps to open it in LibreOffice 6.4:

  1. Start LibreOffice.
  2. Click Open icon.
  3. In the File Type box choose .csv files.
  4. Open the file you want. Give it 1-2 minutes to open the file, it takes a while.
  5. Now you can save it as another file format. This 18MB file saved as a 446KB ODS file.

Notes about the file.

  1. Some of the columns are REALLY wide. I right-clicked the column header to make the column 1″ wide instead.
  2. All the notes appear to be in Chinese and LibreOffice detected them as UTF-16 characters. But LibreOffice seems to handle these just fine.
  3. There are many columns. The raw data starts in column H.
  4. Some columns seem to have JSON data in them.
  5. I hid the Chinese columns since I can’t read them.
  6. Col AI contains the update date and time. Using this you could possibly get daily numbers out of this raw data. The most recent dates are in the top rows.
  7. I put the cursor in cell A2 and did View, Freeze Rows and Columns to freeze the first row.
  8. Hiding columns with a lot of data (even if the columns are not very wide and most of the column is not visible) significantly improves screen redraw performance.

This project has a Python script to analyze the data from the DXY project. https://github.com/jianxu305/nCov2019_analysis