Building a Covid-19 Case Tracker by US State and County - Part 1
Introduction
In this series of articles we will use the New York Times Covid-19 data to create a case tracker, which will display the data as a time series. Users will be able to view data for the US as a whole, or narrow it down by state and by county. The complete source code is available on GitHub - covid-19-us and a live version is hosted at covid19-us.netlify.app. We will use React and Typescript along with other libraries for UI and chart visualization. If you have not used Typescript before, these articles will also introduce you to some of the fundamentals and more advanced concepts of the language.
Taking a look at the data
The data in the New York Times Covid-19 dateset is hosted in 3 csv file - us.csv, us-states.csv and us-counties-recent.csv. We can see each of these files in its raw form by clicking on the Raw
button at the top right corner.
US Data
Looking at the raw US data, we can see the following:
date,cases,deaths
2020-01-21,1,0
2020-01-22,1,0
2020-01-23,1,0
2020-01-24,2,0
2020-01-25,3,0
2020-01-26,5,0
2020-01-27,5,0
...
The file consists of 3 columns, where the first row contains the column names: date
, cases
and deaths
. The subsequent rows contain values for each the three columns.
State Data
The US State data contains two additional columns: state
and fips
. A FIPS code is a geographic identifier that we will not use for this project.
date,state,fips,cases,deaths
...
2020-02-06,Washington,53,1,0
2020-02-06,Wisconsin,55,1,0
2020-02-07,Arizona,04,1,0
2020-02-07,California,06,6,0
...
County Data
The US County Data contains a county
column in addition to all the columns found in the state data:
date,county,state,fips,cases,deaths
...
2020-02-01,Orange,California,06059,1,0
2020-02-01,Santa Clara,California,06085,1,0
2020-02-01,Cook,Illinois,17031,2,0
2020-02-01,Suffolk,Massachusetts,25025,1,0
2020-02-01,Snohomish,Washington,53061,1,0
...
Parsing the data
We will use Papaprse to parse the csv data directly in the browser. For our first attempt we'll render the US state data in the browser using innerHTML
, so we can see can see how parsing works. You can head on to the codesandbox to see the result.
The shape of the data
The first thing you'll notice in the example is a Typescript interface:
interface StateCaseData {
date: string;
state: string;
cases: number;
deaths: number;
}
An interface defines the shape of some data, regardless of the source. In this instance, we are receiving the data from a CSV file that is downloaded and then parsed. Having looked at the raw CSV containing the US state data, we can assume that the data will be parsed into an array of objects that conform to the StateCaseData
interface.
The parse result
Once we have imported Papaparse:
import ParseCSV from "papaparse";
we can call the parse method, providing the URL for the US State data csv and an object containing additional arguments:
ParseCSV.parse(US_CSV_URL, {
download: true,
header: true,
dynamicTyping: true,
complete: parseResult => {
console.log("errors: ", parseResult.errors);
const data = parseResult.data.filter(isStateCaseData);
renderStateData(data);
}
});
These are the arguments contained in the object:
download
: indiates that the file needs to be downloadedheader
: indicates that the first row in the file contains the column namesdynamicTyping
: indicates that we'd like Papaparse to convert numbers and booleans to their respective types, rather than converting them to strings.complete
: a callback that describes what we'd like to do once the parsing is done. Paparse will invoke this callback with a results object which contains adata
property - an an array of objects keyed by the field name. It also contains anerrors
property, an array of string. At at the moment of writing this array is empty, meaning there are no errors parsing the CSV.
A few words about parsing the data
property in the parseResult
. Earlier I said we can assume that each object in the data parsed from the CSV will conform to the StateCaseData
interface. But we don't want to assume that the data will always be correct. In fact, the type of parseResult.data
is defined as any[]
- an array where the elements may be of any type whatsover. What if some of the elements in this array are missing a cases
or state
property? What we want is to only keep those objects that actually confrom to the StateCaseData
interface, and discard the rest. We can do this by defining a function called isStateCaseData
that, when given an element as an argument, returns true
if it conforms to the StateCaseData
interface and false
otherwise:
function isStateCaseData(dataElem: any): data is StateCaseData {
return (
!!dataElem &&
isString(dataElem.date) &&
isString(dataElem.state) &&
isNumber(dataElem.cases) &&
isNumber(dataElem.deaths)
);
}
const isString = (value: any): value is String => typeof value === "string";
const isNumber = (value: any): value is Number => typeof value === "number";
The return type of isStateCaseData
is a type predicate. This function is in fact a type guard - a function that returns a boolean indicating if the argument passed to it is of the stated type. To determine if the type dataElem
is StateCaseData
or not, we need check the types of each of its properties: date
and state
must be strings; cases
and deaths
must be numbers. These type checks are performed via their own type guards: isString
and isNumber
. But first we need to verify that the argument is not null
or undefined
- if that is the case, trying to access any property will result in a runtime error. We verify this by using double bangs, which will coerce the argument to a boolean. If the argument is null
or undefined
, it will be coerced to false
, and the check will be short-circuited.
Going back to where we filter the parseResult
data:
const data = parseResult.data.filter(isStateCaseData);
If we hover over data
we'll see that Typescript has inffered the type as StateCaseData[]
- we have validated that each element in the list is of StateCaseData
type. Finally, we can render the data by passing it to the renderStateData
function. This function simply iterates over the data array and renders the values of date
, state
, and cases
of each element.
This completes the first part of the series. In the next part we will group the data in a way that allows us to render it into charts and <select>
elements.