In December, 2019, a local outbreak of pneumonia of initially unknown cause was detected in Wuhan (Hubei, China), and was quickly determined to be caused by a novel coronavirus,1 namely severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The outbreak has since spread to every province of mainland China as well as 27 other countries and regions, with more than 70 000 confirmed cases as of Feb 17, 2020. In response to this ongoing public health emergency, JHWSE developed an online interactive dashboard, hosted by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, Baltimore, MD, USA, to visualise and track reported cases of coronavirus disease 2019 (COVID-19) in real time. The dashboard, first shared publicly on Jan 22, illustrates the location and number of confirmed COVID-19 cases, deaths, and recoveries for all affected countries. It was developed to provide researchers, public health authorities, and the general public with a user-friendly tool to track the outbreak as it unfolds. All data collected and displayed are made freely available, initially through Google Sheets and now through a Github Repository, along with the feature layers of the dashboard, which are now included in the Esri Living Atlas. (view Lancet Article)
Since I want to have a dashboard showing Philippines' Covid19 cases summary, I have decided to work on a solution on how to use their Github repository.
Data files is in the form of a CSVs. Data started from January 22, 2020 and is still updated today several times on a daily basis.
The challenge is to extract those data automatically using a script from Github to Google Sheets giving flexibility for many dashboard building apps.
I have decided to use a simple approach using Google sheets as it has a script editor which is capable of creating a trigger/schedule on when to run a script. In this case, I simply used an hourly schedule.
I've created 3 scripts to extract confirmed, recovered and death cases due to the way the data has been saved in Github.
Depending on which tool is being used for dashboard reporting, one can see that the original data was growing horizontally on a daily basis. One needs to unpivot/transpose part of the table for easier data manipulation.
Other unexpected problem was the way date was extracted. At times it follows the /m/d/yy format but there are a couple of times where it was on a d/m/yy format.
Other notable limitations is that for some countries, some have no regions available. Also, latitude and latitude are sometimes inaccurate hence the need to do a correction.
Only by ensuring that the process is successful every hour can we tell we have successfully created a fully automated dashboard. There were times when data format has change. Changes in the Github folder structure has also been observe during the early stages of the dashboard.