Example: Curve Your City

Hi,

I started an initiative to plot the current number of infections in your city. Just input csv and off you go:

If you can/want, please reproduce in your city!
The goal of this is manifold:

  • esp. in Germany there is not single central source of data for infection numbers per city (which is where the density of the population and hence the most infections are), so I thought we have to create this data source on our own (see the csv file in the repo for dresden). I basically went to the city’s website and copied the numbers into a csv file

  • it’s important to visualize data well and show it to the public, otherwise insights to data remain unseen and people might not understand why social distancing is important. I won’t claim my viz foo is perfect, but I am trying. Please make suggestions for improvements.

  • I hope that at some point, my simple exponential model breaks apart completely and we see a logistic curve that breaks off. At this point, #COVID would be beaten in my city. If you can, please review the way how I quantify uncertainties for my simple model. The are currently super large and possibly way off.

  • I want to motivate people to help stop the spread. Please fork or PR and contribute the data for your city. With this decentralized way, I hope we can establish a decent dataset. Of course I know, that the simple data is not feasible to for high precision stats as it contains many sources of uncertainty (time lage for diagnosing cases, improvements in diagnosis speeds, …) Anyhow, it’s worth a shot.

Best,
Peter

5 Likes

I’ve been tracking this for my province (12M people) vs British Columbia (~8M people). Coronavirus-log

My code to tracking this

3 Likes

Sorry about not being able to help much, I don’t speak R. Nevertheless, you might find this useful:

3 Likes

Thanks for sharing. It’s interesting to see how the data wiggles around a straight line. Thanks!

Adding in California which has about the same population as Canada… Although Canada had more cases on March 1st, California now has 300 cases more. It will be interesting to see how this progresses.

Coronavirus-log

Here is the current situation report of Tübingen: https://github.com/gizal/COVID19

1 Like

@gizem where did you get the data from? Just curious as I am trying to tap the official stream of data. There is currently much confusion in Germany where to get trustworthy data by city from.

1 Like

@psteinb I used the RKI Coronovirus dataset that @nasim.rahaman shared the link of. I will update my figure later today.

1 Like

How did you extract the number of cases from the past?
I am just starting to see through this dataset and could only find the current number of cases for Dresden from the by-county set up:


If I understand you correctly, you used the German wide dataset directly from here

I guess you ‘downloaded’ it and accumulated the number of cases over time. I admit this whole dataset is quite confusing to me. I currently see this

$ csvgrep -c3 -m Dresden RKI_COVID19.csv|csvlook

IdBundesland Bundesland Landkreis Altersgruppe Geschlecht AnzahlFall AnzahlTodesfall ObjectId Meldedatum IdLandkreis Datenstand
14 Sachsen SK Dresden A00-A04 M 1 False 182,113 2020-03-13 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A00-A04 M 2 False 182,114 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A05-A14 M 1 False 182,115 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 1 False 182,116 2020-03-09 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 1 False 182,117 2020-03-10 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 1 False 182,118 2020-03-12 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 1 False 182,119 2020-03-14 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 2 False 182,120 2020-03-16 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 3 False 182,121 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 3 False 182,122 2020-03-18 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 3 False 182,123 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 4 False 182,124 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 M 1 False 182,125 2020-03-21 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 1 False 182,126 2020-03-11 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 1 False 182,127 2020-03-12 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 1 False 182,128 2020-03-16 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 2 False 182,129 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 5 False 182,130 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 2 False 182,131 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A15-A34 W 3 False 182,132 2020-03-21 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 1 False 182,133 2020-03-11 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 3 False 182,134 2020-03-12 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 4 False 182,135 2020-03-13 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 1 False 182,136 2020-03-14 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 1 False 182,137 2020-03-16 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 2 False 182,138 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 1 False 182,139 2020-03-18 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 9 False 182,140 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 5 False 182,141 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 M 5 False 182,142 2020-03-21 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 1 False 182,143 2020-03-13 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 4 False 182,144 2020-03-14 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 1 False 182,145 2020-03-16 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 2 False 182,146 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 1 False 182,147 2020-03-18 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 3 False 182,148 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 1 False 182,149 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A35-A59 W 5 False 182,150 2020-03-21 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 M 1 False 182,151 2020-03-07 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 M 1 False 182,152 2020-03-13 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 M 1 False 182,153 2020-03-18 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 M 2 False 182,154 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 W 1 False 182,155 2020-03-07 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A60-A79 W 1 False 182,156 2020-03-21 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A80+ M 1 False 182,157 2020-03-17 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A80+ W 1 False 182,158 2020-03-19 00:00:00+00:00 14,612 23.03.2020 00:00
14 Sachsen SK Dresden A80+ W 1 False 182,159 2020-03-20 00:00:00+00:00 14,612 23.03.2020 00:00

But I am not sure how to get the sum of all cases by today? Do I sum up all number of cases at the most recent date?

1 Like

Right, I used the second link (RKI COVID19).
Yes, I downloaded it, and had to do clear and adjust the data a bit.
See the steps I followed below.

Filtering: Find the cases belong to Tübingen, ignore the rest. Output: 54 cases
Get rid of the unnecessary columns: I only want to keep “Age Range”, “Sex”, “Number of case”, “Number of deaths”, “Admission Date”. Surely, I sorted the data according to the admission dates Output:


Fill in the missing dates: You can see that there are days that are missing (e.g. second case on 26th Feb and third one on the 7th March)
Reduce the dataset to keep one row per date but also add columns “Total Women”, “Total Men”.
Add number of cases of a given day to the previous day
Plot :slight_smile:

p.s. I haven’t used the age range information in my figures yet, but I plan to. Still, I actually created indexing for that. There are in total of 6 age groups, therefore I have 6 more columns to show the numbers of a given age range.

2 Likes

thanks for sharing, you are awesome!

2 Likes