Need help with modelling a epidemiological scenario

I’m looking to model a hypothetical scenario:

  • Assume 15,000 individuals living in an area of 12 sq/km.

  • avg person is 6’.

  • Assume that 2/3rd of all individuals enter into 1 of 3 different buildings 2 times a day.

  • Assume 1/3rd of all individuals enter 1 of these same 3 buildings 3 times a day.

  • Assume that every day 200-400 individuals leave this 12sq/km area, and 200-400 new individuals enter it.

  • Assume a healthcare system has 100 ventilators and 200 ICU beds ready to support this environment

  • Assume 25% of individuals overtly practice social distancing (altering their path, actively maintaining 6ft+ distancing from all others) with an effectiveness of 80% (80% of the time they are able to maintain 6ft)

  • Assume a mean age of 30 with a highest age of 65, lowest age of 19

  • Assume at the beginning of the scenario that of the 200-400 new individuals entering the environment, 1 will be infected on the first day, 2 on the 2nd, 3 on the 3rd etc.

  • Assume detection and quarantine of all new individuals with an 80% success rate (80% of the time, the infected are detected and quarantined.

So 15k people, 66% enter one of 3 buildings twice a day, and the other 33% enter one of three buildings three times a day.

If you have other varibles that you’d like me to include just let me know. I’ve been playing with paul vangents python models, but I am fairly new to python(and numpy, etc.) So I thought I’d get some more experience on the problem.

I have a feeling you guys might be busy, but thanks in advance! Stay safe, and wash your hands!

Moderator note: this request was received via our Google Form. If you have questions for the asker, please feel free to ask in the comments and we will mediate. Internal note: entry #5.


I’d be interested in a chat with the asker. Is anyone else up for it?

That’s a really cool project! I am looking forward to seeing what it yields :slight_smile:

Some variables dictated by the disease itself come to mind that I’d add:

  • The chance of being infected, which should be high when getting too close to an infected person, but not 100%. Also there might be a remaining risk, even if you are keeping your distance, but that might make it too complicated.
  • A certain amount of infected people stops moving around due to being hospitalized or sick enough to stay at home. Factoring in the age brackets you picked, but also that official statistics only count hospitalized cases, I’d say about 30% stop interacting with people for a while. The incubation is estimated to roughly 14 days, so before that they just move around as usual and infect people.
  • I guess you have a plan for death rates since you mention ICUs and ventilators. Just mentioning them because in your list they don’t come up. Baseline with your age bracket I’d assume to be around 1%, but obviously having resources depleted would raise it.
  • Survivors count towards the total of people moving around but can - best of my knowledge - be considered immune. They don’t get reinfected and don’t infect others either.
  • I have no idea how to estimate this, but factoring in people wearing masks would be interesting. Does anyone have any kind of estimate on that?

That’s my ideas off of the top of my head.


Hello asker,

Thank you for this question.

The main things we need to know is: what is the aim of this model? Is it to support policy? Context and aim really matter. It would be nice to know the policy options so we can put them in the model and compare their effects. Also given the heavy dependence on context I think it would be wise to try and contact epidemiological or health economics departments at local universities as they will have both technical expertise and local knowledge. The Data-against-Covid website only has the first of these. Be careful with modeling results from people who can make nice models but lack context-knowledge; our results can be biased and/or not relevant for the situation under study.

Nevertheless, with these caveats out of the way, I found it an interesting thing to work on. I’ve written a quick-and-dirty model in R, mostly to reveal questions that we might still have as a modeling community. the code is available on Github. I modelled a spreading of the disease in a simple way with quite some assumptions, for example I’ve not touched on Corona disease progression or load on the healthcare system yet. To reduce running time I divided the population sizes by 10. Some graphs of prevalence and detected cases (average over 1000 runs) with my assumptions and the provided data are found here. It shows a recurring pattern where non-detected infected persons cause an increase in prevalence before the 80% detection rate stamps it out. As the asker said every day more infected people come in, that effect dominates in the end after a year, as by then many infected people would arrive.

Some questions that came up when I was making this:

  1. What exactly are the contact patterns? I assumed it was a office situation where people arrive in the morning, some go for lunch, and all leave in the evening. By assuming these times are stochastic and assuming a time period where there is a chance of contact I modeled the spread of the virus
  2. Do people move between buildings or are these bound to one?
  3. Can people from different buildings interact, and how?
  4. What is the distribution of ages? we now only know the lower and upper bounds. I think that is very important to know if we want to model load on ICU

Looking forward to hearing other people’s thoughts and/or seeing additions on the R code. Is there anyone with access to and knowledge of software such as AnyLogic? With more info about the context from the asker we could make a more detailed agent-based simulation of the situation and the impact of policy measures.

Kind regards


For the pythonistas, there’s also a repository from Paul van Gent with support for simulating multiple scenarios:

I think this can be made in to a nice web app (similar to @Guidolo’s). :slight_smile:

1 Like

I’ve been thinking about this a bit, and I think we are going about it the wrong way. No offence meant to anyone; I think we are all here with the best of intentions. In my opinion we shouldn’t be re-inventing the wheel. Simple models like the one I made or Paul van Gent’s one are nice starting points but years behind current practice in epi modeling for diseases like COVID19. It is unlikely we will get to the same point in a time period that will make our work helpful in the current crisis.

Instead I feel we should ask ourselves the question: where can we add value in supporting decisions relating to the situation posed by the asker? I think these steps may be good to take:

  1. What has been done with regard to modeling COVID-19 spread in public health models?
  2. What are the remaining gaps or weak points in the currently used models and how can we help fill these?
  3. Can we replicate these models and then translate them to easy-to-use decision support tools?

By working towards point 2 we can support epi modellers so that their models are as useful as possible for policy makers. By working toward point 3 we can help decision makers who do not have access to expert modelers.

Happy to hear your thoughts.


These are two good starting points: