Re-exploring the structure of Chinese character images

Introduction

In this notebook we show information retrieval and clustering
techniques over images of Unicode collection of Chinese characters. Here
is the outline of notebook’s exposition:

  1. Get Chinese character images.
  2. Cluster “image vectors” and demonstrate that the obtained
    clusters have certain explainability elements.
  3. Apply Latent Semantic Analysis (LSA) workflow to the character
    set.
  4. Show visual thesaurus through a recommender system. (That uses
    Cosine similarity.)
  5. Discuss graph and hierarchical clustering using LSA matrix
    factors.
  6. Demonstrate approximation of “unseen” character images with an
    image basis obtained through LSA over a small set of (simple)
    images.
  7. Redo character approximation with more “interpretable” image
    basis.

Remark: This notebook started as an (extended)
comment for the Community discussion “Exploring
structure of Chinese characters through image processing”
, [SH1].
(Hence the title.)

Get Chinese character images

This code is a copy of the code in the original
Community post by Silvia Hao
, [SH1]:

0zu4hv95x0jjf
Module[{fsize = 50, width = 64, height = 64}, 
  lsCharIDs = Map[FromCharacterCode[#, "Unicode"] &, 16^^4E00 - 1 + Range[width height]]; 
 ]
charPage = Module[{fsize = 50, width = 64, height = 64}, 
    16^^4E00 - 1 + Range[width height] // pipe[
      FromCharacterCode[#, "Unicode"] & 
      , Characters, Partition[#, width] & 
      , Grid[#, Background -> Black, Spacings -> {0, 0}, ItemSize -> {1.5, 1.2}, Alignment -> {Center, Center}, Frame -> All, FrameStyle -> Directive[Red, AbsoluteThickness[3 \[Lambda]]]] & 
      , Style[#, White, fsize, FontFamily -> "Source Han Sans CN", FontWeight -> "ExtraLight"] & 
      , Rasterize[#, Background -> Black] & 
     ] 
   ];
chargrid = charPage // ColorDistance[#, Red] & // Image[#, "Byte"] & // Sign //Erosion[#, 5] &;
lmat = chargrid // MorphologicalComponents[#, Method -> "BoundingBox", CornerNeighbors -> False] &;
chars = ComponentMeasurements[{charPage // ColorConvert[#, "Grayscale"] &, lmat}, "MaskedImage", #Width > 10 &] // Values // Map@RemoveAlphaChannel;
chars = Module[{size = chars // Map@ImageDimensions // Max}, ImageCrop[#, {size, size}] & /@ chars];

Here is a sample of the obtained images:

SeedRandom[33];
RandomSample[chars, 5]
1jy9voh5c01lt

Vector representation of
images

Define a function that represents an image into a linear vector space
(of pixels):

Clear[ImageToVector];
ImageToVector[img_Image] := Flatten[ImageData[ColorConvert[img, "Grayscale"]]];
ImageToVector[img_Image, imgSize_] := Flatten[ImageData[ColorConvert[ImageResize[img, imgSize], "Grayscale"]]];
ImageToVector[___] := $Failed;

Show how vector represented images look like:

Table[BlockRandom[
   img = RandomChoice[chars]; 
   ListPlot[ImageToVector[img], Filling -> Axis, PlotRange -> All, PlotLabel -> img, ImageSize -> Medium, AspectRatio -> 1/6], 
   RandomSeeding -> rs], {rs, {33, 998}}]
0cobk7b0m9xcn
\[AliasDelimiter]

Data preparation

In this section we represent the images into a linear vector space.
(In which each pixel is a basis vector.)

Make an association with images:

aCImages = AssociationThread[lsCharIDs -> chars];
Length[aCImages]

(*4096*)

Make flat vectors with the images:

AbsoluteTiming[
  aCImageVecs = ParallelMap[ImageToVector, aCImages]; 
 ]

(*{0.998162, Null}*)

Do matrix plots a random sample of the image vectors:

SeedRandom[32];
MatrixPlot[Partition[#, ImageDimensions[aCImages[[1]]][[2]]]] & /@ RandomSample[aCImageVecs, 6]
07tn6wh5t97j4

Clustering over the image
vectors

In this section we cluster “image vectors” and demonstrate that the
obtained clusters have certain explainability elements. Expected Chinese
character radicals are observed using image multiplication.

Cluster the image vectors and show summary of the clusters
lengths:

SparseArray[Values@aCImageVecs]
1n5cwcrgj2d3m
SeedRandom[334];
AbsoluteTiming[
  lsClusters = FindClusters[SparseArray[Values@aCImageVecs] -> Keys[aCImageVecs], 35, Method -> {"KMeans"}]; 
 ]
Length@lsClusters
ResourceFunction["RecordsSummary"][Length /@ lsClusters]

(*{24.6383, Null}*)

(*35*)
0lvt8mcfzpvhg

For each cluster:

  • Take 30 different small samples of 7 images
  • Multiply the images in each small sample
  • Show three “most black” the multiplication results
SeedRandom[33];
Table[i -> TakeLargestBy[Table[ImageMultiply @@ RandomSample[KeyTake[aCImages, lsClusters[[i]]], UpTo[7]], 30], Total@ImageToVector[#] &, 3], {i, Length[lsClusters]}]
0erc719h7lnzi

Remark: We can see that the clustering above
produced “semantic” clusters – most of the multiplied images show
meaningful Chinese characters radicals and their “expected
positions.”

Here is one of the clusters with the radical “mouth”:

KeyTake[aCImages, lsClusters[[26]]]
131vpq9dabrjo

LSAMon application

In this section we apply the “standard” LSA workflow, [AA1, AA4].

Make a matrix with named rows and columns from the image vectors:

mat = ToSSparseMatrix[SparseArray[Values@aCImageVecs], "RowNames" -> Keys[aCImageVecs], "ColumnNames" -> Automatic]
0jdmyfb9rsobz

The following Latent Semantic Analysis (LSA) monadic pipeline is used
in [AA2, AA2]:

SeedRandom[77];
AbsoluteTiming[
  lsaAllObj = 
    LSAMonUnit[]\[DoubleLongRightArrow]
     LSAMonSetDocumentTermMatrix[mat]\[DoubleLongRightArrow]
     LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]\[DoubleLongRightArrow]
     LSAMonExtractTopics["NumberOfTopics" -> 60, Method -> "SVD", "MaxSteps" -> 15, "MinNumberOfDocumentsPerTerm" -> 0]\[DoubleLongRightArrow]
     LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]
     LSAMonEcho[Style["Obtained basis:", Bold, Purple]]\[DoubleLongRightArrow]
     LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#, ImageDimensions[aCImages[[1]]][[1]]]]] & /@SparseArray[#H] &]; 
 ]
088nutsaye7yl
0j7joulwrnj30
(*{7.60828, Null}*)

Remark: LSAMon’s corresponding theory and design are
discussed in [AA1, AA4]:

Get the representation matrix:

W2 = lsaAllObj\[DoubleLongRightArrow]LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]LSAMonTakeW
1nno5c4wmc83q

Get the topics matrix:

H = lsaAllObj\[DoubleLongRightArrow]LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]LSAMonTakeH
1gtqe0ihshi9s

Cluster the reduced dimension
representations
and show summary of the clusters
lengths:

AbsoluteTiming[
  lsClusters = FindClusters[Normal[SparseArray[W2]] -> RowNames[W2], 40, Method -> {"KMeans"}]; 
 ]
Length@lsClusters
ResourceFunction["RecordsSummary"][Length /@ lsClusters]

(*{2.33331, Null}*)

(*40*)
1bu5h88uiet3e

Show cluster interpretations:

AbsoluteTiming[aAutoRadicals = Association@Table[i -> TakeLargestBy[Table[ImageMultiply @@ RandomSample[KeyTake[aCImages, lsClusters[[i]]], UpTo[8]], 30], Total@ImageToVector[#] &, 3], {i, Length[lsClusters]}]; 
 ]
aAutoRadicals

(*{0.878406, Null}*)
05re59k8t4u4u

Using FeatureExtraction

I experimented with clustering and approximation using WL’s function
FeatureExtraction.
Result are fairly similar as the above; timings a different (a few times
slower.)

Visual thesaurus

In this section we use Cosine similarity to find visual nearest
neighbors of Chinese character images.

matPixels = WeightTermsOfSSparseMatrix[lsaAllObj\[DoubleLongRightArrow]LSAMonTakeWeightedDocumentTermMatrix, "IDF", "None", "Cosine"];
matTopics = WeightTermsOfSSparseMatrix[lsaAllObj\[DoubleLongRightArrow]LSAMonNormalizeMatrixProduct[Normalized -> Left]\[DoubleLongRightArrow]LSAMonTakeW, "None", "None", "Cosine"];
smrObj = SMRMonUnit[]\[DoubleLongRightArrow]SMRMonCreate[<|"Topic" -> matTopics, "Pixel" -> matPixels|>];

Consider the character “團”:

aCImages["團"]
0pi2u9ejqv9wd

Here are the nearest neighbors for that character found by using both
image topics and image pixels:

(*focusItem=RandomChoice[Keys@aCImages];*)
  focusItem = {"團", "仼", "呔"}[[1]]; 
   smrObj\[DoubleLongRightArrow]
     SMRMonEcho[Style["Nearest neighbors by pixel topics:", Bold, Purple]]\[DoubleLongRightArrow]
     SMRMonSetTagTypeWeights[<|"Topic" -> 1, "Pixel" -> 0|>]\[DoubleLongRightArrow]
     SMRMonRecommend[focusItem, 8, "RemoveHistory" -> False]\[DoubleLongRightArrow]
     SMRMonEchoValue\[DoubleLongRightArrow]
     SMRMonEchoFunctionValue[AssociationThread[Values@KeyTake[aCImages, Keys[#]], Values[#]] &]\[DoubleLongRightArrow]
     SMRMonEcho[Style["Nearest neighbors by pixels:", Bold, Purple]]\[DoubleLongRightArrow]
     SMRMonSetTagTypeWeights[<|"Topic" -> 0, "Pixel" -> 1|>]\[DoubleLongRightArrow]
     SMRMonRecommend[focusItem, 8, "RemoveHistory" -> False]\[DoubleLongRightArrow]
     SMRMonEchoFunctionValue[AssociationThread[Values@KeyTake[aCImages, Keys[#]], Values[#]] &];
1l9yz2e8pvlyl
03bc668vzyh4v
00ecjkyzm4e2s
1wsyx76kjba1g
18wdi99m1k99j

Remark: Of course, in the recommender pipeline above
we can use both pixels and pixels topics. (With their contributions
being weighted.)

Graph clustering

In this section we demonstrate the use of graph communities to find
similar groups of Chinese characters.

Here we take a sub-matrix of the reduced dimension matrix computed
above:

W = lsaAllObj\[DoubleLongRightArrow]LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]LSAMonTakeW;

Here we find the similarity matrix between the characters and remove
entries corresponding to “small” similarities:

matSym = Clip[W . Transpose[W], {0.78, 1}, {0, 1}];

Here we plot the obtained (clipped) similarity matrix:

MatrixPlot[matSym]
1nvdb26265li6

Here we:

  • Take array rules of the sparse similarity matrix
  • Drop the rules corresponding to the diagonal elements
  • Convert the keys of rules into uni-directed graph edges
  • Make the corresponding graph
  • Find graph’s connected components
  • Show the number of connected components
  • Show a tally of the number of nodes in the components
gr = Graph[UndirectedEdge @@@ DeleteCases[Union[Sort /@ Keys[SSparseMatrixAssociation[matSym]]], {x_, x_}]];
lsComps = ConnectedComponents[gr];
Length[lsComps]
ReverseSortBy[Tally[Length /@ lsComps], First]

(*138*)

(*{{1839, 1}, {31, 1}, {27, 1}, {16, 1}, {11, 2}, {9, 2}, {8, 1}, {7, 1}, {6, 5}, {5, 3}, {4, 8}, {3, 14}, {2, 98}}*)

Here we demonstrate the clusters of Chinese characters make
sense:

aPrettyRules = Dispatch[Map[# -> Style[#, FontSize -> 36] &, Keys[aCImages]]]; CommunityGraphPlot[Subgraph[gr, TakeLargestBy[lsComps, Length, 10][[2]]], Method -> "SpringElectrical", VertexLabels -> Placed["Name", Above],AspectRatio -> 1, ImageSize -> 1000] /. aPrettyRules
1c0w4uhnyn2jx

Remark: By careful observation of the clusters and
graph connections we can convince ourselves that the similarities are
based on pictorial sub-elements (i.e. radicals) of the characters.

Hierarchical clustering

In this section we apply hierarchical clustering to the reduced
dimension representation of the Chinese character images.

Here we pick a cluster:

lsFocusIDs = lsClusters[[12]];
Magnify[ImageCollage[Values[KeyTake[aCImages, lsFocusIDs]]], 0.4]
14cnicsw2rvrt

Here is how we can make a dendrogram plot (not that useful here):

(*smat=W2\[LeftDoubleBracket]lsClusters\[LeftDoubleBracket]13\[RightDoubleBracket],All\[RightDoubleBracket];
Dendrogram[Thread[Normal[SparseArray[smat]]->Map[Style[#,FontSize->16]&,RowNames[smat]]],Top,DistanceFunction->EuclideanDistance]*)

Here is a heat-map plot with hierarchical clustering dendrogram (with
tool-tips):

gr = HeatmapPlot[W2[[lsFocusIDs, All]], DistanceFunction -> {CosineDistance, None}, Dendrogram -> {True, False}];
gr /. Map[# -> Tooltip[Style[#, FontSize -> 16], Style[#, Bold, FontSize -> 36]] &, lsFocusIDs]
0vz82un57054q

Remark: The plot above has tooltips with larger
character images.

Representing
all characters with smaller set of basic ones

In this section we demonstrate that a relatively small set of simpler
Chinese character images can be used to represent (or approxumate) the
rest of the images.

Remark: We use the following heuristic: the simpler
Chinese characters have the smallest amount of white pixels.

Obtain a training set of images – that are the darkest – and show a
sample of that set :

{trainingInds, testingInds} = TakeDrop[Keys[SortBy[aCImages, Total[ImageToVector[#]] &]], 800];
SeedRandom[3];
RandomSample[KeyTake[aCImages, trainingInds], 12]
10275rv8gn1qt

Show all training characters with an image collage:

Magnify[ImageCollage[Values[KeyTake[aCImages, trainingInds]], Background -> Gray, ImagePadding -> 1], 0.4]
049bs0w0x26jw

Apply LSA monadic pipeline with the training characters only:

SeedRandom[77];
AbsoluteTiming[
  lsaPartialObj = 
    LSAMonUnit[]\[DoubleLongRightArrow]
     LSAMonSetDocumentTermMatrix[SparseArray[Values@KeyTake[aCImageVecs, trainingInds]]]\[DoubleLongRightArrow]
     LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]\[DoubleLongRightArrow]
     LSAMonExtractTopics["NumberOfTopics" -> 80, Method -> "SVD", "MaxSteps" -> 120, "MinNumberOfDocumentsPerTerm" -> 0]\[DoubleLongRightArrow]
     LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]
     LSAMonEcho[Style["Obtained basis:", Bold, Purple]]\[DoubleLongRightArrow]
     LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#, ImageDimensions[aCImages[[1]]][[1]]]]] & /@SparseArray[#H] &]; 
 ]
0i509m9n2d2p8
1raokwq750nyi
(*{0.826489, Null}*)

Get the matrix and basis interpretation of the extracted image
topics:

H = 
   lsaPartialObj\[DoubleLongRightArrow]
    LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]
    LSAMonTakeH;
lsBasis = ImageAdjust[Image[Partition[#, ImageDimensions[aCImages[[1]]][[1]]]]] & /@ SparseArray[H];

Approximation of “unseen”
characters

Pick a Chinese character image as a target image and pre-process
it:

ind = RandomChoice[testingInds];
imgTest = aCImages[ind];
matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aCImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic];
imgTest
15qkrj0nw08mv

Find its representation with the chosen feature extractor (LSAMon
object here):

matReprsentation = lsaPartialObj\[DoubleLongRightArrow]LSAMonRepresentByTopics[matImageTest]\[DoubleLongRightArrow]LSAMonTakeValue;
lsCoeff = Normal@SparseArray[matReprsentation[[1, All]]];
ListPlot[MapIndexed[Tooltip[#1, lsBasis[[#2[[1]]]]] &, lsCoeff], Filling -> Axis, PlotRange -> All]
0cn7ty6zf3mgo

Show representation coefficients outliers:

lsBasis[[OutlierPosition[Abs[lsCoeff], TopOutliers@*HampelIdentifierParameters]]]
1w6jkhdpxlxw8

Show the interpretation of the found representation:

vecReprsentation = lsCoeff . SparseArray[H];
reprImg = Image[Unitize@Clip[#, {0.45, 1}, {0, 1}] &@Rescale[Partition[vecReprsentation, ImageDimensions[aCImages[[1]]][[1]]]]];
{reprImg, imgTest}
0c84q1hscjubu

See the closest characters using image distances:

KeyMap[# /. aCImages &, TakeSmallest[ImageDistance[reprImg, #] & /@ aCImages, 4]]
1vtcw1dhzlet5

Remark: By applying the approximation procedure to
all characters in testing set we can convince ourselves that small,
training set provides good retrieval. (Not done here.)

Finding more interpretable
bases

In this section we show how to use LSA workflow with Non-Negative
Matrix Factorization (NNMF)
over an image set extended with already
extracted “topic” images.

Cleaner automatic radicals

aAutoRadicals2 = Map[Dilation[Binarize[DeleteSmallComponents[#]], 0.5] &, First /@ aAutoRadicals]
10eg2eaajgiit

Here we take an image union in order to remove the “duplicated”
radicals:

aAutoRadicals3 = AssociationThread[Range[Length[#]], #] &@Union[Values[aAutoRadicals2], SameTest -> (ImageDistance[#1, #2] < 14.5 &)]
1t09xi5nlycaw

LSAMon pipeline with NNMF

Make a matrix with named rows and columns from the image vectors:

mat1 = ToSSparseMatrix[SparseArray[Values@aCImageVecs], "RowNames" -> Keys[aCImageVecs], "ColumnNames" -> Automatic]
0np1umfcks9hm

Enhance the matrix with radicals instances:

mat2 = ToSSparseMatrix[SparseArray[Join @@ Map[Table[ImageToVector[#], 100] &, Values[aAutoRadicals3]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic];
mat3 = RowBind[mat1, mat2];

Apply the LSAMon workflow pipeline with NNMF for topic
extraction:

SeedRandom[77];
AbsoluteTiming[
  lsaAllExtendedObj = 
    LSAMonUnit[]\[DoubleLongRightArrow]
     LSAMonSetDocumentTermMatrix[mat3]\[DoubleLongRightArrow]
     LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]\[DoubleLongRightArrow]
     LSAMonExtractTopics["NumberOfTopics" -> 60, Method -> "NNMF", "MaxSteps" -> 15, "MinNumberOfDocumentsPerTerm" -> 0]\[DoubleLongRightArrow]
     LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]
     LSAMonEcho[Style["Obtained basis:", Bold, Purple]]\[DoubleLongRightArrow]
     LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#, ImageDimensions[aCImages[[1]]][[1]]]]] & /@SparseArray[#H] &]; 
 ]
1mc1fa16ylzcu
1c6p7pzemk6qx
(*{155.289, Null}*)

Remark: Note that NNMF “found” the interpretable
radical images we enhanced the original image set with.

Get the matrix and basis interpretation of the extracted image
topics:

H = 
   lsaAllExtendedObj\[DoubleLongRightArrow]
    LSAMonNormalizeMatrixProduct[Normalized -> Right]\[DoubleLongRightArrow]
    LSAMonTakeH;
lsBasis = ImageAdjust[Image[Partition[#, ImageDimensions[aCImages[[1]]][[1]]]]] & /@ SparseArray[H];

Approximation

Pick a Chinese character image as a target image and pre-process
it:

SeedRandom[43];
ind = RandomChoice[testingInds];
imgTest = aCImages[ind];
matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aCImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic];
imgTest
1h2aitm71mnl5

Find its representation with the chosen feature extractor (LSAMon
object here):

matReprsentation = lsaAllExtendedObj\[DoubleLongRightArrow]LSAMonRepresentByTopics[matImageTest]\[DoubleLongRightArrow]LSAMonTakeValue;
lsCoeff = Normal@SparseArray[matReprsentation[[1, All]]];
ListPlot[MapIndexed[Tooltip[#1, lsBasis[[#2[[1]]]]] &, lsCoeff], Filling -> Axis, PlotRange -> All]
084vbifk2zvi3

Show representation coefficients outliers:

lsBasis[[OutlierPosition[Abs[lsCoeff], TopOutliers@*QuartileIdentifierParameters]]]
06xq4p3k31fzt

Remark: Note that expected
radical images are in the outliers.

Show the interpretation of the found representation:

vecReprsentation = lsCoeff . SparseArray[H];
reprImg = Image[Unitize@Clip[#, {0.45, 1}, {0, 1}] &@Rescale[Partition[vecReprsentation, ImageDimensions[aCImages[[1]]][[1]]]]];
{reprImg, imgTest}
01xeidbc9qme6

See the closest characters using image distances:

KeyMap[# /. aCImages &, TakeSmallest[ImageDistance[reprImg, #] & /@ aCImages, 4]]
1mrut9izhycrn

Setup

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/MonadicLatentSemanticAnalysis.m"];
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/MonadicSparseMatrixRecommender.m"];
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/Misc/HeatmapPlot.m"]

References

[SH1] Silvia Hao, “Exploring
structure of Chinese characters through image processing”
, (2022),
Wolfram Community.

[AA1] Anton Antonov, “A monad for
Latent Semantic Analysis workflows”
, (2019), Wolfram Community.

[AA2] Anton Antonov, “LSA methods
comparison over random mandalas deconstruction – WL”
, (2022), Wolfram Community.

[AA3] Anton Antonov, “Bethlehem
stars: classifying randomly generated mandalas”
, (2020), Wolfram Community.

[AA4] Anton Antonov, “Random mandalas deconstruction in R, Python, and Mathematica”, (2022), MathematicaForPrediction at WordPress.

[AAp1] Anton Antonov, LSAMon
for Image Collections Mathematica package
, (2022), MathematicaForPrediction
at GitHub
.

A monad for Epidemiologic Compartmental Modeling Workflows

Version 0.8

Introduction

In this document we describe the design and demonstrate the implementation of a (software programming) monad, [Wk1], for Epidemiology Compartmental Modeling (ECM) workflows specification and execution. The design and implementation are done with Mathematica / Wolfram Language (WL). A very similar implementation is also done in R.

Monad’s name is “ECMMon”, which stands for “Epidemiology Compartmental Modeling Monad”, and its monadic implementation is based on the State monad package “StateMonadCodeGenerator.m”, [AAp1, AA1], ECMMon is implemented in the package [AAp8], which relies on the packages [AAp3-AAp6]. The original ECM workflow discussed in [AA5] was implemented in [AAp7]. An R implementation of ECMMon is provided by the package [AAr2].

The goal of the monad design is to make the specification of ECM workflows (relatively) easy and straightforward by following a certain main scenario and specifying variations over that scenario.

We use real-life COVID-19 data, The New York Times COVID-19 data, see [NYT1, AA5].

The monadic programming design is used as a Software Design Pattern. The ECMMon monad can be also seen as a Domain Specific Language (DSL) for the specification and programming of epidemiological compartmental modeling workflows.

Here is an example of using the ECMMon monad over a compartmental model with two types of infected populations:

0ejgq885m1unt 14u60cftusqlc 06grm5lhtjiv9

The table above is produced with the package “MonadicTracing.m”, [AAp2, AA1], and some of the explanations below also utilize that package.

As it was mentioned above the monad ECMMon can be seen as a DSL. Because of this the monad pipelines made with ECMMon are sometimes called “specifications”.

Contents description

The document has the following structure.

  • The sections “Package load” and “Data load” obtain the needed code and data.
  • The section “Design consideration” provide motivation and design decisions rationale.
  • The section “Single site models” give brief descriptions of certain “seed” models that can be used in the monad.
  • The section “Single-site model workflow demo”, “Multi-site workflow demo” give demonstrations of how to utilize the ECMMon monad .
    • Using concrete practical scenarios and “real life” data.
  • The section “Batch simulation and calibration process” gives methodological preparation for the content of the next two sections.
  • The section “Batch simulation workflow” and “Calibration workflow” describe how to do most important monad workflows after the model is developed.
  • The section “Future plans” outlines future directions of development.

Remark: One can read only the sections “Introduction”, “Design consideration”, “Single-site models”, and “Batch simulation and calibration process”. That set of sections provide a fairly good, programming language agnostic exposition of the substance and novel ideas of this document.

Package load

In this section we load packages used in this notebook:

Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/MonadicEpidemiologyCompartmentalModeling.m"];Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/MultiSiteModelSimulation.m"]Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/MonadicTracing.m"]Import["https://raw.githubusercontent.com/antononcube/ConversationalAgents/master/Packages/WL/ExternalParsersHookup.m"]

Remark: The import commands above would trigger some other package imports.

Data load

In this section we ingest data using the “umbrella function” MultiSiteModelReadData from [AAp5]:

Read data

AbsoluteTiming[  aData = MultiSiteModelReadData[];  ](*{38.8635, Null}*)

Data summaries

ResourceFunction["RecordsSummary"] /@ aData
01mnhk5boxlto

Transform data

Here we transform the population related data in a form convenient for specifying the simulations with it:

aPopulations = Association@Map[{#Lon, #Lat} -> #Population &, Normal[aData["CountyRecords"]]];aInfected = Association@Map[{#Lon, #Lat} -> #Cases &, Normal[aData["CasesAndDeaths"]]];aDead = Association@Map[{#Lon, #Lat} -> #Deaths &, Normal[aData["CasesAndDeaths"]]];

Geo-visualizations

Using the built-in function GeoHistogram we summarize the USA county populations, and COViD-19 infection cases and deaths:

Row@MapThread[GeoHistogram[KeyMap[Reverse, #1], Quantity[140, "Miles"], PlotLabel -> #2, PlotTheme -> "Scientific", PlotLegends -> Automatic, ImageSize -> Medium] &, {{aPopulations, aInfected, aDead}, {"Populations", "Infected", "Dead"}}]
1km2posdmqoic

(Note that in the plots above we have to reverse the keys of the given population associations.)

Using the function HextileHistogram from [AAp7 ] here we visualize USA county populations over a hexagonal grid with cell radius 2 degrees ((\approx 140) miles (\approx 222) kilometers):

HextileHistogram[aPopulations, 2, PlotRange -> All, PlotLegends -> Automatic, ImageSize -> Large]
0lqx1d453pio1

In this notebook we prefer using HextileHistogram because it represents the simulation data in geometrically more faithful way.

Design considerations

The big picture

The main purpose of the designed epidemic compartmental modeling framework (i.e. software monad) is to have the ability to do multiple, systematic simulations for different scenario play-outs over large scale geographical regions. The target end-users are decision makers at government level and researchers of pandemic or other large scale epidemic effects.

Here is a diagram that shows the envisioned big picture workflow:

Model-development-and-decision-making

Large-scale modeling

The standard classical compartmental epidemiology models are not adequate over large geographical areas, like, countries. We design a software framework – the monad ECMMon – that allows large scale simulations using a simple principle workflow:

  1. Develop a single-site model for relatively densely populated geographical area for which the assumptions of the classical models (approximately) hold.
  2. Extend the single-site model into a large-scale multi-site model using statistically derived traveling patterns; see [AA4].
  3. Supply the multi-site model with appropriately prepared data.
  4. Run multiple simulations to see large scale implications of different policies.
  5. Calibrate the model to concrete observed (or faked) data. Go to 4.

Flow chart

The following flow chart visualizes the possible workflows the software monad ECMMon:

ECMMon-workflow

Two models in the monad

  • An ECMMon object can have one or two models. One of the models is a “seed”, single-site model from [AAp1], which, if desired, is scaled into a multi-site model, [AA3, AAp2].
  • Workflows with only the single-site model are supported.
    • Say, workflows for doing sensitivity analysis, [AA6, BC1].
  • Scaling of a single-site model into multi-site is supported and facilitated.
  • Workflows for the multi-site model include preliminary model scaling steps and simulation steps.
  • After the single-site model is scaled the monad functions use the multi-site model.
  • The workflows should be easy to specify and read.

Single-site model workflow

  1. Make a single-site model.
  2. Assign stocks initial conditions.
  3. Assign rates values.
  4. Simulate.
  5. Plot results.
  6. Go to 2.

Multi-site model workflow

  1. Make a single-site model.
  2. Assign initial conditions and rates.
  3. Scale the single-site model into a multi-site model.
    1. The single-site assigned rates become “global” when the single-site model is scaled.
    2. The scaling is based on assumptions for traveling patterns of the populations.
    3. There are few alternatives for that scaling:
      1. Using locations geo-coordinates
      2. Using regular grids covering a certain area based on in-habited locations geo-coordinates
      3. Using traveling patterns contingency matrices
      4. Using “artificial” patterns of certain regular types for qualitative analysis purposes
  4. Enhance the multi-site traveling patterns matrix and re-scale the single site model.
    1. We might want to combine traveling patterns by ground transportation with traveling patterns by airplanes.
    2. For quarantine scenarios this might a less important capability of the monad.
      1. Hence, this an optional step.
  5. Assign stocks initial conditions for each of the sites in multi-scale model.
  6. Assign rates for each of the sites.
  7. Simulate.
  8. Plot global simulation results.
  9. Plot simulation results for focus sites.

Single-site models

We have a collection of single-site models that have different properties and different modeling goals, [AAp3, AA7, AA8]. Here is as diagram of a single-site model that includes hospital beds and medical supplies as limitation resources, [AA7]:

Coronavirus-propagation-simple-dynamics

SEI2HR model

In this sub-section we briefly describe the model SEI2HR, which is used in the examples below.

Remark: SEI2HR stands for “Susceptible, Exposed, Infected Two, Hospitalized, Recovered” (populations).

Detailed description of the SEI2HR model is given in [AA7].

Verbal description

We start with one infected (normally symptomatic) person, the rest of the people are susceptible. The infected people meet other people directly or get in contact with them indirectly. (Say, susceptible people touch things touched by infected.) For each susceptible person there is a probability to get the decease. The decease has an incubation period: before becoming infected the susceptible are (merely) exposed. The infected recover after a certain average infection period or die. A certain fraction of the infected become severely symptomatic. If there are enough hospital beds the severely symptomatic infected are hospitalized. The hospitalized severely infected have different death rate than the non-hospitalized ones. The number of hospital beds might change: hospitals are extended, new hospitals are build, or there are not enough medical personnel or supplies. The deaths from infection are tracked (accumulated.) Money for hospital services and money from lost productivity are tracked (accumulated.)

The equations below give mathematical interpretation of the model description above.

Equations

Here are the equations of one the epidemiology compartmental models, SEI2HR, [AA7], implemented in [AAp3]:

ModelGridTableForm[SEI2HRModel[t], "Tooltips" -> False]["Equations"] /. {eq_Equal :> TraditionalForm[eq]}
1fjs0wfx4mrpj

The equations for Susceptible, Exposed, Infected, Recovered populations of SEI2R are “standard” and explanations about them are found in [WK2, HH1]. For SEI2HR those equations change because of the stocks Hospitalized Population and Hospital Beds.

The equations time unit is one day. The time horizon is one year. In this document we consider COVID-19, [Wk2, AA1], hence we do not consider births.

Single-site model workflow demo

In this section we demonstrate some of the sensitivity analysis discussed in [AA6, BC1].

Make a single-site model, SEI2HR:

model1 = SEI2HRModel[t, "InitialConditions" -> True, "RateRules" -> True, "TotalPopulationRepresentation" -> "AlgebraicEquation"];

Make an association with “default” parameters:

aDefaultPars = <|    aip -> 26,     aincp -> 5,     \[Beta][ISSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],     \[Beta][INSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],     qsd -> 60,     ql -> 21,     qcrf -> 0.25,     \[Beta][HP] -> 0.01,     \[Mu][ISSP] -> 0.035/aip,     \[Mu][INSP] -> 0.01/aip,     nhbr[TP] -> 3/1000,     lpcr[ISSP, INSP] -> 1,     hscr[ISSP, INSP] -> 1    |>;

Execute the workflow multiple times with different quarantine starts:

qlVar = 56;lsRes =  Map[   ECMMonUnit[]     ECMMonSetSingleSiteModel[model1]     ECMMonAssignRateRules[Join[aDefaultPars, <|qsd -> #, ql -> qlVar|>]]     ECMMonSimulate[365]     ECMMonPlotSolutions[{"Infected Severely Symptomatic Population"}, 240,        "Together" -> True, "Derivatives" -> False,        PlotRange -> {0, 12000}, ImageSize -> 250,        Epilog -> {Orange, Dashed, Line[{{#1, 0}, {#1, 12000}}], Line[{{#1 + qlVar, 0}, {#1 + qlVar, 12000}}]},        PlotLabel -> Row[{"Quarantine start:", Spacer[5], #1, ",", Spacer[5], "end:", Spacer[5], #1 + qlVar}],        "Echo" -> False]     ECMMonTakeValue &, Range[25, 100, 5]];

Plot the simulation solutions for “Infected Severely Symptomatic Population”:

Multicolumn[#[[1, 1]] & /@ lsRes, 4]
1db54x7hgf4iw

Both theoretical and computational details of the workflow above are given [AA7, AA8].

Multi-site workflow demo

In this section we demonstrate the multi-site model workflow using COVID-19 data for USA, [WRI2, NYT1].

Here a “seed”, single-site model is created:

model1 = SEI2HRModel[t, "InitialConditions" -> True, "RateRules" -> True, "TotalPopulationRepresentation" -> "AlgebraicEquation"];

Here we specify a multi-site model workflow (the monadic steps are separated and described with purple print-outs):

ecmObj =    ECMMonUnit[]    ECMMonSetSingleSiteModel[model1]    ECMMonAssignRateRules[     <|      aip -> 26,       aincp -> 5,       \[Beta][ISSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],       \[Beta][INSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],       qsd -> 0,       ql -> 56,       qcrf -> 0.25,       \[Beta][HP] -> 0.01,       \[Mu][ISSP] -> 0.035/aip,       \[Mu][INSP] -> 0.01/aip,       nhbr[TP] -> 3/1000,       lpcr[ISSP, INSP] -> 1,       hscr[ISSP, INSP] -> 1      |>     ]    ECMMonEcho[Style["Show the single-site model tabulated form:", Bold, Purple]]    ECMMonEchoFunctionContext[Magnify[ModelGridTableForm[#singleSiteModel], 1] &]    ECMMonMakePolygonGrid[Keys[aPopulations], 1.5, "BinningFunction" -> Automatic]    ECMMonEcho[Style["Show the grid based on population coordinates:", Bold, Purple]]    ECMMonPlotGrid["CellIDs" -> True, ImageSize -> Large]    ECMMonExtendByGrid[aPopulations, 0.12]    ECMMonAssignInitialConditions[aPopulations, "Total Population", "Default" -> 0]    ECMMonAssignInitialConditions[DeriveSusceptiblePopulation[aPopulations, aInfected, aDead], "Susceptible Population", "Default" -> 0]    ECMMonAssignInitialConditions[<||>, "Exposed Population", "Default" -> 0]    ECMMonAssignInitialConditions[aInfected, "Infected Normally Symptomatic Population", "Default" -> 0]    ECMMonAssignInitialConditions[<||>, "Infected Severely Symptomatic Population", "Default" -> 0]    ECMMonEcho[Style["Show total populations initial conditions data:", Bold, Purple]]    ECMMonPlotGridHistogram[aPopulations, ImageSize -> Large, PlotLabel -> "Total populations"]    ECMMonEcho[Style["Show infected and deceased initial conditions data:", Bold,Purple]]    ECMMonPlotGridHistogram[aInfected, ColorFunction -> ColorData["RoseColors"], "ShowDataPoints" -> False, ImageSize -> Large, PlotLabel -> "Infected"]    ECMMonPlotGridHistogram[aDead, ColorFunction -> ColorData["RoseColors"], "ShowDataPoints" -> False, ImageSize -> Large, PlotLabel -> "Deceased"]    ECMMonEcho[Style["Simulate:", Bold, Purple]]    ECMMonSimulate[365]    ECMMonEcho[Style["Show global population simulation results:", Bold, Purple]]    ECMMonPlotSolutions[__ ~~ "Population", 365]    ECMMonEcho[Style["Show site simulation results for Miami and New York areas:", Bold, Purple]]    ECMMonPlotSiteSolutions[{160, 174}, __ ~~ "Population", 365]    ECMMonEcho[Style["Show deceased and hospitalzed populations results for Miami and New York areas:", Bold, Purple]]    ECMMonPlotSiteSolutions[{160, 174}, {"Deceased Infected Population", "Hospitalized Population","Hospital Beds"}, 300, "FocusTime" -> 120];
0y0cfyttq7yn6 0gh46eljka9v5 04xdyad4mmniu 1mn5x0eb0hbvl 009qnefuurlhw 09xeiox9ane2w 13kldiv0a47ua 0jjxrwhkaj7ys 1qgjcibkzvib0 1bm71msuv1s02 0l96q6lg8vg9n 1w1u7nso952k8 0gp4bff16s2rj 1pfvwkdwlm72j 0in55gql6v1yi 1fxxap0npojve 1e0ctc7lt1tbh

Theoretical and computational details about the multi-site workflow can be found in [AA4, AA5].

Batch simulations and calibration processes

In this section we describe the in general terms the processes of model batch simulations and model calibration. The next two sections give more details of the corresponding software design and workflows.

Definitions

Batch simulation: If given a SD model (M), the set (P) of parameters of (M), and a set (B) of sets of values (P), (B\text{:=}\left{V_i\right}), then the set of multiple runs of (M) over (B) are called batch simulation.

Calibration: If given a model (M), the set (P) of parameters of (M), and a set of (k) time series (T\text{:=}\left{T_i\right}_{i=1}^k) that correspond to the set of stocks (S\text{:=}\left{S_i\right}_{i=1}^k) of (M) then the process of finding concrete the values (V) for (P) that make the stocks (S) to closely resemble the time series (T) according to some metric is called calibration of (M) over the targets (T).

Roles

  • There are three types of people dealing with the models:
    • Modeler, who develops and implements the model and prepares it for calibration.
    • Calibrator, who calibrates the model with different data for different parameters.
    • Stakeholder, who requires different features of the model and outcomes from different scenario play-outs.
  • There are two main calibration scenarios:
    • Modeler and Calibrator are the same person
    • Modeler and Calibrator are different persons

Process

Model development and calibration is most likely going to be an iterative process.

For concreteness let us assume that the model has matured development-wise and batch simulation and model calibration is done in a (more) formal way.

Here are the steps of a well defined process between the modeling activity players described above:

  1. Stakeholder requires certain scenarios to be investigated.
  2. Modeler prepares the model for those scenarios.
  3. Stakeholder and Modeler formulate a calibration request.
  4. Calibrator uses the specifications from the calibration request to:
    1. Calibrate the model
    2. Derive model outcomes results
    3. Provide model qualitative results
    4. Provide model sensitivity analysis results
  5. Modeler (and maybe Stakeholder) review the results and decides should more calibration be done.
    1. I.e. go to 3.
  6. Modeler does batch simulations with the calibrated model for the investigation scenarios.
  7. Modeler and Stakeholder prepare report with the results.

See the documents [AA9, AA10] have questionnaires that further clarify the details of interaction between the modelers and calibrators.

Batch simulation vs calibration

In order to clarify the similarities and differences between batch simulation and calibration we list the following observations:

  • Each batch simulation or model calibration is done either for model development purposes or for scenario play-out studies.
  • Batch simulation is used for qualitative studies of the model. For example, doing sensitivity analysis; see [BC1, AA7, AA8].
  • Before starting the calibration we might want to study the “landscape” of the search space of the calibration parameters using batch simulations.
  • Batch simulation is also done after model calibration in order to evaluate different scenarios,
  • For some models with large computational complexity batch simulation – together with some evaluation metric – can be used instead of model calibration.

Batch simulations workflow

In this section we describe the specification and execution of model batch simulations.

Batch simulations can be time consuming, hence it is good idea to

In the rest of the section we go through the following steps:

  1. Make a model object
  2. Batch simulate over a few combinations of parameters and show:
    1. Plots of the simulation results for all populations
    2. Plots of the simulation results for a particular population
  3. Batch simulate over the Cartesian (outer) product of values lists of a selected pair of parameters and show the corresponding plots of all simulations

Model (object) for batch simulations

Here we make a new ECMMon object:

ecmObj2 =  ECMMonUnit[]   ECMMonSetSingleSiteModel[model1]   ECMMonAssignRateRules[aDefaultPars];

Direct specification of combinations of parameters

All populations

Here we simulate the model in the object of different parameter combinations given in a list of associations:

res1 =  ecmObj2   ECMMonBatchSimulate[___ ~~ "Population", {<|qsd -> 60, ql -> 28|>, <|qsd -> 55, ql -> 28|>, <|qsd -> 75, ql -> 21, \[Beta][ISSP] -> 0|>}, 240]   ECMMonTakeValue;

Remark: The stocks in the results are only stocks that are populations – that is specified with the string expression pattern ___~~”Population”.

Here is the shape of the result:

Short /@ res1
0a1hc3qgvuof7

Here are the corresponding plots:

ListLinePlot[#, PlotTheme -> "Detailed", ImageSize -> Medium, PlotRange -> All] & /@ res1
1veewx20b1778

Focus population

We might be interested in the batch simulations results for only one, focus populations. Here is an example:

res2 =    ecmObj2    ECMMonBatchSimulate["Infected Normally Symptomatic Population", {<|qsd -> 60, ql -> 28|>, <|qsd -> 55, ql -> 28|>, <|qsd -> 75, ql -> 21, \[Beta][ISSP] -> 0|>}, 240]    ECMMonTakeValue;

Here is the shape of the result:

Short /@ res2
0es5cj4hgrljm

Here are the corresponding plots:

Multicolumn[ KeyValueMap[  ListLinePlot[#2, PlotLabel -> #1, PlotTheme -> "Detailed",     Epilog -> {Directive[Orange, Dashed],       Line[{Scaled[{0, -1}, {#1[qsd], 0}], Scaled[{0, 1}, {#1[qsd], 0}]}],       Line[{Scaled[{0, -1}, {#1[qsd] + #1[ql], 0}], Scaled[{0, 1}, {#1[qsd] + #1[ql], 0}]}]},     ImageSize -> Medium] &, res2]]
17ofnpddvw8ud

Outer product of parameters

Instead of specifying an the combinations of parameters directly we can specify the values taken by each parameter using an association in which the keys are parameters and the values are list of values:

res3 =    ecmObj2    ECMMonBatchSimulate[__ ~~ "Population", <|qsd -> {60, 55, 75}, ql -> {28, 21}|>, 240]    ECMMonTakeValue;

Here is the shallow form of the results

Short /@ res3
1e260m2euunsl

Here are the corresponding plots:

Multicolumn[ KeyValueMap[  ListLinePlot[#2, PlotLabel -> #1, PlotTheme -> "Detailed",     Epilog -> {Directive[Gray, Dashed],       Line[{Scaled[{0, -1}, {#1[qsd], 0}], Scaled[{0, 1}, {#1[qsd], 0}]}],       Line[{Scaled[{0, -1}, {#1[qsd] + #1[ql], 0}], Scaled[{0, 1}, {#1[qsd] + #1[ql], 0}]}]},     ImageSize -> Medium] &, res3]]
1u4ta8aypszhz

Calibration workflow

In this section we go through the computation steps of the calibration of single-site SEI2HR model.

Remark: We use real data in this section, but the presented calibration results and outcome plots are for illustration purposes only. A rigorous study with discussion of the related assumptions and conclusions is beyond the scope of this notebook/document.

Calibration steps

Here are the steps performed in the rest of the sub-sections of this section:

  1. Ingest data for infected cases, deaths due to disease, etc.
  2. Choose a model to calibrate.
  3. Make the calibration targets – those a vectors corresponding to time series over regular grids.
    1. Consider using all of the data in order to evaluate model’s applicability.
    2. Consider using fractions of the data in order to evaluate model’s ability to predict the future or reconstruct data gaps.
  4. Choose calibration parameters and corresponding ranges for their values.
  5. If more than one target choose the relative weight (or importance) of the targets.
  6. Calibrate the model.
  7. Evaluate the fitting between the simulation results and data.
    1. Using statistics and plots.
  8. Make conclusions. If insufficiently good results are obtained go to 2 or 4.

Remark: When doing calibration epidemiological models a team of people it is better certain to follow (rigorously) well defined procedures. See the documents:

Remark: We plan to prepare have several notebooks dedicated to calibration of both single-site and multi-site models.

USA COVID-19 data

Here data for the USA COVID-19 infection cases and deaths from [NYT1] (see [AA6] data ingestion details):

lsCases = {1, 1, 1, 2, 3, 5, 5, 5, 5, 6, 7, 8, 11, 11, 11, 12, 12, 12,12, 12, 13, 13, 14, 15, 15, 15, 15, 25, 25, 25, 27, 30, 30, 30, 43, 45, 60, 60, 65, 70, 85, 101, 121, 157, 222, 303, 413, 530, 725,976, 1206, 1566, 2045, 2603, 3240, 4009, 5222, 6947, 9824, 13434, 17918, 23448, 30082, 37696, 46791, 59678, 73970, 88796, 103318, 119676, 139165, 160159, 184764, 212033, 241127, 262275, 288195, 314991, 341540, 370689, 398491, 423424, 445213, 467106, 490170, 512972, 539600, 566777, 590997, 613302, 637812, 660549, 685165, 714907, 747741, 777098, 800341, 820764, 844225, 868644, 895924, 927372, 953923, 977395, 998136, 1020622, 1043873, 1069587, 1095405,1118643, 1137145, 1155671, 1176913, 1196485, 1222057, 1245777, 1267911, 1285105, 1306316, 1326559, 1349019, 1373255, 1395981, 1416682, 1436260, 1455183, 1473813, 1491974, 1513223, 1536848, 1559020, 1578876, 1600414, 1620096, 1639677, 1660303, 1688335, 1709852, 1727711, 1745546, 1763803, 1784049, 1806724, 1831494, 1855870, 1874023, 1894074, 1918373, 1943743, 1970066, 2001470, 2031613, 2057493, 2088420, 2123068, 2159633, 2199841, 2244876, 2286401, 2324563, 2362875, 2411709, 2461341, 2514500, 2573030, 2622980, 2667278, 2713656, 2767129, 2825865, 2885325, 2952393, 3012349, 3069369, 3129738, 3194944, 3263077, 3338308, 3407962, 3469137, 3529938, 3588229, 3653114, 3721574, 3790356, 3862588, 3928575, 3981476, 4039440, 4101329, 4167741, 4235717, 4303663, 4359188, 4408708, 4455340, 4507370, 4560539, 4617036, 4676822, 4730639, 4777548, 4823529, 4876038, 4929115, 4981066, 5038637, 5089258, 5130147, 5166032, 5206970, 5251824, 5297150, 5344322, 5388034, 5419494, 5458726, 5497530, 5541830, 5586297, 5631403, 5674714, 5707327, 5742814, 5786178, 5817338, 5862014, 5917466, 5958619, 5988001, 6012054, 6040456, 6073671, 6110645, 6157050, 6195893, 6228601, 6264192, 6301923, 6341145, 6385411, 6432677, 6472474, 6507345, 6560827, 6597281, 6638806, 6682079, 6734971, 6776512, 6812354, 6847745, 6889421, 6930523, 6975693, 7027692, 7073962, 7107992, 7168298, 7210171, 7261433, 7315687, 7373073, 7422798, 7466501, 7513020, 7565839, 7625285, 7688761, 7757326, 7808123, 7853753, 7916879, 7976530, 8039653, 8113165, 8193658, 8270925, 8328369, 8401001, 8473618, 8555199, 8642599, 8737995, 8814233, 8892933, 8983153, 9074711, 9182627, 9301455, 9436244, 9558668, 9659817, 9784920, 9923082, 10065150, 10222441, 10405550, 10560047, 10691686, 10852769, 11011484, 11183982, 11367840, 11561152, 11727724, 11864571, 12039323, 12213742, 12397014, 12495699, 12693598, 12838076, 12972986, 13135728, 13315143, 13516558, 13728192, 13958512, 14158135, 14325555, 14519697, 14731424, 14954596, 15174109, 15447371, 15647963, 15826415, 16020169, 16218331, 16465552, 16697862, 16941306, 17132902, 17305013, 17498382, 17694678, 17918870, 18106293, 18200349, 18410644, 18559596, 18740591, 18932346, 19157710};
lsDeaths = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 3, 6, 10, 12, 12, 15, 19, 22, 26, 31, 37, 43, 50, 59, 63, 84, 106, 137, 181, 223, 284, 335, 419, 535, 694, 880, 1181, 1444, 1598, 1955, 2490, 3117, 3904, 4601, 5864, 6408, 7376, 8850, 10159, 11415,12924, 14229, 15185, 16320, 18257, 20168, 21941, 23382, 24617, 26160, 27535, 29821, 31633, 33410, 35104, 36780, 37660, 38805, 40801, 42976, 44959, 46552, 48064, 49122, 50012, 52079, 54509, 56277, 57766, 59083, 59903, 60840, 62299, 63961, 65623, 67143, 68260, 68959, 69633, 71042, 72474, 73718, 74907, 75891, 76416, 76888, 77579, 79183, 80329, 81452, 82360, 82904, 83582, 84604, 85545, 86487, 87559, 88256, 88624, 89235, 90220, 91070, 91900, 92621, 93282, 93578, 93988, 94710, 95444, 96123, 96760, 97297, 97514, 97832, 98638, 99372, 101807, 102525, 103018, 103258, 103594,104257, 104886, 105558, 106148, 106388, 106614, 106998, 107921, 108806, 109705, 110522, 111177, 111567, 111950, 112882, 113856, 114797, 115695, 116458, 116854, 117367, 118475, 119606, 120684, 121814, 122673, 123105, 124782, 126089, 127465, 128720, 130131, 131172, 131584, 132174, 133517, 134756, 135271, 136584, 137547, 138076, 138601, 140037, 141484, 142656, 143793, 144819, 145315, 145831, 147148, 148517, 149550, 150707, 151641, 152065, 152551, 153741, 154908, 156026, 157006, 157869, 158235, 158714, 159781, 160851, 161916, 162869, 163573, 163970, 164213, 164654, 165811, 166713, 167913, 168594, 168990, 169423, 170674, 171663, 172493, 173408, 174075, 174281, 174689, 175626, 176698, 177559, 178390, 179145, 179398, 179736, 180641, 181607, 182455, 183282, 183975, 184298, 184698, 185414, 186359, 187280, 188168, 188741, 189165, 189501, 190312, 191267, 192077, 192940, 193603, 193976, 194481, 195405, 196563, 197386, 198271, 199126, 199462, 199998, 200965, 201969, 202958, 203878, 204691, 205119, 205623, 206757, 208331, 209417, 210829, 211838, 212277, 212989, 214442, 215872, 217029, 218595, 219791, 220402, 221165, 222750, 224641, 226589, 228452, 229868, 230695, 231669, 233856, 236127, 237284, 238636, 239809, 240607, 241834, 244430, 247258, 250102, 252637, 254784, 255857, 257282, 260038, 263240, 266144, 268940, 271172, 272499, 274082, 277088, 280637, 283899, 286640, 289168, 290552, 292346, 295549, 298953, 301713, 302802, 304403, 305581, 307396, 310994, 314654};

Remark: The COVID-19 data was ingested from [NYT1] on 2020-12-31,

Calibration targets

From the data we make the calibration targets association:

aTargets = <|{ISSP -> 0.2 lsCases, INSP -> 0.8 lsCases, DIP -> lsDeaths}|>;

Remark: Note that we split the infection cases into 20% severely symptomatic cases and 80% normally symptomatic cases.

Here is the corresponding plot:

ListLogPlot[aTargets, PlotTheme -> "Detailed", PlotLabel -> "Calibration targets", ImageSize -> Medium]
1u9akes6b4dx0

Here we prepare a smaller set of the targets data for the calibration experiments below:

aTargetsShort = Take[#, 170] & /@ aTargets;

Model creation

modelSEI2HR = SEI2HRModel[t, "TotalPopulationRepresentation" -> "AlgebraicEquation"];

Here are the parameters we want to experiment with (or do calibration with):

lsFocusParams = {aincp, aip, sspf[SP], \[Beta][HP], qsd, ql, qcrf, nhbcr[ISSP, INSP], nhbr[TP]};

Here we set custom rates and initial conditions:

aDefaultPars = <|    \[Beta][ISSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],     \[Beta][INSP] -> 0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],     qsd -> 60,     ql -> 8*7,     qcrf -> 0.25,     \[Beta][HP] -> 0.01,     \[Mu][ISSP] -> 0.035/aip,     \[Mu][INSP] -> 0.01/aip,     nhbr[TP] -> 3/1000,     lpcr[ISSP, INSP] -> 1,     hscr[ISSP, INSP] -> 1    |>;

Remark: Note the piecewise functions for (\beta [\text{ISSP}]) and (\beta [\text{INSP}]).

Calibration

Here is the USA population number we use for calibration:

usaPopulation = QuantityMagnitude@CountryData["UnitedStates", "Population"](*329064917*)

Here is we create a ECMMon object that has default parameters and initial conditions assigned above:

AbsoluteTiming[  ecmObj3 =     ECMMonUnit[]     ECMMonSetSingleSiteModel[modelSEI2HR]     ECMMonAssignInitialConditions[<|TP[0] -> usaPopulation, SP[0] -> usaPopulation - 1, ISSP[0] -> 1|>]     ECMMonAssignRateRules[KeyDrop[aDefaultPars, {aip, aincp, qsd, ql, qcrf}]]     ECMMonCalibrate[      "Target" -> KeyTake[aTargetsShort, {ISSP, DIP}],       "StockWeights" -> <|ISSP -> 0.8, DIP -> 0.2|>,       "Parameters" -> <|aip -> {10, 35}, aincp -> {2, 16}, qsd -> {60, 120}, ql -> {20, 160}, qcrf -> {0.1, 0.9}|>,       DistanceFunction -> EuclideanDistance,       Method -> {"NelderMead", "PostProcess" -> False},       MaxIterations -> 1000      ];  ](*{28.0993, Null}*)

Here are the found parameters:

calRes = ecmObj3ECMMonTakeValue(*{152516., {aip -> 10., aincp -> 3.67018, qsd -> 81.7067, ql -> 111.422, qcrf -> 0.312499}}*)

Using different minimization methods and distance functions

In the monad the calibration of the models is done with NMinimize. Hence, the monad function ECMMonCalibrate takes all options of NMinimize and can do calibrations with the same data and parameter search space using different global minima finding methods and distance functions.

Remark: EuclideanDistance is an obvious distance function, but use others like infinity norm and sum norm. Also, we can use a distance function that takes parts of the data. (E.g. between days 50 and 150 because the rest of the data is, say, unreliable.)

Verification of the fit

maxTime = Length[aTargets[[1]]];
ecmObj4 =    ECMMonUnit[]    ECMMonSetSingleSiteModel[modelSEI2HR]    ECMMonAssignInitialConditions[<|TP[0] -> usaPopulation, SP[0] -> usaPopulation - 1, ISSP[0] -> 1|>]    ECMMonAssignRateRules[Join[aDefaultPars, Association[calRes[[2]]]]]    ECMMonSimulate[maxTime]    ECMMonPlotSolutions[___ ~~ "Population" ~~ ___, maxTime, ImageSize -> Large, LogPlot -> False];
12rw9e4vl0vy0
aSol4 = ecmObj4ECMMonGetSolutionValues[Keys[aTargets], maxTime]ECMMonTakeValue;
Map[ListLogPlot[{aSol4[#], aTargets[#]}, PlotLabel -> #, PlotRange -> All, ImageSize -> Medium, PlotLegends -> {"Calibrated model", "Target"}] &, Keys[aTargets]]
10kkuh7578jjz

Conclusions from the calibration results

We see the that with the calibration found parameter values the model can fit the data for the first 200 days, after that it overestimates the evolution of the infected and deceased popupulations.

We can conjecture that:

  • The model is too simple, hence inadequate
  • That more complicated quarantine policy functions have to be used
  • That the calibration process got stuck in some local minima

Future plans

In this section we outline some of the directions in which the presented work on ECMMon can be extended.

More unit tests and random unit tests

We consider the preparation and systematic utilization of unit tests to be a very important component of any software development. Unit tests are especially important when complicated software package like ECMMon are developed.

For the presented software monad (and its separately developed, underlying packages) have implemented a few collections of tests, see [AAp10, AAp11].

We plan to extend and add more complicated unit tests that test for both quantitative and qualitative behavior. Here are some examples for such tests:

  • Stock-vs-stock orbits produced by simulations of certain epidemic models
  • Expected theoretical relationships between populations (or other stocks) for certain initial conditions and rates
  • Wave-like propagation of the proportions of the infected populations in multi-site models over artificial countries and traveling patterns
  • Finding of correct parameter values with model calibration over different data (both artificial and real life)
  • Expected number of equations for different model set-ups
  • Expected (relative) speed of simulations with respect to model sizes

Further for the monad ECMMon we plant to develop random pipeline unit tests as the ones in [AAp12] for the classification monad ClCon, [AA11].

More comprehensive calibration guides and documentation

We plan to produce more comprehensive guides for doing calibration with ECMMon and in general with Mathematica’s NDSolve and NMinimize functions.

Full correspondence between the Mathematica and R implementations

The ingredients of the software monad ECMMon and ECMMon itself were designed and implemented in Mathematica first. The corresponding design and implementation was done in R, [AAr2]. To distinguish the two implementations we call the R one ECMMon-R and Mathematica (Wolfram Language) one ECMMon-WL.

At this point the calibration is not implemented in ECMMon-R, but we plan to do that soon.

Using ECMMon-R (and the RStudio’s Shiny ecosystem) allows for highly shareable interactive interfaces to be programed. Here is an example: https://antononcube.shinyapps.io/SEI2HR-flexdashboard/ .

(With Mathematica similar interactive interfaces are presented in [AA7, AA8].)

Model transfer between Mathematica and R

We are very interested in transferring epidemiological models from Mathematica to R (or Python, or Julia.)

This can be done in two principle ways: (i) using Mathematica expressions parsers, or (ii) using matrix representations. We plan to investigate the usage of both approaches.

Conversational agent

Consider the making of a conversational agent for epidemiology modeling workflows building. Initial design and implementation is given in [AA13, AA14].

Consider the following epidemiology modeling workflow specification:

lsCommands = "create with SEI2HR;assign 100000 to total population;set infected normally symptomatic population to be 0;set infected severely symptomatic population to be 1;assign 0.56 to contact rate of infected normally symptomatic population;assign 0.58 to contact rate of infected severely symptomatic population;assign 0.1 to contact rate of the hospitalized population;simulate for 240 days;plot populations results;calibrate for target DIPt -> tsDeaths, over parameters contactRateISSP in from 0.1 to 0.7;echo pipeline value";

Here is the ECMMon code generated using the workflow specification:

ToEpidemiologyModelingWorkflowCode[lsCommands, "Execute" -> False, "StringResult" -> True](*"ECMMonUnit[SEI2HRModel[t]] ECMMonAssignInitialConditions[<|TP[0] -> 100000|>] ECMMonAssignInitialConditions[<|INSP[0] -> 0|>] ECMMonAssignInitialConditions[<|ISSP[0] -> 1|>] ECMMonAssignRateRules[<|\\[Beta][INSP] -> 0.56|>] ECMMonAssignRateRules[<|\\[Beta][ISSP] -> 0.58|>] ECMMonAssignRateRules[<|\\[Beta][HP] -> 0.1|>] ECMMonSimulate[\"MaxTime\" -> 240] ECMMonPlotSolutions[ \"Stocks\" -> __ ~~ \"Population\"] ECMMonCalibrate[ \"Target\" -> <|DIP -> tsDeaths|>, \"Parameters\" -> <|\\[Beta][ISSP] -> {0.1, 0.7}|> ] ECMMonEchoValue[]"*)

Here is the execution of the code above:

Block[{tsDeaths = Take[lsDeaths, 150]}, ToEpidemiologyModelingWorkflowCode[lsCommands]];
0cs73r03ki1mi 0227wiq3locdf

Different target languages

Using the natural commands workflow specification we can generate code to other languages, like, Python or R:

ToEpidemiologyModelingWorkflowCode[lsCommands, "Target" -> "Python"](*"obj = ECMMonUnit( model = SEI2HRModel())obj = ECMMonAssignInitialConditions( ecmObj = obj, initConds = [TPt = 100000])obj = ECMMonAssignInitialConditions( ecmObj = obj, initConds = [INSPt = 0])obj = ECMMonAssignInitialConditions( ecmObj = obj, initConds = [ISSPt = 1])obj = ECMMonAssignRateValues( ecmObj = obj, rateValues = [contactRateINSP = 0.56])obj = ECMMonAssignRateValues( ecmObj = obj, rateValues = [contactRateISSP = 0.58])obj = ECMMonAssignRateValues( ecmObj = obj, rateValues = [contactRateHP = 0.1])obj = ECMMonSimulate( ecmObj = obj, maxTime = 240)obj = ECMMonPlotSolutions( ecmObj = obj, stocksSpec = \".*Population\")obj = ECMMonCalibrate( ecmObj = obj,  target = [DIPt = tsDeaths], parameters = [contactRateISSP = [0.1, 0.7]] )"*)

References

Articles

[Wk1] Wikipedia entry, Monad.

[Wk2] Wikipedia entry, “Compartmental models in epidemiology”.

[Wk3] Wikipedia entry, “Coronavirus disease 2019”.

[BC1] Lucia Breierova, Mark Choudhari, An Introduction to Sensitivity Analysis, (1996), Massachusetts Institute of Technology.

[JS1] John D.Sterman, Business Dynamics: Systems Thinking and Modeling for a Complex World. (2000), New York: McGraw.

[HH1] Herbert W. Hethcote (2000). “The Mathematics of Infectious Diseases”. SIAM Review. 42 (4): 599–653. Bibcode:2000SIAMR..42..599H. doi:10.1137/s0036144500371907.

[AA1] Anton Antonov, ”Monad code generation and extension”, (2017), MathematicaForPrediction at GitHub/antononcube.

[AA2] Anton Antonov, “Coronavirus propagation modeling considerations”, (2020), SystemModeling at GitHub/antononcube.

[AA3] Anton Antonov, “Basic experiments workflow for simple epidemiological models”, (2020), SystemModeling at GitHub/antononcube.

[AA4] Anton Antonov, “Scaling of Epidemiology Models with Multi-site Compartments”, (2020), SystemModeling at GitHub/antononcube.

[AA5] Anton Antonov, “WirVsVirus hackathon multi-site SEI2R over a hexagonal grid graph”, (2020), SystemModeling at GitHub/antononcube.

[AA6] Anton Antonov, “NY Times COVID-19 data visualization”, (2020), SystemModeling at GitHub/antononcube.

[AA7] Anton Antonov, “SEI2HR model with quarantine scenarios”, (2020), SystemModeling at GitHub/antononcube.

[AA8] Anton Antonov, “SEI2HR-Econ model with quarantine and supplies scenarios”, (2020), SystemModeling at GitHub/antononcube.

[AA9] Anton Antonov, Modelers questionnaire, (2020), SystemModeling at GitHub/antononcube.

[AA10] Anton Antonov, Calibrators questionnaire, (2020), SystemModeling at GitHub/antononcube.

[AA11] Anton Antonov, A monad for classification workflows, (2018), MathematicaForPrediction at WordPress.

Repositories, packages

[WRI1] Wolfram Research, Inc., “Epidemic Data for Novel Coronavirus COVID-19”, WolframCloud.

[WRI2] Wolfram Research Inc., USA county records, (2020), System Modeling at GitHub.

[NYT1] The New York Times, Coronavirus (Covid-19) Data in the United States, (2020), GitHub.

[AAr1] Anton Antonov, Coronavirus propagation dynamics project, (2020), SystemModeling at GitHub/antononcube.

[AAr2] Anton Antonov, Epidemiology Compartmental Modeling Monad R package, (2020), ECMMon-R at GitHu/antononcube.

[AAp1] Anton Antonov, State monad code generator Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp2] Anton Antonov, Monadic tracing Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp3] Anton Antonov, Epidemiology models Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp4] Anton Antonov, Epidemiology models modifications Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp5] Anton Antonov, Epidemiology modeling visualization functions Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp6] Anton Antonov, System dynamics interactive interfaces functions Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp7] Anton Antonov, Multi-site model simulation Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp8] Anton Antonov, Monadic Epidemiology Compartmental Modeling Mathematica package, (2020), SystemModeling at GitHub/antononcube.

[AAp9] Anton Antonov, Hextile bins Mathematica package, (2020), MathematicaForPrediction at GitHub/antononcube.

[AAp10] Anton Antonov, Monadic Epidemiology Compartmental Modeling Mathematica unit tests, (2020), SystemModeling at GitHub/antononcube.

[AAp11] Anton Antonov, Epidemiology Models NDSolve Mathematica unit tests, (2020), SystemModeling at GitHub/antononcube.

[AAp12] Anton Antonov, Monadic contextual classification random pipelines Mathematica unit tests, (2018), MathematicaForPrediction at GitHub/antononcube.

[AAp13] Anton Antonov, Epidemiology Modeling Workflows Raku package, (2020), Raku-DSL-English-EpidemiologyModelingWorkflows at GitHu/antononcube.

[AAp14] Anton Antonov, External Parsers Hookup Mathematica package, (2019), ConversationalAgents at GitHub.

Generation of Random Bethlehem Stars

Introduction

This document/notebook is inspired by the Mathematica Stack Exchange (MSE) question “Plotting the Star of Bethlehem”, [MSE1]. That MSE question requests efficient and fast plotting of a certain mathematical function that (maybe) looks like the Star of Bethlehem, [Wk1]. Instead of doing what the author of the questions suggests, I decided to use a generative art program and workflows from three of most important Machine Learning (ML) sub-cultures: Latent Semantic Analysis, Recommendations, and Classification.

Although we discuss making of Bethlehem Star-like images, the ML workflows and corresponding code presented in this document/notebook have general applicability – in many situations we have to make classifiers based on data that has to be “feature engineered” through pipeline of several types of ML transformative workflows and that feature engineering requires multiple iterations of re-examinations and tuning in order to achieve the set goals.

The document/notebook is structured as follows:

  1. Target Bethlehem Star images
  2. Simplistic approach
  3. Elaborated approach outline
  4. Sections that follow through elaborated approach outline:
    1. Data generation
    2. Feature extraction
    3. Recommender creation
    4. Classifier creation and utilization experiments

(This document/notebook is a “raw” chapter for the book “Simplified Machine Learning Workflows”, [AAr3].)

Target images

Here are the images taken from [MSE1] that we consider to be “Bethlehem Stars” in this document/notebook:

imgStar1 = Import["https://i.stack.imgur.com/qmmOw.png"];
imgStar2 = Import["https://i.stack.imgur.com/5gtsS.png"];
Row[{imgStar1, Spacer[5], imgStar2}]
00dxgln7hhmjl

We notice that similar images can be obtained using the Wolfram Function Repository (WFR) function RandomMandala, [AAr1]. Here are a dozen examples:

SeedRandom[5];
Multicolumn[Table[MandalaToWhiterImage@ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> RandomInteger[{2, 8}], "ConnectingFunction" -> FilledCurve@*BezierCurve], 12], 6, Background -> Black]
0dwkbztss087q

Simplistic approach

We can just generate a large enough set of mandalas and pick the ones we like.

More precisely we have the following steps:

  1. We generate, say, 200 random mandalas using BlockRandom and keeping track of the random seeds
    1. The mandalas are generated with rotational symmetry order 2 and filled Bezier curve connections.
  2. We pick mandalas that look, more or less, like Bethlehem Stars
  3. Add picked mandalas to the results list
  4. If too few mandalas are in the results list go to 1.

Here are some mandalas generated with those steps:

lsStarReferenceSeeds = DeleteDuplicates@{697734, 227488491, 296515155601, 328716690761, 25979673846, 48784395076, 61082107304, 63772596796, 128581744446, 194807926867, 254647184786, 271909611066, 296515155601, 575775702222, 595562118302, 663386458123, 664847685618, 680328164429, 859482663706};
Multicolumn[
  Table[BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve, ColorFunction -> (White &), Background -> Black], RandomSeeding -> rs], {rs, lsStarReferenceSeeds}] /. GrayLevel[0.25`] -> White, 6, Appearance -> "Horizontal", Background -> Black]
1aedatd1zb3fh

Remark: The plot above looks prettier in notebook converted with the resource function DarkMode.

Elaborated approach

Assume that we want to automate the simplistic approach described in the previous section.

One way to automate is to create a Machine Learning (ML) classifier that is capable of discerning which RandomMandala objects look like Bethlehem Star target images and which do not. With such a classifier we can write a function BethlehemMandala that applies the classifier on multiple results from RandomMandala and returns those mandalas that the classifier says are good.

Here are the steps of building the proposed classifier:

  • Generate a large enough Random Mandala Images Set (RMIS)
  • Create a feature extractor from a subset of RMIS
  • Assign features to all of RMIS
  • Make a recommender with the RMIS features and other image data (like pixel values)
  • Apply the RMIS recommender over the target Bethlehem Star images and determine and examine image sets that are:
    • the best recommendations
    • the worse recommendations
  • With the best and worse recommendations sets compose training data for classifier making
  • Train a classifier
  • Examine classifier application to (filtering of) random mandala images (both in RMIS and not in RMIS)
  • If the results are not satisfactory redo some or all of the steps above

Remark: If the results are not satisfactory we should consider using the obtained classifier at the data generation phase. (This is not done in this document/notebook.)

Remark: The elaborated approach outline and flow chart have general applicability, not just for generation of random images of a certain type.

Flow chart

Here is a flow chart that corresponds to the outline above:

09agsmbmjhhv4

A few observations for the flow chart follow:

  • The flow chart has a feature extraction block that shows that the feature extraction can be done in several ways.
    • The application of LSA is a type of feature extraction which this document/notebook uses.
  • If the results are not good enough the flow chart shows that the classifier can be used at the data generation phase.
  • If the results are not good enough there are several alternatives to redo or tune the ML algorithms.
    • Changing or tuning the recommender implies training a new classifier.
    • Changing or tuning the feature extraction implies making a new recommender and a new classifier.

Data generation and preparation

In this section we generate random mandala graphics, transform them into images and corresponding vectors. Those image-vectors can be used to apply dimension reduction algorithms. (Other feature extraction algorithms can be applied over the images.)

Generated data

Generate large number of mandalas:

k = 20000;
knownSeedsQ = False;
SeedRandom[343];
lsRSeeds = Union@RandomInteger[{1, 10^9}, k];
AbsoluteTiming[
  aMandalas = 
    If[TrueQ@knownSeedsQ, 
     Association@Table[rs -> BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], RandomSeeding -> rs], {rs, lsRSeeds}], 
    (*ELSE*) 
     Association@Table[i -> ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], {i, 1, k}] 
    ]; 
 ]

(*{18.7549, Null}*)

Check the number of mandalas generated:

Length[aMandalas]

(*20000*)

Show a sample of the generated mandalas:

Magnify[Multicolumn[MandalaToWhiterImage /@ RandomSample[Values@aMandalas, 40], 10, Background -> Black], 0.7]
1gpblane63eo9

Data preparation

Convert the mandala graphics into images using appropriately large (or appropriately small) image sizes:

AbsoluteTiming[
  aMImages = ParallelMap[ImageResize[#, {120, 120}] &, aMandalas]; 
 ]

(*{248.202, Null}*)

Flatten each of the images into vectors:

AbsoluteTiming[
  aMImageVecs = ParallelMap[Flatten[ImageData[Binarize@ColorNegate@ColorConvert[#, "Grayscale"]]] &, aMImages]; 
 ]

(*{16.0125, Null}*)

Remark: Below those vectors are called image-vectors.

Feature extraction

In this section we use the software monad LSAMon, [AA1, AAp1], to do dimension reduction over a subset of random mandala images.

Remark: Other feature extraction methods can be used through the built-in functions FeatureExtraction and FeatureExtract.

Dimension reduction

Create an LSAMon object and extract image topics using Singular Value Decomposition (SVD) or Independent Component Analysis (ICA), [AAr2]:

SeedRandom[893];
AbsoluteTiming[
  lsaObj = 
    LSAMonUnit[]⟹
     LSAMonSetDocumentTermMatrix[SparseArray[Values@RandomSample[aMImageVecs, UpTo[2000]]]]⟹
     LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]⟹
     LSAMonExtractTopics["NumberOfTopics" -> 40, Method -> "ICA", "MaxSteps" -> 240, "MinNumberOfDocumentsPerTerm" -> 0]⟹
     LSAMonNormalizeMatrixProduct[Normalized -> Left]; 
 ]

(*{16.1871, Null}*)

Show the importance coefficients of the topics (if SVD was used the plot would show the singular values):

ListPlot[Norm /@ SparseArray[lsaObj⟹LSAMonTakeH], Filling -> Axis, PlotRange -> All, PlotTheme -> "Scientific"]
1sy1zsgpxysof

Show the interpretation of the extracted image topics:

lsaObj⟹
   LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹
   LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#, ImageDimensions[aMImages[[1]]][[1]]]]] & /@ SparseArray[#H] &];
16h8a7jwknnkt

Approximation

Pick a test image that is a mandala image or a target image and pre-process it:

If[True, 
   ind = RandomChoice[Range[Length[Values[aMImages]]]]; 
   imgTest = MandalaToWhiterImage@aMandalas[[ind]]; 
   matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic], 
  (*ELSE*) 
   imgTest = Binarize[imgStar2, 0.5]; 
   matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic] 
  ];
imgTest
0vlq50ryrw0hl

Find the representation of the test image with the chosen feature extractor (LSAMon object here):

matReprsentation = lsaObj⟹LSAMonRepresentByTopics[matImageTest]⟹LSAMonTakeValue;
lsCoeff = Normal@SparseArray[matReprsentation[[1, All]]];
ListPlot[lsCoeff, Filling -> Axis, PlotRange -> All]
1u57b208thtfz

Show the interpretation of the found representation:

H = SparseArray[lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹LSAMonTakeH];
vecReprsentation = lsCoeff . H;
ImageAdjust@Image[Rescale[Partition[vecReprsentation, ImageDimensions[aMImages[[1]]][[1]]]]]
1m7r3b5bx32ow

Recommendations

In this section we utilize the software monad SMRMon, [AAp3], to create a recommender for the random mandala images.

Remark: Instead of the Sparse Matrix Recommender (SMR) object the built-in function Nearest can be used.

Create SSparseMatrix object for all image-vectors:

matImages = ToSSparseMatrix[SparseArray[Values@aMImageVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic]
029x975bs3q7w

Normalize the rows of the image-vectors matrix:

AbsoluteTiming[
  matPixel = WeightTermsOfSSparseMatrix[matImages, "None", "None", "Cosine"] 
 ]
1k9xucwektmhh

Get the LSA topics matrix:

matH = (lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹LSAMonTakeH)
05zsn0o1jyqj6

Find the image topics representation for each image-vector (assuming matH was computed with SVD or ICA):

AbsoluteTiming[
  matTopic = matPixel . Transpose[matH] 
 ]
028u1jz1hgzx9

Here we create a recommender based on the images data (pixels) and extracted image topics (or other image features):

smrObj = 
   SMRMonUnit[]⟹
    SMRMonCreate[<|"Pixel" -> matPixel, "Topic" -> matTopic|>]⟹
    SMRMonApplyNormalizationFunction["Cosine"]⟹
    SMRMonSetTagTypeWeights[<|"Pixel" -> 0.2, "Topic" -> 1|>];

Remark: Note the weights assigned to the pixels and the topics in the recommender object above. Those weights were derived by examining the recommendations results shown below.

Here is the image we want to find most similar mandala images to – the target image:

imgTarget = Binarize[imgStar2, 0.5]
1qdmarfxa5i78

Here is the profile of the target image:

aProf = MakeSMRProfile[lsaObj, imgTarget, ImageDimensions[aMImages[[1]]]];
TakeLargest[aProf, 6]

(*<|"10032-10009-4392" -> 0.298371, "3906-10506-10495" -> 0.240086, "10027-10014-4387" -> 0.156797, "8342-8339-6062" -> 0.133822, "3182-3179-11222" -> 0.131565, "8470-8451-5829" -> 0.128844|>*)

Using the target image profile here we compute the recommendation scores for all mandala images of the recommender:

aRecs = 
   smrObj⟹
    SMRMonRecommendByProfile[aProf, All]⟹
    SMRMonTakeValue;

Here is a plot of the similarity scores:

Row[{ResourceFunction["RecordsSummary"][Values[aRecs]], ListPlot[Values[aRecs], ImageSize -> Medium, PlotRange -> All, PlotTheme -> "Detailed", PlotLabel -> "Similarity scores"]}]
1kdiisj4jg4ut

Here are the closest (nearest neighbor) mandala images:

Multicolumn[Values[ImageAdjust@*ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], 48]]]], 12, Background -> Black]
096uazw8izidy

Here are the most distant mandala images:

Multicolumn[Values[ImageAdjust@*ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], -48]]]], 12, Background -> Black]
0zb7hf24twij4

Classifier creation and utilization

In this section we:

  • Prepare classifier data
  • Build and examine a classifier using the software monad ClCon, [AA2, AAp2], using appropriate training, testing, and validation data ratios
  • Build a classifier utilizing all training data
  • Generate Bethlehem Star mandalas by filtering mandala candidates with the classifier

As it was mentioned above we prepare the data to build classifiers with by:

  • Selecting top, highest scores recommendations and labeling them with True
  • Selecting bad, low score recommendations and labeling them with False
AbsoluteTiming[
  Block[{
    lsBest = Values@aMandalas[[ToExpression /@ Take[Keys[aRecs], 120]]], 
    lsWorse = Values@aMandalas[[ToExpression /@ Join[Take[Keys[aRecs], -200], RandomSample[Take[Keys[aRecs], {3000, -200}], 200]]]]}, 
   lsTrainingData = 
     Join[
      Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> True &, lsBest], 
      Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> False &, lsWorse] 
     ]; 
  ] 
 ]

(*{27.9127, Null}*)

Using ClCon train a classifier and show its performance measures:

clObj = 
   ClConUnit[lsTrainingData]⟹
    ClConSplitData[0.75, 0.2]⟹
    ClConMakeClassifier["NearestNeighbors"]⟹
    ClConClassifierMeasurements⟹
    ClConEchoValue⟹
    ClConClassifierMeasurements["ConfusionMatrixPlot"]⟹
    ClConEchoValue;
0jkfza6x72kb5
03uf3deiz0hsd

Remark: We can re-run the ClCon workflow above several times until we obtain a classifier we want to use.

Train a classifier with all prepared data:

clObj2 = 
   ClConUnit[lsTrainingData]⟹
    ClConSplitData[1, 0.2]⟹
    ClConMakeClassifier["NearestNeighbors"];

Get the classifier function from ClCon object:

cfBStar = clObj2⟹ClConTakeClassifier
0awjjib00ihgg

Here we generate Bethlehem Star mandalas using the classifier trained above:

SeedRandom[2020];
Multicolumn[MandalaToWhiterImage /@ BethlehemMandala[12, cfBStar, 0.87], 6, Background -> Black]
0r37g633mpq0y

Generate Bethlehem Star mandala images utilizing the classifier (with a specified classifier probabilities threshold):

SeedRandom[32];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0.87, "Probabilities" -> True]]
0osesxm4gdvvf

Show unfiltered Bethlehem Star mandala candidates:

SeedRandom[32];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0, "Probabilities" -> True]]
0rr12n6savl9z

Remark: Examine the probabilities in the image-probability associations above – they show that the classifier is “working.“

Here is another set generated Bethlehem Star mandalas using rotational symmetry order 4:

SeedRandom[777];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0.8, "RotationalSymmetryOrder" -> 4, "Probabilities" -> True]]
0rgzjquk4amz4

Remark: Note that although a higher rotational symmetry order is used the highly scored results still seem relevant – they have the features of the target Bethlehem Star images.

References

[AA1] Anton Antonov, “A monad for Latent Semantic Analysis workflows”, (2019), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov, “A monad for classification workflows”, (2018)), MathematicaForPrediction at WordPress.

[MSE1] “Plotting the Star of Bethlehem”, (2020),Mathematica Stack Exchange, question 236499,

[Wk1] Wikipedia entry, Star of Bethlehem.

Packages

[AAr1] Anton Antonov, RandomMandala, (2019), Wolfram Function Repository.

[AAr2] Anton Antonov, IdependentComponentAnalysis, (2019), Wolfram Function Repository.

[AAr3] Anton Antonov, “Simplified Machine Learning Workflows” book, (2019), GitHub/antononcube.

[AAp1] Anton Antonov, Monadic Latent Semantic Analysis Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp2] Anton Antonov, Monadic contextual classification Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp3] Anton Antonov, Monadic Sparse Matrix Recommender Mathematica package, (2018), MathematicaForPrediction at GitHub/antononcube.

Code definitions

urlPart = "https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/";
Get[urlPart <> "MonadicLatentSemanticAnalysis.m"];
Get[urlPart <> "MonadicSparseMatrixRecommender.m"];
Get[urlPart <> "/MonadicContextualClassification.m"];
Clear[MandalaToImage, MandalaToWhiterImage];
MandalaToImage[gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr, imgSize];
MandalaToWhiterImage[gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr /. GrayLevel[0.25`] -> Black, imgSize];
Clear[ImageToVector];
ImageToVector[img_Image] := Flatten[ImageData[ColorConvert[img, "Grayscale"]]];
ImageToVector[img_Image, imgSize_] := Flatten[ImageData[ColorConvert[ImageResize[img, imgSize], "Grayscale"]]];
ImageToVector[___] := $Failed;
Clear[MakeSMRProfile];
MakeSMRProfile[lsaObj_LSAMon, gr_Graphics, imgSize_] := MakeSMRProfile[lsaObj, {gr}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsGrs : {_Graphics}, imgSize_] := MakeSMRProfile[lsaObj, MandalaToWhiterImage[#, imgSize] & /@ lsGrs, imgSize]
MakeSMRProfile[lsaObj_LSAMon, img_Image, imgSize_] := MakeSMRProfile[lsaObj, {img}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsImgs : {_Image ..}, imgSize_] := 
   Block[{lsImgVecs, matTest, aProfPixel, aProfTopic}, 
    lsImgVecs = ImageToVector[#, imgSize] & /@ lsImgs; 
    matTest = ToSSparseMatrix[SparseArray[lsImgVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic]; 
    aProfPixel = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTerms[matTest]⟹LSAMonTakeValue]; 
    aProfTopic = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTopics[matTest]⟹LSAMonTakeValue]; 
    aProfPixel = Select[aProfPixel, # > 0 &]; 
    aProfTopic = Select[aProfTopic, # > 0 &]; 
    Join[aProfPixel, aProfTopic] 
   ];
MakeSMRProfile[___] := $Failed;
Clear[BethlehemMandalaCandiate];
BethlehemMandalaCandiate[opts : OptionsPattern[]] := ResourceFunction["RandomMandala"][opts, "RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve];
Clear[BethlehemMandala];
Options[BethlehemMandala] = Join[{ImageSize -> {120, 120}, "Probabilities" -> False}, Options[ResourceFunction["RandomMandala"]]];
BethlehemMandala[n_Integer, cf_ClassifierFunction, opts : OptionsPattern[]] := BethlehemMandala[n, cf, 0.87, opts];
BethlehemMandala[n_Integer, cf_ClassifierFunction, threshold_?NumericQ, opts : OptionsPattern[]] := 
   Block[{imgSize, probsQ, res, resNew, aResScores = <||>, aResScoresNew = <||>}, 
     
     imgSize = OptionValue[BethlehemMandala, ImageSize]; 
     probsQ = TrueQ[OptionValue[BethlehemMandala, "Probabilities"]]; 
     
     res = {}; 
     While[Length[res] < n, 
      resNew = Table[BethlehemMandalaCandiate[FilterRules[{opts}, Options[ResourceFunction["RandomMandala"]]]], 2*(n - Length[res])]; 
      aResScoresNew = Association[# -> cf[MandalaToImage[#, imgSize], "Probabilities"][True] & /@ resNew]; 
      aResScoresNew = Select[aResScoresNew, # >= threshold &]; 
      aResScores = Join[aResScores, aResScoresNew]; 
      res = Keys[aResScores] 
     ]; 
     
     aResScores = TakeLargest[ReverseSort[aResScores], UpTo[n]]; 
     If[probsQ, aResScores, Keys[aResScores]] 
    ] /; n > 0;
BethlehemMandala[___] := $Failed

Making Graphs over System Dynamics Models

Introduction

In this document we give usage examples for the functions of the package, “SystemDynamicsModelGraph.m”, [AAp1]. The package provides functions for making dependency graphs for the stocks in System Dynamics (SD) models. The primary motivation for creating the functions in this package is to have the ability to introspect, proofread, and verify the (typical) ODE models made in SD.

A more detailed explanation is:

  • For a given SD system S of Ordinary Differential Equations (ODEs) we make Mathematica graph objects that represent the interaction of variable dependent functions in S.
  • Those graph objects give alternative (and hopefully convenient) way of visualizing the model of S.

Load packages

The following commands load the packages [AAp1, AAp2, AAp3]:

Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/WL/SystemDynamicsModelGraph.m"]
Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/EpidemiologyModels.m"]
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/Misc/CallGraph.m"]

Usage examples

Equations

Here is a system of ODEs of a slightly modified SEIR model:

lsEqs = {Derivative[1][SP][t] == -((IP[t] SP[t] \[Beta][IP])/TP[t]) - SP[t] \[Mu][TP], Derivative[1][EP][t] == (IP[t] SP[t] \[Beta][IP])/TP[t] - EP[t] (1/aincp + \[Mu][TP]), Derivative[1][IP][t] == EP[t]/aincp - IP[t]/aip - IP[t] \[Mu][IP], Derivative[1][RP][t] == IP[t]/aip - RP[t] \[Mu][TP], TP[t] == Max[0, EP[t] + IP[t] + RP[t] + SP[t]]};
ResourceFunction["GridTableForm"][List /@ lsEqs, TableHeadings -> {"Equations"}]
01xbi9kqh0cfv

Model graph

Here is a graph of the dependencies between the populations:

ModelDependencyGraph[lsEqs, {EP, IP, RP, SP, TP}, t]
08d1a9tfgog31

When the second argument given to ModelDependencyGraph is Automatic the stocks in the equations are heuristically found with the function ModelHeuristicStocks:

ModelHeuristicStocks[lsEqs, t]

(*{EP, IP, RP, SP, TP}*)

Also, the function ModelDependencyGraph takes all options of Graph:

ModelDependencyGraph[lsEqs, Automatic, t, 
  GraphLayout -> "GravityEmbedding", VertexLabels -> "Name", VertexLabelStyle -> Directive[Red, Bold, 16], EdgeLabelStyle -> Directive[Blue, 16], ImageSize -> Large]
0nbr2tt4704fd

Dependencies only

The dependencies in the model can be found with the function ModelDependencyGraphEdges:

lsEdges = ModelDependencyGraphEdges[lsEqs, Automatic, t]
0oqkkrnakv89r
lsEdges[[4]] // FullForm
0x9s286b3noms

Focus stocks

Here is a graph for a set of “focus” stocks-sources to a set of “focus” stocks-destinations:

gr = ModelDependencyGraph[lsEqs, {IP, SP}, {EP}, t]
13di08vbzgsyi

Compare with the graph in which the argument positions of sources and destinations of the previous command are swapped:

ModelDependencyGraph[lsEqs, {EP}, {IP, SP}, t]
009we6s5tmxek

Additional interfacing

The functions of this package work with the models from the package “EpidemiologyModels.m”, [AAp2].

Here is a model from [AAp2]:

model = SEIRModel[t, "TotalPopulationRepresentation" -> "AlgebraicEquation"];
ModelGridTableForm[model]
0aojbcw5zogfw

Here we make the corresponding graph:

ModelDependencyGraph[model, t]
1v1sbwz9d6peq

Generating equations from graph specifications

A related, dual, or inverse task to the generation of graphs from systems of ODEs is the generation of system of ODEs from graphs.

Here is a model specifications through graph edges (using DirectedEdge):

0qp613dyiglzo

Here is the corresponding graph:

grModel = Graph[lsEdges, VertexLabels -> "Name", EdgeLabels -> "EdgeTag", ImageSize -> Large]
1vrnyvwpgmcz9

Here we generate the system of ODEs using the function ModelGraphEquations:

lsEqsGen = ModelGraphEquations[grModel, t];
ResourceFunction["GridTableForm"][List /@ lsEqsGen, TableHeadings -> {"Equations"}]
1dl7z5ohgof6h

Remark: ModelGraphEquations works with both graph and list of edges as a first argument.

Here we replace the symbolically represented rates with concrete values:

08ewn5gxhx8d5
1e4wq9d04yhro

Here we solve the system of ODEs:

sol = First@NDSolve[{lsEqsGen2, SP[0] == 99998, EP[0] == 0, IP[0] == 1, RP[0] == 0,MLP[0] == 0, TP[0] == 100000}, Union[First /@ lsEdges], {t, 0, 365}]
1p9civying0hn

Here we plot the results:

ListLinePlot[sol[[All, 2]], PlotLegends -> sol[[All, 1]]]
13xvmif6i6o2n

Call graph

The functionalities provided by the presented package “SystemDynamicsModelGraph.m”, [AAp1], resemble in spirit those of the package “CallGraph.m”, [AAp3].

Here is call graph for the functions in the package [AAp1] made with the function CallGraph from the package [AAp3]:

CallGraph`CallGraph[Context[ModelDependencyGraph], "PrivateContexts" -> False, "UsageTooltips" -> True]
0c1vpbf585pe9

References

Packages

[AAp1] Anton Antonov, “System Dynamics Model Graph Mathematica package”, (2020), SystemsModeling at GitHub/antononcube.

[AAp2] Anton Antonov, “Epidemiology models Mathematica package”, (2020), SystemsModeling at GitHub/antononcube.

[AAp3] Anton Antonov, “Call graph generation for context functions Mathematica package”, (2018), MathematicaForPrediction at GitHub/antononcube.

Articles

[AA1] Anton Antonov, “Call graph generation for context functions”, (2019), MathematicaForPrediction at WordPress.

Investigating COVID-19 with R: data analysis and simulations

Methodological presentation
R-Ladies Miami Meetup, May 28th 2020

The extended abstract of the presentation was loosely followed. Here is the presentation mind-map:

MainMindMap

(Note that mind-map’s PDF has hyperlinks. Also, see the folder Presentation-aids. )

The organizers and I did a poll for what people want to hear. After discussing the results of the 15 votes from that poll we decided the presentation to be a methodological one instead of a know-how one.

Approximately 30% of the presentation was based on the R-project “COVID-19-modeling-in-R”, [AA1].

Approximately 30% of the presentation was based on an R-programmed software monad for epidemiology compartmental models, ECMMon-R, [AAr2].

For the rest were used frameworks, simulations, and graphics made with Mathematica, [AAr1].

The presentation was given online (because of COVID-19) using Zoom. 90 people registered. Nearly 40 showed up (and maybe 20 stayed throughout.)

Here is a link to the video recording.

Screenshots

Here are screenshots of statistics used in the introduction:

References

Coronavirus

[Wk1] Wikipedia entry, Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

[Wk2] Wikipedia entry, Coronavirus disease 2019.

Modeling

[Wk3] Wikipedia entry, Compartmental models in epidemiology.

[Wk4] Wikipedia entry, System dynamics.

R code/software

[KS1] Karline Soetaert, Thomas Petzoldt, R. Woodrow Setzer, “deSolve: Solvers for Initial Value Problems of Differential Equations (‘ODE’, ‘DAE’, ‘DDE’)”, CRAN.

[AA1] Anton Antonov, “COVID-19-modeling-in-R”, 2020, SystemModeling at GitHub.

[AAr1] Anton Antonov, Coronavirus-propagation-dynamics, 2020, SystemModeling at GitHub.

[AAr2] Anton Antonov, Epidemiology Compartmental Modeling Monad in R, 2020, ECMMon-R at GitHub.

Apple mobility trends data visualization (for COVID-19)

Introduction

I this notebook/document we ingest and visualize the mobility trends data provided by Apple, [APPL1].

We take the following steps:

  1. Download the data

     

  2. Import the data and summarise it

  3. Transform the data into long form

  4. Partition the data into subsets that correspond to combinations of geographical regions and transportation types

  5. Make contingency matrices and corresponding heat-map plots

  6. Make nearest neighbors graphs over the contingency matrices and plot communities

  7. Plot the corresponding time series

Data description

From Apple’s page https://www.apple.com/covid19/mobility

About This Data The CSV file and charts on this site show a relative volume of directions requests per country/region or city compared to a baseline volume on January 13th, 2020. We define our day as midnight-to-midnight, Pacific time. Cities represent usage in greater metropolitan areas and are stably defined during this period. In many countries/regions and cities, relative volume has increased since January 13th, consistent with normal, seasonal usage of Apple Maps. Day of week effects are important to normalize as you use this data. Data that is sent from users’ devices to the Maps service is associated with random, rotating identifiers so Apple doesn’t have a profile of your movements and searches. Apple Maps has no demographic information about our users, so we can’t make any statements about the representativeness of our usage against the overall population.

Observations

The observations listed in this subsection are also placed under the relevant statistics in the following sections and indicated with “Observation”.

  • The directions request volumes reference date for normalization is 2020-01-13 : all the values in that column are 100.

     

  • From the community clusters of the nearest neighbor graphs (derived from the time series of the normalized driving directions requests volume) we see that countries and cities are clustered in expected ways. For example, in the community graph plot corresponding to “{city, driving}” the cities Oslo, Copenhagen, Helsinki, Stockholm, and Zurich are placed in the same cluster. In the graphs corresponding to “{city, transit}” and “{city, walking}” the Japanese cities Tokyo, Osaka, Nagoya, and Fukuoka are clustered together.

  • In the time series plots the Sundays are indicated with orange dashed lines. We can see that from Monday to Thursday people are more familiar with their trips than say on Fridays and Saturdays. We can also see that on Sundays people (on average) are more familiar with their trips or simply travel less.

Load packages

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/DataReshape.m"]
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/Misc/HeatmapPlot.m"]

Data ingestion

Apple mobile data was provided in this WWW page: https://www.apple.com/covid19/mobility , [APPL1]. (The data has to be download from that web page – there is an “agreement to terms”, etc.)

dsAppleMobility = ResourceFunction["ImportCSVToDataset"]["~/Downloads/applemobilitytrends-2021-01-15.csv"]
1po4mftcckaca

Observation: The directions requests volumes reference date for normalization is 2020-01-13 : all the values in that column are 100.

Data dimensions:

Dimensions[dsAppleMobility]

(*{4691, 375}*)

Data summary:

Magnify[ResourceFunction["RecordsSummary"][dsAppleMobility], 0.6]

Number of unique “country/region” values:

Length[Union[Normal[dsAppleMobility[Select[#["geo_type"] == "country/region" &], "region"]]]]

(*63*)

Number of unique “city” values:

Length[Union[Normal[dsAppleMobility[Select[#["geo_type"] == "city" &], "region"]]]]

(*295*)

All unique geo types:

lsGeoTypes = Union[Normal[dsAppleMobility[All, "geo_type"]]]

(*{"city", "country/region", "county", "sub-region"}*)

All unique transportation types:

lsTransportationTypes = Union[Normal[dsAppleMobility[All, "transportation_type"]]]

(*{"driving", "transit", "walking"}*)

Data transformation

It is better to have the data in long form (narrow form). For that I am using the package “DataReshape.m”, [AAp1].

(*lsIDColumnNames={"geo_type","region","transportation_type"};*) (*For the initial dataset of Apple's mobility data.*)
  lsIDColumnNames = {"geo_type", "region", "transportation_type", "alternative_name", "sub-region", "country"}; 
   dsAppleMobilityLongForm = ToLongForm[dsAppleMobility, lsIDColumnNames, Complement[Keys[dsAppleMobility[[1]]], lsIDColumnNames]]; 
   Dimensions[dsAppleMobilityLongForm]

(*{1730979, 8}*)

Remove the rows with “empty” values:

dsAppleMobilityLongForm = dsAppleMobilityLongForm[Select[#Value != "" &]];
Dimensions[dsAppleMobilityLongForm]

(*{1709416, 8}*)

Rename the column “Variable” to “Date” and add a related “DateObject” column:

AbsoluteTiming[
  dsAppleMobilityLongForm = dsAppleMobilityLongForm[All, Join[KeyDrop[#, "Variable"], <|"Date" -> #Variable, "DateObject" -> DateObject[#Variable]|>] &]; 
 ]

(*{714.062, Null}*)

Add “day name” (“day of the week”) field:

AbsoluteTiming[
  dsAppleMobilityLongForm = dsAppleMobilityLongForm[All, Join[#, <|"DayName" -> DateString[#DateObject, {"DayName"}]|>] &]; 
 ]

(*{498.026, Null}*)

Here is sample of the transformed data:

SeedRandom[3232];
RandomSample[dsAppleMobilityLongForm, 12]

Here is summary:

ResourceFunction["RecordsSummary"][dsAppleMobilityLongForm]

Partition the data into geo types × transportation types:

aQueries = Association@Flatten@Outer[Function[{gt, tt}, {gt, tt} -> dsAppleMobilityLongForm[Select[#["geo_type"] == gt && #["transportation_type"] == tt &]]], lsGeoTypes, lsTransportationTypes];
aQueries = Select[aQueries, Length[#] > 0 &];
Keys[aQueries]

(*{{"city", "driving"}, {"city", "transit"}, {"city", "walking"}, {"country/region", "driving"}, {"country/region", "transit"}, {"country/region", "walking"}, {"county", "driving"}, {"county", "transit"}, {"county", "walking"}, {"sub-region", "driving"}, {"sub-region", "transit"}, {"sub-region", "walking"}}*)

Basic data analysis

We consider relative volume o directions requests for the last date only. (The queries can easily adjusted for other dates.)

lastDate = Last@Sort@Normal@dsAppleMobilityLongForm[All, "Date"]

(*"2021-01-15"*)
aDayQueries = Association@Flatten@Outer[Function[{gt, tt}, {gt, tt} -> dsAppleMobilityLongForm[Select[#["geo_type"] == gt && #Date == lastDate && #["transportation_type"] == tt &]]], lsGeoTypes, lsTransportationTypes];
Dimensions /@ aDayQueries

(*<|{"city", "driving"} -> {299, 10}, {"city", "transit"} -> {197, 10}, {"city", "walking"} -> {294, 10}, {"country/region", "driving"} -> {63, 10}, {"country/region", "transit"} -> {27, 10}, {"country/region", "walking"} -> {63, 10}, {"county", "driving"} -> {2090, 10}, {"county", "transit"} -> {152, 10}, {"county", "walking"} -> {396, 10}, {"sub-region", "driving"} -> {557, 10}, {"sub-region", "transit"} -> {175, 10}, {"sub-region", "walking"} -> {339, 10}|>*)

Here we plot histograms and Pareto principle adherence:

opts = {PlotRange -> All, ImageSize -> Medium};
Grid[
    Function[{columnName}, 
      {Histogram[#, 12, PlotLabel -> columnName, opts], ResourceFunction["ParetoPrinciplePlot"][#, PlotLabel -> columnName, opts]} &@Normal[#[All, "Value"]] 
     ] /@ {"Value"}, 
    Dividers -> All, FrameStyle -> GrayLevel[0.7]] & /@ aDayQueries
1mdtonh8hp7bw

 

Heat-map plots

We can visualize the data using heat-map plots. Here we use the package “HeatmapPlot.m”, [AAp2].

Remark: Using the contingency matrices prepared for the heat-map plots we can do further analysis, like, finding correlations or nearest neighbors. (See below.)

Cross-tabulate dates with regions:

aMatDateRegion = ResourceFunction["CrossTabulate"][#[All, {"Date", "region", "Value"}], "Sparse" -> True] & /@ aQueries;

Make a heat-map plot by sorting the columns of the cross-tabulation matrix (that correspond to countries):

aHeatMapPlots = Association@KeyValueMap[#1 -> Rasterize[HeatmapPlot[#2, PlotLabel -> #1, DistanceFunction -> {None, EuclideanDistance}, AspectRatio -> 1/1.6, ImageSize -> 1600]] &, aMatDateRegion]

(We use Rasterize in order to reduce the size of the notebook.)

Here we take closer look to one of the plots:

aHeatMapPlots[{"country/region", "driving"}]

Nearest neighbors graphs

Graphs overview

Here we create nearest neighbor graphs of the contingency matrices computed above and plot cluster the nodes:

Manipulate[
  Multicolumn[Normal@Map[CommunityGraphPlot@Graph@EdgeList@NearestNeighborGraph[Normal[Transpose[#SparseMatrix]], nns, ImageSize -> Medium] &, aMatDateRegion], 2, Dividers -> All], 
  {{nns, 5, "Number of nearest neighbors:"}, 2, 30, 1, Appearance -> "Open"}, SaveDefinitions -> True]

Closer look into the graphs

Here we endow each nearest neighbors graph with appropriate vertex labels:

aNNGraphs = Map[(gr = NearestNeighborGraph[Normal[Transpose[#SparseMatrix]], 4, GraphLayout -> "SpringEmbedding", VertexLabels -> Thread[Rule[Normal[Transpose[#SparseMatrix]], #ColumnNames]]];Graph[EdgeList[gr], VertexLabels -> Thread[Rule[Normal[Transpose[#SparseMatrix]], #ColumnNames]]]) &, aMatDateRegion];

Here we plot the graphs with clusters:

ResourceFunction["GridTableForm"][List @@@ Normal[CommunityGraphPlot[#, ImageSize -> 800] & /@ aNNGraphs], TableHeadings -> {"region & transportation type", "communities of nearest neighbors graph"}, Background -> White, Dividers -> All]

Observation: From the community clusters of the nearest neighbor graphs (derived from the time series of the normalized driving directions requests volume) we see that countries and cities are clustered in expected ways. For example in the community graph plot corresponding to “{city, driving}” the cities Oslo, Copenhagen, Helsinki, Stockholm, and Zurich are placed in the same cluster. In the graphs corresponding to “{city, transit}” and “{city, walking}” the Japanese cities Tokyo, Osaka, Nagoya, and Fukuoka are clustered together.

Time series analysis

Time series

In this section for each date we sum all cases over the region-transportation pairs, make a time series, and plot them.

Remark: In the plots the Sundays are indicated with orange dashed lines.

Here we make the time series:

aTSDirReqByCountry = 
  Map[
   Function[{dfQuery}, 
    TimeSeries@(List @@@ Normal[GroupBy[Normal[dfQuery], #DateObject &, Total[#Value & /@ #] &]]) 
   ], 
   aQueries 
  ]

Here we plot them:

opts = {PlotTheme -> "Detailed", PlotRange -> All, AspectRatio -> 1/4,ImageSize -> Large};
Association@KeyValueMap[
   Function[{transpType, ts}, 
    transpType -> 
     DateListPlot[ts, GridLines -> {AbsoluteTime /@ Union[Normal[dsAppleMobilityLongForm[Select[#DayName == "Sunday" &], "DateObject"]]], Automatic}, GridLinesStyle -> {Directive[Orange, Dashed], Directive[Gray, Dotted]}, PlotLabel -> Capitalize[transpType], opts] 
   ], 
   aTSDirReqByCountry 
  ]

Observation: In the time series plots the Sundays are indicated with orange dashed lines. We can see that from Monday to Thursday people are more familiar with their trips than say on Fridays and Saturdays. We can also see that on Sundays people (on average) are more familiar with their trips or simply travel less.

“Forecast”

He we do “forecast” for code-workflow demonstration purposes – the forecasts should not be taken seriously.

Fit a time series model to the time series:

aTSModels = TimeSeriesModelFit /@ aTSDirReqByCountry
1v02kqhrfj7pk
1kp9msj22dd19

Plot data and forecast:

Map[DateListPlot[{#["TemporalData"], TimeSeriesForecast[#, {10}]}, opts, PlotLegends -> {"Data", "Forecast"}] &, aTSModels]
0axzczhqlntju

References

[APPL1] Apple Inc., Mobility Trends Reports, (2020), apple.com.

[AA1] Anton Antonov, “NY Times COVID-19 data visualization”, (2020), SystemModeling at GitHub.

[AAp1] Anton Antonov, Data reshaping Mathematica package, (2018), MathematicaForPrediciton at GitHub.

[AAp2] Anton Antonov, Heatmap plot Mathematica package, (2018), MathematicaForPrediciton at GitHub.

SEI2HR-Econ model with quarantine and supplies scenarios

Introduction

The epidemiology compartmental model, [Wk1], presented in this notebook – SEI2HR-Econ – deals with all three rectangles in this diagram:

ImageResize[Import["https://github.com/antononcube/SystemModeling/raw/master/Projects/Coronavirus-propagation-dynamics/Diagrams/Coronavirus-propagation-simple-dynamics.jpeg"], 900]
1nfurrmjzvnjk

“SEI2HR” stands for “Susceptible, Exposed, Infected two, Hospitalized, Recovered” (populations.) “Econ” stands for “Economic”.

In this notebook we also deal with both quarantine scenarios and medical supplies scenarios. In the notebook [AA4] we deal with quarantine scenarios over a simpler model, SEI2HR.

Remark: We consider the contagious disease propagation models as instances of the more general System Dynamics (SD) models. We use SD terminology in this notebook.

The models

SEI2R

The model SEI2R is introduced and explained in the notebook [AA2]. SEI2R differs from the classical SEIR model, [Wk1, HH1], with the following elements:

  1. Two separate infected populations: one is “severely symptomatic”, the other is “normally symptomatic”
  2. The monetary equivalent of lost productivity due to infected or died people is tracked

SEI2HR

For the formulation of SEI2HR we use a system of Differential Algebraic Equations (DAE’s). The package [AAp1] allows the use of a formulation that has just Ordinary Differential Equations (ODE’s).

Here are the unique features of SEI2HR:

  • People stocks
    • There are two types of infected populations: normally symptomatic and severely symptomatic.
    • There is a hospitalized population.
    • There is a deceased from infection population.
  • Hospital beds
    • Hospital beds are a limited resource that determines the number of hospitalized people.
    • Only severely symptomatic people are hospitalized according to the available hospital beds.
    • The hospital beds stock is not assumed constant, it has its own change rate.
  • Money stocks
    • The money from lost productivity is tracked.
    • The money for hospital services is tracked.

SEI2HR-Econ

SEI2HR-Econ adds the following features to SEI2HR:

  • Medical supplies
    • Medical supplies production is part of the model.
    • Medical supplies delivery is part of the model..
    • Medical supplies accumulation at hospitals is taken into account.
    • Medical supplies demand tracking.
  • Hospitalization
    • Severely symptomatic people are hospitalized according to two limited resources: hospital beds and medical supplies.
  • Money stocks
    • Money for medical supplies production is tracked.

SEI2HR-Econ’s place a development plan

This graph shows the “big picture” of the model development plan undertaken in [AAr1] and SEI2HR (discussed in this notebook) is in that graph:

0led86ishkred

Notebook structure

The rest of notebook has the following sequence of sections:

  • Package load section
  • SEI2HR-Econ structure in comparison of SEI2HR
  • Explanations of the equations of SEI2HR-Econ
  • Quarantine scenario modeling preparation
  • Medical supplies production and delivery scenario modeling preparation
  • Parameters and initial conditions setup
    • Populations, hospital beds, quarantine scenarios, medical supplies scenarios
  • Simulation solutions
  • Interactive interface
  • Sensitivity analysis

Load packages

The epidemiological models framework used in this notebook is implemented with the packages [AAp1-AAp4, AA3]; many of the plot functions are from the package [AAp5].

Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/EpidemiologyModels.m"];
Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/EpidemiologyModelModifications.m"];
Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/EpidemiologyModelingVisualizationFunctions.m"];
Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/Projects/Coronavirus-propagation-dynamics/WL/EpidemiologyModelingSimulationFunctions.m"];
Import["https://raw.githubusercontent.com/antononcube/SystemModeling/master/WL/SystemDynamicsInteractiveInterfacesFunctions.m"];

SEI2HR-Econ extends SEI2HR

The model SEI2HR-Econ is an extension of the model SEI2HR, [AA4].

Here is SEI2HR:

reprTP = "AlgebraicEquation";
lsModelOpts = {"Tooltips" -> True, 
   TooltipStyle -> {Background -> Yellow, CellFrameColor -> Gray, 
     FontSize -> 20}};
modelReference = 
  SEI2HRModel[t, "InitialConditions" -> True, "RateRules" -> True, 
   "TotalPopulationRepresentation" -> reprTP];
ModelGridTableForm[modelReference, lsModelOpts]
123yrlk2zl1zu

Here is SEI2HR-Econ:

modelSEI2HREcon = 
  SEI2HREconModel[t, "InitialConditions" -> True, "RateRules" -> True,
    "TotalPopulationRepresentation" -> reprTP];
ModelGridTableForm[modelSEI2HREcon, lsModelOpts]
10vmc7pwzubaa

Here are the “differences” between the two models:

ModelGridTableForm@
 Merge[{modelSEI2HREcon, modelReference}, 
  If[AssociationQ[#[[1]]], KeyComplement[#], Complement @@ #] &]
036glal3iv3uy

Equations explanations

In this section we provide rationale for the equations of SEI2HR-Econ.

The equations for Susceptible, Exposed, Infected, Recovered populations of SEI2R are “standard” and explanations about them are found in [WK1, HH1]. For SEI2HR those equations change because of the stocks Hospitalized Population and Hospital Beds. For SEI2HR-Econ the SEI2HR equations change because of the stocks Medical Supplies, Medical Supplies Demand, and Hospital Medical Supplies.

The equations time unit is one day. The time horizon is one year. Since we target COVID-19, [Wk2, AA1], we do not consider births.

Remark: For convenient reading the equations in this section have tooltips for the involved stocks and rates.

Verbalization description of the model

We start with one infected (normally symptomatic) person, the rest of the people are susceptible. The infected people meet other people directly or get in contact with them indirectly. (Say, susceptible people touch things touched by infected.) For each susceptible person there is a probability to get the decease. The decease has an incubation period: before becoming infected the susceptible are (merely) exposed. The infected recover after a certain average infection period or die. A certain fraction of the infected become severely symptomatic. The severely symptomatic infected are hospitalized if there are enough hospital beds and enough medical supplies. The hospitalized severely infected have different death rate than the non-hospitalized ones. The number of hospital beds might change: hospitals are extended, new hospitals are build, or there are not enough medical personnel or supplies.

The different types of populations (infected, hospitalized, recovered, etc.) have their own consumption rates of medical supplies. The medical supplies are produced with a certain rate (units per day) and delivered after a certain delay period. The hospitals have their own storage for medical supplies. Medical supplies are delivered to the hospitals only, non-hospitalized people go to the medical supplies producer to buy supplies. The hospitals have precedence for the medical supplies: if the medical supplies are not enough for everyone, the hospital needs are covered first (as much as possible.)

The medical supplies producer has a certain storage capacity (for supplies.) The medical supplies delivery vehicles have a certain – generally speaking, smaller – capacity. The hospitals have a certain capacity to store medical supplies. It is assumed that both producer and hospitals have initial stocks of medical supplies. (Following a certain normal, general preparedness protocol.)

The combined demand from all populations for medical supplies is tracked (accumulated.) The deaths from infection are tracked (accumulated.) Money for medical supplies production, money for hospital services, and money from lost productivity are tracked (accumulated.)

The equations below give mathematical interpretation of the model description above.

Code for the equations

Each equation in this section are derived with code like this:

ModelGridTableForm[modelSEI2HREcon, lsModelOpts]["Equations"][[1, 
 EquationPosition[modelSEI2HREcon, RP] + 1, 2]]

and then the output cell is edited to be “DisplayFormula” and have CellLabel value corresponding to the stock of interest.

The infected and hospitalized populations

SEI2HR has two types of infected populations: a normally symptomatic one and a severely symptomatic one. A major assumption for SEI2HR is that only the severely symptomatic people are hospitalized. (That assumption is also reflected in the diagram in the introduction.)

Each of those three populations have their own contact rates and mortality rates.

Here are the contact rates from the SEI2HR-Econ dictionary

ColumnForm@
 Cases[Normal@modelSEI2HREcon["Rates"], 
  HoldPattern[\[Beta][_] -> _], \[Infinity]]
1nlv3w1o6bvs1

Here are the mortality rates from the SEI2HR-Econ dictionary

ColumnForm@
 Cases[Normal@modelSEI2HREcon["Rates"], 
  HoldPattern[\[Mu][_] -> _], \[Infinity]]
1ile9wky2yqy5

Remark: Below with “Infected Population” we mean both stocks Infected Normally Symptomatic Population (INSP) and Infected Severely Symptomatic Population (ISSP).

Total Population

In this notebook we consider a DAE’s formulation of SEI2HR-Econ. The stock Total Population has the following (obvious) algebraic equation:

0z7878rbvh3ut
0z7878rbvh3ut

Note that with Max we specified that the total population cannot be less than 0.

Remark: As mentioned in the introduction, the package [AAp1] allows for the use of non-algebraic formulation, without an equation for TP.

Susceptible Population

The stock Susceptible Population (SP) is decreased by (1) infections derived from stocks Infected Populations and Hospitalized Population (HP), and (2) morality cases derived with the typical mortality rate.

1gercwr5szizt
1gercwr5szizt

Because we hospitalize the severely infected people only instead of the term

09xmdu8xq7e0p
09xmdu8xq7e0p

we have the terms

0dkc6td55qv7p
0dkc6td55qv7p

The first term is for the infections derived from the hospitalized population. The second term for the infections derived from people who are infected severely symptomatic and not hospitalized.

Births term

Note that we do not consider in this notebook births, but the births term can be included in SP’s equation:

Block[{m = SEI2HREconModel[t, "BirthsTerm" -> True]},
 ModelGridTableForm[m]["Equations"][[1, EquationPosition[m, SP] + 1, 
  2]]
 ]
16thktzpo0dk9
16thktzpo0dk9

The births rate is the same as the death rate, but it can be programmatically changed. (See [AAp2].)

Exposed Population

The stock Exposed Population (EP) is increased by (1) infections derived from the stocks Infected Populations and Hospitalized Population, and (2) mortality cases derived with the typical mortality rate. EP is decreased by (1) the people who after a certain average incubation period (aincp) become ill, and (2) mortality cases derived with the typical mortality rate.

083x1htqp0kkx
083x1htqp0kkx

Infected Normally Symptomatic Population

INSP is increased by a fraction of the people who have been exposed. That fraction is derived with the parameter severely symptomatic population fraction (sspf). INSP is decreased by (1) the people who recover after a certain average infection period (aip), and (2) the normally symptomatic people who die from the disease.

01cfzpu1jv3o4
01cfzpu1jv3o4

Infected Severely Symptomatic Population

ISSP is increased by a fraction of the people who have been exposed. That fraction is corresponds to the parameter severely symptomatic population fraction (sspf). ISSP is decreased by (1) the people who recover after a certain average infection period (aip), (2) the hospitalized severely symptomatic people who die from the disease, and (3) the non-hospitalized severely symptomatic people who die from the disease.

0zhs2d6lt6gz8
0zhs2d6lt6gz8

Note that we do not assume that severely symptomatic people recover faster if they are hospitalized, only that they have a different death rate.

Hospitalized Population

The amount of people that can be hospitalized is determined by the available Hospital Beds (HB) – the stock Hospitalized Population (HP) is subject to a resource limitation by the stock HB.

The equation of the stock HP can be easily understood from the following dynamics description points:

  • If the number of hospitalized people is less that the number of hospital beds we hospitalize the new ISSP people.
  • The Available Hospital Beds (AHB) are determined by the minimum of (i) the non-occupied hospital beds, and (ii) the hospital medical supplies divided by the ISSP consumption rate.
  • If the new ISSP people are more than AHB the hospital takes as many as AHB.
  • Hospitalized people have the same average infection period (aip).
  • Hospitalized (severely symptomatic) people have their own mortality rate.

Here is the HP equation:

1c35mhvi92ffe
1c35mhvi92ffe

Note that although we know that in a given day some hospital beds are going to be freed they are not considered in the hospitalization plans for that day. Similarly, we know that new medical supplies are coming but we do not include them into AHB.

Recovered Population

The stock Recovered Population (RP) is increased by the recovered infected people and decreased by mortality cases derived with the typical mortality rate.

04y15eqwki71f
04y15eqwki71f

Deceased Infected Population

The stock Deceased Infected Population (DIP) accumulates the deaths of the people who are infected. Note that we utilize the different death rates for HP and ISSP.

1ikrv4a51h9xo
1ikrv4a51h9xo

Hospital Beds

The stock Hospital Beds (HB) can change with a rate that reflects the number of hospital beds change rate (nhbcr) per day. Generally speaking, using nhbcr we can capture scenarios, like, extending hospitals, building new hospitals, recruitment of new medical personnel, loss of medical personnel (due to infections.)

1is45mamhsayd
1is45mamhsayd

Hospital Medical Supplies

The Hospital Medical Supplies (HMS) are decreased according to the medical supplies consumption rate (mscr) of HP and increased by a Medical Supplies (MS) delivery term (to be described next.)

The MS delivery term is build with the following assumptions / postulates:

  • Every day the hospital attempts to order MS that correspond to HB multiplied by mscr.
  • The hospital has limited capacity of MS storage, \kappa [\text{HMS}].
  • The MS producer has limited capacity for delivery, \kappa [\text{MDS}].
  • The hospital demand for MS has precedence over the demands for the non-hospitalized populations.
  • Hence, if the MS producer has less stock of MS than the demand of the hospital then MS producer’s whole amount of MS goes to the hospital.
  • The supplies are delivered with some delay period: the medical supplies delivery period (msdp).

Here is the MS delivery term:

1uxby6ipa49dw
1uxby6ipa49dw

Here is the corresponding HMS equation:

0ri4ohdddjxle
0ri4ohdddjxle

Medical Supplies

The equation of the Medical Supplies (MS) stock is based on the following assumptions / postulates:

  • The non-hospitalized people go to the MS producer to buy supplies. (I.e. MS delivery is to the hospital only.)
  • The MS producer vehicles have certain capacity, \kappa [\text{MSD}].
  • The MS producer has a certain storage capacity (for MS stock.)
  • Each of the populations INSP, ISSP, and HP has its own specific medical supplies consumption rate (mscr). EP, RP, and TP have the same mscr.
  • The hospital has precedence in its MS order. I.e. the demand from the hospital is satisfied first, and then the demand of the rest of the populations.

Here is the MS delivery term described in the previous section:

1dy38wd13b0h6
1dy38wd13b0h6

Here is the MS formula with the MS delivery term replaced with “Dlvr”:

ModelGridTableForm[modelSEI2HREcon, "Tooltips" -> False][
   "Equations"][[1, EquationPosition[modelSEI2HREcon, MS] + 1, 2]] /. 
 dlvr -> Dlvr
1r0u6ie10kbi4
1r0u6ie10kbi4

We can see from that equation that MS is increased by medical supplies production rate (mspr) with measuring dimension number of units per day. The production is restricted by the storage capacity, \kappa [\text{MS}]:

(*Min[mspr[HB], -MS[t] + \[Kappa][MS]]*)

MS is decreased by the MS delivery term and the demand from the non-hospitalized populations. Because the hospital has precedence, we use this term form in the equation:

(*Min[-Dlvr + MS[t], "non-hospital demand"]*)

Here is the full MS equation:

0xc7vujwb2npw
0xc7vujwb2npw

Medical Supplies Demand

The stock Medical Supplies Demand (MSD) simply accumulates the MS demand derived from population stocks and their corresponding mscr:

0ut8smszrobx2
0ut8smszrobx2

Money for Hospital Services

The stock Money for Hospital Services (MHS) simply tracks expenses for hospitalized people. The parameter hospital services cost rate (hscr) with unit money per bed per day simply multiplies HP.

0evzoim1tjwsj
0evzoim1tjwsj

Money from Lost Productivity

The stock Money from Lost Productivity (MLP) simply tracks the work non-availability of the infected and died from infection people. The parameter lost productivity cost rate (lpcr) with unit money per person per day multiplies the total count of the infected and dead from infection.

1fhewr6izy46n
1fhewr6izy46n

Quarantine scenarios

In order to model quarantine scenarios we use piecewise constant functions for the contact rates \beta [\text{ISSP}] and \beta [\text{INSP}].

Remark: Other functions can be used, like, functions derived through some statistical fitting.

Here is an example plot :

Block[{func = \[Beta]*
    Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1]},
 Legended[
  Block[{\[Beta] = 0.56, qsd = 60, ql = 8*7, qcrf = 0.25},
   ListLinePlot[Table[func, {t, 0, 365}], PlotStyle -> "Detailed"]
   ], func]]
1nf41pq3u0gvg

To model quarantine with a piecewise constant function we use the following parameters:

  • \text{qsd} for quarantine’s start
  • \text{ql} for quarantines duration
  • \text{qcrf} for the effect on the quarantine on the contact rate

Medical supplies scenarios

We consider three main scenarios for the medical supplies:

  1. Constant production rate and consistent delivery (constant delivery period)
  2. Constant production rate and disrupted delivery
  3. Disrupted production and disrupted delivery

The disruptions are simulated with simple pulse functions – we want to see how the system being modeled reacts to singular, elementary disruption.

Here is an example plot of a disruption of delivery period plot :

Block[{func = 
   dbp*Piecewise[{{1, t < dds}, {dsf, dds <= t <= dds + ddl}}, 1]},
 Legended[
  Block[{dbp = 1, dds = 70, ddl = 7, dsf = 1.8},
   ListLinePlot[Table[func, {t, 0, 365}], PlotStyle -> "Detailed"]
   ], func]]
093okrtfqf4bj
093okrtfqf4bj

To model disruption of delivery with a piecewise constant function we use the following parameters:

  • \text{dbp} for the delivery base period
  • \text{dds} for delivery disruption start
  • \text{ddl} for delivery disruption duration
  • \text{dsf} for how much slower or faster the delivery period becomes

Parameters and actual simulation equations code

Here are the parameters we want to experiment with (or do calibration with):

lsFocusParams = {aincp, aip, sspf[SP], \[Beta][HP], qsd, ql, qcrf, 
   nhbcr[ISSP, INSP], nhbr[TP], mspr[HB]};

Here we set custom rates and initial conditions:

population = 10^6;
modelSEI2HREcon =
  SetRateRules[
   modelSEI2HREcon,
   <|
    TP[0] -> population,
    \[Beta][ISSP] -> 
     0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],
    \[Beta][INSP] -> 
     0.5*Piecewise[{{1, t < qsd}, {qcrf, qsd <= t <= qsd + ql}}, 1],
    qsd -> 60,
    ql -> 8*7,
    qcrf -> 0.25,
    \[Beta][HP] -> 0.01,
    \[Mu][ISSP] -> 0.035/aip,
    \[Mu][INSP] -> 0.01/aip,
    nhbr[TP] -> 3/1000,
    lpcr[ISSP, INSP] -> 1,
    hscr[ISSP, INSP] -> 1,
    msdp[HB] -> 
     dbp*Piecewise[{{1, t < dds}, {dsf, dds <= t <= dds + ddl}}, 1],
    dbp -> 1,
    dds -> 70,
    ddl -> 7,
    dsf -> 2
    |>
   ];

Remark: Note the piecewise functions for \beta [\text{ISSP}], \beta [\text{INSP}], and \text{msdp}[\text{HB}].

Here is the system of ODE’s we use to do parametrized simulations:

lsActualEquations = 
  ModelNDSolveEquations[modelSEI2HREcon, 
   KeyDrop[modelSEI2HREcon["RateRules"], lsFocusParams]];
ResourceFunction["GridTableForm"][List /@ lsActualEquations]
1gymf7m70fgyh
1gymf7m70fgyh
lsActualEquations = 
  ModelNDSolveEquations[modelSEI2HREcon, modelSEI2HREcon["RateRules"]];
ResourceFunction["GridTableForm"][List /@ lsActualEquations]
1avljj6kz4oeo
1avljj6kz4oeo

Simulation

Instead of using ParametricNDSolve as in [AA4] in this notebook we use ModelNDSolve and SetRateRules from the package [AAp4].

Different quarantine starts

Here we compute simulation solutions for a set of quarantine starts:

AbsoluteTiming[
 aVarSolutions =
   Association@
    Map[
     Function[{qsdVar},
      qsdVar -> 
       Association[
        ModelNDSolve[
          SetRateRules[
           modelSEI2HREcon, <|ql -> 56, qsd -> qsdVar|>], {t, 365}][[
         1]]]
      ],
     Range[40, 120, 20]
     ];
 ]

(*{0.366168, Null}*)

Here we plot the results for ISSP only:

SeedRandom[2532]
aVals = #[ISSP][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, #1], #1, {If[#1 <= 70, 
      RandomInteger[{120, 200}], RandomInteger[{80, 110}]], Above}] &,
   aVals], PlotLegends -> 
  SwatchLegend[Keys[aVals], LegendLabel -> "Quarantine start"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> ISSP[t] /. modelSEI2HREcon["Stocks"]]
0idl7dzxetaox

Remark: We use the code in this section to do the computations in the section “Sensitivity Analysis”.

Interactive interface

Using the interface in this section we can interactively see the effects of changing parameters. (This interface is programmed without using parametrized NDSolve solutions in order to be have code that corresponds to the interface implementations in [AAr2].)

opts = {PlotRange -> All, PlotLegends -> None, 
   PlotTheme -> "Detailed", PerformanceGoal -> "Speed", 
   ImageSize -> 400};
lsPopulationKeys = {TP, SP, EP, ISSP, INSP, HP, RP, DIP, HB};
lsSuppliesKeys = {MS, MSD, HMS};
lsMoneyKeys = {MHS, MLP, MMSP};
Manipulate[
 DynamicModule[{modelLocal = modelSEI2HREcon, 
   aStocks = modelSEI2HREcon["Stocks"], aSolLocal = aParSol, 
   lsPopulationPlots, lsMoneyPlots, lsSuppliesPlots},
  
  modelLocal = 
   SetRateRules[
    modelLocal, <|aincp -> aincpM, aip -> aipM, 
     sspf[SP] -> sspfM, \[Beta][HP] -> crhpM, qsd -> qsdM, ql -> qlM, 
     qcrf -> qcrfM, nhbr[TP] -> nhbrM/1000, 
     nhbcr[ISSP, ISNP] -> nhbcrM, mspr[HB] -> msprM, 
     msdp[HB] -> msdpM|>];
  aSolLocal = Association[ModelNDSolve[modelLocal, {t, ndays}][[1]]];
  
  lsPopulationPlots =
   Quiet@ParametricSolutionsPlots[
     aStocks,
     KeyTake[aSolLocal, 
      Intersection[lsPopulationKeys, displayPopulationStocks]],
     None, ndays,
     "LogPlot" -> popLogPlotQ, "Together" -> popTogetherQ, 
     "Derivatives" -> popDerivativesQ, 
     "DerivativePrefix" -> "\[CapitalDelta]", opts, 
     Epilog -> {Gray, Dashed, 
       Line[{{qsdM, 0}, {qsdM, 1.5*population}}], 
       Line[{{qsdM + qlM, 0}, {qsdM + qlM, 1.5*population}}]}];
  
  lsSuppliesPlots =
   If[Length[
      KeyDrop[aSolLocal, Join[lsPopulationKeys, lsMoneyKeys]]] == 
     0, {},
    (*ELSE*)
    Quiet@ParametricSolutionsPlots[
      aStocks,
      KeyTake[KeyDrop[aSolLocal, Join[lsPopulationKeys, lsMoneyKeys]],
        displaySupplyStocks],
      None, ndays,
      "LogPlot" -> supplLogPlotQ, "Together" -> supplTogetherQ, 
      "Derivatives" -> supplDerivativesQ, 
      "DerivativePrefix" -> "\[CapitalDelta]", opts]
    ];
  
  lsMoneyPlots =
   Quiet@ParametricSolutionsPlots[
     aStocks,
     KeyTake[aSolLocal, Intersection[lsMoneyKeys, displayMoneyStocks]],
     None, ndays,
     "LogPlot" -> moneyLogPlotQ, "Together" -> moneyTogetherQ, 
     "Derivatives" -> moneyDerivativesQ, 
     "DerivativePrefix" -> "\[CapitalDelta]", opts];
  
  Multicolumn[Join[lsPopulationPlots, lsSuppliesPlots, lsMoneyPlots], 
   nPlotColumns, Dividers -> All, FrameStyle -> GrayLevel[0.8]],
  SaveDefinitions -> True
  ],
 {{ndays, 365, "Number of days"}, 1, 365, 1, Appearance -> {"Open"}},
 Delimiter,
 {{aincpM, 6., "Average incubation period (days)"}, 1, 60., 1, 
  Appearance -> {"Open"}},
 {{aipM, 21., "Average infectious period (days)"}, 1, 60., 1, 
  Appearance -> {"Open"}},
 {{sspfM, 0.2, "Severely symptomatic population fraction"}, 0, 1, 
  0.025, Appearance -> {"Open"}},
 {{crhpM, 0.1, "Contact rate of the hospitalized population"}, 0, 30, 
  0.1, Appearance -> {"Open"}},
 Delimiter,
 {{qsdM, 55, "Quarantine start days"}, 0, 365, 1, 
  Appearance -> {"Open"}},
 {{qlM, 8*7, "Quarantine length (in days)"}, 0, 120, 1, 
  Appearance -> {"Open"}},
 {{qcrfM, 0.25, "Quarantine contact rate fraction"}, 0, 1, 0.01, 
  Appearance -> {"Open"}},
 Delimiter,
 {{nhbrM, 2.9, "Number of hospital beds rate (per 1000 people)"}, 0, 
  100, 0.1, Appearance -> {"Open"}},
 {{nhbcrM, 0, "Number of hospital beds change rate"}, -0.5, 0.5, 
  0.001, Appearance -> {"Open"}},
 {{msprM, 200, "Medical supplies production rate"}, 0, 50000, 10, 
  Appearance -> {"Open"}},
 {{msdpM, 1.2, "Medical supplies delivery period"}, 0, 10, 0.1, 
  Appearance -> {"Open"}},
 Delimiter,
 {{displayPopulationStocks, lsPopulationKeys, 
   "Population stocks to display:"}, lsPopulationKeys, 
  ControlType -> TogglerBar},
 {{popTogetherQ, True, "Plot populations together"}, {False, True}},
 {{popDerivativesQ, False, "Plot populations derivatives"}, {False, 
   True}},
 {{popLogPlotQ, False, "LogPlot populations"}, {False, True}},
 Delimiter,
 {{displaySupplyStocks, lsSuppliesKeys, 
   "Supplies stocks to display:"}, lsSuppliesKeys, 
  ControlType -> TogglerBar},
 {{supplTogetherQ, True, "Plot supplies functions together"}, {False, 
   True}},
 {{supplDerivativesQ, False, 
   "Plot supplies functions derivatives"}, {False, True}},
 {{supplLogPlotQ, True, "LogPlot supplies functions"}, {False, 
   True}},
 Delimiter,
 {{displayMoneyStocks, lsMoneyKeys, "Money stocks to display:"}, 
  lsMoneyKeys, ControlType -> TogglerBar},
 {{moneyTogetherQ, True, "Plot money functions together"}, {False, 
   True}},
 {{moneyDerivativesQ, False, 
   "Plot money functions derivatives"}, {False, True}},
 {{moneyLogPlotQ, True, "LogPlot money functions"}, {False, True}},
 {{nPlotColumns, 1, "Number of plot columns"}, Range[5]},
 ControlPlacement -> Left, ContinuousAction -> False]
1nbd62e0zn26p

Sensitivity analysis

When making and using this kind of dynamics models it is important to see how the solutions react to changes of different parameters. For example, we should try to find answers to questions like “What ranges of which parameters bring dramatic changes into important stocks?”

Sensitivity Analysis (SA) is used to determine how sensitive is a SD model to changes of the parameters and to changes of model’s equations, [BC1]. More specifically, parameter sensitivity, which we apply below, allows us to see the changes of stocks dynamic behaviour for different sequences (and combinations) of parameter values.

Remark: This section to mirrors to a point the section with same name in [AA4], except in this notebook we are more interested in medical supplies related SA because quarantine related SA is done in [AA4].

Remark: SA shown below should be done for other stocks and rates. In order to keep this exposition short we focus on ISSP, DIP, and HP. Also, it is interesting to think in terms of “3D parameter sensitivity plots.” We also do such plots.

Evaluations by Area under the curve

For certain stocks we might be not just interested in their evolution in time but also in their cumulative values. I.e. we are interested in the so called Area Under the Curve (AUC) metric for those stocks.

There are three ways to calculate AUC for stocks of interest:

  1. Add aggregation equations in the system of ODE’s. (Similar to the stock DIP in SEI2HR.)
    • For example, in order to compute AUC for ISSP we can add to SEI2HR the equation:
(*aucISSP'[t] = ISSP[t]*)
- More details for such equation addition are given in [AA2].
  1. Apply NIntegrate over stocks solution functions.
  2. Apply Trapezoidal rule to stock solution function values over a certain time grid.

Below we use 1 and 3.

Variation of medical supplies delivery period

Here we calculate the solutions for a certain combination of capacities and rates:

AbsoluteTiming[
 aVarSolutions =
   Association@
    Map[
     Function[{msdpVar},
      model2 = SEI2HREconModel[t];
      model2 = 
       SetRateRules[
        model2, <|\[Kappa][MS] -> 10000, \[Kappa][HMS] -> 100, 
         mspr[HB] -> 100, msdp[HB] -> msdpVar|>];
      msdpVar -> Association[ModelNDSolve[model2, {t, 365}][[1]]]
      ],
     Union[Join[Range[0.2, 1, 0.2], Range[1, 3, 0.5]]]
     ];
 ]

(*{0.231634, Null}*)

As expected more frequent delivery results in fuller utilization of the non-occupied hospital beds:

SeedRandom[23532]
focusStock = HP;
aVals = #[focusStock][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, #1], #1, {If[#1 < 1, RandomInteger[{120, 150}], 
      RandomInteger[{160, 260}]], Above}] &, aVals], 
 PlotLegends -> 
  SwatchLegend[Keys[aVals], 
   LegendLabel -> "Medical supplies\ndelivery period"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
0xacdbpxbv6yo

Here are the corresponding AUC values:

aAUCs = TrapezoidalRule[Transpose[{Range[0, 365], #}]] & /@ aVals;
ResourceFunction["GridTableForm"][aAUCs]
1a3d3me9wh6bh
BarChart[aAUCs, ChartLabels -> Keys[aAUCs], ColorFunction -> "Pastel",
  PlotLabel -> 
  Row[{focusStock[t] /. modelSEI2HREcon["Stocks"], Spacer[5], "AUC"}]]
1ncranm3ufw0h

Variation of medical supplies production rate

In order to demonstrate the effect of medical supplies production rate (mspr) it is beneficial to eliminate the hospital beds availability restriction – we assume that we have enough hospital beds for all infected severely symptomatic people.

Here we calculate the solutions for a certain combination of capacities and rates:

AbsoluteTiming[
 aVarSolutions =
   Association@
    Map[
     Function[{msprVar},
      model2 = SEI2HREconModel[t];
      model2 = 
       SetRateRules[
        model2, <|\[Kappa][MS] -> 100000, \[Kappa][HMS] -> 
          10000, \[Kappa][MSD] -> 1000, mspr[HB] -> msprVar, 
         msdp[HB] -> 1.5, mscr[ISSP] -> 0.2, mscr[TP] -> 0.001, 
         mscr[ISSP] -> 1, nhbr[TP] -> 100/1000|>];
      msprVar -> Association[ModelNDSolve[model2, {t, 365}][[1]]]
      ],
     {20, 60, 100, 200, 300, 1000, 10000}
     ];
 ]

(*{0.156794, Null}*)

Hospitalized Population

As expected we can see that with smaller production rates we get less hospitalized people:

SeedRandom[1232]
focusStock = HP;
aVals = #[focusStock][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, #1], #1, {RandomInteger[{180, 240}], Above}] &, 
  aVals], PlotLegends -> 
  SwatchLegend[Keys[aVals], 
   LegendLabel -> "Medical supplies\nproduction rate"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
0vxim0nbznmmi

Here are the corresponding AUC values:

aAUCs = TrapezoidalRule[Transpose[{Range[0, 365], #}]] & /@ aVals;
ResourceFunction["GridTableForm"][aAUCs]
07nfkyx6s9mjo
BarChart[aAUCs, ChartLabels -> Keys[aAUCs], ColorFunction -> "Pastel",
  PlotLabel -> 
  Row[{focusStock[t] /. modelSEI2HREcon["Stocks"], Spacer[5], "AUC"}]]
08xdwgkj6ljeh

Medical Supplies

Here we plot the availability of MS at MS producer’s storage:

SeedRandom[821]
focusStock = MS;
aVals = #[MS][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, #1], #1, {RandomInteger[{100, 160}], Above}] &, 
  aVals], PlotLegends -> 
  SwatchLegend[Keys[aVals], 
   LegendLabel -> "Medical supplies\nproduction rate"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
05il7vfu2msga

Here are the corresponding AUC values:

aAUCs = TrapezoidalRule[Transpose[{Range[0, 365], #}]] & /@ aVals;
ResourceFunction["GridTableForm"][aAUCs]
03yccn85qjh1a
BarChart[aAUCs, 
 ChartLabels -> Map[Rotate[ToString[#], \[Pi]/6] &, Keys[aAUCs]], 
 ColorFunction -> "Pastel", 
 PlotLabel -> 
  Row[{focusStock[t] /. modelSEI2HREcon["Stocks"], Spacer[5], "AUC"}],
  ImageSize -> Medium]
0z5dthm5iuod6

Variation of delivery disruption starts

Here we compute simulation solutions for a set of delivery disruption starts using disruption length of 7 days and disruption “slowing down” factor 2:

AbsoluteTiming[
 aVarSolutions =
   Association@
    Map[
     Function[{ddsVar},
      ddsVar -> 
       Association[
        ModelNDSolve[
          SetRateRules[
           modelSEI2HREcon, <|\[Kappa][MS] -> 100000, \[Kappa][HMS] ->
              1000, mspr[HB] -> 100, ql -> 56, qsd -> 60, 
            nhbr[TP] -> 3/1000, dbp -> 1, dds -> ddsVar, ddl -> 7, 
            dsf -> 2|>], {t, 365}][[1]]]
      ],
     Append[Range[40, 120, 20], 365]
     ];
 ]

(*{0.45243, Null}*)

Note, that disruption start at day 365 means no disruption. Also, we use three hospital beds per thousand people.

Here we plot the results for HP only:

SeedRandom[009]
focusStock = HP;
aVals = #[focusStock][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, #1], #1, {RandomInteger[{60, 140}], Bottom}] &, 
  aVals], PlotLegends -> 
  SwatchLegend[Keys[aVals], 
   LegendLabel -> "Medical supplies\ndisruption start"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
1sdi97njh54gj
1sdi97njh54gj

Here are the corresponding AUC values:

aAUCs = TrapezoidalRule[Transpose[{Range[0, 365], #}]] & /@ aVals;
ResourceFunction["GridTableForm"][aAUCs]
0s1zcqcmvxbrs
0s1zcqcmvxbrs
BarChart[aAUCs, 
 ChartLabels -> Map[Rotate[ToString[#], \[Pi]/6] &, Keys[aAUCs]], 
 ColorFunction -> "Pastel", 
 PlotLabel -> 
  Row[{focusStock[t] /. modelSEI2HREcon["Stocks"], Spacer[5], "AUC"}],
  ImageSize -> Medium]
0vq0berxbjf9v

Combined variability of delivery start and disruption

Here we calculate the solutions for a set of combinations of delivery periods and delivery disruption starts:

AbsoluteTiming[
 aVarSolutions =
   Association@
    Map[
     Function[{par},
      model2 = modelSEI2HREcon;
      model2 = 
       SetRateRules[
        model2, <|\[Kappa][MS] -> 100000, \[Kappa][HMS] -> 10000, 
         mspr[HB] -> 1000, dbp -> par[[1]], dds -> par[[2]], ddl -> 7,
          dsf -> 4, nhbr[TP] -> 3/1000|>];
      par -> Association[ModelNDSolve[model2, {t, 365}][[1]]]
      ],
     Flatten[Outer[List, {0.5, 1, 1.5}, {60, 100, 365}], 1]
     ];
 ]

(*{0.759922, Null}*)

As expected more frequent, less disrupted delivery brings fuller utilization of the non-occupied hospital beds:

SeedRandom[3233]
focusStock = HP;
aVals = #[focusStock][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, ToString[#1]], 
    ToString[#1], {RandomInteger[{60, 160}], Left}] &, aVals], 
 PlotLegends -> 
  SwatchLegend[ToString /@ Keys[aVals], 
   LegendLabel -> 
    "Medical supplies\ndelivery period & disruption start"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
1qqm9hcj40qmx
1qqm9hcj40qmx

Here are the corresponding AUC values:

aAUCs = TrapezoidalRule[Transpose[{Range[0, 365], #}]] & /@ aVals;
ResourceFunction["GridTableForm"][aAUCs]
19kpmr551tt1c
19kpmr551tt1c
BarChart[aAUCs, 
 ChartLabels -> Map[Rotate[ToString[#], \[Pi]/6] &, Keys[aAUCs]], 
 ColorFunction -> "Pastel", 
 PlotLabel -> 
  Row[{focusStock[t] /. modelSEI2HREcon["Stocks"], Spacer[5], "AUC"}],
  ImageSize -> Medium]
1ia6ik6cr7kg5
1ia6ik6cr7kg5
SeedRandom[3233]
focusStock = DIP;
aVals = #[focusStock][Range[0, 365]] & /@ aVarSolutions;
ListLinePlot[
 KeyValueMap[
  Callout[Tooltip[#2, ToString[#1]], 
    ToString[#1], {RandomInteger[{60, 160}], Left}] &, aVals], 
 PlotLegends -> 
  SwatchLegend[ToString /@ Keys[aVals], 
   LegendLabel -> 
    "Medical supplies\ndelivery period & disruption start"], 
 PlotRange -> All, ImageSize -> Large, 
 PlotLabel -> focusStock[t] /. modelSEI2HREcon["Stocks"]]
0xwdkry34kcfq
0xwdkry34kcfq
ResourceFunction["GridTableForm"][Last /@ aVals]
1u293ap7k53o5
1u293ap7k53o5
BarChart[Last /@ aVals, 
 ChartLabels -> Map[Rotate[ToString[#], \[Pi]/6] &, Keys[aAUCs]], 
 ColorFunction -> "Pastel", 
 PlotLabel -> "Deceased Population at day 365", ImageSize -> Medium]
0m5va3qv461nx
0m5va3qv461nx

References

Articles

[Wk1] Wikipedia entry, “Compartmental models in epidemiology”.

[Wl2] Wikipedia entry, “Coronavirus disease 2019”.

[HH1] Herbert W. Hethcote (2000). “The Mathematics of Infectious Diseases”. SIAM Review. 42 (4): 599–653. Bibcode:2000SIAMR..42..599H. doi:10.1137/s0036144500371907.

[BC1] Lucia Breierova, Mark Choudhari, An Introduction to Sensitivity Analysis, (1996), Massachusetts Institute of Technology.

[AA1] Anton Antonov, “Coronavirus propagation modeling considerations”, (2020), SystemModeling at GitHub.

[AA2] Anton Antonov, “Basic experiments workflow for simple epidemiological models”, (2020), SystemModeling at GitHub.

[AA3] Anton Antonov, “Scaling of Epidemiology Models with Multi-site Compartments”, (2020), SystemModeling at GitHub.

[AA4] Anton Antonov, “SEI2HR model with quarantine scenarios”, (2020), SystemModeling at GitHub.

Repositories, packages

[WRI1] Wolfram Research, Inc., “Epidemic Data for Novel Coronavirus COVID-19”, WolframCloud.

[AAr1] Anton Antonov, Coronavirus propagation dynamics project, (2020), SystemModeling at GitHub.

[AAr2] Anton Antonov, “Epidemiology Compartmental Modeling Monad in R”, (2020), ECMon-R at GitHub.

[AAp1] Anton Antonov, “Epidemiology models Mathematica package”, (2020), SystemModeling at GitHub.

[AAp2] Anton Antonov, “Epidemiology models modifications Mathematica package”, (2020), SystemModeling at GitHub.

[AAp3] Anton Antonov, “Epidemiology modeling visualization functions Mathematica package”, (2020), SystemModeling at GitHub.

[AAp4] Anton Antonov, “Epidemiology modeling simulation functions Mathematica package”, (2020), SystemModeling at GitHub.

[AAp5] Anton Antonov, “System dynamics interactive interfaces functions Mathematica package”, (2020), SystemsModeling at GitHub.

SEI2HR model with quarantine scenarios

Introduction

The epidemiology compartmental model, [Wk1], presented in this notebook – SEI2HR – deals with the left-most and middle rectangles in this diagram:

0y1f7ckbmwf5i
0y1f7ckbmwf5i

“SEI2HR” stands for “Susceptible, Exposed, Infected two, Hospitalized, Recovered” (populations.)

In this notebook we also deal with quarantine scenarios.

Remark: We consider the contagious disease propagation models as instances of the more general System Dynamics (SD) models. We use SD terminology in this notebook.

The models

SEI2R

The model SEI2R are introduced and explained in the notebook [AA2]. SEI2R differs from the classical SEIR model , [Wk1, HH1], with the following elements:

  1. Two separate infected populations: one is “severely symptomatic”, the other is “normally symptomatic”
  2. The monetary equivalent of lost productivity due to infected or died people is tracked.

SEI2HR

For the formulation of SEI2HR we use a system of Differential Algebraic Equations (DAE’s). The package [AAp1] allows the use of a formulation that has just Ordinary Differential Equations (ODE’s).

Here are the unique features of SEI2HR:

  • People stocks
    • Two types of infected populations: normally symptomatic and severely symptomatic
    • Hospitalized population
    • Deceased from infection population
  • Hospital beds
    • Hospital beds are a limited resource that determines the number of hospitalized people
    • Only severely symptomatic people are hospitalized according to the available hospital beds
    • The hospital beds stock is not assumed constant, it has its own change rate.
  • Money stocks
    • The money from lost productivity are tracked
    • The money for hospital services are tracked

SEI2HR’s place a development plan

This graph shows the “big picture” of the model development plan undertaken in [AAr1] and SEI2HR (discussed in this notebook) is in that graph:

1cg8j8gglqwi1
1cg8j8gglqwi1

Notebook structure

The rest of notebook has the following sequence of sections:

  • Package load section
  • SEI2HR structure in comparison of SEI2R
  • Explanations of the equations of SEI2HR
  • Quarantine scenario modeling preparation
  • Parameters and initial conditions setup
    • Populations, hospital beds, quarantine scenarios
  • Parametric simulation solution
  • Interactive interface
  • Sensitivity analysis

Load packages

The epidemiological models framework used in this notebook is implemented with the packages [AAp1, AAp2, AA3]; many of the plot functions are from the package [AAp4].

SEI2HR extends SEI2R

The model SEI2HR is an extension of the model SEI2R, [AA2].

Here is SEI2R:

0een3ni1vr18e
0een3ni1vr18e

Here is SEI2HR:

0dsi0svozimcl
0dsi0svozimcl

Here are the “differences” between the two models:

0h6b6x7di5e05
0h6b6x7di5e05

Equations explanations

In this section we provide rationale of the model equations of SEI2HR.

The equations for Susceptible, Exposed, Infected, Recovered populations of SEI2R are “standard” and explanations about them are found in [WK1, HH1]. For SEI2HR those equations change because of the stocks Hospitalized Population and Hospital Beds.

The equations time unit is one day. The time horizon is one year. Since we target COVID-19, [Wk2, AA1], we do not consider births.

Remark: For convenient reading the equations in this section have tooltips for the involved stocks and rates.

Verbalization description of the model

We start with one infected (normally symptomatic) person, the rest of the people are susceptible. The infected people meet other people directly or get in contact with them indirectly. (Say, susceptible people touch things touched by infected.) For each susceptible person there is a probability to get the decease. The decease has an incubation period: before becoming infected the susceptible are (merely) exposed. The infected recover after a certain average infection period or die. A certain fraction of the infected become severely symptomatic. If there are enough hospital beds the severely symptomatic infected are hospitalized. The hospitalized severely infected have different death rate than the non-hospitalized ones. The number of hospital beds might change: hospitals are extended, new hospitals are build, or there are not enough medical personnel or supplies. The deaths from infection are tracked (accumulated.) Money for hospital services and money from lost productivity are tracked (accumulated.)

The equations below give mathematical interpretation of the model description above.

Code for the equations

Each equation in this section are derived with code like this:

and then the output cell is edited to be “DisplayFormula” and have CellLabel value corresponding to the stock of interest.

The infected and hospitalized populations

SEI2HR has two types of infected populations: a normally symptomatic one and a severely symptomatic one. A major assumption for SEI2HR is that only the severely symptomatic people are hospitalized. (That assumption is also reflected in the diagram in the introduction.)

Each of those three populations have their own contact rates and mortality rates.

Here are the contact rates from the SEI2HR dictionary

0ooy1eklv8ies
0ooy1eklv8ies

Here are the mortality rates from the SEI2HR dictionary

0kskzm0nsvjo3
0kskzm0nsvjo3

Remark: Below with “Infected Population” we mean both stocks Infected Normally Symptomatic Population (INSP) and Infected Severely Symptomatic Population (ISSP).

Total Population

In this notebook we consider a DAE’s formulation of SEI2HR. The stock Total Population has the following (obvious) algebraic equation:

0peb290e1lx2d
0peb290e1lx2d

Note that with Max we specified that the total population cannot be less than 0.

Remark: As mentioned in the introduction, the package [AAp1] allows for the use of non-algebraic formulation, without an equation for TP.

Susceptible Population

The stock Susceptible Population (SP) is decreased by (1) infections derived from stocks Infected Populations and Hospitalized Population (HP), and (2) morality cases derived with the typical mortality rate.

0h1zwgod2xs4g
0h1zwgod2xs4g

Because we hospitalize the severely infected people only instead of the term

09xmdu8xq7e0p
09xmdu8xq7e0p

we have the terms

0dkc6td55qv7p
0dkc6td55qv7p

The first term is for the infections derived from the hospitalized population. The second term for the infections derived from people who are infected severely symptomatic and not hospitalized.

Births term

Note that we do not consider in this notebook births, but the births term can be included in SP’s equation:

0ny25blzp2rub
0ny25blzp2rub

The births rate is the same as the death rate, but it can be programmatically changed. (See [AAp2].)

Exposed Population

The stock Exposed Population (EP) is increased by (1) infections derived from the stocks Infected Populations and Hospitalized Population, and (2) mortality cases derived with the typical mortality rate. EP is decreased by (1) the people who after a certain average incubation period (aincp) become ill, and (2) mortality cases derived with the typical mortality rate.

0mrj0gt13ppsn
0mrj0gt13ppsn

Infected Normally Symptomatic Population

INSP is increased by a fraction of the people who have been exposed. That fraction is derived with the parameter severely symptomatic population fraction (sspf). INSP is decreased by (1) the people who recover after a certain average infection period (aip), and (2) the normally symptomatic people who die from the disease.

0hy96yb7pg654
0hy96yb7pg654

Infected Severely Symptomatic Population

ISSP is increased by a fraction of the people who have been exposed. That fraction is corresponds to the parameter severely symptomatic population fraction (sspf). ISSP is decreased by (1) the people who recover after a certain average infection period (aip), (2) the hospitalized severely symptomatic people who die from the disease, and (3) the non-hospitalized severely symptomatic people who die from the disease.

19hml5ravi0a8
19hml5ravi0a8

Note that we do not assume that severely symptomatic people recover faster if they are hospitalized, only that they have a different death rate.

Hospitalized Population

The amount of people that can be hospitalized is determined by the available Hospital Beds (HB) – the stock Hospitalized Population (HP) is subject to a resource limitation by the stock HB.

The equation of the stock HP can be easily understood from the following dynamics description points:

  • If the number of hospitalized people is less that the number of hospital beds we hospitalize the new ISSP people.
  • If the new ISSP people are more than the Available Hospital Beds (AHB) we take as many as AHB.
  • Hospitalized people have the same average infection period (aip).
  • Hospitalized (severely symptomatic) people have their own mortality rate.
0ippnyp2hf90k
0ippnyp2hf90k

Note that although we know that in a given day some hospital beds are going to be freed they are not considered in the hospitalization plans for that day.

Recovered Population

The stock Recovered Population (RP) is increased by the recovered infected people and decreased by mortality cases derived with the typical mortality rate.

042iy3oy8o454
042iy3oy8o454

Deceased Infected Population

The stock Deceased Infected Population (DIP) accumulates the deaths of the people who are infected. Note that we utilize the different death rates for HP and ISSP.

1glekhgj1v0e7
1glekhgj1v0e7

Hospital Beds

The stock Hospital Beds (HB) can change with a rate that reflects the number of hospital beds change rate (nhbcr) per day. Generally speaking, using nhbcr we can capture scenarios, like, extending hospitals, building new hospitals, recruitment of new medical personnel, loss of medical personnel (due to infections.)

1is45mamhsayd
1is45mamhsayd

Money for Hospital Services

The stock Money for Hospital Services (MHS) simply tracks expenses for hospitalized people. The parameter hospital services cost rate (hscr) with unit money per bed per day simply multiplies HP.

0evzoim1tjwsj
0evzoim1tjwsj

Money from Lost Productivity

The stock Money from Lost Productivity (MLP) simply tracks the work non-availability of the infected and died from infection people. The parameter lost productivity cost rate (lpcr) with unit money per person per day multiplies the total count of the infected and dead from infection.

1133o7bm975sr
1133o7bm975sr

Quarantine scenarios

In order to model quarantine scenarios we use piecewise constant functions for the contact rates \beta [\text{ISSP}] and \beta [\text{INSP}].

Remark: Other functions can be used, like, functions derived through some statistical fitting.

Here is an example plot :

0gtvprtffih1o
0gtvprtffih1o

To model quarantine with a piecewise constant function we use the following parameters:

  • \text{qsd} for quarantine’s start
  • \text{ql} for quarantines duration
  • \text{qcrf} for the effect on the quarantine on the contact rate

Parameters and actual simulation equations code

Here are the parameters we want to experiment with (or do calibration with):

Here we set custom rates and initial conditions:

Remark: Note the piecewise functions for \beta [\text{ISSP}] and \beta [\text{INSP}].

Here is the system of ODE’s we use to do parametrized simulations:

0i35a50x9mdtq
0i35a50x9mdtq

Simulation

Solutions

Straightforward simulation for one year using ParametricNDSolve :

Example evaluation

Here are the parameters of a stock solution:

Here we replace the parameters with concrete rate values (kept in the model object):

0m3ccwuzjtg7d
0m3ccwuzjtg7d

Here is an example evaluation of a solution using the parameter values above:

Interactive interface

Using the interface in this section we can interactively see the effects of changing the focus parameters.

opts = {PlotRange -> All, PlotLegends -> None, 
   PlotTheme -> "Detailed", PerformanceGoal -> "Speed", 
   ImageSize -> 400};
lsPopulationKeys = {TP, SP, EP, ISSP, INSP, HP, RP, DIP, HB};
lsEconKeys = {MHS, MLP};
Manipulate[
 DynamicModule[{lsPopulationPlots, lsEconPlots, lsRestPlots},
  
  lsPopulationPlots =
   ParametricSolutionsPlots[
    modelSEI2HR["Stocks"],
    KeyTake[aSol, Intersection[lsPopulationKeys, displayStocks]],
    {aincp, aip, spf, crhp, qsd, ql, qcrf, nhbcr, nhbr/1000}, ndays,
    "LogPlot" -> popLogPlotQ, "Together" -> popTogetherQ, 
    "Derivatives" -> popDerivativesQ, 
    "DerivativePrefix" -> "\[CapitalDelta]", opts];
  
  lsEconPlots =
   ParametricSolutionsPlots[
    modelSEI2HR["Stocks"],
    KeyTake[aSol, Intersection[lsEconKeys, displayStocks]],
    {aincp, aip, spf, crhp, qsd, ql, qcrf, nhbcr, nhbr/1000}, ndays,
    "LogPlot" -> econLogPlotQ, "Together" -> econTogetherQ, 
    "Derivatives" -> econDerivativesQ, 
    "DerivativePrefix" -> "\[CapitalDelta]", opts];
  
  lsRestPlots =
   If[Length[KeyDrop[aSol, Join[lsPopulationKeys, lsEconKeys]]] == 
     0, {},
    (*ELSE*)
    ParametricSolutionsPlots[
     modelSEI2HR["Stocks"],
     KeyTake[KeyDrop[aSol, Join[lsPopulationKeys, lsEconKeys]], 
      displayStocks],
     {aincp, aip, spf, crhp, qsd, ql, qcrf, nhbcr, nhbr/1000}, ndays,
     "LogPlot" -> econLogPlotQ, "Together" -> econTogetherQ, 
     "Derivatives" -> econDerivativesQ, 
     "DerivativePrefix" -> "\[CapitalDelta]", opts]
    ];
  
  Multicolumn[Join[lsPopulationPlots, lsEconPlots, lsRestPlots], 
   nPlotColumns, Dividers -> All, FrameStyle -> GrayLevel[0.8]]
  ],
 {{displayStocks, Join[lsPopulationKeys, lsEconKeys], 
   "Stocks to display:"}, Join[lsPopulationKeys, lsEconKeys], 
  ControlType -> TogglerBar},
 {{aincp, 6., "Average incubation period (days)"}, 1, 60., 1, 
  Appearance -> {"Open"}},
 {{aip, 21., "Average infectious period (days)"}, 1, 60., 1, 
  Appearance -> {"Open"}},
 {{spf, 0.2, "Severely symptomatic population fraction"}, 0, 1, 0.025,
   Appearance -> {"Open"}},
 {{qsd, 65, "Quarantine start days"}, 0, 365, 0.01, 
  Appearance -> {"Open"}},
 {{ql, 8*7, "Quarantine length (in days)"}, 0, 120, 1, 
  Appearance -> {"Open"}},
 {{qcrf, 0.25, "Quarantine contact rate fraction"}, 0, 1, 0.01, 
  Appearance -> {"Open"}},
 {{crhp, 0.1, "Contact rate of the hospitalized population"}, 0, 30, 
  0.1, Appearance -> {"Open"}},
 {{nhbcr, 0, "Number of hospital beds change rate"}, -0.5, 0.5, 0.001,
   Appearance -> {"Open"}},
 {{nhbr, 2.9, "Number of hospital beds rate (per 1000 people)"}, 0, 
  100, 0.1, Appearance -> {"Open"}},
 {{ndays, 365, "Number of days"}, 1, 365, 1, Appearance -> {"Open"}},
 {{popTogetherQ, True, "Plot populations together"}, {False, True}},
 {{popDerivativesQ, False, "Plot populations derivatives"}, {False, 
   True}},
 {{popLogPlotQ, False, "LogPlot populations"}, {False, True}},
 {{econTogetherQ, True, "Plot economics functions together"}, {False, 
   True}},
 {{econDerivativesQ, False, 
   "Plot economics functions derivatives"}, {False, True}},
 {{econLogPlotQ, True, "LogPlot economics functions"}, {False, 
   True}},
 {{nPlotColumns, 1, "Number of plot columns"}, Range[5]},
 ControlPlacement -> Left, ContinuousAction -> False]
1su0ilr7pjpgz
1su0ilr7pjpgz

Sensitivity analysis

When making and using this kind of dynamics models it is important to see how the solutions react to changes of different parameters. For example, we should try to find answers to questions like “What ranges of which parameters bring dramatic changes into important stocks?”

Sensitivity analysis is used to determine how sensitive is a SD model to changes of the parameters and to changes of model’s equations, [BC1]. More specifically, parameter sensitivity, which we apply below, allows us to see the changes of stocks dynamic behaviour for different sequences (and combinations) of parameter values.

Remark: The sensitivity analysis shown below should be done for other stocks and rates. In order to keep this exposition short we focus on ISSP, DIP, and HP.

It is interesting to think in terms of “3D parameter sensitivity plots.” We also do such plots.

Evaluations by Area under the curve

For certain stocks we might be not just interested in their evolution in time but also in their cumulative values. I.e. we are interested in the so called Area Under the Curve (AUC) metric for those stocks.

There are three ways to calculate AUC for stocks of interest:

  1. Add aggregation equations in the system of ODE’s. (Similar to the stock DIP in SEI2HR.)
    • For example, in order to compute AUC for ISSP we can add to SEI2HR the equation:
(*aucISSP'[t] = ISSP[t]*)
- More details for such equation addition are given in [AA2].
  1. Apply NIntegrate over stocks solution functions.
  2. Apply Trapezoidal rule to stock solution function values over a certain time grid.

Below we use 1 and 3.

Remark: The AUC measure for a stock is indicated with the prefix “∫”. For example AUC for ISSP is marked with “∫ISSP”.

Ranges

Below we use the following sets of quarantine starts and quarantine durations.

Note that putting the quarantine start to be at day 365 means “no quarantine.”

Number of infected people

Quarantine starts sensitivity

13ewmnrwebn37
13ewmnrwebn37

Note that the plots and tabulated differences with “no quarantine” indicate that there is a very narrow range to choose an effective quarantine start.

Quarantine duration sensitivity

0jgocyy3n73g1
0jgocyy3n73g1

Number of deceased people

Quarantine starts sensitivity

0upvp860txpwb
0upvp860txpwb

Number of hospitalized people

Quarantine starts sensitivity

0jtq60dibh8ns
0jtq60dibh8ns

Infected Severely Symptomatic Population stock integral with respect to quarantine start and length

In this section the 3D plot of AUC of ISSP is calculated using Trapezoidal rule.

1pm82rs4erlsy
1pm82rs4erlsy

Deceased Infected Population stock with respect to quarantine start and length

We can see from SEI2HR’s equations that DIP is already an AUC type of value. We can just plot the DIP values at the time horizon (one year.)

1mtkajep4247q
1mtkajep4247q

Hospitalized Population stock integral with respect to quarantine start and length

In this section the 3D plot of AUC of HP is calculated using Trapezoidal rule.

16mnmktwxtv14
16mnmktwxtv14

References

Articles

[Wk1] Wikipedia entry, “Compartmental models in epidemiology”.

[Wl2] Wikipedia entry, “Coronavirus disease 2019”.

[HH1] Herbert W. Hethcote (2000). “The Mathematics of Infectious Diseases”. SIAM Review. 42 (4): 599–653. Bibcode:2000SIAMR..42..599H. doi:10.1137/s0036144500371907.

[BC1] Lucia Breierova, Mark Choudhari, An Introduction to Sensitivity Analysis, (1996), Massachusetts Institute of Technology.

[AA1] Anton Antonov, “Coronavirus propagation modeling considerations”, (2020), SystemModeling at GitHub.

[AA2] Anton Antonov, “Basic experiments workflow for simple epidemiological models”, (2020), SystemModeling at GitHub.

[AA3] Anton Antonov, “Scaling of Epidemiology Models with Multi-site Compartments”, (2020), SystemModeling at GitHub.

Repositories, packages

[WRI1] Wolfram Research, Inc., “Epidemic Data for Novel Coronavirus COVID-19”, WolframCloud.

[AAr1] Anton Antonov, Coronavirus propagation dynamics project, (2020), SystemModeling at GitHub.

[AAp1] Anton Antonov, “Epidemiology models Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp2] Anton Antonov, “Epidemiology models modifications Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp3] Anton Antonov, “Epidemiology modeling visualization functions Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp4] Anton Antonov, “System dynamics interactive interfaces functions Mathematica package”, (2020), SystemsModeling at GitHub.

Project management files

[AAo1] Anton Antonov, WirVsVirus-Hackathon-work-plan.org, (2020), SystemsModeling at GitHub.

[AAo2] Anton Antonov, WirVsVirus-hackathon-Geo-spatial-temporal-model-mind-map, (2020), SystemsModeling at GitHub.

WirVsVirus 2020 hackathon participation

Introduction

Last weekend – 2020-03-20 ÷ 2020-03-22 – I participated in the (Germany-centric) hackathon WirVsVirus. (I friend of mine who lives in Germany asked me to team up and sign up.)

Our idea proposal was accepted, listed in the dedicated overview table (see item 806). The title of our hackathon project is:

“Geo-spatial-temporal Economic Model for COVID-19 Propagation and Management in Germany”

Nearly a dozen of people enlisted to help. (We communicated through Slack.)

13dzfagts8105

Multiple people helped with the discussion of ideas, directions where to find data, with actual data gathering, and related documented analysis. Of course, just discussing the proposed solutions was already a great help!

What was accomplished

Work plans

The following mind-map reflects pretty well what was planned and done:

15n5cjaej10q8

There is also a related org-mode file with the work plan.

Data

I obtained Germany city data with Mathematica’s build-in functions and used it to heuristically derive a traveling patterns graph, [AA1].

Here is the data:

Here is Geo-histogram of that data:

0t08vw1kjdzbc

We considered a fair amount of other data. But because of the time limitations of the hackathon we had to use only the one above.

Single-site models

During the development phase I used the model SEI2R, but since we wanted to have a “geo-spatial-temporal epidemiological economics model” I productized the implementation of SEI2HR-Econ, [AAp1].

Here are the stocks, rates, and equations of SEI2HR-Econ:

0tbp6de6zdez0

Multi-site SEI2R (SEI2HR-Econ) over a hexagonal grid graph

I managed to follow through with a large part of the work plan for the hackathon and make multi-site scaled model that “follows the money”, [AA1]. Here is a diagram that shows the travelling patterns graph and solutions at one of the nodes:

1vnygv6t7chgg

Here is (a snapshot of) an interactive interface for studying and investigating the solution results:

1pgmngb4uyuzb

For more details see the notebook [AA1]. Different parameters can be set in the “Parameters” section. Especially of interest are the quarantine related parameters: start, duration, effect on contact rates and traffic patterns.

I also put in that notebook code for exporting simulations results and programmed visualization routines in R, [AA2]. (In order other members of team to be able to explore the results.)

References

[DP1] 47_wirtschaftliche Auswirkung_Geo-spatial-temp-econ-modell, DevPost.

[WRI1] Wolfram Research, Inc., Germany city data records, (2020), SystemModeling at GitHub.

[AA1] Anton Antonov, “WirVsVirus hackathon multi-site SEI2R over a hexagonal grid graph”, (2020), SystemModeling at GitHub.

[AA2] Anton Antonov, “WirVsVirus-Hackathon in R”, (2020), SystemModeling at GitHub.

[AAp1] Anton Antonov, “Epidemiology models Mathematica package”, (2020), SystemsModeling at GitHub.

Scaling of epidemiology models with multi-site compartments

Version 1.0

Introduction

In this notebook we describe and exemplify an algorithm that allows the specification and execution geo-spatial-temporal simulations of infectious disease spreads. (Concrete implementations and examples are given.)

The assumptions of the typical compartmental epidemiological models do not apply for countries or cities that are non-uniformly populated. (For example, China, USA, Russia.) There is a need to derive epidemiological models that take into account the non-uniform distribution of populations and related traveling patterns within the area of interest.

Here is a visual aid (made with a random graph over the 30 largest cities of China):

1acjs15vamk0b
1acjs15vamk0b

In this notebook we show how to extend core, single-site epidemiological models into larger models for making spatial-temporal simulations. In the explanations and examples we use SEI2R, [AA2, AAp1], as a core epidemiological model, but other models can be adopted if they adhere to the model data structure of the package “EpidemiologyModels.m”, [AAp1].

From our experiments with we believe that the proposed multi-site extension algorithm gives a modeling ingredient that is hard emulate by other means within single-site models.

Definitions

Single-site: A geographical location (city, neighbourhood, campus) for which the assumptions of the classical compartmental epidemiological models hold.

Single site epidemiological model: A compartmental epidemiological model for a single site. Such model has a system of Ordinary Differential Equations (ODE’s) and site dependent initial conditions.

Multi-site area: An area comprised of multiple single sites with known traveling patterns between them. The area has a directed graph G with nodes that correspond to the sites and a positive matrix \text{\textit{tpm}}(G) for the traveling patterns between the sites.

Problem definition: Given (i) a single site epidemiological model M, (ii) a graph G connecting multiple sites, and (iii) a traveling patterns matrix \text{\textit{tpm}}(G) between the nodes of G derive an epidemiological model S(M,\text{\textit{tpm}}(G)) that simulates more adequately viral decease propagation over G.

Multi-Site Epidemiological Model Extension Algorithm (MSEMEA): An algorithm that derives from a given single site epidemiological model and multi-site area an epidemiological model that can be used to simulate the geo-spatial-temporal epidemics and infectious disease spreads. (The description of MSEMEA is the main message of this notebook.)

Load packages

The epidemiological models framework used in this notebook is implemented with the packages [AAp1, AAp2, AA3]; the interactive plots functions are from the package [AAp4].

Notebook structure

The section “General algorithm description” gives rationale and conceptual steps of MSEMEA.

The next two sections of the notebook follow the procedure outline using the SEI2R model as M, a simple graph with two nodes as G, and both constant and time-dependent matrices for \text{\textit{tpm}}(G).

The section “Constant traveling patterns over a grid graph” presents an important test case with a grid graph that we use to test and build confidence in MSEMEA. The sub-section “Observations” is especially of interest.

The section “Time-dependent traveling patterns over a random graph” presents a nearly “real life” application of MSEMEA using a random graph and a time dependent travelling patterns matrix.

The section “Money from lost productivity” shows how to track the money losses across the sites.

The last section “Future plans” outlines envisioned (immediate) extensions work presented in this notebook.

General algorithm description

In this section we describe a modeling approach that uses different mathematical modeling approaches for (i) the multi-site travelling patterns and (ii) the single-site disease spread interactions, and then (iii) unifies them into a common model.

Splitting and scaling

The traveling between large, densely populated cities is a very different process of the usual people mingling in those cities. The usual large, dense city mingling is assumed and used in the typical compartmental epidemiological models. It seems it is a good idea to split the two processes and derive a common model.

Assume that all journeys finish within a day. We can model the people arriving (flying in) into a city as births, and people departing a city as deaths.

Let as take a simple model like SIR or SEIR and write the equation for every site we consider. This means for every site we have the same ODE’s with site-dependent initial conditions.

Consider the traveling patterns matrix K, which is a contingency matrix derived from source-destination traveling records. (Or the adjacency matrix of a travelling patterns graph.) The matrix entry of K(i,j) tells us how many people traveled from site i to site j. We systematically change the ODE’s of the sites in following way.

Assume that site a had only travelers coming from site b and going to site b. Assume that the Total Population (TP) sizes for sites a and b are N_a and N_b respectively. Assume that only people from the Susceptible Population (SP) traveled. If the adopted single-site model is SIR, [Wk1], we take the SP equation of site a

\text{SP}_a'(t)=-\frac{\beta  \text{IP}_a(t) \text{SP}_a(t)}{N_a}-\text{SP}_a(t) \mu

and change into the equation

\text{SP}_a'(t)=-\frac{\beta  \text{IP}_a(t) \text{SP}_a(t)}{N_a}-\text{SP}_a(t) \mu -\frac{K(a,b)\text{SP}_a(t)}{N_a}+\frac{K(b,a)\text{SP}_b(t)}{N_b},

assuming that

\frac{K(a,b)\text{SP}_a(t)}{N_a}\leq N_a ,\frac{K(b,a)\text{SP}_b(t)}{N_b}\leq N_b.

Remark: In the package [AAp3] the transformations above are done with the more general and robust formula:

\min \left(\frac{K(i,j)\text{SP}_i(t)}{\text{TP}_i(t)},\text{TP}_i(t)\right).

The transformed systems of ODE’s of the sites are joined into one “big” system of ODE’s, appropriate initial conditions are set, and the “big” ODE system is solved. (The sections below show concrete examples.)

Steps of MSEMEA

MSEMEA derives a compartmental model that combines (i) a graph representation of multi-site traveling patterns with (ii) a single-site compartmental epidemiological model.

Here is a visual aid for the algorithm steps below:

0i0g3m8u08bj4
0i0g3m8u08bj4
05c2sz8hd3ryj
05c2sz8hd3ryj
  1. Get a single-site epidemiological compartmental model data structure, M.
    1. The model data structure has stocks and rates dictionaries, equations, initial conditions, and prescribed rate values; see [AA2, AAp1].
  2. Derive the site-to-site traveling patterns matrix K for the sites in the graph G.
  3. For each of node i of G make a copy of the model M and denote with M[i].
    1. In general, the models M[i], i\in G have different initial conditions.
    2. The models can also have different death rates, contact rates, etc.
  4. Combine the models M[i], i\in G into the scaled model S.
    1. Change the equations of M[i], i\in G to reflect the traveling patterns matrix K.
    2. Join the systems of ODE’s of M[i], i\in G into one system of ODE’s.
  5. Set appropriate or desired initial conditions for each of the populations in S.
  6. Solve the ODE’s of S.
  7. Visualize the results.

Precaution

Care should be taken when specifying the initial conditions of MSEMEA’s system of ODE’s (sites’ populations) and the traveling patterns matrix. For example, the simulations can “blow up” if the traveling patterns matrix values are too large. As it was indicated above, the package [AAp3] puts some safe-guards, but in our experiments with random graphs and random traveling patterns matrices occasionally we still get “wild” results.

Analogy with Large scale air-pollution modeling

There is a strong analogy between MSEMEA and Eulerian models of Large Scale Air-Pollution Modeling (LSAPM), [AA3, ZZ1].

The mathematical models of LSAPM have a “chemistry part” and an “advection-diffusion part.” It is hard to treat such mathematical model directly – different kinds of splittings are used. If we consider 2D LSAPM then we can say that we cover the modeling area with steer tank reactors, then with the chemistry component we simulate the species chemical reactions in those steer tanks, and with the advection-diffusion component we change species concentrations in the steer tanks (according to some wind patterns.)

Similarly, with MSEMEA we separated the travel of population compartments from the “standard” epidemiological modeling interaction between the population compartments.

Similarly to the so called “rotational test” used in LSAPM to evaluate numerical schemes, we derive and study the results of “grid graph tests” for MSEMEA.

Single site epidemiological model

Here is the SEI2R model from the package [AAp1]:

18y99y846b10m
18y99y846b10m

Here we endow the SEI2R model with a (prominent) ID:

0alzg909zg4h0
0alzg909zg4h0

Thus we demonstrated that we can do Step 3 of MSEMEA.

Below we use ID’s that correspond to the nodes of graphs (and are integers.)

Scaling the single-site SIR model over a small complete graph

Constant travel matrices

Assume we have two sites and the following graph and matrix describe the traveling patterns between them.

Here is the graph:

0vgm31o9drq4f
0vgm31o9drq4f

And here is the traveling patterns matrix:

0lbp0xgso2tgt
0lbp0xgso2tgt

Note that there are much more travelers from 1 to 2 than from 2 to 1.

Here we obtain the core, single-site model (as shown in the section above):

Make the multi-site compartments model with SEI2R and the two-node travel matrix using the function ToSiteCompartmentsModel of [AAp2]:

Show the unique stocks in the multi-site model:

From the symbolic form of the multi-model equations derive the specific equations with the adopted rate values:

0mjliik7acoyd
0mjliik7acoyd

Show the initial conditions:

Show the total number of equations:

Solve the system of ODE’s of the extended model:

Display the solutions for each site separately:

1o9362wmczxo6
1o9362wmczxo6

From the plots above we see that both sites start with total populations of 100000 people. Because more travelers go from 1 to 2 we see that the exposed, infected, and recovered populations are larger at 2.

Time dependent travel matrices

Instead of using constant traveling patterns matrices we can use matrices with time functions as entries. It is instructive to repeat the computations above using this matrix:

1gsoh03lixm6y
1gsoh03lixm6y

Here are the corresponding number of traveling people functions:

0qh7kbtwxyatf
0qh7kbtwxyatf

Here we scale the SIR model, solve the obtained system of ODE’s, and plot the solutions:

0trv1vnslv1rm
0trv1vnslv1rm

Note that the oscillatory nature of the temporal functions in the travelling patterns matrix are reflected in the simulation results.

Constant traveling patterns over a grid graph

In this section we do the model extension and simulation over a regular grid graph with a constant traveling patterns matrix.

Here we create a grid graph with directed edges:

0l2m5npcrlnvw
0l2m5npcrlnvw

Note that:

  • There is one directed edge between any two edge-connected nodes
  • All horizontal edges point in one direction
  • All vertical edges point in one direction
  • The edges are directed from nodes with smaller indexes to nodes with larger indexes.

Here we make a constant traveling matrix and summarize it:

0d4ocoa6gibfj
0d4ocoa6gibfj

Here we scale the SEI2R model with the grid graph constant traveling matrix:

Change the initial conditions in the following way:

  • Pick initial population size per site (same for all sites)
  • Make a constant populations vector
  • At all sites except the first one put the infected populations to zero; the first site has one severely symptomatic person
  • Set the susceptible populations to be consistent with the total and infected populations.

Solve the system of ODE’s of the scaled model:

Randomly sample the graph sites and display the solutions separately for each site in the sample:

030qbpok8qfmc
030qbpok8qfmc

Display solutions of the first and last site:

0mcxwp8l1vqlb
0mcxwp8l1vqlb

As expected from the graph structure, we can see in the first site plot that its total population is decreasing – nobody is traveling to the first site. Similarly, we can see in the last site plot that its total population is increasing – nobody leaves the last site.

Graph evolution visualizations

We can visualize the spatial-temporal evolution of model’s populations using sequences of graphs. The graphs in the sequences are copies of the multi-site graph each copy having its nodes colored according to the populations in the solutions steps.

Here is a sub-sequence for the total populations:

070ld135tkf7y
070ld135tkf7y

Here is a sub-sequence for the sum of the infected populations:

0rv8vap59g8vk
0rv8vap59g8vk

Here is a sub-sequence for the recovered population:

0dnvpy20nafpz
0dnvpy20nafpz

Here is an animation of the sum of the infected populations:

1hfd1mqh0iwk7
1hfd1mqh0iwk7

Curve shapes of the globally-aggregated solutions

Let us plot for each graph vertex the sum of the solutions of the two types of infected populations. Here is a sample of those graphs:

0jwgipeg8uznn
0jwgipeg8uznn

We can see from the plot above that at the grid vertexes we have typical SEIR curve shapes for the corresponding infected populations.

Let us evaluate the solutions for the infected populations for over all graph vertexes and sum them. Here is the corresponding “globally-aggregated” plot:

03nzzmh7yydeu
03nzzmh7yydeu

We can see that the globally aggregated plot has a very different shape than the individual vertex plots. The globally aggregated plot has more symmetric look; the individual vertex plots have much steeper gradients on their left sides.

We can conjecture that a multi-site model made by MSEMEA would capture better real life situations than any single-site model. For example, by applying MSEMEA we might be more successful in our calibration attempts for the Hubei data shown (and calibrated upon) in [AA2.].

Interactive interface

With this interactive interface we see the evolution of all populations across the graph:

0fv3dbwmah3sh
0fv3dbwmah3sh

Observations

Obviously the simulations over the described grid graph, related constant traveling patterns matrix, and constant populations have the purpose to build confidence in conceptual design of MSEMEA and its implementation.

The following observations agree with our expectations for MSEMEA’s results over the “grid graph test”.

  1. The populations plots at each site resemble the typical plots of SEI2R.
  2. The total population at the first site linearly decreases.
  3. The total population at the last site linearly increases.
  4. The plots of the total populations clearly have gradually increasing gradients from the low index value nodes to the high index value nodes.
  5. For the infected populations there is a clear wave that propagates diagonally from the low index value nodes to the high index value nodes.
    1. In the direction of the general “graph flow.“
  6. The front of the infected populations wave is much steeper (gives “higher contrast”) than the tail.
    1. This should be expected from the single-site SEI2R plots.
  7. For the recovered populations there is a clear “saturation wave” pattern that indicates that the recovered populations change from 0 to values close to the corresponding final total populations.
  8. The globally aggregated solutions might have fairly different shapes than the single-site solutions. Hence, we think that MSEMEA gives a modeling ingredient that is hard to replicate or emulate by other means in single-site models.

Time-dependent traveling patterns over a random graph

In this section we apply the model extension and simulation over a random graph with random time-dependent traveling patterns matrix.

1elzu67nqndly
1elzu67nqndly

Remark: The computations in this section work with larger random graphs; we use a small graph for more legible presentation of the workflow and results. Also, the computations should clearly demonstrate the ability to do simulations with real life data.

Derive a traveling patterns matrix with entries that are random functions:

Here is a fragment of the matrix:

17ym9q0uehfbt
17ym9q0uehfbt

Summarize and plot the matrix at t=1:

0bw9rrd64615r
0bw9rrd64615r

Here we scale the SEI2R model with the random traveling matrix:

Change the initial conditions in the following way:

  • Pick maximum population size per site
  • Derive random populations for the sites
  • At all sites except the first one put the infected populations to zero; the first site has one severely symptomatic person
  • Set the susceptible populations to be consistent with the total and infected populations.

Here solve the obtained system of ODE’s:

Here we plot the solutions:

1pzmli04rbchr
1pzmli04rbchr

Graph evolution visualizations

As in the previous section we can visualize the spatial-temporal evolution of model’s populations using sequences of graphs.

Here is a globally normalized sequence:

1v8xq8jzm6ll5
1v8xq8jzm6ll5

Here is a locally normalized (“by vertex”) sequence:

029w2jtxsyaen
029w2jtxsyaen

Money from lost productivity

The model SEI2R from [AAp1] has the stock “Money from Lost Productivity” shown as \text{MLP}(t) in the equations:

1sareh7ovkmtt
1sareh7ovkmtt

Here are MLP plots from the two-node graph model:

0cm5n3xxlewns
0cm5n3xxlewns

Here we plot the sum of the accumulated money losses:

0n6gz7j3qlq07
0n6gz7j3qlq07

Here is the corresponding “daily loss” (derivative):

0jrk2ktzeeled
0jrk2ktzeeled

Future plans

There are multiple ways to extend the presented algorithm, MSEMEA. Here are a few most immediate ones:

  1. Investigate and describe the conditions under which MSEMEA performs well, and under which it “blows up”
  2. Apply MSEMEA together with single site models that have large economics parts
  3. Do real data simulations related to the spread of COVID-19.

References

Articles, books

[Wk1] Wikipedia entry, “Compartmental models in epidemiology”.

[HH1] Herbert W. Hethcote (2000). “The Mathematics of Infectious Diseases”. SIAM Review. 42 (4): 599–653. Bibcode:2000SIAMR..42..599H. doi:10.1137/s0036144500371907.

[AA1] Anton Antonov, “Coronavirus propagation modeling considerations”, (2020), SystemModeling at GitHub.

[AA2] Anton Antonov, “Basic experiments workflow for simple epidemiological models”, (2020), SystemModeling at GitHub.

[AA3] Anton Antonov, “Air pollution modeling with gridMathematica”, (2006), Wolfram Technology Conference.

[ZZ1] Zahari Zlatev, Computer Treatment of Large Air Pollution Models. 1995. Kluwer.

Repositories, packages

[WRI1] Wolfram Research, Inc., “Epidemic Data for Novel Coronavirus COVID-19”, WolframCloud.

[AAr1] Anton Antonov, Coronavirus propagation dynamics project, (2020), SystemModeling at GitHub.

[AAp1] Anton Antonov, “Epidemiology models Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp2] Anton Antonov, “Epidemiology models modifications Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp3] Anton Antonov, “Epidemiology modeling visualization functions Mathematica package”, (2020), SystemsModeling at GitHub.

[AAp4] Anton Antonov, “System dynamics interactive interfaces functions Mathematica package”, (2020), SystemsModeling at GitHub.