## Introduction

This document/notebook is inspired by the Mathematica Stack Exchange (MSE) question “Plotting the Star of Bethlehem”, [MSE1]. That MSE question requests efficient and fast plotting of a certain mathematical function that (maybe) looks like the Star of Bethlehem, [Wk1]. Instead of doing what the author of the questions suggests, I decided to use a generative art program and workflows from three of most important Machine Learning (ML) sub-cultures: Latent Semantic Analysis, Recommendations, and Classification.

Although we discuss making of Bethlehem Star-like images, the ML workflows and corresponding code presented in this document/notebook have general applicability – in many situations we have to make classifiers based on data that has to be “feature engineered” through pipeline of several types of ML transformative workflows and that feature engineering requires multiple iterations of re-examinations and tuning in order to achieve the set goals.

The document/notebook is structured as follows:

- Target Bethlehem Star images
- Simplistic approach
- Elaborated approach outline
- Sections that follow through elaborated approach outline:
- Data generation
- Feature extraction
- Recommender creation
- Classifier creation and utilization experiments

(This document/notebook is a “raw” chapter for the book “Simplified Machine Learning Workflows”, [AAr3].)

## Target images

Here are the images taken from [MSE1] that we consider to be “Bethlehem Stars” in this document/notebook:

```
Import["https://i.stack.imgur.com/qmmOw.png"];
imgStar1 = Import["https://i.stack.imgur.com/5gtsS.png"];
imgStar2 = Row[{imgStar1, Spacer[5], imgStar2}]
```

We notice that similar images can be obtained using the Wolfram Function Repository (WFR) function `RandomMandala`

, [AAr1]. Here are a dozen examples:

```
SeedRandom[5];
Table[MandalaToWhiterImage@ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> RandomInteger[{2, 8}], "ConnectingFunction" -> FilledCurve@*BezierCurve], 12], 6, Background -> Black] Multicolumn[
```

## Simplistic approach

We can just generate a large enough set of mandalas and pick the ones we like.

More precisely we have the following steps:

- We generate, say, 200 random mandalas using
`BlockRandom`

and keeping track of the random seeds- The mandalas are generated with rotational symmetry order 2 and filled Bezier curve connections.

- We pick mandalas that look, more or less, like Bethlehem Stars
- Add picked mandalas to the results list
- If too few mandalas are in the results list go to 1.

Here are some mandalas generated with those steps:

```
697734, 227488491, 296515155601, 328716690761, 25979673846, 48784395076, 61082107304, 63772596796, 128581744446, 194807926867, 254647184786, 271909611066, 296515155601, 575775702222, 595562118302, 663386458123, 664847685618, 680328164429, 859482663706};
lsStarReferenceSeeds = DeleteDuplicates@{
Multicolumn[Table[BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve, ColorFunction -> (White &), Background -> Black], RandomSeeding -> rs], {rs, lsStarReferenceSeeds}] /. GrayLevel[0.25`] -> White, 6, Appearance -> "Horizontal", Background -> Black]
```

**Remark:** The plot above looks prettier in notebook converted with the resource function `DarkMode`

.

## Elaborated approach

Assume that we want to automate the simplistic approach described in the previous section.

One way to automate is to create a Machine Learning (ML) classifier that is capable of discerning which `RandomMandala`

objects look like Bethlehem Star target images and which do not. With such a classifier we can write a function `BethlehemMandala`

that applies the classifier on multiple results from `RandomMandala`

and returns those mandalas that the classifier says are good.

Here are the steps of building the proposed classifier:

- Generate a large enough Random Mandala Images Set (RMIS)
- Create a feature extractor from a subset of RMIS
- Assign features to all of RMIS
- Make a recommender with the RMIS features and other image data (like pixel values)
- Apply the RMIS recommender over the target Bethlehem Star images and determine and examine image sets that are:
- the best recommendations
- the worse recommendations

- With the best and worse recommendations sets compose training data for classifier making
- Train a classifier
- Examine classifier application to (filtering of) random mandala images (both in RMIS and not in RMIS)
- If the results are not satisfactory redo some or all of the steps above

**Remark:** If the results are not satisfactory we should consider using the obtained classifier at the data generation phase. (This is not done in this document/notebook.)

**Remark:** The elaborated approach outline and flow chart have general applicability, not just for generation of random images of a certain type.

### Flow chart

Here is a flow chart that corresponds to the outline above:

A few observations for the flow chart follow:

- The flow chart has a feature extraction block that shows that the feature extraction can be done in several ways.
- The application of LSA is a type of feature extraction which this document/notebook uses.

- If the results are not good enough the flow chart shows that the classifier can be used at the data generation phase.
- If the results are not good enough there are several alternatives to redo or tune the ML algorithms.
- Changing or tuning the recommender implies training a new classifier.
- Changing or tuning the feature extraction implies making a new recommender and a new classifier.

## Data generation and preparation

In this section we generate random mandala graphics, transform them into images and corresponding vectors. Those image-vectors can be used to apply dimension reduction algorithms. (Other feature extraction algorithms can be applied over the images.)

### Generated data

Generate large number of mandalas:

```
k = 20000;
False;
knownSeedsQ = SeedRandom[343];
1, 10^9}, k];
lsRSeeds = Union@RandomInteger[{AbsoluteTiming[
aMandalas = If[TrueQ@knownSeedsQ,
BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], RandomSeeding -> rs], {rs, lsRSeeds}],
Association@Table[rs ->
(*ELSE*) i -> ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], {i, 1, k}]
Association@Table[
];
]
18.7549, Null}*) (*{
```

Check the number of mandalas generated:

```
Length[aMandalas]
20000*) (*
```

Show a sample of the generated mandalas:

`Magnify[Multicolumn[MandalaToWhiterImage /@ RandomSample[Values@aMandalas, 40], 10, Background -> Black], 0.7]`

### Data preparation

Convert the mandala graphics into images using appropriately large (or appropriately small) image sizes:

```
AbsoluteTiming[
ParallelMap[ImageResize[#, {120, 120}] &, aMandalas];
aMImages =
]
248.202, Null}*) (*{
```

Flatten each of the images into vectors:

```
AbsoluteTiming[
ParallelMap[Flatten[ImageData[Binarize@ColorNegate@ColorConvert[#, "Grayscale"]]] &, aMImages];
aMImageVecs =
]
16.0125, Null}*) (*{
```

**Remark:** Below those vectors are called ** image-vectors**.

## Feature extraction

In this section we use the software monad `LSAMon`

, [AA1, AAp1], to do dimension reduction over a subset of random mandala images.

**Remark:** Other feature extraction methods can be used through the built-in functions `FeatureExtraction`

and `FeatureExtract`

.

### Dimension reduction

Create an `LSAMon`

object and extract image topics using Singular Value Decomposition (SVD) or Independent Component Analysis (ICA), [AAr2]:

```
SeedRandom[893];
AbsoluteTiming[
lsaObj =
LSAMonUnit[]⟹SparseArray[Values@RandomSample[aMImageVecs, UpTo[2000]]]]⟹
LSAMonSetDocumentTermMatrix[
LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]⟹40, Method -> "ICA", "MaxSteps" -> 240, "MinNumberOfDocumentsPerTerm" -> 0]⟹
LSAMonExtractTopics["NumberOfTopics" -> Left];
LSAMonNormalizeMatrixProduct[Normalized ->
]
16.1871, Null}*) (*{
```

Show the importance coefficients of the topics (if SVD was used the plot would show the singular values):

`ListPlot[Norm /@ SparseArray[lsaObj⟹LSAMonTakeH], Filling -> Axis, PlotRange -> All, PlotTheme -> "Scientific"]`

Show the interpretation of the extracted image topics:

```
lsaObj⟹Right]⟹
LSAMonNormalizeMatrixProduct[Normalized -> ImageAdjust[Image[Partition[#, ImageDimensions[aMImages[[1]]][[1]]]]] & /@ SparseArray[#H] &]; LSAMonEchoFunctionContext[
```

### Approximation

Pick a test image that is a mandala image or a target image and pre-process it:

```
If[True,
RandomChoice[Range[Length[Values[aMImages]]]];
ind =
imgTest = MandalaToWhiterImage@aMandalas[[ind]]; ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic],
matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest,
(*ELSE*) Binarize[imgStar2, 0.5];
imgTest = ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic]
matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest,
]; imgTest
```

Find the representation of the test image with the chosen feature extractor (`LSAMon`

object here):

```
matReprsentation = lsaObj⟹LSAMonRepresentByTopics[matImageTest]⟹LSAMonTakeValue;1, All]]];
lsCoeff = Normal@SparseArray[matReprsentation[[ListPlot[lsCoeff, Filling -> Axis, PlotRange -> All]
```

Show the interpretation of the found representation:

```
H = SparseArray[lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹LSAMonTakeH];
H;
vecReprsentation = lsCoeff . Rescale[Partition[vecReprsentation, ImageDimensions[aMImages[[1]]][[1]]]]] ImageAdjust@Image[
```

## Recommendations

In this section we utilize the software monad `SMRMon`

, [AAp3], to create a recommender for the random mandala images.

**Remark:** Instead of the Sparse Matrix Recommender (SMR) object the built-in function `Nearest`

can be used.

Create `SSparseMatrix`

object for all image-vectors:

`SparseArray[Values@aMImageVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic] matImages = ToSSparseMatrix[`

Normalize the rows of the image-vectors matrix:

```
AbsoluteTiming[
matPixel = WeightTermsOfSSparseMatrix[matImages, "None", "None", "Cosine"] ]
```

Get the LSA topics matrix:

`Right]⟹LSAMonTakeH) matH = (lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> `

Find the image topics representation for each image-vector (assuming `matH`

was computed with SVD or ICA):

```
AbsoluteTiming[
Transpose[matH]
matTopic = matPixel . ]
```

Here we create a recommender based on the images data (pixels) and extracted image topics (or other image features):

```
smrObj =
SMRMonUnit[]⟹
SMRMonCreate[<|"Pixel" -> matPixel, "Topic" -> matTopic|>]⟹
SMRMonApplyNormalizationFunction["Cosine"]⟹0.2, "Topic" -> 1|>]; SMRMonSetTagTypeWeights[<|"Pixel" ->
```

**Remark:** Note the weights assigned to the pixels and the topics in the recommender object above. Those weights were derived by examining the recommendations results shown below.

Here is the image we want to find most similar mandala images to – ** the target image**:

`Binarize[imgStar2, 0.5] imgTarget = `

Here is the profile of the target image:

```
ImageDimensions[aMImages[[1]]]];
aProf = MakeSMRProfile[lsaObj, imgTarget, 6]
TakeLargest[aProf,
10032-10009-4392" -> 0.298371, "3906-10506-10495" -> 0.240086, "10027-10014-4387" -> 0.156797, "8342-8339-6062" -> 0.133822, "3182-3179-11222" -> 0.131565, "8470-8451-5829" -> 0.128844|>*) (*<|"
```

Using the target image profile here we compute the recommendation scores for all mandala images of the recommender:

```
aRecs =
smrObj⟹All]⟹
SMRMonRecommendByProfile[aProf, SMRMonTakeValue;
```

Here is a plot of the similarity scores:

`Row[{ResourceFunction["RecordsSummary"][Values[aRecs]], ListPlot[Values[aRecs], ImageSize -> Medium, PlotRange -> All, PlotTheme -> "Detailed", PlotLabel -> "Similarity scores"]}]`

Here are the closest (nearest neighbor) mandala images:

`ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], 48]]]], 12, Background -> Black] Multicolumn[Values[ImageAdjust@*`

Here are the most distant mandala images:

`ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], -48]]]], 12, Background -> Black] Multicolumn[Values[ImageAdjust@*`

## Classifier creation and utilization

In this section we:

- Prepare classifier data
- Build and examine a classifier using the software monad
`ClCon`

, [AA2, AAp2], using appropriate training, testing, and validation data ratios - Build a classifier utilizing all training data
- Generate Bethlehem Star mandalas by filtering mandala candidates with the classifier

As it was mentioned above we prepare the data to build classifiers with by:

- Selecting top, highest scores recommendations and labeling them with
`True`

- Selecting bad, low score recommendations and labeling them with
`False`

```
AbsoluteTiming[
Block[{
ToExpression /@ Take[Keys[aRecs], 120]]],
lsBest = Values@aMandalas[[ToExpression /@ Join[Take[Keys[aRecs], -200], RandomSample[Take[Keys[aRecs], {3000, -200}], 200]]]]},
lsWorse = Values@aMandalas[[
lsTrainingData = Join[
Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> True &, lsBest],
Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> False &, lsWorse]
];
]
]
27.9127, Null}*) (*{
```

Using `ClCon`

train a classifier and show its performance measures:

```
clObj =
ClConUnit[lsTrainingData]⟹0.75, 0.2]⟹
ClConSplitData[
ClConMakeClassifier["NearestNeighbors"]⟹
ClConClassifierMeasurements⟹
ClConEchoValue⟹
ClConClassifierMeasurements["ConfusionMatrixPlot"]⟹ ClConEchoValue;
```

**Remark:** We can re-run the `ClCon`

workflow above several times until we obtain a classifier we want to use.

Train a classifier with all prepared data:

```
clObj2 =
ClConUnit[lsTrainingData]⟹1, 0.2]⟹
ClConSplitData[ ClConMakeClassifier["NearestNeighbors"];
```

Get the classifier function from ClCon object:

` cfBStar = clObj2⟹ClConTakeClassifier`

Here we generate Bethlehem Star mandalas using the classifier trained above:

```
SeedRandom[2020];
12, cfBStar, 0.87], 6, Background -> Black] Multicolumn[MandalaToWhiterImage /@ BethlehemMandala[
```

Generate Bethlehem Star mandala images utilizing the classifier (with a specified classifier probabilities threshold):

```
SeedRandom[32];
12, cfBStar, 0.87, "Probabilities" -> True]] KeyMap[MandalaToWhiterImage, BethlehemMandala[
```

Show unfiltered Bethlehem Star mandala candidates:

```
SeedRandom[32];
12, cfBStar, 0, "Probabilities" -> True]] KeyMap[MandalaToWhiterImage, BethlehemMandala[
```

**Remark:** Examine the probabilities in the image-probability associations above – they show that the classifier is “working.“

Here is another set generated Bethlehem Star mandalas using rotational symmetry order 4:

```
SeedRandom[777];
12, cfBStar, 0.8, "RotationalSymmetryOrder" -> 4, "Probabilities" -> True]] KeyMap[MandalaToWhiterImage, BethlehemMandala[
```

**Remark:** Note that although a higher rotational symmetry order is used the highly scored results still seem relevant – they have the features of the target Bethlehem Star images.

## References

[AA1] Anton Antonov, “A monad for Latent Semantic Analysis workflows”, (2019), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov, “A monad for classification workflows”, (2018)), MathematicaForPrediction at WordPress.

[MSE1] “Plotting the Star of Bethlehem”, (2020),Mathematica Stack Exchange, question 236499,

[Wk1] Wikipedia entry, Star of Bethlehem.

### Packages

[AAr1] Anton Antonov, RandomMandala, (2019), Wolfram Function Repository.

[AAr2] Anton Antonov, IdependentComponentAnalysis, (2019), Wolfram Function Repository.

[AAr3] Anton Antonov, “Simplified Machine Learning Workflows” book, (2019), GitHub/antononcube.

[AAp1] Anton Antonov, Monadic Latent Semantic Analysis Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp2] Anton Antonov, Monadic contextual classification Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp3] Anton Antonov, Monadic Sparse Matrix Recommender Mathematica package, (2018), MathematicaForPrediction at GitHub/antononcube.

*Code definitions*

```
urlPart = "https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/";Get[urlPart <> "MonadicLatentSemanticAnalysis.m"];
Get[urlPart <> "MonadicSparseMatrixRecommender.m"];
Get[urlPart <> "/MonadicContextualClassification.m"];
```

```
Clear[MandalaToImage, MandalaToWhiterImage];
gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr, imgSize];
MandalaToImage[gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr /. GrayLevel[0.25`] -> Black, imgSize]; MandalaToWhiterImage[
```

```
Clear[ImageToVector];
img_Image] := Flatten[ImageData[ColorConvert[img, "Grayscale"]]];
ImageToVector[img_Image, imgSize_] := Flatten[ImageData[ColorConvert[ImageResize[img, imgSize], "Grayscale"]]];
ImageToVector[___] := $Failed; ImageToVector[
```

```
Clear[MakeSMRProfile];
lsaObj_LSAMon, gr_Graphics, imgSize_] := MakeSMRProfile[lsaObj, {gr}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsGrs : {_Graphics}, imgSize_] := MakeSMRProfile[lsaObj, MandalaToWhiterImage[#, imgSize] & /@ lsGrs, imgSize]
MakeSMRProfile[lsaObj_LSAMon, img_Image, imgSize_] := MakeSMRProfile[lsaObj, {img}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsImgs : {_Image ..}, imgSize_] :=
MakeSMRProfile[Block[{lsImgVecs, matTest, aProfPixel, aProfTopic},
lsImgVecs = ImageToVector[#, imgSize] & /@ lsImgs; SparseArray[lsImgVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic];
matTest = ToSSparseMatrix[
aProfPixel = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTerms[matTest]⟹LSAMonTakeValue];
aProfTopic = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTopics[matTest]⟹LSAMonTakeValue]; Select[aProfPixel, # > 0 &];
aProfPixel = Select[aProfTopic, # > 0 &];
aProfTopic = Join[aProfPixel, aProfTopic]
];___] := $Failed; MakeSMRProfile[
```

```
Clear[BethlehemMandalaCandiate];
OptionsPattern[]] := ResourceFunction["RandomMandala"][opts, "RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve]; BethlehemMandalaCandiate[opts :
```

```
Clear[BethlehemMandala];
Options[BethlehemMandala] = Join[{ImageSize -> {120, 120}, "Probabilities" -> False}, Options[ResourceFunction["RandomMandala"]]];
n_Integer, cf_ClassifierFunction, opts : OptionsPattern[]] := BethlehemMandala[n, cf, 0.87, opts];
BethlehemMandala[n_Integer, cf_ClassifierFunction, threshold_?NumericQ, opts : OptionsPattern[]] :=
BethlehemMandala[Block[{imgSize, probsQ, res, resNew, aResScores = <||>, aResScoresNew = <||>},
OptionValue[BethlehemMandala, ImageSize];
imgSize = TrueQ[OptionValue[BethlehemMandala, "Probabilities"]];
probsQ =
res = {}; While[Length[res] < n,
Table[BethlehemMandalaCandiate[FilterRules[{opts}, Options[ResourceFunction["RandomMandala"]]]], 2*(n - Length[res])];
resNew = True] & /@ resNew];
aResScoresNew = Association[# -> cf[MandalaToImage[#, imgSize], "Probabilities"][Select[aResScoresNew, # >= threshold &];
aResScoresNew = Join[aResScores, aResScoresNew];
aResScores =
res = Keys[aResScores]
];
n]];
aResScores = TakeLargest[ReverseSort[aResScores], UpTo[If[probsQ, aResScores, Keys[aResScores]]
n > 0;
] /; ___] := $Failed BethlehemMandala[
```