Random mandalas deconstruction in R, Python, and Mathematica

Today (2022-02-28) I gave a presentation Greater Boston useR Meetup titled “Random mandalas deconstruction with R, Python, and Mathematica”. (Link to the video recording.)


Here is the abstract:

In this presentation we discuss the application of different dimension reduction algorithms over collections of random mandalas. We discuss and compare the derived image bases and show how those bases explain the underlying collection structure. The presented techniques and insights (1) are applicable to any collection of images, and (2) can be included in larger, more complicated machine learning workflows. The former is demonstrated with a handwritten digits recognition
application; the latter with the generation of random Bethlehem stars. The (parallel) walk-through of the core demonstration is in all three programming languages: Mathematica, Python, and R.


Here is the related RStudio project: “RandomMandalasDeconstruction”.

Here is a link to the R-computations notebook converted to HTML: “LSA methods comparison in R”.

The Mathematica notebooks are placed in project’s folder “notebooks-WL”.


See the work plan status in the org-mode file “Random-mandalas-deconstruction-presentation-work-plan.org”.

Here is the mind-map for the presentation:


The comparison workflow implemented in the notebooks of this project is summarized in the following flow chart:

Random mandalas deconstruction workflow


References

Articles

[AA1] Anton Antonov, “Comparison of dimension reduction algorithms over mandala images generation”, (2017), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov, “Handwritten digits recognition by matrix factorization”, (2016), MathematicaForPrediction at WordPress.

Mathematica packages and repository functions

[AAp1] Anton Antonov, Monadic Latent Semantic Analysis Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAf1] Anton Antonov, NonNegativeMatrixFactorization, (2019), Wolfram Function Repository.

[AAf2] Anton Antonov, IndependentComponentAnalysis, (2019), Wolfram Function Repository.

[AAf3] Anton Antonov, RandomMandala, (2019), Wolfram Function Repository.

Python packages

[AAp2] Anton Antonov, LatentSemanticAnalyzer Python package (2021), PyPI.org.

[AAp3] Anton Antonov, Random Mandala Python package, (2021), PyPI.org.

R packages

[AAp4] Anton Antonov, Latent Semantic Analysis Monad R package, (2019), R-packages at GitHub/antononcube.

Generation of Random Bethlehem Stars

Introduction

This document/notebook is inspired by the Mathematica Stack Exchange (MSE) question “Plotting the Star of Bethlehem”, [MSE1]. That MSE question requests efficient and fast plotting of a certain mathematical function that (maybe) looks like the Star of Bethlehem, [Wk1]. Instead of doing what the author of the questions suggests, I decided to use a generative art program and workflows from three of most important Machine Learning (ML) sub-cultures: Latent Semantic Analysis, Recommendations, and Classification.

Although we discuss making of Bethlehem Star-like images, the ML workflows and corresponding code presented in this document/notebook have general applicability – in many situations we have to make classifiers based on data that has to be “feature engineered” through pipeline of several types of ML transformative workflows and that feature engineering requires multiple iterations of re-examinations and tuning in order to achieve the set goals.

The document/notebook is structured as follows:

  1. Target Bethlehem Star images
  2. Simplistic approach
  3. Elaborated approach outline
  4. Sections that follow through elaborated approach outline:
    1. Data generation
    2. Feature extraction
    3. Recommender creation
    4. Classifier creation and utilization experiments

(This document/notebook is a “raw” chapter for the book “Simplified Machine Learning Workflows”, [AAr3].)

Target images

Here are the images taken from [MSE1] that we consider to be “Bethlehem Stars” in this document/notebook:

imgStar1 = Import["https://i.stack.imgur.com/qmmOw.png"];
imgStar2 = Import["https://i.stack.imgur.com/5gtsS.png"];
Row[{imgStar1, Spacer[5], imgStar2}]
00dxgln7hhmjl

We notice that similar images can be obtained using the Wolfram Function Repository (WFR) function RandomMandala, [AAr1]. Here are a dozen examples:

SeedRandom[5];
Multicolumn[Table[MandalaToWhiterImage@ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> RandomInteger[{2, 8}], "ConnectingFunction" -> FilledCurve@*BezierCurve], 12], 6, Background -> Black]
0dwkbztss087q

Simplistic approach

We can just generate a large enough set of mandalas and pick the ones we like.

More precisely we have the following steps:

  1. We generate, say, 200 random mandalas using BlockRandom and keeping track of the random seeds
    1. The mandalas are generated with rotational symmetry order 2 and filled Bezier curve connections.
  2. We pick mandalas that look, more or less, like Bethlehem Stars
  3. Add picked mandalas to the results list
  4. If too few mandalas are in the results list go to 1.

Here are some mandalas generated with those steps:

lsStarReferenceSeeds = DeleteDuplicates@{697734, 227488491, 296515155601, 328716690761, 25979673846, 48784395076, 61082107304, 63772596796, 128581744446, 194807926867, 254647184786, 271909611066, 296515155601, 575775702222, 595562118302, 663386458123, 664847685618, 680328164429, 859482663706};
Multicolumn[
  Table[BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve, ColorFunction -> (White &), Background -> Black], RandomSeeding -> rs], {rs, lsStarReferenceSeeds}] /. GrayLevel[0.25`] -> White, 6, Appearance -> "Horizontal", Background -> Black]
1aedatd1zb3fh

Remark: The plot above looks prettier in notebook converted with the resource function DarkMode.

Elaborated approach

Assume that we want to automate the simplistic approach described in the previous section.

One way to automate is to create a Machine Learning (ML) classifier that is capable of discerning which RandomMandala objects look like Bethlehem Star target images and which do not. With such a classifier we can write a function BethlehemMandala that applies the classifier on multiple results from RandomMandala and returns those mandalas that the classifier says are good.

Here are the steps of building the proposed classifier:

  • Generate a large enough Random Mandala Images Set (RMIS)
  • Create a feature extractor from a subset of RMIS
  • Assign features to all of RMIS
  • Make a recommender with the RMIS features and other image data (like pixel values)
  • Apply the RMIS recommender over the target Bethlehem Star images and determine and examine image sets that are:
    • the best recommendations
    • the worse recommendations
  • With the best and worse recommendations sets compose training data for classifier making
  • Train a classifier
  • Examine classifier application to (filtering of) random mandala images (both in RMIS and not in RMIS)
  • If the results are not satisfactory redo some or all of the steps above

Remark: If the results are not satisfactory we should consider using the obtained classifier at the data generation phase. (This is not done in this document/notebook.)

Remark: The elaborated approach outline and flow chart have general applicability, not just for generation of random images of a certain type.

Flow chart

Here is a flow chart that corresponds to the outline above:

09agsmbmjhhv4

A few observations for the flow chart follow:

  • The flow chart has a feature extraction block that shows that the feature extraction can be done in several ways.
    • The application of LSA is a type of feature extraction which this document/notebook uses.
  • If the results are not good enough the flow chart shows that the classifier can be used at the data generation phase.
  • If the results are not good enough there are several alternatives to redo or tune the ML algorithms.
    • Changing or tuning the recommender implies training a new classifier.
    • Changing or tuning the feature extraction implies making a new recommender and a new classifier.

Data generation and preparation

In this section we generate random mandala graphics, transform them into images and corresponding vectors. Those image-vectors can be used to apply dimension reduction algorithms. (Other feature extraction algorithms can be applied over the images.)

Generated data

Generate large number of mandalas:

k = 20000;
knownSeedsQ = False;
SeedRandom[343];
lsRSeeds = Union@RandomInteger[{1, 10^9}, k];
AbsoluteTiming[
  aMandalas = 
    If[TrueQ@knownSeedsQ, 
     Association@Table[rs -> BlockRandom[ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], RandomSeeding -> rs], {rs, lsRSeeds}], 
    (*ELSE*) 
     Association@Table[i -> ResourceFunction["RandomMandala"]["RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve], {i, 1, k}] 
    ]; 
 ]

(*{18.7549, Null}*)

Check the number of mandalas generated:

Length[aMandalas]

(*20000*)

Show a sample of the generated mandalas:

Magnify[Multicolumn[MandalaToWhiterImage /@ RandomSample[Values@aMandalas, 40], 10, Background -> Black], 0.7]
1gpblane63eo9

Data preparation

Convert the mandala graphics into images using appropriately large (or appropriately small) image sizes:

AbsoluteTiming[
  aMImages = ParallelMap[ImageResize[#, {120, 120}] &, aMandalas]; 
 ]

(*{248.202, Null}*)

Flatten each of the images into vectors:

AbsoluteTiming[
  aMImageVecs = ParallelMap[Flatten[ImageData[Binarize@ColorNegate@ColorConvert[#, "Grayscale"]]] &, aMImages]; 
 ]

(*{16.0125, Null}*)

Remark: Below those vectors are called image-vectors.

Feature extraction

In this section we use the software monad LSAMon, [AA1, AAp1], to do dimension reduction over a subset of random mandala images.

Remark: Other feature extraction methods can be used through the built-in functions FeatureExtraction and FeatureExtract.

Dimension reduction

Create an LSAMon object and extract image topics using Singular Value Decomposition (SVD) or Independent Component Analysis (ICA), [AAr2]:

SeedRandom[893];
AbsoluteTiming[
  lsaObj = 
    LSAMonUnit[]⟹
     LSAMonSetDocumentTermMatrix[SparseArray[Values@RandomSample[aMImageVecs, UpTo[2000]]]]⟹
     LSAMonApplyTermWeightFunctions["None", "None", "Cosine"]⟹
     LSAMonExtractTopics["NumberOfTopics" -> 40, Method -> "ICA", "MaxSteps" -> 240, "MinNumberOfDocumentsPerTerm" -> 0]⟹
     LSAMonNormalizeMatrixProduct[Normalized -> Left]; 
 ]

(*{16.1871, Null}*)

Show the importance coefficients of the topics (if SVD was used the plot would show the singular values):

ListPlot[Norm /@ SparseArray[lsaObj⟹LSAMonTakeH], Filling -> Axis, PlotRange -> All, PlotTheme -> "Scientific"]
1sy1zsgpxysof

Show the interpretation of the extracted image topics:

lsaObj⟹
   LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹
   LSAMonEchoFunctionContext[ImageAdjust[Image[Partition[#, ImageDimensions[aMImages[[1]]][[1]]]]] & /@ SparseArray[#H] &];
16h8a7jwknnkt

Approximation

Pick a test image that is a mandala image or a target image and pre-process it:

If[True, 
   ind = RandomChoice[Range[Length[Values[aMImages]]]]; 
   imgTest = MandalaToWhiterImage@aMandalas[[ind]]; 
   matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic], 
  (*ELSE*) 
   imgTest = Binarize[imgStar2, 0.5]; 
   matImageTest = ToSSparseMatrix[SparseArray@List@ImageToVector[imgTest, ImageDimensions[aMImages[[1]]]], "RowNames" -> Automatic, "ColumnNames" -> Automatic] 
  ];
imgTest
0vlq50ryrw0hl

Find the representation of the test image with the chosen feature extractor (LSAMon object here):

matReprsentation = lsaObj⟹LSAMonRepresentByTopics[matImageTest]⟹LSAMonTakeValue;
lsCoeff = Normal@SparseArray[matReprsentation[[1, All]]];
ListPlot[lsCoeff, Filling -> Axis, PlotRange -> All]
1u57b208thtfz

Show the interpretation of the found representation:

H = SparseArray[lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹LSAMonTakeH];
vecReprsentation = lsCoeff . H;
ImageAdjust@Image[Rescale[Partition[vecReprsentation, ImageDimensions[aMImages[[1]]][[1]]]]]
1m7r3b5bx32ow

Recommendations

In this section we utilize the software monad SMRMon, [AAp3], to create a recommender for the random mandala images.

Remark: Instead of the Sparse Matrix Recommender (SMR) object the built-in function Nearest can be used.

Create SSparseMatrix object for all image-vectors:

matImages = ToSSparseMatrix[SparseArray[Values@aMImageVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic]
029x975bs3q7w

Normalize the rows of the image-vectors matrix:

AbsoluteTiming[
  matPixel = WeightTermsOfSSparseMatrix[matImages, "None", "None", "Cosine"] 
 ]
1k9xucwektmhh

Get the LSA topics matrix:

matH = (lsaObj⟹LSAMonNormalizeMatrixProduct[Normalized -> Right]⟹LSAMonTakeH)
05zsn0o1jyqj6

Find the image topics representation for each image-vector (assuming matH was computed with SVD or ICA):

AbsoluteTiming[
  matTopic = matPixel . Transpose[matH] 
 ]
028u1jz1hgzx9

Here we create a recommender based on the images data (pixels) and extracted image topics (or other image features):

smrObj = 
   SMRMonUnit[]⟹
    SMRMonCreate[<|"Pixel" -> matPixel, "Topic" -> matTopic|>]⟹
    SMRMonApplyNormalizationFunction["Cosine"]⟹
    SMRMonSetTagTypeWeights[<|"Pixel" -> 0.2, "Topic" -> 1|>];

Remark: Note the weights assigned to the pixels and the topics in the recommender object above. Those weights were derived by examining the recommendations results shown below.

Here is the image we want to find most similar mandala images to – the target image:

imgTarget = Binarize[imgStar2, 0.5]
1qdmarfxa5i78

Here is the profile of the target image:

aProf = MakeSMRProfile[lsaObj, imgTarget, ImageDimensions[aMImages[[1]]]];
TakeLargest[aProf, 6]

(*<|"10032-10009-4392" -> 0.298371, "3906-10506-10495" -> 0.240086, "10027-10014-4387" -> 0.156797, "8342-8339-6062" -> 0.133822, "3182-3179-11222" -> 0.131565, "8470-8451-5829" -> 0.128844|>*)

Using the target image profile here we compute the recommendation scores for all mandala images of the recommender:

aRecs = 
   smrObj⟹
    SMRMonRecommendByProfile[aProf, All]⟹
    SMRMonTakeValue;

Here is a plot of the similarity scores:

Row[{ResourceFunction["RecordsSummary"][Values[aRecs]], ListPlot[Values[aRecs], ImageSize -> Medium, PlotRange -> All, PlotTheme -> "Detailed", PlotLabel -> "Similarity scores"]}]
1kdiisj4jg4ut

Here are the closest (nearest neighbor) mandala images:

Multicolumn[Values[ImageAdjust@*ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], 48]]]], 12, Background -> Black]
096uazw8izidy

Here are the most distant mandala images:

Multicolumn[Values[ImageAdjust@*ColorNegate /@ aMImages[[ToExpression /@ Take[Keys[aRecs], -48]]]], 12, Background -> Black]
0zb7hf24twij4

Classifier creation and utilization

In this section we:

  • Prepare classifier data
  • Build and examine a classifier using the software monad ClCon, [AA2, AAp2], using appropriate training, testing, and validation data ratios
  • Build a classifier utilizing all training data
  • Generate Bethlehem Star mandalas by filtering mandala candidates with the classifier

As it was mentioned above we prepare the data to build classifiers with by:

  • Selecting top, highest scores recommendations and labeling them with True
  • Selecting bad, low score recommendations and labeling them with False
AbsoluteTiming[
  Block[{
    lsBest = Values@aMandalas[[ToExpression /@ Take[Keys[aRecs], 120]]], 
    lsWorse = Values@aMandalas[[ToExpression /@ Join[Take[Keys[aRecs], -200], RandomSample[Take[Keys[aRecs], {3000, -200}], 200]]]]}, 
   lsTrainingData = 
     Join[
      Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> True &, lsBest], 
      Map[MandalaToWhiterImage[#, ImageDimensions@aMImages[[1]]] -> False &, lsWorse] 
     ]; 
  ] 
 ]

(*{27.9127, Null}*)

Using ClCon train a classifier and show its performance measures:

clObj = 
   ClConUnit[lsTrainingData]⟹
    ClConSplitData[0.75, 0.2]⟹
    ClConMakeClassifier["NearestNeighbors"]⟹
    ClConClassifierMeasurements⟹
    ClConEchoValue⟹
    ClConClassifierMeasurements["ConfusionMatrixPlot"]⟹
    ClConEchoValue;
0jkfza6x72kb5
03uf3deiz0hsd

Remark: We can re-run the ClCon workflow above several times until we obtain a classifier we want to use.

Train a classifier with all prepared data:

clObj2 = 
   ClConUnit[lsTrainingData]⟹
    ClConSplitData[1, 0.2]⟹
    ClConMakeClassifier["NearestNeighbors"];

Get the classifier function from ClCon object:

cfBStar = clObj2⟹ClConTakeClassifier
0awjjib00ihgg

Here we generate Bethlehem Star mandalas using the classifier trained above:

SeedRandom[2020];
Multicolumn[MandalaToWhiterImage /@ BethlehemMandala[12, cfBStar, 0.87], 6, Background -> Black]
0r37g633mpq0y

Generate Bethlehem Star mandala images utilizing the classifier (with a specified classifier probabilities threshold):

SeedRandom[32];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0.87, "Probabilities" -> True]]
0osesxm4gdvvf

Show unfiltered Bethlehem Star mandala candidates:

SeedRandom[32];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0, "Probabilities" -> True]]
0rr12n6savl9z

Remark: Examine the probabilities in the image-probability associations above – they show that the classifier is “working.“

Here is another set generated Bethlehem Star mandalas using rotational symmetry order 4:

SeedRandom[777];
KeyMap[MandalaToWhiterImage, BethlehemMandala[12, cfBStar, 0.8, "RotationalSymmetryOrder" -> 4, "Probabilities" -> True]]
0rgzjquk4amz4

Remark: Note that although a higher rotational symmetry order is used the highly scored results still seem relevant – they have the features of the target Bethlehem Star images.

References

[AA1] Anton Antonov, “A monad for Latent Semantic Analysis workflows”, (2019), MathematicaForPrediction at WordPress.

[AA2] Anton Antonov, “A monad for classification workflows”, (2018)), MathematicaForPrediction at WordPress.

[MSE1] “Plotting the Star of Bethlehem”, (2020),Mathematica Stack Exchange, question 236499,

[Wk1] Wikipedia entry, Star of Bethlehem.

Packages

[AAr1] Anton Antonov, RandomMandala, (2019), Wolfram Function Repository.

[AAr2] Anton Antonov, IdependentComponentAnalysis, (2019), Wolfram Function Repository.

[AAr3] Anton Antonov, “Simplified Machine Learning Workflows” book, (2019), GitHub/antononcube.

[AAp1] Anton Antonov, Monadic Latent Semantic Analysis Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp2] Anton Antonov, Monadic contextual classification Mathematica package, (2017), MathematicaForPrediction at GitHub/antononcube.

[AAp3] Anton Antonov, Monadic Sparse Matrix Recommender Mathematica package, (2018), MathematicaForPrediction at GitHub/antononcube.

Code definitions

urlPart = "https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/MonadicProgramming/";
Get[urlPart <> "MonadicLatentSemanticAnalysis.m"];
Get[urlPart <> "MonadicSparseMatrixRecommender.m"];
Get[urlPart <> "/MonadicContextualClassification.m"];
Clear[MandalaToImage, MandalaToWhiterImage];
MandalaToImage[gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr, imgSize];
MandalaToWhiterImage[gr_Graphics, imgSize_ : {120, 120}] := ColorNegate@ImageResize[gr /. GrayLevel[0.25`] -> Black, imgSize];
Clear[ImageToVector];
ImageToVector[img_Image] := Flatten[ImageData[ColorConvert[img, "Grayscale"]]];
ImageToVector[img_Image, imgSize_] := Flatten[ImageData[ColorConvert[ImageResize[img, imgSize], "Grayscale"]]];
ImageToVector[___] := $Failed;
Clear[MakeSMRProfile];
MakeSMRProfile[lsaObj_LSAMon, gr_Graphics, imgSize_] := MakeSMRProfile[lsaObj, {gr}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsGrs : {_Graphics}, imgSize_] := MakeSMRProfile[lsaObj, MandalaToWhiterImage[#, imgSize] & /@ lsGrs, imgSize]
MakeSMRProfile[lsaObj_LSAMon, img_Image, imgSize_] := MakeSMRProfile[lsaObj, {img}, imgSize];
MakeSMRProfile[lsaObj_LSAMon, lsImgs : {_Image ..}, imgSize_] := 
   Block[{lsImgVecs, matTest, aProfPixel, aProfTopic}, 
    lsImgVecs = ImageToVector[#, imgSize] & /@ lsImgs; 
    matTest = ToSSparseMatrix[SparseArray[lsImgVecs], "RowNames" -> Automatic, "ColumnNames" -> Automatic]; 
    aProfPixel = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTerms[matTest]⟹LSAMonTakeValue]; 
    aProfTopic = ColumnSumsAssociation[lsaObj⟹LSAMonRepresentByTopics[matTest]⟹LSAMonTakeValue]; 
    aProfPixel = Select[aProfPixel, # > 0 &]; 
    aProfTopic = Select[aProfTopic, # > 0 &]; 
    Join[aProfPixel, aProfTopic] 
   ];
MakeSMRProfile[___] := $Failed;
Clear[BethlehemMandalaCandiate];
BethlehemMandalaCandiate[opts : OptionsPattern[]] := ResourceFunction["RandomMandala"][opts, "RotationalSymmetryOrder" -> 2, "NumberOfSeedElements" -> Automatic, "ConnectingFunction" -> FilledCurve@*BezierCurve];
Clear[BethlehemMandala];
Options[BethlehemMandala] = Join[{ImageSize -> {120, 120}, "Probabilities" -> False}, Options[ResourceFunction["RandomMandala"]]];
BethlehemMandala[n_Integer, cf_ClassifierFunction, opts : OptionsPattern[]] := BethlehemMandala[n, cf, 0.87, opts];
BethlehemMandala[n_Integer, cf_ClassifierFunction, threshold_?NumericQ, opts : OptionsPattern[]] := 
   Block[{imgSize, probsQ, res, resNew, aResScores = <||>, aResScoresNew = <||>}, 
     
     imgSize = OptionValue[BethlehemMandala, ImageSize]; 
     probsQ = TrueQ[OptionValue[BethlehemMandala, "Probabilities"]]; 
     
     res = {}; 
     While[Length[res] < n, 
      resNew = Table[BethlehemMandalaCandiate[FilterRules[{opts}, Options[ResourceFunction["RandomMandala"]]]], 2*(n - Length[res])]; 
      aResScoresNew = Association[# -> cf[MandalaToImage[#, imgSize], "Probabilities"][True] & /@ resNew]; 
      aResScoresNew = Select[aResScoresNew, # >= threshold &]; 
      aResScores = Join[aResScores, aResScoresNew]; 
      res = Keys[aResScores] 
     ]; 
     
     aResScores = TakeLargest[ReverseSort[aResScores], UpTo[n]]; 
     If[probsQ, aResScores, Keys[aResScores]] 
    ] /; n > 0;
BethlehemMandala[___] := $Failed

Conference abstracts similarities

Introduction

In this MathematicaVsR project we discuss and exemplify finding and analyzing similarities between texts using Latent Semantic Analysis (LSA). Both Mathematica and R codes are provided.

The LSA workflows are constructed and executed with the software monads LSAMon-WL, [AA1, AAp1], and LSAMon-R, [AAp2].

The illustrating examples are based on conference abstracts from rstudio::conf and Wolfram Technology Conference (WTC), [AAd1, AAd2]. Since the number of rstudio::conf abstracts is small and since rstudio::conf 2020 is about to start at the time of preparing this project we focus on words and texts from RStudio’s ecosystem of packages and presentations.

Statistical thesaurus for words from RStudio’s ecosystem

Consider the focus words:

{"cloud","rstudio","package","tidyverse","dplyr","analyze","python","ggplot2","markdown","sql"}

Here is a statistical thesaurus for those words:

0az70qt8noeqf
0az70qt8noeqf

Remark: Note that the computed thesaurus entries seem fairly “R-flavored.”

Similarity analysis diagrams

As expected the abstracts from rstudio::conf tend to cluster closely – note the square formed top-left in the plot of a similarity matrix based on extracted topics:

1d5a83m8cghew
1d5a83m8cghew

Here is a similarity graph based on the matrix above:

09y26s6kr3bv9
09y26s6kr3bv9

Here is a clustering (by “graph communities”) of the sub-graph highlighted in the plot above:

0rba3xgoknkwi
0rba3xgoknkwi

Notebooks

Comparison observations

LSA pipelines specifications

The packages LSAMon-WL, [AAp1], and LSAMon-R, [AAp2], make the comparison easy – the codes of the specified workflows are nearly identical.

Here is the Mathematica code:

lsaObj =
  LSAMonUnit[aDesriptions]⟹
   LSAMonMakeDocumentTermMatrix[{}, Automatic]⟹
   LSAMonEchoDocumentTermMatrixStatistics⟹
   LSAMonApplyTermWeightFunctions["IDF", "TermFrequency", "Cosine"]⟹
   LSAMonExtractTopics["NumberOfTopics" -> 36, "MinNumberOfDocumentsPerTerm" -> 2, Method -> "ICA", MaxSteps -> 200]⟹
   LSAMonEchoTopicsTable["NumberOfTableColumns" -> 6];

Here is the R code:

lsaObj <- 
  LSAMonUnit(lsDescriptions) %>% 
  LSAMonMakeDocumentTermMatrix( stemWordsQ = FALSE, stopWords = stopwords::stopwords() ) %>% 
  LSAMonApplyTermWeightFunctions( "IDF", "TermFrequency", "Cosine" ) 
  LSAMonExtractTopics( numberOfTopics = 36, minNumberOfDocumentsPerTerm = 5, method = "NNMF", maxSteps = 20, profilingQ = FALSE ) %>% 
  LSAMonEchoTopicsTable( numberOfTableColumns = 6, wideFormQ = TRUE ) 

Graphs and graphics

Mathematica’s built-in graph functions make the exploration of the similarities much easier. (Than using R.)

Mathematica’s matrix plots provide more control and are more readily informative.

Sparse matrix objects with named rows and columns

R’s built-in sparse matrices with named rows and columns are great. LSAMon-WL utilizes a similar, specially implemented sparse matrix object, see [AA1, AAp3].

References

Articles

[AA1] Anton Antonov, A monad for Latent Semantic Analysis workflows, (2019), MathematicaForPrediction at GitHub.

[AA2] Anton Antonov, Text similarities through bags of words, (2020), SimplifiedMachineLearningWorkflows-book at GitHub.

Data

[AAd1] Anton Antonov, RStudio::conf-2019-abstracts.csv, (2020), SimplifiedMachineLearningWorkflows-book at GitHub.

[AAd2] Anton Antonov, Wolfram-Technology-Conference-2016-to-2019-abstracts.csv, (2020), SimplifiedMachineLearningWorkflows-book at GitHub.

Packages

[AAp1] Anton Antonov, Monadic Latent Semantic Analysis Mathematica package, (2017), MathematicaForPrediction at GitHub.

[AAp2] Anton Antonov, Latent Semantic Analysis Monad R package, (2019), R-packages at GitHub.

[AAp3] Anton Antonov, SSparseMatrix Mathematica package, (2018), MathematicaForPrediction at GitHub.

Wolfram Live-Coding Series on Latent Semantic Analysis workflows

In brief

The lectures on Latent Semantic Analysis (LSA) are to be recorded through Wolfram University (Wolfram U) in December 2019 and January-February 2020.

The lectures (as live-coding sessions)

  1. Overview Latent Semantic Analysis (LSA) typical problems and basic workflows.
    Answering preliminary anticipated questions.
    Here is the recording of the first session at Twitch .

    • What are the typical applications of LSA?
    • Why use LSA?
    • What it the fundamental philosophical or scientific assumption for LSA?
    • What is the most important and/or fundamental step of LSA?
    • What is the difference between LSA and Latent Semantic Indexing (LSI)?
    • What are the alternatives?
      • Using Neural Networks instead?
    • How is LSA used to derive similarities between two given texts?
    • How is LSA used to evaluate the proximity of phrases? (That have different words, but close semantic meaning.)
    • How the main dimension reduction methods compare?
  2. LSA for document collections.
    Here is the recording of the second session at Twitch .

    • Motivational example – full blown LSA workflow.
    • Fundamentals, text transformation (the hard way):
      • bag of words model,
      • stop words,
      • stemming.
    • The easy way with LSAMon.
    • “Eat your own dog food” example.
  3. Representation of the documents – the fundamental matrix object.
    Here is the recording of the third session at Twitch.

    • Review: last session’s example.
    • Review: the motivational example – full blown LSA workflow.
    • Linear vector space representation:
      • LSA’s most fundamental operation,
      • matrix with named rows and columns.
    • Pareto Principle adherence
      • for a document,
      • for a document collection, and
      • (in general.)
  4. Representation of unseen documents.
    Here is the recording of the fourth session at Twitch.

  5. LSA for image de-noising and classification. Here is the recording of the fifth session at Twitch.

    • Review: last session’s image collection topics extraction.
      • Let us try that two other datasets:
    • Image de-noising:
      • Using handwritten digits (again).
    • Image classification:
      • Handwritten digits.
  6. Further use cases.

Comparison of dimension reduction algorithms over mandala images generation

Introduction

This document discusses concrete algorithms for two different approaches of generation of mandala images, [1]: direct construction with graphics primitives, and use of machine learning algorithms.

In the experiments described in this document better results were obtained with the direct algorithms. The direct algorithms were made for the Mathematica StackExchange question "Code that generates a mandala", [3].

The main goals of this document are:

  1. to show some pretty images exploiting symmetry and multiplicity (see this album),

  2. to provide an illustrative example of comparing dimension reduction methods,

  3. to give a set-up for further discussions and investigations on mandala creation with machine learning algorithms.

Two direct construction algorithms are given: one uses "seed" segment rotations, the other superimposing of layers of different types. The following plots show the order in which different mandala parts are created with each of the algorithms.

"Direct-Mandala-creation-algorithms-steps"

In this document we use several algorithms for dimension reduction applied to collections of images following the procedure described in [4,5]. We are going to show that with Non-Negative Matrix Factorization (NNMF) we can use mandalas made with the seed segment rotation algorithm to extract layer types and superimpose them to make colored mandalas. Using the same approach with Singular Value Decomposition (SVD) or Independent Component Analysis (ICA) does not produce good layers and the superimposition produces more "watered-down", less diverse mandalas.

From a more general perspective this document compares the statistical approach of "trying to see without looking" with the "direct simulation" approach. Another perspective is the creation of "design spaces"; see [6].

The idea of using machine learning algorithms is appealing because there is no need to make the mental effort of understanding, discerning, approximating, and programming the principles of mandala creation. We can "just" use a large collection of mandala images and generate new ones using the "internal knowledge" data of machine learning algorithms. For example, a Neural network system like Deep Dream, [2], might be made to dream of mandalas.

Direct algorithms for mandala generation

In this section we present two different algorithms for generating mandalas. The first sees a mandala as being generated by rotation of a "seed" segment. The second sees a mandala as being generated by different component layers. For other approaches see [3].

The request of [3] is for generation of mandalas for coloring by hand. That is why the mandala generation algorithms are in the grayscale space. Coloring the generated mandala images is a secondary task.

By seed segment rotations

One way to come up with mandalas is to generate a segment and then by appropriate number of rotations to produce a mandala.

Here is a function and an example of random segment (seed) generation:

Clear[MakeSeedSegment]
MakeSeedSegment[radius_, angle_, n_Integer: 10, 
   connectingFunc_: Polygon, keepGridPoints_: False] :=
  Block[{t},
   t = Table[
     Line[{radius*r*{Cos[angle], Sin[angle]}, {radius*r, 0}}], {r, 0, 1, 1/n}];
   Join[If[TrueQ[keepGridPoints], t, {}], {GrayLevel[0.25], 
     connectingFunc@RandomSample[Flatten[t /. Line[{x_, y_}] :> {x, y}, 1]]}]
   ];

seed = MakeSeedSegment[10, Pi/12, 10];
Graphics[seed, Frame -> True]
"Mandala-seed-segment"

This function can make a seed segment symmetric:

Clear[MakeSymmetric]
MakeSymmetric[seed_] := {seed, 
   GeometricTransformation[seed, ReflectionTransform[{0, 1}]]};

seed = MakeSymmetric[seed];
Graphics[seed, Frame -> True]
"Mandala-seed-segment-symmetric"

Using a seed we can generate mandalas with different specification signatures:

Clear[MakeMandala]
MakeMandala[opts : OptionsPattern[]] :=      
  MakeMandala[
   MakeSymmetric[
    MakeSeedSegment[20, Pi/12, 12, 
     RandomChoice[{Line, Polygon, BezierCurve, 
       FilledCurve[BezierCurve[#]] &}], False]], Pi/6, opts];

MakeMandala[seed_, angle_?NumericQ, opts : OptionsPattern[]] :=      
  Graphics[GeometricTransformation[seed, 
    Table[RotationMatrix[a], {a, 0, 2 Pi - angle, angle}]], opts];

This code randomly selects symmetricity and seed generation parameters (number of concentric circles, angles):

SeedRandom[6567]
n = 12;
Multicolumn@
 MapThread[
  Image@If[#1,
     MakeMandala[MakeSeedSegment[10, #2, #3], #2],
     MakeMandala[
      MakeSymmetric[MakeSeedSegment[10, #2, #3, #4, False]], 2 #2]
     ] &, {RandomChoice[{False, True}, n], 
   RandomChoice[{Pi/7, Pi/8, Pi/6}, n], 
   RandomInteger[{8, 14}, n], 
   RandomChoice[{Line, Polygon, BezierCurve, 
     FilledCurve[BezierCurve[#]] &}, n]}]
"Seed-segment-rotation-mandalas-complex-settings"

Here is a more concise way to generate symmetric segment mandalas:

Multicolumn[Table[Image@MakeMandala[], {12}], 5]
"Seed-segment-rotation-mandalas-simple-settings"

Note that with this approach the programming of the mandala coloring is not that trivial — weighted blending of colorized mandalas is the easiest thing to do. (Shown below.)

By layer types

This approach was given by Simon Woods in [3].

"For this one I’ve defined three types of layer, a flower, a simple circle and a ring of small circles. You could add more for greater variety."

The coloring approach with image blending given below did not work well for this algorithm, so I modified the original code in order to produce colored mandalas.

ClearAll[LayerFlower, LayerDisk, LayerSpots, MandalaByLayers]

LayerFlower[n_, a_, r_, colorSchemeInd_Integer] := 
  Module[{b = RandomChoice[{-1/(2 n), 0}]}, {If[
     colorSchemeInd == 0, White, 
     RandomChoice[ColorData[colorSchemeInd, "ColorList"]]], 
    Cases[ParametricPlot[
      r (a + Cos[n t])/(a + 1) {Cos[t + b Sin[2 n t]], Sin[t + b Sin[2 n t]]}, {t, 0, 2 Pi}], 
     l_Line :> FilledCurve[l], -1]}];

LayerDisk[_, _, r_, colorSchemeInd_Integer] := {If[colorSchemeInd == 0, White, 
    RandomChoice[ColorData[colorSchemeInd, "ColorList"]]], 
   Disk[{0, 0}, r]};

LayerSpots[n_, a_, r_, colorSchemeInd_Integer] := {If[colorSchemeInd == 0, White, 
    RandomChoice[ColorData[colorSchemeInd, "ColorList"]]], 
   Translate[Disk[{0, 0}, r a/(4 n)], r CirclePoints[n]]};

MandalaByLayers[n_, m_, coloring : (False | True) : False, opts : OptionsPattern[]] := 
  Graphics[{EdgeForm[Black], White, 
    Table[RandomChoice[{3, 2, 1} -> {LayerFlower, LayerDisk, LayerSpots}][n, RandomReal[{3, 5}], i, 
       If[coloring, RandomInteger[{1, 17}], 0]]~Rotate~(Pi i/n), {i, m, 1, -1}]}, opts];

Here are generated black-and-white mandalas.

SeedRandom[6567]
ImageCollage[Table[Image@MandalaByLayers[16, 20], {12}], Background -> White, ImagePadding -> 3, ImageSize -> 1200]
"Layer-types-superimposing-BW"

Here are some colored mandalas. (Which make me think more of Viking and Native American art than mandalas.)

ImageCollage[Table[Image@MandalaByLayers[16, 20, True], {12}], Background -> White, ImagePadding -> 3, ImageSize -> 1200]
"Layer-types-superimposing-colored"

Training data

Images by direct generation

iSize = 400;

SeedRandom[6567]
AbsoluteTiming[
 mandalaImages = 
   Table[Image[
     MakeMandala[
      MakeSymmetric@
       MakeSeedSegment[10, Pi/12, 12, RandomChoice[{Polygon, FilledCurve[BezierCurve[#]] &}]], Pi/6], 
     ImageSize -> {iSize, iSize}, ColorSpace -> "Grayscale"], {300}];
 ]

(* {39.31, Null} *)

ImageCollage[ColorNegate /@ RandomSample[mandalaImages, 12], Background -> White, ImagePadding -> 3, ImageSize -> 400]
"mandalaImages-sample"

External image data

See the section "Using World Wide Web images".

Direct blending

The most interesting results are obtained with the image blending procedure coded below over mandala images generated with the seed segment rotation algorithm.

SeedRandom[3488]
directBlendingImages = Table[
   RemoveBackground@
    ImageAdjust[
     Blend[Colorize[#, 
         ColorFunction -> 
          RandomChoice[{"IslandColors", "FruitPunchColors", 
            "AvocadoColors", "Rainbow"}]] & /@ 
       RandomChoice[mandalaImages, 4], RandomReal[1, 4]]], {36}];

ImageCollage[directBlendingImages, Background -> White, ImagePadding -> 3, ImageSize -> 1200]

"directBlendingImages-3488-36"

Dimension reduction algorithms application

In this section we are going to apply the dimension reduction algorithms Singular Value Decomposition (SVD), Independent Component Analysis (ICA), and Non-Negative Matrix Factorization (NNMF) to a linear vector space representation (a matrix) of an image dataset. In the next section we use the bases generated by those algorithms to make mandala images.
We are going to use the packages [7,8] for ICA and NNMF respectively.


Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/IndependentComponentAnalysis.m"]
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/NonNegativeMatrixFactorization.m"]

Linear vector space representation

The linear vector space representation of the images is simple — each image is flattened to a vector (row-wise), and the image vectors are put into a matrix.

mandalaMat = Flatten@*ImageData@*ColorNegate /@ mandalaImages;
Dimensions[mandalaMat]

(* {300, 160000} *)

Re-factoring and basis images

The following code re-factors the images matrix with SVD, ICA, and NNMF and extracts the basis images.

AbsoluteTiming[
 svdRes = SingularValueDecomposition[mandalaMat, 20];
]
(* {5.1123, Null} *)

svdBasisImages = Map[ImageAdjust@Image@Partition[#, iSize] &, Transpose@svdRes[[3]]];

AbsoluteTiming[
 icaRes = 
   IndependentComponentAnalysis[Transpose[mandalaMat], 20, 
    PrecisionGoal -> 4, "MaxSteps" -> 100];
]
(* {23.41, Null} *)

icaBasisImages = Map[ImageAdjust@Image@Partition[#, iSize] &, Transpose[icaRes[[1]]]];

SeedRandom[452992]
AbsoluteTiming[
 nnmfRes = 
   GDCLS[mandalaMat, 20, PrecisionGoal -> 4, 
    "MaxSteps" -> 20, "RegularizationParameter" -> 0.1];
 ]
(* {233.209, Null} *)

nnmfBasisImages = Map[ImageAdjust@Image@Partition[#, iSize] &, nnmfRes[[2]]];

Bases

Let us visualize the bases derived with the matrix factorization methods.

Grid[{{"SVD", "ICA", "NNMF"},
      Map[ImageCollage[#, Automatic, {400, 500}, 
        Background -> LightBlue, ImagePadding -> 5, ImageSize -> 350] &, 
      {svdBasisImages, icaBasisImages, nnmfBasisImages}]
     }, Dividers -> All]
"Mandala-SVD-ICA-NNMF-bases-20"

"Mandala-SVD-ICA-NNMF-bases-20"

Here are some observations for the bases.

  1. The SVD basis has an average mandala image as its first vector and the other vectors are "differences" to be added to that first vector.

  2. The SVD and ICA bases are structured similarly. That is because ICA and SVD are both based on orthogonality — ICA factorization uses an orthogonality criteria based on Gaussian noise properties (which is more relaxed than SVD’s standard orthogonality criteria.)

  3. As expected, the NNMF basis images have black background because of the enforced non-negativity. (Black corresponds to 0, white to 1.)

  4. Compared to the SVD and ICA bases the images of the NNMF basis are structured in a radial manner. This can be demonstrated using image binarization.

Grid[{{"SVD", "ICA", "NNMF"}, Map[ImageCollage[Binarize[#, 0.5] & /@ #, Automatic, {400, 500}, Background -> LightBlue, ImagePadding -> 5, ImageSize -> 350] &, {svdBasisImages, icaBasisImages, nnmfBasisImages}] }, Dividers -> All]
"Mandala-SVD-ICA-NNMF-bases-binarized-0.5-20"

We can see that binarizing of the NNMF basis images shows them as mandala layers. In other words, using NNMF we can convert the mandalas of the seed segment rotation algorithm into mandalas generated by an algorithm that superimposes layers of different types.

Blending with image bases samples

In this section we just show different blending images using the SVD, ICA, and NNMF bases.

Blending function definition

ClearAll[MandalaImageBlending]
Options[MandalaImageBlending] = {"BaseImage" -> {}, "BaseImageWeight" -> Automatic, "PostBlendingFunction" -> (RemoveBackground@*ImageAdjust)};
MandalaImageBlending[basisImages_, nSample_Integer: 4, n_Integer: 12, opts : OptionsPattern[]] :=      
  Block[{baseImage, baseImageWeight, postBlendingFunc, sImgs, sImgWeights},
   baseImage = OptionValue["BaseImage"];
   baseImageWeight = OptionValue["BaseImageWeight"];
   postBlendingFunc = OptionValue["PostBlendingFunction"];
   Table[(
     sImgs = 
      Flatten@Join[{baseImage}, RandomSample[basisImages, nSample]];
     If[NumericQ[baseImageWeight] && ImageQ[baseImage],
      sImgWeights = 
       Join[{baseImageWeight}, RandomReal[1, Length[sImgs] - 1]],
      sImgWeights = RandomReal[1, Length[sImgs]]
      ];
     postBlendingFunc@
      Blend[Colorize[#, 
          DeleteCases[{opts}, ("BaseImage" -> _) | ("BaseImageWeight" -> _) | ("PostBlendingFunction" -> _)],               
          ColorFunction -> 
           RandomChoice[{"IslandColors", "FruitPunchColors", 
             "AvocadoColors", "Rainbow"}]] & /@ sImgs, 
       sImgWeights]), {n}]
   ];

SVD image basis blending

SeedRandom[17643]
svdBlendedImages = MandalaImageBlending[Rest@svdBasisImages, 4, 24];
ImageCollage[svdBlendedImages, Background -> White, ImagePadding -> 3, ImageSize -> 1200]

"svdBlendedImages-17643-24"

SeedRandom[17643]
svdBlendedImages = MandalaImageBlending[Rest@svdBasisImages, 4, 24, "BaseImage" -> First[svdBasisImages], "BaseImageWeight" -> 0.5];
ImageCollage[svdBlendedImages, Background -> White, ImagePadding -> 3, ImageSize -> 1200]

"svdBlendedImages-baseImage-17643-24"

ICA image basis blending

SeedRandom[17643]
icaBlendedImages = MandalaImageBlending[Rest[icaBasisImages], 4, 36, "BaseImage" -> First[icaBasisImages], "BaseImageWeight" -> Automatic];
ImageCollage[icaBlendedImages, Background -> White, ImagePadding -> 3, ImageSize -> 1200]

"icaBlendedImages-17643-36"

NNMF image basis blending

SeedRandom[17643]
nnmfBlendedImages = MandalaImageBlending[nnmfBasisImages, 4, 36];
ImageCollage[nnmfBlendedImages, Background -> White, ImagePadding -> 3, ImageSize -> 1200]

"nnmfBlendedImages-17643-36"

Using World Wide Web images

A natural question to ask is:

What would be the outcomes of the above procedures to mandala images found in the World Wide Web (WWW) ?

Those WWW images are most likely man made or curated.

The short answer is that the results are not that good. Better results might be obtained using a larger set of WWW images (than just 100 in the experiment results shown below.)

Here is a sample from the WWW mandala images:

"wwwMandalaImages-sample-6

Here are the results obtained with NNMF basis:

"www-nnmfBlendedImages-12"

Future plans

My other motivation for writing this document is to set up a basis for further investigations and discussions on the following topics.

  1. Having a large image database of "real world", human made mandalas.

  2. Utilization of Neural Network algorithms to mandala creation.

  3. Utilization of Cellular Automata to mandala generation.

  4. Investigate mandala morphing and animations.

  5. Making a domain specific language of specifications for mandala creation and modification.

The idea of using machine learning algorithms for mandala image generation was further supported by an image classifier that recognizes fairly well (suitably normalized) mandala images obtained in different ways:

"Mandalas-classifer-measurements-matrix"

References

[1] Wikipedia entry: Mandala, https://en.wikipedia.org/wiki/Mandala .

[2] Wikipedia entry: DeepDream, https://en.wikipedia.org/wiki/DeepDream .

[3] "Code that generates a mandala", Mathematica StackExchange, http://mathematica.stackexchange.com/q/136974 .

[4] Anton Antonov, "Comparison of PCA and NNMF over image de-noising", (2016), MathematicaForPrediction at WordPress blog. URL: https://mathematicaforprediction.wordpress.com/2016/05/07/comparison-of-pca-and-nnmf-over-image-de-noising/ .

[5] Anton Antonov, "Handwritten digits recognition by matrix factorization", (2016), MathematicaForPrediction at WordPress blog. URL: https://mathematicaforprediction.wordpress.com/2016/11/12/handwritten-digits-recognition-by-matrix-factorization/ .

[6] Chris Carlson, "Social Exploration of Design Spaces: A Proposal", (2016), Wolfram Technology Conference 2016. URL: http://wac .36f4.edgecastcdn.net/0036F4/pub/www.wolfram.com/technology-conference/2016/SocialExplorationOfDesignSpaces.nb , YouTube: https://www.youtube.com/watch?v=YK2523nfcms .

[7] Anton Antonov, Independent Component Analysis Mathematica package, (2016), source code at MathematicaForPrediction at GitHub, package IndependentComponentAnalysis.m .

[8] Anton Antonov, Implementation of the Non-Negative Matrix Factorization algorithm in Mathematica, (2013), source code at MathematicaForPrediction at GitHub, package NonNegativeMatrixFactorization.m.

Comparison of PCA, NNMF, and ICA over image de-noising

Introduction

In a previous blog post, [1], I compared Principal Component Analysis (PCA) / Singular Value Decomposition (SVD) and Non-Negative Matrix Factorization (NNMF) over a collection of noised images of digit handwriting from the MNIST data set, [3], which is available in Mathematica.

This blog post adds to that comparison the use of Independent Component Analysis (ICA) proclaimed in my previous blog post, [1].

Computations

The ICA related additional computations to those in [1] follow.

First we load the package IndependentComponentAnalysis.m :

Import["https://raw.githubusercontent.com/antononcube/\
MathematicaForPrediction/master/IndependentComponentAnalysis.m"]

From that package we can use the function IndependentComponentAnalysis to find components:

{S, A} = IndependentComponentAnalysis[Transpose[noisyTestImagesMat], 9, PrecisionGoal -> 4.5];
Norm[noisyTestImagesMat - Transpose[S.A]]/Norm[noisyTestImagesMat]
(* 0.592739 *)

Let us visualize the norms of the mixing matrix A :

norms = Norm /@ A;
ListPlot[norms, PlotRange -> All, PlotLabel -> "Norms of A rows", 
    PlotTheme -> "Detailed"] // 
        ColorPlotOutliers[TopOutliers@*HampelIdentifierParameters]
pos = OutlierPosition[norms, TopOutliers@*HampelIdentifierParameters]

Norms of the mixing matrix

Next we can visualize the found "source" images:

ncol = 2;
Grid[Partition[Join[
 MapIndexed[{#2[[1]], Norm[#], 
   ImageAdjust@Image[Partition[#, dims[[1]]]]} &, 
 Transpose[S]] /. (# -> Style[#, Red] & /@ pos),
Table["", {ncol - QuotientRemainder[Dimensions[S][[2]], ncol][[2]]}]
], ncol], Dividers -> All]

ICA found source images of 6 and 7 images matrix

After selecting several of source images we zero the rest by modifying the matrix A:

pos = {6, 7, 9};
norms = Norm /@ A;
dA = DiagonalMatrix[
 ReplacePart[ConstantArray[0, Length[norms]], Map[List, pos] -> 1]];
newMatICA = 
 Transpose[Map[# + Mean[Transpose[noisyTestImagesMat]] &, S.dA.A]];
  denoisedImagesICA = Map[Image[Partition[#, dims[[2]]]] &, newMatICA];

Visual comparison of de-noised images

Next we visualize the ICA de-noised images together with the original images and the SVD and NNMF de-noised ones.

There are several ways to present that combination of images.

Grayscale 6 and 7 images, orginal, noised, PCA, NNMF, ICA de-noised

Binarized 6 and 7 images, orginal, noised, PCA, NNMF, ICA de-noised

Image collage of orginal, noised, PCA, NNMF, ICA de-noised 6 and 7 images

Comparison using classifiers

We can further compare the de-noising results by building digit classifiers and running them over the de-noised images.

icaCM = ClassifierMeasurements[digitCF, 
    Thread[(Binarize[#, 0.55] &@*ImageAdjust@*ColorNegate /@ 
        denoisedImagesICA) -> testImageLabels]]

We can see that with ICA we get better results than with PCA/SVD, probably not as good as NNMF, but very close.

Classifier comparison over PCA, NNMF, ICA de-noised images of 6 and 7

All images of this blog post

Computations results for ICA application of noised handwriting images of 6 an 7

References

[1] A. Antonov, "Comparison of PCA and NNMF over image de-noising", (2016), MathematicaForPrediction at WordPress.

[2] A. Antonov, "Independent Component Analysis for multidimensional signals", (2016), MathematicaForPrediction at WordPress.

[3] Wikipedia entry, MNIST database.

Independent component analysis for multidimensional signals

Introduction

Independent Component Analysis (ICA) is a (matrix factorization) method for separation of a multi-dimensional signal (represented with a matrix) into a weighted sum of sub-components that have less entropy than the original variables of the signal. See [1,2] for introduction to ICA and more details.

This blog post is to proclaim the implementation of the "FastICA" algorithm in the package IndependentComponentAnalysis.m and show a basic application with it. (I programmed that package last weekend. It has been in my ToDo list to start ICA algorithms implementations for several months… An interesting offshoot was the procedure I derived for the StackExchange question "Extracting signal from Gaussian noise".)

In this blog post ICA is going to be demonstrated with both generated data and "real life" weather data (temperatures of three cities within one month).

Generated data

In order to demonstrate ICA let us make up some data in the spirit of the "cocktail party problem".

(*Signal functions*)
Clear[s1, s2, s3]
s1[t_] := Sin[600 \[Pi] t/10000 + 6*Cos[120 \[Pi] t/10000]] + 1.2
s2[t_] := Sin[\[Pi] t/10] + 1.2
s3[t_?NumericQ] := (((QuotientRemainder[t, 23][[2]] - 11)/9)^5 + 2.8)/2 + 0.2

(*Mixing matrix*)
A = {{0.44, 0.2, 0.31}, {0.45, 0.8, 0.23}, {0.12, 0.32, 0.71}};

(*Signals matrix*)
nSize = 600;
S = Table[{s1[t], s2[t], s3[t]}, {t, 0, nSize, 0.5}];

(*Mixed signals matrix*)
M = A.Transpose[S];

(*Signals*)
Grid[{Map[
   Plot[#, {t, 0, nSize}, PerformanceGoal -> "Quality", 
     ImageSize -> 250] &, {s1[t], s2[t], s3[t]}]}]

Original signals

(*Mixed signals*)
Grid[{Map[ListLinePlot[#, ImageSize -> 250] &, M]}]

Mixed signals

I took the data generation formulas from [6].

ICA application

Load the package:

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/IndependentComponentAnalysis.m"]

It is important to note that the usual ICA model interpretation for the factorized matrix X is that each column is a variable (audio signal) and each row is an observation (recordings of the microphones at a given time). The matrix 3×1201 M was constructed with the interpretation that each row is a signal, hence we have to transpose M in order to apply the ICA algorithms, X=M^T.

X = Transpose[M];

{S, A} = IndependentComponentAnalysis[X, 3];

Check the approximation of the obtained factorization:

Norm[X - S.A]    
(* 3.10715*10^-14 *)

Plot the found source signals:

Grid[{Map[ListLinePlot[#, PlotRange -> All, ImageSize -> 250] &, 
   Transpose[S]]}]

Found source signals

Because of the random initialization of the inverting matrix in the algorithm the result my vary. Here is the plot from another run:

Found source signals 2

The package also provides the function FastICA that returns an association with elements that correspond to the result of the function fastICA provided by the R package "fastICA". See [4].

Here is an example usage:

res = FastICA[X, 3];

Keys[res]    
(* {"X", "K", "W", "A", "S"} *)

Grid[{Map[
   ListLinePlot[#, PlotRange -> All, ImageSize -> Medium] &, 
   Transpose[res["S"]]]}]

FastICA found source signals

Note that (in adherence to [4]) the function FastICA returns the matrices S and A for the centralized matrix X. This means, for example, that in order to check the approximation proper mean has to be supplied:

Norm[X - Map[# + Mean[X] &, res["S"].res["A"]]]
(* 2.56719*10^-14 *)

Signatures and results

The result of the function IndependentComponentAnalysis is a list of two matrices. The result of FastICA is an association of the matrices obtained by ICA. The function IndependentComponentAnalysis takes a method option and options for precision goal and maximum number of steps:

In[657]:= Options[IndependentComponentAnalysis]

Out[657]= {Method -> "FastICA", MaxSteps -> 200, PrecisionGoal -> 6}

The intent is IndependentComponentAnalysis to be the front interface to different ICA algorithms. (Hence, it has a Method option.) The function FastICA takes as options the named arguments of the R function fastICA described in [4].

In[658]:= Options[FastICA]

Out[658]= {"NonGaussianityFunction" -> Automatic, 
 "NegEntropyFactor" -> 1, "InitialUnmixingMartix" -> Automatic, 
 "RowNorm" -> False, MaxSteps -> 200, PrecisionGoal -> 6, 
 "RFastICAResult" -> True}

At this point FastICA has only the deflation algorithm described in [1]. ([4] has also the so called "symmetric" ICA sub-algorithm.) The R function fastICA in [4] can use only two neg-Entropy functions log(cosh(x)) and exp(-u^2/2). Because of the symbolic capabilities of Mathematica FastICA of [3] can take any listable function through the option "NonGaussianityFunction", and it will find and use the corresponding first and second derivatives.

Using NNMF for ICA

It seems that in some cases, like the generated data used in this blog post, Non-Negative Matrix Factorization (NNMF) can be applied for doing ICA.

To be clear, NNMF does dimension reduction, but its norm minimization process does not enforce variable independence. (It enforces non-negativity.) There are at least several articles discussing modification of NNMF to do ICA. For example [6].

Load NNMF package [5] (from MathematicaForPrediction at GitHub):

Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/NonNegativeMatrixFactorization.m"]

After several applications of NNMF we get signals close to the originals:

{W, H} = GDCLS[M, 3];
Grid[{Map[ListLinePlot[#, ImageSize -> 250] &, Normal[H]]}]

NNMF found source signals

For the generated data in this blog post, FactICA is much faster than NNMF and produces better separation of the signals with every run. The data though is a typical representative for the problems ICA is made for. Another comparison with image de-noising, extending my previous blog post, will be published shortly.

ICA for mixed time series of city temperatures

Using Mathematica‘s function WeatherData we can get temperature time series for a small set of cities over a certain time grid. We can mix those time series into a multi-dimensional signal, MS, apply ICA to MS, and judge the extracted source signals with the original ones.

This is done with the following commands.

Get time series data

cities = {"Sofia", "London", "Copenhagen"};
timeInterval = {{2016, 1, 1}, {2016, 1, 31}};
ts = WeatherData[#, "Temperature", timeInterval] & /@ cities;

opts = {PlotTheme -> "Detailed", FrameLabel -> {None, "temperature,\[Degree]C"}, ImageSize -> 350};
DateListPlot[ts, 
    PlotLabel -> "City temperatures\nfrom " <> DateString[timeInterval[[1]], {"Year", ".", "Month", ".", "Day"}] <> 
    " to " <> DateString[timeInterval[[2]], {"Year", ".", "Month", ".", "Day"}], 
    PlotLegends -> cities, ImageSize -> Large, opts]

City temperatures

Cleaning and resampling (if any)

Here we check the data for missing data:

Length /@ Through[ts["Path"]]
Count[#, _Missing, \[Infinity]] & /@ Through[ts["Path"]]
Total[%]
(* {1483, 1465, 742} *)
(* {0,0,0} *)
(* 0 *)

Resampling per hour:

ts = TimeSeriesResample[#, "Hour", ResamplingMethod -> {"Interpolation", InterpolationOrder -> 1}] & /@ ts

Mixing the time series

In order to do a good mixing we select a mixing matrix for which all column sums are close to one:

mixingMat = #/Total[#] & /@ RandomReal[1, {3, 3}];
MatrixForm[mixingMat]
(* mixingMat = {{0.357412, 0.403913, 0.238675}, {0.361481, 0.223506, 0.415013}, {0.36564, 0.278565, 0.355795}} *)
Total[mixingMat]
(* {1.08453, 0.905984, 1.00948} *)

Note the row normalization.

Make the mixed signals:

tsMixed = Table[TimeSeriesThread[mixingMat[[i]].# &, ts], {i, 3}]

Plot the original and mixed signals:

Grid[{{DateListPlot[ts, PlotLegends -> cities, PlotLabel -> "Original signals", opts],
DateListPlot[tsMixed, PlotLegends -> Automatic, PlotLabel -> "Mixed signals", opts]}}]
 

Original and mixed temperature signals

Application of ICA

At this point we apply ICA (probably more than once, but not too many times) and plot the found source signals:

X = Transpose[Through[tsMixed["Path"]][[All, All, 2]] /. Quantity[v_, _] :> v];
{S, A} = IndependentComponentAnalysis[X, 3];
DateListPlot[Transpose[{tsMixed[[1]]["Dates"], #}], PlotTheme -> "Detailed", ImageSize -> 250] & /@ Transpose[S]

ICA found temperature time series components

Compare with the original time series:

MapThread[DateListPlot[#1, PlotTheme -> "Detailed", PlotLabel -> #2, ImageSize -> 250] &, {tsPaths, cities}]

Original temperature time series

After permuting and inverting some of the found sources signals we see they are fairly good:

pinds = {3, 1, 2};
pmat = IdentityMatrix[3][[All, pinds]];

DateListPlot[Transpose[{tsMixed[[1]]["Dates"], #}], PlotTheme -> "Detailed", ImageSize -> 250] & /@ 
  Transpose[S.DiagonalMatrix[{1, -1, 1}].pmat]

Permuted and inverted found source signals

References

[1] A. Hyvarinen and E. Oja (2000) Independent Component Analysis: Algorithms and Applications, Neural Networks, 13(4-5):411-430 . URL: https://www.cs.helsinki.fi/u/ahyvarin/papers/NN00new.pdf .

[2] Wikipedia entry, Independent component analysis, URL: https://en.wikipedia.org/wiki/Independent_component_analysis .

[3] A. Antonov, Independent Component Analysis Mathematica package, (2016), source code MathematicaForPrediction at GitHub, package IndependentComponentAnalysis.m .

[4] J. L. Marchini, C. Heaton and B. D. Ripley, fastICA, R package, URLs: https://cran.r-project.org/web/packages/fastICA/index.html, https://cran.r-project.org/web/packages/fastICA/fastICA.pdf .

[5] A. Antonov, Implementation of the Non-Negative Matrix Factorization algorithm in Mathematica, (2013), source code at MathematicaForPrediction at GitHub, package NonNegativeMatrixFactorization.m.

[6] H. Hsieh and J. Chien, A new nonnegative matrix factorization for independent component analysis, (2010), Conference: Acoustics Speech and Signal Processing (ICASSP).