In this MathematicaVsR project we discuss and exemplify finding and analyzing similarities between texts using Latent Semantic Analysis (LSA). Both Mathematica and R codes are provided.
The LSA workflows are constructed and executed with the software monads LSAMon-WL, [AA1, AAp1], and LSAMon-R, [AAp2].
The illustrating examples are based on conference abstracts from rstudio::conf and Wolfram Technology Conference (WTC), [AAd1, AAd2]. Since the number of rstudio::conf abstracts is small and since rstudio::conf 2020 is about to start at the time of preparing this project we focus on words and texts from RStudio’s ecosystem of packages and presentations.
Statistical thesaurus for words from RStudio’s ecosystem
This notebook / document provides ground data analysis used to make or confirm certain modeling conjectures and assumptions of a Pets Retail Dynamics Model (PRDM), [AA1]. Seattle pets licensing data is used, [SOD2].
We want to provide answers to the following questions.
Does the Pareto principle manifests for pets breeds?
Does the Pareto principle manifests for ZIP codes?
Is there an upward trend for becoming a pet owner?
All three questions have positive answers, assuming the retrieved data, [SOD2], is representative. See the last section for an additional discussion.
We also discuss pet adoption simulations that are done using Quantile Regression, [AA2, AAp1].
This section has subsections that correspond to additional discussion questions. Not all questions are answered, the plan is to progressively answer the questions with the subsequent versions of the this notebook / document.
□ Too few pets
The number of registered pets seems too few. Seattle is a large city with more than 600000 citizens; approximately 50% of the USA households have dogs; hence the registered pets are too few (~50000).
□ Why too few pets?
Seattle is a high tech city and its citizens are too busy to have pets?
Most people do not register their pets? (Very unlikely if they have used veterinary services.)
□ Registration rates
Why the number of registrations is much higher in volume and frequency in the years 2018 and later?
□ Adoption rates
Can we tell apart the adoption rates of pet-less people and people who already have pets?