Mathematica vs. R at GitHub

In brief This post is to announce the repository MathematicaVsR at GitHub that has example projects, code, and documents for comparing Mathematica with R. My plan is to proclaim new completed Mathematica-vs-R projects here, in this blog post, and when appropriate make separate blog posts about them. Mission statement The development in the MathematicaVsR at […]

Adaptive numerical Lebesgue integration by set measure estimates

Introduction In this document are given outlines and examples of several related implementations of Lebesgue integration, [1], within the framework of NIntegrate, [7]. The focus is on the implementations of Lebesgue integration algorithms that have multiple options and can be easily extended (in order to do further research, optimization, etc.) In terms of NIntegrate‘s framework […]

Comparison of PCA, NNMF, and ICA over image de-noising

Introduction In a previous blog post, [1], I compared Principal Component Analysis (PCA) / Singular Value Decomposition (SVD) and Non-Negative Matrix Factorization (NNMF) over a collection of noised images of digit handwriting from the MNIST data set, [3], which is available in Mathematica. This blog post adds to that comparison the use of Independent Component […]

Finding outliers in 2D and 3D numerical data

Introduction This blog post describes a method of finding outliers in 2D and 3D data using Quantile Regression Envelopes discussed in previous blog posts: “Directional quantile envelopes”, “Directional quantile envelopes in 3D”. Data In order to provide a good example of the method application it would be better to use “real life” data. I found this […]

Object-Oriented Design Patterns in Mathematica

Introduction In this blog post I would like to proclaim a recent completion of the first version of a document describing how to implement the most important (in my view) Object-Oriented Programming Designed Patterns by GoF. Here is the link to the document in MathematicaForPrediction at GitHub: “Implementation of Object-Oriented Programming Design Patterns in Mathematica”  […]

Classification and association rules for census income data

Introduction In this blog post I am going to show (some) analysis of census income data — the so called “Adult” data set, [1] — using three types of algorithms: decision tree classification, naive Bayesian classification, and association rules learning. Mathematica packages for all three algorithms can be found at the project MathematicaForPrediction hosted at […]

Classification of genome data with n-gram models

In this blog post we consider the following problem: Gene Sequence Classification Problem (GSCP): Given two genes, G1 and G2, and a (relatively short) sub-sequence S from one of them tell from which gene the sub-sequence S is part of. One way to derive a solution for this problem is to model each gene with […]

Morphological analysis of data mining and prediction algorithms

This post is to proclaim the upload of two Morphological analysis tables I made a month ago. They were a (small) part of my presentation last month at the Wolfram Technology Conference 2013. The tables were made in order to provide summarization, guidance, and insight of the data mining and prediction algorithms of the Mathematica […]

Statistical thesaurus from NPR podcasts

Five months ago I worked with transcripts of National Public Radio (NPR) podcasts. The transcripts are available at — see for example “From child actor to artist…“. Using nearly 5000 transcripts I experimented with topic extraction and statistical thesaurus derivation. The topics are too bulky to show here, but I am going to show […]

Classification of handwritten digits

In this blog post I show some experiments with algorithmic recognition of images of handwritten digits. I followed the algorithm described in Chapter 10 of the book “Matrix Methods in Data Mining and Pattern Recognition” by Lars Elden. The algorithm described uses the so called thin Singular Value Decomposition (SVD). Training phase 1.1. Rasterize each […]