# Introduction

In the last few years I have used a lot R’s base library Matrix that has implementation of sparse matrix objects and efficient computations. To the sparse matrices from R’s Matrix library one can assign and retrieve row names and column names with the functions `colnames`

and `rownames`

. Sometimes I miss this in *Mathematica* so I started a *Mathematica* package that implements similar functionalities. The package is named RSparseMatrix.m has purely *Mathematica* language implementations (i.e. it does not use RLink ). It can be loaded/downloaded from MathematicaForPrediction at GitHub:

`Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/Misc/RSparseMatrix.m"]`

The package provides functions to create and do operations over `RSparseMatrix`

objects of that are basically `SparseArray`

objects with row and column names. A major design decision is to restrict these functionalities to two dimensional sparse arrays and lists of strings as row and column names. (Note that the package is not finished and in some functions the row and column names are ignored.)

The package attempts to cover as many as possible of the functionalities for sparse matrix objects that are provided by R’s Matrix library. (Sub-matrix extraction by row and column names, row and column names propagation for dot products, row and column binding sparse matrices, row and column sums, etc.) This document has examples and tests for RSparseMatrix.m .

My participation in WTC 2015 with a talk about *Mathematica* and R comparison was one the main motivators to write this blog post. Another is this Mathematica StackExchange discussion. (And a third one is seeing tonight the impressive movie “The Martian” — such a display of the triumph of the humans over space and nature using technology and science in a creative way made me wanna discuss how to make some programming objects more convenient.)

# Basic examples

## Creation

rmat = MakeRSparseMatrix[ {{1, 1} -> 1, {2, 2} -> 2, {4, 3} -> 3, {1, 4} -> 4, {3, 5} -> 2}, "ColumnNames" -> {"a", "b", "c", "d", "e"}, "RowNames" -> {"A", "B", "C", "D"}, "DimensionNames" -> {"U", "V"}]

The function `MatrixForm`

shows the `RSparseMatrix`

objects with their row and column names:

rmat // MatrixForm

The `RSparseMatrix`

objects can be created from `SparseArray`

objects:

## Query functions

These functions can be used to retrieve the names of rows, columns, and dimensions. They correspond to R’s functions rownames, colnames, dimnames.

In[154]:= RowNames[rmat] Out[154]= {"A", "B", "C", "D"}

In[155]:= ColumnNames[rmat] Out[155]= {"a", "b", "c", "d", "e"}

In[156]:= DimensionNames[rmat] Out[156]= {"U", "V"}

## Functions that work on SparseArray

Of course since `RSparseMatrix`

is based on `SparseArray`

we would expect the functions that work on `SparseArray`

objects to work `RSpaseMatrix`

objects too. E.g. Dimensions, ArrayRules, Transpose, Total, and others.

In[157]:= Dimensions[rmat] Out[157]= {4, 5}

In[158]:= ArrayRules[rmat] Out[158]= {{1, 1} -> 1, {1, 4} -> 4, {2, 2} -> 2, {3, 5} -> 2, {4, 3} -> 3, {_, _} -> 0}

## Dot product

Row names and column names are respected for dot products if that leads to meaningful assignments. The examples below demonstrate a general principle:

When a matrix operation can be performed on the underlying sparse arrays but the row names or column names do not coincide the names are dropped.

In the tables with examples below the last rows show the heads of the results.

### Matrix by vector

### Matrix by matrix

## Part

A major useful feature is to have `Part`

work with row and column names. The implementation of that additional functionality for `Part`

is demonstrated below.

In the cases when the dimension drops sparse arrays or numbers are returned. In R the operation “[” has the parameter “drop” — the expression “smat[1,,drop=F]” is going to be a sparse matrix, the expression “`smat[1,,drop=T]`

” is going to be a dense vector. The corresponding implementation is to have the option “`Drop->True|False`

” for `Part`

, but that does not seem a good idea. And we can easily emulate the “drop” option in R using “{_?AtomQ}” inside `Part`

.

# Neat example

Consider this incidence matrix that represents a bi-partite graph of relationships of actors starring in movies:

We can use a `RSparseMatrix`

object of it with named rows and columns (`rBiMat`

).

Here is the corresponding graph:

If we want to see which actors have participated in movies together with Orlando Bloom we can do the following:

Pingback: Contingency tables creation examples | Mathematica for prediction algorithms

Pingback: The Great conversation in USA presidential speeches | Mathematica for prediction algorithms