I recently updated the package TriesWithFrequencies.m with two new functions `TrieRemoval`

and TrieNodeFrequencies. The function `TrieNodeFrequencies`

reverses the outcomes of `TrieNodeProbabilities`

; it was made to facilitate the implementation and usage of `TrieRemoval`

.

The removal of a sub-tree from a trie is useful when we want to alternate the learning stored in the trie.

The new functions are illustrated with examples below. (These examples reuse and extend the ones in the blog post “Tries with frequencies for data mining”.)

Consider the trie made with the list of words

`{"war", "ward", "warden", "work", "car", "cars", "carbs"}`

The function `TrieNodeFrequencies`

can be used to reverse the effect of `TrieNodeProbabilities`

. Its first argument is a trie, the second one is a scaling parameter (which is 1 by default). Here is an example:

`TrieForm[TrieNodeFrequencies[TrieNodeProbabilities[mytrie], 14]]`

The function `TrieRemove`

can be used to remove sub-parts of a trie that correspond to a specified “word”. Here is an example using a trie with frequencies:

`TrieForm[TrieRemove[mytrie, "car"]]`

Here is an example using a trie with probabilities:

`TrieForm[TrieRemove[TrieNodeProbabilities[mytrie], "car"]]`

`TrieRemove`

has two options “HowManyTimes” and “FrequencyValues” with default values `Automatic`

. The option “HowManyTimes” takes a number or `Automatic`

and it is still under development (its functionality is partially correct). The option “FrequencyValues” takes `True | False | Automatic`

and it is used for the interpretation of the sub-tree removal. If the values are frequencies the removal process is more straightforward. If the values are probabilities the removal process has to take into account that the values on the same level of the sub-tree are conditional probabilities that sum to 1. `TrieRemove`

has two different implementations one for tries with frequencies and one for tries with probabilities. The option setting `"FrequencyValues"->True|False`

is used to redirect to the correct implementation. When `"FrequencyValues"->Automatic`

is given a simple heuristic is used to determine is the trie argument a trie with frequencies or a trie with probabilities.

Pingback: ML::TriesWithFrequencies | Raku for Prediction