My name is Anton Antonov. This blog is for examples and descriptions of real world applications of machine learning and other prediction algorithms implemented in Mathematica.
I was a kernel developer at Wolfram Research, Inc. for seven years working, mostly, on numerical algorithms. (I implemented and documented the framework and integrators of NIntegrate.) During the last 9+ years I designed, prototyped, and implemented a variety of machine learning and data mining algorithms.
This blog is strongly associated with

the Mathematica source code project MathematicaForPrediction at GitHub, and

the Mathematica and R source code project MathematicaVsR at GitHub.
Advertisements
Hi Anton,
I used to work for Wolfram Solutions and now I’m back to consulting. I have developed a Mathematica Application for data analysis (link to screencast: http://youtu.be/RhYoWRT2yPk) which may at some point be of interest to you. Please take a look at the screencast and keep my email in case you see any synergistic opportunity in the future.
Regards,
Dear Anton Antonov, I was search some codes to estimate quantile regression through linear programming without penalty and with penalty (LASSO and SCAD), I read many information from your website..I can not download the Pdf file from mathematica for prediction at GitHub. can you forward me that file if possible also send me some codes. thanks… I am a PhD student
Hi Amin,
I just verified that the PDF’s and the Mathematica code are downloadable from the GitHub site. Nevertheless, I will send you an email with them as attachments.
Hello Anton
I am working on forecast algorithms in Mathematica, similar works like yourself, would like to know if we could use your services or cooperate somehow.
Dara
Hey, Anton. What would you call a conversational engine and how would you implement it?
Let me explain: I’m writing a conversational engine/bot. I’m new to the field and having to learn the specific names used.
My bot has a defined number of capabilities (e.g. two: finding gifs and printing jokes). After parsing the first user given string – using parser generators, thanks for that!! , I know what the user is trying to do (either requesting gifs or requesting jokes). Hence, depending on the intention/capability, the bot asks different questions (e.g. “gif of what?” for gifs, but “dirty or clean?” for jokes).
As you can see, the conversation may follow one of the defined paths. I’ve been thinking about using a Finite State Machine for that (i.e. for the conversational “engine”), but I’m not happy with it. Questions:
What would you call this “conversational/context engine”?
How would you suggest implementing this?
Thank you in advance and thanks for these awesome posts as well. Very insightful!
Gunar.
Hi Gunar,
Thank you for your note!
Well you can call it whatever you like… More seriously, I am not sure what kind of naming you are looking for. For example, one type of name is “Jokegif bot”, another is “interaction program for user queries of jokes and images.”
I think I have already provided an answer to this question with this blog post of mine:
“Creating and programming DSLs”.
Best regards,
Anton
Hi Anton, I am a researcher working in the field of financial applications of Self organizing maps, and I wonder if there is any room for cooperation between we two.
Thanks in advance for your reply,
Marina
Yes, I would like to investigate such a possibility.
Reading through some of your posts, I wondered if you know of the late Thomas Cover’s work on universal data compression and universal portfolios (http://www.mit.edu/~6.454/www_fall_2001/shaas/universal_portfolios.pdf).
Since I first encountered it, I’ve thought that Cover’s work with what he described informally as “natural optimization techniques”, could provide very efficient and useful optimization approaches for use in machine learning, AI, & AGI.
Cover’s universal optimization approaches grow out of the beginnings of information theory, especially John Kelly’s work at Bell Labs (see: https://www.princeton.edu/~wbialek/rome/refs/kelly_56.pdf).
Cover developed the theoretical optimization framework for identifying, at successive time steps, the mean rankweighted “portfolio” of agents/algorithms from an infinite number of possible combinations of inputs.
Think of this as a multidimensional regular simplex with rank weightings as a hypercap. One can then find the mean rankweighted “portfolio” of agents geometrically.
(Note: Statistical methods of doing this take lots of processing time and power. I can share Mathematica code for a geometric solution that does this).
Cover proved that successively following that mean rankweighted “portfolio” (shifting the portfolio allocation at each time step) converges asymptotically to the best single “portfolio” of agents at any future time step with a probability of 1.
Optimization without Monte Carlo or neural nets.
No dependence on distribution of the data.
I don’t know of anyone that has incorporated Cover’s ideas into AI & AGI. Seems like a potentially fruitful path.
I’ve wondered, if human brains might optimize their responses to the world by some Coverlike method. It would seems to correspond closely with the wetware.
From the posts you’ve presented in this blog, I thought this might pique your interest.
Let me know.
Andreas