Wolfram Live-Coding Series on Latent Semantic Analysis workflows

In brief

The lectures on Latent Semantic Analysis (LSA) are to be recorded through Wolfram University (Wolfram U) in December 2019 and January-February 2020.

The lectures (as live-coding sessions)

  1. Overview Latent Semantic Analysis (LSA) typical problems and basic workflows.
    Answering preliminary anticipated questions.
    Here is the recording of the first session at Twitch .

    • What are the typical applications of LSA?
    • Why use LSA?
    • What it the fundamental philosophical or scientific assumption for LSA?
    • What is the most important and/or fundamental step of LSA?
    • What is the difference between LSA and Latent Semantic Indexing (LSI)?
    • What are the alternatives?
      • Using Neural Networks instead?
    • How is LSA used to derive similarities between two given texts?
    • How is LSA used to evaluate the proximity of phrases? (That have different words, but close semantic meaning.)
    • How the main dimension reduction methods compare?
  2. LSA for document collections.
    Here is the recording of the second session at Twitch .

    • Motivational example – full blown LSA workflow.
    • Fundamentals, text transformation (the hard way):
      • bag of words model,
      • stop words,
      • stemming.
    • The easy way with LSAMon.
    • “Eat your own dog food” example.
  3. Representation of the documents – the fundamental matrix object.
    Here is the recording of the third session at Twitch.

    • Review: last session’s example.
    • Review: the motivational example – full blown LSA workflow.
    • Linear vector space representation:
      • LSA’s most fundamental operation,
      • matrix with named rows and columns.
    • Pareto Principle adherence
      • for a document,
      • for a document collection, and
      • (in general.)
  4. Representation of unseen documents.
    Here is the recording of the fourth session at Twitch.

  5. LSA for image de-noising and classification. Here is the recording of the fifth session at Twitch.

    • Review: last session’s image collection topics extraction.
      • Let us try that two other datasets:
    • Image de-noising:
      • Using handwritten digits (again).
    • Image classification:
      • Handwritten digits.
  6. Further use cases.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.