Where Does Probability Come From?

A language model learns probabilities by counting word sequences (n-grams) in a large training corpus. For a bigram model, it learns the chance of a word appearing after another. This learned knowledge is then used to evaluate new sentences and calculate perplexity.

Step 1: Training Corpus

Learned Bigram Probabilities

Step 2: Evaluation & Perplexity Calculation