An n-gram is a sequence of n adjacent phonemes, syllables, letters or words in a particular order. N-grams can be classified into many divisions based on their n value.
For example, if n is 1, we call it a unigram; n = 2, a bigram and so on with the prefixes being describing of their sequences.
Let's take the sentence "The quick brown fox jumps over the lazy dog." and generate n-grams from it:
As you can see, the Bigrams contain more context than the Unigrams and Trigrams contain more context than both. Now we will move on to the N-gram models.
If you want to learn more about n-grams you can go to here and here.
An N-gram language model predicts the probability of a given N-gram within any sequence of words in a language (or the ones in its training data). The model predicts the next word by taking the last
sequences of words in the input and comparing it with the available data to find which word is the most likely to occur next. Some may find this useful as it provides an accurate relationsip between
words. On the other hand, if we desire uniqueness, we might need to randomize the selection of the words.
You can find more information about this here.
My N-gram model is fairly simple. It doesn't use any of those smoothing techniques nor does it require different functions to create and use different n-grams.
It basically works by using nested dictionaries where the previous word is the key for the current one. It generates dictionaries for a bigram like this: