## Updating the language model

The Kaldi model used in Vosk is compiled from 3 data sources:

  * dictionary
  * acoustic model
  * language model

You can rebuild all three with different level of effort, but sometimes you just
need to adjust the probability of the words to improve the recognition. For
that it is enough to recompile the language model from the text. To do that

1) Take a text that reflects the speech you want to recognize
2) Remove punctuation, convert everything to the lowercase, you can do it with a python script
3) Build openfst and opengrm inside kaldi

```
export KALDI_ROOT=`pwd`/kaldi
git clone https://github.com/kaldi-asr/kaldi
cd kaldi/tools
make
# install all required dependencies and repeat `make` if needed
extras/install_opengrm.sh
```

4) Now lets build a grammar

```
export PATH=$KALDI_ROOT/tools/openfst/bin:$PATH
export LD_LIBRARY_PATH=$KALDI_ROOT/tools/openfst/lib/fst
cd model
fstsymbols --save_osymbols=words.txt Gr.fst > /dev/null
farcompilestrings --fst_type=compact --symbols=words.txt --keep_symbols text.txt | \
    ngramcount | ngrammake | \
    fstconvert --fst_type=ngram > Gr.fst
```

Use created Gr.fst instead of standard one in your model.

For more details see OpenGRM documentation http://www.opengrm.org/twiki/bin/view/GRM/NGramLibrary

You can not introduce new words this way, that is something we will cover later.