I’m currently working on a small project where I want to implement word prediction and word completion in a React app. The app itself is done, but I still need to figure out the right algorithms. The idea is to help people who can’t speak by allowing them to type out sentences, with the app completing or predicting words as they type.
When I started looking into models for word prediction, I got a bit lost—there are so many options, and I’m still new to NLP. So, I was hoping someone with more experience could give me some guidance.
How can I implement a solid but fast and lightweight Bard or GPT (or any other model) for word prediction that can run on the client side?
I’d really appreciate any suggestions or ideas.
Here’s some additional info:
I have experience with TensorFlow, so I was thinking about using TensorFlow-Lite models for client-side use.
For word completion, I’ve already tried a simple RNN (but I’m open to tips or other options).
For word prediction, I was considering an LSTM, but it’s not quite there yet, so I’m also thinking about using a small GPT or Bard variant.
It’s common to use the same language model for both word completion and prediction. Completion is basically adding a filter to what the model suggests.
If you know the keyboard layout, you can handle typos by matching each key with its neighbors. A fancier approach would be using a noisy channel model (a combination of a language model and a typo model).
For German, subword modeling like byte-pair encoding can be really helpful.
If you want something fast and simple to start, consider a basic bigram model. They’re lightweight and quick.
Don’t forget about training data. The closer it matches real use (emails, text messages, etc.), the better. Tokenization is also key.
I’ve worked in word prediction for assistive tech before, and spent some years on mobile typing tech at Swype and Nuance, but that was before neural networks became the go-to.
For storage, I’m not sure yet, but I’d prefer a smaller model so the loading time isn’t too long. At the same time, I want it to provide good predictions to minimize typing.
The goal here is mainly for the app to complete or predict words efficiently.
@Holt
I’d suggest setting a memory limit. If you’re aiming for fast loading, consider keeping it within 10-15MB, similar to typical web dependencies. You might have to compromise a bit between size and quality.
I’m not aware of pre-trained models that fit that size, but you could try training your own smaller model:
The smallest GPT2 I found was around 200MB, but you could train a smaller version by tweaking parameters.
If you set a memory limit upfront, you can adjust hyperparameters to fit within that range. Even without full training, neural networks have fixed sizes.
Embeddings tend to take up the most space. Byte-pair encoding can help reduce vocabulary size, which saves memory.
If you’re designing a custom neural network (even a simple RNN):
Share the embedding weights between input and output to save memory.
Reduced precision and quantization can make a big difference.
Check out the GPT2 paper and similar ones for tricks to balance quality and memory use.
For evaluating your model:
Perplexity is a standard metric in language modeling, so you could compare your scores with German test datasets.
Keystroke savings is another useful measure. It simulates typing with your app to see how much it reduces the number of keystrokes compared to manual typing.