Thanks for the explanation! Your explanation led me down a rabbit hole of seeing if there’s a way to cancel an await call, from what I can tell there was no clear way to do so. In my case I ended up connecting the signal to a secondary function instead of utilizing the await command, I’m not entirely sure if there’s an advantage to utilizing one method over the other.
You can install Ollama in a docker container and use that to install models to run locally. Some are really small and still pretty effective, like Llama 3.2 is only 3B and some are as little as 1B. It can be accessed through the terminal or you can use something like OpenWeb UI to have a more “ChatGPT” like interface.