Ollama allows you to run open-source large language models (LLM's) like Llama 2 and Mistral, locally. It's an LLM serving platform written in GoLang.
It's currently only available for OSX and Linux.
Setting Up
Simply download Ollama from ollama.ai/download
Or run:
curl https://ollama.ai/install.sh | sh
Once installed, pick your LLM of choice from this list. For example, lets setup llama2
:
ollama run llama2
This will download the most basic version of the model and runs a REPL where you can converse with the downloaded LLM.
Ollama has a REST API for running and managing models.
ollama serve
Generate a response
curl http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt":"Why is the sky blue?"
}'
Chat with a model
curl http://localhost:11434/api/chat -d '{
"model": "mistral",
"messages": [
{ "role": "user", "content": "why is the sky blue?" }
]
}'
Review the API documentation for all endpoints.
💡
Website: https://ollama.ai/
GitHub: https://github.com/jmorganca/ollama
Discord: https://discord.com/invite/ollama
GitHub: https://github.com/jmorganca/ollama
Discord: https://discord.com/invite/ollama