Deploy MCP Server
AI & Machine Learning None

Ollama REST API

Run large language models locally with REST API

Ollama is a local AI runtime that allows developers to run large language models (LLMs) on their own hardware through a simple REST API. It supports popular models like Llama 2, Mistral, Code Llama, and custom models with optimized performance for local inference. Developers use Ollama to build AI applications without cloud dependencies, maintain data privacy, and reduce inference costs.

Base URL http://localhost:11434/api

API Endpoints

MethodEndpointDescription
POST/generateGenerate a response from a model with a single prompt
POST/chatGenerate chat completions with conversation history
POST/embeddingsGenerate embeddings from a model for a given text
POST/pullDownload a model from the Ollama library
POST/pushUpload a model to the Ollama library
POST/createCreate a new model from a Modelfile
DELETE/deleteDelete a model and its data
POST/copyCopy a model to a new name
GET/tagsList all locally available models
POST/showShow information about a specific model
GET/psList currently running models
POST/blobs/:digestCreate a blob for model file uploads

Code Examples

curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

Connect Ollama to AI

Deploy a Ollama MCP server on IOX Cloud and connect it to Claude, ChatGPT, Cursor, or any AI client. Your AI assistant gets direct access to Ollama through these tools:

ollama_generate Generate text completions from locally running LLMs with custom prompts and parameters
ollama_chat Maintain multi-turn conversations with local AI models using chat history
ollama_embed Generate vector embeddings from text using local embedding models for semantic search
ollama_list_models List all locally available models and their details including size and modified date
ollama_pull_model Download and install models from the Ollama library to local storage

Deploy in 60 seconds

Describe what you need, AI generates the code, and IOX deploys it globally.

Deploy Ollama MCP Server →

Related APIs