{ "cells": [ { "cell_type": "markdown", "id": "5de79491", "metadata": {}, "source": [ "# Llama Stack Quick Start Demo\n", "\n", "This notebook demonstrates how to use Llama Stack to run an agent with **client-side tools**." ] }, { "cell_type": "markdown", "id": "e5d1fc8c", "metadata": {}, "source": [ "## 1. Install Dependencies\n", "\n", "**Note:** `llama-stack-client` requires Python 3.12 or higher. If your Python version does not meet this requirement, refer to the FAQ section in the documentation: **How to prepare Python 3.12 in Notebook**." ] }, { "cell_type": "code", "execution_count": null, "id": "a8f9e5e4", "metadata": {}, "outputs": [], "source": [ "# Use current kernel's Python so PATH does not point to another env\n", "# If download is slow, add: -i https://pypi.tuna.tsinghua.edu.cn/simple\n", "import sys\n", "!{sys.executable} -m pip install \"llama-stack-client>=0.4\" \"requests\" \"fastapi\" \"uvicorn\" --target ~/packages" ] }, { "cell_type": "markdown", "id": "9d942699", "metadata": {}, "source": [ "## 2. Import Libraries" ] }, { "cell_type": "code", "execution_count": null, "id": "cfd65276", "metadata": {}, "outputs": [], "source": [ "import sys\n", "from pathlib import Path\n", "\n", "user_site_packages = Path.home() / \"packages\"\n", "if str(user_site_packages) not in sys.path:\n", " sys.path.insert(0, str(user_site_packages))\n", "\n", "import os\n", "import requests\n", "from typing import Dict, Any\n", "from urllib.parse import quote\n", "from llama_stack_client import LlamaStackClient, Agent\n", "from llama_stack_client.lib.agents.client_tool import client_tool\n", "from llama_stack_client.lib.agents.event_logger import AgentEventLogger\n", "\n", "print('Libraries imported successfully')" ] }, { "cell_type": "markdown", "id": "baabf4fc", "metadata": {}, "source": [ "## 3. Define Tools\n", "\n", "Use the `@client_tool` decorator to define a weather query tool." ] }, { "cell_type": "code", "execution_count": null, "id": "c57f95e5", "metadata": {}, "outputs": [], "source": [ "@client_tool\n", "def get_weather(city: str) -> Dict[str, Any]:\n", " \"\"\"Get current weather information for a specified city.\n", "\n", " Uses the wttr.in free weather API to fetch weather data.\n", "\n", " :param city: City name, e.g., Beijing, Shanghai, Paris\n", " :returns: Dictionary containing weather information including city, temperature and humidity\n", " \"\"\"\n", " try:\n", " # URL encode the city name to handle spaces and special characters\n", " encoded_city = quote(city)\n", " url = f'https://wttr.in/{encoded_city}?format=j1'\n", " response = requests.get(url, timeout=10)\n", " response.raise_for_status()\n", " data = response.json()\n", "\n", " current = data['current_condition'][0]\n", " return {\n", " 'city': city,\n", " 'temperature': f\"{current['temp_C']}°C\",\n", " 'humidity': f\"{current['humidity']}%\",\n", " }\n", " except Exception as e:\n", " return {'error': f'Failed to get weather information: {str(e)}'}\n", "\n", "print('Weather tool defined successfully')" ] }, { "cell_type": "markdown", "id": "05cefded", "metadata": {}, "source": [ "## 4. Connect to Server and Create Agent\n", "\n", "Use LlamaStackClient to connect to the running server, create an Agent with the client-side weather tool, and execute tool calls." ] }, { "cell_type": "code", "execution_count": null, "id": "394ee5db", "metadata": {}, "outputs": [], "source": [ "base_url = os.getenv('LLAMA_STACK_URL', 'http://localhost:8321')\n", "print(f'Connecting to Server: {base_url}')\n", "\n", "client = LlamaStackClient(base_url=base_url)\n", "\n", "models = client.models.list()\n", "llm_model = next(\n", " (m for m in models\n", " if m.custom_metadata and m.custom_metadata.get('model_type') == 'llm'),\n", " None\n", ")\n", "if not llm_model:\n", " raise Exception('No LLM model found')\n", "model_id = llm_model.id\n", "print(f'Using model: {model_id}\\n')\n", "\n", "agent = Agent(\n", " client,\n", " model=model_id,\n", " instructions='You are a helpful weather assistant. When users ask about weather, use the weather tool to query and answer.',\n", " tools=[get_weather],\n", ")\n", "print('Agent created successfully')" ] }, { "cell_type": "markdown", "id": "90c28b81", "metadata": {}, "source": [ "## 5. Run the Agent" ] }, { "cell_type": "code", "execution_count": null, "id": "70e8d661", "metadata": {}, "outputs": [], "source": [ "# Create session\n", "session_id = agent.create_session('weather-agent-session')\n", "print(f'✓ Session created: {session_id}\\n')\n", "\n", "# First query\n", "print('=' * 60)\n", "print('User> What is the weather like in Beijing today?')\n", "print('-' * 60)\n", "\n", "response_stream = agent.create_turn(\n", " messages=[{'role': 'user', 'content': 'What is the weather like in Beijing today?'}],\n", " session_id=session_id,\n", " stream=True,\n", ")" ] }, { "cell_type": "markdown", "id": "ca2f26f2", "metadata": {}, "source": [ "### Display the Result" ] }, { "cell_type": "code", "execution_count": null, "id": "4728a638", "metadata": {}, "outputs": [], "source": [ "logger = AgentEventLogger()\n", "for printable in logger.log(response_stream):\n", " print(printable, end='', flush=True)\n", "print('\\n')" ] }, { "cell_type": "markdown", "id": "728530b0", "metadata": {}, "source": [ "### Try Different Queries" ] }, { "cell_type": "code", "execution_count": null, "id": "ed8cc5a0", "metadata": {}, "outputs": [], "source": [ "# Second query\n", "print('=' * 60)\n", "print('User> What is the weather in Shanghai?')\n", "print('-' * 60)\n", "\n", "response_stream = agent.create_turn(\n", " messages=[{'role': 'user', 'content': 'What is the weather in Shanghai?'}],\n", " session_id=session_id,\n", " stream=True,\n", ")\n", "\n", "logger = AgentEventLogger()\n", "for printable in logger.log(response_stream):\n", " print(printable, end='', flush=True)\n", "print('\\n')" ] }, { "cell_type": "markdown", "id": "6f8d31d0", "metadata": {}, "source": [ "## 6. FastAPI Service Example\n", "\n", "You can also run the agent as a FastAPI web service for production use. This allows you to expose the agent functionality via HTTP API endpoints." ] }, { "cell_type": "code", "execution_count": null, "id": "a5d732e4", "metadata": {}, "outputs": [], "source": [ "# Import FastAPI components\n", "from fastapi import FastAPI\n", "from pydantic import BaseModel\n", "from threading import Thread\n", "import time\n", "\n", "# Create a simple FastAPI app\n", "api_app = FastAPI(title=\"Llama Stack Agent API\")\n", "\n", "class ChatRequest(BaseModel):\n", " message: str\n", "\n", "\n", "@api_app.post(\"/chat\")\n", "def chat(request: ChatRequest):\n", " \"\"\"Chat endpoint that uses the Llama Stack Agent\"\"\"\n", " session_id = agent.create_session('fastapi-weather-session')\n", "\n", " # Create turn and collect response\n", " response_stream = agent.create_turn(\n", " messages=[{'role': 'user', 'content': request.message}],\n", " session_id=session_id,\n", " stream=True,\n", " )\n", "\n", " # Collect the full response\n", " full_response = \"\"\n", " logger = AgentEventLogger()\n", " for printable in logger.log(response_stream):\n", " full_response += printable\n", "\n", " return {\"response\": full_response}\n", "\n", "print(\"FastAPI app created. Use the next cell to start the server.\")" ] }, { "cell_type": "markdown", "id": "475997ba", "metadata": {}, "source": [ "### Start the FastAPI Server\n", "\n", "**Note**: In a notebook, you can start the server in a background thread. For production, run it as a separate process using `uvicorn`." ] }, { "cell_type": "code", "execution_count": null, "id": "6f5db723", "metadata": {}, "outputs": [], "source": [ "# Start server in background thread (for notebook demonstration)\n", "from uvicorn import Config, Server\n", "\n", "# Create a server instance that can be controlled\n", "config = Config(api_app, host=\"127.0.0.1\", port=8000, log_level=\"info\")\n", "server = Server(config)\n", "\n", "def run_server():\n", " server.run()\n", "\n", "# Use daemon=True so the thread stops automatically when the kernel restarts\n", "# This is safe for notebook demonstrations\n", "# For production, use process managers instead of threads\n", "server_thread = Thread(target=run_server, daemon=True)\n", "server_thread.start()\n", "\n", "# Wait a moment for the server to start\n", "time.sleep(2)\n", "print(\"✓ FastAPI server started at http://127.0.0.1:8000\")" ] }, { "cell_type": "markdown", "id": "715b2d47", "metadata": {}, "source": [ "### Test the API\n", "\n", "Now you can call the API using HTTP requests:" ] }, { "cell_type": "code", "execution_count": null, "id": "407b82af", "metadata": {}, "outputs": [], "source": [ "# Test the API endpoint\n", "response = requests.post(\n", " \"http://127.0.0.1:8000/chat\",\n", " json={\"message\": \"What's the weather in Shanghai?\"},\n", " timeout=60\n", ")\n", "\n", "print(f\"Status Code: {response.status_code}\")\n", "print(\"Response:\")\n", "print(response.json().get('response'))" ] }, { "cell_type": "markdown", "id": "945a776f", "metadata": {}, "source": [ "### Stop the FastAPI Server" ] }, { "cell_type": "code", "execution_count": null, "id": "c7795bba", "metadata": {}, "outputs": [], "source": [ "# Stop the FastAPI server (section 6)\n", "if 'server' in globals() and server.started:\n", " server.should_exit = True\n", " print(\"✓ FastAPI server shutdown requested.\")\n", "else:\n", " print(\"FastAPI server is not running or has already stopped.\")" ] }, { "cell_type": "markdown", "id": "a3ebed1f", "metadata": {}, "source": [ "## 7. More Resources\n", "\n", "For more resources on developing AI Agents with Llama Stack, see:\n", "\n", "### Official Documentation\n", "- [Llama Stack Documentation](https://llamastack.github.io/docs) - The official Llama Stack documentation covering all usage-related topics, API providers, and core concepts.\n", "- [Llama Stack Core Concepts](https://llamastack.github.io/docs/concepts) - Deep dive into Llama Stack architecture, API stability, and resource management.\n", "\n", "### Code Examples and Projects\n", "- [Llama Stack GitHub Repository](https://github.com/llamastack/llama-stack) - Source code, example applications, distribution configurations, and how to add new API providers.\n", "- [Llama Stack Example Apps](https://github.com/llamastack/llama-stack-apps/) - Official examples demonstrating how to use Llama Stack in various scenarios.\n", "\n", "### Community and Support\n", "- [Llama Stack GitHub Issues](https://github.com/llamastack/llama-stack/issues) - Report bugs, ask questions, and contribute to the project.\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python (llama-stack-demo)", "language": "python", "name": "llama-stack-demo" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.11" } }, "nbformat": 4, "nbformat_minor": 5 }