Semantic Kernel

Semantic Kernel (SK) is Microsoft’s open-source AI orchestration SDK. It bridges AI models with conventional programming by providing a structured way to define AI skills (plugins), orchestrate multi-step AI tasks (planners), manage memory, and integrate with enterprise services. Available for .NET, Python, and Java.

GitHub: https://github.com/microsoft/semantic-kernel
Docs: https://learn.microsoft.com/en-us/semantic-kernel/
PyPI: https://pypi.org/project/semantic-kernel
NuGet: https://www.nuget.org/packages/Microsoft.SemanticKernel

Installation

Python

# Core SDK
pip install semantic-kernel

# With Azure OpenAI
pip install "semantic-kernel[azure]"

# With Hugging Face
pip install "semantic-kernel[hugging_face]"

# With vector store backends
pip install "semantic-kernel[chroma]"
pip install "semantic-kernel[qdrant]"
pip install "semantic-kernel[azure_ai_search]"
pip install "semantic-kernel[weaviate]"

# Full install
pip install "semantic-kernel[all]"

.NET

# Core SDK
dotnet add package Microsoft.SemanticKernel

# Azure OpenAI connector
dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI

# OpenAI connector
dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI

# Memory (vector store)
dotnet add package Microsoft.SemanticKernel.Plugins.Memory
dotnet add package Microsoft.SemanticKernel.Connectors.Chroma

Java

<!-- Maven -->
<dependency>
    <groupId>com.microsoft.semantic-kernel</groupId>
    <artifactId>semantickernel-api</artifactId>
    <version>1.x.x</version>
</dependency>

Configuration

Python Kernel Setup

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion, AzureChatCompletion
from semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding

# Create kernel
kernel = sk.Kernel()

# Add OpenAI chat service
kernel.add_service(
    OpenAIChatCompletion(
        ai_model_id="gpt-4o",
        api_key="sk-...",
        service_id="chat",
    )
)

# Add Azure OpenAI chat service
kernel.add_service(
    AzureChatCompletion(
        deployment_name="gpt-4o",
        endpoint="https://myinstance.openai.azure.com/",
        api_key="xxxxxxxxxxxxxxxx",
        service_id="azure-chat",
    )
)

# Add embeddings service
kernel.add_service(
    OpenAITextEmbedding(
        ai_model_id="text-embedding-3-small",
        api_key="sk-...",
        service_id="embeddings",
    )
)

.NET Kernel Setup

using Microsoft.SemanticKernel;

// Build kernel with OpenAI
var kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion("gpt-4o", "sk-...")
    .Build();

// Build with Azure OpenAI
var kernel = Kernel.CreateBuilder()
    .AddAzureOpenAIChatCompletion(
        deploymentName: "gpt-4o",
        endpoint: "https://myinstance.openai.azure.com/",
        apiKey: "xxxxxxxxxxxxxxxx"
    )
    .Build();

Core API

Kernel Components

Component	Description
`Kernel`	Central orchestrator; holds services, plugins, and settings
`KernelPlugin`	Collection of related `KernelFunction`s
`KernelFunction`	A callable unit — either an LLM prompt or native code
`KernelArguments`	Named parameters passed to functions
`ChatHistory`	Manages conversation context
`PromptTemplateConfig`	Settings for prompt rendering and LLM invocation
`FunctionChoiceBehavior`	Controls automatic/manual tool call behavior

Plugin Types

Type	Description
Native Plugin	Python/C# class with `@kernel_function` decorated methods
Prompt Plugin	YAML/text templates loaded from directory
OpenAPI Plugin	Auto-generated from OpenAPI spec
Core Plugins	Built-ins: `TextPlugin`, `TimePlugin`, `MathPlugin`, `HttpPlugin`

Advanced Usage

Defining Native Plugins (Python)

from semantic_kernel.functions import kernel_function
from semantic_kernel.functions.kernel_function_decorator import kernel_function

class MathPlugin:
    @kernel_function(name="add", description="Add two numbers together")
    def add(self, number1: float, number2: float) -> float:
        """Adds number1 and number2."""
        return number1 + number2

    @kernel_function(name="square_root", description="Calculate square root of a number")
    def square_root(self, number: float) -> float:
        """Returns the square root."""
        import math
        return math.sqrt(number)

class WebSearchPlugin:
    def __init__(self, search_client):
        self.client = search_client

    @kernel_function(name="search", description="Search the web for information")
    async def search(self, query: str) -> str:
        """Performs web search and returns results."""
        results = await self.client.search(query)
        return "\n".join([r.snippet for r in results[:3]])

# Register plugins
kernel.add_plugin(MathPlugin(), plugin_name="math")
kernel.add_plugin(WebSearchPlugin(client), plugin_name="web")

Prompt Functions

from semantic_kernel.prompt_template import PromptTemplateConfig
from semantic_kernel.functions import KernelFunctionFromPrompt

# Inline prompt function
summarize_fn = KernelFunctionFromPrompt(
    function_name="summarize",
    plugin_name="text_tools",
    prompt="""Summarize the following text in {{$max_sentences}} sentences:

{{$input}}

Summary:""",
    prompt_template_settings=PromptTemplateConfig(
        template_format="semantic-kernel",
    ),
)
kernel.add_function(plugin_name="text_tools", function=summarize_fn)

# Invoke the function
from semantic_kernel.functions import KernelArguments

result = await kernel.invoke(
    summarize_fn,
    KernelArguments(input="Long article text here...", max_sentences=3)
)
print(result)

Prompt Directory (YAML format)

# plugins/Summarizer/Summarize/config.json
{
  "schema": 1,
  "name": "Summarize",
  "description": "Summarizes text in a given number of sentences",
  "input_variables": [
    {"name": "input", "description": "Text to summarize"},
    {"name": "max_sentences", "description": "Max sentences", "default_value": "3"}
  ],
  "execution_settings": {
    "default": {
      "max_tokens": 256,
      "temperature": 0.3
    }
  }
}

# plugins/Summarizer/Summarize/skprompt.txt
Summarize the following in {{$max_sentences}} sentences:

{{$input}}

Summary:

# Load from directory
plugin = kernel.add_plugin(parent_directory="./plugins", plugin_name="Summarizer")
result = await kernel.invoke(plugin["Summarize"], KernelArguments(input="..."))

Function Calling / Auto Tool Use

from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior

# Auto function calling — kernel automatically executes tool calls
settings = OpenAIChatPromptExecutionSettings(
    function_choice_behavior=FunctionChoiceBehavior.Auto(
        filters={"included_plugins": ["math", "web"]}
    ),
    max_tokens=1024,
)

from semantic_kernel.contents.chat_history import ChatHistory

history = ChatHistory()
history.add_user_message("What is the square root of 144, and search for latest AI news?")

from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion

chat_service = kernel.get_service(type=OpenAIChatCompletion)
result = await chat_service.get_chat_message_content(
    chat_history=history,
    settings=settings,
    kernel=kernel,
)
print(result)

Memory and Vector Stores

from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.connectors.memory.chroma import ChromaMemoryStore

# Setup memory store
memory_store = ChromaMemoryStore(persist_directory="/data/chroma")
memory = SemanticTextMemory(
    storage=memory_store,
    embeddings_generator=kernel.get_service(service_id="embeddings"),
)

# Save memories
await memory.save_information(
    collection="company_docs",
    id="doc_001",
    text="AcmeCorp was founded in 2010 in San Francisco.",
    description="Company founding information",
)

# Search memories
results = await memory.search(
    collection="company_docs",
    query="When was the company founded?",
    limit=3,
    min_relevance_score=0.7,
)
for r in results:
    print(f"[{r.relevance:.2f}] {r.description}: {r.text[:80]}")

Planners (Automatic Multi-Step Execution)

# Handlebars Planner — generates a Handlebars template to solve a goal
from semantic_kernel.planners.handlebars_planner import HandlebarsPlanner, HandlebarsPlannerOptions

planner = HandlebarsPlanner(
    kernel,
    options=HandlebarsPlannerOptions(allow_loops=True, max_tokens=2048)
)

# Generate a plan
plan = await planner.create_plan("Search for Python tips and summarize the top 3 results.")
print(plan.template)  # View the generated Handlebars plan

# Execute the plan
result = await plan.invoke(kernel)
print(result)

.NET — Complete Example

using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;

var kernel = Kernel.CreateBuilder()
    .AddOpenAIChatCompletion("gpt-4o", Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
    .Build();

// Native plugin
public class WeatherPlugin
{
    [KernelFunction("get_weather")]
    [Description("Get current weather for a city")]
    public async Task<string> GetWeatherAsync(
        [Description("City name")] string city,
        Kernel kernel)
    {
        // Call weather API
        return $"Weather in {city}: 72°F, Sunny";
    }
}

kernel.Plugins.AddFromType<WeatherPlugin>();

// Auto function calling
var settings = new OpenAIPromptExecutionSettings
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};

var chatService = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory("You are a helpful weather assistant.");
history.AddUserMessage("What is the weather in Seattle?");

var response = await chatService.GetChatMessageContentAsync(history, settings, kernel);
Console.WriteLine(response);

Common Workflows

RAG with Memory

async def answer_with_context(question: str, collection: str) -> str:
    # Retrieve relevant memories
    context_results = await memory.search(collection, question, limit=5)
    context = "\n".join([r.text for r in context_results])

    # Answer using context
    prompt = f"""Answer the question using only the provided context.
    
Context:
{context}

Question: {question}
Answer:"""

    result = await kernel.invoke_prompt(prompt)
    return str(result)

Chat with Function Calling

from semantic_kernel.contents.chat_history import ChatHistory

async def interactive_chat():
    history = ChatHistory()
    history.add_system_message("You are a helpful assistant with math and search capabilities.")

    while True:
        user_input = input("User: ")
        if user_input.lower() == "exit":
            break

        history.add_user_message(user_input)
        response = await kernel.invoke_prompt(
            "{{$history}}",
            arguments=KernelArguments(history=history),
        )
        print(f"Assistant: {response}")
        history.add_assistant_message(str(response))

Tips and Best Practices

Topic	Recommendation
Service ID	Always assign `service_id` when adding multiple LLM services; reference by ID in settings
Plugin naming	Use descriptive function names and `description=` — these go into tool schemas
Async	Prefer `await kernel.invoke()` pattern; SK is async-first in Python
Streaming	Use `kernel.invoke_stream()` for real-time token output
Filters	Use `FunctionChoiceBehavior.Auto(filters=...)` to limit which plugins the LLM can call
Prompt injection	Sanitize user input before including in prompts; SK has no built-in injection protection
Memory sizing	Chunk documents into 256–512 token pieces before embedding for better retrieval
Planners	Handlebars Planner is more reliable than sequential for complex multi-step tasks
Azure integration	Use `DefaultAzureCredential` instead of API keys for production Azure deployments
Telemetry	SK integrates with OpenTelemetry; enable tracing for production observability
Versioning	SK is evolving rapidly; pin versions and review changelogs between upgrades
Enterprise	Use SK’s built-in `KernelPlugin` from OpenAPI specs to integrate REST APIs automatically