Semantic Kernel (SK) is Microsoft’s open-source AI orchestration SDK. It bridges AI models with conventional programming by providing a structured way to define AI skills (plugins), orchestrate multi-step AI tasks (planners), manage memory, and integrate with enterprise services. Available for .NET, Python, and Java.
GitHub: https://github.com/microsoft/semantic-kernel
Docs: https://learn.microsoft.com/en-us/semantic-kernel/
PyPI: https://pypi.org/project/semantic-kernel
NuGet: https://www.nuget.org/packages/Microsoft.SemanticKernel
Installation
Python
# Core SDK
pip install semantic-kernel
# With Azure OpenAI
pip install "semantic-kernel[azure]"
# With Hugging Face
pip install "semantic-kernel[hugging_face]"
# With vector store backends
pip install "semantic-kernel[chroma]"
pip install "semantic-kernel[qdrant]"
pip install "semantic-kernel[azure_ai_search]"
pip install "semantic-kernel[weaviate]"
# Full install
pip install "semantic-kernel[all]"
.NET
# Core SDK
dotnet add package Microsoft.SemanticKernel
# Azure OpenAI connector
dotnet add package Microsoft.SemanticKernel.Connectors.AzureOpenAI
# OpenAI connector
dotnet add package Microsoft.SemanticKernel.Connectors.OpenAI
# Memory (vector store)
dotnet add package Microsoft.SemanticKernel.Plugins.Memory
dotnet add package Microsoft.SemanticKernel.Connectors.Chroma
Java
<!-- Maven -->
<dependency>
<groupId>com.microsoft.semantic-kernel</groupId>
<artifactId>semantickernel-api</artifactId>
<version>1.x.x</version>
</dependency>
Configuration
Python Kernel Setup
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion, AzureChatCompletion
from semantic_kernel.connectors.ai.open_ai import OpenAITextEmbedding
# Create kernel
kernel = sk.Kernel()
# Add OpenAI chat service
kernel.add_service(
OpenAIChatCompletion(
ai_model_id="gpt-4o",
api_key="sk-...",
service_id="chat",
)
)
# Add Azure OpenAI chat service
kernel.add_service(
AzureChatCompletion(
deployment_name="gpt-4o",
endpoint="https://myinstance.openai.azure.com/",
api_key="xxxxxxxxxxxxxxxx",
service_id="azure-chat",
)
)
# Add embeddings service
kernel.add_service(
OpenAITextEmbedding(
ai_model_id="text-embedding-3-small",
api_key="sk-...",
service_id="embeddings",
)
)
.NET Kernel Setup
using Microsoft.SemanticKernel;
// Build kernel with OpenAI
var kernel = Kernel.CreateBuilder()
.AddOpenAIChatCompletion("gpt-4o", "sk-...")
.Build();
// Build with Azure OpenAI
var kernel = Kernel.CreateBuilder()
.AddAzureOpenAIChatCompletion(
deploymentName: "gpt-4o",
endpoint: "https://myinstance.openai.azure.com/",
apiKey: "xxxxxxxxxxxxxxxx"
)
.Build();
Core API
Kernel Components
| Component | Description |
|---|
Kernel | Central orchestrator; holds services, plugins, and settings |
KernelPlugin | Collection of related KernelFunctions |
KernelFunction | A callable unit — either an LLM prompt or native code |
KernelArguments | Named parameters passed to functions |
ChatHistory | Manages conversation context |
PromptTemplateConfig | Settings for prompt rendering and LLM invocation |
FunctionChoiceBehavior | Controls automatic/manual tool call behavior |
Plugin Types
| Type | Description |
|---|
| Native Plugin | Python/C# class with @kernel_function decorated methods |
| Prompt Plugin | YAML/text templates loaded from directory |
| OpenAPI Plugin | Auto-generated from OpenAPI spec |
| Core Plugins | Built-ins: TextPlugin, TimePlugin, MathPlugin, HttpPlugin |
Advanced Usage
Defining Native Plugins (Python)
from semantic_kernel.functions import kernel_function
from semantic_kernel.functions.kernel_function_decorator import kernel_function
class MathPlugin:
@kernel_function(name="add", description="Add two numbers together")
def add(self, number1: float, number2: float) -> float:
"""Adds number1 and number2."""
return number1 + number2
@kernel_function(name="square_root", description="Calculate square root of a number")
def square_root(self, number: float) -> float:
"""Returns the square root."""
import math
return math.sqrt(number)
class WebSearchPlugin:
def __init__(self, search_client):
self.client = search_client
@kernel_function(name="search", description="Search the web for information")
async def search(self, query: str) -> str:
"""Performs web search and returns results."""
results = await self.client.search(query)
return "\n".join([r.snippet for r in results[:3]])
# Register plugins
kernel.add_plugin(MathPlugin(), plugin_name="math")
kernel.add_plugin(WebSearchPlugin(client), plugin_name="web")
Prompt Functions
from semantic_kernel.prompt_template import PromptTemplateConfig
from semantic_kernel.functions import KernelFunctionFromPrompt
# Inline prompt function
summarize_fn = KernelFunctionFromPrompt(
function_name="summarize",
plugin_name="text_tools",
prompt="""Summarize the following text in {{$max_sentences}} sentences:
{{$input}}
Summary:""",
prompt_template_settings=PromptTemplateConfig(
template_format="semantic-kernel",
),
)
kernel.add_function(plugin_name="text_tools", function=summarize_fn)
# Invoke the function
from semantic_kernel.functions import KernelArguments
result = await kernel.invoke(
summarize_fn,
KernelArguments(input="Long article text here...", max_sentences=3)
)
print(result)
# plugins/Summarizer/Summarize/config.json
{
"schema": 1,
"name": "Summarize",
"description": "Summarizes text in a given number of sentences",
"input_variables": [
{"name": "input", "description": "Text to summarize"},
{"name": "max_sentences", "description": "Max sentences", "default_value": "3"}
],
"execution_settings": {
"default": {
"max_tokens": 256,
"temperature": 0.3
}
}
}
# plugins/Summarizer/Summarize/skprompt.txt
Summarize the following in {{$max_sentences}} sentences:
{{$input}}
Summary:
# Load from directory
plugin = kernel.add_plugin(parent_directory="./plugins", plugin_name="Summarizer")
result = await kernel.invoke(plugin["Summarize"], KernelArguments(input="..."))
from semantic_kernel.connectors.ai.open_ai import OpenAIChatPromptExecutionSettings
from semantic_kernel.connectors.ai.function_choice_behavior import FunctionChoiceBehavior
# Auto function calling — kernel automatically executes tool calls
settings = OpenAIChatPromptExecutionSettings(
function_choice_behavior=FunctionChoiceBehavior.Auto(
filters={"included_plugins": ["math", "web"]}
),
max_tokens=1024,
)
from semantic_kernel.contents.chat_history import ChatHistory
history = ChatHistory()
history.add_user_message("What is the square root of 144, and search for latest AI news?")
from semantic_kernel.connectors.ai.open_ai import OpenAIChatCompletion
chat_service = kernel.get_service(type=OpenAIChatCompletion)
result = await chat_service.get_chat_message_content(
chat_history=history,
settings=settings,
kernel=kernel,
)
print(result)
Memory and Vector Stores
from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory
from semantic_kernel.connectors.memory.chroma import ChromaMemoryStore
# Setup memory store
memory_store = ChromaMemoryStore(persist_directory="/data/chroma")
memory = SemanticTextMemory(
storage=memory_store,
embeddings_generator=kernel.get_service(service_id="embeddings"),
)
# Save memories
await memory.save_information(
collection="company_docs",
id="doc_001",
text="AcmeCorp was founded in 2010 in San Francisco.",
description="Company founding information",
)
# Search memories
results = await memory.search(
collection="company_docs",
query="When was the company founded?",
limit=3,
min_relevance_score=0.7,
)
for r in results:
print(f"[{r.relevance:.2f}] {r.description}: {r.text[:80]}")
Planners (Automatic Multi-Step Execution)
# Handlebars Planner — generates a Handlebars template to solve a goal
from semantic_kernel.planners.handlebars_planner import HandlebarsPlanner, HandlebarsPlannerOptions
planner = HandlebarsPlanner(
kernel,
options=HandlebarsPlannerOptions(allow_loops=True, max_tokens=2048)
)
# Generate a plan
plan = await planner.create_plan("Search for Python tips and summarize the top 3 results.")
print(plan.template) # View the generated Handlebars plan
# Execute the plan
result = await plan.invoke(kernel)
print(result)
.NET — Complete Example
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
var kernel = Kernel.CreateBuilder()
.AddOpenAIChatCompletion("gpt-4o", Environment.GetEnvironmentVariable("OPENAI_API_KEY"))
.Build();
// Native plugin
public class WeatherPlugin
{
[KernelFunction("get_weather")]
[Description("Get current weather for a city")]
public async Task<string> GetWeatherAsync(
[Description("City name")] string city,
Kernel kernel)
{
// Call weather API
return $"Weather in {city}: 72°F, Sunny";
}
}
kernel.Plugins.AddFromType<WeatherPlugin>();
// Auto function calling
var settings = new OpenAIPromptExecutionSettings
{
FunctionChoiceBehavior = FunctionChoiceBehavior.Auto()
};
var chatService = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory("You are a helpful weather assistant.");
history.AddUserMessage("What is the weather in Seattle?");
var response = await chatService.GetChatMessageContentAsync(history, settings, kernel);
Console.WriteLine(response);
Common Workflows
RAG with Memory
async def answer_with_context(question: str, collection: str) -> str:
# Retrieve relevant memories
context_results = await memory.search(collection, question, limit=5)
context = "\n".join([r.text for r in context_results])
# Answer using context
prompt = f"""Answer the question using only the provided context.
Context:
{context}
Question: {question}
Answer:"""
result = await kernel.invoke_prompt(prompt)
return str(result)
Chat with Function Calling
from semantic_kernel.contents.chat_history import ChatHistory
async def interactive_chat():
history = ChatHistory()
history.add_system_message("You are a helpful assistant with math and search capabilities.")
while True:
user_input = input("User: ")
if user_input.lower() == "exit":
break
history.add_user_message(user_input)
response = await kernel.invoke_prompt(
"{{$history}}",
arguments=KernelArguments(history=history),
)
print(f"Assistant: {response}")
history.add_assistant_message(str(response))
Tips and Best Practices
| Topic | Recommendation |
|---|
| Service ID | Always assign service_id when adding multiple LLM services; reference by ID in settings |
| Plugin naming | Use descriptive function names and description= — these go into tool schemas |
| Async | Prefer await kernel.invoke() pattern; SK is async-first in Python |
| Streaming | Use kernel.invoke_stream() for real-time token output |
| Filters | Use FunctionChoiceBehavior.Auto(filters=...) to limit which plugins the LLM can call |
| Prompt injection | Sanitize user input before including in prompts; SK has no built-in injection protection |
| Memory sizing | Chunk documents into 256–512 token pieces before embedding for better retrieval |
| Planners | Handlebars Planner is more reliable than sequential for complex multi-step tasks |
| Azure integration | Use DefaultAzureCredential instead of API keys for production Azure deployments |
| Telemetry | SK integrates with OpenTelemetry; enable tracing for production observability |
| Versioning | SK is evolving rapidly; pin versions and review changelogs between upgrades |
| Enterprise | Use SK’s built-in KernelPlugin from OpenAPI specs to integrate REST APIs automatically |