AST-Filtered eval() Is Not a Sandbox: Remote Code Execution in Microsoft Semantic Kernel and the AI Infrastructure Pattern Behind It

Dan Aridor · SPR{K3 Security Research · March 2026

CVE-2026-26030 · CVSS 10.0 · CWE-94 · Responsible Disclosure

Abstract

This post documents CVE-2026-26030, a CVSS 10.0 Remote Code Execution vulnerability in Microsoft Semantic Kernel’s InMemoryVectorStore filter parser, discovered and reported on December 1, 2025. The vulnerability arises from a fundamental misuse of Python’s eval() function under the assumption that AST-based filtering provides adequate sandboxing. It does not. This report details the root cause, the bypass mechanism, the full disclosure timeline, and the broader architectural pattern that makes this class of vulnerability endemic to AI and ML infrastructure. The patch shipped in Semantic Kernel python-1.39.4 (PR #13505). CVE-2026-26030 and GHSA-xjw9-4gw8-4rqx were published February 19, 2026.

1. Background and Motivation

Over the past year, I have been conducting systematic security research across machine learning frameworks, vector databases, and LLM orchestration tooling as part of SPR{K}3 Security Research. A recurring pattern emerged across multiple codebases: developers implementing flexible query or filter interfaces reach for Python’s eval() function, apply Abstract Syntax Tree (AST) filtering to constrain what can be evaluated, and ship the result under the assumption that the AST layer constitutes a security boundary.

It does not. AST filtering constrains syntax. It cannot constrain runtime object graph traversal. This distinction is the root cause of CVE-2026-26030 and a class of vulnerability that is appearing across AI infrastructure with notable frequency.

This post is published following responsible disclosure. Microsoft was notified on December 1, 2025. The vulnerability has been patched, a CVE has been issued, and the 90-day disclosure window has passed. No exploit code targeting unpatched systems is included.

2. Vulnerability Summary

Field	Value
CVE ID	CVE-2026-26030
GHSA	GHSA-xjw9-4gw8-4rqx
CVSS Score	10.0 (Maximum) — Scope: Changed
CWE	CWE-94
Affected Product	Microsoft Semantic Kernel (Python SDK)
Affected Component	InMemoryVectorCollection._parse_and_validate_filter
Affected File	python/semantic_kernel/connectors/in_memory.py
Fixed Version	python-1.39.4
Fix PR	PR #13505
Report Date	December 1, 2025
CVE Published	February 19, 2026
MSRC Case	104081 (VULN-167388)

3. Technical Analysis

3.1 The Vulnerable Pattern

Semantic Kernel’s InMemoryVectorStore provides a vector storage implementation for use in RAG (Retrieval-Augmented Generation) pipelines. The filter parser accepts user-supplied filter expressions to query stored vectors. The implementation evaluated these expressions using Python’s built-in eval() function, with an AST-based filter applied to restrict the evaluated syntax.

The design intent is clear: allow flexible filter syntax while preventing dangerous operations by examining and restricting the AST before evaluation. This is a pattern that appears intuitive but contains a fundamental flaw.

The core mistake: AST filtering constrains what syntax is permitted. It does not — and cannot — constrain what the permitted syntax accesses at runtime through Python’s object model.

3.2 Python’s Object Model and the Escape Chain

Python’s object model exposes the complete class hierarchy from any object instance. Every Python object carries a reference to its class via __class__, and every class carries references up and across the inheritance tree via __bases__ and down via __subclasses__(). This is by design — it is a fundamental feature of Python’s introspection model.

The consequence for eval()-based sandboxes is severe. Consider the following expression:

__class__.__bases__[0].__subclasses__()

This expression uses only three operations: attribute access, subscript access, and a method call. All three are syntactically benign. An AST filter examining this expression would find nothing objectionable — no import statements, no exec() calls, no os references. The expression passes.

At runtime, however, the expression traverses to the base object hierarchy and enumerates every subclass currently loaded in the Python runtime. Because Python loads its standard library during a typical application run, this enumeration includes classes from os, subprocess, and other modules that provide operating system access.

3.3 From Subclass Enumeration to Code Execution

Once an attacker can enumerate loaded subclasses, the path to code execution is straightforward. Many internal Python classes store references to their containing module’s globals in __init__.__globals__. By iterating over loaded subclasses and inspecting their globals, an attacker can locate and invoke any function available to the Python runtime.

A representative traversal proceeds as follows:

			
# Step 1: Enumerate loaded subclasses
subclasses = [].__class__.__bases__[0].__subclasses__()
# Step 2: Find a class whose globals expose os
for cls in subclasses:
    try:
        if 'os' in cls.__init__.__globals__:
            # Step 3: Execute arbitrary system command
            cls.__init__.__globals__['os'].system('id')
            break
    except AttributeError:
        pass

		

The result is full Remote Code Execution in the context of the application process, with no special privileges required beyond the ability to supply a filter expression to the affected component. Any application passing user-controlled input to InMemoryVectorStore’s filter interface was vulnerable.

3.4 Why AST Filtering Cannot Prevent This

An AST filter operates on the parsed structure of an expression before it is evaluated. A well-written AST filter can successfully block:

Import statements (import os)
Direct built-in calls (exec(), eval(), __import__())
Attribute access to explicitly blacklisted names
String-based indirect access (__builtins__['exec'])

What an AST filter cannot block is object graph traversal through attributes that are not individually blacklisted. The __class__, __bases__, and __subclasses__ attributes are not inherently dangerous names — they are standard Python introspection attributes present on every object. Blacklisting them individually is feasible but brittle: Python’s object model provides many equivalent traversal paths.

The fundamental problem is that Python’s object model is not designed to be restricted by syntax inspection. The runtime maintains a rich, interconnected graph of objects, and any sufficiently expressive language subset will provide paths through that graph to dangerous capabilities.

The correct mental model: eval() with AST filtering is not a sandbox. It is eval() with a partial blocklist. These are not equivalent.

4. Disclosure Timeline

Date	Event	Detail
Dec 1, 2025	Vulnerability reported	Submitted to MSRC. Case 104081 / VULN-167388 opened.
Jan 2, 2026	MSRC follow-up	MSRC acknowledged and requested engineering update.
Jan 24, 2026	Engineering confirmation	MSRC: “We confirmed the behavior you reported.”
Feb 19, 2026	CVE published	CVE-2026-26030 published. CVSS 10.0. GHSA-xjw9-4gw8-4rqx issued. Patch in PR #13505.
Feb 19, 2026	Bounty decision	Case closed: OUT_OF_SCOPE_REVIEWED. No bounty awarded.
Mar 2026	Reassessment request	Formal request submitted to bounty team. No response received as of publication.
Mar 2026	Public disclosure	This post published following the 90-day responsible disclosure window.

4.1 Note on Bounty Program Handling

The MSRC portal records: Security Impact — Remote Code Execution; product category — Copilot / AI + ML / LLMs; internal severity — Important; published CVSS — 10.0; case points awarded — 40. The bounty decision was OUT_OF_SCOPE_REVIEWED.

A formal reassessment request was submitted. No response has been received. I will stick to the technical facts and allow readers to draw their own conclusions.

5. The Broader Pattern: eval() Sandboxing in AI Infrastructure

5.1 Why This Pattern Keeps Appearing

CVE-2026-26030 is not an isolated mistake. The eval()-with-AST-filtering pattern is appearing across AI infrastructure for a structural reason: the ecosystem has developed a strong preference for flexible, natural-language-adjacent query and filter interfaces. Vector databases, LLM orchestration frameworks, and agent tooling all benefit from allowing users to express complex filter conditions without requiring them to write raw code.

The developer implementing this feature faces a genuine design challenge: how do you provide expressive query syntax without introducing arbitrary code execution? The natural answer — evaluate user expressions, but restrict the syntax with an AST filter — is intuitive and wrong.

5.2 Affected Infrastructure Categories

Based on ongoing research, the following categories are most likely to contain this pattern:

Vector databases with filter expressions — High risk
LLM orchestration frameworks with query or condition parsers — High risk
RAG pipelines with user-configurable retrieval filters — High risk
Agent frameworks evaluating tool selection conditions — Medium risk
ML platforms with custom DSL or expression interfaces — Medium risk

The pattern is not limited to Python. Equivalent object graph traversal attacks exist in JavaScript (prototype chain), Ruby (ObjectSpace), and other dynamic languages.

5.3 The Diagnostic Question

If your framework evaluates user-controlled expressions at runtime, and your security boundary is an AST filter or a blocklist, you likely have this vulnerability. The question is not whether the bypass exists — it does — but whether user input can reach the evaluation point.

6. Remediation

6.1 What Does Not Work

The following mitigations are commonly proposed but insufficient:

Expanding the AST blocklist to include __class__, __bases__, __subclasses__ — Python provides equivalent traversal paths through __mro__, __dict__, type(), and other mechanisms. Blocklists are inherently incomplete.
Restricting eval() to a custom globals dict — Unless the runtime is completely isolated, attribute traversal from any accessible object can escape the restricted globals.
Catching exceptions from dangerous operations — Exception handling cannot prevent the traversal itself.

6.2 What Works

Implement a proper DSL parser — Define a grammar for your filter language and parse it explicitly. Never pass user input to eval(). This is the only fundamentally sound approach.
Whitelist-based evaluation — Define a fixed set of allowed operations and implement them directly, without runtime evaluation. For simple filter conditions this is usually sufficient.
Subprocess isolation — If runtime evaluation is genuinely required, evaluate expressions in a subprocess with no shared memory and strict OS-level sandboxing (seccomp, namespaces).
Established sandboxing libraries — Libraries such as RestrictedPython provide more robust sandboxing than hand-written AST filters, though they are not guaranteed to be complete.

The simplest remediation: do not use eval() on user-controlled input. If your feature requires expression evaluation, invest in a proper parser. The security debt of eval()-based sandboxes compounds over time.

7. Detection Guidance

For security engineers auditing codebases for this pattern:

Any call to eval(), exec(), or compile() that processes user-supplied strings
AST visitor or transformer classes co-located with eval() usage — a common signal of attempted AST-based sandboxing
Filter, query, or expression parser functionality in vector store or retrieval components
Custom DSL implementations that fall back to eval() for complex expressions

The SPR{K}3 platform detects this pattern through static AST analysis across repositories, with specific detection rules for eval()-with-AST-visitor co-location. The pattern has been observed across multiple major ML frameworks and is included in automated nightly scanning targets.

8. Conclusion

CVE-2026-26030 demonstrates a vulnerability class structurally embedded in the current generation of AI infrastructure. The combination of pressure toward flexible user interfaces, dynamic language runtimes, and the intuitive-but-incorrect belief that AST filtering provides sandboxing has produced a repeating pattern of exploitable eval() usage across the ecosystem.

The fix for any individual instance is straightforward: replace eval() with a proper parser or whitelist-based evaluator. The fix for the ecosystem requires raising awareness that AST-filtered eval() is not a sandbox — it never was — and that this assumption should not be carried into new AI framework implementations.

If you maintain an ML framework, vector database, LLM tool, or agent framework with a filter or query interface — audit your eval() usage now. The bypass chain is well-known. The attack surface across AI infrastructure is not.

References

CVE-2026-26030 — nvd.nist.gov
GHSA-xjw9-4gw8-4rqx — github.com/advisories
Semantic Kernel PR #13505 — github.com/microsoft/semantic-kernel
MSRC Case 104081 / VULN-167388
CWE-94: Improper Control of Generation of Code — cwe.mitre.org

Dan Aridor is the founder of SPR{K}3 Security Research, focused on vulnerability discovery in AI infrastructure, ML frameworks, and supply chain security. Confirmed vulnerabilities across Meta, Microsoft, Google, NVIDIA, and Amazon.

daridor@sprk3.com · github.com/daridor9