Red Hornet | Dumbpipe

We demonstrated that OpenAI API infrastructure can be abused as a two-way communications backdoor for malware. In our test, a compromised Windows host received commands and returned results through the vision API's image_url workflow without ever connecting directly to attacker infrastructure. In many enterprise environments, the traffic would likely be treated as legitimate and allowed.

Enterprise networks that allow access to AI APIs (ChatGPT, OpenAI API, Copilot) are potentially exposed to C2 channels that bypass traditional network detection, because the traffic looks indistinguishable from legitimate AI usage.

This research was informed by earlier public reporting, including Check Point Research's work on abusing web-based AI services as C2 proxies, and Microsoft's SesameOp report describing a backdoor that used the OpenAI Assistants API for command and control.

What Dumbpipe Proves

Conventional Command-and-Control (C2) detection often focuses on malware communicating with suspicious domains, unusual IP space, or known beaconing infrastructure. In this case, the host does not call the server. The model provider does.

On the endpoint, network telemetry looks like a binary using a legitimate AI API. On the server side, access logs show cloud IP space belonging to major AI infrastructure. Making detection from either the AI provider or a compromised organisation's security team challenging.

In a controlled lab, we built a Windows agent and paired it with a simple remote controller. Commands were rendered as an image on the C2 server. The agent asked a vision-capable model to read that image and return the text. Command output then went back the other way by embedding data inside URLs handed to the same vision workflow.

The channel worked without the agent ever opening a direct connection to the C2 host. Provider infrastructure fetched the remote content on the agent's behalf, where the data chunks were reassembled, decoded, and executed by the agent.

How It Works

Architecture Overview

Diagram showing the agent, OpenAI API, C2 server, and operator exchanging commands through provider-side image fetches. — High-level Dumbpipe flow: AI backend becomes transport broker between agent and C2.

Demonstration

Command Delivery (C2 → Agent)

The operator posts a PowerShell command to the C2 server's /instruction endpoint.
The C2 server renders the command as a white-background PNG with black monospace text at /instruction.png.
The agent sends a vision API request to OpenAI: "Read and transcribe exactly what text is shown in this image" with image_url: http://C2/instruction.png.
OpenAI's backend (running on Azure/Google infrastructure) fetches the PNG from the C2 server, passes it to GPT-4o-mini, and returns the transcribed text to the agent.
The agent executes the transcribed text as a PowerShell command.

Example instruction image published by the C2 server containing a PowerShell command that the vision model transcribes for the agent. — Example of the C2-published instruction image fetched by the provider backend and read by the vision model.

Result Exfiltration (Agent → C2)

The agent chunks the command output, Base64-encodes it, and constructs URLs like:
http://C2/comms/<session>/<index>/<total>/<chunk>.jpg
The agent sends another vision API request: "What do you see in this image?" with image_url: <that URL>.
The AI provider's backend makes an HTTP GET to the URL to "fetch the image." The data is already embedded in the URL path.
The vision model returns an error (400 Unsupported image format), but the GET has already occurred, and the C2 server has recorded the chunk.
The C2 server reassembles chunks into the full output, which the controller polls and displays.

The image_url field in OpenAI's vision API specification requires the provider's backend to make an outbound HTTP GET to the specified URL before passing any image data to the model. Even when the model rejects the "image," the network transmission has already happened.

Expanding AI Pipelines

AI traffic is becoming normal business egress in many environments. ChatGPT, Copilot, OpenAI, and other AI services are increasingly sanctioned destinations. A steady stream of HTTPS requests to an approved AI endpoint looks less suspicious than a host reaching a newly registered VPS at fixed intervals.

The technique shows that "allowed AI use" and "safe AI use" are not the same question. If an organisation permits direct access to external AI services, it also creates a potential covert channel through approved SaaS infrastructure.

Why This Bypasses Standard Controls

1. No Direct Agent-to-C2 Network Contact

The agent never opens a TCP connection to the C2 server. All outbound connections are HTTPS to api.openai.com. Firewalls, IDS, and netflow monitoring see only "user accessing an AI API", not "compromised host beaconing to a C2 server."

2. Egress Looks Like Normal Business Traffic

Enterprise AI adoption is widespread. ChatGPT, Microsoft Copilot, GitHub Copilot, and OpenAI are commonly whitelisted. The agent's traffic is:

Protocol: HTTPS to well-known AI API domains
Payload: Standard JSON chat completion requests (no encoded shellcode, no steganography)
Response: Normal HTTP 200 responses from the AI API

3. C2 Logs Show AI Provider IPs, Not the Agent's

The C2 server logs show requests from Azure (20.x.x.x, 52.x.x.x) and Google (68.x.x.x, 135.x.x.x) IP ranges, the backend infrastructure of OpenAI and Google. The agent's true IP address is never exposed to the C2 server. This breaks IP-based attribution and geo-location for defenders analysing C2 access logs.

4. DLP and Content Inspection Don't Catch It

Data Loss Prevention tools inspect HTTP bodies, email attachments, and file uploads. They do not inspect the image_url strings inside AI API request JSON. The exfiltrated data travels inside a URL path, outside the body, outside attachments, outside traditional inspection surfaces.

5. EDR Sees Only a Go Binary Talking to OpenAI

Endpoint Detection and Response solutions that monitor for suspicious network connections will see agent.exe making HTTPS requests to api.openai.com. Unless the EDR has a specific detection for "repeated vision API calls with external image URLs pointing to the same domain," this activity blends into the noise of normal AI tool usage.

Detection Challenges

Control Layer	Traditional Detection	Why It Fails Here
Firewall / IDS	Block/alert on C2 domains/IPs	Agent only talks to OpenAI; C2 is contacted by Azure/Google IPs
DLP	Inspect HTTP bodies for sensitive data	Data is in URL paths, not bodies; inside JSON chat requests
EDR	Flag processes connecting to suspicious IPs	Connection is to a legitimate AI API
Proxy Logs	URL categorisation blocks "malware"	api.openai.com is categorised as "Business/Technology"
Sandbox / Detonation	Network traffic analysis in sandbox	Sandbox may not flag OpenAI API traffic; requires custom sigs

Detection Opportunities

Network Layer

Volume analysis: A process making hundreds of vision API requests per hour with external image_url values is suspicious. Normal chat usage is conversational, not robotic.
URL patterns: Repeated image_url values pointing to the same external host with encoded path segments (Base64-like strings) could indicate exfiltration.

Technical Details & Interesting Findings

Model Testing

We tested multiple models and APIs. perplexity/sonar could only search the web, not fetch arbitrary URLs. google/gemini-3.1-flash-lite-preview refused private IPs and hallucinated on gist URLs. OpenAI vision models consistently caused the provider backend to hit our private VPS URLs via image_url.

Limited Antivirus Detection

At the time of testing, we submitted the proof-of-concept file to VirusTotal. Only 3 of 71 security vendors flagged it as potentially malicious. It was not detected by the Microsoft, Google, or Palo Alto engines. While this is only a point-in-time result, it reinforces the broader theme of this research: activity using trusted AI infrastructure may not be readily identified as malicious by conventional controls.

HTTP 400 Still Delivers the Payload

During exfiltration, the OpenAI-facing request returns HTTP 400 with the message "Unsupported image format" or "Invalid image data". This is expected. The model rejects the fake "image," but the provider's backend already made the GET request and transmitted the data. The agent treats any response as success.

PNG Rendering Was Reliable

We render commands as white-background PNGs with black monospace text using Python PIL. GPT-4o-mini transcribes these reliably, even for PowerShell syntax with backslashes, quotes, and variable expansion. The only failure mode was very long commands that wrapped awkwardly, solved by text wrapping at 60 characters.

Implications for Defenders

AI APIs create some of the same monitoring problems as older covert-channel techniques over DNS. The difference is that AI APIs are authenticated, rate-limited, and heavily used, which can make suspicious traffic harder to separate from routine activity.

Assume AI egress is a risk: If your organisation allows ChatGPT, OpenAI API access, or Copilot, you have a potential C2 channel. Treat AI API access with the same scrutiny as any other outbound internet access.
Log and audit AI API usage: If you use OpenAI or Azure OpenAI, export usage logs. Look for patterns: high volume from a single API key, repeated vision requests with external URLs, or requests from non-browser user agents.
EDR rules: Create custom detections for processes (not browsers) making repeated HTTPS requests to OpenAI with vision model parameters and external image URLs.

Ethical & Research Context

This project was built as a proof-of-concept for defensive awareness. The goal is to help security teams understand how AI APIs can be abused as covert channels and to improve detection capabilities before these techniques appear in the wild.

This issue was submitted to OpenAI's bug bounty program before publication and was assessed as an informational finding.

AI platforms are part of the attack surface. They are not only productivity tools; they can also fetch remote content, relay data, and act as internet-facing intermediaries.