Researcher exposes Microsoft's flawed code that lets attackers access files on your computer

Back in May this year, when Microsoft held its Build 2025 conference, it introduced NLWeb, short for "Natural Language Web," a project that was meant to be a way for AI agents to interact with websites.

With companies like Shopify and TripAdvisor signed on, Microsoft pitched this as the foundation for an "agentic web," a future where AI agents handle complex tasks by talking directly to online services.

Now, Aonan Guan, a security researcher, alongside Lei Wang (via The Verge) have documented a path traversal vulnerability in the open-source framework. According to Guan, they were browsing the NLWeb GitHub repo when a specific file caught their attention: webserver/static_file_handler.py.

 # The vulnerable code snippet safe_path = os.path.normpath(path.lstrip("/")) ​ possible_roots = [     APP_ROOT,     os.path.join(APP_ROOT, "site", "wwwroot"),     "/home/site/wwwroot",     os.environ.get("HOME", ""), ] ​ # Later in the code... full_path = os.path.join(root, safe_path) 

If you take a look at the first line of the code, you would notice an innocent looking line. From the official Python documentation, os.path.normpath() normalizes a pathname by collapsing redundant separators and up-level references.

On Windows, it converts forward slashes (/) to backslashes (\). For example, a path like A//B/./C/../D would be normalized to A/B/D, but this function has a nasty side effect, as Guan notes. It does not actually prevent a user from "climbing" out of the intended web directory using ../ sequences.

Guan confirmed his suspicions when he set up a local server listening on 0.0.0.0:8000, a standard testing configuration. When he ran curl "http://localhost:8000/static/..%2f..%2f..%2fetc/passwd", the server happily returned the contents of /etc/passwd.

Image: Aonan Guan

In case you are unaware, /etc/passwd is the user account database on UNIX systems like Linux and macOS, where it maps usernames to user IDs and other system information.

Image: Aonan Guan

The researcher could also access files within the app"s source code, including the project"s .env file (which you should never publicly expose as it contains secrets like API keys) when he ran curl "http://localhost:8000/static/..%2f..%2f..%2fUsers//NLWeb/code/.env".

The code failed to sanitize user input properly as well as validate that the final, resolved file path was actually inside the designated web root. Guan reported the vulnerability on May 28th. Microsoft acknowledged the report the same day and issued a fix two days later.

In the company"s case, the fix was to first check for .. in the raw path to block basic directory traversal attempts. After that, it checks if the requested file ends with one of the allowed extensions: .html, .htm, .css, .js, .png, .jpg, .jpeg, .gif, .svg, .ico, .json, .txt, or .xml.

Finally, and most importantly, it resolves the full, absolute path of the requested file and confirms it resides within one of the approved root directories, preventing any escape.

Guan recommends updating your NLWeb instance immediately and argues that this incident shows how the agentic web introduces a new attack surface, since interpreting natural language from users could inadvertently translate into malicious file paths or system commands if not handled with extreme care.

Report a problem with article
Next Article

Uber's new $20 billion gambit hides a deeper truth about its strategy

Previous Article

Steam is dropping macOS 11 support