nfind¶
nfind — short for natural-find — finds files by describing them in plain
language (it's find, but driven by a natural-language description instead of a filter
expression). You give a natural-language prompt; an LLM writes a small Python filter for
it; that filter runs against your file tree inside a hardened, disposable Docker
container and prints the matching paths — a natural-language cousin of find.
uv tool install nfind
export OPENAI_API_KEY=sk-...
nfind "directories that contain only audio files"
nfind "Python files that import requests" ./src
What can you ask?¶
The prompt is free-form. These are examples of the kind of query that isn't possible
with find or grep — they show why generating a real program per question matters:
# Cross-file structural analysis
nfind "Helm charts where replicaCount in values.yaml defaults to 1 but the Deployment template never overrides it" ~/k8s
# Semantic code quality
nfind "JavaScript files that define async functions but never handle Promise rejections" ./src
# Security: context-sensitive pattern analysis
nfind "shell scripts that pass an unquoted variable to rm -rf" ~/bin
# Binary introspection
nfind "MP3 files whose embedded cover art is larger in bytes than the audio data itself" ~/Music
# macOS provenance × file contents (Spotlight can do each alone; nfind combines them)
nfind "PDFs I downloaded from arxiv.org that mention 'mechanistic interpretability'" ~/Papers --macos-meta
Because nfind generates a real program per query — rather than matching fixed predicates or sending your files to a model — the answer is: almost anything you can describe in a sentence. See the Examples gallery.
How it works¶
- Enumerate — nfind walks the search directory on the host and collects every file and directory path.
- Generate — it asks an LLM to write a filter function matching your prompt (the path list itself is not sent — only your description). The model also picks the runtime — Python or Node.js — and declares any packages it needs. If the reply doesn't validate, nfind feeds the error back and retries a few times. The generated Python filter is then tidied with ruff (unused imports removed, imports sorted, reformatted) before it is shown, saved, or run.
- Run safely — by default, the generated code executes in a throwaway Docker
container with the search root bind-mounted read-only, networking disabled, all
Linux capabilities dropped, and CPU/memory/process limits applied. On macOS,
--sandbox appleopts into experimental Apple Containers support with an explicit macOS 15 networking warning. - Map back — the container returns matching container paths; nfind maps them to host paths and prints them.
Because the filter runs inside the sandbox, it can safely inspect file contents and
metadata — not just names — to answer questions classic find + grep can't.
Why it exists¶
Tools like lfind send the whole file list to an LLM and let the model do the
filtering — which doesn't scale and only sees filenames. nfind instead has the LLM
write code once, then runs that code locally over the full tree. It scales to large
directories, can read file contents, and keeps you in control: you can review, save,
or confirm the generated code before it runs.
Unlike Spotlight (mdfind), which queries a pre-built metadata index, nfind generates
and runs a program per question — so it can answer structural and computed queries an
attribute index can't express. See How nfind compares.
Features¶
- Natural-language search over any directory tree.
- Sandboxed execution — read-only mount, dropped caps, resource limits; the default Docker backend also disables networking.
- Review before running —
--show-code,--save, and--confirmlet you inspect, keep, or approve the generated filter. - Save & replay —
--savewrites the filter as a self-describing, dependency-declaring artifact; replay it sandboxed with--runor run trusted Python saves directly viauv run. - Output modes — a clean path list by default,
--verbosefor extra per-path fields,--jsonfor machine-readable records. - Declared dependencies — filters can request libraries (to read MP3 tags, image sizes, …); approved packages are installed into a derived sandbox image and remembered, gated by a whitelist.
- Python & Node.js runtimes — the model picks the ecosystem per
prompt (e.g. TypeScript analysis with
ts-morph); nfind runs the filter in the matching sandbox image. - macOS metadata —
--macos-metaexposes Finder tags and download provenance to the filter, enabling queries that combine them with file contents. - Python API — call
search()from your own code.
Quick start¶
nfind "files with no extension" # search the current directory
nfind "directories with more than 50 files" ~/Projects # search a specific directory
nfind "Python files, and for each the number of lines" --verbose
nfind "audio files (mp3, flac, wav)" --json
nfind "files that look like backups" --confirm # review the code first
See Examples for the full prompt gallery.
Requirements¶
- Python 3.11+
- Docker installed and running
- An OpenAI API key in
OPENAI_API_KEY(or another provider's key)
See Installation for details.
Documentation¶
Get started¶
- Installation — install via
uv,pipx, orpip; Docker and API key setup - Tutorial — a hands-on walkthrough of every feature, from first search to advanced options
- Examples — a gallery of prompts to adapt, with runtime and image information
Reference¶
- CLI reference
- Configuration — env vars, config file, and model/provider selection
- Output modes
- Python API
Concepts¶
- Safety model — what the sandbox does and doesn't protect
- Runtimes (Python & Node.js) — how the model picks a runtime and why only two exist
- Dependencies & the whitelist — third-party packages inside the sandbox
- macOS metadata — Finder tags and download provenance
- How nfind compares — nfind vs. Spotlight,
find, lfind, and others
Help¶
Transparency¶
A significant portion of this codebase was developed with AI assistance (primarily Claude by Anthropic). All generated code was reviewed and curated by the author.