# Sandbox & Code Execution

## code_execute
Run JavaScript, Python, or Bash in a sandboxed Docker container. No network access.

## File I/O Contract - "sandbox = your project"
- `/work/` mirrors your project files. Input paths are preserved: an `input_files` entry of `data/foo.csv` lands at `/work/data/foo.csv`.
- Write files at **any relative path under /work/** (or the cwd, which is /work/). Anything you create or modify is written back to the project at the same path.
- If you write to a path that already exists, the project file is **versioned** (not overwritten destructively). Users and agents can roll back with `file_version_restore`, so it's safe to transform files in place.
- Unchanged input files are NOT round-tripped (we compare sha256 before returning them). You only pay the VFS write when you actually changed a file.
- Python users: `os.makedirs('thumbnails', exist_ok=True)` before writing to a new subdir. `ffmpeg` and most CLI tools auto-create parent dirs.
- Max file size: 512 KB. No network. 30s default / 600s max timeout.
- `output_files` is an optional assertion: list paths you *expect* to produce, and you'll get a warning if any are missing. It does **not** filter - all new/modified files are returned regardless.

## Pre-installed Runtimes
- **Node.js 20** - full standard library
- **Python 3** - pandas, numpy, matplotlib, scipy, sympy, pillow, openpyxl, xlsxwriter, xlrd, python-docx, python-pptx, reportlab, cairosvg, seaborn, qrcode, python-barcode, requests, bs4, lxml, pyyaml, Jinja2, tabulate, csvkit (+ agate-dbf, agate-excel, agate-sql, dbfread), pyfiglet, pytesseract, python-dateutil, python-slugify, Pygments, faker, wordcloud, networkx, SQLAlchemy, Markdown, babel, fonttools
- **Perl** 5.36 - full install
- **GCC/G++**, **mingw-w64** (cross-compile to Windows .exe via `x86_64-w64-mingw32-gcc` / `-g++` / `-windres`)

## CLI Tools (prefer one-liners over scripts)
- **FFmpeg** - transcode, trim, extract frames/audio, generate thumbnails (`ffmpeg`, `ffprobe`, `ffplay`)
- **ImageMagick** (`convert`/`mogrify`/`identify`/`composite`) - resize, crop, composite, format-convert
- **Graphviz** (`dot`, `neato`, `sfdp`, `fdp`, `twopi`, `circo`) - diagrams from DOT
- **gnuplot** - CLI plotting
- **sox** - audio processing (convert, trim, mix, effects); `play`/`rec` also available
- **jq** - JSON query/transform
- **sqlite3** - ad-hoc queries on .sqlite/.db
- **PDF toolkit** - `pdftotext`, `pdfimages`, `pdfinfo`, `pdfhtml`, `pdftoppm` (PDF→PNG/PPM), `pdftocairo` (PDF→PNG/SVG), `pdfunite` (merge), `pdfseparate` (split), `pdfattach`/`pdfdetach`, `pdfsig`, `pdffonts`
- **ghostscript** (`gs`), `ps2pdf`/`pdf2ps`/`ps2epsi`/`eps2eps`/`dvipdf` - PostScript ↔ PDF
- **qpdf** - merge, split, encrypt, linearize PDFs
- **pandoc** - convert markdown/HTML/docx/epub/rst/latex
- **LibreOffice** (`soffice --headless`, `lowriter`, `localc`, `loimpress`, `lodraw`, `loweb`) - docx/xlsx/pptx → PDF or HTML
- **wkhtmltopdf** / **wkhtmltoimage** - render HTML files to PDF or PNG
- **exiftool**, **mediainfo** - read/write file metadata
- **potrace**, **mkbitmap** - bitmap → SVG (`mkbitmap` preprocesses for `potrace`)
- **optipng**, **gifsicle** - image optimization; **gifdiff**, **gifview**
- **WebP toolkit** - `cwebp` (encode), `dwebp` (decode), `gif2webp`, `img2webp`, `webpmux`, `webpinfo`
- **diffimg** - pixel-diff two images
- **xmlstarlet** - query/edit XML
- **datamash**, **miller** (`mlr`) - tabular stats and CSV/JSON transforms
- **tesseract** - OCR (Python: `pytesseract.image_to_string`)
- **qrencode** - generate QR code images
- **fc-list**, **fc-match** - fontconfig: list/match fonts on the system
- **lp_solve** - linear programming solver (LP/MIP)
- **openssl** - hashes, x509, keygen, encrypt/decrypt
- **gpg** (+ `gpgsm`) - sign, encrypt, verify
- **bc** - arbitrary-precision calculator for bash
- **tree** - recursive directory listing
- **p7zip** (`7z`), **unzip/zip** (`zipinfo`, `zipsplit`, `zipcloak`), **unrar**
- **hunspell**, **figlet/toilet**, **dos2unix/unix2dos**, **sed/awk/mawk/perl**

## Discovering tools
If unsure whether a tool or package is available, probe before using it - the sandbox has no network, so a fast check is cheaper than a failed run:
- CLI: `command -v ffmpeg` (exits 0 with path, 1 if missing)
- Python: `python3 -c "import foo; print(foo.__version__)"` or `pip show foo`
- Node: `node -e "console.log(require.resolve('foo'))"`

## Worked Examples

### 1. FFmpeg - transcode MP4 → WebM
```
input_files: ["videos/clip.mp4"]
language: bash
code: ffmpeg -y -i videos/clip.mp4 -c:v libvpx-vp9 -b:v 1M -c:a libopus videos/clip.webm
```
Returns: project gains `videos/clip.webm`. The `.mp4` is unchanged, so it is not re-written.

### 2. FFmpeg - extract 10 frames from a video
```
input_files: ["videos/clip.mp4"]
language: bash
code: |
  mkdir -p frames
  ffmpeg -y -i videos/clip.mp4 -vf "fps=10/$(ffprobe -v error -show_entries format=duration -of csv=p=0 videos/clip.mp4)" frames/frame_%03d.jpg
```
Returns: `frames/frame_001.jpg` … `frames/frame_010.jpg`.

### 3. ImageMagick - resize + convert format
```
input_files: ["images/photo.jpg"]
language: bash
code: |
  mkdir -p thumbnails
  convert images/photo.jpg -resize 400x400 -quality 85 thumbnails/photo.webp
```
Returns: `thumbnails/photo.webp`.

### 4. Python pandas - filter/aggregate a CSV
```
input_files: ["data/sales.csv"]
language: python
code: |
  import pandas as pd
  df = pd.read_csv('data/sales.csv')
  q4 = df[df['quarter'] == 'Q4'].groupby('region')['amount'].sum().reset_index()
  q4.to_csv('data/sales_q4.csv', index=False)
  print(q4.to_string(index=False))
```
Returns: `data/sales_q4.csv` plus a printed preview on stdout.

### 5. Python matplotlib - chart a JSON metric series → PNG
```
input_files: ["data/metrics.json"]
language: python
code: |
  import json, os
  import matplotlib
  matplotlib.use('Agg')
  import matplotlib.pyplot as plt
  d = json.load(open('data/metrics.json'))
  os.makedirs('charts', exist_ok=True)
  plt.plot(d['labels'], d['values'])
  plt.xticks(rotation=45); plt.tight_layout()
  plt.savefig('charts/metrics.png', dpi=120)
```
Returns: `charts/metrics.png` (inline in the reply).

### 6. Bash + poppler - PDF → plain text
```
input_files: ["docs/manual.pdf"]
language: bash
code: |
  mkdir -p docs
  pdftotext -layout docs/manual.pdf docs/manual.txt
  wc -l docs/manual.txt
```
Returns: `docs/manual.txt` plus the line count on stdout.

## Audio/Music Generation - use dedicated tools
Do NOT use code_execute with sox/ffmpeg to generate music or sound effects. Use:
- `music_generate` - AI-generated music
- `sound_generate` - AI-generated sound effects
- `speech_generate` - AI-generated voice narration

These produce higher-quality results and handle file saving automatically.
