Designing with a Co-Pilot: Autonomy, Control, and What’s New in Deep Cuts
Just yesterday, I released the first binary release (v0.1.0) for Deep Cuts—an offline-first, local music exploration tool built using Rust, Tauri, and Svelte 5. It is powered by local machine learning, utilizing sqlite-vec for vector similarity, CLAP audio embeddings, and local Qwen2-Audio models to let you chat directly with your music library.
Building a complex desktop app from scratch in a matter of days was only possible because I pair-programmed with AI coding agents. But as the codebase has rapidly grown, so too has the sophistication of how I collaborate with these agents.
In this post, I want to pull back the curtain on the actual collaboration dynamics that shaped Deep Cuts over this intense sprint, analyze where agents excel (and where they hit a wall), and share the new features we've implemented since the v0.1.0 release.
The Autonomy-Control Spectrum
When working with modern AI agents, the relationship isn't a simple "command line." It’s a spectrum of delegation. Crucially, autonomy is never granted blindly. Every phase of high-autonomy execution is preceded by a strict planning phase where we discuss architectural decisions, state management, and UX design in depth. Only after we have thoroughly reviewed and aligned on the technical blueprint does the agent take over to write the code.
Across the development of Deep Cuts, our work fell into three distinct categories:
1. High Agent Autonomy (Delegation)
Once a plan was established, architectural boundaries defined, and the design discussed in depth, I could hand over the keys completely. If a task was well-scoped, or involved low-level system integrations with a clear path forward, the agent worked independently.
For instance, when integrating native drag-and-drop support (via tauri-plugin-drag to let DJs drag tracks directly into Serato, Traktor, or a DAW in the prototype app), the agent autonomously realized that WebView drag events were blocking native behavior. It bypassed the restriction by moving to mousedown triggers and resolved a payload validation error by mocking a transparent 1x1 base64 PNG. Similarly, implementing complex mathematical operations for vector searches inside Rust and SQLite, mapping new audio formats (AIFF, Ogg, Opus) in our scanner, or writing database migrations were fully delegated. A single "Yes, let's do it" yielded hundreds of lines of working code.
2. High Human Control (Precision Steering)
Conversely, there are places where you must hold the wheel.
The Scalability Cliff: AI agents are excellent at writing code that works, but terrible at anticipating real-world data scaling. During the implementation of the duplicate track detector, I had to interrupt mid-generation to ask: "How will this scale to 30,000 tracks?" The agent's initial approach would have crashed the application; we immediately pivoted the architecture.
UI/UX & Aesthetics: The design system for Deep Cuts (Sonic Glitch) started as a dark-theme mockup. Translating that into light and accessible high-contrast themes required manual guidance. Agents lack the visual nuance to make design adaptations on the fly and often miss layout dependencies when separating state.
The Slow Lane: For speculative features, I enforced a "doc-before-code" rule. Ideas were scoped out in a
/docscratchpad first so we didn't waste cycles writing code for features that hadn't been fully conceptualized.
3. Cooperative Protocols
The most rewarding parts of the project were cooperative. We established a custom skills/ directory in the repository—a set of markdown protocols documenting how to handle migrations, write Svelte 5 stores, and implement Tauri IPC commands. By encoding my preferences into these files, the agents read them at the start of each session, removing the need to re-explain the architectural style.
We also actively challenged each other. At one point, I suggested a pure LLM-based text search interface for the library. The agent pushed back, arguing that a faceted, interactive sidebar filter was a far superior UX for track discovery. The agent was right, and that faceted sidebar became the core navigation pillar of the app.
What’s New Since v0.1.0
While analyzing these workflows, we’ve also been busy building. Deep Cuts has evolved significantly since the first public binary. Here is a look at what has landed:
AcoustID & Metadata Enrichment (v0.1.1)
Offline libraries are notoriously messy. Deep Cuts now includes automatic audio fingerprinting using the AcoustID/MusicBrainz database. The app generates local acoustic fingerprints, resolves missing artists, titles, and albums online, fetches high-resolution cover art, and caches it locally.
Playlists & Visual Statistics (v0.1.1)
You can now create and manage custom playlists, and export them as native .m3u playlist files directly via the macOS file dialog. Additionally, we added an interactive Statistics panel where you can compare BPM, key distributions, loudness, and genre variations between your full library and your current filtered tracks.
Mood Radar & Range Filters (v0.1.3 - Upcoming)
We have expanded the Essentia classifier integration to support granular mood discovery.
Mood Range Sliders: You can filter tracks along seven different mood axes (Happy, Sad, Relaxed, Party, Acoustic, Electronic, Aggressive) using dual-handle range sliders.
Mood Radar: The track detail pane now features a custom D3-powered radar/spider chart mapping the visual mood profile of the selected track.
Self-Correcting ML Pipeline & Energy Windowing (v0.1.3)
Previously, the Qwen2-Audio analysis analyzed a static segment near the beginning of a track. Now, the pipeline performs Energy-Based Window Selection, scanning the file to select the loudest 10-second segment (ignoring silent or slow intros) to feed into the models.
Even better, we introduced a CLAP-Based Validation Pass. Qwen description outputs are validated against the track's CLAP audio embeddings. If the semantic similarity score falls below a threshold (meaning Qwen hallucinated or failed to capture the musical style), the app automatically triggers a lower-temperature resampling pass with strict system prompt corrections to fix the description on the fly.
Chat Session Persistence (v0.1.3)
You can now save, review, and search your chat histories with individual tracks. If you have an insightful conversation about a track's production style or transition compatibility, you can pick that session right back up later.
Building a Future with Co-Pilots
Deep Cuts is a testament to what a single developer can build when using agents as a leverage multiplier rather than a simple code generator. By establishing clean division of labor—delegating boilerplate and system integrations, maintaining strict architectural control, and using structured markdown files as shared protocols—human-AI collaboration transitions from a novelty into a powerful engineering standard.
Deep Cuts is open source and available on GitHub. Give it a spin, check out the new release, and let me know how you manage your music collection!