Deep Cuts Is Now Open Source
Two days ago I wrote about building a music intelligence app in five days with two AI coding agents. That post described a working MVP — a private prototype called music-intelligence that could scan a library, run ML analysis, and visualize the results on a UMAP map. I said I was planning to release it as open source.
Today I'm doing that. The app is now called Deep Cuts, and the code is on GitHub at github.com/robertolupi/deep-cuts.
What changed
The original prototype was a proof of concept. In the time since that post, I effectively rewrote the entire application from scratch as a clean-room implementation. What shipped today is a different program — same ideas, completely new code.
The Python backend is gone
The original prototype had a hybrid architecture: a Python/FastAPI backend running as a subprocess alongside the Rust/Tauri layer. This worked, but it was fragile and made distribution painful.
Deep Cuts runs entirely in-process in Rust. Every analysis pass — DSP, ONNX inference, CLAP embeddings, UMAP projection, loudness, BPM — happens natively inside the Tauri process. There is no Python runtime required to run the app. The only external dependency is llama.cpp (via Homebrew), and only if you want the Qwen2-Audio description pass. Everything else is self-contained.
The analysis pipeline is a proper framework
The original had a working but ad-hoc pipeline. Deep Cuts has a unified PassSpec registry with dependency ordering, versioned pass results, and a sidecar persistence system. Every track's analysis is stored in a .dc.json file next to the audio file — so re-indexing after a library move restores all computed results instantly without re-running inference.
The UI was redesigned from scratch
The original UI was functional but rough. Deep Cuts has a complete redesign built around the Sonic Glitch design system — dark, light, and accessible high-contrast themes, a persistent player bar, a full-width filter sidebar with removable chips, and a track detail pane with waveform, spectrogram, and all analysis results.
The Music Map is significantly better
The original UMAP map used min/max normalization, which caused most of the library to get compressed into a small region by a handful of acoustic outliers. Deep Cuts uses p1–p99 percentile clipping, so the main cluster fills the canvas and outliers land at the edges rather than dominating the bounding box.
You can also switch between UMAP and a fast deterministic PCA projection. PCA is instant and surprisingly good for global structure — it's the recommended default.
Non-music content (audiobooks, spoken word, sound effects) is automatically excluded from the projection.
New features
Several features that didn't exist in the prototype:
Semantic NLP search — type natural language into the search bar ("uptempo melancholic piano track") and the app runs inference against the local MiniLM sentence encoder and queries the
description_embeddingsvector table. Results are ranked by match percentage.Similarity search — click "Sounds Similar" on any track to fire a CLAP KNN query and filter the library to the most acoustically similar tracks. Works from both the track detail pane and the Music Map.
Duplicates view — CLAP-based similarity detection that surfaces near-duplicates and remixes in your library.
Silence-aware CLAP window selection — instead of sampling at fixed 25/50/75% positions, the pipeline finds the three highest-energy non-silent windows. This means the embedding captures the drop or chorus rather than whatever happens to fall at a fixed timestamp.
BPM refinement pipeline — coarse metadata genre corrects half/double-time errors; a second pass uses the Essentia Discogs-400 label for further refinement.
Embedded cover art — the track detail pane shows embedded album art extracted from the audio file.
What stayed the same
The core stack is identical to the prototype: Tauri 2, Rust, Svelte 5, SQLite with sqlite-vec for vector search, ONNX Runtime for local ML inference, WaveSurfer.js for waveform rendering. The choice to keep everything offline and sandboxed was non-negotiable from the start and hasn't changed.
The ML models are the same: Essentia Discogs-Effnet for genre/mood/vocal classification, LAION CLAP for audio embeddings, all-MiniLM-L6-v2 for text embeddings, Qwen2-Audio-7B for natural language descriptions.
The two-agent development approach also continued throughout — Claude Code and Antigravity, with CLAUDE.md and AGENTS.md keeping both coherent across sessions. The commit history reflects it.
The cognitive shift, continued
In the original post I wrote: "The cognitive bottleneck moved from implementation to design — which is where, honestly, it should always have been."
That held up across the full rewrite. The work that took time wasn't writing code — it was deciding what to build, in what order, and recognizing when a working prototype had outgrown its original structure. The agents handled implementation at a pace I couldn't match alone. The judgment calls — when to rewrite rather than patch, what features are actually worth building, which abstractions survive contact with real data — those remained mine.
Getting started
Deep Cuts is currently source-only — no pre-built binaries yet. You'll need Rust 1.77.2+, Node.js 18+, Python 3.10+ (for model export), and llama.cpp if you want Qwen2-Audio descriptions.
The README has full setup instructions including how to download and export the model files. The models directory is not committed — the export scripts in tools/ generate the ONNX files, and the Essentia models are available from the Essentia model hub.
The app is licensed under AGPL-3.0.