VOCAL ISOLATION
Source separation, before the transcriber sees a single bar.
Source separation removes the instrumental before transcription. The transcriber receives clean vocals, not the full mix.
Vocal isolation, dialect-aware processing, and structured output — purpose-built for catalogs in Spanish, English, and code-switched production.
In production with labels and publishers across Latin America and the U.S. Music Technology Venture Studio — San Juan, Puerto Rico.
Good afternoon, Alex
Wed, May 6
Tracks
12
Avg accuracy
97%
Dialect
PR · MX
Pipeline
Active
Recent transcriptions
Bad Bunny — DTMF
PR · Reggaetón · 47s
Peso Pluma — Lady Gaga
MX · Corrido Tumbado · 52s
Karol G — Si Antes Te Hubiera Conocido
CO · Reggaetón Pop · 49s
Rauw Alejandro — Touching The Sky
PR · Trap Latino · 51s
ELENA ROSE — Me Lo Merezco
US Latin · Pop · 44s
Active pipeline
Smart Review
82%“Que se joda el que corille”
Capabilities
Six capabilities that compound across the workflow.
VOCAL ISOLATION
Source separation removes the instrumental before transcription. The transcriber receives clean vocals, not the full mix.
ADAPTIVE DIALECT ENGINE
Spanish vocabulary differs sharply across territories. Musavox detects the regional dialect from the audio and applies a curated lexicon during post-processing. Current coverage spans 8 regions including Puerto Rico, México, Colombia, República Dominicana, Venezuela, U.S. Latin, Argentina, and Chile. Coverage expands with catalog demand.
CONFIDENCE HIGHLIGHTING
Each transcribed line carries a confidence score. Lines below threshold are flagged in the editor with smart-review navigation, so reviewers focus on the lines that need a second pass instead of reading every line.
CODE-SWITCH DETECTION
Mid-verse switches between Spanish and English are detected and tagged in the output. The post-processor handles bilingual production without requiring manual segmentation.
INDUSTRY-STANDARD EXPORTS
Word-level timestamps in LRC. Plain text with section markers in TXT. Subtitle-format SRT for music video workflows. Structured JSON for downstream metadata pipelines.
CATALOG-LEVEL CONSISTENCY
Editorial corrections you make on one track propagate as suggestions within your own catalog only. Your editorial layer stays inside your account; nothing leaves your tenant boundary. Use of anonymized corrections to improve global model accuracy is opt-in per organization and disabled by default.
ADAPTIVE DIALECT ENGINE
Spanish vocabulary, idiom, and pronunciation differ sharply across territories. Reggaetón from Puerto Rico, corridos tumbados from México, and dembow from República Dominicana each carry distinct lexicons that generic models smooth into a textbook neutral. The result reads wrong to native speakers and creates downstream metadata problems.
Eight fully curated regional modules covering Puerto Rico, México, República Dominicana, Colombia, Argentina, Chile, Venezuela, and U.S. Latin. 302+ curated terms across all regions. Lexicons expand as catalogs in new territories enter the platform.
matched: jevo, corillo, bellaco, perreo
How Musavox compares
| Capability | Moises | VEED | Songscription | Musavox |
|---|---|---|---|---|
| Lyrics transcription | ||||
| Vocal isolation | Basic | — | — | Production-grade |
| Latin music focus | — | — | — | Native, regional-aware |
| Regional dialect detection | — | — | — | Auto-detect, 6+ regions |
| Ad-lib + producer tag detection | — | — | — | |
| Code-switch (Spanish/English) | Basic | — | — | First-class citizen |
| Confidence scoring per line | — | — | — | |
| Smart Review flagging | — | — | — | |
| Industry-standard exports | TXT | TXT, SRT | TXT | TXT, LRC, SRT, JSON |
| Correction-capture training loop | — | — | — | Catalog-specific learning |
| Built for label/publisher catalogs | — | — | — |
Built for
Evaluate prospective signings with lyrics transcribed in the dialect of the artist. Identify thematic patterns and lyrical consistency across a roster before contract.
Transcribe lyrics at the pace of release week. Submit final metadata to distribution with verified lyrics, not approximated ones. Audit trail for every editorial decision lives inside the platform.
Lyrics ready for copyright registration with word-level timestamps. Confidence scores flag lines that need a second pass before submission.
Lyric documentation with confidence scoring, version history, and editorial provenance per track.
Why Musavox
Musavox was built by operators who needed a transcription tool
that understood the music they were releasing.
Generic tools smooth regional vocabulary into neutral Spanish.
Editorial teams ship with errors. Metadata gets rejected.
Copyright registrations carry approximations instead of verified text.
We built the tool we wanted to use. Then we opened it to other catalogs.
— The Musavox team
Start free. Scale as you grow.
Try it out
Independent A&Rs, managers, songwriters
Growing teams, production houses
Established labels, publisher catalogs
Major labels, publisher catalogs, multi-territory operations
Musavox supports MP3, WAV, FLAC, OGG, and M4A files up to 20MB.
The pipeline isolates vocals from instrumentals before transcription, then applies a region-aware language model. Each transcribed line carries a confidence score, so reviewers know exactly which lines need a second pass. Accuracy varies by genre, production density, and audio quality — heavily processed vocals (autotune, layered ad-libs) and live recordings tend to score lower than clean studio vocals.
English and Spanish are fully supported from day one, including code-switching detection for bilingual tracks. More languages coming soon.
Most tracks are processed in 30-90 seconds. You can watch the pipeline progress in real time.
Yes. Every transcription can be reviewed and edited. Your edits are saved separately so you can always revert to the AI version.
Credits reset monthly and don't carry over. This keeps pricing simple and predictable.
The Musavox Brief
Notable releases, transcription patterns observed across the platform, and editorial guidance for catalog teams. Free, weekly.
No spam. One email per week. Unsubscribe anytime. Read the privacy policy.