Lyrics transcription for music professionals.

Vocal isolation, dialect-aware processing, and structured output — purpose-built for catalogs in Spanish, English, and code-switched production.

In production with labels and publishers across Latin America and the U.S. Music Technology Venture Studio — San Juan, Puerto Rico.

musavox.io / dashboard

Good afternoon, Alex

Wed, May 6

Demo preview

Tracks

12

Avg accuracy

97%

Dialect

PR · MX

Pipeline

Active

Recent transcriptions

  • Bad Bunny DTMF

    PR · Reggaetón · 47s

    0.0%
  • Peso Pluma Lady Gaga

    MX · Corrido Tumbado · 52s

    0.0%
  • Karol G Si Antes Te Hubiera Conocido

    CO · Reggaetón Pop · 49s

    0.0%
  • Rauw Alejandro Touching The Sky

    PR · Trap Latino · 51s

    0.0%
  • ELENA ROSE Me Lo Merezco

    US Latin · Pop · 44s

    0.0%

Active pipeline

Vocal isolation
Transcription
Post-processing
Done

Smart Review

82%

“Que se joda el que corille

Capabilities

Capabilities.

Six capabilities that compound across the workflow.

VOCAL ISOLATION

Source separation, before the transcriber sees a single bar.

Source separation removes the instrumental before transcription. The transcriber receives clean vocals, not the full mix.

ADAPTIVE DIALECT ENGINE

Region-aware Spanish processing.

Spanish vocabulary differs sharply across territories. Musavox detects the regional dialect from the audio and applies a curated lexicon during post-processing. Current coverage spans 8 regions including Puerto Rico, México, Colombia, República Dominicana, Venezuela, U.S. Latin, Argentina, and Chile. Coverage expands with catalog demand.

CONFIDENCE HIGHLIGHTING

Every line carries a score. Reviewers see only what matters.

Each transcribed line carries a confidence score. Lines below threshold are flagged in the editor with smart-review navigation, so reviewers focus on the lines that need a second pass instead of reading every line.

CODE-SWITCH DETECTION

Bilingual tracks are first-class citizens.

Mid-verse switches between Spanish and English are detected and tagged in the output. The post-processor handles bilingual production without requiring manual segmentation.

INDUSTRY-STANDARD EXPORTS

TXT, LRC, SRT, JSON. Plus the integrations that publishing actually needs.

Word-level timestamps in LRC. Plain text with section markers in TXT. Subtitle-format SRT for music video workflows. Structured JSON for downstream metadata pipelines.

CATALOG-LEVEL CONSISTENCY

Catalog-level consistency.

Editorial corrections you make on one track propagate as suggestions within your own catalog only. Your editorial layer stays inside your account; nothing leaves your tenant boundary. Use of anonymized corrections to improve global model accuracy is opt-in per organization and disabled by default.

ADAPTIVE DIALECT ENGINE

Region-aware processing across 8 dialect modules.

Spanish vocabulary, idiom, and pronunciation differ sharply across territories. Reggaetón from Puerto Rico, corridos tumbados from México, and dembow from República Dominicana each carry distinct lexicons that generic models smooth into a textbook neutral. The result reads wrong to native speakers and creates downstream metadata problems.

Eight fully curated regional modules covering Puerto Rico, México, República Dominicana, Colombia, Argentina, Chile, Venezuela, and U.S. Latin. 302+ curated terms across all regions. Lexicons expand as catalogs in new territories enter the platform.

Detected dialectPR · Puerto Rico
Puerto Rico
82%

matched: jevo, corillo, bellaco, perreo

Mexico
31%
Colombia
18%

How Musavox compares

How Musavox compares.

CapabilityMoisesVEEDSongscriptionMusavox
Lyrics transcription
Vocal isolationBasicProduction-grade
Latin music focusNative, regional-aware
Regional dialect detectionAuto-detect, 6+ regions
Ad-lib + producer tag detection
Code-switch (Spanish/English)BasicFirst-class citizen
Confidence scoring per line
Smart Review flagging
Industry-standard exportsTXTTXT, SRTTXTTXT, LRC, SRT, JSON
Correction-capture training loopCatalog-specific learning
Built for label/publisher catalogs

Built for

Built for catalog operators.

A&R Executives

Evaluate prospective signings with lyrics transcribed in the dialect of the artist. Identify thematic patterns and lyrical consistency across a roster before contract.

Label Managers and Artist Relations

Transcribe lyrics at the pace of release week. Submit final metadata to distribution with verified lyrics, not approximated ones. Audit trail for every editorial decision lives inside the platform.

Publishers

Lyrics ready for copyright registration with word-level timestamps. Confidence scores flag lines that need a second pass before submission.

Music Attorneys

Lyric documentation with confidence scoring, version history, and editorial provenance per track.

Why Musavox

Musavox was built by operators who needed a transcription tool

that understood the music they were releasing.

Generic tools smooth regional vocabulary into neutral Spanish.

Editorial teams ship with errors. Metadata gets rejected.

Copyright registrations carry approximations instead of verified text.

We built the tool we wanted to use. Then we opened it to other catalogs.

— The Musavox team

Simple, Transparent Pricing

Start free. Scale as you grow.

free

Try it out

$0/mo
  • 5 transcriptions/month
  • English + Spanish
  • TXT export
  • Confidence scoring
Get Started

studio

Independent A&Rs, managers, songwriters

$19/mo
  • 30 transcriptions/month
  • English + Spanish
  • TXT + LRC export
  • Confidence scoring
  • Email support
Choose Studio
Popular

pro

Growing teams, production houses

$49/mo
  • 100 transcriptions/month
  • English + Spanish
  • All export formats
  • Batch upload
  • Priority support
Choose Pro

label

Established labels, publisher catalogs

$149/mo
  • 400 transcriptions/month
  • All languages
  • All export formats
  • Batch upload
  • Multi-seat (up to 5 users)
  • Custom dialect tuning per territory
  • Priority pipeline lane
  • Dedicated support
Choose Label

enterprise

Major labels, publisher catalogs, multi-territory operations

Custom
  • Everything in Label
  • Unlimited transcriptions
  • Unlimited team seats
  • Multi-office support
Talk to sales

Frequently Asked Questions

Musavox supports MP3, WAV, FLAC, OGG, and M4A files up to 20MB.

The pipeline isolates vocals from instrumentals before transcription, then applies a region-aware language model. Each transcribed line carries a confidence score, so reviewers know exactly which lines need a second pass. Accuracy varies by genre, production density, and audio quality — heavily processed vocals (autotune, layered ad-libs) and live recordings tend to score lower than clean studio vocals.

English and Spanish are fully supported from day one, including code-switching detection for bilingual tracks. More languages coming soon.

Most tracks are processed in 30-90 seconds. You can watch the pipeline progress in real time.

Yes. Every transcription can be reviewed and edited. Your edits are saved separately so you can always revert to the AI version.

Credits reset monthly and don't carry over. This keeps pricing simple and predictable.

The Musavox Brief

A weekly brief for music professionals.

Notable releases, transcription patterns observed across the platform, and editorial guidance for catalog teams. Free, weekly.

No spam. One email per week. Unsubscribe anytime. Read the privacy policy.