Speech-to-Text (STT/ASR) Documentation

Convert spoken Dhivehi audio to written text with AI-powered automatic speech recognition. Record live or upload audio files.

Cost
0.01 Credits/Second
2 decimal precision
Input Methods
Record or Upload
Multiple options
Processing
Real-time
RunPod ASR

Overview

The Speech-to-Text feature transcribes spoken Dhivehi into written text automatically. Perfect for note-taking, documentation, content creation, and accessibility.

Key Features:

  • Direct microphone recording with real-time controls
  • Upload pre-recorded audio files
  • Audio waveform visualization
  • Editable transcription results
  • History management with audio playback

Input Methods

Recording Mode

Record audio directly using your device's microphone with full recording controls.

Record/Pause/Stop Controls: Full control over recording
Real-time Timer: See recording duration as you record
Waveform Visualization: Visual representation of audio
Audio Preview: Listen to recording before transcription
Browser-based: Uses Media Recorder API

Upload Mode

Upload pre-recorded audio files for transcription.

File Upload: Select audio files from your device
Duration Display: Shows audio length automatically
Audio Preview: Listen to uploaded audio before processing
File Validation: Automatic format and size checking

How to Transcribe Audio

1

Choose Input Method

Go to the Speech-to-Text page and select your input method:

  • Record: Use microphone to record live audio
  • Upload: Upload a pre-recorded audio file
2

Capture or Upload Audio

If Recording:

  • Click "Record" to start capturing audio
  • Use "Pause" to temporarily stop recording
  • Click "Stop" when finished
  • View waveform visualization and duration

If Uploading:

  • Click "Upload Audio" button
  • Select your audio file
  • Wait for file validation
  • Preview audio if needed
3

Check Credit Cost

Credits are calculated based on audio duration:

  • Formula: duration (seconds) × 0.01 credits
  • Example: 120 seconds (2 min) = 1.20 credits
  • Example: 300 seconds (5 min) = 3.00 credits
  • 2 decimal precision for accurate billing
4

Start Transcription

Click "Transcribe" to start processing:

  • Audio processed by RunPod ASR endpoint
  • Real-time progress updates
  • Duration tracking in seconds
  • Credits deducted when processing starts
5

View and Edit Transcription

Once transcribed, you can:

  • View full transcription text
  • Edit text in rich text editor
  • Copy to clipboard
  • Download as file
  • Listen to original audio
  • Access from History tab

Advanced Features

Audio Waveform Display

Visual representation of your audio using WaveSurfer.js. See peaks and valleys of audio signal.

• Real-time waveform during recording
• Interactive waveform for uploaded files
• Visual feedback for audio quality

Rich Text Editor

Edit transcribed text with full formatting capabilities. Correct errors, add formatting, and refine content.

• Full text editing capabilities
• Preserve Dhivehi text formatting
• RTL support for proper display

History Management

All transcriptions are automatically saved with audio files for easy reference and re-use.

• View all past transcriptions
• Download audio files
• Re-access and edit transcriptions
• Delete unnecessary records

Mobile Support

Works on mobile devices including iOS with dedicated audio player support.

• iOS-specific audio player
• Mobile recording support
• Responsive interface

Common Use Cases

Meeting Transcription

Record and transcribe Dhivehi meetings, interviews, or discussions for documentation and reference.

Content Creation

Convert spoken Dhivehi content into written form for blogs, articles, or social media posts.

Accessibility

Create text versions of Dhivehi audio content for hearing-impaired users or improved accessibility.

Note-taking

Dictate notes in Dhivehi instead of typing. Faster and more natural for quick documentation.

Voice Memos

Convert voice memos and recordings into searchable, editable text for better organization.

Subtitles & Captions

Transcribe Dhivehi videos for subtitle creation and video captioning.

Tips for Best Results

Use Clear Audio

Record in a quiet environment with minimal background noise for best transcription accuracy.

Speak Clearly

Enunciate words clearly and speak at a moderate pace for improved recognition accuracy.

Use Good Microphone

Better quality microphones produce better transcription results. Use headset or external mic if available.

Review and Edit

Always review transcribed text for accuracy. Edit any mistakes using the built-in editor.

Test Audio Quality

Listen to your recording before transcribing to ensure audio quality is sufficient.

Pricing

0.01 Credits per Second

Pay only for the duration of your audio

30 seconds:0.30 credits
1 minute (60 seconds):0.60 credits
5 minutes (300 seconds):3.00 credits
10 minutes (600 seconds):6.00 credits

Ready to Transcribe Dhivehi Audio?

Access the Speech-to-Text feature in your Dhavana dashboard