eolas/neuron/0c4abb9b-c940-4636-84e0-b880b7c1ac8b/Audio_file_formats.md

113 lines
4.1 KiB
Markdown
Raw Normal View History

2024-12-27 14:21:56 +00:00
---
tags: [sound, binary]
created: Friday, December 27, 2024
---
# Audio file formats
2025-01-01 15:57:56 +00:00
## CD quality
2024-12-27 14:21:56 +00:00
2025-01-01 15:57:56 +00:00
We can use CD's as the digital gold standard as the best digital quality
available (to consumers).
2024-12-27 14:21:56 +00:00
2025-01-01 15:57:56 +00:00
CDs store audio in uncompressed PCM (Pulse Code Modulation) format.
They have a sampling rate of 44.1kHz, which is done in two channels to enable
stereo.
## What lossy formats entail
A conversion to MP3 from, for example, a CD source is always a one-way process
and is not reversible. Once information is discarded in the compression process,
it cannot be retrieved.
This is obviously in contrast to lossless methods like FLAC where the original
CD audio can always be reconstructed.
It follows from the above that if you repeatedly encode a CD source to MP3, it
will deteriorate in quality since more data is being removed each time.
## Major audio formats
### WAV (Waveform Audio File Format)
2024-12-27 14:21:56 +00:00
- CD-quality encoding with no compression
- Bit-for-bit identical to the CD source
- Historically developed for Windows machines but can play on all operating
systems
2025-01-01 15:57:56 +00:00
### FLAC (Free Lossless Audio Codec)
2024-12-27 14:21:56 +00:00
- Basically the same as WAV but in a (losslessly) compressed format
- The difference between a novel in a text file (WAV) and as a zipped file
2025-01-01 15:57:56 +00:00
### MP3 (MPEG-1 Audio Layer MP3)
2024-12-27 14:21:56 +00:00
- Lossy format.
- When a WAV file (or other lossless format) is converted to MP3 a Fast Fourier
2025-01-01 15:57:56 +00:00
Transform analysis is performed to determine the frequency of certain sounds
2024-12-27 14:21:56 +00:00
- This is used by the encoder to decide which parts of the sound are
imperceptible and thus which can be discarded to reduce the file size. This is
done through the application of psycho-acoustic models.
- The remaining data is then compressed
- Examples of the data reduction applied:
2025-01-01 15:57:56 +00:00
2024-12-27 14:21:56 +00:00
- Removing frequencies that humans cannot hear
- Removing quieter sounds that are masked by louder sounds
- Combining similar frequencies
2025-01-02 18:02:35 +00:00
- Reducing stereo information where it is less noticeable
2024-12-27 14:21:56 +00:00
2025-01-01 16:46:40 +00:00
### OGG: Ogg Vorbis
2024-12-27 14:21:56 +00:00
- An open-source alternative to MP3
- Typically achieves better quality than MP3 at the same bit rate, especially at
very high/ low frequencies
- Also better stereo handling at low frequencies
- Uses a more modern psycho-acoustic model
## Variable and constant bit rates
For lossy formats like MP3, the amount of data that is being encoded from the
uncompressed source is expressed via the unit of "bit rate": **how many
thousands of bits are being used to represent each second of audio**.
There are two methods of encoding that impact on the bit rate.
With _variable bit rate_ encoding, the encoder dynamically adjusts the bit rate
depending on what is happening in the music. During complex passages (e.g. a
full orchestra) it uses a higher bit rate to capture the detail. During simpler
passages (like a single instrument) it uses lower bit rates since less data is
needed and during silence it can drop to very low bit rates.
_Constant bit rate encoding_ uses the same bit rate throughout, it is
consequently less efficient.
When talking about the quality of MP3s there are generally two bit rates cited:
- ~128kbps: acceptable but significantly reduced
- 320kbps: the highest quality you can get whilst still using a lossy method
like MP3
With VBR, these are sometimes expressed as an average.
Subjectively, A 128kbps MP3 might sound "underwater" or "swishy", while a
320kbps version would preserve much more detail.
2025-01-02 18:02:35 +00:00
Still, the bit-rate of a CD is 1411kbps! A 128kb/s MP3 is therefore only
capturing about 9% of CD quality and a 320kbps MP3 is capturing about 23% of CD
quality.
2024-12-27 14:21:56 +00:00
## Streaming services
2025-01-02 18:02:35 +00:00
Spotify uses Ogg Vorbis throughout but uses different bit rates for its
different tiers. The free tier has a range from 24-160kb/s at VBR with the
option of 320kb/s on the premium tier.
2024-12-27 14:21:56 +00:00
Other services offer FLAC or FLAC-equivalent quality at their most expensive
tiers (Apple Music, Amazon Music, Tidal).
Og Vorbis is particularly well-suited to streaming. It can seamlessly switch bit
rates during the stream which is beneficial with changeable network conditions.
Plus data is organised into independent packets so if a packet is lost there is
no perceptible difference.