Ce document correspond à la version en cache proposée par G o o g l e pour la page http://www.intersrv.com/~dcross/wavio.html.
La version ?? En cache ?? proposée par G o o g l e correspond à la page telle qu’elle se présentait lors de la dernière consultation effectuée par Google.
Il se peut que la page ait été modifiée depuis cette date. Cliquez ici pour consulter la page actuelle (sans mises en valeur).
Pour créer un lien avec cette page ou l'inclure dans vos favoris/signets, utilisez l'adresse suivante : http://www.google.com/search?q=cache:L7ew32PDX2QC:www.intersrv.com/~dcross/wavio.html+WAV+format&hl=fr&ie=UTF-8.


Google n'est ni affilié aux auteurs de cette page ni responsable de son contenu.
Les termes de recherche suivants ont été mis en valeur : wav format 

Reading and writing WAV files

Reading and writing WAV files

by Don Cross - dcross@intersrv.com

This page last modified on 26 June 1999.


Introduction

This page is for programmers who wish to read and write WAV format audio files in their programs. I have code here for C++ and Turbo Pascal. You may use this code freely in non-commercial programs. My source code is copyrighted, and use in commercial (i.e. for profit) computer programs requires my express written permission. Contact me by email if you are interested.

Whether you want to use my code or write your own for interfacing with WAV files, the following information about the format of a WAV file should be helpful.


The WAV file format

The WAV file format is a variant of the RIFF format for data interchange between programs. This format was designed so that data in a file is broken up into self-described, independent "chunks". Each chunk has a prefix which describes the data in that chunk. The prefix is a four-character chunk ID which defines the type of data in the chunk, followed by a 4-byte integer which is the size of the rest of the chunk in bytes. The size does not include the 8 bytes in the prefix. The chunks can be nested. In fact, a RIFF file contains a single chunk of type "RIFF", with other chunks nested inside it. Therefore, the first four bytes of a WAV file are "RIFF", and the four bytes after that contain the size of the whole file minus 8 bytes.

After the RIFF header is the WAV data, consisting of the string "WAVE" and two important chunks: the format header and the audio data itself. There may also be other chunks in a WAV file that contain text comments, copyrights, etc., but they are not needed to play the recorded sound. I have not tested this code on WAV files that contain these extra kinds of chunks, so the code might "blow chunks" in that case. If this causes anyone problems, please send me email and I will fix them.

The format header

The format header describes how the audio data is formatted in the file. Without the format header, you cannot correctly interpret the audio data. Here's what the WAV format header looks like:

name size [bytes] description
ckID 4 The ASCII string "fmt ". Note the single trailing space character. All chunk ID's have to be 4 characters, so trailing spaces are used to pad shorter strings.
nChunkSize 4 This is a 32-bit unsigned integer which holds the length of the entire 'fmt ' chunk in bytes. Note that this and all other multi-byte integer data in a WAV file are expressed with the least significant byte first. For example, if a WAV file's the chunk size is 16, then a hex dump of nChunkSize would print out 10 00 00 00.
wFormatTag 2 This defines how the audio data is encoded in the WAV file. This value will almost always be 1, which means Pulse Code Modulation (PCM). This web page, and my source code, all assume PCM. I don't support other formats, because they are mostly for compression and require more complicated algorithms. For my (experimental) purposes, I'd rather have to buy a big hard drive than write lots of complicated code!
nChannels 2 This is the number of channels of audio present in the WAV file. For monaural sounds there is 1 channel; for stereo sounds, there are 2 channels. It is possible to have more than 2 channels (e.g. Surround Sound), but this is rare. However, the number of channels should never be less than 1.
nSamplesPerSec 4 The sampling rate expressed in samples per second, or Hz. The reciprocal of this number is the amount of time between samples expressed in seconds. Typical values are 11025 (telephone quality), 22050 (radio quality), and 44100 (CD quality). You will probably not see sampling rates less than 8000 Hz or higher than 48000 Hz.
nAvgBytesPerSec 4 The average number of bytes per second that a player program would have to process to play this audio in real time. For PCM audio, this is redundant because you can calucate it by multiplying together the sampling rate, number of channels, and number of bytes per sample.
nBlockAlign 2 This number tells you how many bytes there are to output at a single time. In PCM, this is the same as the number of bytes per sample multiplied by the number of audio channels.
nBitsPerSample 2 This field is present only in PCM recordings. It defines the number of bits per sampled audio amplitude. It will usually be either 8 or 16. Eight-bit audio files have only 256 different amplitude levels possible, so they are low quality and contain inherent "hiss" known as quantization distortion. Sixteen-bit audio files sound much better but are twice as large (assuming the same sampling rate and number of channels).

The PCM data

After reading the format header chunk (of type "fmt "), you will know how to interpret the PCM data, which is in its own chunk of type "data". The first thing you need to know is the size of each sample. This is determined using nBitsPerSample from the format header. If nBitsPerSample is less than or equal to 8 bits, then each PCM sample will be one byte, and will range in value from 0 to 255 inclusive. However, if nBitsPerSample is between 9 and 16 bits, each PCM sample will occupy two bytes, and range in value from -32768 to +32767. (Yes, it is inconsistent to use an unsigned interpretation in one case and signed in the other, but that's the way it is!)

The next thing you need to know is how many audio channels there are. If there is just one channel, then the PCM data are ordered in chronological order, with the earliest sample first in the file and the latest sample last in the file. If there is more than one channel, all the samples for the first time index are given first, iterating through the channels. Then the next sample is given, again iterating through the channels. For example, if you had PCM data with an arbitrary number of channels in an array called buffer, you could process it using code that looked like this:

for ( int i=0; i<numSamples; i += numChannels )
{
    for ( int c=0; c<numChannels; c++ )
    {
        // ... do something with buffer[i+c]
    }
}

Note that if there are two channels, the first value in each pair is the left channel, and the second one is the right channel. Thus you can think of the left channel as Channel 0, and the right channel as Channel 1.

The only other important quantity is the sampling rate. This is important mainly when you need to process audio based on real time values expressed in seconds, or need to interpret frequency information in Hz (using the Fast Fourier Transform, for example).


Turbo Pascal source code

wavio.pas - This is a Turbo Pascal unit which lets you read and write WAV files. It implements an object type called WaveFile. You can open a WaveFile object for read or for write. Once open, you can read/write data in blocks of arbitrary size. Finally, you close the WaveFile using its close procedure.

tchtone.pas - This is a sample program that shows how to use the WavIO unit. It generates touch tone noises in an output wave file based on a phone number you supply in the command line. You can also download the executable of a C++ version of this program.


C++ source code

waviocpp.zip - This zip file contains all the source code you will need to open, read, and write WAV files in C++. Pay special attention to class WaveFile in the file riff.h.

Here is a sample program for using the above C++ code


Related Resources


[Don Cross home page]
[digital audio page]