1. DataLines |
- 1.1. General
- 1.1.1. How can I be notified when data is available for
write/read in a SourceDataLine or
TargetDataLine?
- 1.1.2. Why does it fail to open any line with 16 kHz sample
rate?
- 1.1.3. How can I get a
SourceDataLine or
TargetDataLine in μ-law
format?
- 1.1.4. Why does simultaneous recording and playback only
work when first opening the playback line
(SourceDataLine)?
- 1.1.5. Why doesn't simultaneous recording and playback work
at all with the Sun JDK 1.3/1.4 on GNU/Linux?
- 1.1.6. How can I get a
Line from a specific
Mixer?
- 1.1.7. Why are there no mono lines
with the "Direct Audio Devices" mixers on Linux?
- 1.1.8. Why is a SourceDataLine
called "source" and a
TargetDataLine called "target"
though it's actually the other way round?
- 1.1.9. Why are
DataLine.getFramePosition() and
DataLine.getMicrosecondPosition() so
inaccurate?
- 1.1.10. Why does DataLine.getLevel()
always return -1?
- 1.1.11. What is the difference
between DataLine.isActive() and
DataLine.isRunning()?
- 1.1.12. How can I detect a buffer
underrun or overrun?
- 1.1.13. Why is there no event for
notifying applications of an underrun/overrun
condition?
- 1.1.14. How can I find out the current
playback or recording position?
- 1.1.15. How can I do looping in
playback?
- 1.2. SourceDataLine
- 1.2.1. How can I avoid that the last bit
of sound played on a SourceDataLine
is repeated?
- 1.2.2. Why is playback distorted, too
fast or too slow with the JDK 1.5.0 beta, but not with
earlier versions of the JDK?
- 1.3. TargetDataLine
- 1.3.1. How can I capture from a
specific source (microphone or line-in)?
- 1.3.2. How can I get more than one
TargetDataLine?
- 1.3.3. Why is in not possible to open more than one
TargetDataLine at the same
time?
- 1.3.4. Why do I get a LineUnavailableException: "Requested
format incompatible with already established device
format"?
- 1.3.5. How can I control the volume when recording with a
TargetDataLine?
- 1.3.6. How should I use
stop() and
drain() on a
TargetDataLine?
- 1.3.7. Why is
TargetDataLine.read() blocking for a
long time?
- 1.3.8. Why is the end of
recordings cut off prematurely?
- 1.4. Clip
- 1.4.1. Why do I get an out of memory
exception when trying to use a Clip
with a 5 MB audio file?
- 1.4.2. Why do I get "LineUnavailableException: No Free
Voices" when opening a Clip?
- 1.4.3. How can I rewind a Clip?
- 1.4.4. Why does the
frame/microsecond position not jump back to zero when a
Clip is looped?
- 1.4.5. Why are there failures,
clicks and other random effects if a clip is played
multiple times with 1.5?
| |
1.1. General |
- 1.1.1. How can I be notified when data is available for
write/read in a SourceDataLine or
TargetDataLine?
- 1.1.2. Why does it fail to open any line with 16 kHz sample
rate?
- 1.1.3. How can I get a
SourceDataLine or
TargetDataLine in μ-law
format?
- 1.1.4. Why does simultaneous recording and playback only
work when first opening the playback line
(SourceDataLine)?
- 1.1.5. Why doesn't simultaneous recording and playback work
at all with the Sun JDK 1.3/1.4 on GNU/Linux?
- 1.1.6. How can I get a
Line from a specific
Mixer?
- 1.1.7. Why are there no mono lines
with the "Direct Audio Devices" mixers on Linux?
- 1.1.8. Why is a SourceDataLine
called "source" and a
TargetDataLine called "target"
though it's actually the other way round?
- 1.1.9. Why are
DataLine.getFramePosition() and
DataLine.getMicrosecondPosition() so
inaccurate?
- 1.1.10. Why does DataLine.getLevel()
always return -1?
- 1.1.11. What is the difference
between DataLine.isActive() and
DataLine.isRunning()?
- 1.1.12. How can I detect a buffer
underrun or overrun?
- 1.1.13. Why is there no event for
notifying applications of an underrun/overrun
condition?
- 1.1.14. How can I find out the current
playback or recording position?
- 1.1.15. How can I do looping in
playback?
| |
1.1.1. | How can I be notified when data is available for
write/read in a SourceDataLine or
TargetDataLine? |
| You have to use
SourceDataLine/TargetDataLine.available(). The
usual implementation for streaming audio (in Java Sound)
is a dedicated thread for that - look at the Java Sound
Demo which you can download from Sun or at the Java Sound Resources: Examples. (Florian) |
1.1.2. | Why does it fail to open any line with 16 kHz sample
rate? |
| Apparently, most Java Sound implementations do not
provide that, even if the soundcard supports it. Future
implementations will support that. (Florian) |
1.1.3. | How can I get a
SourceDataLine or
TargetDataLine in μ-law
format? |
| TargetDataLines are supposed
to act as a "direct" way to communicate with the
audio hardware device, i.e. your soundcard. When your
soundcard does not support μ-law directly, the
TargetDataLine won't either. The way to go is to open a
TargetDataLine in pcm format and
route it through a format converter. See doc of
AudioSystem to get converted
Streams. The converted stream you get provides μ-law
samples then. There is no drawback in this approach: all PC
soundcards that I know of deliver only PCM, so it has to
be rendered to μ-law anyway in software. Whether in the
soundcard's driver, the operating system layer or in the
application (your java program) doesn't matter. You get
maximum portability when only using pcm for
TargetDataLines. (Florian) |
1.1.4. | Why does simultaneous recording and playback only
work when first opening the playback line
(SourceDataLine)? |
| This depends on the soundcard and its driver to the
native operating system. E.g. Soundblaster 16 or 64 do not
provide real full duplex, only a kind of pseudo full
duplex. I experienced under Windows that you can only use
this pseudo full duplex when you have a certain order in
opening record/playback lines. (Florian) |
1.1.5. | Why doesn't simultaneous recording and playback work
at all with the Sun JDK 1.3/1.4 on GNU/Linux? |
| Due to problems with some OSS drivers, full-duplex
is disabled by default in versions up to 1.4.1. There are
several ways to get full-duplex: Use the ALSA support
in JDK 1.4.2 or later. Note that in 1.4.2, the ALSA
support is not used by default for playback. If you call
AudioSytem.getLine(), the default
is used ("Java Sound Audio Engine"). To use the "Direct
Audio Device" (which uses ALSA), obtain the respective
mixer with AudioSystem.getMixer()
and call getLine() on the mixer. To
detect the "Direct Audio Device", look for a string
"ALSA" in the vendor or description string of the
Mixer.Info object. Although string comparison is not a
nice way, it is higly likely that "ALSA" will appear in
at least one of the string in future releases. For
recording, the "Direct Audio Device" is the default. A
way to make the is the default for playback, too, is to
rename /dev/audio and
/dev/dsp. However, this will
disable sound support for all non-ALSA programs. In
version 1.5, the "Direct Audio Device" are the default
for playback, too, if the soundcard supports mixing in
hardware. Use Tritonus. The
Tritonus plug-ins work with Java versions that are older
than 1.4.2, too. However, it is recommended to use 1.4.2
if possible. The ALSA support in 1.4.2 is more stable
than the one in Tritonus.
See also Q: 3.3 (Matthias) |
1.1.6. | How can I get a
Line from a specific
Mixer? |
| Obtain the list of available
Mixer implementations with AudioSystem.getMixerInfo(). Select
one of the available and call AudioSystem.getMixer(Mixer.Info)
to obtain the Mixer. With this
object you can call Mixer.getLine(Line.Info)
instead of AudioSystem.getLine(Line.Info). In
the JDK 1.5.0, you can also use the ease-of-use methods in
AudioSystem: With the JDK 1.5.0, there is an additional
possibility: The default provider properties can be used
to select the default Mixer for
each type of line (SourceDataLine,
TargetDataLine,
Clip,
Port). The default
Mixer, if available, is used in
AudioSystem.getLine(). For details,
see the specification.
(Matthias) |
1.1.7. | Why are there no mono lines
with the "Direct Audio Devices" mixers on Linux? |
| The implementation of the "Direct Audio Device"
queries the soundcard driver for the supported
formats. Some ALSA drivers do not support mono lines, so
they are not available in the "Direct Audio Device". The
workaround is to open a stereo line and expand the mono
data to stereo. See also How can I convert between mono
and stereo? and How can I make a mono stream
appear on one channel of a stereo stream?
(Matthias) |
1.1.8. | Why is a SourceDataLine
called "source" and a
TargetDataLine called "target"
though it's actually the other way round? |
| Well, nobody really knows why this fancy naming was
chosen. From the perspective of an application, it's
counter-intuitive. To understand it, take the perspective
of a Mixer object: It receives data
from the application via a
SourceDataLine object, this is its
source of data. And it delivers data to the application
via a TargetDataLine. So from the
perspective of the Mixer, this is
the target of its data. (Matthias) |
1.1.9. | Why are
DataLine.getFramePosition() and
DataLine.getMicrosecondPosition() so
inaccurate? |
| The implementation of these methods in the "Java
Sound Audio Engine" is bad and will not be fixed. The
"Direct Audio Device" has a much better
implementation. See also What are all these mixers? But keep in mind that it is not possible to get a
frame precise playback position with these methods. There
is too much buffering in the data path (also in the audio
hardware), so calculating the position is always only an
estimation. If you try to measure the precision of
DataLine.getMicrosecondPosition()
with a real-time clock, you are also likely to see the
effect of a clock drift. For details on this phenomenon
see Why does recording or playing for a
certain period of time results in audio data that is shorter
or longer than the period I recorded / played?
(Matthias) |
1.1.10. | Why does DataLine.getLevel()
always return -1? |
| DataLine.getLevel() is not
implemented in current versions of the Sun JDK (1.4.1),
nor in any other known Java Sound implementation. Here is
a suggestion from Florian Bomers on how to implement this
functionality yourself: Read the data from the TargetDataLine in
blocks. Convert each block to a common format,
e.g. normalized floats [-1, +1], or 8 bit signed
bytes. If your project can make use of LGPL'd code, have
a look at class FloatSampleBuffer
(for floats) or TConversionTool
(for integer-based values) of the Tritonus
project. Calculate the level of the block. This
could be the average, RMS power, peak amplitude, or
similar. Be sure to use the absolute values (or squaring
the amplitudes for the power). See also How can I calculate the power
of a signal?
(Matthias) |
1.1.11. | What is the difference
between DataLine.isActive() and
DataLine.isRunning()? |
| This is an issue where even the Java Sound gurus do
not know a satisfying answer. A useful definition would be
the following: isActive() returns true if
the line is in started state, i.e. between calls to
start() and
stop(). isRunning() returns true if
data is actually read from or written to the
device. This would mean that
isRunning() returns false in case
of buffer underruns or overruns.
However, this is not the way it is implemented. For
the "Direct Audio Device" mixers,
isActive() and
isRunning() always return the same
value. In general, it is recommended to use
isActive(), since it is specified
less ambigously and it is implemented consistently. See
also bug
#4791152. (Matthias) |
1.1.12. | How can I detect a buffer
underrun or overrun? |
| The following is working reliably at least with the
"Direct Audio Device" mixers: SourceDataLine: underrun
if (line.available() == line.getBufferSize()) SourceDataLine.available():
how much data can be written to the buffer. If the
whole buffer can be written to, there is no data in
the buffer to be rendered. TargetDataLine: overrun
if (line.available() == line.getBufferSize()) TargetDataLine.available():
how much data can be read from the buffer. If the
whole buffer can be read, there is no space in the
buffer for new data captured from the line.
(Matthias) |
1.1.13. | Why is there no event for
notifying applications of an underrun/overrun
condition? |
| This is Florian's (and my) opinion: Java Sound is a low level audio API. We decided to
give highest priority to performance and "bare"
functionality, rather than adding many high-level
features. And although this is not a reason to not add
it, all low level audio API's that I have worked closely
with do not provide underrun notification.
(Matthias) |
1.1.14. | How can I find out the current
playback or recording position? |
| There are two possibilities: Use DataLine.getFramePosition()
or DataLine.getFramePosition(). These
methods are supposed to return the current "hearing"
position. However, they weren't implemented well prior
to the JDK 1.5.0. Count the frames that you read from or write to
the DataLine and add one full
buffer size and 15 milliseconds (ballpark figure for
hardware delay) to it. As reference point use the time
when the write()/read() method returns. This allows
amount correct extrapolation. This method works best
if you call
read()/write()
with buffers that fit exactly into the line's buffer
size. This approach also works reasonably fine with
1.4.2 and before. It is implemented in the JAM program
at J1 2003.
(Matthias) |
1.1.15. | How can I do looping in
playback? |
| There are two possibilities: (Matthias) |
1.2. SourceDataLine |
- 1.2.1. How can I avoid that the last bit
of sound played on a SourceDataLine
is repeated?
- 1.2.2. Why is playback distorted, too
fast or too slow with the JDK 1.5.0 beta, but not with
earlier versions of the JDK?
| |
1.2.1. | How can I avoid that the last bit
of sound played on a SourceDataLine
is repeated? |
| This can be avoided easily: after writing all data
to the SourceDataLine call
drain() and
stop(). If you want to reuse the line
after this, call start() again before
writing more data to the line. (Matthias) |
1.2.2. | Why is playback distorted, too
fast or too slow with the JDK 1.5.0 beta, but not with
earlier versions of the JDK? |
| The reason is a common misconception about how
Line.open()
works. According to the specification,
open() without parameters opens a
line in a "default format". The default format of a line
is an implementation specific property. It is
not the
AudioFormat used in the
DataLine.Info object. Rather, the
format in DataLine.Info is used to
request a DataLine instance that is
capable of handling this format. This
does not necessarily mean that the line has to be opened
in that format. Note that it is possible to construct
DataLine.Info with an array of
AudioFormat objects. This means
that the requested line has to be able to handle any of
the given formats. The Java Sound implementaion prior to JDK 1.5.0 had
the following property: If only one
AudioFormat is given in a
DataLine.Info, this
AudioFormat becomes the default
format of the line. This caused the behaviour that it was
possible to specify the format for
open() via the
DataLine.Info object. However, this
behaviour was never specified, it is just an
implementation specific property you can't rely on in
general. The "Direct Audio Device" mixers in JDK 1.5.0
beta (see also What are all these mixers?) behave different: they just pick one
of the supported hardware formats as default format. This
is a correct behaviour according to the specification,
since the specification doesn't specify how the default
format is chosen. Therefore, it is recommended to always specify the
format when opening a DataLine: use
open(AudioFormat format) or
open(AudioFormat format, int
buffersize) rather than
Line.open() without parameters. See
also Line.open(),
SourceDataLine.open(AudioFormat),
SourceDataLine.open(AudioFormat,
int), TargetDataLine.open(AudioFormat)
and TargetDataLine.open(AudioFormat,
int) It was decided to change the behaviour for the final
version of the JDK 1.5.0 to provide backward compatibility
with the JDK 1.4. The former unportable technique will be
specified behaviour. See also bugs #5053380
and #5067526
(Matthias) |
1.3. TargetDataLine |
- 1.3.1. How can I capture from a
specific source (microphone or line-in)?
- 1.3.2. How can I get more than one
TargetDataLine?
- 1.3.3. Why is in not possible to open more than one
TargetDataLine at the same
time?
- 1.3.4. Why do I get a LineUnavailableException: "Requested
format incompatible with already established device
format"?
- 1.3.5. How can I control the volume when recording with a
TargetDataLine?
- 1.3.6. How should I use
stop() and
drain() on a
TargetDataLine?
- 1.3.7. Why is
TargetDataLine.read() blocking for a
long time?
- 1.3.8. Why is the end of
recordings cut off prematurely?
| |
1.3.1. | How can I capture from a
specific source (microphone or line-in)? |
| You can use the system mixer of your operating
system to select the recording source in the same way you
would do it for a native program. With newer versions of
the Sun JDK, you can achieve the same by using the interface
javax.sound.sampled.Port. See
the section Ports for details. (Matthias) |
1.3.2. | How can I get more than one
TargetDataLine? |
| Current implementations of the Java Sound API do not
support multiple TargetDataLines
for the same recording source. There are no plans to
change this behaviour. If, in the future, multi-channel
soundcards are supported, it may be possible to get
different TargetDataLine instances
for the different inputs. If you just want to "split"
lines, do it in your application. See also Can I use multi-channel
sound?
(Matthias) |
1.3.3. | Why is in not possible to open more than one
TargetDataLine at the same
time? |
| Well, because it's a bug. The above is true for the
Sun JDK up to version 1.4.2 on Solaris and Windows, and up
to 1.4.1 on Linux. Beginning with version 1.5.0 for
Solaris and Windows and version 1.4.2 for Linux there are
the new "Direct Audio Device" mixer that don't have this
limitation. Tritonus is unaffected by this
limitation. (Matthias) |
1.3.4. | Why do I get a LineUnavailableException: "Requested
format incompatible with already established device
format"? |
| This is a bug that was fixed for 1.4.2. If you have
to use an older version, there are two possible
workarounds: Do not play back anything using the "Java Sound
Audio Engine" before recording. In version prior to
1.4.2, there is no way of doing playback at all
without using the "Java Sound Audio Engine". If the
"Java Sound Audio Engine" is used, it results in
opening the sound device for 44100 Hz, 16 bit stereo,
thereby setting the "previously established
format". Always capture at 16 bit, stereo, 44100Hz. If
you need your sound data in a different format, you
can convert it afterwards. See also Conversion between sample
representations
and How can I do sample rate
conversion?
(Matthias) |
1.3.5. | How can I control the volume when recording with a
TargetDataLine? |
| The obvious solution would be to get a
Control object of type
VOLUME or
MASTER_GAIN for the
TargetDataLine and manipulate the
volume via this object. However, this is not possible,
since no known Java Sound implementation supports any
controls for TargetDataLine
instances. What you can do is to use the system mixer to
control the recording volume --- it affects hardware
settings in the soundcard. One possibility is to use the
mixer application of the operating system. The other
possibility is using Port lines
from inside a Java Sound application. See the section
Ports for
details. The remaining possibility is to implement a volume
control digitally: multiplying each single sample of the
sound data with a certain value that lowers or raises the
level proportionally. See also Change the amplitude (volume) of an audio file (Matthias) |
1.3.6. | How should I use
stop() and
drain() on a
TargetDataLine? |
| It is specified that
TargetDataLine.drain() has to wait
until all data has been delivered to the
TargetDataLine. If the line is not
yet stopped, there is always data being delivered to the
line. So you should call drain() only after stop(). In
fact, drain() isn't needed with
TargetDataLine at all. A common technique to terminate reading from a
TargetDataLine is the
following:
TDL.stop();
do
{
count = TDL.read();
}
while (count > 0);
TDL.close(); For an implementation of
TargetDataLine.drain() to be 100%
compliant you need to block when the line is started and
there is still data available. One way to do this is the
following:
public void drain()
{
while (isActive() && (available() > 0))
{
Thread.sleep(100);
}
} (Matthias) |
1.3.7. | Why is
TargetDataLine.read() blocking for a
long time? |
| By specification,
TargetDataLine.read() is a blocking
call: it waits until the requested amount of data is
available. To use read() in a
non-blocking manner, you can check how much data is
available with available() and
request only that amount. If you want to use
read() in a standard blocking manner,
but need quick response for a real-time application, use
smaller buffers for reading. See also What is the minimum buffer size
I can use?
(Matthias) |
1.3.8. | Why is the end of
recordings cut off prematurely? |
| Even after calling stop() on
a TargetDataLine, there may be
data remaining in its internal buffer. Make sure you read
data until there is no more available. Then you can call
close() on the line. See also How should I use
stop() and
drain() on a
TargetDataLine?
(Matthias) |
1.4. Clip |
- 1.4.1. Why do I get an out of memory
exception when trying to use a Clip
with a 5 MB audio file?
- 1.4.2. Why do I get "LineUnavailableException: No Free
Voices" when opening a Clip?
- 1.4.3. How can I rewind a Clip?
- 1.4.4. Why does the
frame/microsecond position not jump back to zero when a
Clip is looped?
- 1.4.5. Why are there failures,
clicks and other random effects if a clip is played
multiple times with 1.5?
| |
1.4.1. | Why do I get an out of memory
exception when trying to use a Clip
with a 5 MB audio file? |
| For files of this size, you should stream the
audio. Like that you treat buffers of small size and feed
them successively into the audio device. Look at the
Java Sound Resources: Examples, there are some streaming
audio players to take as a start. (Florian) |
1.4.2. | Why do I get "LineUnavailableException: No Free
Voices" when opening a Clip? |
| This happens with the "Java Sound Audio Engine" when
too many clips are open. While you can obtain any number
of Clip instances, only 32 can be
open at the same time. This is a hard limitation of the
engine; it can only mix 32 channels. As a workaround, you
can close unused clips and open them once they are needed
again. If you really need more than 32 channels, you can
do the mixing in your application and output the result to
a SourceDataLine. (Matthias) |
1.4.3. | How can I rewind a Clip? |
| Stop the clip by calling
stop(), then use
clip.setFramePosition(0) or
clip.setMicrosecondPosition(0). Alternativly,
you can set looping points so that rewinding occurs
automatically: clip.setLoopPoints(0,
-1) (In this case you have to call
clip.loop(...) instead of
clip.start().) (Matthias) |
1.4.4. | Why does the
frame/microsecond position not jump back to zero when a
Clip is looped? |
| getFramePosition() and
getMicrosecondPosition() are
specified to return the position corresponding to the time
since the line (or clip) was opened. If you want to get
the position inside the loop of a looping clip, you can
use something similar to this (assuming you are looping
over the whole length of the clip):
currentFrame = clip.getFramePosition() %
clip.getFrameLength(); (Matthias) |
1.4.5. | Why are there failures,
clicks and other random effects if a clip is played
multiple times with 1.5? |
| This is a bug, and apparently one not easy to
fix. See bug #6251460.
Note that you can work around this issue by using the old
"Java Sound Audio Engine" instead of the "Direct Audio
Device" mixers. This way, you get the same behaviour as in
1.4. See also What are all these mixers?
(Matthias) |
2. Controls |
- 2.1. Why does the SourceDataLine
instances I get when using the "Direct Audio Device" (ALSA on
Linux) have no controls?
- 2.2. What is the difference between a
BALANCE and a PAN
control? Which one should I use?
- 2.3. Why do mono lines from a "Direct Audio Device" have no
PAN control?
- 2.4. Why does obtaining a gain
control work with 1.4.2, but not with 1.5.0?
- 2.5. Why do
Clip and
SourceDataLine instances have no
VOLUME control?
- 2.6. Why is there no sample rate control in 1.5.0?
| |
2.1. | Why does the SourceDataLine
instances I get when using the "Direct Audio Device" (ALSA on
Linux) have no controls? |
| Lines from these mixers do not provide controls in
1.4.2. In Florian's original opinion, "any control would
obscure the initial idea, to provide high-performance direct
audio access". However, he changed his mind and implemented
volume and balance controls in 1.5.0. (Matthias) |
2.2. | What is the difference between a
BALANCE and a PAN
control? Which one should I use? |
| In music, pan knobs are used for mono input lines to
control how they are mapped to stereo output lines. On the
other hand, for stereo input lines, the knob is labelled
"balance". So you should get a PAN
control for mono lines and a BALANCE
control for stereo lines (and none for lines with more than
2 channels). In the Sun J2SDK, PAN controls
behave like BALANCE controls for stereo
lines and BALANCE like
PAN for mono lines. However, this is
only a convenience for compatibility. To write portable
programs, you should not rely on this behaviour.
(Matthias) |
2.3. | Why do mono lines from a "Direct Audio Device" have no
PAN control? |
| To implement a PAN control for a
mono line, it has to be "distributed" between the left and
right channel of a stereo line. This was no problem with the
"Java Sound Audio Engine". The "Java Sound Audio Engine"
always opens the soundcard in stereo, so it is always
possible to do this "distribution". The "Direct Audio
Device" implementation, however, opens the soundcard in mono
if a mono line is requested. So it's not possible to
implement a PAN control for such
lines. The workaround is to work with stereo: convert your
stream to stereo and open the
SourceDataLine in that stereo format.
Then this line will have a BALANCE control, which works like
a PAN control. See also How can I convert between mono
and stereo? and What is the difference between a
BALANCE and a PAN
control? Which one should I use?
(Matthias) |
2.4. | Why does obtaining a gain
control work with 1.4.2, but not with 1.5.0? |
| Gain
(FloatControl.Type.MASTER_GAIN /
FloatControl.Type.VOLUME) controls are
still available with the "Direct Audio Device" mixers in
1.5.0 (see also What are all these mixers?). However, the behaviour has been
changed so that controls are only available after the line
has been opened. This was necessary because in general, some
control are only available if the device driver supports
certain features, which can be queried only after the
respective device has been opened. (Matthias) |
2.5. | Why do
Clip and
SourceDataLine instances have no
VOLUME control? |
| Clip and
SourceDataLine instances provide a
FloatControl.Type.MASTER_GAIN control
rather than a FloatControl.Type.VOLUME
control to control the playback volume. See also Why does obtaining a gain
control work with 1.4.2, but not with 1.5.0? (Matthias) |
2.6. | Why is there no sample rate control in 1.5.0? |
| The "Direct Audio Device" mixers in 1.5 (see What are all these mixers?) do not provide a
sample rate control. To cite Florian: “This is mostly because we wanted to
give direct access to the sound hardware, without the
problems of high-level features — namely latency and
processor usage. We may add sample rate in future if we find
a good way to add it without affecting
performance.”
As an alternative, you can resample your data with a
sample rate converter to achieve the same effect. See also
How can I do sample rate
conversion? Or, you can still use the sample rate control of the
"Java Sound Audio Engine" with 1.5 by requesting lines
directly from it. See How can I get a
Line from a specific
Mixer? (Matthias) |
3. DataLine buffers |
- 3.1. What is the minimum buffer size
I can use?
- 3.2. Why does a line have the default buffer size though a
buffer size was specified in a
DataLine.Info object when obtaining
the line?
- 3.3. Why is it not possible to use large buffers for a
DataLine with 1.5.0?
| |
3.1. | What is the minimum buffer size
I can use? |
| Obviously, this depends on the operating system, the
hardware, the Java VM, which Mixer implementation you use
and several other factors. The following measurements have
been found experimentally on a very old PC (350 MHz) under
Linux with the Sun JDK 1.4.2_02: These measurements suggest that the latency introduced
by buffers in the "Java Sound Audio Engine" is about 50 ms,
independant of the sample rate. (Matthias) |
3.2. | Why does a line have the default buffer size though a
buffer size was specified in a
DataLine.Info object when obtaining
the line? |
| This happens with the "Direct Audio Device" of the JDK
1.5.0 if the line is opened with
open(AudioFormat) instead of
open(AudioFormat, int). The reason for
this behaviour is that by requiring a certain buffersize or
range of buffersizes in
DataLine.Info, you obtain a line that
is capable of setting its buffersize to
the respective value. You still have to choose the actual
value. This is done when opening the line: with
open(AudioFormat, int), a certain
buffer size for the line can be specified. If
open(AudioFormat) is used, the line is
opened with the default buffer size. Until 1.4.2, a
buffersize in DataLine.Info was used
in opening if the open() call does not
specify a buffer size. However, it was decided that
automatically taking over this value is a questionable
convenience. (Matthias) |
3.3. | Why is it not possible to use large buffers for a
DataLine with 1.5.0? |
| The DataLine implementation of
the "Java Sound Audio Engine" has a circular buffer per line
instance. For SourceDataLine
instances, write() writes data to this
buffer. A separate thread reads from the circular buffer and
transfers the data to the native layer of the engine. This
allows for arbitrary sized buffers, but results in the
overhead of an additional buffer and one thread per
DataLine. The DataLine implementation of
the "Direct Audio Device" of 1.5.0 does not have a circular
buffer. Instead, it writes/reads data directly to/from the
soundcard driver. This gives higher performance and lower
latency. On the other hand, it restricts buffer sizes to
what the soundcard driver supports. Adding a layer of buffering to the "Direct Audio
Device" mixers would result in the same performance penalty
as the DataLine implementation of the
"Java Sound Audio Engine". It would introduce a general
overhead though the additional functionality is only needed
in special cases. Therefore, it is unlikely that the
implementation of the "Direct Audio Device" mixers will be
changed to allow larger buffers. If you need larger buffers, you can implement an
additional layer with a circular buffer in your
application. Then you can choose any size you want for this
buffer. And note that you need an additional thread —
like the "Java Sound Audio Engine". The Answering Machine has
classes that do a similar job. There is also the class
org.tritonus.share.TCircularBuffer in
Tritonus that
you can use for this purpose. (Matthias) |
4. Mixers |
- 4.1. What are all these mixers?
- 4.2. Why are there mixers from
which I can't get a
SourceDataLine?
- 4.3. How can I redirect sound output to
a phone / modem device?
- 4.4. Can I use multiple soundcards at the same time?
- 4.5. Why can I record from
different soundcards, but not play back to them?
- 4.6. How can I obtain the formats
supported by a mixer (or at all)?
- 4.7. What formats are supported by "Direct Audio Device"
mixers?
- 4.8. Why are there
AudioFormat objects with frame
rate/sample rate reported as -1 when I
query a Mixer for its supported
formats?
- 4.9. How can I detect which Port
Mixer belongs to which soundcard?
- 4.10. How can I find out which
Mixer implementation is used?
- 4.11. Why do I get lines from the
"Java Sound Audio Engine" in the JDK 1.5.0 though the
"Direct Audio Device" mixers are available, too?
| |
4.1. | What are all these mixers? |
| There are several implementations of
Mixer in Java Sound: - "Java Sound Audio Engine", beatnik engine
This is a software mixing engine. It provides
SourceDataLine and
Clip instances. It does not
provide TargetDataLine
instances. Output of this mixer goes to the audio
device. In versions up to 1.4.2, this mixer is the
default for playback. In 1.5, it is only used if there
is no other way to mix audio streams (because neither
the soundcard hardware nor the device driver support
mixing). - Simple Input Devices, "Microsoft Sound Mapper" (Windows), "Linux,dev/dsp,multi threaded" (Linux), "Linux,dev/audio,multi threaded" (Linux, Solaris)
In versions 1.4.2 and earlier, this mixer is
used for recording. It provides
TargetDataLine instances, but
nothing else. In 1.5, it is no longer available,
because the direct audio devices can be used for
recording on all platforms. - Direct Audio Devices, "Primary Sound Driver" (Windows), "Primary Sound Capture Driver" (Windows), "Soundcard [plughw:0,0]" (Linux)
These are mixers that can be used for playback
as well as for recording. They provide
SourceDataLine,
TargetDataLine and
Clip instances. In 1.4.2, they
became available on Linux; in 1.5, Solaris and Windows
followed. These mixers allow simultaneous playback and
recording (full-duplex) if the soundcard supports
it. These mixers do not do software mixing. So mixing
of multiple playback lines is only available if either
the soundcard hardware or the device driver are
capable of mixing. In other words: You may get only
one SourceDataLine, and you
will always get only one
TargetDataLine - Port Mixers, "Port Soundcard" (Windows), "Port Soundcard [hw:0,0]" (Linux)
These mixers provide Port
instances, but no other type of Line. So you can't
play back or record with these mixers. They became
available with 1.4.2 for Windows, and will be
available for Solaris and Linux, too, in 1.5. See also
Ports
Note that what Java Sound calls "Mixer" is different
from what Windows calls "Mixer": See also How can I find out which
Mixer implementation is used? (Matthias) |
4.2. | Why are there mixers from
which I can't get a
SourceDataLine? |
| There are mixer that only provide
TargetDataLine instances. In the Sun
JDK up to 1.4.2, SourceDataLine
instances are only provided by the "Java Sound Audio
Engine", while TargetDataLine
instances are only provided by the "Simple Input Device"
mixers. This is subject to change for JDK 1.5. Starting with version 1.4.2, there are additional
mixers that provide only Port
instances. See also What are all these mixers? (Matthias) |
4.3. | How can I redirect sound output to
a phone / modem device? |
| With the Sun JDK 1.4.2 or earlier on Windows, you can
set the default audio device to the telephone device:
Control panel -> Multimedia (or Sounds...) ->
Preferred Device. With the "Direct Audio Device" mixers of
the JDK 1.5 it is also possible to use the default provider
properties to set the default Mixer /
MixerProvider inside Java
Sound. See also Why are there mixers from
which I can't get a
SourceDataLine?, How can I capture from a
specific source (microphone or line-in)?, How can I get a
Line from a specific
Mixer? and Why can I record from
different soundcards, but not play back to them? (Matthias) |
4.4. | Can I use multiple soundcards at the same time? |
| For the Sun JDK, this is possible with version 1.4.2
and later for Linux and with version 1.5.0 and later for
Solaris and Windows. For Tritonus, this is possible with the
ALSA Mixer
implementation. (Matthias) |
4.5. | Why can I record from
different soundcards, but not play back to them? |
| This is true for Solaris and Windows for Java versions
up to 1.4.2. There, playback is only possible via the "Java
Sound Audio Engine", which always uses the first
soundcard. On the other hand, recording in these versions is
done with the "Simple Input Device", which provider one
Mixer instance per soundcard. With the "Direct Audio Device" mixers, it is possible
to choose different soundcards for output, too. See also
What are all these mixers? (Matthias) |
4.6. | How can I obtain the formats
supported by a mixer (or at all)? |
| First, obtain a list of supported lines either from a
Mixer object or from
AudioSystem. For this, use the
methods getSourceLineInfo() and
getTargetLineInfo(). Then, check each
of the returned Line.Info objects if
it is an instance of
DataLine.Info. If it is, cast the
object to DataLine.Info. Now you can
call getFormats() to obtain the
AudioFormat types supported by this
line type. A code example:
Line.Info[] infos = AudioSystem.getSourceLineInfo();
// or:
// Line.Info[] infos = AudioSystem.getTargetLineInfo();
for (int i = 0; i < infos.length; i++)
{
if (infos[i] instanceof DataLine.Info)
{
DataLine.Info dataLineInfo = (DataLine.Info) infos[i];
AudioFormat[] supportedFormats = dataLineInfo.getFormats();
}
} To see what is supported on your system, you can use
the application jsinfo. See also Why are there
AudioFormat objects with frame
rate/sample rate reported as -1 when I
query a Mixer for its supported
formats?
(Matthias) |
4.7. | What formats are supported by "Direct Audio Device"
mixers? |
| It depends on the hardware. The mixers just report
formats that are supported by the device driver. Typically,
there are between 8 and 20 supported formats. To write a
portable application, you should not assume that a certain
format is always supported (though in fact, 44.1 kHz 16 bit
stereo is almost always supported). Rather, you should check
the supported formats at run-time and try to convert your
audio data to one of the available formats. See also How can I obtain the formats
supported by a mixer (or at all)?
(Matthias) |
4.8. | Why are there
AudioFormat objects with frame
rate/sample rate reported as -1 when I
query a Mixer for its supported
formats? |
| The -1
(AudioSystem.NOT_SPECIFIED) means that
any reasonable sample rate is supported. Common soundcards
typically support sample rates between 4 kHz and 48 kHz. See
also How can I obtain the formats
supported by a mixer (or at all)? (Matthias) |
4.9. | How can I detect which Port
Mixer belongs to which soundcard? |
| There is no really satisfying solution. You can try to
match the name in the
Mixer.Info object of a Port Mixer
against the one of a DataLine Mixer. On Linux, this is
reliable by looking at the device id that is part of the
mixer name: "(hw:0)", "(hw:1)", "(plughw:0,1)". The first
(or only) number refers to the number of the
soundcard. Windows don't allow to query which port belongs to
which soundcard (there are ways on Windows, but it was not
possible to use them for Java Sound because they require
actually opening the devices). So the only thing you can do
is to match the name of the soundcard. However, this will
not always work reliably. Especially, if there are two
soundcards of the same model, their names will look the
same. See also What are all these mixers? (Matthias) |
4.10. | How can I find out which
Mixer implementation is used? |
| You can detect the mixer implementation from the class
types of the lines you get: See also What are all these mixers?
and How can I find out which
soundcard driver is used? (Matthias) |
4.11. | Why do I get lines from the
"Java Sound Audio Engine" in the JDK 1.5.0 though the
"Direct Audio Device" mixers are available, too? |
| In the JDK 1.5.0, the "Direct Audio Device" mixers are
used by default if they support more than one concurrently
active SourceDataLine. This is the
case if either the soundcard hardware supports mixing of
multiple channels (and the driver supports it) or the driver
does software mixing of multiple channels. If this is not the case, the "Java Sound Audio Engine"
is used by default. If you don't mind the limitation that
there will be only one SourceDataLine
or Clip instance, you can still use
the "Direct Audio Device" mixers by addressing them directly
(see How can I get a
Line from a specific
Mixer?). See also Can I make ALSA the default in
version 1.4.2? and How can I enable mixing with the
"Direct Audio Device" mixers on Linux? (Matthias) |
5. Soundcard Drivers |
- 5.1. Which soundcard drivers
can be used by Java Sound?
- 5.2. How can I find out which
soundcard driver is used?
- 5.3. I've installed ALSA and the JDK 1.4.2 to take
advantage of the ALSA support. Now, how do I use it?
- 5.4. Can I make ALSA the default in
version 1.4.2?
- 5.5. How can I enable mixing with the
"Direct Audio Device" mixers on Linux?
- 5.6. What are the requirements for using the direct audio
devices?
- 5.7. How can I find out which
soundcard driver is installed on my Linux system?
- 5.8. How does Java Sound deal with
hardware buffers of the soundcard?
| |
5.1. | Which soundcard drivers
can be used by Java Sound? |
| See also What are all these mixers?
and Q: 3.5
(Matthias) |
5.2. | How can I find out which
soundcard driver is used? |
| First, check which mixer is used (see How can I find out which
Mixer implementation is used?). Then consult the table in
Which soundcard drivers
can be used by Java Sound? to find out the
driver. For Linux, there is no way to tell from Java Sound if
a real OSS driver or ALSA's OSS emulation is used. See also
How can I find out which
soundcard driver is installed on my Linux system? (Matthias) |
5.3. | I've installed ALSA and the JDK 1.4.2 to take
advantage of the ALSA support. Now, how do I use it? |
| In 1.4.2, the "Java Sound Audio Engine" is still the
default. To use the ALSA support, you have to obtain the
Mixer object representing the direct audio access. Then,
obtain lines from this object instead of via
AudioSystem. See also How can I get a
Line from a specific
Mixer?
(Matthias) |
5.4. | Can I make ALSA the default in
version 1.4.2? |
| You can, but only with an ugly trick: rename, remove
or disable the device files
/dev/dsp*. This disables the Java Sound
Audio Engine, so the JDK falls back to use the ALSA
mixers. But be aware that this disables the software
synthesizer ("Java Sound Synthesizer"), too. So you won't be
able to play MIDI files. And of course native applications
using /dev/dsp won't be happy,
too. (Matthias) |
5.5. | How can I enable mixing with the
"Direct Audio Device" mixers on Linux? |
| The "Direct Audio Device" implementation on Linux is
based on ALSA. Mixing is
available in the Mixer instance if
ALSA provides mixing. This is the case if the soundcard can
do mixing in hardware and its ALSA driver supports this
feature. This is true for some common soundcards like
Soundblaster LIFE! and Soundblaster Audigy and cards based
on the Trident 4D Wave NX chipset. If this feature is
available at all, it needs no special configuration. It is
enabled by default. Using ALSA's dmix
plug-in does not work together with Java Sound. The
reason is that the "Direct Audio Device" mixer
implementation based on ALSA queries the available hardware
devices. However, a dmix device in ALSA is no hardware
device, so it is not recognized. Discussions about this
issue led to the conclusion that there is no easy way to
integrate a query for additional devices. See also Q: 3.4 (Matthias) |
5.6. | What are the requirements for using the direct audio
devices? |
| According to Florian: (Matthias) |
5.7. | How can I find out which
soundcard driver is installed on my Linux system? |
| Run /sbin/lsmod to show the
currently loaded kernel modules. If there are entries "snd"
and "snd-*", you are running ALSA. A typical picture of ALSA
is like this:
snd-mixer-oss 12672 1 (autoclean) [snd-pcm-oss]
snd-seq 38348 0 (autoclean) (unused)
snd-emu10k1 65956 1 (autoclean)
snd-hwdep 5024 0 (autoclean) [snd-emu10k1]
snd-rawmidi 13792 0 (autoclean) [snd-emu10k1]
snd-pcm 64416 0 (autoclean) [snd-pcm-oss snd-emu10k1]
snd-page-alloc 6148 0 (autoclean) [snd-emu10k1 snd-pcm]
snd-timer 15040 0 (autoclean) [snd-seq snd-pcm]
snd-ac97-codec 42200 0 (autoclean) [snd-emu10k1]
snd-seq-device 4116 0 (autoclean) [snd-seq snd-emu10k1 snd-rawmidi]
snd-util-mem 1504 0 (autoclean) [snd-emu10k1]
snd 36832 0 (autoclean) [snd-pcm-oss snd-mixer-oss snd-seq
snd-emu10k1 snd-hwdep snd-rawmidi snd-pcm snd-timer snd-ac97-codec
snd-seq-device snd-util-mem]
soundcore 3556 6 (autoclean) [snd]
An alternative way it to look for the directory
/proc/asound/. It is only present if
ALSA is active. (Matthias) |
5.8. | How does Java Sound deal with
hardware buffers of the soundcard? |
| Internally, Java Sound implementations usually do not
work with hardware buffers. Instead, they use the platform's
audio API for accessing the soundcard. See also DataLine buffers
(Matthias) |
6. Synchronization |
- 6.1. How can I synchronize two or more
playback lines?
- 6.2. How can I synchronize playback
(SourceDataLines) with recording (TargetDataLines)?
- 6.3. How can I synchronize
playback to an external clock?
- 6.4. Do multiple
Clip instances that are looped stay
in sync?
- 6.5. Why does recording or playing for a
certain period of time results in audio data that is shorter
or longer than the period I recorded / played?
- 6.6. How can I use
Mixer.synchronize()?
| |
6.1. | How can I synchronize two or more
playback lines? |
| The synchronization functions in
Mixer are not
implemented. Nevertheless, playback typically stays in
sync. (Matthias) |
6.2. | How can I synchronize playback
(SourceDataLines) with recording (TargetDataLines)? |
| As with multiple playback lines from the same
Mixer object, playback and recording
lines from the same Mixer object stay
in sync once they are started. In practice, this means that
you can achieve synchronization this easy way only by using
the "Direct Audio Device" mixers. Since the "Java Sound
Audio Engine" only provides playback lines, but no recording
lines, playback/recording sync is not as easy with the "Java
Sound Audio Engine". See also How can I synchronize two or more
playback lines? If playback and recording lines originate from
different Mixer objects, you need to
synchronize the soundcards that are represented by the
Mixer objects. So the situation is
similar to external synchronization. See also How can I synchronize
playback to an external clock? (Matthias) |
6.3. | How can I synchronize
playback to an external clock? |
| This is possible in one of two ways: See also Q: 3.12 (Matthias) |
6.4. | Do multiple
Clip instances that are looped stay
in sync? |
| Yes. There is no mechanism in Java Sound to start
Clip instances
synchronuously. However, calling
start() for all
Clip instances in a loop with the
Clip instances otherwise prepared
should be precise enough. Once started,
Clip instances played on the same
Mixer instance should stay in
sync. If they don't, make sure they have the exactly same
length. Clip instances played on
different Mixer instance are likely
to drift away from each other, unless the soundcard clocks
are synchronized (which is only possible on "pro"
soundcards). (Matthias) |
6.5. | Why does recording or playing for a
certain period of time results in audio data that is shorter
or longer than the period I recorded / played? |
| The reason of this problem is clock drift. There are
two clocks involved in this scenario: The real
time clock is used to measure the period of time
you are recording or playing. The soundcard
clock determine how many samples are recorded or
played during this period. Since there are two different
hardware devices, the inherently drift away from each other
over time. There are several ways to deal with this
problem: You can try to minimize the drift by
making both clocks high-precision. The real-time clock of
the computer can be synchronized to atomic clocks by using
some means of synchronization. The Network Time Protocol
(NTP) is commonly used for this on the internet. On
Windows, the utility AboutTime
can be used for synchronization. The precision of the
soundcard clock can be enhanced by using a professional
soundcard with a "word clock" input. This input has to be
connected to an external high-precision time base. In this
case, the soundcard clock is synchronized to the external
clock source. Professional studios often spend tens of
thousands of dollars to purchase a high-precision time
base. Note that this solution minimizes the drift, but
cannot remove it completely. You can use the soundcard clock as your
time base to measure wall-clock time. This way, you have
removed the second clock, so there is no drift. While this
may sound inconvenient, it may be a good solution if the
audio data has to be synchronized to, for instance, video
playback or the playback of slides, mouse events or
MIDI. If your soundcard's clock is synchronized to an
external time base as described in the previous point,
using it to measure wall-clock time is likely to give much
better results than using the computer's (unsynchronized)
"real time" clock. If both of the above solutions are not
appropriate, you can adapt the length of the audio data by
doing time streching/shrinking. This usually requires
fairly advanced and computationally expensive DSP
algorithms. In this case, you do not remove the clock
drift, but remove the effect of it on your audio
data.
(Matthias) |
6.6. | How can I use
Mixer.synchronize()? |
| Synchronization isn't implemented in any known Java
Sound implementation. It may be implemented in future
versions. Note that you can check the availability of
synchronization with the method Mixer.isSynchronizationSupported(). See
also Do multiple
Clip instances that are looped stay
in sync?,
How can I synchronize two or more
playback lines?
and Why does recording or playing for a
certain period of time results in audio data that is shorter
or longer than the period I recorded / played?
(Matthias) |
7. Audio Files |
- 7.1. How can I save audio data to a file, like
.wav or
.aiff?
- 7.2. How can I add special chunks to
.wav or .aiff
files (like for a descriptive text or copyright)?
- 7.3. Is it possible to get
information about loop points (e.g. from the 'smpl' chunk in
.wav files) using the
AudioFileFormat properties?
- 7.4. Why does
AudioFileFormat.getFrameLength() always
return -1 for .wav
files?
- 7.5. Why does a .wav file contain
PCM_UNSIGNED data if I try to save 8
bit PCM_SIGNED data?
- 7.6. How can I read in a .vox file and
save it as .wav file?
- 7.7. How can I read from a headerless audio file?
- 7.8. How can I determine the length or
the duration of an audio file?
- 7.9. How can I write an audio file in
smaller parts?
- 7.10. Why are some .wav files not
recognized by Java Sound?
- 7.11. Why is it not possible to write big-endian data using
a WaveAudioOutputStream?
- 7.12. How can I edit or modify audio files?
- 7.13. How can I play audio files where the data is cached in
the RAM?
- 7.14. Why is there a difference between using
AudioSystem.write(..., File) and using
AudioSystem.write(..., OutputStream)
with a FileOutputStream?
- 7.15. Where can I find documentation on the
AudioOutputStream programming?
- 7.16. How can I start playback of a file at a certain
position?
- 7.17. Is it possible to read and
write multichannel audio files?
- 7.18. How can I compare two audio
files?
- 7.19. Is it possible to insert
recorded audio data into an existing file?
- 7.20. How can I store an audio file
in a byte array?
- 7.21. Which value should I use for
the length of the file in
AudioOutputStreams if the length is
not known in advance?
| |
7.1. | How can I save audio data to a file, like
.wav or
.aiff? |
| Have a look at the Java Sound Resources: Examples. (Florian) |
7.2. | How can I add special chunks to
.wav or .aiff
files (like for a descriptive text or copyright)? |
| The Java Sound API does not support this
currently. Future versions are likely to, because this is
indeed quite important. For the moment, you will need to
implement your own class for writing
.wav or .aiff
files. Or make meaningful filenames... (Florian) |
7.3. | Is it possible to get
information about loop points (e.g. from the 'smpl' chunk in
.wav files) using the
AudioFileFormat properties? |
| While with the JDK 1.5's properties there is a way to
represent such information, Sun's
AudioFileReader implementation just
ignores such chunks. However, it is possible to write an own
implementation that handles the chunks and places the
information in AudioFileFormat
properties. See also Q & A 2, “Service Provider Interface (SPI)” (Matthias) |
7.4. | Why does
AudioFileFormat.getFrameLength() always
return -1 for .wav
files? |
| This information is never given in the
AudioFileFormat for
.wav files. It is a more or less
reasonable choice from an implementation point of view. The
reason is the chunk-oriented structure of the
.wav file format. The information about
the audio data length is in the format chunk of the
.wav file. According to the
specification, this chunk may be the last one. In other
words: It may be the case that for getting the format
information, you have to read to the end of a 20 MB
file. That's why the implementors decided to not give this
information. The workaround: fetch an
AudioInputStream with
AudioSystem.getAudioInputStream(File). Then
query the AudioInputStream object for
its length. You can see an example of this technique in
Getting information about an audio file. (Matthias) |
7.5. | Why does a .wav file contain
PCM_UNSIGNED data if I try to save 8
bit PCM_SIGNED data? |
| By the specification, 8 bit data in
.wav files has to be
unsigned. Therefore, the signedness is converted
automatically by Java Sound's file writer. (Matthias) |
7.6. | How can I read in a .vox file and
save it as .wav file? |
| Probably it's simplest to do all by yourself: use a
RandomAccessFile or similar to open
the vox file, parse the headers, etc. You need to know the
vox file format, you can find many documents specifying it
on the Internet. To create a .wav file
from that, create a AudioFileFormat
instance with the format read from the vox-header and supply
an InputStream with the audi data of the vox file. You can
then use AudioSystem.write() to write a
.wav file. (Florian) |
7.7. | How can I read from a headerless audio file? |
| If you know the format of your data, you can use the
following approach:
File file = new File("headerless_audio_data.dat");
InputStream is = new FileInputStream(file);
is = new BufferedInputStream(is);
AudioFormat format = new AudioFormat(...);
long lLengthInFrames = file.length() / format.getFrameSize();
AudioInputStream ais = new AudioInputStream(is, format,
lLengthInFrames); See also the example Converting raw data (headerless) files.
(Matthias) |
7.8. | How can I determine the length or
the duration of an audio file? |
| A common technique that works for PCM data is shown in
the example Getting information about an audio file. For files with encoded data,
the general technique is the following:
File file = new File("my_file.ext");
AudioFileFormat audioFileFormat = AudioSystem.getAudioFileFormat(file);
// get all properties
Map<String, Object> properties = audioFileFormat.properties();
// duration is in microseconds
Long duration = (Long) properties.get("duration");
} Note that this technique requires the JDK 1.5.0. Even
with this version, it currently does not work for ordinary
.aiff, .au and
.wav files (this is an implementation
issue that can be fixed easily). With recent javazoom versions of the mp3 and Ogg Vorbis plug-ins (not
with the Tritonus versions), you can use a hack that tries
to simulate the above programming technique. It can be used
with older JDK versions:
import org.tritonus.share.sampled.file.TAudioFileFormat
File file = new File("my_file.ext");
AudioFileFormat audioFileFormat = AudioSystem.getAudioFileFormat(file);
if (audioFileFormat instanceof TAudioFileFormat)
{
// Tritonus SPI compliant audio file format.
Map properties = ((TAudioFileFormat) audioFileFormat).properties();
// duration is in microseconds
Long duration = (Long) properties.get("duration");
} See also Why does
AudioInputStream.getFrameLength()
return -1?, How can I get the duration of
an Ogg Vorbis file?, How can I get the length of
an mp3 stream? and How can I calculate
the duration of a GSM file? (Matthias) |
7.9. | How can I write an audio file in
smaller parts? |
| AudioSystem.write() assumes that
the AudioInputStream you pass to it
contains everything that should go into the file. If you
don't want to write the file as a whole, but in blocks, you
can't use AudioSystem.write(). The
alternative is to use Tritonus'
AudioOutputStream architecture. See
Tritonus
plug-ins. (Matthias) |
7.10. | Why are some .wav files not
recognized by Java Sound? |
| Most types of audio files formats, including
.wav, can contain audio data in various
compressed formats. Only some of the formats are handled by
the standard audio file readers. The formats handled are
A-law and μ-law. Not handled are IMA ADPCM, MS ADPCM, and
others. (Matthias) |
7.11. | Why is it not possible to write big-endian data using
a WaveAudioOutputStream? |
| .wav files always store data in
little-endian order. And by design,
AudioOutputStreams do not do any
magic. Especially, they do not automatically convert
endianess or signedness. (Matthias) |
7.12. | How can I edit or modify audio files? |
| There are no special methods for this in Java
Sound. Nevertheless, it is obviously possible: read data
from a file into a byte array, modify the audio data there
and save the modified array to a file. See also How can I read an audio file and
store the audio data in a byte array? and
How can I write audio data from
a byte array to an audio file?. An alternative approach is to write a subclass of
AudioInputStream that modifies the data
"flowing" through it. You can see an example of this technique
in Change the amplitude (volume) of an audio file. (Matthias) |
7.13. | How can I play audio files where the data is cached in
the RAM? |
| There are two possibilities: Use Clip lines. They load
the data into the RAM before playback. However, there is
a limit to the size of the data somwhere between 2 and 5
MB. See also Clip Read the whole file (including its headers) into a
byte array. Then construct a
ByteArrayInputStream from this
array and pass it to
AudioSystem.getAudioInputStream(InputStream)
to obtain an
AudioInputStream.
(Matthias) |
7.14. | Why is there a difference between using
AudioSystem.write(..., File) and using
AudioSystem.write(..., OutputStream)
with a FileOutputStream? |
| The basic problem is that the length of the audio data
has to be given in the header of an audio file, and the
header is written at the beginning of the file. The length
may not be known at the time the header is written. If the
AudioInputStream passed to
write() has a known length, this length
is used for filling in the header. If, however, the
AudioInputStream has an unknown
length (AudioSystem.NOT_SPECIFIED),
there is no valid information to fill in the header at the
beginning. OutputStream allows only
sequential writing: once the header is written, it cannot be
changed any more. So if the length of the audio data is
unknown, the header will contain invalid length
information. If the destination is given as a
File, the audio file writer can open
the file in random access mode. After writing all audio
data, it goes back to the beginning of the file and fixes
the header with the then-known length information. This
method is called "backpatching". Due to this behaviour,
AudioSystem.write(..., File) is
recommended over AudioSystem.write(...,
OutputStream), if using it is possible. See also
Why does
AudioInputStream.getFrameLength()
return -1? The AudioOutputStream
architecture of Tritonus has to deals with the same
problem. There, the difference exists between using a
TSeekableDataOutputStream
(representing a File, allows
backpatching) and using a
TNonSeekableDataOutputStream
(representing an OutputStream, does
not allow backpatching). See also How can I write an audio file in
smaller parts?
(Matthias) |
7.15. | Where can I find documentation on the
AudioOutputStream programming? |
| The API documentation is part of the Tritonus
docs. See Q: 9 The recommended way to learn about
programming with the
AudioOutputStream architecture is to
look at the examples that use it like Saving waveform data to a file
(AudioOutputStream version). (Matthias) |
7.16. | How can I start playback of a file at a certain
position? |
| You can call skip() on the
AudioInputStream you obtain for the
file. Note that skip() can only advance
the position, it cannot go back. To rewind see How do I rewind an
AudioInputStream?
(Matthias) |
7.17. | Is it possible to read and
write multichannel audio files? |
| The file readers and writers of both the Sun JDK and
Tritonus should support interleaved multi-channel WAVE
files. This feature hasn't been tested extensively, so there
may be minor bugs, but it should basically work. Interleaved multichannel PCM formats are represented
by an AudioFormat instance with the
respective number of channels. See also Can I use multi-channel
sound? (Matthias) |
7.18. | How can I compare two audio
files? |
| If you want to compare files if they are exactly the
same, this is easy: just compare them byte by byte. However,
typically, you want to compare two different recordings of
the same piece of music. Because of noise, quantisation
errors, different volume levels and other effects, two
recordings do never match exactly. So a simple comparison
can't be used. A useful comparison is a non-trivial task that
requires knowledge about digital signal processing. One approach
to do such a comparison is the following: normalize the file based on signal power transform to frequency domain with an FFT scale down the FFT components compare the series of frequency components with a
statistical analysis for correlation
You may get better results by exchanging step 1 with
step 2 and/or using a wavelet transformation instead of
FFT. See also How can I do equalizing / noise reduction
/ fft / echo cancellation / ...? and Q: 2 (Matthias) |
7.19. | Is it possible to insert
recorded audio data into an existing file? |
| With standard Java Sound functionality, it is not
possible to insert recorded sound into an existing file. The
obvious workaround is to record to a new, temporary file and
then put the pieces together to the file you want. If direct writing to an existing file is important to
you, you could try to hack the AudioOutputStream classes of
Tritonus. I think it is possible to introduce a constructor
flag for "open existing" instead of "overwrite file
completely" and to introduce a skip() method to move to a
cue point. If you're interested in this, Florian and I will
help you to find your way through the implementation of
AudioOutputStreams. (Matthias) |
7.20. | How can I store an audio file
in a byte array? |
| You can pass an instance of
ByteArrayOutputStream to
AudioSystem.write(...,
OutputStream). The byte array you extract from
the ByteArrayOutputStream will
contain the complete file including the headers. This
technique is especially useful if you want to store audio
files in a database. (Matthias) |
7.21. | Which value should I use for
the length of the file in
AudioOutputStreams if the length is
not known in advance? |
| If the length of the file is not known in advance, you
should use the value
AudioSystem.NOT_SPECIFIED. Pass this
value to the constructors of
AudioOutputStream subclasses directly
or use it in requesting an
AudioOutputStream instance via
AudioSystemShadow.getAudioOutputStream(). Note that not knowing the length makes it impossible
to use OutputStreams as target for
some audio file types (File targets
should work always). (Matthias) |
8. Sample Representation and
AudioFormat |
- 8.1. How is audio
represented digitally?
- 8.2. In which cases should I use a
floating point representation for audio data?
- 8.3. What is the meaning of frame rate in
AudioFormat?
- 8.4. What is the meaning of frame size in
Audioformat?
- 8.5. What is signed / unsigned?
- 8.6. How can I use Java's signed byte type to store an 8
bit unsigned sample?
- 8.7. How can I find out if an
AudioFormat is signed or
unsigned?
- 8.8. What is endianess / big endian /
little endian?
- 8.9. How are samples organized in
a byte array/stream?
- 8.10. What does "unknown sample rate" in an
AudioFormat object mean?
| |
8.1. | How is audio
represented digitally? |
| Each second of sound has so many (on a CD, 44,100)
digital samples of sound pressure per second. The number of
samples per second is called sample rate or sample
frequency. In PCM (pulse code modulation) coding, each
sample is usually a linear representation of amplitude as a
signed integer (sometimes unsigned for 8 bit). There is one
such sample for each channel, one channel for mono, two
channels for stereo, four channels for quad, more for
surround sound. One sample frame consists of one sample for
each of the channels in turn, by convention running from
left to right. Each sample can be one byte (8 bits), two bytes (16
bits), three bytes (24 bits), or maybe even 20 bits or a
floating-point number. Sometimes, for more than 16 bits per
sample, the sample is padded to 32 bits (4 bytes) The order
of the bytes in a sample is different on different
platforms. In a Windows WAV soundfile, the less significant
bytes come first from left to right ("little endian" byte
order). In an AIFF soundfile, it is the other way round, as
is standard in Java ("big endian" byte
order). Floating-point numbers (4 byte float or 8 byte
double) are the same on all platforms. See also How are samples organized in
a byte array/stream? and What is endianess / big endian /
little endian?
(Matthias) |
8.2. | In which cases should I use a
floating point representation for audio data? |
| Converting sample data to a floating point
representation (float or double data type) is handy if you
are doing DSP stuff. In this case, it gives greater
precision and greater dynamic range. In all other cases,
there is no advantage. Note also that conversion to or from
floats is expensive, while dealing only with integer formats
is typically much faster. (Matthias) |
8.3. | What is the meaning of frame rate in
AudioFormat? |
| For PCM, A-law and μ-law data, a frame is all data
that belongs to one sampling intervall. This means that the
frame rate is the same as the sample rate. For compressed formats like Ogg Vorbis, mp3 and GSM 06.10, the situation is different. A frame
is a block of data as it is output by the encoder. Often,
these blocks contain the information for several sampling
intervalls. For instance, a mp3 frame represents about 24
ms. So the frame rate is about 40 Hz. However, the sample
rate of the original is preserved even inside the frames and
is correctly restored after decoding. (Matthias) |
8.4. | What is the meaning of frame size in
Audioformat? |
| As outlined in the previous question, it depends on
what a frame is. For PCM, the frame size is just the number
of bytes for one sample, multiplied with the number of
channels. Note that usually each individual sample is
represented in an integer number of bytes. For instance, a
12 bit stereo frame uses 4 bytes, not 3. For compressed
formats, the frame size is some more-or-less arbitrarily
chosen number that is a property of the compression
schema. Some compression methods do not have a constant, but
variable frame size. In this case the value returned by
AudioFormat.getFrameSize() is
-1. Some common frame sizes: (Matthias) |
8.5. | What is signed / unsigned? |
| For PCM, sample values are represented by
integers. These integers can be signed or unsigned, similar
to signed or unsigned data types in programming languages
like C. The following table shows the value ranges for
signed and unsigned integers of common sizes and of the
general case: (Matthias) |
8.6. | How can I use Java's signed byte type to store an 8
bit unsigned sample? |
| Basically, a byte is a storage container for 8
bits. Whether these 8 bits are used to store a signed or an
unsigned numer is a matter of interpretation. Yes, Java
always interprets bytes as signed. But they can be
interpreted just the other way, too. The 8 bits can always represent 256 different bit
patterns. In unsigned interpretation, these 256 bit patterns
are interpreted as the decimal values 0 to 255. In signed
interpretation, patterns are interpreted as the decimal
values -128 to 127. The following table may help to understand
this. In representing wave forms, the range of the
respective interpretation is used to express minimum and
maximum of the wave. As you can see, the difference between signed and
unsigned notation, expressed in decimal, is
128. (Matthias) |
8.7. | How can I find out if an
AudioFormat is signed or
unsigned? |
| For PCM, check if the encoding equals either
AudioFormat.Encoding.PCM_SIGNED or
AudioFormat.Encoding.PCM_UNSIGNED.
(Matthias) |
8.8. | What is endianess / big endian /
little endian? |
| Most common computers have their memory organized in
units of 8 bits, called a byte. The bytes can be adressed by
ordinal numbers, starting with zero. (The hardware
organization of the memory is often in rows of 16, 32, 64,
128 or even more bits. But the instruction set of the
processor still gives you the view of the byte-organized
memory.) If you want to store a value that needs more than
8 bits, the question arises how the bits of the value are
divided into bytes and stored in memory. If you have a value
with 16 bits, there is not much discussion that it has to be
divided into two groups: bit 0 to 7 and bits 8 to 15. But
then, the fight starts. Some CPUs store the first group (bit
0 to 7) in the byte with the lower address and the second
group (bits 8 to 15) in the byte with the higher
address. This schema is called little endian. As an example,
all Intel architecture and Alpha CPUs are little
endian. Other types of CPUs do it the other way round, which
is called big endian. Sparc (Sun), PowerPC (Motorola, IBM)
and Mips (PMC-Sierra) CPUs are big endian. For Java Sound, endianess matters if the size of
samples (as given by
AudioFormat.getSampleSizeInBits()) is
greater than 8 bit. For 8 bit data, while the endianess
still has to be specified in an
AudioFormat object, it has no
significance. It is a convention in Java Sound that
Mixer,
AudioFileWriter and
FormatConversionProvider implementations
handle both endianesses, but you can't really rely on
this. (Matthias) |
8.9. | How are samples organized in
a byte array/stream? |
| It depends on the format of the data, which is given
as an AudioFormat instance. Below are
some common cases. To understand the terms little endian, big endian,
high byte and low byte, see What is endianess / big endian /
little endian? See also How do I convert short (16 bit)
samples to bytes to store them in a byte array? and
How can I reconstruct
sample values from a byte array? (Matthias) |
8.10. | What does "unknown sample rate" in an
AudioFormat object mean? |
| Since 1.5.0, "unknown sample rate" is output by
AudioFormat.toString() if the sample
rate is -1
(AudioSystem.NOT_SPECIFIED). See also
Why are there
AudioFormat objects with frame
rate/sample rate reported as -1 when I
query a Mixer for its supported
formats?
(Matthias) |
9. Conversion between sample
representations |
- 9.1. How can I convert 8 bit signed samples to 8 bit
unsigned or vice versa?
- 9.2. How do I convert short (16 bit)
samples to bytes to store them in a byte array?
- 9.3. How do I convert float or double samples
to bytes to store them in a byte array?
- 9.4. How can I reconstruct
sample values from a byte array?
- 9.5. How can I convert between mono
and stereo?
- 9.6. How can I make a mono stream
appear on one channel of a stereo stream?
| |
9.1. | How can I convert 8 bit signed samples to 8 bit
unsigned or vice versa? |
| Signed to unsigned:
byte unsigned = (byte) (signed + 128); Unsigned to signed:
byte signed = (byte) (unsigned - 128); Alternativly, you can use for both conversions:
byte changed = (byte) (original ^ 0x80); (Matthias) |
9.2. | How do I convert short (16 bit)
samples to bytes to store them in a byte array? |
| Generally:
short sample = ...;
byte high = (byte) (sample >> 8) & 0xFF;
byte low = (byte) (sample & 0xFF); If you want to store them in an array in big endian
byte order:
short sample = ...;
byte[] buffer = ...;
int offset = ...;
// high byte
buffer[offset + 0] = (byte) (sample >> 8) & 0xFF;
// low byte
buffer[offset + 1] = (byte) (sample & 0xFF); If you want to store them in an array in little endian
byte order:
short sample = ...;
byte[] buffer = ...;
int offset = ...;
// low byte
buffer[offset + 0] = (byte) (sample & 0xFF);
// high byte
buffer[offset + 1] = (byte) (sample >> 8) & 0xFF; Note that in Java arithmetic operations on integers
are always done with int's (32 bit) or long's (64
bit). Using arithmetic operations on byte or short leads to
extending them to int. Therefore, storing 16 bit values in
int (32 bit) variables uses less processing time if you want
to do calculations like the above. On the other hand, it
doubles memory usage. Optimized code to do these conversions can be found in
the class TConversionTool
of Tritonus. See also
How are samples organized in
a byte array/stream? (Matthias) |
9.3. | How do I convert float or double samples
to bytes to store them in a byte array? |
| You can do this with the following steps: Code example for float samples:
// the sample to process
float fSample = ...;
// saturation
fSample = Math.min(1.0F, Math.max(-1.0F, fSample);
// scaling and conversion to integer
int nSample = Math.round(fSample * 32767.0F);
byte high = (byte) (nSample >> 8) & 0xFF;
byte low = (byte) (nSample & 0xFF); Code example for double samples:
// the sample to process
double dSample = ...;
// saturation
dSample = Math.min(1.0, Math.max(-1.0, dSample);
// scaling and conversion to integer
int nSample = (int) Math.round(dSample * 32767.0);
byte high = (byte) (nSample >> 8) & 0xFF;
byte low = (byte) (nSample & 0xFF); (Matthias) |
9.4. | How can I reconstruct
sample values from a byte array? |
| The code below assumes that buffer is an array of
bytes and offset an int, used as a an index into the
buffer. It further assumes that the sample values are signed
for sample sizes greater than 8 bit. Optimized code to do these conversions can be found in
the class TConversionTool
of Tritonus. See also
How are samples organized in
a byte array/stream? (Matthias) |
9.5. | How can I convert between mono
and stereo? |
| This is possible with the PCM2PCM converter of
Tritonus. It is available as part of the "Tritonus
Miscellaneous" package. See Tritonus
Plug-ins. (Matthias) |
9.6. | How can I make a mono stream
appear on one channel of a stereo stream? |
| You can use a technique like shown below. The example
assumes that the data is 8 bit unsigned.
// incoming: mono input stream
// outgoing: stereo output stream
void monoToSingleSideStereo(byte[] incoming, byte[] outgoing)
{
int nSignalOffset;
int nSilenceOffset;
// this is for unsigned data. For signed data, use the value 0.
int nSilenceValue = -128;
if (copyToLeftChannel)
{
nSignalOffset = 0;
nSilenceOffset = 1;
}
else // signal to the right channel
{
nSignalOffset = 1;
nSilenceOffset = 0;
}
for (int i = 0; i < incoming.length; i++)
{
outgoing[(i * 2) + nSignalOffset] = incoming[i];
outgoing[(i * 2) + nSilenceOffset] = nSilenceValue;
}
} Alternativly, you can use a PAN
control on a SourceDataLine while
doing playback. Note that this only works with the "Java
Sound Audio Engine". With the "Direct Audio Device" mixers,
you have to use a workaround: convert the mono stream to a
stereo stream (see How can I convert between mono
and stereo?), open the line in stereo and
use a BALANCE control. See also Why are there no mono lines
with the "Direct Audio Devices" mixers on Linux?
(Matthias) |
10. AudioInputStreams and Byte
Arrays |
- 10.1. How can I read an audio file and
store the audio data in a byte array?
- 10.2. How can I write audio data from
a byte array to an audio file?
- 10.3. How can I calculate the number of bytes to skip from
the length in seconds?
- 10.4. How do I rewind an
AudioInputStream?
- 10.5. How do I skip backwards on an
AudioInputStream?
- 10.6. How can I implement a
real-time AudioInputStream, though I
cannot give a length for it, as it is not known in
advance?
- 10.7. How can I mix two (or more)
AudioInputStream instances to a
resulting AudioInputStream?
- 10.8. How can I create an
AudioInputStream that represents a
portion of another
AudioInputStream?
- 10.9. Why does
AudioInputStream.getFrameLength()
return -1?
- 10.10. What is the
difference between
AudioSystem.getAudioInputStream(InputStream)
and new AudioInputStream(InputStream, AudioFormat,
long)?
| |
10.1. | How can I read an audio file and
store the audio data in a byte array? |
| Create a ByteArrayOutputStream
object. Then, in a loop, read form the AudioInputStream and
write the data read from it to the
ByteArrayOutputStream. Once all data
is processed, call
ByteArrayOutputStream.toByteArray() to
get a byte array with all the data. See Buffering of audio data in memory
for a code example. As an alternative, you can do the following: Calculate the required size of the byte array from
the number of frames and the frame size (see How can I determine the length or
the duration of an audio file?). Create a byte array of the calculated size. Call AudioInputStream.read()
with this array. Note that while this typically reads
the whole file in one call, this is not quaranteed. If,
for some reason, reading the whole content of the
AudioInputStream does not
succeed, only part of the data may be written to the
byte array. Therefore, you have to compare the return
value of read() against the length
of the byte array. If some part is missing, you have to
call read() again with an
appropriate offset.
(Matthias) |
10.2. | How can I write audio data from
a byte array to an audio file? |
| Create a ByteArrayInputStream
object from the byte array, create an
AudioInputStream from it, then call
AudioSystem.write(). See Buffering of audio data in memory
for a code example. (Matthias) |
10.3. | How can I calculate the number of bytes to skip from
the length in seconds? |
| Use one of the following formulas:
bytes = seconds * sample rate * channels * (bits
per sample / 8)
or
bytes = seconds * sample rate * frame size
You can get the sample rate, number of channels, bits per
sample and frame size from an
AudioFormat
object. (Matthias) |
10.4. | How do I rewind an
AudioInputStream? |
| See the example Playing an audio file multiple times. Note that the way the
JavaSoundDemo does it is not recommended, because it relies
on implementation specific behaviour. (Matthias) |
10.5. | How do I skip backwards on an
AudioInputStream? |
| In general, there is no clean way besides buffering
the whole content of the
AudioInputStream as done in the
example Playing an audio file multiple times. There is one
possibility: if the AudioInputStream
is created from a FileInputStream,
you can use AudioInputStream.skip() with
a negative skipp amount. This works because the
AudioInputStream implementation just
passes the skip() call to its underlying
stream and the FileInputStream
implementation is able to handle random accesses. Note,
however, that this relies on unspecified, implementation
specific behaviour of the Sun JDK. Therefore, this approach
should be used with care. (Matthias) |
10.6. | How can I implement a
real-time AudioInputStream, though I
cannot give a length for it, as it is not known in
advance? |
| You should use
AudioSystem.NOT_SPECIFIED as length.
This approach seems logical to me and it works fine in my
program. (Florian) |
10.7. | How can I mix two (or more)
AudioInputStream instances to a
resulting AudioInputStream? |
| There are no special methods in the Java Sound API to
do this. However, mixing is a trivial signal processing
task, it can be accomplished with plain Java code. Have a
look at Concatenating or mixing audio files. See also How can I do mixing of audio
streams? (Matthias) |
10.8. | How can I create an
AudioInputStream that represents a
portion of another
AudioInputStream? |
| To create a derived
AudioInputStream that starts at frame
start of the original
AudioInputStream and has a length of
length frames, you can use the folloing
code:
AudioInputStream originalAIS = ...
int start = ...; // in frames
int length = ...; // in frames
int frameSize = originalAIS.getFormat().getFrameSize();
originalAIS.skip(start * frameSize);
AudioInputStream derivedAIS = new AudioInputStream(originalAIS,
originalAIS.getFormat(), length); (Matthias) |
10.9. | Why does
AudioInputStream.getFrameLength()
return -1? |
| A length of -1
(AudioSystem.NOT_SPECIFIED) means that
the length of the stream is unknown. This typically happens
in two situations: If an AudioInputStream
obtains its data from a
TargetDataLine, the amount of
data (and therefore, the length of the stream) is
determined by the length of the recording. Obviously,
this cannot be known at the time the
AudioInputStream instance is
created. If audio data is encoded to or decoded from a
compression format like Ogg Vorbis or mp3, where the length of the encoded
data is not a simple fraction of the length of the
unencoded data. In this case, it is not possible for the
codec to calculate the length of the converted
stream. So it has to state that the lengt is
unknown.
To write portable programs, you should always expect
that the length of an
AudioInputStream may be
-1. For instance, if you are
calculating a buffer size from the stream length, you should
handle this case separately. (Matthias) |
10.10. | What is the
difference between
AudioSystem.getAudioInputStream(InputStream)
and new AudioInputStream(InputStream, AudioFormat,
long)? |
| AudioSystem.getAudioInputStream(InputStream)
"intelligently" parses the header of the file in
InputStream and tries to retrieve the
format of it. This fails for "raw" audio files or files that
aren't recognized by Java Sound. The
AudioInputStream returned by this
method is at the position where the actual audio data
starts, the file header is skipped. AudioInputStream(InputStream, AudioFormat,
long) is a "stupid" constructor that just returns
an AudioInputStream with the
InputStream used "as is". No attempt
is made to verify the format with the given
AudioFormat instance - if you pass a
wrong AudioFormat, the data in
InputStream is interpreted in a wrong
way. Using the second way on an
InputStream that is obtained from an
audio file would give an
AudioInputStream where the "audio
data" starts with the file header. Often, the difference
won't be noticable, because headers are typically short
(typically 44 bytes for .wav files, 24
bytes for .au files). However, there is
no quarantee that the header is not much longer in some
audio files, and that it will be audible as clicks or
noise. An exception are audio files without a header. These
are typically "streamable" formats, e.g. mp3 and GSM 06.10. There, the data is
organized in frames, and each frame has a very basic
description of the audio data. So for these headerless
formats, the two ways to get an
AudioInputStream are
equivalent. (Matthias) |
11. Data Processing (Amplifying,
Mixing, Signal Processing) |
- 11.1. How can I do some processing on an A-law stream (like
amplifing it)?
- 11.2. How can I detect the level of sound while I am
recording it?
- 11.3. How can I do sample rate
conversion?
- 11.4. How can I detect the frequency (or pitch) of sound data?
- 11.5. How can I do equalizing / noise reduction
/ fft / echo cancellation / ...?
- 11.6. How can I do silence supression or silence
detection?
- 11.7. How can I do mixing of audio
streams?
- 11.8. Should I use float or double for signal
processing?
- 11.9. How can I do computations with complex numbers in
Java?
- 11.10. How can I change the pitch
(frequency) of audio data without changing the
duration?
- 11.11. How can I change the duration of
audio data without changing the pitch (frequency)?
- 11.12. How can I use reverbation?
- 11.13. How can I find out the maximum volume of a sound file?
- 11.14. How can I normalize the volume of sound?
- 11.15. How can I calculate the power
of a signal?
| |
11.1. | How can I do some processing on an A-law stream (like
amplifing it)? |
| It is much easier to change gain with linear encoding
(PCM). I would strongly suggest that - especially when you
have the data in linear format at first. You'd have to
convert it back to A-law after processing. (Florian) |
11.2. | How can I detect the level of sound while I am
recording it? |
| First of all, you should have the data in PCM format
(preferable in signed PCM). Then you can look at the samples
to detect the amplitude (level). Some statistics are
suitable, too, like taking the average of the absolute
values or RMS. (Florian) |
11.3. | How can I do sample rate
conversion? |
| Currently, this is not supported by the Sun JDK (see
bug #4916960). Tritonus has a sample
rate converter that is available as a plug-in for other Java
Sound implementations, too. See the 'Tritonus
Miscellaneous' package at Tritonus
Plug-ins. See Converting the sample rate of audio files for a code example. Also,
JMF
supports sample rate conversion. See also How can I convert
between two encoded formats directly (e.g. from mp3 to
A-law)? and Q: 16
(Matthias) |
11.4. | How can I detect the frequency (or pitch) of sound data? |
| What you need is an algorithm called 'fast fourier
transform', abbreviated 'FFT'. See also How can I do equalizing / noise reduction
/ fft / echo cancellation / ...? and Q: 2. (Matthias) |
11.5. | How can I do equalizing / noise reduction
/ fft / echo cancellation / ...? |
| Java Sound is an API concerned with basic sound input
and output. It does not contain digital signal processing
algorithms. Nevertheless, you can do this with Java; you
just have to code it on your own. Craig Lindley's book (see
Q: 1)
contains some DSP algorithm. Also, it is often easy to
transform C or C++ code found on the net to Java. You may want to have a look at the comp.dsp
FAQ For code that does fft, have a look at the Peruna
Project (original website is offline, view it at the
Internet
Archive). (Matthias) |
11.6. | How can I do silence supression or silence
detection? |
| This can be achieved with a variant of a common DSP
algorith called "noise gate". A noise gate is a special form
of a compressor, which belongs to the area of dynamic
processing. (Matthias) |
11.7. | How can I do mixing of audio
streams? |
| If you want to do playback of multiple streams, just
obtain multiple instances of
SourceDataLine, one for each stream
to play. The data fed to the
SourceDataLine instances is mixed
inside the Mixer instance, either in
software or in hardware. There is no way to monitor the
result of the mixing, other than looping the soundcard's
output line to some input line. For mixing without playback see How can I mix two (or more)
AudioInputStream instances to a
resulting AudioInputStream? If the sources of the audio data are not
AudioInputStream instances, but byte
buffers, you can use the class
FloatSampleBuffer of Tritonus.
(Matthias) |
11.8. | Should I use float or double for signal
processing? |
| This is a question discussed over and over again. It
seems that there is no definitive answer. Which way to go
depends on the circumstances and the requirements. Here are
some arguments in favour of each alternative. Advantages of using float: It uses half of the memory size used by double: 4
bytes instead of 8 bytes per sample. This may be an
issue if lage amounts of data are stored in a floating
point representation. Calculations may be faster. This depends on the
processor. For Pentium-class processors, there is no
performance gain by using float with the standard FPU
(Floating Point Unit): Both float and double are handled
using an 80 bit represention internally anyway. However,
"multimedia" instructions that execute more than one
operation simultaneously are only available for
float. The memory bandwidth needed to transfer data from
the RAM to the processor and vice versa is half of that
needed for double. This is an issue in real time systems
with high throuput, where the memory bandwidth is the
limiting factor.
Advantages of using double: There are smaller rounding errors for filter
constants, so for algorithms with feedback (IIR
filters), the propapility of numerical instability is
lower. Some algorithms with a lot of feedback like reverb
may require double. Several mathematical functions (for instance
sin(), log(),
pow()) are only available with
double parameters and return values. Using double
throughout instead of float saves the conversions
between float and double.
(Matthias) |
11.9. | How can I do computations with complex numbers in
Java? |
| Here are two implementations of classes for complex
numbers: Cmplx.java,
The
Colt Distribution (Open Source Libraries for High
Performance Scientific and Technical Computing in
Java). (Matthias) |
11.10. | How can I change the pitch
(frequency) of audio data without changing the
duration? |
| This is a quite complex problem called "pitch
shifting". It requires advanced DSP algorithms. This is not
available as part of the Java Sound API and is unlikely to
ever become so. However, it is possible to do this in
Java. One example is in Craig Lindley's book (see Q: 1). (Matthias) |
11.11. | How can I change the duration of
audio data without changing the pitch (frequency)? |
| This is a problem similar to pitch shifting: It
requires non-trivial DSP algorithms. See Marvin's
mail and Simon's
mail for some links. (Matthias) |
11.12. | How can I use reverbation? |
| The "Java Sound Audio Engine" (see What are all these mixers?) has an implementation
of a Reverb control (it is implemented as
Control of the
Mixer. Note that
Mixer extends
Line, so you can get controls from a
Mixer, too). However, it seems that
it is not working. In general, it is recommended to implement reverb
yourself. The reason is that the availability of reverb as a
control of a Mixer as an
implementation-specific property of certain
Mixer implementations. The "Java
Sound Audio Engine" supports reverb, all other mixers
don't. So relying on reverb in the mixer makes your program
not portable. In the upcoming JDK 1.5.0, the "Java Sound
Audio Engine" is no longer the default mixer, the default
are now the "Direct Audio Device" mixers. There are many
good reasons to use the "Direct Audio Device" mixers instead
of the "Java Sound Audio Engine", including low latency and
support for multiple soundcards. But if you need the
reverberation, you are hooked to the "Java Sound Audio
Engine". And one day, the "Java Sound Audio Engine" may
disappear completely. (Matthias) |
11.13. | How can I find out the maximum volume of a sound file? |
| In a loop, go through the whole file and examine each
sample. The maximum volume is the maximum of the absolute
values of all samples. For getting the sample values, see
How are samples organized in
a byte array/stream? (Matthias) |
11.14. | How can I normalize the volume of sound? |
| One way to do it is to scan the whole wave to find
it's max (and min, or abs() it) valued sample, get the ratio
of this to the available max and scale up the whole
wave. An alternative that might work in practice is to use a
compressor - i.e. apply a scaling algorithm that boosts the
lower-level parts of the signal. This has the perceived
effect of making everything sound louder - it's often done
to TV ads. The latter (compression) is the preferrable
approach. Just looking for the maximum and minimum sample
may result in not getting a silent tune louder, because it
may have a single peak to the maximum. It's better to
calculate the average level of the whole piece and use this
value in relation to the possible maximum level to predict a
compression ratio. Note that this usage of the term
"compression" refers to reducing the dynamic range of
music. It has nothing to do with the compression of MP3,
which means reducing the storage size or
bitrate. (Matthias) |
11.15. | How can I calculate the power
of a signal? |
| You have to calculate the root-mean-square average of
continuous samples. For four samples, the formula looks like
this: rms = sqrt( (x0^2 + x1^2 + x2^2 + x3^2) / 4)
(Matthias) |
12. Compression and Encodings |
- 12.1. Ogg Vorbis
- 12.1.1. What is Ogg Vorbis?
- 12.1.2. How can I play back Ogg Vorbis files?
- 12.1.3. How can I encode Ogg Vorbis files?
- 12.1.4. Who should we lobby to get Ogg
Vorbis support in the Sun JRE?
- 12.1.5. How can I get the duration of
an Ogg Vorbis file?
- 12.2. mp3
- 12.2.1. How can I play back mp3 files?
- 12.2.2. Why is there no mp3 decoder in the
Sun JRE/JDK?
- 12.2.3. What is the legal state of the
JLayer mp3 decoder?
- 12.2.4. What are the differences
between the JLayer mp3 decoder plug-in and the Sun mp3
decoder plug-in?
- 12.2.5. How can I encode mp3 files?
- 12.2.6. Is there a mp3 encoder implemented in pure
java?
- 12.2.7. Which input formats
can I use for the mp3 encoder?
- 12.2.8. Is mp3 encoding possible on Mac OS?
- 12.2.9. Why do I get an
UnsupportedAudioFileException when
trying to play a mp3 file?
- 12.2.10. How can I get the length of
an mp3 stream?
- 12.3. GSM 06.10
- 12.3.1. Is there support for GSM?
- 12.3.2. Why does the GSM codec refuses to encode from/decode
to the format I want?
- 12.3.3. How can I read a
.wav file with GSM data or store
GSM-encoded data in a .wav
file?
- 12.3.4. I want to convert to/from GSM using the Tritonus
plug-in. However, I do not work with files or
streams. Rather, I want to convert byte[]
arrays.
- 12.3.5. How can I decode GSM from frames of 260 bit?
- 12.3.6. How can I calculate
the duration of a GSM file?
- 12.3.7. Are there native
implementations of codecs that are compatible with the
framing format used by the Java Sound GSM codec?
- 12.4. A-law and μ-law
- 12.4.1. What are A-law and μ-law?
- 12.4.2. How can I convert a PCM encoded byte[]
to a μ-law byte[]?
- 12.5. Speex
- 12.5.1. What is Speex?
- 12.5.2. Is there support for Speex?
- 12.5.3. How do I use JSpeex?
- 12.5.4. How can I get the duration of a Speex file?
- 12.6. Miscellaneous
- 12.6.1. Is there support for ADPCM (a.k.a. G723) in Java
Sound?
- 12.6.2. Is there support for WMA and ASF in Java
Sound?
- 12.6.3. How can I convert
between two encoded formats directly (e.g. from mp3 to
A-law)?
- 12.6.4. What compression schemas
can I use?
- 12.6.5. How can I get
Encoding instances for GSM and mp3
with JDKs older than 1.5.0?
- 12.6.6. Is there support for RealAudio /
RealMedia (.ra /
.rm files)?
- 12.6.7. How can I get support for
a new encoding?
| |
12.1. Ogg Vorbis |
- 12.1.1. What is Ogg Vorbis?
- 12.1.2. How can I play back Ogg Vorbis files?
- 12.1.3. How can I encode Ogg Vorbis files?
- 12.1.4. Who should we lobby to get Ogg
Vorbis support in the Sun JRE?
- 12.1.5. How can I get the duration of
an Ogg Vorbis file?
| |
12.1.1. | What is Ogg Vorbis? |
| From the website: Ogg Vorbis is a fully open, non-proprietary,
patent-and-royalty-free, general-purpose compressed
audio format for mid to high quality (8kHz-48.0kHz, 16+
bit, polyphonic) audio and music at fixed and variable
bitrates from 16 to 128 kbps/channel. This places Vorbis
in the same competitive class as audio representations
such as MPEG-4 (AAC), and similar to, but higher
performance than MPEG-1/2 audio layer 3, MPEG-4 audio
(TwinVQ), WMA and PAC. Vorbis is the first of a planned family of Ogg
multimedia coding formats being developed as part of
Xiph.org's Ogg multimedia project.
For more information see The Ogg Vorbis CODEC
project. (Matthias) |
12.1.2. | How can I play back Ogg Vorbis files? |
| A Plug-in
for Java Sound is available from the Tritonus
project. It uses JOrbis, a pure
Java decoder from the JCraft project. Under
development is also a decoder based on native
libraries. (Matthias) |
12.1.3. | How can I encode Ogg Vorbis files? |
| A beta version of an encoder based on native
libraries is available as part of the Tritonus
project. See plug-ins
(Matthias) |
12.1.4. | Who should we lobby to get Ogg
Vorbis support in the Sun JRE? |
| You can vote for the RFE
#4671067 to include Ogg Vorbis in the JRE. A
remark from Florian: As far as I know, there is no development yet in
Java Sound for Mustang. Also, there are legal problems
(licenses...) for including ogg support in Java. Sun
cannot just include 3rd party code, no matter what
license it is published. I've tried to push inclusion of
native bindings for ogg in Java so that you just need to
install the ogg library locally to get ogg support in
Java.
See also RFE
#4499904. (Matthias) |
12.1.5. | How can I get the duration of
an Ogg Vorbis file? |
| Currently, the JavaZOOM version of the Vorbis
decoder plug-in (VorbisSPI)
sets the duration property in
TAudioFileFormat if the data source
is a File. Ways to provide length
and duration information for URL
and InputStream sources are under
discussion; see the mailing list archives. See also How can I determine the length or
the duration of an audio file?
(Matthias) |
12.2. mp3 |
- 12.2.1. How can I play back mp3 files?
- 12.2.2. Why is there no mp3 decoder in the
Sun JRE/JDK?
- 12.2.3. What is the legal state of the
JLayer mp3 decoder?
- 12.2.4. What are the differences
between the JLayer mp3 decoder plug-in and the Sun mp3
decoder plug-in?
- 12.2.5. How can I encode mp3 files?
- 12.2.6. Is there a mp3 encoder implemented in pure
java?
- 12.2.7. Which input formats
can I use for the mp3 encoder?
- 12.2.8. Is mp3 encoding possible on Mac OS?
- 12.2.9. Why do I get an
UnsupportedAudioFileException when
trying to play a mp3 file?
- 12.2.10. How can I get the length of
an mp3 stream?
| |
12.2.1. | How can I play back mp3 files? |
| There is a pure Java decoder of the javazoom
project. Tritonus, the open
source implementation of Java Sound incorporates it. There
is a plug-in
available which runs under any JVM. Sun has also released a pure java mp3 decoder
plug-in: Java
MP3 PlugIn There is also a native mp3 decoder
implementation. It is part of the mp3 encoder plug-in. See
Tritonus
Plug-ins. (Matthias) |
12.2.2. | Why is there no mp3 decoder in the
Sun JRE/JDK? |
| A quote from Florian: “As far as I know, Sun will not include MP3
support into the JRE, mostly because it would require a
separate license to click through during
installation. That's also the reason why it could not be
enabled that your software downloads the plug-in on your
own since the license must be acknowledged by every
end-user. It's the crazy lawyers.”
(Matthias) |
12.2.3. | What is the legal state of the
JLayer mp3 decoder? |
| There was much discussion on the mailing list; see
the archive for details. As a short summary, see Eric's
Mail and Florian's
Mail. If you want to avoid legal issues completly,
it is recommended to use Ogg Vorbis instead of mp3. (Matthias) |
12.2.4. | What are the differences
between the JLayer mp3 decoder plug-in and the Sun mp3
decoder plug-in? |
| The Sun decoder is twice as fast as the JLayer
decoder though it is written in pure java, too. The Sun decoder only supports MPEG 1 audio layer
III files, while the JLayer decoder supports MPEG 1
and MPEG2, audio layer I - III files. .
See also What is the legal state of the
JLayer mp3 decoder? (Matthias) |
12.2.5. | How can I encode mp3 files? |
| Java is free, this collides with the (enforced)
licences for mp3 encoders. I have studied very carefully
the mp3 licencing model and also asked at Fraunhofer
(inventors of mp3) for additional information: it won't be
possible to deliver a free mp3 encoder legally. (If anyone
knows a "hole", please let me know. (not the
available source code - this is not appropriate: the
encoders available as source code - most of them based on
the ISO reference implementation - create bad quality
mp3's and the licence doesn't allow the use of such
encoders!)) The Tritonus team is
working on an interface to the open source encoder LAME. Like that
people who do not fear licence problems can download LAME
as a separate package and link it to Java Sound. See also
Q: 7
(Florian) |
12.2.6. | Is there a mp3 encoder implemented in pure
java? |
| No. At least none that is available to the
public. (Matthias) |
12.2.7. | Which input formats
can I use for the mp3 encoder? |
| The Tritonus mp3 encoder supports the following
input formats: 16 bit signed PCM; mono or stereo; big or
little endian; 8, 11.025, 12, 16, 22.05, 24, 32, 44.1 or
48 kHz sample rate. (Matthias) |
12.2.8. | Is mp3 encoding possible on Mac OS? |
| LAME and its Tritonus plug-in are reported to work
on Mac OS X, but not on Mac OS 9. For details, see Steven's
mail. (Matthias) |
12.2.9. | Why do I get an
UnsupportedAudioFileException when
trying to play a mp3 file? |
| First, check your installation as described on the
bottom of the Java Sound Plugins
page. If your installation is correct, but the
file still doesn't play, there are two common reasons:
id3v2 tags or a variable bit rate (VBR) header. Both are
prepended to an ordinary mp3 file. And the
AudioFileReader for mp3 can't
detect this situation. The Tritonus team does not plan to
fix this behaviour. However, JavaZOOM
provides a modified version of the
AudioFileReader. (Matthias) |
12.2.10. | How can I get the length of
an mp3 stream? |
| Currently, you can use the following hack with the
JLayer decoder:
import java.io.*;
import javazoom.jl.decoder.*;
import javax.sound.sampled.*;
public class TestMP3Duration
{
public static void main(String args[])
{
try
{
File f = new File(args[0]);
Bitstream m_bitstream = new Bitstream(
new FileInputStream(f));
Header m_header = m_bitstream.readFrame();
int mediaLength = (int)f.length();
int nTotalMS = 0;
if (mediaLength != AudioSystem.NOT_SPECIFIED) {
nTotalMS = Math.round(m_header.total_ms(mediaLength));
}
System.out.println("Length in ms: " + nTotalMS);
} catch(Exception e) {
e.printStackTrace();
}
}
} It seems that the decoder released by Sun (see How can I play back mp3 files?) does not
support any means to obtain the length. In the future (once the JDK 1.5.0 is released) it
will be possible to get the length in a portable way using
AudioFileFormat properties. See
also How can I determine the length or
the duration of an audio file?
(Matthias) |
12.3. GSM 06.10 |
- 12.3.1. Is there support for GSM?
- 12.3.2. Why does the GSM codec refuses to encode from/decode
to the format I want?
- 12.3.3. How can I read a
.wav file with GSM data or store
GSM-encoded data in a .wav
file?
- 12.3.4. I want to convert to/from GSM using the Tritonus
plug-in. However, I do not work with files or
streams. Rather, I want to convert byte[]
arrays.
- 12.3.5. How can I decode GSM from frames of 260 bit?
- 12.3.6. How can I calculate
the duration of a GSM file?
- 12.3.7. Are there native
implementations of codecs that are compatible with the
framing format used by the Java Sound GSM codec?
| |
12.3.1. | Is there support for GSM? |
| Yes, you can download a service provider plug-in for
GSM
06.10 from Java Sound
Plugins. Since this implementation is pure-java,
it can be used with any Java Sound implementation on any
platform. For examples of using the GSM plug-in, see
Encoding an audio file to GSM 06.10,
Decoding an encoded audio file
and Playing an encoded audio file. (Matthias) |
12.3.2. | Why does the GSM codec refuses to encode from/decode
to the format I want? |
| GSM 06.10 only works with 8 kHz sample rate. This is
a property of the format and cannot be changed. The whole
algorithm depends on this. Therefore, Tritonus' GSM coded supports only two
format at the PCM side:
AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 8000.0F, 16,
1, 2, 8000.0F, false) and
AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 8000.0F, 16,
1, 2, 8000.0F, true). If you want to use other source or
target formats, you have to do the conversion in two
steps: For encoding, first convert the audio data from your
source format to one the encoder accepts. Then you can do
the encoding. For decoding, decode to one of the format the
decoder supports. In a second step, convert to your
desired target format. If your data has a sample rate different form 8 kHz,
you have to do a sample rate conversion. See How can I do sample rate
conversion? Note that GSM actually uses only the 13 most
significant bit of the 16 bit PCM samples. The 3 least
significant bits are ignored while encoding, and set to
zero while decoding. (Matthias) |
12.3.3. | How can I read a
.wav file with GSM data or store
GSM-encoded data in a .wav
file? |
| This is not supported by Tritonus. The reason is
that Microsoft specified a fancy scrambling of bits for
GSM in .wav. We considered it too
much work to comply with such "standards". For details,
see the GSM
page of Jutta. See also Are there native
implementations of codecs that are compatible with the
framing format used by the Java Sound GSM codec? (Matthias) |
12.3.4. | I want to convert to/from GSM using the Tritonus
plug-in. However, I do not work with files or
streams. Rather, I want to convert byte[]
arrays. |
| You have two choices: Convert the byte array to and from
AudioInputStreams using
ByteArrayInputStreams and
ByteArrayOutputStreams. For
details, see the questions How can I read an audio file and
store the audio data in a byte array?
and How can I write audio data from
a byte array to an audio file?. If you want to encode data captured with a
TargetDataLine, use the
AudioInputStream constructor
with a TargetDataLine
parameter. This is the recommended way, because it is
clean and does not directely access low-level APIs. It
is highly likely to be portable between different Java
Sound implementations (Assuming that one day there
will be an alternate GSM codec implementation). Use the low-level API of the GSM decoder and
encoder. This is tricky and is not officially
supported by the Tritonus team. The source code is
your friend, besides that you may get support from the
original authors. To say it short: it is not
recommended. If you really want to do it, you can take
the implementation of
GSMFormatConversionProvider.java
in Tritonus as an
example.
(Matthias) |
12.3.5. | How can I decode GSM from frames of 260 bit? |
| GSM frames indeed have a length of 260 bits, which
is equal to 32.5 bytes. To store such frames in files, the
common technique is to pad each frame with 4 zero bits at
the end, so that a frame fits into 33 bytes. This is the
format used by the GSM codec of Tritonus. So if you do the
same padding, the data can be decoded by this
codec. (Matthias) |
12.3.6. | How can I calculate
the duration of a GSM file? |
| If the length of the encoded data is known,
calculating the total duration is quite easy: A GSM frame
with 33 bytes contains information about 160 samples at a
sample rate of 8 kHz; each frame represents 20
milliseconds. So the formula is:
long length_of_data = ...; // in bytes
long number_of_frames = length_of_data / 33;
long duration = number_of_frames * 20; // in milliseconds See also How can I determine the length or
the duration of an audio file? (Matthias) |
12.3.7. | Are there native
implementations of codecs that are compatible with the
framing format used by the Java Sound GSM codec? |
| Yes, there are quite a number of programs listed on
GSM
Applications and GSM
for X. Note that Microsoft uses a different framing for
GSM. Therefore, Microsoft GSM codecs are incompatible.
See How can I read a
.wav file with GSM data or store
GSM-encoded data in a .wav
file?
(Matthias) |
12.4. A-law and μ-law |
- 12.4.1. What are A-law and μ-law?
- 12.4.2. How can I convert a PCM encoded byte[]
to a μ-law byte[]?
| |
12.4.1. | What are A-law and μ-law? |
| These are logarithmic codings of a sample value. The
values are stored in 8 bit, but the range of values is
roughly equal to 14 bit linear. So coding 16 bit data to
A-law and μ-law means half the storage size with about
15 per cent quality loss. See a mathematical
definition. (Matthias) |
12.4.2. | How can I convert a PCM encoded byte[]
to a μ-law byte[]? |
| When you are processing streams, read the
documentation of AudioSystem. There
are functions like
getAudioInputStream(AudioFormat,
AudioInputStream) that do the conversion for
you. In case you absolutely want to do the conversion
"by hand", look at how Tritonus is doing
it: have a look at the class TConversionTool.
(Florian) |
12.5. Speex |
- 12.5.1. What is Speex?
- 12.5.2. Is there support for Speex?
- 12.5.3. How do I use JSpeex?
- 12.5.4. How can I get the duration of a Speex file?
| |
12.5.1. | What is Speex? |
| Speex is an audio compression format designed for
speech. It is open source and patent free. For more
information see the Speex
Homepage. (Matthias) |
12.5.2. | Is there support for Speex? |
| Yes, have a look at the JSpeex
project. (Matthias) |
12.5.3. | How do I use JSpeex? |
| See Mark's
mail. (Matthias) |
12.5.4. | How can I get the duration of a Speex file? |
| Currently, there seems to be no way to find out the
duration. The typical problem with getting the duration of
compressed audio data is that there is no linear relation
between the length of the encoded data and the length of
the unencoded data. So typcally, length information is
available if either: There is a header that contains this
information The implementor of the decoder decided
to read or skip through the whole stream to gather this
information. Whether this is possible depends on the
encoded format and on the stream: it requires resetting,
so this is only possible if the stream is seekable or
can be reopened from the beginning or the whole content
is cached in memory. Implementor typically decide
against caching, since it may consume several megabytes
of memory.
I don't know details about the Speex format, so I
don't know if there is a possibility to make length
information available. See also How can I determine the length or
the duration of an audio file?
(Matthias) |
12.6. Miscellaneous |
- 12.6.1. Is there support for ADPCM (a.k.a. G723) in Java
Sound?
- 12.6.2. Is there support for WMA and ASF in Java
Sound?
- 12.6.3. How can I convert
between two encoded formats directly (e.g. from mp3 to
A-law)?
- 12.6.4. What compression schemas
can I use?
- 12.6.5. How can I get
Encoding instances for GSM and mp3
with JDKs older than 1.5.0?
- 12.6.6. Is there support for RealAudio /
RealMedia (.ra /
.rm files)?
- 12.6.7. How can I get support for
a new encoding?
| |
12.6.1. | Is there support for ADPCM (a.k.a. G723) in Java
Sound? |
| Currently not. There is an alpha version of a codec
for IMA ADPCM in Tritonus. However, the file readers and
writers haven't been adapted to handle this format, so the
codec is of little use. Doing this is not really
difficult, volunteers are appreciated. Developing support
for MS ADPCM shouldn't be too difficult, too. Also note that JMF
can handle IMA ADPCM. (Matthias) |
12.6.2. | Is there support for WMA and ASF in Java
Sound? |
| WMA and ASF are not supported by Java Sound or any
known plug-in to it. Of course there are native programs
that can do the conversion. (Matthias) |
12.6.3. | How can I convert
between two encoded formats directly (e.g. from mp3 to
A-law)? |
| You have to do this in 2 to 4 steps: Convert it to PCM, 16 bit, any
endianess, sample rate and channels as the original
input file If necessary, convert sample rate and
number of channels (as separate steps) to the values you
want in the target file. convert that PCM stream to the target
format, same sample rate and channels.
See also the example Converting audio files to different encodings, sample size, channels, sample rate
(Matthias) |
12.6.4. | What compression schemas
can I use? |
| The table below gives you an overview: See also What compression schema
should I use to transfer audio data over a network? Also note that JMF
has quite a few more codecs
than Java Sound. (Matthias) |
12.6.5. | How can I get
Encoding instances for GSM and mp3
with JDKs older than 1.5.0? |
| Since 1.5.0, the way to obtain
Encoding instances for non-standard
encodings like GSM 06.10, Ogg Vorbis and mp3 is to use the constructor
Encoding(String name) (See, for
instance, Encoding an audio file to GSM 06.10). In JDKs older than 1.5.0, this
constructor is protected. So calling it directly is not
possible. The old workaround for this problem was a
special class
org.tritonus.share.sampled.Encodings
introduced by Tritonus. It can be used to retrieve
Encoding instances. See this older
version of GSMEncoder for
an example how to do this. (Matthias) |
12.6.6. | Is there support for RealAudio /
RealMedia (.ra /
.rm files)? |
| There isn't, and it doesn't look like there will be
in the near future. RealAudio is a proprietary format;
there is no specification available to the public. Due to
that, it's hard to implement support for it. Currently,
the only way to do this seems to use native libraries
provided by RealNetworks. If you want to change this
situation, bug RealNetworks to publish specs (politely,
please). (Matthias) |
12.6.7. | How can I get support for
a new encoding? |
| If you need support for an encoding that is
currently not supported, you can code it yourself (or pay
somebody to do so). Java Sound has an extension mechanism
called "service provider interface" (SPI). For supporting
a new format, you need to write a plug-in that implements
the interface
FormatConversionProvider. Typically,
it is also necessary to write new audio file readers
(interface AudioFileReader) and
audio file writers (interface
AudioFileWriter) or extend existing
ones. See also Q & A 2, “Service Provider Interface (SPI)” (Matthias) |
13. Audio data transfer over
networks |
- 13.1. How can I do streaming of audio data?
- 13.2. Why do I get distorted sound in my streaming
application if it is used on the internet, but works on a
LAN?
- 13.3. How can I upload recorded audio data to a
server?
- 13.4. What compression schema
should I use to transfer audio data over a network?
| |
13.1. | How can I do streaming of audio data? |
| There is no special support for streaming protocols in
the Java Sound API. Options include: (Matthias) |
13.2. | Why do I get distorted sound in my streaming
application if it is used on the internet, but works on a
LAN? |
| With a naive streaming approach (simply writing to and
reading from sockets), you need a guaranteed network
bandwidth and minimum network latency. Though this is not
really guaranteed on an ethernet, the bandwidth is typically
sufficient for smooth operation. On the internet, however,
bandwidth is much more limited and latency much higher than
on an ethernet. So packet are arriving late, with leads to
clicks in the sound. To compensate for this effects, special
streaming protocols are needed. The most common of there is
the Real-time protocol (RTP). (Matthias) |
13.3. | How can I upload recorded audio data to a
server? |
| There are several ways to do this: One possibility is to use sockets (classes
java.net.Socket and
java.net.ServerSocket). See
this
mail for more details, including an example
server program. Another possibility is to use HTTP requests. Both
POST and PUT requests can be used for uploading. A more sophisticated approach is to use the
realtime streaming protocol (RTP) implementation
included in the Java
Media Framework (JMF).
You may also want to have a look at Java Sound Resources: Applications: Answering
Machine (Matthias) |
13.4. | What compression schema
should I use to transfer audio data over a network? |
| It depends on your requirements. There is a general
trade-off between bandwidth, processing power and
quality. Better quality needs either more bandwidth or more
processing power. Here is a short overview of some common
compression schemas: For speech, GSM 06.10 is a common choise. It is widely used
in internet phone and voice chat applications. For
high-quality music, use mp3 or (better) Ogg Vorbis. See also What compression schemas
can I use? (Matthias) |
14. Ports |
- 14.1. How do I use the interface
Port?
- 14.2. Why is it not possible to retrieve
Port instances?
- 14.3. Why is it not possible to
retrieve Control instances from
Port lines?
- 14.4. What does opening and closing
mean for Port lines?
- 14.5. Why is it not possible to read data from a microphone
Port line?
- 14.6. Can I use Java Sound's Port
interface to control volume and tone of sound played with an
application using JMF?
- 14.7. Why are there no Port instances
of certain predefined types (like
Port.Info.MICROPHONE or
Port.Info.COMPACT_DISC) on
Linux?
| |
14.1. | How do I use the interface
Port? |
| Have a look at the chapter "Processing
Audio with Controls" in the Java
Sound Programmer's Guide. You can also have a look
at how the applications jsinfo and systemmixer deal
with ports. (Matthias) |
14.2. | Why is it not possible to retrieve
Port instances? |
| Up to version 1.4.1, there was no
Port implementation in the Sun
JDK. In 1.4.2, an implementation was added for Windows. In
1.5.0, an implementation was added for Solaris and
Linux. (Matthias) |
14.3. | Why is it not possible to
retrieve Control instances from
Port lines? |
| Make sure you are opening the
Port line before retrieving
controls. For instance:
Port port = ...;
port.open();
Control[] controls = port.getControls(); (Matthias) |
14.4. | What does opening and closing
mean for Port lines? |
| Typically, the implementation of ports needs to query
the soundcard's mixer for its properties and build internal
data structures for Control
instances. Since this is often an expensive operation, it is
only done if the port is really used, i.e. when it is
opened. So you need to open the Port
to retrieve and use Control
instances. After closing the port, the association between
the Control instances and native
resources of the soundcard are invalidated, so that changes
to the controls do have no effect. See also Why is it not possible to
retrieve Control instances from
Port lines? (Matthias) |
14.5. | Why is it not possible to read data from a microphone
Port line? |
| This is due to the design of the hardware. Soundcards
usually have only one Analog-Digital-Converter (ADC), but
multiple input lines. You can obtain a
TargetDataLine to get the digital
data provided by the ADC. On the other hand,
Port lines represent the analog
inputs to the ADC and the analog outputs from the
Digital-Analog-Converter (DAC). By using the controls of a
Port line, you can influence the
signal level on that line that reaches the ADC, or influence
the volume on the output line that leads to your
speakers. In other words, the Port
lines are the abstraction of the hardware mixer on the
soundcard. While one could question why
Port and
DataLine have a common base
interface, it should be clear that you can't read digital
data from an object representing an analog
line. See also How can I detect which Port
Mixer belongs to which soundcard? (Matthias) |
14.6. | Can I use Java Sound's Port
interface to control volume and tone of sound played with an
application using JMF? |
| Yes, this is possible. Port
lines control the hardware mixer of the soundcard, so using
them affects everything played, even sound from native
applications. (Matthias) |
14.7. | Why are there no Port instances
of certain predefined types (like
Port.Info.MICROPHONE or
Port.Info.COMPACT_DISC) on
Linux? |
| Some operating systems or soundcard driver APIs do not
provide information on the type of the available mixer
channels. In these cases, a Java Sound implementation cannot
match mixer channels with pre-defined
Port types. Especially, this is the
case with ALSA, which is
used as the basis for the Port
implementation of the Sun JDK on Linux. To write portable programs, you should not rely on the
availability of pre-defined Port
types. If in doubt, obtain the list of available Ports and
let the user decide which one to use. This is a good idea
anyway, since some users don't have a microphone connected
to the "mic in" channel of the soundcard, but via a preamp
connected to the "line in" channel. (Matthias) |
15. Miscellaneous |
- 15.1. Why is playback of audio data
with Java Sound significantly quieter than with a similar
player on the native OS?
- 15.2. Can I use multi-channel
sound?
- 15.3. Which multi-channel
soundcards can I use with Java Sound?
- 15.4. Can I use the rear channels of a
four-channel soundcard (like Soundblaster Life! and
Soundblaster Audigy)?
- 15.5. How can I read audio data from a CD?
- 15.6. Why is there no sound at all
when running my program on Linux, while on Windows it works
as expected?
- 15.7. How can I display audio data
as a waveform?
- 15.8. What is the difference between
AudioInputStream and
TargetDataLine?
- 15.9. Does Java Sound support 24 bit/96 kHz audio?
| |
15.1. | Why is playback of audio data
with Java Sound significantly quieter than with a similar
player on the native OS? |
| There was the issue that Sun's implementation of Java
Sound (at least up to version 0.99) lowers the level of
output in order to avoid clippings when several lines are
mixed. Probably this "feature" is the
problem. I find this "feature" quite doubtful. A Java
Sound programmer should use GainControls attached to single
lines to lower the volume, if wanted. Many applications
won't profit of this "feature": e.g. they only
play one line at a time. Or the mixed sounds don't create
clippings. This is not unusual, as even
"normalized" sounds leave most of the time enough
room - there must coincide peaks to create a clipping. The
case that the soft synth AND audio are playing
simultaneously can be expected in "quality"
programs which provide a way to lower the gain of the lines
- or do the gain decrease automatically. As Java Sound is supposed to be a low-level engine,
such an approach would not be suitable. The problem of the
feature is a general decrease of signal-to-noise ratio of
all Java Sound programs. Automatic lowering of volume
prevents the use in "serious" or professional
environments... (Florian) Note that the above is true for the "Java Sound Audio
Engine". It does not apply to the "Direct Audio Device"
mixers. See also What are all these mixers?
(Matthias) |
15.2. | Can I use multi-channel
sound? |
| With the "Direct Audio Device" mixers (see What are all these mixers?) it is possible to use
multi-channel cards. On Windows, the device drivers of multi-channel cards
usually split the hardware facilities into stereo channels,
each provided by a separate logical device. On Linux with
ALSA,
device drivers of multi-channel cards typically represent
the hardware by one device with all channels together
(interleaved). However, it is possible to split the channels
using the ALSA configuration files. Without the "Direct Audio Device" mixers, it is
possible to record from, but not play back to logically
splitted devices. For playback, the first one is used. See
Why can I record from
different soundcards, but not play back to them? See also Is it possible to read and
write multichannel audio files? (Matthias) |
15.3. | Which multi-channel
soundcards can I use with Java Sound? |
| Soundcards known to work well with Java Sound (JDK
1.5.0) on Windows as well as on Linux are the M-Audio Delta
44 and Delta 66. Another card working on Windows is the ESI
Waveterminal 192X. However, it is reported to have stability
problems with Java Sound. (Matthias) |
15.4. | Can I use the rear channels of a
four-channel soundcard (like Soundblaster Life! and
Soundblaster Audigy)? |
| Yes, if access to these channels are provided by the
soundcard driver in a useful way. For Windows, there is no
obvious solution; details are under investigation. For
Linux, it should be possible. See also Can I use multi-channel
sound?
(Matthias) |
15.5. | How can I read audio data from a CD? |
| On Linux, you can do this with Tritonus' CDDA
extension. See Tritonus
Plug-ins, Java Sound Resources: Examples: CD Digital Audio Extraction and Java Sound Resources: Applications: Ripper. Currently, there is no implementation doing the same
for Windows or other operating systems. Other possible
solutions include: On some Windows systems as well as on some Linux
systems, reading audio CDs is integrated into the
operating system. Typically, the CD is mapped into the
file system as another disk with one
.wav file per track. In this case,
you can just open and read one of these files with Java
Sound as you would do with any other audio file. Use an external tool to extract the digital data
from the CD to a .wav file. Then
process this file with Java Sound. It's possible to keep
this mechanism "under the hood": invoke the capturing
utility from inside your java app (System.exec() or
simular) and pass it the name of a temporary file it has
to write to. After the utility has completed, read this
file. On most systems, you can select the CD as a
recording source in the system mixer (this requires your
CD drive to be connected to your soundcard with an
analog cable). Then do a audio recording with Java
Sound. Of course, this does not result in a digital copy
of the data on CD.
(Matthias) |
15.6. | Why is there no sound at all
when running my program on Linux, while on Windows it works
as expected? |
| A common pitfall on Linux are mixing daemons like esd
and artsd. They open the audio device exclusively. So if
they are running while the Java VM is started, the VM is
denied access to the audio device. There are three possible
solutions: Use a soundcard that does mixing in
hardware. In this case, the Java VM and the mixing
daemon can coexist, because opening the audio device is no
longer exclusive; the audio streams of the VM and the
daemon are mixed in hardware. Using ALSA's dmix
plug-in is currently no solution, since the "Direct Audio Device" mixer implementation opens ALSA PCM devices in "hw" mode and therefore misses devices emulated by dmix. Kill or disable the mixing daemon while you are using
Java Sound programs. As a "light-weight" solution, you can
install ALSA including its OSS emulation and configure your system so that the sound daemon uses ALSA directly while the Java VM uses the OSS emulation or the other way round. This way the JVM and the sound daemon wont't interfere. Note that the "Java
Sound Audio Engine" uses the OSS API while the "Direct Audio Device" mixer uses the ALSA API.
See also Q: 3.4 and How can I enable mixing with the
"Direct Audio Device" mixers on Linux?. (Matthias) |
15.7. | How can I display audio data
as a waveform? |
| Well, you have to extract the sample values from the
byte stream (see How are samples organized in
a byte array/stream? and How can I reconstruct
sample values from a byte array?) and then draw some
lines... There are some examples of classes that implement such
a thing: See also How can I calculate the power
of a signal? (Matthias) |
15.8. | What is the difference between
AudioInputStream and
TargetDataLine? |
| InputStream represents a stream
of bytes that may be read from a file, URL or other data
source. AudioInputStream extends
InputStream with properties that are
needed for interpreting audio data: the data format (an
AudioFormat object) and the length of
the stream in frames. An
AudioInputStream instance can be
wrapped around any InputStream object
to provide this information. TargetDataLine is much more
specific: it represents an audio line to which data is
output from an audio device (represented by a
Mixer object). Data recorded from an
audio capture device is delivered to a
TargetDataLine, from which it can be
read by the application. So lines of various types
(TargetDataLine,
SourceDataLine,
Clip, Port)
are not arbitrary software objects that can be created and
connected to a mixer or audio device. Rather, they are part
of the mixer or device itself. The difference between
AudioInputStream and
TargetDataLine is mirrored by the
difference between AudioOutputStream
and SourceDataLine. While
AudioOutputStream (a concept
introduced by Tritonus) is a
general concept of something you can write audio data to,
SourceDataLine is specific to a
Mixer instance. For programming, there are two subtle
differences: The handling of read lengths that are not
an integral multiple of the frame size:
AudioInputStream silently rounds
down the length (in bytes) to the nearest integer number
of frames. TargetDataLine throws an
IllegalArgumentException.
The behaviour when the end of data is
reached: AudioInputStream.read()
returns -1 if there is no more
data. TargetDataLine.read() returns
0 if the line is closed (otherwise,
read() is guaranteed to block until
the requested amount of data is
available).
(Matthias) |
15.9. | Does Java Sound support 24 bit/96 kHz audio? |
| There is nothing in the API that prevents dealing with
24 bit/96 kHz. The implementations of the API are a
different story. The "Java Sound Audio Engine" does not
support it. Therefore, there is no support for it in the
JDK up to 1.4.2. With the "Direct Audio Device" mixers in
the JDK 1.5.0 (Linux: 1.4.2), it should be
possible. See also What are all these mixers? (Matthias) |