11516

Get the amplitude at a given time within a sound file?

I'm working on a project where I need to know the amplitude of sound coming in from a microphone on a computer.

I'm currently using Python with the Snack Sound Toolkit and I can record audio coming in from the microphone, but I need to know how loud that audio is. I could save the recording to a file and use another toolkit to read in the amplitude at given points in time from the audio file, or try and get the amplitude while the audio is coming in (which could be more error prone).

Are there any libraries or sample code that can help me out with this? I've been looking and so far the Snack Sound Toolkit seems to be my best hope, yet there doesn't seem to be a way to get direct access to amplitude.

Answer1:

Looking at the Snack Sound Toolkit examples, there seems to be a dbPowerSpectrum function.

From the reference:

dBPowerSpectrum ( )

Computes the log FFT power spectrum of the sound (at the sample number given in the start option) and returns a list of dB values. See the section item for a description of the rest of the options. Optionally an ending point can be given, using the end option. In this case the result is the average of consecutive FFTs in the specified range. Their default spacing is taken from the fftlength but this can be changed using the skip option, which tells how many points to move the FFT window each step. Options:

EDIT: I am assuming when you say amplitude, you mean how "loud" the sound appears to a human, and not the time domain voltage(Which would probably be 0 throughout the entire length since the integral of sine waves is going to be 0. eg: 10 * sin(t) is louder than 5 * sin(t), but their average value over time is 0. (You do not want to send non-AC voltages to a speaker anyways)).

To get how loud the sound is, you will need to determine the amplitudes of each frequency component. This is done with a Fourier Transform (FFT), which breaks down the sound into it's frequency components. The dbPowerSpectrum function seems to give you a list of the magnitudes (forgive me if this differs from the exact definition of a power spectrum) of each frequency. To get the total volume, you can just sum the entire list (Which will be close, xept it still might be different from percieved loudness since the human ear has a frequency response itself).

Answer2:

I disagree completely with this "answer" from CookieOfFortune.

granted, the question is poorly phrased... but this answer is making things much more complex than necessary. I am assuming that by 'amplitude' you mean perceived loudness. as technically each sample in the (PCM) audio stream represents an amplitude of the signal at a given time-slice. to get a loudness representation try a simple RMS calculation:

RMS

|K<

Answer3:

I'm not sure if this will help, but skimpygimpy provides facilities for parsing WAVE files into python sequences and back -- you could potentially use this to examine the wave form samples directly and do what you like. You will have to read some source, these subcomponents are not documented.

Recommend

  • regex to grab text if code exists
  • Add completion handler to presentViewControllerAsSheet(NSViewController)?
  • QML ListElement pass list of strings
  • time_t conversion format question
  • update record in database using jdatabase
  • Aptana 3 remove bundle (jquery)
  • How do I retrieve the user information of a user authenticated with Apache's mod_ldap?
  • Uncaught TypeError: $(…).select2 is not a function
  • Authentication in Play! and RestEasy
  • How can the INSERT … ON CONFLICT (id) DO UPDATE… syntax be used with a sequence ID?
  • jQuery ready not fired after rails link_to is clicked
  • QLPreviewController hide print button in ios6
  • Bad request using file_get_contents for PUT request in PHP
  • Is there a javascript serializer for JSON.Net?
  • Android screen density dpi vs ppi
  • Spring security and special characters
  • Uncaught Error: Could not find module `ember-load-initializers`
  • Deleting and Updating values from a cusrsor adapter
  • Can Jackson SerializationFeature be overridden per field or class?
  • How would I use PHP exceptions to define a redirect?
  • Modifying destination and filename of gulp-svg-sprite
  • How to extract text from Word files using C#?
  • SSO with signing and signature validation doesn't work
  • Where to put my custom functions in Wordpress?
  • 'TypeError' while using NSGA2 to solve Multi-objective prob. from pyopt-sparse in OpenMDAO
  • How to make Safari send if-modified-since header?
  • How to pass list parameters for each object using Spring MVC?
  • php design question - will a Helper help here?
  • Matrix multiplication with MKL
  • Buffer size for converting unsigned long to string
  • AngularJs get employee from factory
  • Rails 2: use form_for to build a form covering multiple objects of the same class
  • NSLayoutConstraint that would pin a view to the bottom edge of a superview
  • Setting background image for body element in xhtml (for different monitors and resolutions)
  • need help with bizarre java.net.HttpURLConnection behavior
  • IndexOutOfRangeException on multidimensional array despite using GetLength check
  • Authorize attributes not working in MVC 4
  • JaxB to read class hierarchy
  • Binding checkboxes to object values in AngularJs
  • How to push additional view controllers onto NavigationController but keep the TabBar?