When we want to separate two signals which are mixed up, one interesting method we can use is Independent Component Analysis (ICA). I think I'm not knowledgeable enough to explain how this whole thing works but there are plenty of explanations about it in the web. Take a look at the references listed at the end for any further details. But, the purpose of this article is to put the codes I used recently for an ICA job so that I will not forget how to use it in the future.
In order to perform ICA on Python we need to install an important package first. Let's do it first.
sudo pip install --upgrade pip
pip install -U scikit-learn
Now, it's time to write the Python script. The following script is taking two wav files as input which contains two mixed signals in different ways. Then it generates another two new wav files which contains the separated signals.
"""
=====================================
Blind source separation using FastICA
=====================================
An example of estimating sources from noisy data.
:ref:`ICA` is used to estimate sources given noisy measurements.
Imagine 3 instruments playing simultaneously and 3 microphones
recording the mixed signals. ICA is used to recover the sources
ie. what is played by each instrument. Importantly, PCA fails
at recovering our `instruments` since the related signals reflect
non-Gaussian processes.
"""
print(__doc__)
import os
import wave
import pylab
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
from scipy.io import wavfile
from sklearn.decomposition import FastICA, PCA
###############################################################################
matplotlib.rcParams['ps.useafm'] = True
matplotlib.rcParams['pdf.use14corefonts'] = True
matplotlib.rcParams['text.usetex'] = True
# read data from wav files
sample_rate1, samples1 = wavfile.read('100000010mix1.wav')
sample_rate2, samples2 = wavfile.read('100000010mix2.wav')
print 'sample_rate1', sample_rate1
print 'sample_rate2', sample_rate2
S = np.c_[samples1, samples2]
ica = FastICA(n_components=2)
S_ = ica.fit_transform(S) # Reconstruct signals
print 'original signal=', S
print 'recovered signal=', S_
print 'extracted signal1', S_[:,0]
print 'extracted signal2', S_[:,1]
# write data to wav files
scaled1 = np.int16(S_[:,0]/np.max(np.abs(S_[:,0])) * 32767)
wavfile.write('extracted-signal-1.wav', sample_rate1, scaled1)
scaled2 = np.int16(S_[:,1]/np.max(np.abs(S_[:,1])) * 32767)
wavfile.write('extracted-signal-2.wav', sample_rate2, scaled2)
###############################################################################
# Plot results
pylab.figure(num=None, figsize=(10, 10))
pylab.subplot(411)
pylab.title('(received signal 1)')
pylab.xlabel('Time (s)')
pylab.ylabel('Sound amplitude')
pylab.plot(samples1)
pylab.subplot(412)
pylab.title('(received signal 2)')
pylab.xlabel('Time (s)')
pylab.ylabel('Sound amplitude')
pylab.plot(samples2)
pylab.subplot(413)
pylab.title('(extracted signal 1)')
pylab.xlabel('Time (s)')
pylab.ylabel('Sound amplitude')
pylab.plot(S_[:,0])
pylab.subplot(414)
pylab.title('(extracted signal 2)')
pylab.xlabel('Time (s)')
pylab.ylabel('Sound amplitude')
pylab.plot(S_[:,1])
pylab.subplots_adjust(hspace=.5)
pylab.savefig('extracted-data.pdf')
pylab.show()
Run this Python script with the two wav files in the same directory and you will get the output signals as wav files in addition to seeing waveform plots of all these signals.