Sunday, August 28, 2016

DSP1 - Statistical Basics for Studying DSP


I have been trying to self-learn the field of Digital Signal Processing (DSP) since the last year. However, it seems I lost the motivation at some points and got busy with my other works too. The key information source for my self-learning approach was the book The Scientist and Engineer's Guide to Digital Signal Processing by Steven W. Smith. There are two great things about this book. First, it avoids mathematical jargons as much as possible when explaining concepts unlike many other DSP books. Second, it is freely available either to read on-line or to download for reading off-line.

I kept reading this book and when I come across something hard to understand, I skipped it even though that is not a good way to deal with hard to understand things. Somewhere after chapter 10, I got lost and gave up with reading it. But it seems it's hard to live on this planet without DSP and I decided to start it all over again. This time, my approach is different. I decided to take a more practical approach where I will write some codes and try the concepts presented in the book to see and understand them by doing. Let's see how it goes.

In this post, I'm going to write down some of the most basic stuff I came across while reading through the book's first two chapters.

  • What is a Signal ?

This is the first and one of the most important terms we come across while learning DSP. A Signal is a parameter that varies with another parameter in a system. Take the temperature of a room as an example which varies over time. We will be measuring the temperature in degrees of Celsius and the time in seconds. We can consider the temperature measurement as the parameter which varies with the time parameter. So, this temperature variation is a Signal. Take a look at the following graph which illustrate this signal.
Fig-1: Variation of temperature over time (mythical data)


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import matplotlib.pyplot as plt
import numpy as np
from random import gauss

# generating dummy temperature data over time
sample = 60
x = np.arange(sample)

temp = list()
for i in range(0,sample):
    temp.append(gauss(27,0.1)) # gauss(mean, std)
y = np.array(temp)

# plotting the signal waveform
#plt.title('Waveform of temparature signal')
plt.xlabel('Time (s)')
plt.ylabel('Temperature (C)')
plt.ylim(25, 29)
plt.grid(True)
plt.plot(x, y)
plt.show()

It is really important to note that it is not necessary for the second parameter to be time always. The second parameter can be anything. For example, let's modify the above scenario slightly. We will place an array of 10 temperature sensors across the room with a 1 meter distances from each adjacent sensor and then take a temperature measurement from each of them at a single instance of time. So, this time we get a variation of temperatures in the room not over time but over space. We can take the second parameter as the distance from one corner of the room. Take a look at the following graph which illustrates this. Still, this is a signal.
Fig-2: Variation of temperature over space (mythical data)


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import matplotlib.pyplot as plt
import numpy as np
from random import gauss

# generating dummy temperature data over space
sample = 10
x = np.arange(sample)

temp = list()
for i in range(0,sample):
    temp.append(gauss(27,0.2)) # gauss(mean, std)
y = np.array(temp)

# plotting the signal waveform
#plt.title('Waveform of temparature signal')
plt.xlabel('Distance (m)')
plt.ylabel('Temperature (C)')
plt.ylim(25, 29)
plt.grid(True)
plt.plot(x, y)
plt.show()


  • What is the Waveform of a signal ?

When we plot the variation of the first parameter against the second parameter of a signal, what we get is the Waveform of the signal. In the previous two scenarios, what we have drawn are Waveforms of the two signals. It shows the shape of our signal nicely. But, there are some other aspects of a signal to look at other than its Waveform. We will see them later.

  • Mean and Standard deviation of a signal

Now we know what signals are and how to represent their waveform in a graph. By looking at the source codes shown above, it is obvious that we can simply represent any signal in an array in computer memory. When we have a signal, we can look at their statistical properties and the most important two properties comes in to the stage are Mean and Standard deviation of the signal's data samples.

Mean (\( \mu \)) is simply the average value of the data points in a signal. Mathematically, it can be represented as follows where \( x \) is the array containing our signal.

$$\mu = \frac{1}{N} \sum_{i=0}^{N-1} x_i$$

Standard Deviation (\( \sigma \)) of a signal is a measure of how much the signal data points are deviated from the mean value of the signal. Mathematically, it can be represented as follows where \( x \) is the array containing our signal and \( \mu \) is the mean of the signal which we have calculated already.

$$\sigma^{2} = \frac{1}{N-1} \sum_{i=0}^{N-1} (x_i - \mu)^{2}$$

Following program reads a wave file and then calculate the mean and standard deviation of the signal. It is interesting to note that a wave file is in stereo format where we have two channels. Therefore, we select one channel to plot the waveform graph as shown in the code example.

Fig-3: Waveform of a sound from an Owl


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
from random import randint
from random import gauss
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile

# read a signal from a Wav file
fs, data = wavfile.read("owl.wav")
# fs=sample rate, data=signal
# data has a value range from -2^15 to (2^15)-1
# because the data type is int16

# extract data from channel 0 (our wav file is stereo which has two chanels)
data = data[:, 0]
# to extract data from channel 1
#data = data[:, 1]

# showing the properties of the signal
print 'data:', data
print 'mean:' ,np.mean(data)
print 'std:', np.std(data)
print 'SNR:', np.mean(data)/np.std(data)

# showing the waveform of the signal in channel 0
#plt.title("Waveform (channel 0)")
plt.xlabel("Samples")
plt.ylabel("Amplitude")
plt.grid(True)
plt.xlim(0,140000)
plt.plot(data)
plt.show()


  • What is Signal-to-Noise Ratio (SNR) of a signal ?

When we are looking at a signal coming from some source, the mean value represents the actual value which we need to observe. Meanwhile, the standard deviation represents the errors or the noise that deviate our observed values. Therefore, we can divide the mean by standard deviation to get a ratio which represents how good our signal is compared to the background noise. We call this ratio Signal-to-Noise ratio or SNR.

  • Probability vs Statistics

I like the following explanation given in Stack Exchange by John D. Cook. It illustrates the difference between probability and statistics in a nice way.

Let's say we have a jar of red and green jellybeans. "A probabilist starts by knowing the proportion of each and asks the probability of drawing a red jelly bean. A statistician infers the proportion of red jelly beans by sampling from the jar."

  • Histogram of a signal

Now we are going to look at another way of illustrating a signal graphically. Let's think about our temperature measurement experiment again. We have a temperature sensor along with a clock so that we can take readings of room temperature at equally spaced time intervals. We can draw the waveform of this signal as follow.
Fig-4: Waveform of a the temperature signal over time


We can see from the waveform graph that same temperature reading can be measured at different discrete time instances. If we count the number of instances each temperature value has occurred in our signal, we can plot it in a histogram. The x-axis will be temperature values while the y-axis will be the count or number of times that temperature value is available on the signal. We can draw this histogram as follows.

Fig-5: Histogram of the temperature signal

We can use the following python script to generate the above graphs.


import matplotlib.pyplot as plt
import numpy as np
from random import gauss

# generating dummy temperature data over time
sample = 60
x = np.arange(sample)

temp = list()
for i in range(0,sample):
    temp.append(gauss(27,0.5)) # gauss(mean, std)
y = np.array(temp)

# plotting the signal waveform
#plt.title('Waveform of temparature signal')
plt.xlabel('Time (s)')
plt.ylabel('Temperature (C)')
plt.ylim(25, 29)
plt.grid(True)
plt.plot(x, y)
plt.show()

# plotting the histogram
plt.hist(y, normed=True)
#plt.title("Histogram of temperature signal")
plt.xlabel("Temperature (C)")
plt.ylabel("Frequency")
plt.grid(True)
plt.show()


  • Probability Density Function (PDF) for a signal

For a continuous signal, probability density function (PDF) has the x-axis with some parameter and the probability of getting that value in the y-axis. The graph shown below is the waveform for a signal. Let's say it was obtained by sampling the temperature in a room for a duration of 1 hour (3600 seconds). Probability density function for this signal has temperature in the x-axis and the probability of getting that temperature within the 1 hour period is given in the y-axis.
Fig-6: Waveform of a the temperature signal over 1 minute duration


This is what we get if we generate the probability density function for the signal we considered. In this case, we have received a Gaussian distribution shaped graph since our signal contains data which has a Gaussian distribution.
Fig-7: Probability density function of a the temperature signal


import matplotlib.pyplot as plt
import numpy as np
from random import gauss
from scipy.interpolate import UnivariateSpline

# generating dummy temperature data over time
sample = 3600
x = np.arange(sample)

temp = list()
for i in range(0,sample):
    temp.append(gauss(27,0.5)) # gauss(mean, std)
y = np.array(temp)

# plotting the signal waveform
#plt.title('Waveform of temparature signal')
plt.xlabel('Time (s)')
plt.ylabel('Temperature (C)')
plt.ylim(25, 29)
plt.grid(True)
plt.plot(x, y)
plt.show()

# plotting the PDF
p, x = np.histogram(y, bins=sample)
x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
f = UnivariateSpline(x, p, s=sample)
plt.plot(x, f(x)/sample)
#plt.title("PDF of temperature signal")
plt.xlabel("Temperature (C)")
plt.ylabel("Probability")
plt.grid(True)
plt.show()


  • Gaussian (normal) Distribution

When we sample a random variable which we find in the natural world, most of the times they are normally distributed. That means its distribution is a Gaussian (Normal) distribution that has a bell shape. Shape of this bell is controlled by the mean and standard deviation of the signal we are considering. Following equation illustrates the probability density function for the Gaussian distribution according to Wikipedia.

$$f(x \mid \mu,\sigma^2) = \frac{1}{\sqrt{2 \sigma^2 \pi}} \rm{e}^{-\frac{(x-\mu)^2}{2\sigma^2}} $$

Here are some Gaussian distributions for different mean \( \mu \) and standard deviation \( \sigma \) values. Python scripts used to generate these distributions are given after that.

Fig-8: Gaussian distributions for different mean and standard deviations.


import matplotlib.pyplot as plt
import numpy as np
from random import gauss
from scipy.interpolate import UnivariateSpline

# generating dummy temperature data over time
sample = 100000

temp = list()
for i in range(0,sample):
    temp.append(gauss(0,1)) # gauss(mean, std)
y0 = np.array(temp)

temp = list()
for i in range(0,sample):
    temp.append(gauss(0,0.2)) # gauss(mean, std)
y1 = np.array(temp)

temp = list()
for i in range(0,sample):
    temp.append(gauss(2,0.5)) # gauss(mean, std)
y2 = np.array(temp)

# plotting the PDF
p, x = np.histogram(y0, bins=sample)
x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
f = UnivariateSpline(x, p, s=sample)
plt.plot(x, f(x)/sample)

# plotting the PDF
p, x = np.histogram(y1, bins=sample)
x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
f = UnivariateSpline(x, p, s=sample)
plt.plot(x, f(x)/sample)

# plotting the PDF
p, x = np.histogram(y2, bins=sample)
x = x[:-1] + (x[1] - x[0])/2   # convert bin edges to centers
f = UnivariateSpline(x, p, s=sample)
plt.plot(x, f(x)/sample)

#plt.title("PDF of temperature signal")
plt.xlabel("x")
plt.ylabel("y")
plt.grid(True)

plt.legend(['mean=0,std=1', 'mean=0,std=0.2', 'mean=2,std=0.5'], loc='upper left')
plt.show()


  • Cumulative Distribution Function (CDF)

The cumulative distribution function of a normal distribution is the sum of probabilities of the x-axis variable in normal distribution up to each x-axis variable. Following graph illustrates this concept where the y-axis values vary between 0 and 1 since the total probability is 1 for all the x-axis values.

Fig-9: CDF for a normal distribution.


from random import gauss
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile

# generating a gaussian random signal
x = list()
for i in range(0,10000):
    x.append(gauss(0,10)) # gauss(mean, std)
x = np.array(x)

# showing the CDF of a signal
X = np.sort(x)
F = np.array(range(len(x)))/float(len(x))
plt.plot(X, F)
#plt.title("CDF")
plt.xlabel("Value")
plt.ylabel("Cumulative Frequency")
plt.grid(True)
plt.show()


  • Precision vs Accuracy of a measurement

I had came across these terms frequently at different places but it was only after reading the DSPGuide book I understood the true difference between then in terms of taking a measurement. When we talk about Precision of a measurement, we are talking about maximum variation or the possible error that can occur in successive measurements. When considering Accuracy, we are talking about how close our measurement is to the actual value that we are trying to measure.

I'll write more when I proceed to the next chapters of the book. Until then, enjoy processing digital signals!!!

Writing Equations on the Web


$$\sigma^{2} = \frac{1}{N-1} \sum_{i=0}^{N-1} (x_i - \mu)^{2}$$
An important requirement arose for me recently which didn't occur for many years of my blogging experience. I wanted to write some complex equations on my blog posts here in my blog. The first idea came in to my mind is preparing the equations on Latex somewhere else and then inserting them inside blog posts as images. But I wanted something better than that so I kept searching. Finally I found the right tool to do it.

MathJax is a JavaScript library which we can include in our web pages. Then we can various syntaxes to write equations in our web pages. The most important point is, I can write my most familiar Latex equation syntax inside HTML documents leaving MathJax to handle how to render them on the client web browsers. Here are the simple steps to do this on a blog article hosted in a blogger.com site such as my blog.

(1)
Put the following lines in the top of the HTML content of the web page first inside the <head> tags. In case you are doing it in a blogger.com article, you have to goto the HTML view first and then add these lines on the top.


<script type="text/javascript" async
  src="//cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-MML-AM_CHTML">
</script>

(2)
Now we are ready to write an equation. We can use Latex math syntax for this but depending on the place we put the equations, we have to use two different tags. In order to write some mathematical text in-line, we have to enclose them inside a \( and \) tags. To put mathematical equations separated from the main texts, we have to enclose them inside a $$ and $$ tags.

Here is an example text line where I will put an in-line equation using \( and \) tags as follows. \( y = ax + b \). Did you see the equation? It was originally written as follows within the text line.

\( y = ax + b \)

OK then here's how we put an equation outside the main text using $$ and $$ tags.

$$\sigma^{2} = \frac{1}{N-1} \sum_{i=0}^{N-1} (x_i - \mu)^{2}$$

It was originally written as follows in the HTML source code.


$$\sigma^{2} = \frac{1}{N-1} \sum_{i=0}^{N-1} (x_i - \mu)^{2}$$

Enjoy writing equations on web pages!

References:
[1] https://www.mathjax.org/
[2] http://docs.mathjax.org/en/latest/start.html

Thursday, August 25, 2016

Reflections of a Busy Semester

my desk at SCoRe lab
Life has become never like before. I've came across so many revolutions throughout my life and at the moment I can't believe how amazing was the journey for the last 29 years. As I always enjoy turning back and looking at the trail behind, I decided to spare some time to think and write about the past 6 months of my life. It's full of new experiences, new faces, new knowledge and all new set of dreams. Even though it is impossible to write about everything, at least this would be a collection of most important things I came across.

The things which took most of my processing power were my academic duties at UCSC. Last semester I involved in 5 courses for undergraduate students in UCSC. In the meantime, I taught an undergraduate course in University of Sri Jayawardenapura as a visiting lecturer. Well, some people find it unbelievable how come I get involved in teaching 5 undergraduate courses in a semester. But this is the thing. An important decision was made in this semester by the administration to assign an assistant lecturer for every course to get involved in teaching along with a senior lecturer. As I'm one of such an assistant lecturer, they assigned me to several courses. Me too was so eager to hold a significant responsibility in teaching and this was the best chance I ever received to work.

course introduction slide of SCS2105
Among all those many courses I got involved in, there were two important courses. The first is SCS2105 - Computer Networks - 1. The two lecturers assigned to teach in this course are Mr. Brian Wijesuriya and me and my role was little compared to Mr. Brian. However, since he had many other courses to teach which made him so busy, I taught about 11 lectures from the total of 15 lectures in this courses. This course was the most well structured and planned course I have ever worked on. Before starting the lecture series, Mr. Brian and me had few meeting and we discussed about the syllabus of the course and about the time line of covering each topic. Most importantly, we decided to prepare a series of practicals and tutorials to make the students busy o work every week. Computer networks is a practical course which should be learned by doing. Many undergraduate students prefer a lifestyle of having a nice time throughout the semester without worrying about course works and jump into learn only at the end of the semester a week before the exams. This is not what we encourage and I think we did our best to give a hard time for such people.

course introduction slides of SCS4107
While that course went in a well structured and planned phase, SCS4107 - Operating Systems - 2 was a course which went on a much experimental approach. Since this semester was the first time this course was offered, course contents were not fixed in the beginning and we had to plan on the go. We based this course on the Operating Systems Engineering course in MIT and took many contents from there. Since the senior lecturer worked on this course was Dr. Chamath Keppitiyagama, it was easier for me since it is always enjoyable to work with him. During the first 8 lectures of the course, I conducted sessions about XV6 operating system which is a clone of the famous Unix version 6 operating system. In every lecture, we spent the first half explaining some component of the operating system and then gave a task for the students to spend the next half of the course working on their own. They got marks at the end of the day for the work they do. The next 7 lectures of the course was on some advanced topics of operating systems conducted by Dr. Chamath. I attended to those lectures like a student since all those things were new to me too.

It was not a happy time always. I faced really challenging, stressful and tiresome days during the past semester. Sometimes, due to the large amount of works to switch between, I had a limited time to prepare before conducting a lecture. Because of that, there were moments where I felt that I was not prepared enough in front of the students. Anyway, I think I did my very best. When I received the student feedback forms, it was a little bit disappointing to see how some students responded specially in Computer Networks course for 2nd year CS students. But, there were some good and constructive comments from some students too. I'm thankful to them for helping me to improve as a junior lecturer. I identified many weaknesses of me which I should fix in future semesters.

In addition to teaching duties, I enjoyed my work in SCoRe lab. Our significant work is the contribution to write a new funding proposal for a research project. Dr. Chamath, Prof. Thiemo and Prof. Luca involved in writing this proposal in collaboration and I wrote a significant part of the document. It was a great experience to me to understand how to write grant proposals. Meanwhile, towards the end of this semester, I made a new initiative which seems to be interesting to many other people. That was starting a reading group among the SCoRe lab community. Our objective is to meet every week and do some group work such as reading a book, studying some topic or presenting a latest research paper. I hope this reading group will go a long way from here.


Sunday, August 7, 2016

Preparing Latex Documents in Sinhala (සිංහල) Language

Since I'm addicted to Latex for preparing all kinds of documents such as letters, presentation slides and research papers, sometimes I find it hard to work on some document if it is given in a file format for a WYSIWYG editor such as Microsoft word. Finally several months ago, I came across an interest to try and prepare some documents on Sinhala (සිංහල) language using Latex since it is my native language. Actually, I got the interest after working on a Wikipedia page translation in to Sinhala. You can find that wiki page which I translated here. However, I didn't have a chance to spend some time on making Latex documents in Sinhala. Today was the right time for that.

This is how we can prepare a simple Latex document in Sinhala. First of all, we need to have the necessary packages installed. I cannot specifically mention each and every package we need since I used to install the full distribution of texlive package to get everything under a single roof. So, let's go like that.

sudo apt-get install texlive-full

After the installation, open a text editor and enter the following content into it. We can use Google input tools for typing the Sinhala unicode texts and then copy and paste here inside our Latex document.


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
\documentclass{article}
\usepackage{fontspec}

%
% This file is a sample tex file to illustrate use of
% XeTeX in typesetting a Sinhala document.
% Asanka Sayakkara, 2016
% Based on the example from Golam Mortuza Hossain, 2008

%
% Define fonts that you want to use
%

\font\serif="FreeSerif:script=sinh"
\font\serifbb="FreeSerif:script=sinh" at 24pt
\font\serifsection="FreeSerif:script=sinh" at 16pt
\font\deffont="FreeSerif:script=sinh" at 14pt
%\font\mukti="Mukti Narrow Bold:script=beng"

%
% "script=sinh" in above lines ensures that appropriate
% complex text rendering engines are used for proper
% rendering of Sinhala.
%

\title{\bf\serifbb මාතෘකාව මෙතන ලියන්න}
\author{\bf\serif අසංක සායක්කාර}
\date{2016-08-06}
\newcommand{\edition}{Second Edition} % Book edition

\begin{document}

\deffont %Default font used for the document
\maketitle

\section{\textbf{\serifsection සිංහල}}

මේ මම සිංහලෙන් සකස් කල ලිපියක්. මෙය සකස් කිරිඉමට මම ලේටෙක් මෘදුකාංගය පාවිච්චි කලා. එය ඉතාම ප්‍රයෝජනවත් මෙවලමක් මට.

\section{English}

Some English here and there.

\end{document}

Let's say you saved this document with the file name sample.tex somewhere in your file system. Now, it's time to compile it and generate our PDF file using it. We should use xelatex tool to do that since latex and pdflatex are not going to work for these kinds of documents with unicode texts.

xelatex sample.tex

Now, there should be a PDF file generated in the same directory and it will have a content like the figure shown in the beginning of this article. Since copying and pasting Latex contents from the above sample content can be hard for people, I added my files to a github repository. You can find it here.

Enjoy preparing documents in Sinhala!

Friday, August 5, 2016

Sniffing GSM packets

These are the steps I followed to capture GSM packets using a HackRF SDR device and view it on Wireshark protocol analyzer.

(1) Download the following Ubuntu live iso image which contains GNURadio pre-configured. Then, prepare a live USB stick using this image. I used Ubuntu startup disk creator for this purpose. This is where we start.

s3-dist.gnuradio.org/ubuntu-14.04.3-desktop-amd64-gnuradio-3.7.9.iso

(2) When your live USB stick is ready, boot the laptop with it. Then let's install some packags as follows.

sudo apt-get install git cmake libboost-all-dev libcppunit-dev swig doxygen liblog4cpp5-dev python-scipy

 
(3) Now we should download the source code, build locally and install one important module called gr-gsm which has all the necessary tools to capture the GSM packets using HackRF and then directing it to Wireshark with a proper format.

cd Downloads
git clone https://github.com/ptrkrysik/gr-gsm.git
cd gr-gsm
mkdir build
cd build
cmake ..
make
sudo make install
sudo ldconfig



(4) Open the file ~/.gnuradio/config.conf and append following two lines to the end of it. In order to edit and save the file, I didn't need root permission.

[grc]
local_blocks_path=/usr/local/share/gnuradio/grc/blocks


(5)  Now, let's install another important package.

cd Downloads
git clone https://github.com/scateu/kalibrate-hackrf.git
cd kalibrate-hackrf
./bootstrap
./configure
make
sudo make install


(6) Now let's run calibrate tool and scan the frequency range for GSM networks.

cd Downloads/kalibrate-hackrf/src
./kal -s GSM900 -g 40 -l 40


Now we have to wait a while until this tool discovers and lists down all the frequency channels used by GSM networks in the area. We will refer to the data we found here later.

(7) Install Wireshark if you haven't done yet.

sudo apt-get install wireshark


(8) Go to the place where we downloaded gr-gsm. Then open the given GRC script using Gnuradio Companion tool. Update the gain value to 40. Now run the GRC script. We will get a new window where we can select the GSM channel frequency we need to sniff.

cd Downloads/gr-gsm/apps
sudo gnuradio-companion gr-gsmlivemon.grc


(9) While the above tool is running, open a new terminal and enter the following command to run Wireshark with the appropriate filters.

sudo wireshark -k -Y 'gsmtap && !icmp' -i lo

Now we should see a subset of GSM packets from GSM networks are appearing in Wireshark window.

References:

[1] http://dilsdomain.blogspot.sg/2016/01/install-requirements-for-gsm-band.html
[2]
http://dilsdomain.blogspot.com/2016/01/sniffing-gsm-traffic-with-hackrf.html
[3] https://z4ziggy.wordpress.com/2015/05/17/sniffing-gsm-traffic-with-hackrf/