June 20, 2013

Signal Acquisition & Filtering on Simple Embedded Systems



I've been working on an embedded system at my job, and recently had the opportunity to design some low pass digital filters in Octave for that project. I thought I would put up some notes on what I did, both for my future reference and on the off chance that they might help someone else. I'll present only items that are very generic to any electrical engineering/software project, nothing proprietary.



Yes, I realize it is possible that there is not a ton of overlap between the set of potential readers who might find antique sewing machines interesting and the set of potential readers interested in design of filters for signal processing. But the foremost purpose of this blog is to provide a platform for me to post whatever I like, so here goes.


What is Simple?
The precise definition of a 'Simple' embedded system is open to interpretation. The chip I'm using on this project is smaller than a dime and costs a few dollars, which I think qualifies as simple. On the other hand, it runs faster than desktop computers did when I was in high school. Modern electronic hardware is utterly mind blowing in how much it can do at outrageously low cost. For example, check out this link to a $12 (retail, quantity 1!!!) cell phone at Bunny's blog.

The other dimension of simple is software complexity. People have done some highly impressive things with careful software engineering on low end processors, but in a 'simple' system the software should be straightforward to implement and advanced sychronization and optimization should not be necessary. This is a less obvious bounty provided by fast and cheap hardware; if I were building this system 20 years ago it would be 10x the size, 10x the price, and I would have to be killing myself writing reams of tight assembly code. Today, I can just download and reuse a ton of excellent code, and write the rest without much trouble in a high level language with fully featured support routines. It just doesn't matter that I'm not using the hardware that efficiently. I can even enjoy the luxury of floating point and not have to trouble myself left shifting and right shifting things and keeping track of cryptic multipliers, to keep all math as integer operations. An embarrassment of riches, is what it feels like.


Resources
I am fortunate enough to have taken (and loved!) the main discrete time signal processing class in the electrical engineering and software department while I was at school. This class (6.341) was taught by an excellent professor named Oppenheim, aided by one of the finest recitation leaders I ever had. We spent a lot of time doing signal processing stuff in Matlab. However, that was over 15 years ago, and I have forgotten a shockingly large amount of the material. Anyway, there is so often a yawning gap between a concept learned in a class at university and what one needs to do to use that concept in a real world project. So I had to spend some time reviewing, thinking, and surfing the web.

An admirably well organized project which provides a code base for some of the LPC series of microcontrollers from NXP can be found at microbuilder.eu. The code repository is on GitHub. I am using a 32 bit Cortex-M3 ARM MCU from NXP on this project, and have found the resources published and maintained by microbuilder to be extremely useful. There are basic IIR and moving average filters included in the code base, and a note in one of these files pointed me towards another wonderful resource:

Steve Smith's The Scientist and Engineer's Guide to Digital Signal Processing.

This book is free online, though I also bought a paper copy to have at the office. As I mentioned above, I have some background in this topic, but I think it would be a good intro to the subject even if you don't know much about it. The topics are organized and presented around practical implementation of the ideas on real electronics, and it addresses topics of vital interest to people wanting to actually build real systems.

Before we get to the digital filters though, we've got to get the signal into the microprocessor. Assuming enough analog electronics are in place to get the signal into the right voltage range for the microprocessor's A/D converter, we need to think about how often to sample the input, and what to use for an analog antialiasing filter, if indeed one is even necessary.


Antiailasing Filter - Frequency Domain




The first thing to sort out is the anti-aliasing filter, since the best digital filter in the world can't do anything about noise aliased into the same frequency band as the signal of interest. The general idea is to do as little as possible in the analog realm, since anything more than a low order filter can often be easier to implement and more effective if done in the digital realm.

As little as possible ideally translates into a single pole RC filter, implemented by one resistor and one capacitor just before the A/D input. Can we use this filter for my application, or do we need something better?


The MCU I am using has a 12 bit A/D. This translates to 4096 counts over a 3.3V input range, or about 0.8mV/step. So a perfect (or rather indistinguishable from perfect) antialiasing filter would cut any signal that would result in aliasing down to less than 0.8mV peak to peak amplitude, such that when it is converted to digital it has no detectable impact on the signal I am looking for. The worst case scenario is that the undesirable noise would be occupying the full 3.3V scale, in which case we would want to cut it down by a factor of 1/4096 to get it below the threshold of detectability. This means we would like to have 72db attenuation at the first frequency that would alias into our range of interest.

Several factors figure into calculating the attenuation at the first potential aliased frequency. First off is the relationship between the frequency of interest and the sampling frequency. As an example, lets consider a slowly varying signal, which would take several seconds to change in a significant way. Imagine we only care about the 0-5Hz component of the signal.

Discrete time principles dictate that the minimum frequency we can sample at (without aliasing) is 2x the highest frequency in the input signal. So just using the Nyquist principle alone, we could sample at 10Hz and be fine. All we would need is an antialiasing filter which has no significant attenuation up to 5Hz, then has 72db attenuation above 5Hz.


This is called a "brick wall" filter, and is useful as a conceptual device, but impossible to actually build. High order digital FIR filters can get arbitrarily close to the perfect low pass, limited only by computation speed and power required, but remember we are not actually into the computer yet and so have to select from the menu of analog techniques.

In contrast to the Brick Wall, a single RC filter has an attenuation of 20db/decade. This looks more like a gentle downward slope in the frequency domain. So to get to no detectable aliasing of any sort, we would want -72db gain by the time we get to our Nyquist frequency.


This means Fnyq would need to be 3.6 decades above 5Hz, which would put us at 20kHz. This translates to a sampling frequency of 40kHz for a 5Hz signal! This is possible to do of course, and is not even particularly difficult for a chip running at 72MHz, but it does seem like overkill doesn't it? Plus, we would certainly need to put the sampling on an interrupt routine feeding a FIFO with new samples, while the main routine processed samples out of the FIFO, worry about serial transactions resulting in delays which would cause the FIFO to overflow, etc. Again, totally doable, but why go to the trouble if we don't need to?

One important and fascinating aspect of aliasing is that it introduces a mirror axis in the frequency domain around Fnyq. For example, if my sampling frequency is 10kHz, Fnyq = 5kHz, which says any frequency above 5kHz will map into a lower frequency after being sampled. But the key is that 5001Hz will map to 4999Hz, not 1Hz.


This sort of makes sense (in retrospect, once you know the answer already) if you think of a sine wave at Fnyq, with a sample at the max and min of each cycle, shown as waveform 1 below. Now with the same sample interval, a slightly higher frequency sine wave would put the first sample at the max as before, but the second sample would be a little off the min, the min having occurred slightly before the second sample as shown in 1a. This in fact results in the same sample results as if the input had been a slightly lower frequency sine wave, such as in 1b, where the min would occur slightly after the second sample.


Anyhow, what this means is that we don't necessarily need to avoid ALL aliasing, just aliasing that will muddy up our frequency of interest. Returning to the previous example where this range was 0-5Hz, lets imagine the sampling frequency is 10kHz. Assuming we can unleash a devastatingly effective digital filter to vanquish any content in the 5Hz to 5kHz range once the signal is converted, we actually won't care about anything until we get to the 9995Hz-10000Hz range, which is what would potentially alias into the 0-5Hz band.


 9995/5 = a factor of 1999; in other words 9995Hz is 3.3 decades above 5Hz. So oversampling at 10kHz would get us an antialiasing attenuation of 66dB in our 0-5Hz band if we used a first order AA stage, which is pretty good. Not 72db, but at 66db down, the noise at 9995-10000Hz would need to be coming in at greater than 1.6V out of the 3.3V range in order to cause aliasing.

The other major factor in figuring the attenuation at the first interesting aliased frequency is the order of the antialiasing filter. Up to now I've been talking about a single pole RC filter, which is easy to build but has low performance. I've been told that you can do two passive RC filters in a row, as long as the second R is a lot bigger (>10x) than the first R. A second order filter would generally give you -40db/decade; of course you get different shaped curves in the frequency domain from different topologies like Butterworth, Chebyshev, etc. If the analog input needs gain too, using a second order Sallen-Key analog filter can give you gain as well as a second order filter in a single op amp, with the gain level independent from the cut frequency. In addition, the Sallen-Key design can also implement Butterworth, Chebyshev, or Bessel characteristics depending on the component values. Multiple feedback design uses slightly fewer passives, but has inverting gain, so is not as good for systems with non-symmetric power rails. TI has a nice little free design program for designing these filters.

In my application, I have one op amp already in the analog signal path for signal conditioning and gain. It is a configuration not amenable to second order filtering, but I have read and heard that putting a capacitor in parallel with a feedback resistor on an op amp will give you half a pole or so (i.e. -10db/decade) of low pass action, at a cut frequency of 1/(2*pi*Rf*Cf). I designed this op amp circuit to support this, with parts installed for Fc=10Hz. The output from the op amp goes through an RC (also set for 10Hz for this experiment), then into the A/D.

To test the combined attenuation of the op amp half pole and the RC, I hooked a signal generator up to the input, and a scope to the output (i.e. at the A/D input).

Here is the data I took (blue line). Even with the input grounded, the scope would still read 6mV worth of output from noise, but one assumes that in reality the slope continues down.



So in the region above about 50Hz, but before the scope measurement bottoms out, it looks like it is going down by 30db/decade, which is what I was expecting from the feedback RC plus the passive RC. The green line is a -30db/decade reference starting at 10Hz, the red line is the same but starting at 20Hz. It does take this filter quite a ways to get to its eventual slope. Maybe my component values are not quite right for 10Hz and the cut frequencies of the two stages are offset? Still, the slope gets to -30db eventually. If I adjust the filter physically so that it tracks to the 10Hz/-30db line, what frequency would I need to sample at for no aliasing below 10Hz?

decades needed in frequency = 72db/30db = 2.4 decades
so a factor of 10^2.4 = 251 above 10
(Fs-10)/10 = 251
Fs = 2521Hz
or
30*log((Fs-Fc)/Fc) = 72
30*log((Fs-10)/10) = 72
Fs = 10*10^(72/30) + 10 = 2521

Generalizing:
(filter attenuation)*log((Fs-Fc)/Fc) = (desired attenuation at aliasing)

The software from microbuilder was already set up for a 1ms system tick, so it was convenient to sample the input every system tick, which of course means Fs = 1000Hz. If I reformat the equation above to fix Fs at 1kHz and the filter rolloff at -30db/decade, in order to find Fc, I get:

30*log((1000-Fc)/Fc) = 72
10^(72/30) = 1000/Fc - 1
1000/(10^(72/30)+1) = Fc = 3.96Hz

So if I set up my op amp feedback cap and RC to be at 3.96Hz, I shouldn't need to worry about aliasing  in the 0-4Hz range with a sampling frequency of 1kHz. In reality, given the real data taken above, I'd probably go with 2Hz design Fc.

The other major factor to consider in antialiasing is what the spectrum of the signal + noise actually looks like. With -72db, you don't need to think about what the spectrum is, since that is by design enough to wipe out any possible signal that could alias into the band of interest. But what if I can ascertain that I am only going to have +/-100mV of signal or noise in the frequency band that will cause trouble with aliasing? If I have the same hardware as above, this would mean I only need 48db attenuation to get below one LSB. Using equation above, this means I could sample at 163Hz if I left my AA filter Fc at 4Hz, or I could move my Fc for the AA filter up to 24Hz if I kept sampling at 1kHz.

A real way to look at the spectrum of the input signal would be to use a spectrum analyzer, or a scope with an FFT option. If you don't have either of those expensive tools, you could save a waveform to a data file on a regular scope, transfer it to the computer, then look at the spectrum in Octave.

I've set up on my MCU a routine to buffer 1024 A/D samples (or ~1s worth of data at 1kHz Fs). I can print these out as CSV on the terminal, then copy them into an emacs window, save, then import into Octave. For testing purposes, I hooked a function generator with an offset square wave at 60Hz up to the input of my analog section. The actual captured data for this is shown below (only half the samples shown, so the waveform shape is more visible). I also plotted an FFT of the captured data. Of course, this is digitized data, so any potential aliasing will have already happened. But if this same operation were done with a scope capable of recording a waveform as the data source, it would be a good way to look at the spectrum of the signal.


octave-3.6.4:1089> subplot(2,1,1)
octave-3.6.4:1090> plot(raw602)
octave-3.6.4:1091> axis([0,500]);
octave-3.6.4:1092> ylabel("A/D Counts")
octave-3.6.4:1093> xlabel("Sample #")
octave-3.6.4:1094> subplot(2,1,2)
octave-3.6.4:1095> f = 1000*(0:499)/1000;
octave-3.6.4:1096> fft_y = (abs(fft(raw602)));
octave-3.6.4:1097> fft_y = fft_y(1:500);
octave-3.6.4:1098> plot(f,fft_y)
octave-3.6.4:1099> axis([0,500,0,700000])


The input square wave (data imported to raw602) is softened by the antialias filter, but the frequency content typical of a square wave is still obvious at the odd harmonics of 60Hz (3x, 5x, 7x, etc.).

So the basic message is that the sampling frequency, AA filter cutoff and order, and spectrum of the signal+noise are all important and coupled factors in the system design.


Antialiasing Filter - Time Domain
We now have an idea of the upper bound for the cutoff frequency of the AA filter. Why not just make it at an ultra low frequency? Well, you wouldn't want to be cutting into the band you care about for one thing. The other major consideration for antialiasing filter design is time domain performance.

My board has a bunch of channels of analog input, but only one (the main process input) needs a well designed signal path to provide high precision results.  The nature of the process this channel is built to measure dictates that we want to have a stable signal within some seconds after a disturbance. That speaks to the step response and settling time of the overall filter scheme. To be stable to less than one LSB after conversion, we have to wait for the signal to get within 1/4096 = 0.024% of final value = 8.3 * tau for a one pole RC. Lets say we want to get there in 2 seconds or less, which would mean tau = 2/8.3 = 0.24. For a one pole RC, tau = RC and the cut frequency, Fc = 1/(2*pi*tau). So for tau = 0.24, Fc = 0.66Hz. This means the time domain constraint of settling to 0.024% within 6 seconds can be satisfied by a one pole RC with a cut frequency of 0.66Hz or higher.

If one uses second order or higher analog filters for antialiasing, their time domain performance warrants even more careful thought. For instance, the step response of a Chebyshev looks terrible; lots of ringing is evident. So while Chebyshev provides faster rolloff in the frequency domain, it can really bugger up the time domain. Maybe it's possible to clean up the ringing later with a digital filter. Maybe you can just wait to treat the data seriously for some interval after a step (if the step is coming from something you know about, like a switch in analog signal gain range), to let the filter settle. Point is, its worth giving some thought to the time domain implications of the antialiasing filter.

Here is the step response of the 0.66Hz single pole RC considered above:



octave-3.6.4:1106> aaa = [.241 1]
aaa =

   0.241000   1.00000

octave-3.6.4:1107> baa = [0 1]
baa =

   0   1

octave-3.6.4:1108> sys = tf(baa,aaa)

Transfer function 'sys' from input 'u1' to output ...

          1     
 y1:  ----------
      0.241 s + 1

Continuous-time model.
octave-3.6.4:1109> tc = 0:.01:5;
octave-3.6.4:1110> step(sys,tc)


tau = 1/(2*pi*f) = 1/(2*pi*0.66) = for a single pole RC.


Digital Filtering



Ok, now we've got the data into the micro and we can dig into it with software. For this, I've been using Octave, which is a an open source matrix manipulation and data processing application much like Matlab, but at a far more attractive price point (free). Octave has an add-on package for signal processing called 'signal', which provides functions similar to and for the most part compatible with the signal processing toolbox in Matlab. On newer versions of Octave, you can have it go out and download an add on package from Octave-forge on the internet and install it, all from the Octave command line.

I tried some of the precompiled binaries of Octave for my machine, none of which were the most current version. These wouldn't run the signal package I downloaded from Octave-forge without errors, so I ended up having to build Octave from source. If you are a linux user, you know this often fails for obscure reasons or various package mismatch/dependency problems that can be annoying or impossible to fix. But the Octave build went surprisingly well, once I installed a few dependency packages it was asking for. It did take quite a long time (~4 hours). Once I had Octave up, I pulled down the signal package from within Octave and got started.


IIR vs FIR
It's easy enough to design whatever filter you want in Octave, but of course the next thing I was going to do was implement it on my MCU. My requirements were to smooth out the data samples a bit (chop high frequency noise), and also strongly cut down any 60Hz noise. 60Hz (or 50Hz in many places) is something that demands consideration in any analog system that will be operated anywhere close to grid power. This particular board will be in fairly close proximity to tens of kW of SCR chopped 60Hz power. It has some shielding, conductor lengths are short, etc., but I still wanted to make sure 60Hz and its harmonics were mercilessly crushed by the digital filter.

IIR filters make use of previous outputs. In other words, they have feedback which introduces poles in the transfer function. FIR filters only depend on time shifted samples of the input, so they achieve their action with zeros only. To get a given amount of low pass filtering an IIR will require a lower order design, which will take less computation and constants to implement. There are certainly some drawbacks to IIR, but for low order filters they are a good choice. So I tried low order IIR filters first.

Butterworth IIR in Octave
In Octave, there are a few functions to design IIR filters. The convention for discrete time frequencies is to denote them as a fraction of the sampling frequency, Fs. So 60Hz would be 60/1000 = 0.06. For many functions in Octave, the frequency is denoted as a fraction of Fnyq (or 1/2 Fs).

I started out looking at the butterworth IIR routine, since the butterworth features of good rolloff with a flat pass band sounded appropriate. To design a second order butterworth with Fc of 5Hz, with Fs=1000Hz:

octave-3.6.4:1140> [bb2 ba2] = butter(2,5/500);

You can take a look at the frequency response easily with the freqz command:


octave-3.6.4:1206> freqz(bb2,ba2)



I made up some filters at orders 2,3 and 4:
octave-3.6.4:1140> [bb2 ba2] = butter(2,5/500);
octave-3.6.4:1141> [bb3 ba3] = butter(3,5/500);
octave-3.6.4:1142> [bb4 ba4] = butter(4,5/500);

Then I plotted the magnitude vs. frequency response and the step response:
octave-3.6.4:1143> [Hb2 Wb2] = freqz(bb2,ba2);
octave-3.6.4:1144> [Hb3 Wb3] = freqz(bb3,ba3);
octave-3.6.4:1145> [Hb4 Wb4] = freqz(bb4,ba4);
octave-3.6.4:1147> stp = [zeros(1,100) ones(1,900)];
octave-3.6.4:1148> stp2 = filter(bb2,ba2,stp);
octave-3.6.4:1149> stp3 = filter(bb3,ba3,stp);
octave-3.6.4:1150> stp4 = filter(bb4,ba4,stp);
octave-3.6.4:1151> x = 0:999;
octave-3.6.4:1259> xh = linspace(0,1,512);

octave-3.6.4:1188> subplot(2,1,1);
octave-3.6.4:1187> plot(xh', 20*log10(abs(Hb2'));20*log10(abs(Hb3'));20*log10(abs(Hb4'))]);
octave-3.6.4:1191> axis([0 .2 -100 0])
octave-3.6.4:1192> ylabel("dB")
octave-3.6.4:1204> xlabel("Frequency (f/Fnyq)")

octave-3.6.4:1210> legend("N=2","N=3","N=4");

octave-3.6.4:1211> title("IIR Butterworth Filters")


octave-3.6.4:1193> subplot(2,1,2)
octave-3.6.4:1196> plot(x,[stp2;stp3;stp4;stp]);
octave-3.6.4:1197> axis([0 600 0 1.2]);
octave-3.6.4:1201> xlabel("milliseconds");
octave-3.6.4:1202> ylabel("Response");
octave-3.6.4:1215> legend("N=2","N=3","N=4","input","location","east");


On the scale in the top plot, 1 = Fnyq, so 60Hz is 0.12. At that frequency, the 2nd order filter has attenuation of 40db, whereas the 4th order is down to 80+db of attenuation. So in that respect, higher order is nice. But when you look at the step response in the second plot, its clear that as the order increases, the step response has worse overshoot and a longer settling time. N=3 looks like a decent compromise between attenuation of 60Hz (>60db) and step response. We can find the overshoot like this:

octave-3.6.4:1256> max(stp2)
ans =  1.0432
octave-3.6.4:1257> max(stp3)
ans =  1.0815
octave-3.6.4:1258> max(stp4)
ans =  1.1083
octave-3.6.4:1259> 

So we can see that the third order butter filter overshoots by 8.1%.

Other IIR designs
Lets compare the other IIR filter designs available in Octave to the butterworth, all at third order. The other options in the signal package are Chebyshev and elliptical. With an extra download from Sourceforge, we can use maxflat as well, which has flat passband like butter, but lets you independently set the number of zeros and poles. I did some graphs, and the zeros were not really doing much, so for comparison I put a maxflat with no zeros and three poles in here. Below, I create each filter (N=3 butter already made in the last section).

octave-3.6.4:1221> [cb3 ca3] = cheby1(3,1,5/500)
cb3 =

   1.8750e-06   5.6249e-06   5.6249e-06   1.8750e-06

ca3 =

   1.00000  -2.96822   2.93766  -0.96943

octave-3.6.4:1220> [eb3 ea3] = ellip(3,1,80,5/500)
eb3 =

   5.1566e-05  -4.4041e-05  -4.4041e-05   5.1566e-05

ea3 =

   1.00000  -2.96823   2.93769  -0.96944

octave-3.6.4:1242> [mb3 ma3] = maxflat(0,3,5/500)
mb3 =  3.0044e-05
ma3 =

   1.00000  -2.93717   2.87631  -0.93910

We can see that the 'a' vectors are very close in value, telling us the poles are in similar places. But there is a big difference in the 'b' vector, which gives zeros. Next, lets find the frequency and step response for each, then graph:

octave-3.6.4:1223> [Hc3 Wc3] = freqz(cb3,ca3);
octave-3.6.4:1224> stp_c3 = filter(cb3,ca3,stp);
octave-3.6.4:1225> [He3 We3] = freqz(eb3,ea3);
octave-3.6.4:1226> stp_e3 = filter(eb3,ea3,stp);
octave-3.6.4:1243> [Hm3 Wm3] = freqz(mb3,ma3);
octave-3.6.4:1244> stp_m3 = filter(mb3,ma3,stp);

octave-3.6.4:1250> subplot(2,1,1);
octave-3.6.4:1251> plot(xh',[20*log10(abs(Hb3'));20*log10(abs(Hc3'));20*log10(abs(He3'));20*log10(abs(Hm3'))]);
octave-3.6.4:1252> legend("butter","cheby1","ellip","maxflat");
octave-3.6.4:1253> ylabel("db")octave-3.6.4:1254> xlabel("Frequency (f/Fnyq)")octave-3.6.4:1255> axis([0 .2 -100 0])

octave-3.6.4:1236> subplot(2,1,2)
octave-3.6.4:1245> plot(x,[stp2;stp_c3;stp_e3;stp_m3;stp]);
octave-3.6.4:1246> legend("butter","cheby1","ellip","maxflat","input","location","east");
octave-3.6.4:1247> axis([0 600 0 1.2]);octave-3.6.4:1248> xlabel("milliseconds");
octave-3.6.4:1249> ylabel("Response");


Looking at the frequency response, it looks like maxflat is very much like butter, though they do diverge at higher frequencies not shown on this plot. Cheby would get us something like another -10db at 60Hz, but the most noticeable thing is the nice deep sinkhole in the ellip filter curve right in the 60Hz area! However, when we look at the step response, the cheby and ellip overlay each other; both ring a lot and take quite a long time to settle. Also remember that with my 12 bit A/D, there is no benefit at all for having more than -72db. Maxflat has worse overshoot and greater delay than butter. So I'd go with butter or ellip here, depending on whether I could tolerate the poor time domain performance of the ellip.

It is interesting to look at where the poles and zeros are in these different types of filter. We can see that using the following Octave code.


octave-3.6.4:1331> subplot(2,2,1)
octave-3.6.4:1332> zplane(bf)
octave-3.6.4:1333> zplane(bb3,ba3)
octave-3.6.4:1334> title("butter N=3")
octave-3.6.4:1335> zplane(cb3,ca3)
octave-3.6.4:1336> zplane(bb3,ba3)
octave-3.6.4:1337> title("butter N=3")
octave-3.6.4:1338> subplot(2,2,2)
octave-3.6.4:1339> zplane(cb3,ca3)
octave-3.6.4:1340> title("cheby1 N=3")
octave-3.6.4:1341> subplot(2,2,3)
octave-3.6.4:1342> zplane(eb3,ea3)
octave-3.6.4:1343> title("ellip N=3")
octave-3.6.4:1344> subplot(2,2,4)
octave-3.6.4:1345> zplane(mb3,ma3)
octave-3.6.4:1346> title("maxflat N=3")



One way to think of system response is that the input signal frequencies are along the unit circle from 0 to pi radians corresponding to 0-Fnyq Hz. At a particular frequency, the response can be found by multiplying by the length of all the vectors from that point to each zero, and by dividing by the length of all the vectors from the point to each pole.

So in all cases, the three poles are arranged over to the right very close to the unit circle at low frequencies, since our Fc in this example is a small fraction of Fnyq. Those short vectors in the denominator at low frequency are propping up the response down there, like poles in a tent, ha ha. A pole or zero at the origin doesn't do anything (since you just multiply or divide by a vector that is always 1 in length). The zero in the first three filters on the left at Fnyq punches the response down as it approaches Fnyq in frequency, which is clear from the magnitude plots shown earlier. The zeros over at lower frequency in the elliptical case explain the dip around 60Hz we observed earlier in its frequency response. The lack of the zero at Fnyq in the maxflat map explains why it levels out in the stop band. Here is the full frequency response of these filters; the previous one was scaled to zoom in on the lower frequency area:


octave-3.6.4:1347> figure
octave-3.6.4:1348> plot(xh',[20*log10(abs(Hb3'));20*log10(abs(Hc3'));20*log10(abs(He3'));20*log10(abs(Hm3'))]);
octave-3.6.4:1349> axis([0 1 -200 0])
octave-3.6.4:1350> title("N=3 IIR Filters")
octave-3.6.4:1351> xlabel("Frequency (1=Fnyq=500Hz)")
octave-3.6.4:1352> ylabel("db")
octave-3.6.4:1353> legend("butter","cheby1","ellip","maxflat");




IIR Implementation
Octave uses 'Direct Form' notation for filter coefficients. When you are writing code to implement the filter, this mainly impacts what gets added and what gets subtracted. You can work out a simple example on a calculator to see what the output of the filter would be with your arrangement and make sure the signs are right. So lets look at the output of Octave for a third order maxflat filter:


octave-3.6.4:1356> [mb3 ma3] = maxflat(0,3,5/500)
mb3 =  3.0044e-05
ma3 =

   1.00000  -2.93717   2.87631  -0.93910


Here is how to implement that filter in C:


/* IIR filter data structure */
typedef struct {
  float32_t x1;
  float32_t x2;
  float32_t x3;
  float32_t out;
  float32_t y1;
  float32_t y2;
  float32_t y3;
  float32_t a1;
  float32_t a2;
  float32_t a3;
  float32_t b0;
  float32_t b1;
  float32_t b2;
  float32_t b3;
} iir_filt_3p_instance;


/* analog channel 0 input filter, Fc = 5Hz default */
iir_filt_3p_instance ad0_iir_filt = {
  .a1=-2.93717, .a2=2.87630, .a3=-0.93910, 
  .b0=3.0044e-5, .b1=0.00000, .b2=0.00000, .b3=0.00000
};

/* IIR filter routine */
float32_t iir_filt_3p(iir_filt_3p_instance* filt, float32_t in){
  /* new output */
  filt->out = (
               filt->b0*in + 
               filt->b1*filt->x1 + filt->b2*filt->x2 + filt->b3*filt->x3 -
               filt->a1*filt->y1 - filt->a2*filt->y2 - filt->a3*filt->y3);
  
  /* shift input samples */
  filt->x3 = filt->x2;
  filt->x2 = filt->x1;
  filt->x1 = in;
  
  /* shift output samples */
  filt->y3 = filt->y2;
  filt->y2 = filt->y1;
  filt->y1 = filt->out;
  
  return(filt->out);
}



int main(void){

  uint16_t raw;
  uint16_t filtered;
  uint32_t currentTick, lastTick;

  while(1){

    /* tick updating */
    while(currentTick == lastTick){
      currentTick = systickGetTicks();
    }
    if(currentTick-1 != lastTick) printf("missed 1ms tick\n\r");
    lastTick = currentTick;
 

    /* 1ms tasks */ 

    /* acquire new analog input for 1ms sampled channels*/
    raw = adcRead(0);

    /* apply digital filter */
    filtered = (uint16_t)iir_filt_3p(&ad0_iir_filt,raw);

  }
}

We could save some time and code by removing the b coefficients we are not using in the maxflat filter, but it is nice to keep them in since then we can change to an elliptical, chebyshev, or butterworth just by changing the numbers. I'm sure there is a slick way to make this scalable in filter order, but I wanted a third order filter, so there you have it. It would be easy to edit the code to create higher order filters.

One thing to watch out for is that if want an IIR filter high than maybe 4-6, apparently quantization of coefficients and intermediate results can lead to some undesired non-ideal behavior. This can be avoided by implementing the filter as a cascade of biquad (N=2) filters. There is an Octave function for that; the example below considers a 6th order elliptical filter:


octave-3.6.4:1358> [eb6 ea6] = ellip(6,1,80,5/500)
eb6 =

 Columns 1 through 6:

   9.9730e-05  -5.9353e-04   1.4766e-03  -1.9656e-03   1.4766e-03  -5.9353e-04

 Column 7:

   9.9730e-05

ea6 =

    1.00000   -5.96941   14.84904  -19.70205   14.70597   -5.85493    0.97137

octave-3.6.4:1361> [sos g] = tf2sos(eb6,ea6)
sos =

   1.00000  -1.96052   1.00000   1.00000  -1.98425   0.98439
   1.00000  -1.99422   1.00000   1.00000  -1.98937   0.98997
   1.00000  -1.99665   1.00000   1.00000  -1.99579   0.99677

g =  9.9730e-05


I have not actually used this, and in particular I'm a little unsure how to implement the 'g' factor, but I think we could take that output and do something like this (with a N=2 order version of the filter routine):

/* ad0 input filter stages, Fc = 5Hz default */
iir_filt_2p_instance ad0_iir_filt_1 = {
  .a1=-1.98425, .a2=0.98439, 
  .b0=9.973e-5, .b1=9.973e-5*-1.96052, .b2=9.973e-5};

iir_filt_2p_instance ad0_iir_filt_2 = {
  .a1=-1.98937, .a2=0.98997, 
  .b0=9.973e-5, .b1=9.973e-5*-1.99422, .b2=9.973e-5};

iir_filt_2p_instance ad0_iir_filt_3 = {
  .a1=-1.99579, .a2=0.99677, 
  .b0=9.973e-5, .b1=9.973e-5*-1.99665, .b2=9.973e-5};



Then later, in the main loop:


float32_t tmp1,tmp2;

/* acquire new analog input for 1ms sampled channels*/
raw = adcRead(0);

/* apply digital filter */
tmp1 = iir_filt_2p(&ad0_iir_filt_1,raw);
tmp2 = iir_filt_2p(&ad0_iir_filt_2,tmp1);
filtered = (uint16_t)iir_filt_2p(&ad0_iir_filt_3,tmp2);


Or we could write a routine to do the three stage filter in one go.


Real world performance
Using my buffer capture code on the real hardware, I recorded 1024ms segments of synchronized A/D input and digital filter output, with a function generator hooked up to the front of the analog stage. Below, see the results with a 2Hz and a 60Hz input. The filter used was the third order maxflat with poles only, Fc=5Hz.


In the first plot, the overshoot in the step response is clearly visible with the 2Hz input, as is the slight rounding effect on the incoming signal due to the analog antialiasing filter. In the second plot, we can see that 60Hz is very effectively attenuated. The third plot shows a zoom in of the 60Hz results, showing the quantization of the real filter output as well as the fact that the output is down to showing +/- a couple LSBs of 60Hz noise.

To create these data sets, I wrote my buffer capture code to output comma separated decimal values in such a way that I could cut and paste them into a text file, in other words a 128 row long spew like this:
.........

00754,00692,00656,00636,01526,02530,03059,03200
03335,03483,03562,03603,03623,03277,02011,01356
01008,00826,00730,00680,00648,00636,01932,02743
03168,03391,03515,03578,03611,03630,02759,01745
01216,00934,00785,00708,00664,00640,01002,02259
02915,03262,03443,03535,03591,03618,03633,02347
01533,01102,00876,00756,00692,00656,00638,01516
02526,03055,03335,03483,03559,03602,03622,03290
02013,01356,01010,00825,00728,00678,00652,00634
01926,02739,03166,03394,03511,03578,03607,03627
02767,01750,01218,00936,00788,00710,00666,00644
00990,02256,02913,03259,03443,03539,03591,03618
03631,02353,01532,01104,00876,00754,00692,00656
00637,01508,02520,03054,03335,03484,03559,03602
03623,03302,02018,01356,01010,00832,00728,00680
00650,00633,01919,02735,03165,03391,03511,03578
03613,03630,02779,01754,01220,00938,00786,00708
00665,00643,00982,02250,02910,03255,03441,03538
.........
....
..

which can then be brought in to Octave with a CSV import.


octave-3.6.4:1448> raw60 = csvread("data_60Hz_raw_mf3_60.txt");

octave-3.6.4:1448> raw602 = reshape(raw60',1,1024);

The actual filtered data can be brought in the same way. Then the simulated filtered result can be had by putting the raw captured data through the designed filter in Octave:



octave-3.6.4:1448> filtsim602 = filt(mb3,ma3,raw602);

Then they can be plotted together. It's comforting to see that after the effect of the initial boundary condition is overcome (the simulation is starting at zero, while the actual output has had plenty of time to settle in to steady state), the actual filter output matches quite well to the simulated filter output.


Precision of Coefficients
One thing to look out for. The printouts of the filter coefficients in Octave by default give you 5 significant digits. This is mostly ok, but if for instance you change the cutoff of the third order butterworth IIR from above to 1Hz, you get something that looks fine:


octave-3.6.4:1528> [bb31 ba31] = butter(3,1/500)
bb31 =

   3.0812e-08   9.2437e-08   9.2437e-08   3.0812e-08

ba31 =

   1.00000  -2.98743   2.97495  -0.98751


But if you put these numbers into your MCU as shown, the output of the filter blows up to very quickly and saturates. With some debugging, I could see that it was the feedback term that was blowing up, which made me suspicious about the ax coefficients. If you ask Octave for more digits, you can see more digits:


octave-3.6.4:1529> format long e
octave-3.6.4:1530> [bb31 ba31] = butter(3,1/500)
bb31 =

 Columns 1 through 3:

   3.08123730444334e-08   9.24371191333001e-08   9.24371191333001e-08

 Column 4:

   3.08123730444334e-08

ba31 =

 Columns 1 through 3:

   1.00000000000000e+00  -2.98743365005572e+00   2.97494613266544e+00

 Column 4:

  -9.87512236110736e-01

If I put the longer strings of digits into the MCU for the a coeffs, the filter runs fine. Clearly at some point single precision floats are going to just not be enough to do the job. Not sure when that is, but keep this issue in mind if you are having problems with the filter saturating or behaving strangely. 



What about FIR?
Well, it sure seems like you need a way bigger order FIR to match the attenuation of an IIR. Keeping the butter and ellip order 3 IIRs from our example, lets also throw in a big ol' FIR of order 200, which is what I found I needed to get up the same attenuation range at 60Hz as the IIRs are showing:

octave-3.6.4:1289> bf = fir1(200,5/500);
octave-3.6.4:1290> [Hf Wf] = freqz(bf);
octave-3.6.4:1299> stp_f = filter(bf,1,stp);
.....
Other plotting commands similar to previous set...

So the FIR has some benefits: it falls off faster in magnitude after Fc (though it is not as flat up to Fc); it gets down to -60db much faster than the IIRs. And the step response has no overshoot or ringing, and settles more quickly than the IIRs. Still, 200 order is a whole lot more than 3! I could probably cut it back some by adjusting the order of the FIR, fiddling with the cut frequency or the sampling frequency. But its not going to change drastically; I tried a bunch of combinations and it just seems that you need a lot of zeros in the filter to get the response in the stop band down to a low level.

Here is what the FIR looks like on a Z plane pole/zero map:


So you can see that it works by just coating the unit circle with zeros, which makes sense looking at the frequency response too (with some interesting stuff happening around 0 frequency, as with the IIR filters).


It is also interesting to see what the impulse response looks like, since the FIR filtering operation is just convolving the impulse response with the input data:


So these both look like windowed sincs, which is no surprise, since the time domain impulse response of a brick wall filter is a perfect sinc.



FIR Implementation
You can implement FIR with just the same code I showed above for IIR, simply setting all the 'a' vector coefficients to zero, and the 'b' vector to the FIR coefficients. It is more efficient to have a dedicated FIR routine, certainly, that would just calculate the required xn*bn terms and doesn't bother saving and shifting previous outputs. Even better to use some FIFO code to handle the shifting and storage of previous inputs.

As FIR order climbs beyond 60-100, according to some sources, it becomes less computationally intensive to implement the filter as multiplication in the frequency domain rather than convolution in the time domain. After doing that, it is almost the same amount of work regardless of filter order, so high order filters become more attractive.

ARM has lately started producing and maintaining free support software which really enhances the utility of chips with their core. In fact the toolchain I am using for development is an ARM maintained set of GNU cross development tools.

Another useful thing ARM is doing is providing a set of standardized libraries with processor specific header files and routines called CMSIS. This software also has some pre-written and optimized routines for typical signal processing tasks, including biquad IIR and FIR filters. I spent some time trying to use the PID function, but honestly couldn't figure it out in a reasonable amount of time so I just went with my own version. The FIR function looks like it shouldn't be too hard to set up, but I have not actually done it so I don't really know.

There are functions for doing FFT and inverse FFT, which would be nice for doing high order FIRs. Plus there is a LOT of other cool looking stuff in there for statistics, matrices, etc. Cool! I do wish there were examples for all the functions though, maybe that would have helped me get their PID controller working properly.


Conclusion
In summary, Octave is an excellent tool to design filters, and digital filters can be extremely effective and fairly simple to implement on a microcontroller.

I wouldn't be at all surprised if I've made some errors in this little article; if you notice some, or just know of better or easier ways to think about or implement things, please feel free to post a comment.



ADDITION - 2/19/2016
Based on some info I dug up in books and online, I wrote a routine which auto generates the filter coefficients for a 2 pole butterworth. This is nice because you can just put in the sampling frequency and cut frequency to your code and have it make the coeffs at run time, rather than designing the filter in Octave then cutting and pasting to float numbers. The reason I actually did this was because in one application I was working on, I wanted to dynamically change the filter while the device was in operation. Surely this was based on someone else's work but I can't recall at the moment where it was...

Anyway, here is the code if you are interested:

/**************************************************************************/
/*!
    @brief      Two pole butterworth low pass IIR filter coefficient generator routine

    @param[in]  filt
                Pointer to the iir_filt_2p_instance instance containing
                the filter data structure
    @param[in]  fs
                sampling frequency
    @param[in]  fc
                cut point frequency
    @returns    zero on success, -1 on error
*/
/**************************************************************************/
int32_t iir_butter2(iir_filt_2p_instance* filt, float32_t fs, float32_t fc){
  float ax = tan(PI*fc/fs);
  float ax2 = ax*ax;
  float r = sin(PI*3.0/4.0);
  float sx = ax2 + 2.0*ax*r + 1.0;
  float A = ax2/sx;
  float d1 = 2.0*(1-ax2)/sx;
  float d2 = -(ax2 - 2.0*ax*r + 1.0)/sx;

  filt->b0 = A;
  filt->b1 = 2*A;
  filt->b2 = A;
  filt->a1 = -1*d1;
  filt->a2 = -1*d2;

  return(0);
}

7 comments:

Bobbin Doctor said...

I truly am extremely impressed, but I think I'll stick with sewing machines! Basic static mechanics I can understand. Neither electrical nor chemical equations stick in my brain. ;-)

Thin Man Sewing said...

I earned a degree in electronics in 1987. I remember some of what you are doing, but most of it wasn't thought of when I was studying it. The 8088 chip programmed with machine language. Brought back fond memories.

mssewcrazy said...

Though I have a science degree I got lost mid way through this and will just stick to the sewing and cooking posts on here. My grandsons here visiting are running around and that is not conducive to concentration on technical writings. I might not have understood the content had the small grandsons not been here but that's as good of an excuse as any I suppose .

Nico said...

Thanks for the well researched article. I learned about the finicky behavior of higher order IIR filters the hard way, especially when trying to move them from octave to another environment. I stumbled over your blog entry and tried the BiQuad approach you've mentioned. I struggled a bit with the gain g returned by the tf2sos command though. I reckon you don't need to multiply the B coefficients as you did, since the B coefficients appear to be already scaled. All you need to do is multiply the raw input sample by the gain. Please correct me if I'm wrong.

Holly said...

Nico- Thanks for the feedback. I did do some more looking at the output of the tf2sos command and how to implement a filter on it. I had also come to the conclusion that the right thing to do is multiply the incoming sample by g before feeding it into the filters. But I didn't move it from octave into the microcontroller since I already had something that basically worked. Thanks for offering your experience with tf2sos.

Unknown said...

Hi Holly:

Nice block diagram. Its amazing how many think they can get by without an alias filter!

During my time at Zond, I designed the ADAS and it sampled at 160 Hz. The converter was a 12 bit and therefore we needed to attenuate the incoming analog signal to -72 db at 80 Hz (1 count out of 4096). We did this using a 6 pole analog anti-alias filter that we adjusted using pots and in the end, had a nice filter with a 20 Hz bandwidth and a -72 db rejection at 80 Hz, just what we needed to prevent even 1 bit of alias appearing on the converter. We used a Butterworth because it would have low phase shift, but had to use 6 poles (three op amps) in order to get the -72 db at 80 Hz.

Our A/D was a forerunner of the Sigma Delta converters popular today. It was a synchronized voltage to frequency converter (Analog Devices AD652) driving a counter.

Some things never change and engineering requirements don't.

BTW, I also enjoyed your "energy" blog.

God bless

Kevin Cousineau

Holly said...

Kevin -
Glad to see you here!

Wow, a 6 pole analog filter must have really taken some tweaking. At the moment I'm developing an instrument based on the same LPC1347 processor I used in this article which uses digital lock-in for pulling a tiny measurement out of tons of SCR chopped 60Hz+harmonics noise. I've got 2 pole Butterworth band pass filters on the analog inputs centered at 250Hz and 10Hz wide, Q=25. These are extremely sensitive to analog component tolerances (at least the passband part; stopband is pretty insensitive). And I've noticed for the digital filters as you go higher order (for IIR anyway), the required precision of the coefficients shoots up quickly. So I can imagine with your 6 pole analog Butterworth you must have spent some time figuring out exactly which parts to use and how many ohms to split to fixed resistors vs. trimmers. Maybe even manually measuring and binning parts?

I remember working with the control boards on the Z40 when I was in college, which you designed, and I thought it was cool that you had done the A/D with a voltage to frequency converter.

Bob has been forwarding me updates on the stuff you are doing for WS; very nice work!

Anyway, glad you enjoyed this post.