TAPLFO 3C – Squashing the bug

We’ve recently discovered a bug in the recently-released TAPLFO 3 code (how embarrassing!). The problem only affects the tap tempo feature of the TAPLFO3. Other chips like the VCLFO or STOMPLFO aren’t affected. We’ve fixed the code, and while we were working on it, we also improved the switch debouncing routine to be more tolerant of bouncy switches.

We’re offering anyone who has an affected chip (a TAPLFO 3) a free replacement with the TAPLFO 3C. Please get in touch and send us your faulty chip/s. If you’re not using the tap tempo feature, you don’t need to worry.

Here’s links to the updated code and hex file:

Note nothing else has changed. All the circuits and the datasheet are identical. We just squashed the bug is all.

So what was the problem, and why was it missed?

I was working with the TAPLFO3 chip trying to improve the switch debouncing, and I noticed a strange timing anomaly. While everything seemed fine at faster speeds, at low speeds the LFO didn’t keep up with the tapped tempo. And the slower you tapped, the worse it got. The problem only really gets noticeable if you tap slower than once a second or so – e.g. at 60bpm or less. It occurs above this, but the effect is so much smaller, you don’t really see it. Similarly, if the LFO was synced to an external clock, the periodic resets hide the problem, again except at the slowest speeds.

The problem was caused by a bad instruction in the UpdateFreqIncs routine which deals with the waveform distortion feature – a simple copy/paste error. This code was the last thing to be added to the new TAPLFO3 code when it was ported over from the original 16F684 chip. The tap tempo feature had already been done and was tested and working. When the waveform distortion was added, it was tested and the waveforms distorted correctly, so no problem was noticed. Unfortunately it had side-effects on the tapped tempo at the slowest settings, and because of the limited nature of the problem this wasn’t spotted testing the chip after the waveform distortion was completed.

There were several steps involved in finding the problem. The tap tempo feature uses several parts of the code and any of them could potentially be at fault. The first step was to check that the actual tap timing was operating correctly. The time between taps is counted using a 16-bit counter clocked at 15625Hz, which means the longest time that can be measured is a little over 4 seconds (65535 / 15625 = 4.19). This part was fine. This counter value represents the required waveform *period*, but we need the waveform frequency increment. This is obtained by doing a long division, and my first thought was that the error was probably in the division routine, since it looked as if the errors might be related to the size of the count (slower taps = larger count). However, the division routine also checked out. The next step in the process is for the frequency increment value obtained from the division routine to be multiplied by the Tempo Multiplier value. Again, this was a possible source of an arithmetic error, but again, the routine checked out fine.

The last step is the conversion of the final frequency increment value into two separate values to be used for the first half and the second half of the waveform for the waveform distortion. The distortion feature works by effectively speeding up one half of the waveform while slowing down the other half proportionally so that there is no overall speed change. Commenting this feature out in the code made the bug disappear, so I was getting warm!

The distortion is produced by more division, so that made another arithmetic routine to check, but that wasn’t the problem either. Finally, I checked the code that shifts values from one variable to another ready for the division routine, and there it was! That snippet of code does a series of shifts, and the last of those was a copy/paste from the previous one that had not been altered so it read “FREQ_INC_MID” instead of “FREQ_INC_LO”, thus performing two shifts on the middle byte and none on the lower byte. This was the source of the error. The effect of this is to reduce the value of the middle byte of the frequency increment. On slower settings, the waveform takes more samples, so the frequency increment is added more times and the errors build up – this is why it appeared to be related to count size. The actual error in any particular case depend on the value of the middle byte of a 24-bit variable, so the error is actually related to ((FREQ_INC / 256) % 256), which is not exactly a direct proportional relationship as commonly understood!

To make the whole situation doubly embarrassing, the initial “fixed” version of the code, TAPLFO 3B, fixed this bug, but introduced another in the newly-updated switch debouncing. That’s what you get for trying to rush a fix out and taking your eye off the (other) ball!

All in all, it’s been a learning experience for me as a programmer and one I’ll chalk up to experience. Bugs are a fact of life for coders and squashing them is a big part of writing code. I’m sorry that one two got past to make it into released code though, and I’ll endeavour to make sure it doesn’t happen again.



Leave a Reply

Your email address will not be published. Required fields are marked *