Tuesday, March 8, 2016

Arduino Shield for custom board CPLD programming and testing using pogo pins

Arduino Shield for custom board CPLD programming and testing using pogo pins

This post is just to show how I have used the Arduino JTAG programming hardware/software that I have discussed before.

The idea was to have a setup where I could both program and test a CPLD based board.

Lets see the photos:

The photo above shows the support for the board to be programmed with the pogo pins at the center, the board itself and the top shield. I have used two identical shield boards and have spaced them to give the pogo pins the proper vertical direction.

Some of the pads on the board to be tested are SMD, others are through hole. Of course, after it was assembled, I realized I should have left the pogos that go peek at through holes slightly higher that those that peek at SMD pads. That would have made the process of fitting the board much better.

In the same holes, I have mounted the board support, which is a kind of "negative" of the board. It consists of two milled pcbs, with two concentric circles to give support to the board. If you look carefully to the left of the photo, you can see a small dent that is used to give the board the proper orientation.

In this next photo, we can see the toggle clamp device used to hold the board in place in action.

In the last photo, we can see the full stack: the Arduino at the ground, the two shield boards in the middle and the support with a board in it.

Some details for those interested:

Hope you like it, comments are welcome!

Saturday, March 5, 2016

Serial Buffer Size versus Effective Bit Rate of Arduino USB


I have recently just met a few of the "gotchas" related to serial programming and memory on the Arduino, and got a few lessons. I was debugging someone else's non-working code. Non-working for, apparently, no good reason. To make a long story short, the problem was that the program used a large amount of SRAM (static RAM) memory, in the form of strings. The Arduino Uno has 32 Kib of flash, but only 2 KiB of SRAM. That is why strings in the Arduino should be kept in flash memory to save the precious SRAM. To do so, you have to use the "F()" macro, so that the compiler does that for you.

Figuring out the problem was not easy, since using "Serial.print()"  without "F()" to debug would just make things worse in an unpredictable way. But at a certain point, I got it, and since then I tried my best to spare SRAM. That is when I started facing the problem of the size of the serial buffer.

Serial communication on the Arduino has one big problem: there is no hardware flow control. That means that if you want reliable communication, you must implement your own flow control mechanism. Anything you come up with software, implies a greater overhead than what you would get with a hardware mechanism. But of course, using a larger reception buffer would minimize the problem. The larger the buffer, the smaller the number of times the flow control mechanism must work.

The Arduino software has a default size of 64 bytes for the serial buffer. I wonder if that was enough, so I wrote some code to test it.

Some theory

Lets try to come up with a model. Linear models look interesting, for a start. Lets try the following: the time that a transfer takes \((\Delta t)\) is proportional to the number of bytes we want to transfer. If you consider a serial transmission with UART, eight data bits, one start bit and one stop bit, the time to transfer one byte is proportional to either ten times the inverse of the bit rate or some byte processing overhead \((O_{byte})\), whichever is greater. But since these bytes are transfered in blocks, we can imagine that the total transfer time also has an overhead component proportional to the number of blocks \((O_{block})\). In equations:

\begin{equation} \Delta t = NumBytes \cdot \max \left( \frac{10}{BitRate}, O_{byte}\right) + NumBlocks \cdot O_{block} \end{equation}

\begin{equation} \Delta t = NumBytes \cdot \max \left( \frac{10}{BitRate}, O_{byte}\right) + \frac{NumBytes}{BlockSize} \cdot O_{block} \end{equation}

\begin{equation} \label{eqDeltaTFinal} \Delta t = NumBytes \cdot \left[ \max \left( \frac{10}{BitRate}, O_{byte}\right) + \frac{O_{block}}{BlockSize} \right] \end{equation}

\begin{equation} \frac{\Delta t}{NumBytes} = \max \left( \frac{10}{BitRate}, O_{byte}\right) + \frac{O_{block}}{BlockSize} \end{equation}

\begin{equation} \frac{10}{EffectiveBitRate} = \max \left( \frac{10}{BitRate}, O_{byte}\right) + \frac{O_{block}}{BlockSize} \end{equation}

Equation \ref{eqDeltaTFinal} shows two things:

  1. We can mitigate the block overhead using a larger block size.
  2. We should try to keep the byte overhead less than 10 times the inverse of the bit rate.

In this work, I will estimate the byte overhead and the block overhead from the measure of the effective bit rate for various block sizes.

Arduino Bit Rates

For the data to have some meaning, we will have to use an exact calculation of the Arduino bit rates. The formula is (for AVR's U2X bit = 1):

\begin{equation}\label{eqnBitRate}BitRate = \frac{10 \cdot ClockFrequency}{8 \cdot (UBRR + 1)} \end{equation}


\begin{equation}\label{eqnBytePeriod}\frac{1}{ByteRate} = \frac{10}{BitRate} = \frac{8 \cdot (UBRR + 1) \cdot 1000}{ClockFrequency} \,\, ms/byte \end{equation}

For the Arduino, \(ClockFrequency = 16 MHz \), such that \(9600\,bits/s\) is actually \(9615.4\,bits/s\) \((UBRR = 16,\,1.0406\,ms/byte)\), and \(115200\,bits/s\) is actually \(117647\,bits/s\,(UBRR = 207,\,85.144\,\mu{s}/byte)\).

I found this nice AVR bit rate calculator, if you are curious, you can play with it.

The Data

Each graph consists on a log-log plot of two sets of data. The red curves are for the bit rate of 9600, and the blue curves are for the bit rate of 115200. Both curves refer to the transfer time of 32768 bytes. For each graph I have created an artificial byte overhead using a delay after receiving the byte. The log-log plot is necessary to linearize the "\(\frac{1}{x}\)" relation of total time versus block size.

The tail of the curves can be estimated from equation \ref{eqDeltaTFinal} taking the limit when the block size is large:

\begin{equation} \Delta t_{tail} = NumBytes \cdot \max \left( \frac{10}{BitRate}, O_{byte}\right) \end{equation}

which is \(\left(NumBytes \cdot \frac{10}{BitRate}\right)\) or \(\left(NumBytes \cdot O_{byte}\right)\). If we assume that \( O_{byte}\leqslant \frac{10}{BitRate}\), then for a sequence of 32768 bytes we have the theoretical values of 34.078 ms and 2.785 ms for 9615.4 bps and 117647 bps respectively. Which agrees whith the measured values of 34.1 ms and 2.79 ms.

The graphs show that for the Arduino working in 9600 bps (9615.4 bps actually), a buffer size of 17 bytes is enough to reach the minimum theoretical transfer time. For 115200 bps (117647 bps actually) a buffer size of 27 bytes will do.

give us the estimate of \(\max \left( \frac{10}{BitRate}, O_{byte}\right)\), while the linear part gives us the estimate of \(\frac{O_{block}}{BlockSize}\).

Reality is always full of surprises. The tail behaves as we would expect from our crude model, but for the lower values of the buffer size, we can see some unexpected things.


Depending on the byte processing overhead your algorithm has, we saw that a 63 byte buffer can have the same performance in 9600 bits/s or 115200 bits/s.

The bumpy block overhead is something that I might analyse more carefully some day in the future.