Posts Tagged ‘ embedded ’

LPC176x UART Driver

In my last post (here), I claimed that FIFOs are often used in UART drivers. Here I will show a UART driver that utilizes dual FIFOs, one for transmit and one for receive. A universal asynchronous receiver/transmitter (UART) is a device that receives and transmits data without a known clock relationship to the connecting device. This allows each device to send data whenever it wants. This is in stark contrast to the SPI and I2C buses where the slave device can’t send data without the master first initiating a bus transfer. UARTs are very versatile and are in wide use. They are most commonly found in RS-232 ports on PCs.

The basic structure behind a UART driver is a negotiation process between the asynchronous hardware and the user’s code. FIFOs are used to aide this process. For transmitting data, it is desirable for the user to drop the data off at any time and forget about the actual serial transmission. This is where the FIFO comes in. The UART driver just takes the data and puts it in a FIFO and returns to the user. In another thread (driven by interrupts) the driver sends all the data in the FIFO as fast as it can. The receive path is very similar. The driver, again in an interrupt driven thread, transfers all received data into a FIFO. The user periodically checks if there is any new data and pulls it out at its own speed.

UARTs are often used for printing ASCII to a debug console. Most of the UARTs I have made have only been used for this purpose. For this reason it is very important to have a good method for converting numbers (integer and floating-point) to a sequence of ASCII characters. Of course, you could use a sprintf-like function, however, these are very slow. Even the embedded versions of these libraries produce terribly inefficient code (I dare you to follow the call stack of a printf function). I’m not a big fan of Arduinos, but I must say that the Arduino serial printing functions are very nice. There are no format strings to parse. Instead, the user just calls a sequence of print functions to produce the desired ASCII. My UART driver has an integrated printing library similar to the functions found in the Arduino library. This may be better off separated from the actual driver, however, I feel it fits fine into this code. You’ll notice a lot of similarity between my print functions and the Arduino serial library.

Header File

/************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above
   copyright notice, this list of conditions and the following
   disclaimer in the documentation and/or other materials provided
   with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

*************************************************************************

 Information:
   File Name  :  uart3.h
   Author(s)  :  Nic McDonald
   Hardware   :  LPCXpresso LPC1768
   Purpose    :  UART 3 Driver

*************************************************************************
 Modification History:
   Revision   Date         Author    Description of Revision
   1.00       05/30/2011   NGM       initial

*************************************************************************
 Assumptions:
   All print functions assume the UART is enabled.  Calling these
   functions while the UART is disabled produced undefined behavior.

************************************************************************/

#ifndef _UART3_H_
#define _UART3_H_

/* includes */
#include <stdint.h>

/* defines */
#define SW_FIFO_SIZE            512
#define UART3_DISABLED          0x00
#define UART3_OPERATIONAL       0x01
#define UART3_OVERFLOW          0x02
#define UART3_PARITY_ERROR      0x03
#define UART3_FRAMING_ERROR     0x04
#define UART3_BREAK_DETECTED    0x05
#define UART3_CHAR_TIMEOUT      0x06

/* typedefs */

/* functions */
void uart3_enable(uint32_t baudrate);
void uart3_disable(void);
void uart3_printByte(uint8_t c);
void uart3_printBytes(uint8_t* buf, uint32_t len);
void uart3_printString(char* buf); // must be null terminated
void uart3_printInt32(int32_t n, uint8_t base);
void uart3_printUint32(uint32_t n, uint8_t base);
void uart3_printDouble(double n, uint8_t frac_digits);
uint32_t uart3_available(void);
uint8_t uart3_peek(void);
uint8_t uart3_read(void);
uint8_t uart3_txStatus(void);
uint8_t uart3_rxStatus(void);

#endif /* _UART3_H_ */

Source File

/************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above
   copyright notice, this list of conditions and the following
   disclaimer in the documentation and/or other materials provided
   with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

*************************************************************************

 Information:
   File Name  :  uart3.c
   Author(s)  :  Nic McDonald
   Hardware   :  LPCXpresso LPC1768
   Purpose    :  UART 3 Driver

*************************************************************************
 Modification History:
   Revision   Date         Author    Description of Revision
   1.00       05/30/2011   NGM       initial

*************************************************************************
 Theory of Operation:
   This provides a simple UART driver with accompanying print functions 
   for converting integer and floating point numbers to bytes.

************************************************************************/

#include "uart3.h"
#include "fifo.h"
#include "LPC17xx.h"

/* local defines */
#define RX_TRIGGER_ONE          0x0
#define RX_TRIGGER_FOUR         0x1
#define RX_TRIGGER_EIGHT        0x2
#define RX_TRIGGER_FOURTEEN     0x3
#define RX_TRIGGER_LEVEL        RX_TRIGGER_FOURTEEN
#define RLS_INTERRUPT           0x03
#define RDA_INTERRUPT           0x02
#define CTI_INTERRUPT           0x06
#define THRE_INTERRUPT          0x01
#define LSR_RDR                 (1<<0)
#define LSR_OE                  (1<<1)
#define LSR_PE                  (1<<2)
#define LSR_FE                  (1<<3)
#define LSR_BI                  (1<<4)
#define LSR_THRE                (1<<5)
#define LSR_TEMT                (1<<6)
#define LSR_RXFE                (1<<7)

/* local persistent variables */
static uint8_t uart3_tx_sts = UART3_DISABLED;
static uint8_t uart3_rx_sts = UART3_DISABLED;
static uint8_t uart3_txBuffer[SW_FIFO_SIZE];
static uint8_t uart3_rxBuffer[SW_FIFO_SIZE];
static FIFO txFifo;
static FIFO rxFifo;

/* private function declarations */
static inline void uart3_interruptsOn(void);
static inline void uart3_interruptsOff(void);

uint32_t rdaInterrupts = 0;
uint32_t ctiInterrupts = 0;

/* public functions */
void uart3_enable(uint32_t baudrate) {
    uint32_t fdiv, pclk;

    // initial the SW FIFOs
    fifo_init(&txFifo, SW_FIFO_SIZE, (uint8_t*)uart3_txBuffer);
    fifo_init(&rxFifo, SW_FIFO_SIZE, (uint8_t*)uart3_rxBuffer);

    // set pin function to RxD3 and TxD3
    LPC_PINCON->PINSEL0 &= ~0x0000000F;
    LPC_PINCON->PINSEL0 |=  0x0000000A;

    // give power to PCUART3
    LPC_SC->PCONP |= (1 << 25);

    // set peripheral clock selection for UART3
    LPC_SC->PCLKSEL1 &= ~(3 << 18); // clear bits
    LPC_SC->PCLKSEL1 |=  (1 << 18); // set to "01" (full speed)
    pclk = SystemCoreClock;

    // set to 8 databits, no parity, and 1 stop bit
    LPC_UART3->LCR = 0x03;

    // enable 'Divisor Latch Access" (must disable later)
    LPC_UART3->LCR |= (1 << 7);

    // do baudrate calculation
    fdiv = (pclk / (16 * baudrate));
    LPC_UART3->DLM = (fdiv >> 8) & 0xFF;
    LPC_UART3->DLL = (fdiv) & 0xFF;

    // disable 'Divisor Latch Access"
    LPC_UART3->LCR &= ~(1 << 7);

    // set the number of bytes received before a RDA interrupt
    LPC_UART3->FCR |= (RX_TRIGGER_LEVEL << 6);

    // enable Rx and Tx FIFOs and clear FIFOs
    LPC_UART3->FCR |= 0x01;

    // clear Rx and Tx FIFOs
    LPC_UART3->FCR |= 0x06;

    // add the interrupt handler into the interrupt vector
    NVIC_EnableIRQ(UART3_IRQn);

    // set the priority of the interrupt
    NVIC_SetPriority(UART3_IRQn, 30); // '0' is highest

    // turn on UART3 interrupts
    uart3_interruptsOn();

    // set to operational status
    uart3_tx_sts = UART3_OPERATIONAL;
    uart3_rx_sts = UART3_OPERATIONAL;
}

void uart3_disable(void) {
    // disable interrupt
    NVIC_DisableIRQ(UART3_IRQn);

    // turn off all interrupt sources
    uart3_interruptsOff();

    // clear software FIFOs
    fifo_clear(&txFifo);
    fifo_clear(&rxFifo);

    // set to disabled status
    uart3_tx_sts = UART3_DISABLED;
    uart3_rx_sts = UART3_DISABLED;
}

void uart3_printByte(uint8_t b) {
    uint8_t thr_empty;

    // turn off UART3 interrupts while accessing shared resources
    uart3_interruptsOff();

    // determine if the THR register is empty
    thr_empty = (LPC_UART3->LSR & LSR_THRE);

    // both checks MUST be here.  there is a slight chance that
    //  the THR is empty but chars still reside in the SW Tx FIFO
    if (thr_empty && fifo_isEmpty(&txFifo)) {
        LPC_UART3->THR = b;
    }
    else {
        // turn UART3 interrupts back on to allow Sw Tx FIFO emptying
        uart3_interruptsOn();

        // wait for one slot available in the SW Tx FIFO
        while (fifo_isFull(&txFifo));

        // turn interrupts back off
        uart3_interruptsOff();

        // add character to SW Tx FIFO
        fifo_put(&txFifo, b); // <- this is the only case of txFifo putting
    }

    // turn UART3 interrupts back on
    uart3_interruptsOn();
}

void uart3_printBytes(uint8_t* buf, uint32_t len) {
    // transfer all bytes to HW Tx FIFO
    while ( len != 0 ) {
        // send next byte
        uart3_printByte(*buf);

        // update the buf ptr and length
        buf++;
        len--;
    }
}

void uart3_printString(char* buf) {
    while ( *buf != '\0' ) {
        // send next byte
        uart3_printByte((uint8_t)*buf);

        // update the buf ptr
        buf++;
    }
}

void uart3_printInt32(int32_t n, uint8_t base) {
    uint32_t i = 0;

    // print '-' for negative numbers, also negate
    if (n < 0) {
        uart3_printByte((uint8_t)'-');
        n = ((~n) + 1);
    }

    // cast to unsigned and print using uint32_t printer
    i = n;
    uart3_printUint32(i, base);
}

void uart3_printUint32(uint32_t n, uint8_t base) {
    uint32_t i = 0;
    uint8_t buf[8 * sizeof(uint32_t)]; // binary is the largest

    // check for zero case, print and bail out if so
    if (n == 0) {
        uart3_printByte((uint8_t)'0');
        return;
    }

    while (n > 0) {
        buf[i] = n % base;
        i++;
        n /= base;
    }

    for (; i > 0; i--) {
        if (buf[i - 1] < 10)
            uart3_printByte((uint8_t)('0' + buf[i - 1]));
        else
            uart3_printByte((uint8_t)('A' + buf[i - 1] - 10));
    }
}

void uart3_printDouble(double n, uint8_t frac_digits) {
    uint8_t i;
    uint32_t i32;
    double rounding, remainder;

    // test for negatives
    if (n < 0.0) {
        uart3_printByte((uint8_t)'-');
        n = -n;
    }

    // round correctly so that print(1.999, 2) prints as "2.00"
    rounding = 0.5;
    for (i=0; i<frac_digits; i++)
        rounding /= 10.0;
    n += rounding;

    // extract the integer part of the number and print it
    i32 = (uint32_t)n;
    remainder = n - (double)i32;
    uart3_printUint32(i32, 10);

    // print the decimal point, but only if there are digits beyond
    if (frac_digits > 0)
        uart3_printByte((uint8_t)'.');

    // extract digits from the remainder one at a time
    while (frac_digits-- > 0) {
        remainder *= 10.0;
        i32 = (uint32_t)remainder;
        uart3_printUint32(i32, 10);
        remainder -= i32;
    }
}

uint32_t uart3_available(void) {
    uint32_t avail;
    uart3_interruptsOff();
    avail = fifo_available(&rxFifo);
    uart3_interruptsOn();
    return avail;
}

uint8_t uart3_peek(void) {
    uint8_t ret;
    uart3_interruptsOff();
    ret = fifo_peek(&rxFifo);
    uart3_interruptsOn();
    return ret;
}

uint8_t uart3_read(void) {
    uint8_t ret;
    uart3_interruptsOff();
    ret = fifo_get(&rxFifo);
    uart3_interruptsOn();
    return ret;
}

uint8_t uart3_txStatus(void) {
    return uart3_tx_sts;
}

uint8_t uart3_rxStatus(void) {
    return uart3_rx_sts;
}

/* private functions */
void UART3_IRQHandler(void) {
    uint8_t intId;  // interrupt identification
    uint8_t lsrReg; // line status register

    // get the interrupt identification from the IIR register
    intId = ((LPC_UART3->IIR) >> 1) & 0x7;

    // RLS (receive line status) interrupt
    if ( intId == RLS_INTERRUPT ) {
        // get line status register value (clears interrupt)
        lsrReg = LPC_UART3->LSR;

        // determine type of error and set Rx status accordingly
        if (lsrReg & LSR_OE)
            uart3_rx_sts = UART3_OVERFLOW; // won't happen when using SW fifo
        else if (lsrReg & LSR_PE)
            uart3_rx_sts = UART3_PARITY_ERROR;
        else if (lsrReg & LSR_FE)
            uart3_rx_sts = UART3_FRAMING_ERROR;
        else if (lsrReg & LSR_BI)
            uart3_rx_sts = UART3_BREAK_DETECTED;
    }
    // RDA (receive data available) interrupt
    else if ( intId == RDA_INTERRUPT )      {
        // this interrupt occurs when the number of bytes in the
        //  HW Rx FIFO are greater than or equal to the trigger level 
        // (FCR[7:6])

        // read out bytes
        // clears interrupt when HW Rx FIFO is below trigger level FCR[7:6]
        // the number of loops should be the trigger level (or +1)
        while ((LPC_UART3->LSR) & 0x1)
            fifo_put(&rxFifo, LPC_UART3->RBR);
        rdaInterrupts++;
    }
    // CTI (character timeout indicator) interrupt
    else if ( intId == CTI_INTERRUPT )      {
        // this interrupt occurs when the HW Rx FIFO contains at least one
        //  char and nothing has been received in 3.5 to 4.5 char times.
        // read out all remaining bytes
        while ((LPC_UART3->LSR) & 0x1)
            fifo_put(&rxFifo, LPC_UART3->RBR);
        ctiInterrupts++;
    }
    // THRE (transmit holding register empty) interrupt
    else if ( intId == THRE_INTERRUPT ) {
        uint8_t i;
        // transfer 16 bytes if available, if not, transfer all you can
        for (i=0; ((i<16) && (!fifo_isEmpty(&txFifo))); i++)
            LPC_UART3->THR = fifo_get(&txFifo);
    }
}

static inline void uart3_interruptsOn(void) {
    LPC_UART3->IER = 0x07; // RBR, THRE, RLS
}

static inline void uart3_interruptsOff(void) {
    LPC_UART3->IER = 0x00; // !RBR, !THRE, !RLS
}

Handling FIFOs


The LPC176x UART design has hardware FIFOs built-in. Having these hardware FIFOs makes the UART hardware very efficient. However, handling the data flow between the hardware FIFOs, the software FIFOs, and the user can be very tricky. There are many situations that must be considered. The main issue is synchronization (the lack of such will cause data corruption). A correct UART driver design must always send the data in-order. Issues will occur if the driver mistakenly assumes that the software FIFO is empty and adds data directly to the hardware FIFO. If you look at the ‘print_byte()’ function, it has a lot of checks to ensure this does not happen. Throughout the code, the driver is constantly turning on and off the UART interrupts. This is because the interrupts can trigger at any time. While accessing shared memory, the interrupt code must be stalled. This is a tricky concept and is the basis for many embedded system software errors.

Advertisements

Software FIFO

The base of any embedded system is the drivers that interact with the hardware. 99% of the time, these drivers use interrupts to handle the asynchronous behavior of hardware. A crucial component in driver development is often a first-in-first-out (FIFO) buffer that allows the hardware interrupt handler to act independently of the regular system code. FIFOs allow a system to have a ‘producer’ and ‘consumer’ of data. The rate at which the FIFO is filled and emptied does not have to be the same on both sides. This asynchronous behavior allows for bursty data flows. A basic FIFO has two interfaces: a write interface that allows some code to write data to it; and a read interface that allows other code to pull data from it.

Header File

Developing a precise interface specification before implementation will make the design process faster and less buggy. Here is the specification to my FIFO:

/************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met: 

1. Redistributions of source code must retain the above copyright 
   notice, this list of conditions and the following disclaimer. 
2. Redistributions in binary form must reproduce the above 
   copyright notice, this list of conditions and the following 
   disclaimer in the documentation and/or other materials provided 
   with the distribution. 

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS 
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE 
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

*************************************************************************

 Information:
   File Name  :  fifo.h
   Author(s)  :  Nic McDonald
   Hardware   :  Any
   Purpose    :  First In First Out Buffer

*************************************************************************
 Modification History:
   Revision   Date         Author    Description of Revision
   1.00       05/30/2011   NGM       initial

************************************************************************/
#ifndef _FIFO_H_
#define _FIFO_H_

/* includes */
#include <stdint.h>

/* defines */
#define FIFO_GOOD       0x00
#define FIFO_OVERFLOW   0x01
#define FIFO_UNDERFLOW  0x02

/* typedefs */
typedef struct {
    volatile uint32_t size;
    volatile uint8_t* data;
    volatile uint8_t  status;
    volatile uint32_t putIndex;
    volatile uint32_t getIndex;
    volatile uint32_t used;
} FIFO;

/* functions */
void     fifo_init(FIFO* f, uint32_t size, uint8_t* data);
uint32_t fifo_isFull(FIFO* f);
uint32_t fifo_isEmpty(FIFO* f);
uint8_t  fifo_get(FIFO* f);
void     fifo_put(FIFO* f, uint8_t c);
uint8_t  fifo_peek(FIFO* f);
uint32_t fifo_available(FIFO* f);
void     fifo_clear(FIFO* f);
uint8_t  fifo_status(FIFO* f);

#endif // _FIFO_H_

Source File

It is important to design a FIFO to be robust even when the user abuses the interface specification. For instance, you don’t want the memory to become corrupted when the user reads from the FIFO when it is empty or when the user writes to the FIFO when it is full. The memory allocated to the FIFO may become corrupt, but the memory surrounding it should not. Here is the implementation behind the header file’s specification:

/************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met: 

1. Redistributions of source code must retain the above copyright 
   notice, this list of conditions and the following disclaimer. 
2. Redistributions in binary form must reproduce the above 
   copyright notice, this list of conditions and the following 
   disclaimer in the documentation and/or other materials provided 
   with the distribution. 

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS 
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR 
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE 
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

*************************************************************************

 Information:
   File Name  :  fifo.c
   Author(s)  :  Nic McDonald
   Hardware   :  Any
   Purpose    :  First In First Out Buffer

*************************************************************************
 Modification History:
   Revision   Date         Author    Description of Revision
   1.00       05/30/2011   NGM       initial

*************************************************************************
 Theory of Operation:
   This FIFO implementation provides a memory safe 'First In First Out'
   circular buffer.  If the operating conditions of a FIFO causes it
   to 'underflow' or 'overflow' the FIFO will not corrupt memory other
   than its own data buffer.  However, memory accesses into the buffer
   will be invalid.  If a FIFO 'underflows' or 'overflows', it should
   be re-initialized or cleared.

   Example Usage:
      volatile uint8_t fifo_buf[128];
      FIFO fifo;
      fifo_init(&fifo, 128, fifo_buf);

************************************************************************/

#include "fifo.h"

void fifo_init(FIFO* f, uint32_t size, uint8_t* data) {
    f->size     = size;
    f->data     = data;
    f->status   = FIFO_GOOD;
    f->putIndex = 0;
    f->getIndex = 0;
    f->used     = 0;
}

uint32_t fifo_isFull(FIFO* f) {
    return (f->used >= f->size);
}

uint32_t fifo_isEmpty(FIFO* f) {
    return (f->used == 0);
}

uint8_t fifo_get(FIFO* f) {
    uint8_t c;
    if (f->used > 0) {
        c = f->data[f->getIndex];
        f->getIndex = (f->getIndex+1) % f->size;
        f->used--;
        return c;
    }
    else {
        f->status = FIFO_UNDERFLOW;
        return 0;
    }
}

void fifo_put(FIFO* f, uint8_t c) {
    if (f->used >= f->size)
        f->status = FIFO_OVERFLOW;
    else {
        f->data[f->putIndex] = c;
        f->putIndex = (f->putIndex+1) % f->size;
        f->used++;
    }
}

uint8_t fifo_peek(FIFO* f) {
    return f->data[f->getIndex];
}

uint32_t fifo_available(FIFO* f) {
    return f->used;
}

void fifo_clear(FIFO* f) {
    f->status = FIFO_GOOD;
    f->putIndex = 0;
    f->getIndex = 0;
    f->used = 0;
}

uint8_t fifo_status(FIFO* f) {
    return f->status;
}

How to use a FIFO

Previously I mentioned that a FIFO is a method for synchronizing two asynchronous data flows. If these two data flows are on different threads (including interrupts), extreme care must be taken when accessing the FIFO. FIFOs are often used in UART drivers.

Let’s consider a case where a FIFO is used to bridge the gap between a UART receiver and some user code. A FIFO works great in this situation because UART data comes in very bursty and the user code may not be able to immediately handle the data. The FIFO allows the user to pull the data at its own speed, as long as the FIFO doesn’t overflow.

In this case there are two threads accessing the FIFO. The UART receive interrupt can come at any time and will interrupt the user’s code. The FIFOs functionality is heavily based on a variable that represents how many bytes are currently in the FIFO (in my code it is ‘used’). The interrupt code will be using the ‘fifo_put()’ function and the user code will be using the ‘fifo_get()’ function. Both functions modify the ‘used’ variable. If proper synchronization techniques are not taken, the interrupt code might call ‘fifo_put()’ right in the middle of the user calling ‘fifo_get()’. This could cause the ‘used’ variable to become corrupted and the FIFO would then be unusable. Fortunately in the interrupt case, the user code just needs to temporarily turn off the UART receive interrupt while calling ‘fifo_get()’. For a multi-threaded design, semaphores should be used to properly access the FIFO functions without corrupting the variables.

LPC176x I2C Driver

I’ve had a few requests for a LPC176x I2C driver.  During my development process on the LPC1768 LPCXpresso board, I wanted to design a simple I2C driver but I couldn’t find any simple examples.  Most of the drivers out there are complex and don’t have easy functionality for those who need a simple master only send/receive interface. I believe this one is simple enough to learn from.

Header File

Before writing a driver, you first need to make a specification of the interface.  I wanted my driver to be a basic send/receive interface where the slave is specified by address and the buffer is pre-allocated.  Here is the header file to my driver.

/*****************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
******************************************************************************
Copyright 2011
All Rights Reserved

Information:
File Name : i2c0.h
Author(s) : Nic McDonald
Project : Quadrotor
Hardware : LPCXpresso LPC1768
Purpose : I2C Driver

******************************************************************************
Modification History:
Revision Date Author Description of Revision
1.00 03/04/2011 NGM initial

******************************************************************************
Warning:
This I2C implementation is only for master mode. It also only
gives one transfer per transaction. This means that this driver
only does 'send' or 'receive' per function call. The user
functions 'receive' and 'send' are NOT thread safe.

*****************************************************************************/
#ifndef _I2C0_H_
#define _I2C0_H_

/* includes */
#include &lt;stdlib.h&gt;
#include &lt;stdint.h&gt;
#include "LPC17xx.h"

/* defines */
#define MODE_100kbps 100000
#define MODE_400kbps 400000
#define MODE_1Mbps 1000000

/* typedefs */

/* functions */

// Initialize the I2C hardware.
// see 'readme'
void i2c0_init(uint32_t i2c_freq, uint8_t int_pri);

// Performs a I2C master send function.
// Returns the number of bytes sent successfully.
// Returns 0xFFFFFFFF if slave did not response on bus.
// This is NOT thread safe.
uint32_t i2c0_send(uint8_t address, uint8_t* buffer, uint32_t length);

// Performs a I2C master receive function.
// Returns the number of bytes received successfully.
// Returns 0xFFFFFFFF if slave did not response on bus.
// This is NOT thread safe.
uint32_t i2c0_receive(uint8_t address, uint8_t* buffer, uint32_t length);

/*** DEBUG ***/uint8_t* i2c_buf(void);
/*** DEBUG ***/uint32_t i2c_pos(void);

#endif /* _I2C0_H_ */

Source File

Now that we have a good interface, let’s see what we need to implement.

/*****************************************************************************
Copyright (c) 2011, Nic McDonald
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
1. Redistributions of source code must retain the above copyright
   notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above
   copyright notice, this list of conditions and the following
   disclaimer in the documentation and/or other materials provided
   with the distribution.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
******************************************************************************
                                Copyright 2011
                             All Rights Reserved

 Information:
   File Name  :  i2c0.c
   Author(s)  :  Nic McDonald
   Project    :  Quadrotor
   Hardware   :  LPCXpresso LPC1768
   Purpose    :  I2C Driver

******************************************************************************
 Modification History:
   Revision   Date         Author    Description of Revision
   1.00       03/04/2011   NGM       initial

*****************************************************************************/
#include "i2c0.h"

// IC2 control bits
#define AA      (1 << 2)
#define SI      (1 << 3)
#define STO     (1 << 4)
#define STA     (1 << 5)
#define I2EN    (1 << 6)

// pointers setup by users functions
static volatile uint8_t  slave_address; // formatted by send or receive
static volatile uint8_t* buf;
static volatile uint32_t buf_len;
static volatile uint32_t num_transferred;
static volatile uint32_t i2c0_busy;

static inline uint8_t to_read_address(uint8_t address);
static inline uint8_t to_write_address(uint8_t address);

/*************DEBUG**************************************************************************************/
uint8_t i2c_status_buf[100];
uint32_t i2c_status_pos;
uint8_t* i2c_buf(void) {return i2c_status_buf;}
uint32_t i2c_pos(void) {return i2c_status_pos;}
/*************DEBUG**************************************************************************************/

LPC_I2C_TypeDef*  regs;
IRQn_Type         irqn;
uint32_t ignore_data_nack = 1;


void i2c0_init(uint32_t i2c_freq, uint8_t int_pri) {
    uint32_t pclk, fdiv;

    regs = LPC_I2C0;
    irqn = I2C0_IRQn;

    // setup initial state
    i2c0_busy = 0;
    buf = NULL;
    buf_len = 0;
    num_transferred = 0;

    // give power to the I2C hardware
    LPC_SC->PCONP |= (1 << 7);

    // set PIO0.27 and PIO0.28 to I2C0 SDA and SCK
    LPC_PINCON->PINSEL1 &= ~0x03C00000;
    LPC_PINCON->PINSEL1 |=  0x01400000;

    // set peripheral clock selection for I2C0
    LPC_SC->PCLKSEL0 &= ~(3 << 14); // clear bits
    LPC_SC->PCLKSEL0 |=  (1 << 14); // set to "01" (full speed)
    pclk = SystemCoreClock;

    // clear all flags
    regs->I2CONCLR = AA | SI | STO | STA | I2EN;

    // determine the frequency divider and set corresponding registers
    //  this makes a 50% duty cycle
    fdiv = pclk / i2c_freq;
    regs->I2SCLL = fdiv >> 1; // fdiv / 2
    regs->I2SCLH = fdiv - (fdiv >> 1); // compensate for odd dividers

    // install interrupt handler
    NVIC_EnableIRQ(irqn);

    // set the priority of the interrupt
    NVIC_SetPriority(irqn, int_pri); // '0' is highest

    // enable the I2C (master only)
    regs->I2CONSET = I2EN;
}

uint32_t i2c0_send(uint8_t address, uint8_t* buffer, uint32_t length) {
    // check software FSM
    if (i2c0_busy)
        //error_led_trap(0x11000001, i2c0_busy, 0, 0, 0, 0, 0, 0, 0);
        return 0;

    // set to status to 'busy'
    i2c0_busy = 1;

    // setup pointers
    slave_address = to_write_address(address);
    buf = buffer;
    buf_len = length;
    num_transferred = 0;

    // trigger a start condition
    regs->I2CONSET = STA;

    // wait for completion
    while (i2c0_busy);

    // get how many bytes were transferred
    return num_transferred;
}

uint32_t i2c0_receive(uint8_t address, uint8_t* buffer, uint32_t length) {
    // check software FSM
    if (i2c0_busy)
        //error_led_trap(0x11000002, i2c0_busy, 0, 0, 0, 0, 0, 0, 0);
        return 0;

    // set to status to 'busy'
    i2c0_busy = 1;

    // setup pointers
    slave_address = to_read_address(address);
    buf = buffer;
    buf_len = length;
    num_transferred = 0;

    // trigger a start condition
    regs->I2CONSET = STA;

    // wait for completion
    while (i2c0_busy);

    // get how many bytes were transferred
    return num_transferred;
}

void I2C0_IRQHandler(void) {
    // get reason for interrupt
    uint8_t status = regs->I2STAT;

    // ignore data nack when control is true
    if ((status == 0x30) && (ignore_data_nack))
            status = 0x28;

    // LPC17xx User Manual page 443:
    //      "...read and write to [I2DAT] only while ... the SI bit is set"
    //      "Data in I2DAT remains stable as long as the SI bit is set."


    /**************************************DEBUG************************************************************/
    i2c_status_buf[i2c_status_pos] = status;
    i2c_status_pos++;
    if (i2c_status_pos > 99)
        i2c_status_pos = 0;
    /**************************************DEBUG************************************************************/


    switch(status) {

    // Int: start condition has been transmitted
    // Do:  send SLA+R or SLA+W
    case 0x08:
        regs->I2DAT = slave_address; // formatted by send or receive
        regs->I2CONCLR = STA | SI;
        break;

    // Int: repeated start condition has been transmitted
    // Do:  send SLA+R or SLA+W
    //case 0x10:
    //    regs->I2DAT = slave_address;
    //    regs->I2CONCLR = STA | SI;
    //    break;

    // Int: SLA+W has been transmitted, ACK received
    // Do:  send first byte of buffer if available
    case 0x18:
        if (num_transferred < buf_len) {
            regs->I2DAT = buf[0];
            regs->I2CONCLR = STO | STA | SI;
        }
        else {
            regs->I2CONCLR = STA | SI;
            regs->I2CONSET = STO;
        }
        break;

    // Int: SLA+W has been transmitted, NACK received
    // Do:  stop!
    case 0x20:
        regs->I2CONCLR = STA | SI;
        regs->I2CONSET = STO;
        num_transferred = 0xFFFFFFFF;
        i2c0_busy = 0;
        break;

    // Int: data byte has been transmitted, ACK received
    // Do:  load next byte if available, else stop
    case 0x28:
        num_transferred++;
        if (num_transferred < buf_len) {
            regs->I2DAT = buf[num_transferred];
            regs->I2CONCLR = STO | STA | SI;
        }
        else {
            regs->I2CONCLR = STA | SI;
            regs->I2CONSET = STO;
            i2c0_busy = 0;
        }
        break;

    // Int: data byte has been transmitted, NACK received
    // Do:  stop!
    case 0x30:
        regs->I2CONCLR = STA | SI;
        regs->I2CONSET = STO;
        i2c0_busy = 0;
        break;

    // Int: arbitration lost in SLA+R/W or Data bytes
    // Do:  release bus
    case 0x38:
        regs->I2CONCLR = STO | STA | SI;
        i2c0_busy = 0;
        break;

    // Int: SLA+R has been transmitted, ACK received
    // Do:  determine if byte is to be received
    case 0x40:
        if (num_transferred < buf_len) {
            regs->I2CONCLR = STO | STA | SI;
            regs->I2CONSET = AA;
        }
        else {
            regs->I2CONCLR = AA | STO | STA | SI;
        }
        break;

    // Int: SLA+R has been transmitted, NACK received
    // Do:  stop!
    case 0x48:
        regs->I2CONCLR = STA | SI;
        regs->I2CONSET = STO;
        num_transferred = 0xFFFFFFFF;
        i2c0_busy = 0;
        break;

    // Int: data byte has been received, ACK has been returned
    // Do:  read byte, determine if another byte is needed
    case 0x50:
        buf[num_transferred] = regs->I2DAT;
        num_transferred++;
        if (num_transferred < buf_len) {
            regs->I2CONCLR = STO | STA | SI;
            regs->I2CONSET = AA;
        }
        else {
            regs->I2CONCLR = AA | STO | STA | SI;
        }
        break;

    // Int: data byte has been received, NACK has been returned
    // Do:  transfer is done, stop.
    case 0x58:
        regs->I2CONCLR = STA | SI;
        regs->I2CONSET = STO;
        i2c0_busy = 0;
        break;

    // something went wrong, trap error
    default:
        while (1); // flash a LED or something 😦
        break;

    }
}

static inline uint8_t to_read_address(uint8_t address) {
    return (address << 1) | 0x01;
}
static inline uint8_t to_write_address(uint8_t address) {
    return (address << 1);
}

As you can see, the implementation is fairly simple except for the interrupt handler. Fortunately, NXP is a great vendor when it comes to documentation. The user manual (found here) explains everything in detail. In fact, the state machine implemented in my driver’s interrupt handler is taken directly from the instructions in the manual. Each time an I2C event occurs, the I2C interrupter reports a status code. The user manual tells you exactly what to do for each status code. Using a large switch/case statement as I have done, leads to very short interrupt handling time.

I left some debugging code in there as I found it was extremely useful. The ‘i2c_buf’ and ‘i2c_pos’ functions allow me to retrieve information about the i2c transfer. The ‘i2c0_send’ and ‘i2c0_recv’ functions are mostly unconnected with the interrupts so there is no good way to figure what is going wrong when it does. Using a small buffer lets me see the order in which the interrupt status codes come. This allows me to determine what went wrong and why. This debug buffer isn’t flawless. I only used it to see one transaction length. I suggest removing it from the code once you verified that the driver works for you.

Conclusion

I hope that no one takes this code and uses it.  Instead, I’d hope that you’d take this code, verify it works in your system, then use it to start working on your own driver!  Making an I2C driver is a lot of fun and allows you to write code that heavily interacts with the hardware.  Making a finite state machine around the I2C status codes will really help you learn driver development. I2C is one of the more complicated protocols. UART, SPI, etc. are much easier and are a better starting point for a beginner. USB, Ethernet, CAN, etc. are more complicated than I2C. I2C presents a nice bridge between the extremely easy and the extremely hard.

Sample Usage:

#include <stdio.h>
#include "i2c0.h"
void main() {
    i2c0_init(MODE_400kbps, 3);
    char buf[100] = "hello";
    uint8_t slave = 0xEE;
    uint32_t res;
    if ((res = i2c0_send(slave, buf, sizeof(buf))) == 0xFFFFFFFF)
        /* slave did not response on bus */;
    if ((res = i2c0_recv(slave, buf, sizeof(buf))) == 0xFFFFFFFF)
        /* slave did not response on bus */;
    else {
        buf[res] = '\0';
        printf("Slave responded: %s\n", buf);
    }
}

Digital System Resets

Designing a reset architecture for a digital device such as an ASIC, FPGA, CPLD, etc. can be challenging.  Resets are a common culprit of metastability and unpredictable behavior.  Here I will discuss various reset architectures and how to properly use them.

Before you can begin to understand resets you must first understand flip-flops.  Flip-flops are the basic building block of all digital synchronous circuits.  Flip-flops are used to hold state between clock edges.  Flip-flops come in MANY varieties.  Flip-flops usually have between 0 and 2 signals that represent some sort of “reset”.  The 3 most common flip-flops are shown below (clock enables not shown):

Non-Resettable Flip-Flop:

Flip-flops don’t actually need any reset logic built-in.  External logic such as multiplexers can be used to emulate all the functionality of internal reset logic.  However, adding reset logic to the flip-flop directly greatly reduces the overall logic footprint.

Asynchronous Resettable Flip-Flop:

An asynchronous reset scheme enables a flip-flop to inherit a value when a specific signal is active.  The two asynchronous signal names are typically referred to as “preset” and “clear”.  Using positive logic, when the “preset” line is high, the output of the flip-flop is immediately forced high independent of the clock’s state and the input data.  Likewise, when the “clear” line is high, the output is forced low.

This waveform shows a simple asynchronous reset process.  On the first rising clock edge, the output ‘Q’ is set low because the input ‘D’ is low.  On the second rising clock edge, the output now goes high as a result of the input.  Between clock 2 and 3 the asynchronous clear signal goes high.  As soon as the signal reaches a full logic level 1, the output of the flip-flop is immediately forced low.

Synchronous Resettable Flip-Flop:

A synchronous reset scheme enables a flip-flop to inherit a value when a specific signal is active during an active clock edge.  The two synchronous signal names are typically referred to as “set” and “reset”.  Using positive logic and positive clock edges, when the “set” line is high during a positive clock edge, the flip-flop is forced high independent of the input data.  Likewise, when the “reset” line is high during a positive clock edge, the flip-flop is forced low.

This waveform shows a simple synchronous reset process.  On the first rising clock edge, the output ‘Q’ is set low because the input ‘D’ is low.  On the second rising clock edge, the output now goes high as a result of the input.  Between clock 2 and 3 the synchronous reset signal goes high.  This change does not effect the flip-flop output value until the third rising clock edge.  At this point the output is driven low even though the input signal is still high.

What Needs to be Reset?

A good reset design approach is “reset only what needs it”.  Things that need to be reset are flip-flops that must be put in a known state. Common examples are: finite state machine flip-flops; incrementing or decrementing counters; and control pipelines.

In general, data paths do not need to be reset.  Adding a reset to a large data path can cause excessive resource usage and routing delays.  Take care when deciding which flop-flops need to be reset.

Asynchronous/Synchronous Comparison:

Before deciding what reset architecture to use, let’s first define the advantages and disadvantages of the two styles.

Advantages of Asynchronous Resets:
  • Flip-flops immediately take the value of reset without dependence on a clock edge.
  • No signal synchronization needed for asynchronous input reset signals (like a push button reset).
Disadvantages of Asynchronous Resets:
  • Coming out of reset often causes metastability.
  • Chip-wide asynchronous resets cause modules to come out of reset at different times due to inconsistent delay paths.

Advantages of Synchronous Resets:

  • All modules come out of reset at the same time and timing assumptions can safely be made about module interfaces.
  • All clock/reset timing is taken care of by standard synthesis.
Disadvantages of Synchronous Resets:
  • Designs with large area will use excessive routing resources while trying to meet timing constraints.
  • Relies on the existence of a clock.  Signals won’t be reset until an active clock edge.

Note: this topic applies to all types of digital devices.  Each device type (FPGA, ASIC, etc.) will have optimal setups, but understanding your options will help you decide how to safely reset your device.

The Asynchronous Reset Problem:

For asynchronous resets, going into a reset state isn’t a problem.  When software tools are synthesizing, and placing components, asynchronous resets are a simple task because they are not related to a specific clock and have no timing constraints.

Asynchronous resets create a problem when the reset signal is being deactivated.  If the reset is released near an active clock edge the results of that clock cycle are unknown.  The following waveform shows this scenario:

At the start of the second clock cycle the clock rises and the clear signal falls.  What should the flip-flop be set to?  Will the input ‘D’ win the fight or will the clear signal?  The answer is that we don’t know.  Not knowing the state of a signal will certainly cause issues.  An even bigger problem is the violation of the setup and hold time requirements of the flip-flop.  Violating these requirements results in metastability.

Consider a state machine that has 3 states and is one-hot encoded with 001, 010, and 100.  Now consider the asynchronous deactivation problem.  What if bits 1 and 2 got reset but bit 3 did not?  The state could then be 101 and the circuit’s logic would consider the state machine to be in two states simultaneously.  Obviously this would kill the design.

Some designers attempt to overcome this problem by first synchronizing the reset to the appropriate clock domain then using it as a synchronous reset.  If this new synchronous reset is used globally, you’ve effectively converted your design to a synchronous reset architecture.  If the new reset signal is only used locally, you’ll create problems due to not knowing exactly when adjacent modules are in or out of reset.

The Synchronous Reset Problem:

Unlike asynchronous resets, synchronous resets must travel between flip-flops in one clock period.  During synthesis and place & route, the software tools will ensure that each reset signal will arrive at its destination before the active clock edge that it triggers on.  This may seem like a good thing because the designer now doesn’t need to worry about violating the setup and hold times of the flip-flops being reset.  This is true, but only on a small scale.

Synchronous resets, specifically global synchronous resets, create routing problems that lead to sub-optimal timing results.  Using a global synchronous reset effectively means that every block must see the same reset signal every clock cycle.  Routing one signal to all locations of a chip in one clock cycle requires a massive amount of routing resources or, depending on the clock speed and die size, is impossible.

Consider a large design with 3 major sub-designs.  Each sub-design must communicate with all other sub-designs so it is important to know that each block comes out of reset on the same clock cycle.  This is the main idea behind a global synchronous reset.

The small red block is a module that synchronizes the input reset to the clock in order to provide a synchronous reset to the rest of the chip.  Now consider the results if all 3 blocks directly use the reset as a synchronous reset.  All flip-flops using the reset signal will draw current from reset source.  For a large design, the fanout of this structure will cause most designs to fail static timing analysis.

Solution:

From our discussion thus far, it’s apparent that working with synchronous resets is easier because the software tools will provide proper timing.  The first thing we need to do is synchronize the asynchronous input reset to our clock domain.  The synchronizer below outputs a reset that activates asynchronously and deactivates synchronously.  Using this style of synchronizer gives us the advantages of asynchronous resets and the safety of synchronous resets.

Now that we have a good reset signal we need to spread it across the chip efficiently.  We will create ‘M’ parallel reset pipelines of ‘N’ flip-flops.  ‘M’ is is the number of major blocks the design contains.  ‘N’ is determined according to clock speed and die size.  It needs to be high enough such that each reset pipeline can meet timing while delivering the reset to the desired location.

This figure shows M=3 and N=6.  The 3 separate pipelines are of equal length so each of the 3 blocks will receive the reset at the same time.  The 6 pipeline stages allow the place & route tools to easily make it across the chip while still meeting timing.  The pipeline stages work just like the synchronizer in that they produce a asynchronous reset assertion and a synchronous reset deassertion.

After the HDL is in place to generate the circuits described above, synthesis and timing constraints must be used in order for this reset architecture to work.

  1. A synthesis directive must be placed on all flip-flops in the pipeline stages informing the synthesizer to keep all flip-flop instances.  By default, the synthesizer will see that the pipeline stages are parallel versions of each other and “optimize” them away.  For Synopsys constraints, the “syn_keep” directive will perform this task.
  2. In order to use the advantage of the asynchronously asserted reset, the reset must be used asynchronously in the HDL.  Because it is asynchronous, the synthesizer will assume no timing dependencies relative to the clock.  However, we must guarantee that the deassertion of the reset is synchronous.  A place & route constraint must be placed between all stages of the pipeline and between the last stage and its destination.  The constraint must ensure that the reset reaches its destination without violating the setup and hold times of the input flip-flops.  If the reset is used synchronously, this step can be skipped.

Other Links:

EETimes: How do I reset my FPGA?

Assertions in Microcontrollers

Introduction to Assertions:

Assertions are a simple way of testing your programming logic.  The basic idea behind assertions is testing a piece of code that you assume will always work as designed.  As programmers, we inheritantly do this all the time while writing code.  We find ourselves writing small print statements to check that what we just coded produced the right output.  Assertions are a formal way of doing this.  The typical assertion header file looks like this:

#ifndef _ASSERT_H_
#define _ASSERT_H_

#if ENABLE_ASSERTIONS==1
void assertion_failure(char* expr, char* file, char* baseFile, int line);
#define assert(expr) \
    if (expr) ; \
    else assertion_failure(#expr,__FILE__,__BASE_FILE__,__LINE__)
#else
#define assert(expr) /*nothing here*/
#endif

#endif // _ASSERT_H_

And a programmer would use it like this:

#define ENABLE_INSERTIONS 1
#include "assert.h"
double nasa_rocket() {
    int i;

    // thrust value for the four engines
    double thrust_vals[4];

    // average thrust percentage
    double thrust_percentage;

    // get thrust from all rocket engines
    for (i=0; i<4; i++)
        thrust_vals[i] = getMotorThrust(i);

    // call function to compute average thrust percentage
    thrust_percentage = computeAverageThrustPercentage(thrust_vals);

    // double check programming logic
    assert(thrust_percentage >= 0.0);
    assert(thrust_percentage <= 100.0);

    return thrust_percentage;
}

As shown in the first code snippet, when an assertion fails, it calls ‘assert_failure()’ and passes information about the tested expression, file names, and line number.  This function usually uses ‘fprintf()’ to write this information to the error console.  After printing the information, the ‘assertion_failure()’ function either calls a system halt function, kills the program, or puts itself into an infinite loop.

The NASA software engineer could verify here that the returned ‘thrust_percentage’ would not be out of the standard bounds of percentages (0% to 100%).  This would not be a check you’d want to do in code because it should always work and extra checks will just slow it down.  However, you do want to make sure it works before sending your billion dollar rocket into space.  After the engineer has verified that the code works, he can simply disable assertions using the ‘ENABLE_ASSERTIONS’ define statement.  Doing this causes the C preprocessor to produce blank lines for all calls to ‘assert()’.  After all, it’s better to use invalid numbers than to cause an infinite loop inside an ‘assertion_failure()’ function!

How to assert in a microcontroller:

The first program you write for a microcontroller is much different than the first program you write for a personal computer.  You don’t have any libraries or operating system to support the standard “Hello World!” program.  Usually your first program is making an LED flash.  GPIO is the easiest thing to implement in a microcontroller.  Setting up a console for a microcontroller is a big process.  What happens if you wish to use assertions to test the console driver your are creating?  How can you assert without a console to print to?  There are many answers to this.  This post attempts to give the most robust implementation for microcontrollers.

First thing is first, you need a working LED that you can set aside for assertion failures.  That’s it!  Using some simple code, you can be notified (by the LED) of a failure and check where it came from using the debugger.

Here is my implementation for assertions:

#ifndef _ASSERTS_H_
#define _ASSERTS_H_

void assert_init(void (*assert_indicator)(void));
#if ENABLE_ASSERTIONS==1
void assertion_failure(char* expr, char* file, char* basefile, int linenum)
#define assert(expr) \
    if (expr) ; \
    else assertion_failure(#expr,__FILE__,__BASE_FILE__,__LINE__))
#else
#define assert(expr) /*nothing*/
#endif // ENABLE_ASSERTIONS==1

#endif // _ASSERTS_H_
#define ENABLE_ASSERTIONS 1
#include "asserts.h"

// function pointer to assertion indicator
static void (*assert_ind)(void) = NULL;

// sets user's choice of failure indicator function
void assert_init(void (*assert_indicator)(void)) {
    assert_ind = assert_indicator;
}

// failure function called by macro.
#if ENABLE_ASSERTIONS==1
void assertion_failure(char* expr, char* file, char* basefile, int linenum) {
    // debugger can look at these
    volatile const char* v_expr;
    volatile const char* v_file;
    volatile const char* v_basefile;
    volatile int v_linenum;

    // assign variables to volatiles
    v_expr     = expr;
    v_file     = file;
    v_basefile = basefile;
    v_linenum  = linenum;

    // call function that indicates an assertion failure
    assert_ind();

    // halt
    while (1);
}
#endif

As shown, my implementation has an extra function for initialization. The user simply passes a function pointer to the ‘assert_init()’ function that corresponds to the function that would indicate to the user of an assertion failure.  The simplest solution is a function that turns on an LED designated for assertion failure indication.

The ‘assertion_failure()’ implementation simply copies the input values to local volatile versions.  It then calls the user’s function (which illuminates the LED) then puts itself into a infinite loop.

When the user sees the LED illuminate, he knows an assertion has failed.  Using the debugger, he can then pause the program (which is still in the ‘assertion_failure()’ function) and inspect the variables which indicate the failing expression, file location of the failing expression, and line number within the file of the failing expression.  Using this information, he can then fix the mistake(s).

Proportional-Integral-Derivative (PID) Controller

PID Intro:

The heart of any real control system has a feedback controller.  When the system’s process is unknown or hard to model, a Proportional-Integral-Derivative (PID) controller is an efficient method for control.  If the system’s process is known, a custom controller will yield higher efficiency.  However, modeling complex systems in attempt to design a custom controller may result in a much higher development time and cost than can be afforded.  In both cases, a PID controller is a very efficient method for controlling all types of systems.

thanks to wikipedia.org for the diagram

PID Theory:

PID controllers can be understood(partially) without an in depth lecture on control theory. PID controllers use 3 sub-controllers combined into 1 controller using a simple sum. The 3 controllers are:

  • Proportional:
    The proportional section of a PID controller is a basic intuitive approach to feedback control. A naive approach to feedback control would say, the farther away from perfect the system is, the more it should work to get perfect. In a perfect world without friction, momentum, etc., this system alone would work great! This is proportional control. However, our world is imperfect and we need to add smarter feedback compensation.
  • Integral:
    The integral section of a PID controller compensates environment imperfections such as friction, wind, and other such imperfections that would resist the system to reach its perfect state. An integral controller works by keeping a sum of all the error the system has seen. In calculus, this is equivalent to the area underneath the curve up to the current point. The controller increases its control signal as the summed error gets larger.
  • Derivative:
    The derivative section of a PID controller compensates environment imperfections such as momentum which causes the system to overshoot its perfect state. The derivative controller lessens its control signal as the speed in which it is achieving its perfect state increases. In calculus, this is the slope of the error signal.

Example:

Let’s consider an example to show how a PID controller could be used. Our example will be an airborne GPS navigation system in a helicopter. The aircraft knows where it is and where it is being told to be. Using basic subtraction, it can figure out the distance to the desired location. This is the error signal. It indicates how far from perfect it is.

Lets say that the PID controller is controlling the forward thrust of the helicopter and that the direction is automatically set by something else (in other words, the helicopter always points the direction it should).

The proportional controller would just set the thrust according to the distance away from the target location. This is a simple approach but most likely will not work by itself. As the helicopter gains momentum, it will become harder to slow down. Using proportional control only, the helicopter will overshoot its target and possibly oscillate around the target location. Another issue is wind. The thrust controller should compensate for wind because a tail wind could cause an positional overshoot and a head wind might cause it to never get there.

The integral controller continues to sum the error it incurs and adjusts appropriately. In the head wind example, the thrust would increase until the aircraft could reach its target destination.

The derivative controller measures the speed in which the error signal is changing, which for the helicopter is simply the relative ground velocity. As the helicopter approaches its destination the derivative controller reduces the thrust allowing the helicopter not to overshoot its target position. In the event of tail wind, it reduces the thurst even more so that it can’t be pushed past the target.

For the helicopter example, all 3 sub-controllers must be used. To use all of them, you just sum the control signals of all 3.

Implementation:

If you think this sounds hard to implement, you are wrong. PID controllers were designed to be generic and easily adapted to all systems. As mentioned before, PID controllers have 3 sub-controllers. Each controller has a parameter that must be tuned, called ‘gain’. To add a PID controller to a system, you just need to attach the generic PID controller and tune the 3 values.

The following C code is an efficient PID controller.

typedef struct {
    double windup_guard;
    double proportional_gain;
    double integral_gain;
    double derivative_gain;
    double prev_error;
    double int_error;
    double control;
} PID;

void pid_zeroize(PID* pid) {
    // set prev and integrated error to zero
    pid->prev_error = 0;
    pid->int_error = 0;
}

void pid_update(PID* pid, double curr_error, double dt) {
    double diff;
    double p_term;
    double i_term;
    double d_term;

    // integration with windup guarding
    pid->int_error += (curr_error * dt);
    if (pid->int_error < -(pid->windup_guard))
        pid->int_error = -(pid->windup_guard);
    else if (pid->int_error > pid->windup_guard)
        pid->int_error = pid->windup_guard;

    // differentiation
    diff = ((curr_error - pid->prev_error) / dt);

    // scaling
    p_term = (pid->proportional_gain * curr_error);
    i_term = (pid->integral_gain     * pid->int_error);
    d_term = (pid->derivative_gain   * diff);

    // summation of terms
    pid->control = p_term + i_term + d_term;

    // save current error as previous error for next iteration
    pid->prev_error = curr_error;
}

Windup Guard:

As you may have noticed, the code has a feature called windup guard. This is a critical feature that must be used in most control systems. Windup guarding is simply just setting a cap on the maximum value that the integrated error can be. This is typically most needed for startup conditions and situations for switching in and out of control. Let’s look at our helicopter example again. Consider the situation where the helicopter can be piloted by a person or the PID autonomous controller. If the human operator had a target destination only a few feet away from the current position but held the helicopter still, the integral portion of the PID controller would continue to sum the error seen. Eventually this error would grow very large. If the pilot then switched to autonomous mode, the controller’s output thrust signal would be huge, most likely causing a wreck. There are two solutions to this problem. The first is using a cap for the maximum absolute value of the integrated value. This is the windup guard. Another solution would be to call the pid_zeroize function each time the GPS target is set and each time the autonomous control system is enabled. In either case, setting a safe maximum value using a windup guard is good practice.

Optimization:

This code can be optimized if the pid_update function is called at the exact same rate every time. In this case, all references to ‘dt’ can be removed. This will change the optimal values for the integral and derivative gains but once tuned, will respond the same way. In this code, this optimization only removes one multiplication operation and one division operation.
It is possible to change this code to run on fixed-point arithmetic rather than floating-point. This dramatically increases the efficiency but system limits and precision may be compromised. Only use fixed-point if you are sure that you are using enough precision.

EDIT: fixed bug on line 31. changed multiply to divide.

RC Receiver Interface

There are numerous RC Receiver interfacing techniques published on the web.  My quadcopter design is using only one processor for all its computation.  For this reason, I spent extra time designing the interfaces to be as efficient as possible.  Even though RC Receiver interfaces aren’t super complicated, they can cause real-time scheduling issues because of the number of interrupts and the time spent inside these interrupt routines.

Typical RC Receivers output the channel values in a given sequence of pulse width modulated (PWM) signals.  The transmitters and receivers I have tested are Spektrum made.  All Spektrum models seem to have the same design style.  After a small bit of testing, I found that the several channels of signals are output in the same order everytime and have a small time separation.  Since the standard for RC PWM is 50Hz, each pulse comes every 20 milliseconds.  The top 5 lines of the picture above show an example output of a 5 channel Spektrum made receiver.

In a microcontroller, the most accurate way to measure incoming pulses is to use an input capture with a timer.  Measurement occurs when a free running timer value is latched into a register when an event occurs.  Obviously, a higher frequency timer yields a more precise pulse measuring system.  Since most microcontrollers only have a few input capture pins, some special circuitry must be used to measure several pulses.  Since the Spektrum systems do not overlap their output pulses, a simple OR gate structure can be used to combine all the signals into one signal.

Since my quadcopter will be using the Spektrum DX5e and AR500, I’m using a quad 2-input OR gate IC (MC14071BCP), to create this combined signal.  As shown in the timing diagram at the top, there are still 2 voltage transistions per pulse.  This results in 10 interrupts for this design style.  For synchronization purposes, there is always a time between the last pulse and the beginning of the first pulse that is longer than the longest possible pulse.  To capture the values sent by the transmitter, the software is setup for an interrupt to occur on every transition of the signal.  All of the interrupts, except the last one, have a very small amount of time to process.  The last interrupt copies out the five pulse width values and triggers an availability flag so that the rest of the software can have access to it.

An interesting thing that I noticed while testing is that Spektrum has a constant delay of 60 microseconds between each pulse.  To make the system even more efficient, turning off the falling edge interrupts for the first four pulses then back on for the last interrupt results in 4 less interrupts.  For the first 4 pulse width values, 60 microseconds would just be subtracted off.

For my system, this OR gate structure serves another purpose.  The AR500 receiver runs on 5 volts, however, the microcontroller (LPC1768) runs on 3.3 volts.  Of course there are many ways for converting one to another but the MC14071BCP IC has a wide voltage range and is tolerant to 5 volt inputs when running at 3.3 volts.  Without extra circuitry, it will convert the five 5 volt signals from the AR500 into one 3.3 volt signal that represents all five channels.

In conclusion, I’ve found that using a simple OR gate IC is very efficient for pin usage, timer utilization, voltage translation, and interrupt time efficiency.

Update:

I implemented this RC receiver interface on the LPC1768 connected to a AR500.  My results are accurate to about 60-70ns.  This leads me to believe that the AR500 is running on the typical ATmega8 (or 168 or 328) microcontroller running at 16MHz (1/16MHz = 62.5 ns).

Update:

Due to popular request, here is the code I used on the LPC1768 (right-click download, then change the “.pdf” extension to “.zip”):
https://nicisdigital.files.wordpress.com/2013/09/rc_receiver_interface-zip.pdf