Multiple Master Problem with Atmel AVR Microcontrollers

Note: This topic is fairly complex and will only be of interest to developers knowledgeable in Atmel AVR microcontrollers and the TWI serial bus.

I’ve successfully used Atmel’s two-wire interface (TWI) for many years for communication between many microcontrollers and other chips. TWI is the generic-brand equivalent of the Philips I2C serial bus.

The simplest configuration of the I2C bus is to have a single microcontroller that initiates communication with all of the other chips. The master outputs the address of the desired target device (the slave) and indicates whether it is writing or reading. This works perfectly well for most projects.

The constraint with a single master is that the slaves cannot provide data until the master requests the information. The master must routinely ask the devices (called “polling”) if there is any data waiting. Alternatively, additional wires can be connected to the master so that slave devices can indicate when they need attention.

To overcome this limitation, the I2C specification allows any device to send and receive messages by switching from master to slave as needed. The protocol includes a start message indicator and stop message indicator so that multiple masters can keep track of when the bus is busy, even if they don’t understand the contents of each message. Furthermore, if two masters start a message at the same time, the I2C specification dictates how to detect the first bit that differs in their messages and for the losing master to fallback and retry the message later on. These techniques allow multiple masters to safely read and write to the bus without ruining each other’s messages.

Most microcontroller manufacturers implement large portions of the I2C protocol directly in the chip hardware, saving the software developer a lot of work. The software isn’t burdened at all when other devices are using the bus. The software is interrupted only when something is communicating with that specific microcontroller.

Multimaster Issue

The Atmel AVR documentation indicates that it supports multiple masters. And, indeed, it works just fine in most circumstances. However, I recently ran across a very difficult situation that took five days of painstaking work to diagnose.

Diagram of three microcontroller masters on a two-wire interface (TWI,I2C) serial bus.

Diagram of three AVR microcontroller masters on a two-wire interface (TWI, aka I2C) serial bus.

I have three microcontrollers in this particular setup, although the problem can occur with as few as two masters. The microcontrollers have addresses of 21, 22, and 23 hexadecimal. (Since the least-significant bit on the bus is for read/write, some people may prefer to think of the address as 42, 44, and 46.) The actual addresses are not significant in triggering this issue.

Each master can talk on the bus (when it isn’t busy) and can talk to any other master. For the purposes of simplifying this issue, there are no other devices on the bus.

The communication works fine most of the time, but always hangs on a particular sequence. Here is how the sequence should work:

  1. Master #21: Dear Master #22, here are five bytes that may interest you.
  2. Master #21: Dear Master #23, here are five bytes that may interest you.
  3. Master #22: Dear Master #21, thank you for the five bytes. I’ve thought about it, and have the following five bytes for you.
  4. Master #23: Dear Master #21, thank you for the five bytes. I’ve thought about it, and have the following five bytes for you.

Put another way, Master #21 sends a message to #22 and then #23. Those masters may (or may not) eventually send a reply.

Here’s what happens:

Logic analyzer trace of corrupted I2C bus conversation involving three masters.

Logic analyzer trace of corrupted I2C bus conversation involving three AVR masters.

The first two messages are fine, but Master #22 starts replying too quickly and the last byte is repeated forever (a stop never occurs). The fourth message (from Master #23 to Master #21) never occurs.

Here’s how I would describe the messages between masters:

  1. Master #21: Dear Master #22, here are five bytes that may interest you.
  2. Master #21: Dear Master #23, here are five bytes that may interest you. Master #22: Dear Master #21, thank you for the five bytes. I’ve thought about it, and have the following five bytes for you you you you you you you you you you you you you you you you you you you [ad nauseum]

Notice that the width of the clock pulses on the final bytes are wider than the earlier clock pulses. That’s because the earlier pulses are properly generated by the read/write byte clock of the TWI hardware. But, the later pulses are actually generated by a fight between the stop and start TWI hardware of masters 22 and 23. The data is nothing more than an echo of the final valid bit.

Because the bus becomes perpetually busy, no other communication can occur. Eventually, other devices become stuck waiting for the bus.

I tried all sorts approaches to find the root cause of this.

Finally, I ran across a partial thread where someone already found the issue, but didn’t state it in a way that I recognized it. Here’s what is really happening.

  1. Master #21 successfully sends the message to Master #22.
  2. Master #22 successfully receives an interrupt where the status register (TWSR) indicates a stop or restart has occurred (0xA0). This lets the chip know that a message has been received and can now be processed.
  3. Master #21 successfully waits the appropriate amount of time before starting the message to Master #23.
  4. (Problem) Master #22 is still servicing the interrupt request and the AVR hardware is not paying attention to the fact that the bus has become busy again.
  5. Due to timing luck, Master #23 is fortunate enough to receive the entire message from Master #21. It receives an interrupt to process the message.
  6. (Problem) Master #22 tells the AVR hardware to start sending a message to #21. Since the AVR hardware wasn’t watching the bus, it starts talking immediately. In this example, it violated the protocol that requires a little bit more time between messages.
  7. (Problem) Master #23 is servicing the interrupt request and the AVR hardware is not paying attention to the fact that the bus has become busy again.
  8. Before Master #22 can finish its message, Master #23 starts sending a reply to Master #21, thus completely hosing the bus.

Put more succinctly, the AVR TWI hardware does not keep track of the state of the bus when it delivers the stop interrupt to the software, and therefore may overwrite another chip’s message that started in the meantime. The stop event is a particularly bad place to have such a problem. The I2C protocol does not allow the bus to stretched upon a stop/restart, yet the receiving chip is highly likely to need extra time to process the message it just received.

Possible Software Workarounds

The AVR TWI only has one interrupt entry point for all TWI operations. That means you need to write a big long nasty series of switch/case or nested “if” statements to handle the various status states. It also means that many registers are going to need to be stacked to perform all of this processing. This takes a lot of time. The amount of time usually exceeds the I2C protocol minimum required gap between stop and start.

I considered using a lookup table to jump quickly to the specific code needed to process each status value in the TWI status register (TWSR). Unfortunately, the C compiler will stack almost every register when you jump outside of an interrupt routine.

I considered buffering the newly arrived data to process later on outside of the interrupt routine. Unfortunately, the data will not yet have been processed if the master issues a restart instead of stop. This means the software will not be ready to respond to the master’s read request. The software could check for this and immediately work on unprocessed data while stretching the bus during the restart address acknowledgement period. However, adding either a callback routine or integrated processing code would get you right back into having to stack all of the registers.

Because the AVR does not include a software interrupt, I can’t even try to get the best of both worlds by buffering the data, posting the software interrupt, exiting the TWI interrupt routine, and then processing the data via the software interrupt (which would stack many more registers than the first interrupt). Yes, I can use the PCINT pins to generate an interrupt, but who has spare pins?

I significantly optimized the interrupt routine to process a stop message as fast as possible by immediately triggering the TWINT bit on the TWCR (condition register). Yet, the time to trigger the interrupt and the cost of general register stacking still didn’t allow the TWI hardware to reengage during the I2C protocol’s stop-to-start period. Furthermore, this does not handle the case where the TWI interrupt routine is delayed due to the microcontroller already being in the middle of servicing another interrupt.

Lastly, I could change my higher-level protocol so that the Masters communicate with each other in another way. Perhaps using a write followed by a read, or perhaps using a single master. But, that’s a software hack to compensate for a feature that should work in hardware. The reality is that the TWI chip should continue tracking the bus busy state regardless of what the software is doing.

Software Workaround

Here’s what I came up with as a solution. So far, it seems to work very well.

// Add this to the top of your TWI library .c file.
     /* A value of 0 turns off this feature. */
     /* Greater values are slower but more reliable. */

unsigned char gI2CCheckBusyAfterStop = 0; /* global */


// Add this in your TWI interrupt routine status switch statement.
case I2C_STATUS_SR_RD_STOP_OR_RESTART: // Defined as 0xA0


// Manual Bus Check: Add this in your idle code and before issuing a start command.
if ( gI2CCheckBusyAfterStop != 0 ) // Call repeatedly while(gI2CCheckBusyAfterStop>0)
     if (    PinIsLow(I2C_DATA_PORTIN, I2C_DATA_PIN)
          || PinIsLow(I2C_CLOCK_PORTIN, I2C_CLOCK_PIN) )
          gI2CCheckBusyAfterStop = I2C_HOW_MANY_BUSY_CHECKS_AFTER_STOP;
          // Bus is busy. Start the countdown all over again.
          gI2CCheckBusyAfterStop--; // Good. The bus is quiet. Count down!

The TWI interrupt loads a counter (gI2CCheckBusyAfterStop) when it receives a slave TWI stop/restart. This reminds the software to check the bus manually the next time it wants to start a message.

Officially, the I2C bus is idle when the data pin rises after the clock pin. Unfortunately, that’s difficult for the software to detect. However, the bus is definitely busy if either the data or clock pins are low. Therefore, we simply restart the countdown if either pin is low. (I have a macro defined for PinIsLow. Use whatever you normally prefer instead.)

I call the bus-check code in my main idle loop and before I attempt to start a message. That way, if it has been a long time since I received the stop, the main idle loop will have already detected a quiet bus and we'll be ready to send immediately thereafter. If the code needs to send a message right away, then it can continuously check the bus and perform the countdown in the TWI start routine.

Note that this code has almost no impact on microcontrollers that never receive messages in slave mode. The only negative consequence is a slight decrease on the maximum bus throughput for masters that often receive messages as slaves.

Logic analyzer trace of I2C bus conversation involving three masters, with smart software pauses between messages to avoid contention with Atmel AVR microcontrollers.

Logic analyzer trace of I2C bus conversation involving three masters, with smart software pauses between messages to avoid contention with Atmel AVR microcontrollers.

As you can see, the same messages shown earlier are now successfully delivered without stepping on each other. For very slow busses with very fast microcontrollers (not usually the case), the I2C_HOW_MANY_BUSY_CHECKS_AFTER_STOP value should be increased.

Desired Fix

Obviously, the desired fix from Atmel would be to have the TWI hardware continue to watch the bus during a stop/restart event on the receiving slave.

Better still, Atmel could make the TWI stop/restart condition its own separate interrupt. That way, the TWI chip can continue watching the bus without needing to freeze its registers before invoking the interrupt. (The software would already know that this is a stop condition without having to read TWSR, because this special interrupt vector was called.) This would also allow for fewer general purpose registers to be stacked during the primary TWI interrupt.

Both of these proposals would be backwards compatible with existing code.

Although I have spent a lot of time analyzing this issue, I respect the possibility that I could have made a mistake somewhere. If someone has a more elegant solution or can point out that I am wrong about the Atmel chips, please let me know so that I can publish that information.


What is the maximum throughput possible for a 100 kHz I2C bus when polling six devices?

The official specification is here:
Page 32 deals with pauses (hold times) between messages.

1 / 100,000 Hz = 10 µs. A perfect square wave clock would be up for 5 µs and down for 5 µs. It doesn’t surprise me to see specified hold times and low high periods of around 4 µs. That simply means that your clock up-time and clock down-time can be a slightly off from "5 µs" perfection.

Every complete message includes a minimum start (4 µs), stop (4 µs), and pause (4.7 µs) before the next message. That’s 12.7 µs of overhead. If a single bit takes 10 µs, let’s simply call this required start/stop/pause 2 bits (20 µs) to simplify the calculations.

Each packet is 9 bits long (8 bits followed by an ack/nak). Assuming that you send an address request followed by reading 1 byte, that will take 18 bits (9 address + 9 read) + 2 bits for start/stop/pause.

100,000 Hz / 20 bits = 5,000 messages * 1 read byte = 5,000 bytes per second.

If you want to poll 6 devices and you assume that they only need their address (no command byte) and they will return a single byte, then you will have a maximum possible throughput of 5,000 bytes per second.

If you were to read 2 bytes from the slave, that would be 1 address + 2 bytes:

100,000 Hz / 29 bits = 3448 messages * 2 read bytes = 6,896 bytes per second. This is an increase in throughput because the ratio between the overhead (address+start/stop/pause) and the data (2 bytes) has improved.

The best case is where you send a single address byte and read bytes forever. Thus, the address/start/stop/pause times approaches zero. But, this isn’t applicable for polling multiple devices.

100,000 Hz / 9 bits = 11,111 bytes per second

Unfortunately, if you need to tell the slave device what you’d like to read, that would be 1 address + 1 command (what you’d like to read) followed by 1 read byte.

100,000 Hz / 29 bits = 3448 messages * 1 read byte = 3,448 bytes per second. This is a decrease in throughput because the ratio between the overhead (address + command) and the data (1 byte) has worsened.

Maximum Protocol vs. Software and Device-required Pauses

Let’s assume that you think 5,000 bytes per second (1 address + 1 byte read) is acceptable. We still need to consider the time these devices need to perform their work.

For example, if you are using a microcontroller as the master, then the microcontroller is going to want to do something with the bytes it has been reading. That time away won’t be spent reading the bus. So, it really won’t be fully utilizing the 5,000 bytes per second potential speed.

Looking at the logic analyzer pictures earlier on this page, notice the pauses after the <S> start bit and between bursts of 9 clocks? The Atmel chip is throwing an interrupt, saving the registers, and calling my I2C processing routine for the next byte to process. Those pauses are not required by the I2C protocol. Those pauses are simply the amount of time it takes for my software to provide the next piece of data to the Atmel I2C hardware. The faster my processor clock is, and the faster my software routines are, the shorter these pauses will be (up to a certain point).

It appears that adds about 30% overhead. Now the 5,000 byte throughput is actually about 3,800 bytes per second.

In summation, if you need more speed, go for a faster bus speed (>100 kHz) or read more bytes per message. Otherwise, assume around 3,800 bytes per second is realistic when reading 1 byte from six different devices using a 100 kHz I2C bus. Derate if the bus is noisy/error prone.