ASIC from Scratch – Part 5: UART Receiver

Veröffentlicht von

Learning just how straightforward a Verilog UART receiver is to implement, was quite the eye-opening experience. In fact I’d even say, that my receiver is a good bit simpler than it’s own testbench. I think the big takeaway here is that the bottleneck of ASIC development isn’t the implementation of whatever you’re trying to build, it’s the verification.

In my first UART post I introduced the theory of operation behind the receiver. As you might remember the basic structure consists of a finite state machine with 4 states. This is implemented as a switch-case statement (unlike all other languages verilog omits the „switch“).

Fig. 1: The UART receiver statemachine

The first state „wait for falling edge“ – called STATE_IDLE in my code is doing the least work. In theory all it has to do is check whether the rx line is LOW and change to the next state if it is. In practice it would be nice to have some really basic low pass filtering to ignore glitches and make the receiver more robust.

STATE_IDLE: begin
	// check for falling edge with some basic filtering
	if((rx_shift_reg == 4'b0001) || (rx_shift_reg == 4'b0010)) begin
		cur_state <= STATE_RECEIVE_START;
	end

	ctr <= 4;
	data_ready <= 0;
	bit_ctr <= 0;
	data <= 0;
end

Outside the case-statement I use a shift register to store the current rx state. STATE_IDLE can then check the shift register’s contents for the falling edge pattern. In my case a single HIGH glitch surrounded by three LOW values is recognized as a falling edge. If a falling edge is detected the state is changed to STATE_RECEIVE_START which will receive the start bit.

Inside STATE_IDLE the counter ctr is set to 4. This might seem odd at first, however it has a simple reason: Due to the low pass filter, this code is only able to detect a falling edge after 4 cycles have passed. If the counter was still at 0 when we enter the next state, we’d be off by 4 cycles. For lower UART speeds this doesn’t matter, for higher speeds this might be an issue though.

Another thing that’s happening in this state is the reset of data_ready, bit_ctr and data. This should be fairly self-explanatory, but in case it isn’t: This is simply preparation for the next states. They expect a clean-slate and this is how they get that.

Next up is STATE_RECEIVE_START. It simpy exists to receive the start bit.

STATE_RECEIVE_START: begin
	if(rx != 0 && ctr < start_bit_duration) begin 
		// rx went back high again, this wasn't a valid stop bit.
		// reset to idle state
		cur_state <= STATE_IDLE;
		ctr <= 0;
	end

	// Check if start bit is over, if yes, move to next state
	if(ctr == bit_duration) begin
		ctr <= 0;
		cur_state <= STATE_READ_DATA;
	end
end

There are really only 2 things that can happen here. Either a) rx goes high before the start bit is over, or b) The start bit is over. In the first case we reset to STATE_IDLE. We didn’t receive a valid start bit and don’t want to continue and therefore want to start fresh. If the transmission is a bit glitchy or the clocks of receiver and transmitter aren’t perfectly synchronous, the start bit might not be recognized reliably though. To understand just how critical this timing can be let’s take a look at a fairly simple 9600 baudrate transmission and a internal clock of 20 MHz. Shouldn’t be too problematic, right? Well, as we’ll see the problem isn’t baudrate, it’s the internal clock speed.

9600 baud = 9600 Hz => Tbit = 104.2 µs
Tclk = 1 / fclk = 1 / 20000000 Hz => 50 ns = 0.05 µs
From this we can calculate the maximum relative deviation trel = Tclk / Tbit = 0.00048 = 0.048 % = 480 ppm

Whoa! That’s quite low. Granted, it’s not „absolutely impossible“-low, but it’s still a much smaller tolerance than preferable. If you breath on your IC wrong you might just heat it enought to sufficiently change the clock (probably not). I’d like a little more leeway than that.

The way around this was to have 2 durations. One to „wait out“ the start bit and another duration during which rx must be LOW. They can be the same, but in practice I’ll probably set the start_bit_duration to something in the 80-90% range of bit_duration.

After start_bit_duration has passed, we still don’t want to move on to the next state though. We simply stop checking rx. Only after bit_duration has passed will the state machine move on. I could have implemented this with another state, but didn’t really see the point for something that simple. In the end splitting the code up into states serves to simplify the code. Introducing an additional state for this would have the opposite effect in my opinion.

Next on the list is STATE_READ_DATA, the bread and butter of this module. It receives the actual payload bits.

STATE_READ_DATA: begin
	if(ctr == bit_duration >> 1) begin
		// we're in the middle of the bit, this is the point where we want to read the bit

		// increment the bit counter
		bit_ctr <= bit_ctr + 1;

		// shift in the bit that was read
		data <= {rx, data[7:1]}; 
	end  

	if(ctr == bit_duration) begin
		// at this point the current bit was read, we can either read the next data bit or the stop bit(s)

		// we have read all 8 databits, read stop bit next
		if(bit_ctr == 8) begin
			cur_state <= STATE_STOPBITS;
			ctr <= 0;
		end else begin
			ctr <= 0;
		end
	end
	// rx was low for long enough to count as uart start bit, now we wait for the START_BIT_LENGTH + half baud length, then go to STATE_RECEIVE_DATA
end

As you see there are two if blocks. The first compares the counter to bit_duration >> 1, which is exactly half of bit_duration. We want to read the payload bit smack-bang in the middle. We store the bit in a shift register called data and increment the counter bit_ctr, which counts how many bits we have received already.

The second if statement checks if the bit is over. If it is, what follows is either another databit or the stopbit. The way we differentiate between these two is by checking bit_ctr, it should be 8 if we received all 8 bits. If it isn’t we read another bit, if it is we move to STATE_STOPBITS.

Lastly there is STATE_STOPBITS. And yes, I actually went all the way and implemented the option to have a variable stop bit length.

STATE_STOPBITS: begin
	// Count to stop bit duration, then reset state to idle
	if((stopbits == 2'b00) && (ctr == bit_duration >> 1)) begin // 0.5 stop bits
		data_ready <= 1;
		cur_state <= STATE_IDLE;
	end else if ((stopbits == 2'b01) && (ctr == bit_duration)) begin // 1 stop bit
		data_ready <= 1;
		cur_state <= STATE_IDLE;
	end else if ((stopbits == 2'b10) && (ctr == {1'b0,bit_duration[15:0]} + bit_duration >> 1)) begin
		// this condition adds half of bit_duration to bit_duration in a 17-bit wide register to get 1.5x bit_duration
		data_ready <= 1;
		cur_state <= STATE_IDLE;
	end else if(ctr[15:1] == bit_duration) begin // functionally equivalent to left shift of bit_duration
		data_ready <= 1;
		cur_state <= STATE_IDLE;
	end
end

Depending on the number of stopbits (0.5, 1, 1.5 or 2) ctr must be compared to different values. Half a stopbit, one stopbit and two stopbits are pretty simple. We compare ctr to the original value shifted by -1, 0 or +1 positions. 1.5 stopbits is where it gets a little ugly though. Multiplication is slow, so we use addition of bit_duration with its half. Since that addition might overflow, we have to use a register 1 bit wider than bit_duration itself. It all makes for some pretty ugly code in the end.

Maybe you wonder why I’m not checking whether rx stays high during the stopbit. According to the UART protocl the line must stay high after all. The simple answer is it’s not really necessary to receive the byte. The stopbits serve as a pause between two transmissions. It doesn’t really matter what the rx line is doing after the payload has been received. Either rx stays high and everything is perfectly fine or the next transmission starts early. If the next transmission starts early, that shouldn’t affect the validity of the current byte. It should only affect the validity of the new, „early“ byte. With my implementation the new byte will cause the state machine to detect an invalid state eventually and reset to STATE_IDLE. That might even happen multiple times. If the next byte after that is valid again, it’ll be able to detect that, since the receiver will be in STATE_IDLE.

What actually happens here is pretty simple though. We set the data_ready flag HIGH to indicate that the value in the data register is valid and then return to the STATE_IDLE state. Any consumer of this uart module may then check the data_ready bit and take the data from this module whenever that flag is high.

The result looks good as you can see below.

Fig. 2: A short UART transmission and the module’s resulting output waveforms.

Figure 2 might warrant some short explanation. clk is obviously the internal clock. It’s ticking fast. Really fast in fact. 20 MHz to be a little more precise. uart_tx_test is the test transmission sent by the testbench. The interesting waveforms are data_ready and data[7:0]. The former only ever goes high for a single clock cycle: Whenever data contains valid data and the transmission is complete data_ready will go high. data on the other hand contains garbage most of the time during and after each transmission. The reason is that it is simply the raw output of the receiver’s internal shift register and displays intermediate states of the transmission. Outputting invalid data is a terrible idea though, so what gives? I’m actually planning to deal with this at a higher level of abstraction. My uart_rx module isn’t supposed to be used on it’s own, it will be part of a more abstract UART module. This module will contain all the goodies one expects of a uart peripheral, such as transmit/receive FIFOs, interrupts, simple baudrate configuration, a way to integrate it with a DMA, memory mapping (to access it like any other RAM address) and a whole bunch of other things.

Notable missing features are parity calculation and simple baudrate configuration. I might implement those in the future. They’re both easy to implement. As for why I haven’t added it yet, I don’t need parity calculation and I think simple baudrate configuration should be handled at a higher level of abstraction.

As always, for completeness sake, here’s the full code:

module uart_rx(
	input clk, rst, rx,
	input [15:0] bit_duration, start_bit_duration,
	input [1:0] stopbits,
	output reg [7:0] data,
	output reg data_ready
);

reg [3:0] rx_shift_reg;
reg [16:0] ctr; // larger than bit_duration to prevent overflows during internal calculations
reg [4:0] bit_ctr;
reg [1:0] cur_state;




// States for uart statemachine
parameter STATE_IDLE = 2'b00;
parameter STATE_RECEIVE_START = 2'b01;
parameter STATE_READ_DATA = 2'b10;
parameter STATE_STOPBITS = 2'b11;

always @ (posedge clk) begin

	if(rst) begin
		// set internal registers to 0
		ctr <= 0;
		rx_shift_reg <= 0;
		bit_ctr <= 0;

		// set outputs to 0
		data <= 0;
		data_ready <= 0;
	end else begin
		
		// Shift register of past rx values for filtering etc
		rx_shift_reg <= {rx, rx_shift_reg[3:1]};
		ctr <= ctr + 1;
	
		// rx state machine here
		case(cur_state)
			default: cur_state <= STATE_IDLE;
			STATE_IDLE: begin
				// check for falling edge with some basic filtering
				if((rx_shift_reg == 4'b0001) || (rx_shift_reg == 4'b0010)) begin
					cur_state <= STATE_RECEIVE_START;
				end

				ctr <= 4;
				data_ready <= 0;
				bit_ctr <= 0;
				
				//if(!data_ready) begin
					data <= 0;
				//end
			end
			STATE_RECEIVE_START: begin
				if(rx != 0 && ctr < start_bit_duration) begin 
					// rx went back high again, this wasn't a valid stop bit.
					// reset to idle state
					cur_state <= STATE_IDLE;
					ctr <= 0;
				end

				// Check if start bit is over, if yes, move to next state
				if(ctr == bit_duration) begin
					ctr <= 0;
					cur_state <= STATE_READ_DATA;
				end
			end
			STATE_READ_DATA: begin
				if(ctr == bit_duration >> 1) begin
					// we're in the middle of the bit, this is the point where we want to read the bit

					// increment the bit counter
					bit_ctr <= bit_ctr + 1;

					// shift in the bit that was read
					data <= {rx, data[7:1]}; 
				end  

				if(ctr == bit_duration) begin
					// at this point the current bit was read, we can either read the next data bit or the stop bit(s)

					// we have read all 8 databits, read stop bit next
					if(bit_ctr == 8) begin
						cur_state <= STATE_STOPBITS;
						ctr <= 0;
					end else begin
						ctr <= 0;
					end
				end
				// rx was low for long enough to count as uart start bit, now we wait for the START_BIT_LENGTH + half baud length, then go to STATE_RECEIVE_DATA
			end
			STATE_STOPBITS: begin
				// Count to stop bit duration, then reset state to idle
				if((stopbits == 2'b00) && (ctr == bit_duration >> 1)) begin // 0.5 stop bits
					data_ready <= 1;
					cur_state <= STATE_IDLE;
				end else if ((stopbits == 2'b01) && (ctr == bit_duration)) begin // 1 stop bit
					data_ready <= 1;
					cur_state <= STATE_IDLE;
				end else if ((stopbits == 2'b10) && (ctr == {1'b0,bit_duration[15:0]} + bit_duration >> 1)) begin
					// this condition adds half of bit_duration to bit_duration in a 17-bit wide register to get 1.5x bit_duration
					data_ready <= 1;
					cur_state <= STATE_IDLE;
				end else if(ctr[15:1] == bit_duration) begin // functionally equivalent to left shift of bit_duration
					data_ready <= 1;
					cur_state <= STATE_IDLE;
				end
			end
		endcase
	end
end
   
endmodule

Kommentar hinterlassen

Deine E-Mail-Adresse wird nicht veröffentlicht.

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.