Cut through vs Store and forward switching


Switching infrastructure has evolved since 4 decades from hub, bridges ASIC based switches, FPGAs based switches etc. Switches operate at layer 2 (Data link Layer) and their purpose is to receive a frame on one port (Ingress port) and forward it to another (Egress port). This decision is based on Destination MAC (DMAC) address. Here we will describe how store and forward switching works and how cut-through switching works.
Store and forward switching
Following figure shows Layer 2 frame. Bytes are received from left to right. Our main focus would be on Frame header and Checksum field. Checksum field in Layer 2 frame is used to check integrity of the frame (Error control). Integrity means performing some checks to find out whether frame was tempered/corrupted in between or not. Checksum is certain type of hash calculation over whole frame data. This means we need to have all the bytes of frame before checking for integrity. 



As soon as we have whole frame and checksum check is successful we forward packet to destination MAC address. Following is figure for frame header.

Cut-through switching
As we explained before that frame header bytes are received first by switch and then follows the rest of the headers and data. And we also know that switching decisions are performed based on destination MAC address and this field is also present in frame header. So why not just forward the packet as soon as we receive Destination MAC address field rather than waiting for whole packet to be received and then to make forwarding decision. Yes you got it, that is exactly how cut-through switching works.

Did you also realize that there is little drawback with this approach? Well switches using cut-through forwarding won’t perform checksum (Error control) which means there won’t be any integrity check and even corrupted packets are also forwarded. But who cares as we know that receiving machine (Server) will perform checksum and drop the packet if it is corrupted? However this makes troubleshooting of CRC/FCS errors complicated. See next section for details. Here are few advantages of cut-through switching.
Bigger advantage with this approach is that switch does not wait for whole frame to arrive hence it gives low latency in switching decision. This is called switch’s latency. This advantage is quite substantial when you have big size packets (Jumbo packets).  Switch’s latency in store and forward is few milliseconds whereas in cut-through approach it reduces to few microseconds (Approx 10 microseconds).

Although normal application users won’t really feel any difference when underlying switching uses store and forward or cut-through technology however this has clear cut advantage in latency sensitive applications like trading applications where difference of microseconds gives you better entry in the market.

Troubleshooting CRC/FCS/Checksum Error in Cut-Through switching
 As we learned that cut-through switching (E.g used by Cisco Nexus 5K series) forwards CRC errors which means all the corrupted packets will be forwarded throughout our L2 landscape wherever we have cut through switching. Lets see this in pictures.



 Lets assume that cable between application server and SWA is faulty which is causing CRC errors when traffic goes from application server to File server. These CRC errors will be seen on SWA Eth1. As this switch is cut-through, it will further forward these corrupt packets to SWB1 and SWB2 which means we will also see errors on SWB1 Et5 and SWB2 Et4. As SWB1 and SWB2 also use cut-through switching they will further propagate these errors to SWC. Here SWC will see errors on Et2 and Et3. As SWC is using store and forward switching, it wont send these corrupt packets to anywhere else.
Now think for a while if you have cut-through switching which includes 10+ switches, its going to be hard to figure out source of interface errors ( CRCs). For this, you need to start from any of the switch and try to trace back to the source by looking at interface errors. E.g If you saw errors on SWC first then you should see on which interface it received the errors and where are those interfaces connecting to. You again login to neighbor switches and check for interface errors on those switches. Usually its good practice to match the counters of interface errors and see if those are matching or if there is any pattern in them.

Comments

Post a Comment

Popular posts from this blog

Home Automation with Openhab

SSH Tunneling or SSH Port forwarding