Embedded Design Handbook

ID 683689
Date 8/28/2023
Public
Document Table of Contents

7.4.3.1.3. The Sockets API

After tuning your application to become more computationally efficient (thereby freeing more of the processor’s time for operating the networking stack), you can optimize how the application uses the networking stack. This section describes how to select the best protocol for use by your application and the most efficient way to use the Sockets API.

Selecting the Right Networking Protocol

When using the Sockets API, you must also select which protocol to use for transporting data across the network. There are two main protocols used to transport data across networks: TCP and UDP. Both of these protocols perform the basic function of moving data across Ethernet networks, but they have very different implementations and performance implications. The table below compares the two protocols.

Table 55.  The UDP and TCP Protocols
Parameter Protocol
UDP TCP
Connection Mode Connectionless Connection-Oriented
In Order Data Guarantee No Yes
Data Integrity and Validation No Yes
Data Retransmission No Yes
Data Checksum Yes; Can be disabled Yes

In terms of just throughput performance, the UDP protocol is much faster than TCP because it has very little overhead. The UDP protocol makes no attempt to validate that the data being sent arrived at its destination (or even that the destination is capable of receiving packets), so the network stack needs to perform much less work in order to send or receive data using this protocol.

However, aside from very specialized cases where your embedded system can tolerate losing data (for example, streaming multimedia applications), use the TCP protocol.

Note: Use the UDP protocol to gain the fastest performance possible; however, use the TCP protocol when you must guarantee the transmission of the data.

Improving Send and Receive Performance

Proper use of the Sockets API in your application can also increase the overall networking throughput of your system. The following list describes several ways to optimally use the Sockets API:

  • Minimize send and receive function calls—The Sockets API provides two sets of functions for sending and receiving data through the networking stack. For the UDP protocol these functions are sendto() and recvfrom(). For the TCP protocol these functions are send() and recv().

    Depending on which transport protocol you use (TCP or UDP), your application uses one of these sets of functions. To increase overall performance, avoid calling these functions repetitively to handle small units of data. Every call to these functions incurs a fixed time penalty for execution, which can compound quickly when these functions are called multiple times in rapid succession. Combine data that you want to send (or receive) and call these functions with the largest possible amount of data at one time.

    Note: Call the Socket API’s send and receive functions with larger buffer sizes to minimize system call overhead.
  • Minimize latency when sending data—Although the TCP Sockets send() function can accept an arbitrary number of bytes, those bytes might not be immediately sent as a packet. This situation is especially likely when send() is called with a small number of bytes, because the networking stack attempts to coalesce these small data chunks into a larger packet. Small data chunks are coalesced to avoid congesting the network with many small packets (using the Nagle algorithm for congestion avoidance). There is a solution, however, through the use of the TCP_NODELAY flag.

    Setting a socket’s TCP_NODELAY flag, with the setsockopt() function call, disables the Nagle algorithm. The socket immediately sends whatever bytes are passed in as a TCP packet. Disabling the Nagle algorithm can be a useful way to increase network throughput in the case where your application must send many small chunks of data very quickly.

    Note: If you need to accelerate the transmission of small TCP packets, use the TCP_NODELAY flag on your socket. You can find an example of setting the TCP_NODELAY flag in the benchmarking application software in the Nios® II ethernet acceleration design example.

    While disabling the Nagle algorithm usually causes smaller packets to be immediately sent over the network, the networking stack might still coalesce some of the packets into larger packets. This situation is especially likely in the case of the Windows workstation platform. However, you can expect the networking stack to do so with much lower frequency than if the Nagle algorithm were enabled.

The Zero Copy API

The NicheStack networking stack provides a further optimization to accelerate the data transfers to and from the stack called the zero copy API. The zero copy API increases overall system performance by eliminating the buffer management scheme performed by the Socket API’s read and write function calls. The application manages the send and receive data buffers directly, eliminating an extra level of data copying performed by the Nios® II processor.

Using the NicheStack Zero Copy API can accelerate your network application’s throughput by eliminating an extra layer of copying.