FreeRTOS+TCP throughput

Hi, I’m having difficulties with maximizing throughput with FreeRTOS+TCP. The same code with very old FreeRTOS and lwip achieved ~47 MB/s, but now I can’t get more than ~17 MB/s so I must be doing something wrong. Any suggestions on configuration parameters settings that could help me? I attached my current config files. Thanks.

FreeRTOS+TCP throughput

Which target (MCU) are you using? Can you please attach your FreeRTOSIPConfig.h file to your post.

FreeRTOS+TCP throughput

I’m using Xilinx Zynq 7000 SoC. I already attached FreeRTOSIPConfig.h in my post above.

FreeRTOS+TCP throughput

Hi, a throughput of 17 MB is indeed very low for a Zynq. How do you measure this speed? Can you check this page, especially about FREERTOS_SO_WIN_PROPERTIES? When TCP is slow, in most cases there is not enough buffer space, or the TCP window size is too small. Can you set them explicitly?

FreeRTOS+TCP throughput

I’m measuring throughput by sending data to PC and measuring amount of received data in one second. I tried to set window parameters:
/* Unit: bytes */
win_props.lTxBufSize = 8 * ipconfigTCP_MSS;
/* Unit: MSS */
win_props.lTxWinSize = 4;
/* Unit: bytes */
win_props.lRxBufSize = 8 * ipconfigTCP_MSS;
/* Unit: MSS */
win_props.lRxWinSize = 4;

on the listening socket, but it improved the throughput only slightly (maybe 2MB/s more).

FreeRTOS+TCP throughput

I’m still having this problem. Any idea what else could I try?

FreeRTOS+TCP throughput

Sorry I responded so late: I was travelling last month. It’s good you pinged me.
I tried to set window parameters on the listening socket
The values look good. I suppose you have set the socket options before calling accept()? All settings will be inherited by the child sockets. Could you post a (well zipped) PCAP file that show the TCP conversation? You may want to try out an iperf3 server ( see here below ). I would recommend to add these IPERF settings to your FreeRTOSIPConfig.h : ~~~

define ipconfigIPERF_VERSION 3

define ipconfigIPERFSTACKSIZEIPERFTASK 680

define ipconfigIPERFDOESECHO_UDP 0

define ipconfigIPERFTXBUFSIZE ( 24 * ipconfigTCP_MSS )

define ipconfigIPERFTXWINSIZE ( 12 )

define ipconfigIPERFRXBUFSIZE ( 24 * ipconfigTCP_MSS )

define ipconfigIPERFRXWINSIZE ( 12 )

/* The iperf module declares a character buffer to store its send data. */

define ipconfigIPERFRECVBUFFERSIZE ( 12 * ipconfigTCPMSS )

~~~

FreeRTOS+TCP throughput

Yes, I’m setting socket options before accept call. To be more precise – right after creating socket. You can find PCAP in the attachments.

FreeRTOS+TCP throughput

Thank you. The Zynq is indeed responding very slowly: it resumes transmission 300 uS after an acknowledgement. You could try increase the sizes: ~~~ winprops.lTxBufSize = 48 * ipconfigTCPMSS; winprops.lTxWinSize = 24; winprops.lRxBufSize = 48 * ipconfigTCPMSS; winprops.lRxWinSize = 24; ~~~ But still, I wouldn’t expect a big increase. Can you monitor you CPU load? Do the 2 tasks (IP-task and the MAC-driver) get enough CPU time? Or do you have higher-priority tasks that keep them from running? If you have time, could you try out the iperf3?

FreeRTOS+TCP throughput

I commented out almost all of my code and run only iperf. Here’s the output: iperf3.exe -V -c 172.16.0.215 –port 5001 –bytes 100M [ 4] local 172.16.0.1 port 50317 connected to 172.16.0.215 port 5001 Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 104857600 bytes to send [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.01 sec 27.0 MBytes 224 Mbits/sec [ 4] 1.01-2.01 sec 27.0 MBytes 226 Mbits/sec [ 4] 2.01-3.01 sec 28.4 MBytes 238 Mbits/sec [ 4] 3.01-3.60 sec 17.6 MBytes 251 Mbits/sec
Test Complete. Summary Results: [ ID] Interval Transfer Bandwidth [ 4] 0.00-3.60 sec 100 MBytes 233 Mbits/sec sender [ 4] 0.00-3.60 sec 99.8 MBytes 233 Mbits/sec receiver CPU Utilization: local/sender 6.0% (1.7%u/4.2%s), remote/receiver 0.0% (0.0%u/0.0%s) That’s better but still not good enough.

FreeRTOS+TCP throughput

I almost started doubting my self and so I ran the demo on my Zybo board again. When running iperf in sending mode, the board receives data at a speed of about 477 Mbits/sec. When adding the -R option, the board sends data at a speed of about 517 Mbits/sec. That is twice as much as what you measured. That is also what I remember observing when I developed the FreeRTOS Zynq demo. So what is the difference between your board / application and mine? Is your LAN I won’t attach my application, but if you write me an email, I will forward it to you. It runs on both MicroZed as well as a Zybo board. My address is hein [at] htibosch [point] com Some questions: What are you also using compiler optimisation? I’m using GGC with -Os (optimise for size). What version of memcpy() are you using? I’m using GCC’s own version. I attached a PCAP, showing the first 1000 packets of the iperf conversation. It shows that a window of about 15 KB would be enough to get optimal results.

FreeRTOS+TCP throughput

the difference between your board / application and mine? Is your LAN
Oops. I left that question unfinished: I wanted to ask you if your LAN is quiet enough to leave space for your iperf data? My own LAN was 99% available during the test. About the PCAP that I attached: you will see that about every 10 packets receive an ACK. That is why I wrote that a TCP-Window size of 10 x 1.46 KB would be enough. Here is an example of a TCP Window size of 12 packets: ~~~ #define ipconfigIPERFTXBUFSIZE ( 24 * ipconfigTCPMSS ) /* Units of bytes. */ #define ipconfigIPERFTXWINSIZE ( 12 ) /* Size in units of MSS */ #define ipconfigIPERFRXBUFSIZE ( 24 * ipconfigTCPMSS ) /* Units of bytes. / #define ipconfigIPERF_RX_WINSIZE ( 12 ) / Size in units of MSS */ ~~~ No one can guarantee a constant high TCP throughput. If you write the application for both sides, I would consider using UDP in stead.

FreeRTOS+TCP throughput

I’m using -O2 optimization and GCC’s memcpy. I tried with -Os flag but differences are very small. Also, I use dedicaded LAN connection – PC and my device are connected directly via cable, and I don’t use that connection for anything else. I didn’t try iperf with -R option before, so here are results: iperf3.exe -c 172.16.0.215 –port 5001 –bytes 100M Connecting to host 172.16.0.215, port 5001 [ 4] local 172.16.0.1 port 49832 connected to 172.16.0.215 port 5001 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 30.6 MBytes 257 Mbits/sec [ 4] 1.00-2.00 sec 31.6 MBytes 265 Mbits/sec [ 4] 2.00-3.00 sec 31.9 MBytes 267 Mbits/sec [ 4] 3.00-3.17 sec 5.88 MBytes 292 Mbits/sec
[ ID] Interval Transfer Bandwidth [ 4] 0.00-3.17 sec 100 MBytes 265 Mbits/sec sender [ 4] 0.00-3.17 sec 99.8 MBytes 264 Mbits/sec receiver iperf3.exe -c 172.16.0.215 –port 5001 –bytes 100M -R Connecting to host 172.16.0.215, port 5001 Reverse mode, remote host 172.16.0.215 is sending [ 4] local 172.16.0.1 port 49867 connected to 172.16.0.215 port 5001 [ ID] Interval Transfer Bandwidth [ 4] 0.00-1.00 sec 54.0 MBytes 453 Mbits/sec [ 4] 1.00-1.77 sec 46.0 MBytes 503 Mbits/sec
[ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-1.77 sec 37.0 Bytes 168 bits/sec 4294967295 sender [ 4] 0.00-1.77 sec 100 MBytes 475 Mbits/sec receiver That’s much better. But if I switch to my code, I still get ~19MB/s. So, it seems the problem is somewhere in my code, not FreeRTOS. I checked the task priorites and all the tasks I create have lower priorities than ipTask. I didn’t profile code to see what’s using CPU the most yet since it’s a bit tricky in this environment.

FreeRTOS+TCP throughput

When you use the -R option, it means that Zynq is sending data. That gives a good performance. If your application mostly sends from the device to the outside world, that is very good news. When you omit the -R option, the Zynq receives data and it is more dependent on the speed of its partner. In your test you only get an average speeds of 260 Mbps: [ 4] 0.00-1.00 sec 30.6 MBytes 257 Mbits/sec [ 4] 1.00-2.00 sec 31.6 MBytes 265 Mbits/sec If your embedded device mostly receives data, you may also want to optimise the software on your host… if that is possible. Does your embedded application mostly send or mostly receive data? There are many techniques to optimise TCP-communication. You can have a look in the iperf module, or the project under “protocols”, such as the FTP server. If it is possible, can you post some code that sends/receives the ‘bulk’ TCP data? You can also send it to the email address that I mentioned here above.

FreeRTOS+TCP throughput

Application mostly sends data so difference in my throughput and iperf’s is even more interesting. I tried to send you mail on address you provided but gmail said that it cannot find that hostname.

FreeRTOS+TCP throughput

Wrong email address: oops, the correct address is hein [at] htibosch [point] net It is not a .com

FreeRTOS+TCP throughput

Ok, I sent it now.