About PMTUD and MTU problems

erikvl · Post by **erikvl** » Tue Dec 05, 2017 1:40 pm

Hi,

I am looking for some more information about how OpenVPN handles MTU differences and what I might be able to do to diagnose MTU problems.
First some background.

I have an OpenVPN network with many clients (on average about 260) in subnet topology.
Traffic between clients is managed by iptables.

The tunnel MTU is left at the default of 1500. Fframent or mssfix are not active.

OpenVPN version is 2.2.1. I know this is old, but it is what is standard on Ubuntu 12.04 LTS. I know this is old too. An upgrade was planned long ago, but may still be some months away, mainly due to other business taking precedence. I have no control over this.
I am looking into upgrading OpenVPN beyond what is in the current repositories based on info another forum member sent me.

Meanwhile, I am looking to educate myself further on the subject of how OpenVPN handles, or is supposed to handle, MTU differences.

In my network of 260+ clients, there are currently 2 clients that cause problems. Both on the same premises. They are laptops using the OpenVPN Windows client. On the same premises are some embedded clients (provided by me) that use the same configuration as the laptops. All with their own unique identity of course.

When these clients connect and start sending or receiving, OpenVPN starts logging the well known code=90 messages, such as these:

avc-00009/213.126.118.218:15140 write UDPv4 [EMSGSIZE Path-MTU=1452|EMSGSIZE Path-MTU=1452|EMSGSIZE Path-MTU=1452|EMSGSIZE Path-MTU=1452]: Message too long (code=90)
read UDPv4 [EMSGSIZE Path-MTU=1452|EMSGSIZE Path-MTU=1452]: Message too long (code=90)

and a packet trace shows:

IP 213.126.118.218 > 10.0.8.11: ICMP 213.126.118.218 unreachable - need to frag (mtu 1452), length 36

So clearly OpenVPN receives the messages that are required for PMTUD. Also, the client does not seem to be having trouble receiving. Unless they have informed me incorrectly, they do not experience a stalled connection or any other symptoms that might indicate they are missing packets.

A packet trace of the OpenVPN UDP communication between client and server shows that OpenVPN starts by sending IP packets of 1452 bytes (1466 with ethernet header included).

2017-12-05 12:53:00.965599 10.0.8.11 213.126.118.218 IPv4 1466 Fragmented IP protocol (proto=UDP 17, off=0, ID=77f9) [Reassembled in #127]

Internet Protocol Version 4, Src: 10.0.8.11, Dst: 213.126.118.218
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
Total Length: 1452
Identification: 0x77f9 (30713)
Flags: 0x01 (More Fragments)
0... .... = Reserved bit: Not set
.0.. .... = Don't fragment: Not set
..1. .... = More fragments: Set
Fragment offset: 0
Time to live: 64
Protocol: UDP (17)
Header checksum: 0x7ee4 [validation disabled]
[Header checksum status: Unverified]
Source: 10.0.8.11
Destination: 213.126.118.218
[Source GeoIP: Unknown]
[Destination GeoIP: AS9143 Ziggo B.V., Netherlands, Den Bosch, 06, 51.585701, 6.020100]
Reassembled IPv4 in frame: 127

It was my understanding that the ICMP type 3 messages is specifying the maximum IP size of the next hop, being 1452 in this case.
OpenVPN seems to be honouring this, by sending IP packets no larger that 1452 bytes.
Which would lead me to conclude that the other side is sending ICMP type 3 messages with the wrong number, or is sending those messages when it should not.

Since they say they are not having problems sending or receiving, I would think that the ICMP 3 messages are wrong. But I can't think of any reason why. Also, my entire line of thinking may be based on false premises.

Any thoughts anyone? Besides upgrading OpenVPN. The server will be upgraded shortly (hopefully).

p.s. Many other clients I have checked so far are sending and receiving IP packets of 1473 bytes (1487 with ethernet header included), which suggests OpenVPN is indeed adapting to MTU differences.

TinCanTech · Post by **TinCanTech** » Tue Dec 05, 2017 4:28 pm

erikvl wrote: ↑
Tue Dec 05, 2017 1:40 pm
I am looking for some more information about how OpenVPN handles MTU

There is very little information about this .. and what info there is is not very useful.

On the entire OpenVPN wiki page there is only one mention of MTU:
https://community.openvpn.net/openvpn/w ... tu-problem

You might get some more info from the Openvpn users mailing list.

erikvl wrote: ↑
Tue Dec 05, 2017 1:40 pm
It was my understanding that the ICMP type 3 messages is specifying the maximum IP size of the next hop

Not to my knowledge ..
https://en.wikipedia.org/wiki/Internet_ ... e_Protocol

erikvl wrote: ↑
Tue Dec 05, 2017 1:40 pm
Any thoughts anyone? Besides upgrading OpenVPN

Infact, according to The ChangeLog there have been no changes to actual MTU code since before the version you are running (except for minor changes to related items, eg. --link-mtu and --mtu-disc)

erikvl · Post by **erikvl** » Wed Dec 06, 2017 8:03 am

TinCanTech wrote: ↑
Tue Dec 05, 2017 4:28 pm
You might get some more info from the Openvpn users mailing list.

Thanks for your reply. I have just subscribed, and will be asking there too.

TinCanTech wrote: ↑
Tue Dec 05, 2017 4:28 pm

erikvl wrote: ↑
Tue Dec 05, 2017 1:40 pm
It was my understanding that the ICMP type 3 messages is specifying the maximum IP size of the next hop
Not to my knowledge ..
https://en.wikipedia.org/wiki/Internet_ ... e_Protocol

I think you may have missed this bit: https://en.wikipedia.org/wiki/Internet_ ... nreachable.

Since you are right and so little useful (to my situation) information can be found about OpenVPN MTU handling, I did some more investigation and what I found out is this.

Although I have not discovered exactly how yet, OpenVPN UDP packets do get fragmented at the IP level before they are transmitted. In my case there is a client that claims an MTU of 1452 bytes and I see OpenVPN packets of 1453 bytes split into two.

However, it turns out that if your OpenVPN server is behind a NAT firewall, which mine is, this IP fragmentation gets immediately undone by the NAT layer of the firewall (at least if yours, like mine, is a netfilter based firewall). This is obvious if you think about it, because NAT needs the IP and UDP headers in order to translate them. And with IP fragmentation, those headers are only in the first fragment. Therefore, the fragments are reassembled and then NAT is performed. They could have been fragmented again, but they are not. Maybe there is room for improvement of netfilter (or its successor) there, or maybe there are good reasons why it can't be done.

So basically, PMTUD seems pointless when NAT is involved. I am surprised this is never mentioned in any of the OpenVPN documentation, since it appears to be a rather significant fact. Unless I am wrong of course, which seems unlikely given my theoretical findings and experimental confirmation.

So why does OpenVPN still work on routes with smaller MTU values than that of the server? The answer is that the data packets I have inspected only have the don't fragment bit set if they are relatively small (less than 700 bytes), so routers are free to fragment the larger packets themselves (again) if their next hop so requires.

Now back to my specific situation, or rather, that of the customer.
The customer's router is receiving packets that have been reassembled by my firewall into packets larger than the 1452 it can forward unchanged. But since the don't fragment bit is not set, it can and does fragment the packet and forwards it to the client. I know this, because the customer is not losing any data and does not experience a stalled connection.

But the customer's router also makes the mistake of responding to an 'oversized' packet by returning an ICMP type 3 code 4 response, which it should only do if the packet has the don't fragment bit set and it really cannot deliver it. Clearly it can deliver, because it does, and the don't fragment bit is not set.

There's still the mystery of why the customer's router behaves the way it does, but it is now up to them to fix it.

Hopefully, someone else can benefit in some way from this when the run into similar problems. I will be notifying the mailing list of my findings too, maybe the can use it to improve documentation.

Obviously, if anything I have concluded is false or incomplete, I would be much obliged to learn about it.

TinCanTech · Post by **TinCanTech** » Wed Dec 06, 2017 2:59 pm

erikvl wrote: ↑
Wed Dec 06, 2017 8:03 am
I think you may have missed this bit: https://en.wikipedia.org/wiki/Internet_ ... nreachable.

Thanks for reminding me

From what you describe, I think the solution proposed here will suffice.

eg:

server

fragment 1400

client

fragment 1400
mssfix

Although,

The Manual wrote:It should also be noted that this option is not meant to replace UDP fragmentation at the IP stack level. It is only meant as a last resort when path MTU discovery is broken. Using this option is less efficient than fixing path MTU discovery for your IP link and using native IP fragmentation instead

erikvl wrote: ↑
Wed Dec 06, 2017 8:03 am
There's still the mystery of why the customer's router behaves the way it does, but it is now up to them to fix it.

Which translates to "fixing path MTU discovery for your IP link"

erikvl · Post by **erikvl** » Wed Dec 06, 2017 3:24 pm

TinCanTech wrote: ↑
Wed Dec 06, 2017 2:59 pm
From what you describe, I think the solution proposed here will suffice.

eg:

server
fragment 1400

client
fragment 1400
mssfix

Although,
The Manual wrote:It should also be noted that this option is not meant to replace UDP fragmentation at the IP stack level. It is only meant as a last resort when path MTU discovery is broken. Using this option is less efficient than fixing path MTU discovery for your IP link and using native IP fragmentation instead
Luckily, the problem does not need fixing after all. At least not with regard to the MTU. With more than 250 clients in the field that would really be a problem.

We are receiving the ICMP type 3 code 4 messages from a faulty router, so dropping those messages from that router fixes the contamination in the syslog. The customer will need to fix their router, because all of their traffic is tainted by the same problem and surely their throughput must suffer from it.

Communication was fine all along and know our logs are clean again and I have come away with a deeper understanding of several topics. Not bad

TinCanTech wrote: ↑
Wed Dec 06, 2017 2:59 pm

erikvl wrote: ↑
Wed Dec 06, 2017 8:03 am
There's still the mystery of why the customer's router behaves the way it does, but it is now up to them to fix it.
Which translates to "fixing path MTU discovery for your IP link"
Actually no, PMTUD is fine. At least as far as the OpenVPN server is concerned. It correctly detects the MTU of 1452 bytes and IP fragmentation breaks up larger packets into smaller ones.

However, once the packets hit the firewall, the firewall reassembles them to perform NAT. And then they leave the firewall too large again. But this is beyond the control (and awareness) of the OpenVPN server. It is also unavoidable, which is the point I was trying to make. If you place an OpenVPN server behind a firewall/router, NAT is required. And NAT undoes IP fragmentation. I am sure you are not advocating connecting the server directly to the internet, are you?

But even though NAT undoes the IP fragmentation of the OpenVPN server, the packet being larger than the MTU further down the link does not cause a problem, because the don't fragment bit is not set. A router down the link that has a next hop with an MTU smaller than the packet's size will simply fragment it again as needed.

In conclusion, PMTUD behind a router performing NAT is pointless.

OpenVPN Support Forum

About PMTUD and MTU problems

About PMTUD and MTU problems

Re: About PMTUD and MTU problems

Re: About PMTUD and MTU problems

Re: About PMTUD and MTU problems

Re: About PMTUD and MTU problems