Discussion:
Path MTU and multiple layers of NAT -- bad?
Richard Jones
19 years ago
Permalink
I've got a strange case where I can't fetch web pages from a remote
server. I can ping the server, I can connect to the remote server,
but after that the connection hangs until close.

The difference seems to be that (for various reasons) I'm trying to
connect over multiple (3) layers of NAT. My best guess is that path
MTU discovery is screwed up somehow.

When I do:

# telnet annexia.org 80
Trying 80.68.91.176...
Connected to furbychan.cocan.org.
Escape character is '^]'.
GET / HTTP/1.0
Host: annexia.org
<blank line>
<--- hangs here for several minutes, followed by:
Connection closed by foreign host.

This is the tcpdump on the outgoing interface, annotated:

### Nameserver query for annexia.org
14:01:11.586250 IP 10.0.1.2.1025 > 10.0.0.2.domain: 5488+ AAAA? annexia.org. (29)
14:01:11.586731 IP 10.0.0.2.domain > 10.0.1.2.1025: 5488 0/1/0 (90)
14:01:11.586968 IP 10.0.1.2.1025 > 10.0.0.2.domain: 5489+ AAAA? annexia.org.merjis.com. (40)
14:01:11.587035 IP 10.0.0.2.domain > 10.0.1.2.1025: 5489 0/1/0 (91)
14:01:11.587126 IP 10.0.1.2.1025 > 10.0.0.2.domain: 5490+ A? annexia.org. (29)
14:01:11.587225 IP 10.0.0.2.domain > 10.0.1.2.1025: 5490 1/3/3 A furbychan.cocan.org (157)
14:01:11.587516 IP 10.0.1.2.1025 > 10.0.0.2.domain: 2122+ PTR? 176.91.68.80.in-addr.arpa. (43)
14:01:11.587621 IP 10.0.0.2.domain > 10.0.1.2.1025: 2122 1/3/3 (189)

### Initial 3 way handshake - NB it's successful.
14:01:11.588170 IP 10.0.1.2.1550 > furbychan.cocan.org.www: S 4206660655:4206660655(0) win 5840 <mss 1460,sackOK,timestamp 926083 0,nop,wscale 2>
14:01:11.610829 IP furbychan.cocan.org.www > 10.0.1.2.1550: S 3680005253:3680005253(0) ack 4206660656 win 5792 <mss 1460,sackOK,timestamp 164089147 926083,nop,wscale 0>
14:01:11.610873 IP 10.0.1.2.1550 > furbychan.cocan.org.www: . ack 1 win 1460 <nop,nop,timestamp 926089 164089147>

### Sending "GET / HTTP/1.0<CR><LF>" without response.
14:01:16.052618 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 927199 164089147>
14:01:16.577667 arp who-has 10.0.1.2 tell 10.0.1.129
14:01:16.577752 arp reply 10.0.1.2 is-at 00:16:3e:5a:c6:75
14:01:19.478040 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 928056 164089147>
14:01:23.062258 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 928952 164089147>
14:01:30.230710 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 930744 164089147>
14:01:35.543046 arp who-has 10.0.1.129 tell 10.0.1.2
14:01:35.543055 arp reply 10.0.1.129 is-at fe:ff:ff:ff:ff:ff
14:01:44.579614 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 934331 164089147>
14:02:13.253398 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 941499 164089147>
14:02:18.253698 arp who-has 10.0.1.129 tell 10.0.1.2
14:02:18.253705 arp reply 10.0.1.129 is-at fe:ff:ff:ff:ff:ff
14:03:10.600990 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 955835 164089147>
14:03:15.949320 arp who-has 10.0.1.129 tell 10.0.1.2
14:03:15.949327 arp reply 10.0.1.129 is-at fe:ff:ff:ff:ff:ff
14:05:05.392192 IP 10.0.1.2.1550 > furbychan.cocan.org.www: P 1:17(16) ack 1 win 1460 <nop,nop,timestamp 984531 164089147>
14:05:10.412474 arp who-has 10.0.1.129 tell 10.0.1.2
14:05:10.412487 arp reply 10.0.1.129 is-at fe:ff:ff:ff:ff:ff
14:05:55.595302 arp who-has 10.0.1.129 tell 10.0.1.2
14:05:55.595312 arp reply 10.0.1.129 is-at fe:ff:ff:ff:ff:ff

### Remote host closes the connection.
14:06:12.587427 IP furbychan.cocan.org.www > 10.0.1.2.1550: F 1:1(0) ack 1 win 5792 <nop,nop,timestamp 164119247 926089>
14:06:12.587667 IP 10.0.1.2.1550 > furbychan.cocan.org.www: FP 17:38(21) ack 2 win 1460 <nop,nop,timestamp 1001328 164119247>
14:06:12.861837 IP furbychan.cocan.org.www > 10.0.1.2.1550: F 1:1(0) ack 1 win 5792 <nop,nop,timestamp 164119276 926089>
14:06:12.861982 IP 10.0.1.2.1550 > furbychan.cocan.org.www: . ack 2 win 1460 <nop,nop,timestamp 1001397 164119276,nop,nop,sack sack 1 {1:2} >
14:06:17.584477 arp who-has 10.0.1.2 tell 10.0.1.129
14:06:17.584631 arp reply 10.0.1.2 is-at 00:16:3e:5a:c6:75

Any ideas or further things I can try?

Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
--
Gllug mailing list - ***@gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
Daniel P. Berrange
19 years ago
Permalink
Post by Richard Jones
I've got a strange case where I can't fetch web pages from a remote
server. I can ping the server, I can connect to the remote server,
but after that the connection hangs until close.
The difference seems to be that (for various reasons) I'm trying to
connect over multiple (3) layers of NAT. My best guess is that path
MTU discovery is screwed up somehow.
Any ideas or further things I can try?
Having PMTU disocvery enabled, means that all packets will be sent with
the DF bit set, so to test if this is the problem temporarily disable
PMTU and let the packets fragment normally.

echo 1 > /proc/sys/net/ipv4/ip_no_pmtu_disc


Regards,
Dan.
--
|=- GPG key: http://www.berrange.com/~dan/gpgkey.txt -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- ***@redhat.com - Daniel Berrange - ***@berrange.com -=|
Richard Jones
19 years ago
Permalink
...
Right, well, turns out that PMTU discovery *isn't* the problem in that
case. Any further ideas ?!?

It's very strange that it can ping, can manage the 3-way handshake,
but can't send a packet with a 16 byte payload.

Tracing from the web server end indicates that the packet with the 16
byte payload is never received.

Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
--
Gllug mailing list - ***@gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
Richard Jones
19 years ago
Permalink
This turns out to be an obscure Xen bug ... Nothing to do with either
pMTUd or NAT.

http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=495

Rich.
--
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
--
Gllug mailing list - ***@gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
Russell Howe
19 years ago
Permalink
Post by Richard Jones
This turns out to be an obscure Xen bug ... Nothing to do with either
pMTUd or NAT.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=495
There's some stuff in the 2.6.15 changelog about bridging (or was it
GRE...?) and checksums, IIRC. It might just be for devices which do
hardware checksums. Might be worth a grep through changelogs?
--
Russell Howe | Why be just another cog in the machine,
***@siksai.co.uk | when you can be the spanner in the works?
--
Gllug mailing list - ***@gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
Nix
19 years ago
Permalink
Post by Russell Howe
Post by Richard Jones
This turns out to be an obscure Xen bug ... Nothing to do with either
pMTUd or NAT.
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=495
There's some stuff in the 2.6.15 changelog about bridging (or was it
GRE...?) and checksums, IIRC. It might just be for devices which do
hardware checksums. Might be worth a grep through changelogs?
It certainly doesn't affect all devices with bridges: I have one here
and it works fine both before and after the upgrade to 2.6.15.

There's an IPv6 length check fix in there, and a device add fix, and
a netfilter leak on hosts with multiple bridges on them (!), and
that seems to be pretty much it. Nothing that affects ordinary
unchanging single IPv4 bridges that I can see.
--
`... follow the bouncing internment camps.' --- Peter da Silva
--
Gllug mailing list - ***@gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
Loading...