Partager via


TMG stopped processing web proxy requests

This post is about an issue I worked on several days ago.

Symptom:

========

My customer had a TMG array with two nodes running with NLB. The problem they faced was that from time to time some TMG node couldn't process traffic anymore: requests to the virtual IP (VIP) failed and only rebooting the TMG machine eliminated the issue.

IE was configured to use the NLB virtual IP (VIP) as proxy address. When the issue happened users couldn’t browse internet pages -  the browser didn't show any error page, it just stayed with a blank page.

Troubleshooting:

=============

First of all, I looked at the TMG Web proxy log and found an  error logged there with the result code of 11001 (WSAHOST_NOT_FOUND) for  the request for  “contoso.com”, which we tried to browse in order to reproduce the issue.

Then I took a look at a captured network monitor trace in order to find an appropriate network conversation between the client and TMG:

    Time                             Delta time     Source           Destination        Protocol Length   Info

11:07:26.566261000  1.376436000    CLIENT_IP          TMG_VIP           HTTP     433    GET http://contoso.com/ HTTP/1.1

11:07:26.570933000  0.004672000    TMG_VIP           CLIENT_IP          HTTP     1440   HTTP/1.1 502 Proxy Error ( The host was not found. )  (text/html)

In a web proxy scenario, the web proxy is supposed to resolve the name, however, by some reason the node didn't manage to resolve the remote host.

As the next step, I filtered out the trace for DNS traffic and didn't find DNS packets on the node at all – this looked very suspicious.

Therefore I looked at Netstat and its output showed around ~62000 lines similar to the followings, whereas 2580 was the pid of wspsrv.exe:

  UDP    0.0.0.0:4002           *:*                                    2580

  UDP    0.0.0.0:4003           *:*                                    2580

  UDP    0.0.0.0:4004           *:*                                    2580

  UDP    0.0.0.0:4005           *:*                                    2580

  UDP    0.0.0.0:4006           *:*                                    2580

  UDP    0.0.0.0:4007           *:*                                    2580

  UDP    0.0.0.0:4020           *:*                                    2580

  UDP    0.0.0.0:4021           *:*                                    2580

  UDP    0.0.0.0:4022           *:*                                    2580

  UDP    0.0.0.0:4023           *:*                                    2580

  UDP    0.0.0.0:4024           *:*                                    2580

  UDP    0.0.0.0:4025           *:*                                    2580

  UDP    0.0.0.0:4026           *:*                                    2580

So TMG seemed to use up all UDP ports – hence there was no free UDP port left to use as source port for dns queries. This explained why name resolution failed and why TMG returned 502.

Why was such a great amount of ports used by TMG?

My guess was that it might be doing it on behalf of a client. Therefore, I looked at the Firewall log which showed that there were huge amount of UDP requests from a single client to different external hosts.

Luckily TMG firewall client was installed on the client machine which gave us an application name:

7:27:51 Establish CLINET2_IP 2058 EXTERNAL_HOST_1 54150 0x0 Access Internet UDP domain\user smax4pnp.exe:3:5.1 TMG1 Unidentified IP Traffic

7:27:52 Establish CLINET2_IP 2033 EXTERNAL_HOST_2 9362 0x0 Access Internet UDP domain\user smax4pnp.exe:3:5.1 TMG1 Unidentified IP Traffic

7:27:53 Terminate CLINET2_IP 2026 EXTERNAL_HOST_3 59866 0x80074e20 Access Internet UDP domain\user smax4pnp.exe:3:5.1 TMG1 Unidentified IP Traffic

So TMG used up all the available ports due to the excessive amount of request from this  client/application.

In order to resolve the issue the customer disabled the client  that has  been causing this huge UDP traffic.

 

Author:

Vasily Kobylin, Senior Support Engineer, EMEA Forefront Edge Team

 

Reviewer:

Balint Toth, Senior Support Escalation Engineer, EMEA Forefront Edge Team

Comments

  • Anonymous
    January 01, 2003
    As a side note, flood mitigation could have helped here as well, but this was not enabled on the TMG machine in question. Thanks Arpad for pointing out.

  • Anonymous
    May 08, 2014
    h aysHY@H saus aysdhsauw@jajwaus wdsa!!!