[upstream regression] wget does not honor dot-prefixed domains in no_proxy env variable

Bug #1861440 reported by Shawn K. O'Shea
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
wget (CentOS)
Unknown
Critical
wget (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Traditionally (AFAIK for at least the last decade), tools that support the no_proxy environment variable support specifying an entire subdomain by prefixing it with a "dot". For example, to exclude any website under example.com from using the proxy, you would set no_proxy to .example.com (export no_proxy=.example.com).

A regression in wget 1.19 changed this behavior to expect non-prefixed domains (example.com vs .example.com). This regression was ultimately fixed and released with the 1.20 release of wget. bionic includes the regressed behavior version of wget.

The regression was apparently introduced in wget 1.19.3. This bug should not effect other Ubuntu releases (xenial contains 1.17.1 and both disco and eoan contain 1.20.x versions that have the upstream fix).

For more details, see references below and my additional comments on the RHEL8 bug filed for this issue (RH bug 1763702 linked below).

What happens:
no_proxy=.example.com in bionic sends requests to the proxy server for URLs like http://www.example.com/ despite requesting proxy exception via no_proxy.

What should happen:
Request should bypass the proxy and go directly to the web server. (works in xenial, disco and eoan as expected).

System/software information:
$ lsb_release -rd
Description: Ubuntu 18.04.3 LTS
Release: 18.04
$ apt-cache policy wget
wget:
  Installed: 1.19.4-1ubuntu2.2
  Candidate: 1.19.4-1ubuntu2.2
  Version table:
 *** 1.19.4-1ubuntu2.2 500
        500 http://div6mirrors.llan.ll.mit.edu:80/ubuntu bionic-updates/main amd64 Packages
        500 http://div6mirrors.llan.ll.mit.edu:80/ubuntu bionic-security/main amd64 Packages
        100 /var/lib/dpkg/status
     1.19.4-1ubuntu2 500
        500 http://div6mirrors.llan.ll.mit.edu:80/ubuntu bionic/main amd64 Packages

References:
* Upstream wget bug report:
  GNU Wget - Bugs: bug #53622 wget no_proxy leading dot on (sub)domains not working contradicting man page
  https://savannah.gnu.org/bugs/?53622

* Upstream commit reference that introduces the regression
  http://git.savannah.gnu.org/cgit/wget.git/commit/?id=fd85ac9cc623847e9d94d9f9241ab34e2c146cbf

* Upstream commit reference that introduces the fix
  http://git.savannah.gnu.org/cgit/wget.git/commit/?id=fd85ac9cc623847e9d94d9f9241ab34e2c146cbf

* Expected behavior of no_proxy as documented in the GNU Emacs manual: https://www.gnu.org/software/emacs/manual/html_node/url/Proxies.html

* Red Hat Bugzilla entry for this issue (Reported against RHEL8.1)
  Bug 1763702 - wget is ignoring no_proxy environment variable
  https://bugzilla.redhat.com/show_bug.cgi?id=1763702

* Red Hat Bugzilla entry tracking the (now released) errata package for RHEL8.1
  Bug 1772821 - wget is ignoring no_proxy environment variable [rhel-8.1.0.z]
  https://bugzilla.redhat.com/show_bug.cgi?id=1772821

Revision history for this message
In , fperalta (fperalta-redhat-bugs) wrote :

Description of problem:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

Expected results:

Additional info:

Revision history for this message
In , fperalta (fperalta-redhat-bugs) wrote :

Sorry I hit enter to fast..

(In reply to Francisco Peralta from comment #0)
> Description of problem:

wget does not correctly use the no_proxy variable in RHEL 8

> Version-Release number of selected component (if applicable):

wget-1.19.5-7.el8_0.1.x86_64

> How reproducible:

Always.
If using a newer wget 1.20.3 or if using older wget from RHEL 7 I do not reproduce the issue.

> Steps to Reproduce:
 1. $ export http_proxy=http://www.notexisting.com:8080
 2. $ export no_proxy=localhost,.redhat.com
 3. $ wget www.redhat.com

> Actual results:

--2019-10-21 13:09:46-- http://www.redhat.com/
Resolving www.nonexisting.com (www.nonexisting.com)... 192.249.111.222
Connecting to www.nonexisting.com (www.nonexsisting.com)|192.249.111.222|:8080... ^C

> Expected results:

--2019-10-21 13:50:42-- http://www.redhat.com/
Resolving www.redhat.com (www.redhat.com)... 2a02:26f0:97:181::d44, 2a02:26f0:97:19d::d44, 23.2.233.53
Connecting to www.redhat.com (www.redhat.com)|2a02:26f0:97:181::d44|:80... failed: Network is unreachable.
Connecting to www.redhat.com (www.redhat.com)|2a02:26f0:97:19d::d44|:80... failed: Network is unreachable.
Connecting to www.redhat.com (www.redhat.com)|23.2.233.53|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.redhat.com/en [following]
--2019-10-21 13:50:42-- https://www.redhat.com/en
Connecting to www.redhat.com (www.redhat.com)|23.2.233.53|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’

index.html [ <=> ] 71.39K --.-KB/s in 0.04s

2019-10-21 13:50:42 (1.90 MB/s) - ‘index.html’ saved [73101]

Converting links in index.html... 20-131
Converted links in 1 files in 0.004 seconds.

> Additional info:

The issue must have been fixed in the upstream community, I could not find the exact place nor the root cause of introducing it, but I think it's a matter of updating the RHEL 8 version of wget.

Revision history for this message
In , thozza (thozza-redhat-bugs) wrote :

During writing a test for the issue, I found a corner case which is still not fixed in upstream. I'll wait for their response first.

For more information, please see https://lists.gnu.org/archive/html/bug-wget/2019-11/msg00011.html

Revision history for this message
In , shawn (shawn-redhat-bugs-1) wrote :

The GNU Emacs manual covers this corner case in at least a little more detail. See https://www.gnu.org/software/emacs/manual/html_node/url/Proxies.html

As per your post, correlated with the EMACS manual, if you want the "host" mit.edu to not use the proxy, you would need to add it to no_proxy (so no_proxy=.mit.edu,mit.edu).

I'd really like to see this bug fixed as our local proxy server actually denies requests to proxy internally, so without changing proxy vars (that we've been using for over a decade), wget to our internal websites just doesn't work.

Revision history for this message
In , shawn (shawn-redhat-bugs-1) wrote :

Just a few links for extra documentation.

This regression in wget appears to have gone into changes in src/host.c with commit 55d25fc20c0141cb7cb8bd0a6964b81aa0b50124 on 2018-01-07 and was released with wget 1.19.3.
http://git.savannah.gnu.org/cgit/wget.git/commit/?id=fd85ac9cc623847e9d94d9f9241ab34e2c146cbf

This was reported upstream in #53622 (https://savannah.gnu.org/bugs/?53622).

Although not acknowledged in the upstream issue tracker, this appears fixed in upstream commit fd85ac9cc623847e9d94d9f9241ab34e2c146cbf on 2018-10-25.
http://git.savannah.gnu.org/cgit/wget.git/commit/?id=fd85ac9cc623847e9d94d9f9241ab34e2c146cbf

According to git tags in the wget repo, 1.19.5 was release 2018-05-06, so this fix wasn't included until the 1.20 release on 2018-11-13.

Revision history for this message
In , thozza (thozza-redhat-bugs) wrote :

So the no_proxy correct behavior in wget is for a longer discussion and since there is no standard, it is impossible to get it completely right. For this reason I decided to continue the discussion and possible changes in upstream, but will backport the current current upstream wget behavior to RHEL-8 in order to solve this pressing customer issue as soon as possible.

Revision history for this message
In , fperalta (fperalta-redhat-bugs) wrote :

Thank you Tomás,
 Yes, I think the most important thing now is to make sure the behaviour of no_proxy is again consistent with previous versions.
 Then discussions about what is the standard to use and agree on it can be taken and if then eventually those will be different, it will be announced in release notes in advance and proper way.

Kind Regards,
 Cisco.

Revision history for this message
Shawn K. O'Shea (b00gamonkey) wrote :

Just another data point. I searched Debian packages and there is not an associated release impacted by this issue.

Debian oldstable (stretch) ships wget 1.18 (pre-regression release). See https://packages.debian.org/stretch/wget
Debian stable (buster) ships wget 1.20 (regression fixed release). See https://packages.debian.org/buster/wget

summary: - wget does not honor dot-prefixed domains in no_proxy env variable
+ [upstream regression] wget does not honor dot-prefixed domains in
+ no_proxy env variable
Changed in wget (CentOS):
importance: Unknown → Critical
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in wget (Ubuntu):
status: New → Confirmed
Revision history for this message
Robert Varjasi (robert.varjasi) wrote :

Its fixed in https://launchpad.net/ubuntu/+source/wget/1.20.3-1ubuntu1. Can you backport this to ubuntu bionic please?

Revision history for this message
Robert Varjasi (robert.varjasi) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.