jech / polipo Goto Github PK
View Code? Open in Web Editor NEWThe Polipo caching HTTP proxy
Home Page: http://www.pps.jussieu.fr/~jch/software/polipo/
License: MIT License
The Polipo caching HTTP proxy
Home Page: http://www.pps.jussieu.fr/~jch/software/polipo/
License: MIT License
Hi @jech,
I got polipo crashed on certain case, and compiled the polipo with -g using gdb to trace the error. found it segment fault at function
httpServerContinueConditionHandler(int status, ConditionHandlerPtr chandler)
I modify the code int server.c like below:
static int
httpServerContinueConditionHandler(int status, ConditionHandlerPtr chandler)
{
HTTPConnectionPtr connection = *(HTTPConnectionPtr*)chandler->data;
if (connection)
{
if (!connection->request)
{
do_log(L_ERROR, "%s(%d): Null Request\n", __FUNCTION__, __LINE__);
return 0;
}
}
else
{
do_log(L_ERROR, "%s(%d): Null Connection\n", __FUNCTION__, __LINE__);
return 0;
}
if(connection->request->flags & REQUEST_WAIT_CONTINUE)
return 0;
httpServerDelayedDoSide(connection);
return 1;
}
Still, I catch the bug, it is the connection->request is NULL. print the line Null Request. But still the program got SIGABORT signal in gdb.
So, could help me with this, what should I do when NULL request found like above ?
I know normal case we should find out why the connection is NULL, not how to process it. But it's really urgent, I need to fix it, at least make sure polipo running stable. Actually, I would like to read polipo more, but there is little comment in code, it's really hard, especially the memory allocation part,
Tks.
Here's a small patch that fixes some WIN32 annoyances:
diskcache: don't check for leading '/' on WIN32, else serving files from the webserver never works.
dns: don't even try to parse resolv.conf on WIN32
(There should be a way to find the dns server on WIN32, possibly with DnsQueryConfig, but this function looks scary...)
io: REUSEADDR has completely different semantics on WIN32, don't use it; for better security use SO_EXCLUSIVEADDRUSE
(See: http://msdn.microsoft.com/en-us/library/windows/desktop/ms740621%28v=vs.85%29.aspx)
util: try detaching from the console
In cmd if you run polipo with: start polipo.exe [args] the polipo console window will immediately disappear.
Same if you start it with a shortcut.
edit: patch removed
From time to time this message appears in polipo.log:
Couldn't parse ETag.
It would help if the message were to include the URL and also the ETag, because if it's one of my sites I'd like to fix it.
Hello,
based on my positive experience with WWWOFFLE I'd like to suggest for polipo to offer to download a page the next time it goes online when it is in offline mode and the request cannot be fulfilled from the cache. This had been discussed previously in #31 but it is quite clear that either I don't understand Juliusz or the other way around or both. This is an incredibly useful feature I find and thus, I'd like to discuss it further.
What I want polipo to do (and what WWWOFFLE did when I was still using it) is to display a page notifying the user that the polipo is currently offline and the requested URI not present in the cache. The user is then presented with a link that will instruct polipo to fetch that URI the next time it goes online. This makes "offline browsing" incredibly comfortable and was one of my favourite features of WWWOFFLE. I don't see how wget alone as suggested by Juliusz would be able to replace this
functionality.
http://www.gedanken.org.uk/viewvc/wwwoffle/trunk/cache/html/en/messages/ConfirmRequest.html?revision=2124&view=markup is basically the markup that was doing this in WWWOFFLE.
Thank you for your consideration.
Regards
Rolf
I see this daily. Pulled 10/27.
Unsupported Cache-Control directive post-check -- ignored.
Unsupported Cache-Control directive pre-check -- ignored.
Unsupported Cache-Control directive post-check -- ignored.
Unsupported Cache-Control directive pre-check -- ignored.
Assertion failed: (i >= 0), function lockChunk, file object.c, line 318.
Abort trap: 6
$ cat ~/.polipo
proxyAddress = "127.0.0.1"
proxyPort = 8000
allowedClients = 127.0.0.1
allowedPorts = 1-65535
proxyName = "localhost"
cacheIsShared = false
# daemonize = true
socksParentProxy = "localhost:9888"
socksProxyType = socks5
# disableLocalInterface = true
# disableConfiguration = true
localDocumentRoot=/Users/tlc/polipo
scrubLogs = true
See http://bugs.debian.org/654660
Additionally, the warning message could be better.
I noticed a segfault during browsing some sites. Reproduceable on delicaclub.ru/forum (in random places).
Reproduces only on mipsel build, couldn't catch on x86-64.
Here is a backtrace:
[dmig@my-router polipo]$ sudo gdb --args polipo -c /opt/etc/polipo/config
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "mipsel-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/sbin/polipo...done.
(gdb) run
Starting program: /opt/sbin/polipo -c /opt/etc/polipo/config
Program received signal SIGSEGV, Segmentation fault.
0x004363f0 in httpServerFinish (connection=0x4975e0, s=1, offset=0) at server.c:1266
1266 connection->server->time = current_time.tv_sec;
(gdb) bt
#0 0x004363f0 in httpServerFinish (connection=0x4975e0, s=1, offset=0) at server.c:1266
#1 0x004369e0 in httpServerDelayedFinishHandler (event=0x499638) at server.c:1342
#2 0x00406a18 in runTimeEventQueue () at event.c:492
#3 0x0040722c in eventLoop () at event.c:654
#4 0x0041dfb0 in main (argc=3, argv=0x7fff6d74) at main.c:165
Here is my config:
# Sample configuration file for Polipo. -*-sh-*-
# You should not need to use a configuration file; all configuration
# variables have reasonable defaults. If you want to use one, you
# can copy this to /etc/polipo/config or to ~/.polipo and modify.
# This file only contains some of the configuration variables; see the
# list given by ``polipo -v'' and the manual for more.
daemonise = true
pidFile = "/tmp/var/run/polipo.pid"
logFile = "/opt/var/log/polipo"
### Basic configuration
### *******************
# Uncomment one of these if you want to allow remote clients to
# connect:
# proxyAddress = "::0" # both IPv4 and IPv6
proxyAddress = "0.0.0.0" # IPv4 only
# If you do that, you'll want to restrict the set of hosts allowed to
# connect:
allowedClients = 127.0.0.1, 192.168.1.0/24
# allowedClients = "127.0.0.1, 134.157.168.0/24"
# Uncomment this if you want your Polipo to identify itself by
# something else than the host name:
# proxyName = "polipo.example.org"
# Uncomment this if there's only one user using this instance of Polipo:
# cacheIsShared = false
# Uncomment this if you want to use a parent proxy:
# parentProxy = "squid.example.org:3128"
# Uncomment this if you want to use a parent SOCKS proxy:
# socksParentProxy = "localhost:9050"
# socksProxyType = socks5
### Memory
### ******
# Uncomment this if you want Polipo to use a ridiculously small amount
# of memory (a hundred C-64 worth or so):
# chunkHighMark = 819200
# objectHighMark = 128
# Uncomment this if you've got plenty of memory:
# chunkHighMark = 50331648
# objectHighMark = 16384
### On-disk data
### ************
# Uncomment this if you want to disable the on-disk cache:
# diskCacheRoot = ""
# Uncomment this if you want to put the on-disk cache in a
# non-standard location:
diskCacheRoot = "/opt/var/cache/polipo"
# Uncomment this if you want to disable the local web server:
localDocumentRoot = ""
# Uncomment this if you want to enable the pages under /polipo/index?
# and /polipo/servers?. This is a serious privacy leak if your proxy
# is shared.
# disableIndexing = false
# disableServersList = false
diskCacheTruncateTime = 30d
diskCacheUnlinkTime = 90d
diskCacheTruncateSize = 256MB
### Domain Name System
### ******************
# Uncomment this if you want to contact IPv4 hosts only (and make DNS
# queries somewhat faster):
# dnsQueryIPv6 = no
# Uncomment this if you want Polipo to prefer IPv4 to IPv6 for
# double-stack hosts:
# dnsQueryIPv6 = reluctantly
# Uncomment this to disable Polipo's DNS resolver and use the system's
# default resolver instead. If you do that, Polipo will freeze during
# every DNS query:
# dnsUseGethostbyname = yes
### HTTP
### ****
# Uncomment this if you want to enable detection of proxy loops.
# This will cause your hostname (or whatever you put into proxyName
# above) to be included in every request:
# disableVia=false
# Uncomment this if you want to slightly reduce the amount of
# information that you leak about yourself:
# censoredHeaders = from, accept-language
# censorReferer = maybe
# Uncomment this if you're paranoid. This will break a lot of sites,
# though:
# censoredHeaders = set-cookie, cookie, cookie2, from, accept-language
# censorReferer = true
# Uncomment this if you want to use Poor Man's Multiplexing; increase
# the sizes if you're on a fast line. They should each amount to a few
# seconds' worth of transfer; if pmmSize is small, you'll want
# pmmFirstSize to be larger.
# Note that PMM is somewhat unreliable.
# pmmFirstSize = 16384
# pmmSize = 8192
# Uncomment this if your user-agent does something reasonable with
# Warning headers (most don't):
# relaxTransparency = maybe
# Uncomment this if you never want to revalidate instances for which
# data is available (this is not a good idea):
# relaxTransparency = yes
# Uncomment this if you have no network:
# proxyOffline = yes
# Uncomment this if you want to avoid revalidating instances with a
# Vary header (this is not a good idea):
# mindlesslyCacheVary = true
hi
i am using polipo compiled from latest version and i am trying to block some advertisement and over tracking connection, but whatever i do i don't see any "blocked" message or similar to it in polipo log.
i have put the log level to 0xFF.
this happens for all the https connection that i am trying to block and many of http ones as well.
for example if i add this line
example.com
to forbiddenTunnelsFile,and go to that site it doesn't get blocked and i see this message in polipo log.
"tunnel example.com:443 allowed"
am i missing something?
i have tried many formats,for example for google-analytics domain i tried google-analytics.com ssl.google-analytics.com but nothing works and i still get the message that tunnel allowed for this domain that i have supposedly blocked.
Hi,
Does or could polipo support a URL based proxy? Where you target: http://polipohost.tld:8123/http://urliwanttoget/
Basically the same as: https://github.com/zhuzhuor/urlproxy/
The use for this is with wanting to cache HTTPS repositories. Say you have a repository that is only HTTPS accessible. You could write your configuration file to use http://polipohost.tld:8123/https://repohost/
and treat it as a normal proxied request, getting speed benefits without having to do SSL MiTM.
Thanks,
Teran
The error handling in the local interface is buggy, crashes on error. See
http://mid.gmane.org/[email protected]
http://seclists.org/fulldisclosure/2011/Oct/10
[util.c:85]: (style) Array index 'i' is used before limits check.
Source code is
while(string[i] != '\0' && i < n) {
Suggest sanity check array index before use.
Polipo's logging is ad-hoc, and not as useful as it should be. Note however that we don't want to allow logging of all requests, since that would make the life of fascist system administrators too easy.
There have been some reports of Polipo failing to properly serve objects larger than available memory, even in the absence of revalidation failures. (There's probably not much that can be done with uncachable objects without some major surgery to Polipo.)
Would you consider adding TCP Fast Open support?
There's a small and simple example patch here:
https://github.com/dtaht/ceropackages-3.10/blob/cf6fd6a01fdfbddb468f2a3ea27b6450955b674a/net/polipo/patches/001-server_tfo.patch
Hi, I think the following in http://www.pps.univ-paris-diderot.fr/~jch/software/polipo/tor.html should be changed to https://check.torproject.org/
Go to ipid.shat.net. It should show you an address that you've never heard about.
And in http://www.pps.univ-paris-diderot.fr/~jch/software/polipo/tor.html, change the following to https://torproject.org/
Using polipo 1.0.4 (debian stable) and I see that this bug is still around: http://sourceforge.net/p/polipo/mailman/message/4011605/
This issue is reproduceable on delicaclub.ru/forum (in random places).
Reproduces only on mipsel build, couldn't catch on x86-64.
Here is a backtrace:
[dmig@my-router polipo]$ sudo gdb --args polipo -c /opt/etc/polipo/config
GNU gdb (GDB) 7.3.1
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "mipsel-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /opt/sbin/polipo...done.
(gdb) run
Starting program: /opt/sbin/polipo -c /opt/etc/polipo/config
[Inferior 1 (process 5876) exited normally]
(gdb) ^Z[1]+ Stopped sudo gdb --args polipo -c /opt/etc/polipo/config
[dmig@my-router polipo]$ ps | grep polipo
5874 admin 8224 2 gdb --args polipo -c /opt/etc/polipo/config
5879 admin 1404 S /opt/sbin/polipo -c /opt/etc/polipo/config
5881 dmig 884 R grep polipo
[dmig@my-router polipo]$ fg
sudo gdb --args polipo -c /opt/etc/polipo/config
(gdb) attach 5879
Attaching to program: /opt/sbin/polipo, process 5879
Reading symbols from /lib/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/libc.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/libc.so.0
Reading symbols from /lib/ld-uClibc.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib/ld-uClibc.so.0
0x77f624fc in poll () from /lib/libc.so.0
(gdb) continue
Continuing.
Program received signal SIGABRT, Aborted.
0x77fae964 in raise () from /lib/libc.so.0
(gdb) bt
#0 0x77fae964 in raise () from /lib/libc.so.0
#1 0x77fa8064 in abort () from /lib/libc.so.0
#2 0x77f65888 in __assert () from /lib/libc.so.0
#3 0x00435bd4 in httpServerFinish (connection=0x4945f0, s=1, offset=0) at server.c:1171
#4 0x004369e0 in httpServerDelayedFinishHandler (event=0x495d08) at server.c:1342
#5 0x00406a18 in runTimeEventQueue () at event.c:492
#6 0x0040722c in eventLoop () at event.c:654
#7 0x0041dfb0 in main (argc=3, argv=0x7fff6d74) at main.c:165
(gdb)
See #22 for polipo config file.
Found this caching proxy while searching for alternatives to squid, installed and all looked perfect for me until i noticed there's no way to rewrite the store url!
For example i would like to store steam game updates the urls are like:
http://valve#.cs.steampowered.com/depot/..........
they should be rewritten so there's no duplicate files!
This already works on squid3.4+ but it's way to complicated to run and it doesn't like cygwin at all!
Please implement this???? 👍
I am trying to setup Polipo to cache everything that it ever serves, and cache it forever. However, Polipo seems to respect server-side headers. Is it possible to force it to ignore them?
According to the specs, a POST with no payload is allowed to be made without the Content-Length header (i.e. it defaults to 0)
This is now being done by the latest version of UCBrowser for Android.
POSTS without data are often ways to ensure links are not actuated maliciously with inline code.
For example, the forum voting here: http://forums.theregister.co.uk/forum/3/2014/07/29/feature_spectral_approach_to_scottish_independence/
Using Polipo, the above now fails with a lack of content-length warning.
Simple fix:
--- client.c.orig 2014-05-14 23:19:43.000000000 +0100
+++ client.c 2014-07-31 22:23:04.000000000 +0100
@@ -738,7 +738,7 @@
connection->reqbegin = i;
if(body_len < 0) {
- if(request->method == METHOD_GET || request->method == METHOD_HEAD)
+ if(request->method == METHOD_GET || request->method == METHOD_HEAD || request->method == METHOD_POST)
body_len = 0;
}
connection->bodylen = body_len;
Ref: http://tools.ietf.org:80/html/draft-ietf-httpbis-p1-messaging-20
Cheers, Jamie
When I enabled authCredentials on my proxy, I can't visit https website. Http website has no such problem.
I have polipo run on OpenWrt router , when I cut down the Internet connection on Route wan port while downloading big file via polipo, the file would never be downloaded unless I delete cache on local.
I have add some debug message to see what happen and review the code again and again, I just found polipo have the httpTimeout check on that case, and only for server will got timeout, the client's connect is right, @jech do you have any suggestion on that ?
what i am thinking is when the server timeout we do shutdown the client fd and destroy the object, but currently I couldn't find the client's fd when server fd timeout.
Hello. I'm running polipo with 20000+ active users. Polipo is allowed only to access certain IP addresses, all other addresses are closed with TCP Reset (this is done with iptables).
Number of used file descriptors is increased over time and it never gets less.
lsof -n | grep polipo | grep sock
polipo 5646 proxy 661u sock 0,6 0t0 3066108598 can't identify protocol
polipo 5646 proxy 663u sock 0,6 0t0 3066489338 can't identify protocol
polipo 5646 proxy 666u sock 0,6 0t0 3064968353 can't identify protocol
polipo 5646 proxy 667u sock 0,6 0t0 3064968355 can't identify protocol
polipo 5646 proxy 670u sock 0,6 0t0 3063258367 can't identify protocol
polipo 5646 proxy 675u sock 0,6 0t0 3064970332 can't identify protocol
…
After some time, polipo will reach open descriptors limit (8192 on my setup) and would fail on opening any connections.
if dns transaction id value of 0xf123, get signed int type value of 0xfffff123.
findQuery() function is so often fails.
--- dns.old.c 2015-02-02 20:33:34.507287561 +0900
+++ dns.c 2015-02-02 20:30:08.879290658 +0900
@@ -1373,14 +1373,14 @@ stringToLabels(char *buf, int offset, in
}
#ifdef UNALIGNED_ACCESS
-#define DO_NTOHS(_d, s) d = ntohs((short)(_s));
+#define DO_NTOHS(_d, s) d = ntohs((unsigned short)(_s));
#define DO_NTOHL(_d, s) d = ntohl((unsigned)(_s))
-#define DO_HTONS(_d, _s) (short)(_d) = htons(_s);
+#define DO_HTONS(_d, _s) (unsigned short)(_d) = htons(_s);
#define DO_HTONL(_d, _s) (unsigned)(_d) = htonl(_s)
#else
#define DO_NTOHS(_d, _s)
-do { short _dd;
-memcpy(&(_dd), (_s), sizeof(short));
+do { unsigned short _dd;
+memcpy(&(_dd), (_s), sizeof(unsigned short));
_d = ntohs(_dd); } while(0)
The presence of an Age header field implies that the response was not generated or validated by the origin server for this request.
However, Polipo generates an Age header even for responses that were just obtained from the origin server, if the delays accumulate to at least 1 second. For example:
$ curl -si --proxy http://localhost:8123 httpbin.org/delay/5 | head
HTTP/1.1 200 OK
Content-Length: 262
Date: Sat, 30 Jan 2016 13:09:00 GMT
Via: 1.1 polipo
Server: nginx
Content-Type: application/json
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Age: 5
Connection: keep-alive
Fixing this would probably make the spurious Warning headers easy to fix as well.
when I visit http://www.airasia.com/jp/ja/home.page through polipo the banner rotation in the middle isn't served. Can you reproduce this?
I don't think I've set anything locally that could trigger this behaviour. The forbidden file is essentially empty (comments only).
$ cat polipo/config
logSyslog = true
logFile = /var/log/polipo/polipo.log
allowedClients = 127.0.0.1, 172.0.0/10
logLevel = 0xFF
disableIndexing = false
disableServersList = false
dnsQueryIPv6 = no
dnsUseGethostbyname = happily
dnsNegativeTtl = 30
Goto the website http://www.511.org
Observe a segmentation fault.I am on debian wheezy
~/develop/polipo/polipo$ sudo gdb --args ./polipo -c /etc/polipo/config
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /media/videos/develop/polipo/polipo/polipo...done.
(gdb) run
Starting program: /media/videos/develop/polipo/polipo/polipo -c /etc/polipo/config
Program received signal SIGSEGV, Segmentation fault.
httpServerReplyHandler (status=-65540, event=0x63a460, srequest=0x63a480) at server.c:1793
1793 assert(request->object->flags & OBJECT_INPROGRESS);
(gdb) bt
#0 httpServerReplyHandler (status=-65540, event=0x63a460, srequest=0x63a480) at server.c:1793
#1 0x0000000000404f1c in do_scheduled_stream (status=, event=0x63a460) at io.c:282
#2 0x00000000004044e5 in pokeFdEventHandler (tevent=) at event.c:569
#3 0x0000000000404561 in runTimeEventQueue () at event.c:492
#4 0x0000000000404725 in eventLoop () at event.c:654
#5 0x0000000000402aa1 in main (argc=3, argv=) at main.c:165
The code for generating HTTP Warning headers is dodgy, and probably provides incorrect data in some cases.
The status code is 404 but should rather be 403.
diff --git a/tunnel.c b/tunnel.c
index e1e1e62..9efb656 100644
--- a/tunnel.c
+++ b/tunnel.c
@@ -174,7 +174,7 @@ do_tunnel(int fd, char *buf, int offset, int len, AtomPtr url)
if (tunnelIsMatched(url->string, url->length,
tunnel->hostname->string, tunnel->hostname->length)) {
releaseAtom(url);
- tunnelError(tunnel, 404, internAtom("Forbidden tunnel"));
+ tunnelError(tunnel, 403, internAtom("Forbidden tunnel"));
logTunnel(tunnel,1);
return;
}
While not optimal from a security perspective, it is moderately common for http clients to send URLs that contain usernames and passwords such as http://name:[email protected]/somepath/
. Polipo parses these URLs assuming that the colon is always a separator between host and port as in http://example.com:8080/somepath
. Depending on the content of the request, this can result is several different errors, a hostname lookup error using the password as a hostname, a forbidden port error (if the password begins with a number), probably other conditions are possible. It would be good if polipo was able to parse URLs containing usernames and passwords in them.
Polipo generates Warning headers like this:
Warning: 110 polipo:8123 Object is stale
This does not match the Warning
production in RFC 7234 § 5.5, in that the double quotes around the warn-text
are missing.
I've started seeing the following error with SourceForge.net pages since switching to polipo yesterday:
> Error: CSS did not load.
This may happen on the first request due to CSS mimetype issues. Try clearing your browser cache and refreshing.
The log (with logLevel=0xff):
Vary header present (Accept-Encoding, User-Agent).
Uncacheable object http://sourceforge.net/p/tmux/tmux-code/ci/master/tree/ (2562)
Superseding object http://sourceforge.net/p/tmux/tmux-code/ci/master/tree/ (200 22492 -1 (none) -> 200 22487 -1 (none))
Superseding object http://fonts.googleapis.com/css?family=Ubuntu:regular (200 192 -1 (none) -> 200 192 -1 (none))
Vary header present (Accept-Encoding).
Superseding object http://consent-st.truste.com/get?name=notice.js&domain=slashdot.org&c=teconsent&text=true (200 4433 -1 (none) -> 200 -1 -1 (none))
Vary header present (Accept-Encoding).
Superseding object http://www.google-analytics.com/ga.js (200 16063 1412297322 (none) -> 200 16063 1412297322 (none))
Vary header present (Accept-Encoding).
Superseding object http://consent-st.truste.com/get?name=cmapi.module.js (200 5415 -1 (none) -> 200 -1 -1 (none))
Unsupported Cache-Control directive post-check -- ignored.
Unsupported Cache-Control directive pre-check -- ignored.
Vary header present (Accept-Encoding).
Uncacheable object http://consent.truste.com/notice?js=1&name=notice.js&domain=slashdot.org&c=teconsent&text=true (606)
Superseding object http://consent.truste.com/notice?js=1&name=notice.js&domain=slashdot.org&c=teconsent&text=true (200 1128 -1 (none) -> 200 1129 -1 (none))
Uncacheable object http://www.google-analytics.com/… (138)
Uncacheable object http://www.google-analytics.com/… (138)
Vary header present (Accept-Encoding).
Superseding object http://consent-st.truste.com/get?name=notice2.js (200 6266 -1 (none) -> 200 -1 -1 (none))
Unsupported Cache-Control directive post-check -- ignored.
Unsupported Cache-Control directive pre-check -- ignored.
Uncacheable object http://consent.truste.com/noticemsg?action=returns&domain=slashdot.org&behavior=expressed&country=de&language=en&rand=0.8105680911821955 (94)
Uncacheable object http://sourceforge.net/log/webtracker/?event_id=…&project=tmux&action_type=git&url=http%3A%2F%2Fsourceforge.net%2Fp%2Ftmux%2Ftmux-code%2Fci%2Fmaster%2Ftree%2F&referer= (2)
My config:
logFile = /var/log/polipo/polipo.log
diskCacheRoot = /var/spool/polipo
proxyPort = 3128
logLevel = 0xFF
disableIndexing = false
disableServersList = false
# max. open file handles
maxDiskEntries = 64
I've tried adding these settings (and restarted polipo), but without luck:
dontTrustVaryETag = true
dontCacheCookies = true
dontCacheRedirects = true
Restarting polipo, purging its disk cache (polipo -x
) and forcefully reloading does not help (in Firefox). With Google Chrome forcefully reloading helps however, and fixes the cache?!
This may be related to the order in which the browsers fetch the resources, and how polipo handles them (regarding pipelining etc).
Wiping polipo's diskcache altogether (after stopping it) helps.
It appears to be related/caused by a JavaScript error, because some resource is gzipped, but the headers do not say so:
% wget http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js -S
--2014-11-06 17:42:14-- http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js
Resolving localhost (localhost)... 127.0.0.1
Connecting to localhost (localhost)|127.0.0.1|:3128... connected.
Proxy request sent, awaiting response...
HTTP/1.1 200 OK
Content-Length: 706
ETag: "1349976945.13-2193"
Date: Thu, 06 Nov 2014 16:39:51 GMT
Last-Modified: Thu, 11 Oct 2012 17:35:45 GMT
Expires: Wed, 04 Nov 2015 21:59:31 GMT
Cache-Control: public, max-age=31382380
Content-Type: application/x-javascript
Age: 143
Connection: keep-alive
Length: 706 [application/x-javascript]
Saving to: ‘header.js.1’
2014-11-06 17:42:14 (42,9 MB/s) - ‘header.js.1’ saved [706/706]
% http_proxy= wget http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js -S
--2014-11-06 17:42:36-- http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js
Resolving a.fsdn.com (a.fsdn.com)... 23.63.115.172
Connecting to a.fsdn.com (a.fsdn.com)|23.63.115.172|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: nginx
Content-Type: application/x-javascript
Access-Control-Allow-Origin: *
ETag: "1360348568.16-2193"
Last-Modified: Fri, 08 Feb 2013 18:36:08 GMT
X-Frame-Options: SAMEORIGIN
Cache-Control: public, max-age=31382270
Expires: Wed, 04 Nov 2015 22:00:26 GMT
Date: Thu, 06 Nov 2014 16:42:36 GMT
Content-Length: 2193
Connection: keep-alive
Length: 2193 (2,1K) [application/x-javascript]
Saving to: ‘header.js.2’
% file header.js.*
header.js.1: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT)
header.js.2: ASCII text
The file in polipo's disk cache (a.fsdn.com/+pdEifTwTC6+549yOVD+3g==) looks as follows:
HTTP/1.1 200 OK
Content-Length: 706
ETag: "1349976945.13-2193"
Date: Thu, 06 Nov 2014 16:39:51 GMT
Last-Modified: Thu, 11 Oct 2012 17:35:45 GMT
Expires: Wed, 04 Nov 2015 21:59:31 GMT
Cache-Control: public, max-age=31382380
Content-Type: application/x-javascript
X-Polipo-Location: http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js
X-Polipo-Access: Thu, 06 Nov 2014 16:42:14 GMT
^_<8B>^H^@^@^@^@^@^@^@<9D…
See issue #43 for more investigation into this particular resource.
I am using polipo from Git (@e40fbbe7).
Could this be causes by pipelining or something similar, where polipo receives the content gzipped, although it's not when requested alone?!
This is defined in RFC 5861, and should be fairly easy to implement in Polipo, since all the infrastructure is already there.
Note that there is a patch floating around the network that claims to do that[1], but, contrary to what the commit message claims, it doesn't do anything useful (no, Virginia, Polipo doesn't treat unknown Cache-Control directives as no-cache).
As stated in the subject, the authentication (variable authCredentials="user:password") doesn't work at all if I attempt to connect to https urls; instead the logon mask will appear if I connect to simple http websites.
Thanks for this nice tool!
After replacing my local Squid instance with polipo, I am missing the "X-Cache" and "X-Cache-Lookup" headers that Squid provides.
I find these useful in general when looking at response headers, and when debugging cache issues in particular.
See also http://blog.lyte.id.au/2014/08/28/x-cache-and-x-cache-lookupheaders/
Using polipo as $http_proxy
, it fails to clone the following Git repo:
% git clone http://git.savannah.gnu.org/r/gnulib.git
Cloning into 'gnulib'...
remote: Counting objects: 160407, done.
remote: Compressing objects: 100% (24057/24057), done.
error: RPC failed; result=18, HTTP code = 200 MiB | 3.47 MiB/s
fatal: The remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
The polipo log contains:
Established listening socket on port 3128.
Parsed URL http://git.savannah.gnu.org/r/gnulib.git/info/refs?service=git-upload-pack: rc=0, x=7, y=27
Uncacheable object http://git.savannah.gnu.org/r/gnulib.git/info/refs?service=git-upload-pack (66)
Parsed URL http://git.savannah.gnu.org/r/gnulib.git/git-upload-pack: rc=0, x=7, y=27
Uncacheable object http://git.savannah.gnu.org/r/gnulib.git/git-upload-pack (66)
Short on chunk memory -- attempting to punch holes in the middle of objects.
Short on chunk memory -- attempting to punch holes in the middle of objects.
Short on chunk memory -- attempting to punch holes in the middle of objects.
Short on chunk memory -- attempting to punch holes in the middle of objects.
Short on chunk memory -- attempting to punch holes in the middle of objects.
Short on chunk memory -- attempting to punch holes in the middle of objects.
Read from server failed: Cannot allocate memory
Current pid file handling is not correct: it prevents daemon to start, if pid file exists, not checking whether daemon is actually running.
Here is a patch correcting this issue:
diff --git a/util.c b/util.c
index 8fc30e5..ddeb770 100644
--- a/util.c
+++ b/util.c
@@ -591,12 +591,26 @@ writePid(char *pidfile)
int fd, n, rc;
char buf[16];
- fd = open(pidfile, O_WRONLY | O_CREAT | O_EXCL, 0666);
+ fd = open(pidfile, O_WRONLY | O_CREAT, 0644);
if(fd < 0) {
do_log_error(L_ERROR, errno,
"Couldn't create pid file %s", pidfile);
exit(1);
}
+ rc = lockf(fd, F_TLOCK, 0);
+ if(rc) {
+ n = errno;
+ if(EACCES == n || EAGAIN == n) {
+ do_log_error(L_ERROR, 0,
+ "Another instance is alredy running");
+ exit(1);
+ } else {
+ do_log_error(L_ERROR, n,
+ "Couldn't create pid file %s", pidfile);
+ exit(1);
+ }
+ }
+
n = snprintf(buf, 16, "%ld\n", (long)getpid());
if(n < 0 || n >= 16) {
close(fd);
I have a 100% reproducible test case on Debian wheezy
Steps to reproduce the issue
1>Navigate to www.newegg.com
2>Immediately polipo crashes
Here is the stacktrace.I tried with latest master codeline as well
oot@bhavesh:/media/develop/polipo/polipo# gdb --args polipo -c /etc/polipo/config
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /media/videos/develop/polipo/polipo/polipo...done.
(gdb) run
Starting program: /media/videos/develop/polipo/polipo/polipo -c /etc/polipo/config
polipo: event.c:517: findEvent: Assertion `!(revents & 0x020)' failed.
Program received signal SIGABRT, Aborted.
0x00007ffff7a83165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff7a83165 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff7a863e0 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007ffff7a7c311 in __assert_fail ()
from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00000000004048fc in findEvent (events=0x63cc70, revents=32)
at event.c:517
#4 eventLoop () at event.c:710
#5 0x0000000000402aa1 in main (argc=3, argv=)
at main.c:165
In some case (broken or misconfigured server, or for testing purposes) it's pretty useful to force caching disregarding HTTP headers. This is clearly off-standard but useful too.
Filtering by mime-type or server-URL pattern would be a great bonus.
An example:
user-agent send a request (with keep-alive)
server replies:
Cache-Control: no-cache, no-store, must-revalidate
Etag: <buggy-etag>
Expire: <in the past>
Such requests won't be cached by polipo (relaxTransparency = true, mindlesslyCacheVary = true, cacheIsShared = false)
I have a feature request. It would be very useful if polipo had a configuration setting to preserve the host header.
It should work like mod_proxy ProxyPreserveHost On|Off:
http://httpd.apache.org/docs/2.2/mod/mod_proxy.html#proxypreservehost
http://www.c-and-a.com/fr/fr/shop/femme/looks-tendances/coton-bio/toute-la-collection
==5475== Memcheck, a memory error detector
==5475== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==5475== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==5475== Command: /home/ghost/local/src/polipo/polipo -c /home/ghost/local/src/polipo-test/polipo.conf forbiddenFile=/home/ghost/local/src/polipo-test/forbidden forbiddenTunnelsFile=Tunnels
==5475==
==5475== Invalid read of size 8
==5475== at 0x428A79: httpServerDirectHandlerCommon (server.c:2597)
==5475== by 0x428F06: httpServerDirectHandler2 (server.c:2681)
==5475== by 0x406E9F: do_scheduled_stream (io.c:245)
==5475== by 0x405E71: pokeFdEventHandler (event.c:569)
==5475== by 0x405C12: runTimeEventQueue (event.c:492)
==5475== by 0x4060BA: eventLoop (event.c:654)
==5475== by 0x4151CF: main (main.c:167)
==5475== Address 0x52574c8 is 24 bytes inside a block of size 120 free'd
==5475== at 0x4C27D4E: free (vg_replace_malloc.c:427)
==5475== by 0x424FA4: httpServerFinish (server.c:1315)
==5475== by 0x4254E2: httpServerRestart (server.c:1461)
==5475== by 0x42622F: httpServerHandler (server.c:1742)
==5475== by 0x406E9F: do_scheduled_stream (io.c:245)
==5475== by 0x405E71: pokeFdEventHandler (event.c:569)
==5475== by 0x405C12: runTimeEventQueue (event.c:492)
==5475== by 0x4060BA: eventLoop (event.c:654)
==5475== by 0x4151CF: main (main.c:167)
==5475==
==5475==
==5475== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
==5475== starting debugger with cmd: /usr/bin/gdb -nw /proc/5484/fd/1024 5484
GNU gdb (GDB) 7.4.1-debian
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /proc/5484/fd/1024...done.
Attaching to program: /proc/5484/fd/1024, process 5484
Reading symbols from /usr/lib/valgrind/vgpreload_core-amd64-linux.so...Reading symbols from /usr/lib/debug/usr/lib/valgrind/vgpreload_core-amd64-linux.so...done.
done.
Loaded symbols for /usr/lib/valgrind/vgpreload_core-amd64-linux.so
Reading symbols from /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so...Reading symbols from /usr/lib/debug/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so...done.
done.
Loaded symbols for /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so
Reading symbols from /lib/x86_64-linux-gnu/libc.so.6...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libc-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib/x86_64-linux-gnu/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib/x86_64-linux-gnu/libnss_files-2.13.so...done.
done.
Loaded symbols for /lib/x86_64-linux-gnu/libnss_files.so.2
Failed to read a valid object file image from memory.
0x0000000000428a79 in httpServerDirectHandlerCommon (kind=2, status=-65540, event=0x52d6c40, srequest=0x52d6c60)
at server.c:2597
2597 HTTPRequestPtr request = connection->request;
(gdb) bt
#0 0x0000000000428a79 in httpServerDirectHandlerCommon (kind=2, status=-65540, event=0x52d6c40, srequest=0x52d6c60)
at server.c:2597
#1 0x0000000000428f07 in httpServerDirectHandler2 (status=-65540, event=0x52d6c40, srequest=0x52d6c60)
at server.c:2681
#2 0x0000000000406ea0 in do_scheduled_stream (status=-65540, event=0x52d6c40) at io.c:245
#3 0x0000000000405e72 in pokeFdEventHandler (tevent=0x52d6e20) at event.c:569
#4 0x0000000000405c13 in runTimeEventQueue () at event.c:492
#5 0x00000000004060bb in eventLoop () at event.c:654
#6 0x00000000004151d0 in main (argc=5, argv=0x7ff000238) at main.c:167
(gdb) p *connection
$1 = {flags = 0, fd = -1, buf = 0x0, len = 0, offset = 932, request = 0x0, request_last = 0x0, serviced = 4,
version = 1, time = 1444809708, timeout = 0x0, te = 0, reqbuf = 0x0, reqlen = 27044, reqbegin = 0, reqoffset = 0,
bodylen = -1, reqte = 0, chunk_remaining = -1, server = 0x52571c0, pipelined = 0, connecting = 0}
(gdb) p sizeof(*connection)
$2 = 120
I've noticed that Polipo fails to parse the URL from Python's urllib.urlretrieve (Python 2.7.9):
% python2 -c 'import urllib; r = urllib.urlretrieve("https://www.example.com/"); print(file(r[0]).read())'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html><head>
<title>Proxy error: 400 Couldn't parse URL.</title>
</head><body>
<h1>400 Couldn't parse URL</h1>
<p>The following error occurred while trying to access <strong>https://www.example.com/</strong>:<br><br>
<strong>400 Couldn't parse URL</strong></p>
<hr>Generated Tue, 03 Mar 2015 13:11:53 CET by Polipo on <em>localhost:3128</em>.
</body></html>
This appears to happen for any https URLs, and is probably caused by an improper / no use of CONNECT from Python's side?!
The relevant bug for Python is http://bugs.python.org/issue1424152; there's a patch for 2.7, which hasn't been applied yet.
Hi, I successfully use Polipo on Haiku (the OS).
Polipo works very well for me, except for one thing.
In my Browser (Qupzilla) I do Google searches (Google is set as my default search engine) simply typing keywords in the address bar (as we can do in every browser). Eg: if in the address bar I type "foo" will be opened a search result on Google using "foo" as keyword.
Well, if I use Polipo as http proxy, when I type any keyword in the address bar, which consists in any single word (eg "foo") I just get: "504 Host foo lookup failed: Host not found"
and I can see that Polipo try to translate the word "foo" as http address (in fact in the address bar i can see "http://foo/"), but if as keyword I use two words (eg "foo foo") I can properly see the default behaviour of the browser.
In the polipo.conf I've tried to set the entry "dns Gethostbyname =" as "yes" "false" "reluctant" and "happily", but the result is always the same.
There is some workaround? I miss something? My resolv.conf file is properly set.
If polipo receive request like:
POST / HTTP/1.1
or
POST / HTTP/1.1
Host: www.example.com
after response with HTTP 405 error or "Couldn't parse URL ...", the connection->fd will never close (leak)
Hi I want to change this proxy in a way that I can add an object from outside, for example I have a complete object [headers and content] how I can add to the cache that it appears as if it was cached in normal way and all subsequent requests for this object would be served from cache?
Installation on Linux Arch
diskCacheRoot = "/mnt/dysk1/polipo/"
It worked in version 1.1.0
One of the things I loved in WWWOFFLE was the ability to surf in offline mode and have it ask if uncached and thus unavailable items should be fetched next time the program was online. It would be nice to see similar functionality in polipo.
The headers between the cached/proxied version and the real one differ.
The following headers are missing when being served through polipo:
X-Frame-Options, Server, Access-Control-Allow-Origin.
This is the same resource which is problematic in #42 - maybe that's an indication?!
# With proxy:
% curl -I http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js | sort
Age: 2138
Cache-Control: public, max-age=31382380
Connection: keep-alive
Content-Length: 706
Content-Type: application/x-javascript
Date: Thu, 06 Nov 2014 16:39:51 GMT
ETag: "1349976945.13-2193"
Expires: Wed, 04 Nov 2015 21:59:31 GMT
HTTP/1.1 200 OK
Last-Modified: Thu, 11 Oct 2012 17:35:45 GMT
# Without proxy:
% http_proxy= curl -I http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js | sort
Access-Control-Allow-Origin: *
Cache-Control: public, max-age=31380236
Connection: keep-alive
Content-Type: application/x-javascript
Date: Thu, 06 Nov 2014 17:15:35 GMT
ETag: "1349976945.13-2193"
Expires: Wed, 04 Nov 2015 21:59:31 GMT
HTTP/1.1 200 OK
Last-Modified: Thu, 11 Oct 2012 17:35:45 GMT
Server: nginx
X-Frame-Options: SAMEORIGIN
After curl -H "Cache-control: no-cache" http://a.fsdn.com/allura/nf/1415138103/_ew_/theme/sftheme/js/sftheme/header.js
this is fixed:
Access-Control-Allow-Origin: *
Age: 71
Cache-Control: public, max-age=31379019
Connection: keep-alive
Content-Length: 2193
Content-Type: application/x-javascript
Date: Thu, 06 Nov 2014 17:36:47 GMT
ETag: "1360348568.16-2193"
Expires: Wed, 04 Nov 2015 22:00:26 GMT
HTTP/1.1 200 OK
Last-Modified: Fri, 08 Feb 2013 18:36:08 GMT
Server: nginx
X-Frame-Options: SAMEORIGIN
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.