How to track down pesky "unauthenticated content" on TLS sites?

Often when working with a TLS secured web page you'll find the odd file which accidentally includes something from a non-https scheme URI.

This can be obvious to find (e.g grep the page source for "http://") or it can be tedious in the extreme (think about some third party obfuscated javascript library for instance creating <img> tags on the fly).

One fairly tidy solution I've come up with (and haven't seen documented elsewhere) is this:

tcpdump -i eth0 -A -n port 80 | grep GET

This will need to be run as root (use sudo or su as appropriate to your distro) but will print a convenient list of all non-https URLs being loaded on your PC while it runs. Change "eth0" to match whatever interface you reach the web server over.

If you're running something that is generating a lot of web traffic in the background (like a tor exit node) and you're reasonably sure you can narrow down which machine the undesired http traffic is coming from you could add a filter along these lines:

tcpdump -i eth0 -A -n port 80 and host example.com | grep GET

Where example.com is the machine hosting the content.

For more information see man 8 tcpdump. I think this solution should be fairly obvious if you use tcpdump a lot but may not jump immediately to mind.

--Michael Fincham <michael@finch.am> 2011-02-11

[DIR] Back to Projects