Tuesday, June 28, 2011

Watch Out for App Engine Proxies

Thanks to Amit Agarwal for detailing how to set up a proxy on Google App Engine.  I expect he may have helped the folks we fought today.  This morning, we found that our site had been proxied through a Google App Engine site.  That proxied site was then indexed by Google.  Somehow, Google ranked it higher than our original site.  So, it ranked #1, while our original ranked #2.  A client called to inform us that we had porn on our site (which was technically false).  The App Engine app slipstreamed an ad banner (porn from juicyads.com) into our site.

I firewalled off the 74.125.0.0/16 range to disallow proxy from GAE.  Then, I found a link on the main proxy page which showed a revolving list of 80 different proxies.  No doubt, these proxies help to keep them under GAE quotas and confuse system administrators and Google proper.  Below is the list of proxies I found:

proxy-earth.appspot.com
proxy-hill.appspot.com
proxy-human.appspot.com
proxy-man.appspot.com
proxy-mars.appspot.com
proxy-river.appspot.com
proxy-sites.appspot.com
proxy-star.appspot.com
proxy-sun0.appspot.com
proxysun0.appspot.com
proxy-sun1.appspot.com
proxysun1.appspot.com
proxy-sun2.appspot.com
proxysun2.appspot.com
proxy-sun3.appspot.com
proxysun3.appspot.com
proxy-sun4.appspot.com
proxysun4.appspot.com
proxy-sun5.appspot.com
proxysun5.appspot.com
proxy-sun6.appspot.com
proxysun6.appspot.com
proxy-sun7.appspot.com
proxysun7.appspot.com
proxy-sun8.appspot.com
proxysun8.appspot.com
proxy-sun9.appspot.com
proxysun9.appspot.com
proxy-sun.appspot.com
proxysun.appspot.com
sun2surf1.appspot.com
sun2surf2.appspot.com
sun2surf3.appspot.com
sun2surf4.appspot.com
sun2surf5.appspot.com
sun2surf6.appspot.com
sun2surf7.appspot.com
sun2surf8.appspot.com
sunproxy0.appspot.com
sunproxy10.appspot.com
sunproxy11.appspot.com
sunproxy12.appspot.com
sunproxy13.appspot.com
sunproxy14.appspot.com
sunproxy15.appspot.com
sunproxy16.appspot.com
sunproxy17.appspot.com
sunproxy18.appspot.com
sunproxy19.appspot.com
sunproxy1.appspot.com
sunproxy2.appspot.com
sunproxy3.appspot.com
sunproxy4.appspot.com
sunproxy5.appspot.com
sunproxy6.appspot.com
sunproxy7.appspot.com
sunproxy8.appspot.com
sunproxy9.appspot.com
surf4sun1.appspot.com
surf4suns1.appspot.com
web2sun123.appspot.com
web2sun1.appspot.com
web2sun22.appspot.com
web2sun33.appspot.com
web2sun5.appspot.com
web2sun65.appspot.com
web2sun6.appspot.com
web2sun77.appspot.com
web2sun88.appspot.com
web4sun22.appspot.com
www2sun100.appspot.com
www2sun122.appspot.com
www2sun123.appspot.com
www2sun1.appspot.com
www2sun22.appspot.com
www2sun2.appspot.com
www2sun33.appspot.com
www2sun345.appspot.com
www2sun88.appspot.com
www2sun.appspot.com

I found this list using the following bash script, which employs curl and grep:

curl http://proxysunlist2.appspot.com/list 2> /dev/null |sed -n "/a.*href/ { s/.*http:\/\/\([^<]*\).*/\1/; p }" > /tmp/badsites.new; grep -vf /tmp/badsites /tmp/badsites.new >> /tmp/badsites; wc -l /tmp/badsites

I had to run it several times to populate the list fully.  When my line counter stopped incrementing, I figured I'd discovered all of the proxies.