Browse Articles in Web Development
Andre Honsberg Andre Honsberg is a software engineer who develops mostly for the Web. He lives in Hamburg Germany and works building software for the web using a wide spectrum of different technologies.

Spider Script Command Line View Larger Image Today I will be talking about making anonymous cURL request using PHP Tor and Polipo Proxy. The system that I am working on is Ubuntu 10.04 customized for web development. I have Apache installed and will be installing Tor and Polipo on this local box. The script that I use will run on the command line. This script can be modified easily to do POST request anonymously and download pages anonymously. You could pull others people content without them knowing where the request came from by looking at the logs. Also I could see this being the perfect spider using the simple html dom class for PHP. Alright, let us start the hackery.

In Ubuntu 10.04 (I am guessing other versions will work also) open up a terminal by pressing [CTRL] [ALT] [T] if you did not change the keyboard shortcuts. In this terminal enter the following and hit enter:

sudo echo 'deb http://deb.torproject.org/torproject.org lucid main' > /etc/apt/sources.list.d/tor.list && gpg --keyserver keys.gnupg.net --recv 886DDD89 && gpg --export A3C4F0F979CAA22CDBA8F512EE8CBC9E886DDD89 | sudo apt-key add - && sudo aptitude update && sudo aptitude install tor tor-geoipdb polipo

That should take care of importing the repository and downloading the necessary packages to get the job done. Now the configuration process begins. This is a very simple process because the kind folks over at the Tor project already provide a superb config file for us to use at the following url:

http://www.andrehonsberg.com/media/polipo.conf

Just copy all the text from the config file into the one on your system. You can get your config file open by typing the following into the terminal and pushing enter:

sudo gedit /etc/polipo/config

Now that the config file is open delete everything and paste in what you copied earlier. Since the nature of this article is meant for people that know what they are doing I will not go into how to copy and paste. If anyone does not know this please leave a comment below with your home ip address so we can tank the DDoS machine your way. Save the file and now we will restart Tor and Polipo by running the following command in the terminal again.

sudo /etc/init.d/tor restart
sudo /etc/init.d/polipo restart

If you followed the directions thus-far, we should be in good shape. Now jump over to your favorite IDE (mine is Netbeans 7) and start code crunching away. For this example we will have a PHP command line script. It will take 2 inputs so that our command will look like

#So php file <url or page> <renew identity>
php get_links.php http://www.andrehonsberg.com y

IP hidden from server log View Larger Image The code to make all this happen now is pretty simple. I will post an example below and then explain it below that.

<?php

include ('classes/simple_html_dom.php'); // class i use to parse dom
include ('includes/global.inc.php');     // used to include other files

if (exec('whoami') != 'root') { // check if we are root
  clm("[!] You must be root to run this script!"); 
  // clm is a wrapper function for fwrite(STDOUT, "message \n");
  exit;
}

if ($argc < 2) {  // make sure we have at least 2 arguments
  clm("[!] Please enter a url.");
  clm("[-] To also set a new identity at y to end. No is default");
  clm("[-] Example: {$_SERVER['PHP_SELF']} http://www.andrehonsberg.com y");
  clm("[-] Usage:   {$_SERVER['PHP_SELF']} <url> [y/n]");
  exit;
}

if (!$curl = curl_init()) { // initialize curl
  clm("[!] Could not initialize curl!");
  exit;
}

// restart tor and polipo to get new identity
if ($argc == 3 && $argv[2] == 'y') { // restarting tor allows us to change the ip
  clm("[-] Restart Tor    >> " . exec('/etc/init.d/tor restart'));
  clm("[-] Restart Polipo >> " . exec('/etc/init.d/polipo restart'));
}

curl_setopt ($curl, CURLOPT_URL, "{$argv[1]}");
curl_setopt ($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($curl, CURLOPT_USERAGENT, "U53r4g3nt K1ll4"); // user agent so we can see easier on log
//curl_setopt ($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (compatible; MSIE 5.01; Windows NT 5.0)");
curl_setopt ($curl, CURLOPT_AUTOREFERER, 1);
curl_setopt ($curl, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($curl, CURLOPT_REFERER, "http://www.google.com/");
curl_setopt ($curl, CURLOPT_PROXY, "http://127.0.0.1:8118/"); // this is to connect to local proxie

$result = curl_exec ($curl);
curl_close ($curl);

//write contents of $result to file
$file = "page.txt";
if (!$fh = fopen($file, 'w')) {
  clm("[!] Could not open file for html dumping.");
  exit; 
}
fwrite($fh, $result);
fclose($fh);

//turn file into dom object
if (!$page = file_get_html("page.txt")) {
  clm("[!] Could not convert html from file to dom object!");
  exit;
}
$links = $page->find('a');
$x = 0;
foreach ($links as $link) {
  $href = $link->href;
  clm("[>] {$href}");
  $x++;  
}
clm("+---------------------------------------------------------+");
clm("| [*] {$x} links found on {$argv[1]}");
clm("+---------------------------------------------------------+");
exit;

?>

The above snippet of code is a complete little command line utility that will allow you to anonymously pull all the URLs of a page specified on the command line. All you need to make this to work is PHP, Simple HTML Dom Class, and a command line. First in the script we check weather the user is root or not. This is needed if we want to change the identity which requires us to have root in order to restart Tor. Then we check if we have a link to use at least. Then if we were told to run the new identity part restart Tor and Polipo to use a new IP Address. Now we set all the cURL options. Make sure the last cURL options tells the script what port the proxy (Polipo) is running on. Obviously this address can be changed to any address you have a proxy running on. Then we retrieve our HTML for the target page. After writing the dom elements to file and loading them in with the simple dom parser we now loop through all the links and display them on the command line.

alt text

Hope you guys like this script. It should make it really easy for you guys to customize this into all sorts of stuff. Have fun hacking with this. Also I have included screen-shots so you can see with your own eyes the results.

Want to leave a Commnet?