The other day I decided that I wanted to become more familiar with the internals of the Metasploit Framework, so with the latest svn of the framework and a couple of books on Ruby, I started digging.  I decided a fun project would be to port some of my existing tools and scripts into the framework.  I have started this with this ground-up rework of GooSweep (which has fallen into disrepair), and I have to say: Putting this together in Ruby with the Metasploit framework was a very enjoyable experience, and resulted in something that’s useful and usable way beyond what GooSweep used to be.  I’m definitely going to be writing stuff in the framework more often, now.

This module, web_search_scan, will perform search engine queries (Google by default, but configurable) for each IP address (and, optionally, hostnames found by rDNS) in a range specified by the user.  If there are hits on the search engine for a host, the module will display the number of hits, and URLs to view the results.  If you have a database connected, it will also log notes to the database for each host that it finds.

It’s a simple idea, but I’ve found the technique to be very useful.  It requires a little manual work to check out the results, since there’s no way of really knowing what you’re going to find, but you can find some interesting things like this.  For example:

  • Publicly-accessible and indexed web logs and stats – You can tell if someone at that IP has visited a site, and possibly even when, how often, and what their user agent was
  • Wiki edits and IP user pages
  • Mailing list and newsgroup posts – Hits from the mail/post headers, or occasionally admins asking for configuration help that don’t censor addresses
  • Abuse reports for open proxies, spammers, etc.
  • Posts to forums, comments, or guestbooks that log and display IP addresses

With a little detective work, you can map out some known active hosts on a network, and some information about those hosts, without having to actively probe the network.  This is great for the information-gathering phase of a penetration test.  I’ve also found it to be very helpful for learning more about potential attackers when doing incident response.

Here’s what the module’s info looks like in Metasploit (output edited for width):

HacBook:framework wesley$ ./msfconsole

                __.                       .__.        .__. __.
  _____   _____/  |______    ____________ |  |   ____ |__|/  |_
 /     \_/ __ \   __\__  \  /  ___/\____ \|  |  /  _ \|  \   __\
|  Y Y  \  ___/|  |  / __ \_\___ \ |  |_> >  |_(  <_> )  ||  |
|__|_|  /\___  >__| (____  /____  >|   __/|____/\____/|__||__|
      \/     \/          \/     \/ |__|

       =[ msf v3.2-release
+ -- --=[ 299 exploits - 124 payloads
+ -- --=[ 18 encoders - 6 nops
       =[ 68 aux

msf > use auxiliary/scanner/misc/web_search_scan
msf auxiliary(web_search_scan) > info

       Name: Web Search Engine IP Address Scanner
    Version: 5612

Provided by:
  Wesley McGrew <>

Basic options:
  Name         Current Setting  Required  Description
  ----         ---------------  --------  -----------
  LOOKUP       false            yes       Reverse lookup IPs and
                                          search hostnames too? (Not
  PROXYCHAINS                   no        Pipe-delimited (|) list of
                                          proxy chains to use
  QUIET        false            yes       Quiet output (still logs to
  RETRIES      3                yes       Number of times to retry
                                          queries if they fail
  RHOSTS                        yes       The target address range or
                                          CIDR identifier
  SLEEP        3                yes       Minimum time to sleep between
                                          requests (seconds)
  SLEEPRAND    3                yes       Random additional time to
                                          sleep (seconds)
  THREADS      1                yes       The number of concurrent threads

  This scanner will do a web search engine query for each IP address
  (optionally, rDNS names as well) and record the number of hits and a
  URL to the query results. This is a useful for determining some
  active hosts and information gathering about a network without
  having to directly probe the network. Common results include
  publicly accessible web access logs, mailing list posts, abuse
  reports, and wikipedia edits. (WARNING: If you set LOOKUP to true,
  your target may notice the reverse DNS lookups.)

msf auxiliary(web_search_scan) >

A quick overview of these options:

  • RHOSTS - Set of IP addresses you want to scan.  You can comma-delimit sets of hosts, do dash-seperated ranges, or masks, just like with any Metasploit module
  • LOOKUP - If you like, the module can do a reverse-DNS query for each IP address and perform search engine queries for each hostname found.  If you're trying hard to be stealthy, you may want to avoid this option, as the target's DNS will see the queries.
  • SLEEP and SLEEPRAND - After each search engine query, the module will sleep for SLEEP + rand(SLEEPRAND+1) seconds.  Many web search engines will freak out if you throw queries at it faster than a normal/human user would.  You can adjust this to be faster or slower, depending on how dangerous you feel.
  • RETRIES - Sometimes, even when we're careful, a search engine will respond with something we have no idea how to parse.  Or stops responding altogether.  This is the number of times the module will attempt a query before giving up.  At the end of a complete scan, the module will display all the queries that failed, so that you are aware of any false-negatives.
  • QUIET - If set to "true", the module will only output status at the beginning and end of its run.  If you set this, you will want to have a database connected, as that's the only place the results will be going.  You can set this, use "run -j" to execute the scan, and it will run in the background fairly quietly, letting you do other things in metasploit while this slowwww scan runs :) .
  • PROXYCHAINS and THREADS - Many metasploit modules allow you to specify a proxy chain to work with.  This one allows you to specify multiple chains, which will allow you parallelize and run a scan faster, even with all the necessary sleeping.  For best results, set THREADS to a few greater than the number of proxy chains.  Each thread will claim a proxy for duration of each individual query.  I apologize that this feature isn't extremely well tested (I left my botnet in my other pants).

There's also some "advanced" options, that allow you to tweak where and how the module gets its results.  This can be useful if you need to use a different search engine, or fix the current one if it's changes and breaks the regex.  Here's what you can tweak:

msf auxiliary(web_search_scan) > show advanced
Module advanced options:
   Name           : NOHITSREGEX
   Current Setting: (?:No results found)|(?:did not match any documents)
   Description    : Regex to match a zero-hit search
   Name           : NUMHITSREGEX
   Current Setting: of (?:about )?<b>((?:[,\d])+)<\/b> for <b>
   Description    : Regex to match number of hits
   Name           : SEARCHHOST
   Current Setting:
   Description    : Hostname of search engine
   Name           : SEARCHPORT
   Current Setting: 80
   Description    : Search Port
   Name           : SEARCHURI
   Current Setting: /search?hl=en&q=*&btnG=Google+Search
   Description    : Search URI (* for query location)
   Name           : TIMEOUT
   Current Setting: 10
   Description    : Timeout for the search engine to respond
   Name           : USERAGENT
   Current Setting: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv: Gecko/2008070208 Firefox/3.0.1
   Description    : The User-Agent header to use for all requests

One thing you could do with the SEARCHURI option is add in extra parameters such as “” to look for mentions IP addresses and hosts only on a specific site.

Here’s what a scan might look like (searching non-routable ranges guarantees some results, but it’s a bit pointless too :) ):

So there you have it!  Here’s the code, if you want to drop it in the framework (tested with the latest SVN of metasploit) and use it yourself:


…guys in range of your wireless network with laptop stickers like this (click for full resolution):

The above shot was taken of me at last night’s Linux users’ group meeting, and Gimp’d up this afternoon (slow day). I was mostly ignoring the presentation (because it was “What is Linux”, not because of bad presentation skills. Greg did a great job.). Instead I was trying to figure out libpcap-ruby. I’m new to Ruby, so I just sort of skimmed the pickaxe book and dove right into writing a sniffer. Personally I don’t really understand why a packet’s .raw_data member wouldn’t contain the headers, but not the TCP payload, but I’m not too concerned with it since I finally noticed the .tcp_data member.

I think it’s going take some tweaking and getting-used-to before I really dig irb as much as I like ipython. I often use ipython as my interface when I’m using python code I’ve written. Rather than having a set of scripts that I edit to do whatever I need, I tend to write python functions for tasks, load them up in ipython, and use them interactively. That way I can shuffle the data around in a more ad-hoc manner, stuff it out to a file when I need to, and basically just play the part of a script myself. It looks like irb is what I’ll be doing the same activities in with Ruby, but I’m just not as good at it yet :) .

© 2012 McGrew Security Suffusion theme by Sayontan Sinha