5 Must Know Intelligence Gathering Tools and Techniques

People who are not very informed on this topic most likely think that an experienced pen tester, or hacker, would be able to just sit down and start hacking away at their target without much preparation.

(Just like in Hollywood movies when Mr. Hacker Guy only needs “a little more time” to get into the mainframe)

The truth is without proper intelligence gathering a penetration test suddenly comes close to a halt because deciding on which hacking tool to use becomes very difficult since very little is known about the target.

The information gathering stage can be applied to many other tasks in life and seems to be the one thing that is commonly overlooked since it doesn’t involve any real fancy stuff and can be somewhat boring.

A few good examples would be job interviews, tests at all school levels, game planning in any sport, and so on.

We have all been in that awful situation in school where we simply forgot to study for a test and fill in the very few answers we do know then proceed to either lucky guesses or cheating. Both methods at least result in a failing grade and/or getting in trouble for cheating.

Some in depth studying before the test would have prevented such failures from happening, but as usual hindsight is 20/20.

The scenario mentioned above can be directly translated into penetration testing because without knowing how to properly penetrate a network you are more than likely to get caught which is why in depth reconnaissance is so important.

Obviously getting caught during a penetration test is not exactly a legal issue, but for anyone gaining unauthorized access to a network by accident or with bad intentions can most definitely result in some time behind bars.

Although, even a legitimate penetration test could result in a legal issue if not all guidelines are established before the test takes place which once again comes down to proper preparation.

How could this happen during a legitimate penetration test?

Glad you asked!

Such an occasion would take place if the given pen tester accessed sensitive data on a company network that was not intended to be part of the test. (One would think that all network data should be open game during a penetration test but that’s not always the case)

Okay, now time to talk about the tools.

#1 Google Hacking(or other search engines)


Your first reaction to this should naturally be “Really?!” and my response would be…

“Yes, really”.

(bet you never expected that response)

Google is not the only search engine that you could use for the techniques about to be mentioned, but it is safe to assume that Google is more than likely the search engine you would have chosen to use.

So let’s get started!

Even though it might seem obvious that using a search engine is a good way to gather information about anything, it goes a bit farther than simply reading information from a company’s web page. What modern day companies seem to forget is that having a large web presence is a good thing, but it also means that a lot more of their information is posted on the internet.

Google search operators, or directives, is the name of the method that people can use to gather further information from a website than meets the eye.

Usually when someone searches for something online they simply enter “search term” then proceed to digging for what they’re looking for, but simply using operators can make the results a lot more specific.

A basic example of using an operator would be the “allintext” operator. You would use the allintext operator by running a search for something like healthy pizza recipes by typing allintext: healthy pizza recipes in the search field. The results will only show pages with healthy pizza recipes appearing in their text.

The example of searching healthy pizza recipes is very basic and broad but an example of how you can make your search results more specific by using an operator.

There are a number of operators one can use and a very large number of combinations depending on what you want your results to be.

The full list of web search operators contains the following:

For Web Search

allinanchor:, allintext:, allintitle:, allinurl:, cache:, define:, filetype:, id:, inanchor:, info:, intext:, intitle:, inurl:, link:, related:, site:

You can find operators for searches seeking other results such as image searches, group searches, directory searches, news searches, and product searches.

It would take quite a while to explain all of the operators, but some independent research is sure to quickly explain the operator you are wanting to learn about.


So how can these operators be used for reconnaissance?

Well, as mentioned before, the right combination of operators can return some interesting results.

For example, if you were to type in the search field

inurl:admin inurl:orders filetype:php

Anyone with some ecommerce or web development knowledge would quickly realize that the combination of operators is seeking a website’s customer order file which has a possibility of containing sensitive financial information.

I highly recommend checking out this great pdf which explains in depth how this technique works. (Although it might be a little outdated, it’s a great learning resource nonetheless)


Another resource I highly recommend is the Google hacking section of Exploit-DB.com. The website has a very wide range of information regarding penetration testing or hacking but the specific section of the site about Google hacking provides a number of different search queries that others have came up with.

The page also provides a mini search engine so you can find the search queries that fit your interest such as files containing passwords, files containing usernames, files containing sensitive shopping information.

You can check out this awesome tool here http://www.exploit-db.com/google-dorks/

#2 The Harvester

A very common and basic method that attackers use to try gaining information about their targets is by contacting the targets directly usually via email which is known in the information security world as phishing.

The attacker is usually seeking some sort of sensitive information (in this case network login information) which can be acquired in different ways. A popular method that attackers use for corporate targets is sending the target an email with some sort of urgency.

The email either leads the victim to filling out a fake form which provides the attacker with login details, or opening an attachment containing malware that could possibly provide the attacker with a backdoor into the network.

A gullible victim will enter their credentials into the fake form without knowing they just provided the attacker with login credentials to their company’s network.

Now the question is:

How does an attacker get such contact information?

Well, aside from looking at a company’s “Contact Us” page of their website there is actually a free tool called The Harvester that searches and finds user contact information that is not normally accessible on a website.

The popular method for using this tool is through Kali Linux which is no surprise since Kali Linux seems to have every penetration testing tool available.

To use The Harvester in Kali Linux you would need to open a terminal (similar to a command prompt window) and simply type theharvester then press enter and the tool will load.

Using this tool in Kali Linux or separately is pretty simple even though people automatically think a tool is complicated if it doesn’t have a fancy GUI to work with.

An example of how you can start your harvesting is to type the following command

./theharvester.p/ -dwebsite.com –l 20 –b all

The first part of the command ./harvester.py activates the tool (even though you are already in it)

The second part of the command –d tells what domain you are targeting

The third part which is a lower case L tells how many results you want returned and in this case you can see the desired number of results is 20

The last part –b is used to specify which public resource you want to search for information, but in this case you can see that all is used. However, feel free to try more specific sources such as Google and other search engines. You can even give social networks a try if you want since there is a strong chance of contact information being contained somewhere in those sites as well.

You could download this tool by itself instead of accessing it through Kali Linux since much less memory would be used.

Here is a link where you can check out the individual program


The Harvester is not only good for email addresses however, it is also good for gathering subdomains, hosts, employee names, open ports, and banners which is gathered by The Harvester from a variety of online sources.

Crazy that such a handy tool is free right?

#3 WHOIS.net and Verio.com

These are two free tools that can provide pretty valuable information (in my opinion too much information) with very little technical knowledge.

I included both of them in this section because the information they provide is very similar but not exactly identical. All you really know is how to navigate a website to find the information you are looking for.

First I will cover WHOIS.net


All you have to do to use this tool is to enter the website into the search box and click go.

The results will include a wide range of information about the target website such as IP addresses, host names, Domain Name Servers, and you can even find addresses and phone numbers of either the company or the individual owner which is common for smaller sites.

I was surprised at how many sites had so much of their information displayed on WHOIS when it can be easily avoided by purchasing domain privacy.

I highly recommend taking advantage of privacy if you are starting a website for yourself or a company because the less information you leave available for the bad guys the better.

The second tool I mentioned earlier in this section is Verio.com which is pretty much based on results found through WHOIS, but in my experience I have noticed that it returns different results containing more information for one reason or another. I have tested this a bunch of times and have noticed that Verio definitely tends to show some more information than WHOIS.

You can access VERIO by clicking the link on the bottom of the WHOIS page as seen below.verio

You will notice that the page where you search with Verio is similar to WHOIS


I really have a hard time believing that such a large number of website owners are not bothered by the fact that both their website’s information AND their personal contact information can be found with these tools but hopefully people start to care at some point right?

#4 Netcraft.com

Netcraft is a website that provides a wide range of internet security services as well as research data and analysis.

I highly recommend digging around this site because it has a TON of great information about internet security and information security in general. I really like the news section of the site because it provides very in depth details about each story as well as different kinds of charts to help give a full explanation of the data.

Anyway, moving on to how we are going to use this tool.

Netcraft provides a free website analysis tool that after entering the URL provides a detailed report about the given website. The tool can be found here http://toolbar.netcraft.com/site_report/.

Below is a small portion of what you will see on the page after you enter a URL. The default information that shows is actually the information from the Netcraft website so I recommend taking a look at the large amount of information that this tool provides.


Netcraft, like the other tools mentioned in this content, can be used for analysis by both good and bad guys but it is a great tool no matter what color hat you are wearing.

One of the features I like most about the Netcraft reporting tool is how it provides you with a Netcraft security risk rating based on a scale of 1 through 10 which is very useful for website owners who want to make sure their virtual property is air tight, or for the bad guys who want to see if your website is a sitting duck or not.

#5 MetaGoofil

MetGoofil is a tool used for gathering a target’s information consisting of metadata which is taken from documents that are commonly available to the public on company websites.

The most popular publicly available documents in my opinion are pdf documents because those seem to be very common in job descriptions and other various types of data that a company will publish online.

MetaGoofil generates a report with information that contains the usernames, software versions, server names, and even sometimes machine names that were involved with the creation and distribution of the given document.

Below are examples of the information generated from MetaGoofil when searching Microsoft.com

metagoofil metagoofil-2 metagoofil-3 metagoofil-4

I find MetaGoofil to be a very interesting tool because it provides valuable information without doing much work at all for a variety of reasons. The one part of the information MetaGoofil provides that really catches my eye are the usernames because a username is pure gold for a hacker trying to gain unauthorized access to a system although figuring out the password would obviously require some additional finger work.

MetaGoofil, like some of the other tools mentioned, can be used through Kali Linux, or you can download MetaGoofil by itself at the link below.