The information gathering steps of footprinting and scanning are of utmost importance. Good information gathering can make the difference between a successful penetration test and one that has failed to provide maximum benefit to the client. We can say that Information is a weapon, a successful penetration testing and a hacking process need a lots of relevant information that is why, information gathering so called foot printing is the first step of hacking. So, gathering valid login names and emails are one of the most important parts for penetration testing. We can use these to profile our target, brute force authentication systems, send client-side attacks (through phishing), look through social networks for juicy info on platforms and technologies, etc. For gathering information, we can either use the tool theHarvester or we can use the metasploit module called search_email_collector.
What is theHarvester
TheHarvester has been developed in Python by Christian Martorella. It is a tool which provides us information of about e-mail accounts, user names and hostnames/subdomains from different public sources like search engines and PGP key server.
This tool is designed to help the penetration tester on an earlier stage; it is an effective, simple and easy to use. The sources supported are:
- Google – emails, subdomains/hostnames
- Google profiles – Employee names
- Bing search – emails, subdomains/hostnames, virtual hosts
- Pgp servers – emails, subdomains/hostnames
- LinkedIn – Employee names
- Exalead – emails, subdomain/hostnames
- Time delays between requests
- XML results export
- Search a domain in all sources
- Virtual host verifier
Go to the Arsenal -> scanning -> web scanner -> theharvester.
In case, if it is not available in your distribution, than you can easily download it from http://code.google.com/p/theharvester/downlaod, where latest version 2.2 is available, simply download it and extract it.
Provide execute permission to the theHarvester.py by chmod 755 theHavester.py.
After getting in to that, simply run.
/theharvester, it will display version and other option that can be used with this tool with detailed description.
./theHarvester.py -d <url> -l 300 -b <search engine name >
./theHarvester.py –d matriux.com –l 300 –b google
See the below image for the result.
In Above command:-
- –d <url> will be the remote site from which you wants to fetch the juicy information.
- –l will limit the search for specified number.
- -b is used to specify search engine name.
From above information of email address we can identify pattern of the email addresses assigned to the employees of the organization. For example, some companies uses email@example.com pattern, so that can be useful in order to brute force the account of a specific person.
Host information can be useful in order to scan the specific system.
Search from all search engine.
./theHarvester.py –d gtu.ac.in –l 300 –b all
This command will grab the information from multiple search engines supported by the specific version of theHarvester, and display following information.
Save the result in HTML file. Command:
./theHarvester.py –d gtu.ac.in –l 300 –b all –f hackguru
To save results in html file -f parameter is used as shown in this example.
theHravester is a handy tool, which would quickly fetch the juicy information from the public resources by active or passive means.
Exposure of personal information is an advantage for every social engineer guy. Every information that you post on the Internet will eventually stay forever. So before you post something personal think twice if it is really necessary to allow other people to know about yourself and your activities. Also using different email addresses and usernames will make the work of social engineers much more difficult.