OSINT + Python = Custom Hacking

Last May 10 and 11 the course of Computer Forensic Expert was held in Reus (Spain) by the Asociación Nacional de Tasadores y Peritos Judiciales Informáticos (ANTPJI), which I am member and one of the instructors, where I had the pleasure of giving a talk on two of my passions such as Python and OSINT (Open Source Intelligence).

Python is a great language to quickly develop all kinds of powerful applications with lots of libraries to perform exploits, reverse engineering, web analysis tools and more. No doubt a useful knowledge for any security expert.

Internet is immense, sheltering all unimaginable information and this is the reason why OSINT techniques are vital to collect, analyze and present this information.

For this course, I decided it would be interesting for attendees to learn how to develop simple tools (scripts) that allow them to perform OSINT using Python with a series of practical exercises with a specific objective each.

Presentation and code are available at VULNEX website.

Note: I have removed from the scripts the Google Hacking query, so the reader can insert its own query.

Tool #1

Objective: search ANTPJI members on LinkedIn using Google Custom Search API.

These scripts are very simple and do the same thing but in a different way. The first one uses the Google API Client, while the second uses the fantastic Requests library.

In these scripts we are using some Google Hacking to find members of the association on LinkedIn.


# File: ex1_a.py
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Basic Google Hacking 
#
# VULNEX (C) 2013
# www.vulnex.com

import const
from apiclient.discovery import build
import pprint

# your google hacking query
query=''
query_params=''

doquery=query+query_params

service = build("customsearch","v1",developerKey=const.cse_token)

res = service.cse().list(
    q=doquery,
    cx=const.cse_id,
    num=10).execute()

pprint.pprint(res)

# VULNEX EOF

# File: ex1_b.py        
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Simple Google Hacking                
#
# VULNEX (C) 2013
# www.vulnex.com

import requests
import json
import urllib
import const

site="https://www.googleapis.com/customsearch/v1?key="

# Your Google Hacking query
query='' 
query_params='' 

url=site+const.cse_token+"&cx="+const.cse_id+"&q=" + urllib.quote(query+query_params)
response = requests.get(url)
print json.dumps(response.json,indent=4)

# VULNEX EOF

When running any of these scripts, we get the following result:

py_osint_img1

Not too interesting for the moment :)

Tool #2

Objective: obtain photos of the ANTPJI members LinkedIn profiles using Google Custom Search API.

The following script gets the photos of the members of the association on LinkedIn and also extracts picture metadata ;) The script generates an HTML page with all the photos.

Used libraries: Google API Client, PIL, Requests and Markup.


# File: ex2.py         
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Download picture and extract metadata               
#
# VULNEX (C) 2013
# www.vulnex.com

import const
from apiclient.discovery import build
import pprint
import os
from PIL import Image
from StringIO import StringIO
from PIL.ExifTags import TAGS
import requests
import markup

def do_query(istart=0):
    if istart == 0:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10).execute()
    else:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10,
        start=istart).execute()
    return res

pic_id=1
do_stop=10
cnt=1

page=markup.page()

# Set page title
page.init(title="ANTPJI OSINT") 
page.h1("ANTPJI OSINT")

# Set output directory
out_dir = "pics_gepl"

# Your Google Hacking query 
query=''
query_params=''

doquery=query+query_params

service = build("customsearch","v1",developerKey=const.cse_token)

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

res=[]
while True:
    if cnt==1:
        res = do_query()
    else:
        if not res['queries'].has_key("nextPage"): break
        res = do_query(res['queries']['nextPage'][0]['startIndex'])
    cnt+=1
    if cnt > do_stop: break
    if res.has_key("items"):
        for item in res['items']:
            name=""
            if not item.has_key('pagemap'): continue
            if not item['pagemap'].has_key('hcard'): continue
            hcard = item['pagemap']['hcard']
            for card in hcard:
                pic_url=""
                if 'title' in card:
                    if 'fn' in card: name = card['fn']
                    if 'photo' in card: pic_url = card['photo']
                if pic_url != "":   
                    image = requests.get(pic_url)
                    pic_n = os.path.join(out_dir,"%s.jpg") % pic_id
                    file = open(pic_n,"w")
                    pic_id+=1
                    try:
                        i = Image.open(StringIO(image.content))
                        if hasattr(i,"_getexif"):
                            ret = {}
                            info = i._getexif()
                            if info:
                                for k,v in info.items():
                                    decode = TAGS.get(k,v)
                                    ret[decode] = v
                                print ret
                        i.save(file,"JPEG")
                        page.p(name.encode('ascii','ignore')) 
                        page.img(src=pic_n)
                        page.br()
                        page.br()
                    except IOError, e:
                        print "error: %s" % e
                    file.close()            

# Set your output filename
with open('index_gepl.html','w') as fp:
    fp.write(str(page))

# VULNEX EOF

And this is the result:

py_osint_img2

With few lines of code we have got a very interesting tool.

Tool #3

Objective: what is the relationship of ANTPJI members on LinkedIn?

With this script we are looking for the relationship between the members of the association at LinkedIn and create a graph that relates the words.

Used libraries: Google API Client, NetworkX and Matplotlib.


# File: ex3.py         
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Build graph from profiles                
#
# VULNEX (C) 2013
# www.vulnex.com

import const
from apiclient.discovery import build
import networkx as nx
import matplotlib.pyplot as plt

def do_query(istart=0):
    if istart == 0:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10).execute()
    else:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10,
        start=istart).execute()
    return res

do_stop=10
cnt=1

# Your Google Hacking query here
query=''
query_params=''

doquery=query+query_params

service = build("customsearch","v1",developerKey=const.cse_token)

G=nx.DiGraph()
res=[]
while True:
    if cnt==1:
        res = do_query()
    else:
        if not res['queries'].has_key("nextPage"): break
        res = do_query(res['queries']['nextPage'][0]['startIndex'])
    cnt+=1
    if cnt > do_stop: break
    if res.has_key("items"):
        for item in res['items']:
            name=""
            if not item.has_key('pagemap'): continue
            if not item['pagemap'].has_key('hcard'): continue
            hcard = item['pagemap']['hcard']
            for card in hcard:
                if 'title' in card:
                    if 'fn' in card: name = card['fn']
                G.add_edge(name,card["fn"])     

plt.figure(figsize=(30,30))
nx.draw(G)
# Set your output filename
plt.savefig('antpji_rela_map.png')

# VULNEX EOF

And this is the graph generated:

py_osint_img3

Tool #4

Objective: what’s hot on Twitter account of the association?

This script downloads the latest tweets from the account of the association and generates a tag cloud. Useful to quickly view what are they talking about.

Used libraries: Requests, pytagcloud.


# File: ex4.py         
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Create word cloud               
#
# VULNEX (C) 2013
# www.vulnex.com

import requests
import json
import urllib
import const

from pytagcloud import create_tag_image, make_tags
from pytagcloud.lang.counter import get_tag_counts

site="http://search.twitter.com/search.json?q="

# Your query here
query=""

url=site+urllib.quote(query)

response = requests.get(url)

tag = []
for res in response.json["results"]:
    tag.append(res["text"].encode('ascii','ignore'))

text = "%s" % "".join(tag)  
tags = make_tags(get_tag_counts(text),maxsize=100)
# Set your output filename
create_tag_image(tags,"antpji_word_cloud.png", size=(600,500), fontname="Lobster")

# VULNEX EOF

And this is the tag cloud:

py_osint_img4

Tool #5

Objective: do the ANTPJI usernames from Twitter exist on social networks sites?

The following script extracts the usernames that have been published or mentioned in the Twitter of the association and checks in 160 social networks sites.

Used libraries: Requests.


# File: ex5.py         
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Check usernames on 160 social network sites               
#
# VULNEX (C) 2013
# www.vulnex.com

import requests
import json
import urllib
import const
import pprint

site="http://search.twitter.com/search.json?q="

# Your query here
query=""

url=site+urllib.quote(query)

print "Recolectando alias en Twitter: %s\n" % query
response = requests.get(url)

users = []

for res in response.json["results"]:
    if res.has_key('to_user'):
        if not res['to_user'] in users: users.append(str(res["to_user"]))
    if res.has_key('from_user'):
        if not res['from_user'] in users: users.append(str(res["from_user"]))

print "ALIAS-> %s" % users

print "\nComprobrando alias en 160 websites\n"
for username in users:  
    for service in const.services:  
            try:    
            res1 = requests.get('http://checkusernames.com/usercheckv2.php?target=' + service + '&username=' + username, headers={'X-Requested-With': 'XMLHttpRequest'}).text
            if 'notavailable' in res1: 
                print ""
                print username + " -> " + service 
                print "" 
            except Exception as e:  
                print e 

# VULNEX EOF

And the result is as follows:

py_osint_img5

Tool #6

Objective: can we extract metadata from ANTPJI photos?

This script downloads the photos related to ANTPJI from Google and extracts the metadata.

Used libraries: Requests, PIL and Markup.


# File: ex6.py         
# Date: 05/14/13
# Author: Simon Roses Femerling
# Desc: Download pictures from Google and extract metadata               
#
# VULNEX (C) 2013
# www.vulnex.com

import const
from apiclient.discovery import build
import pprint
import os
from PIL import Image
from StringIO import StringIO
from PIL.ExifTags import TAGS
import requests
import markup

def do_query(istart=0):
    if istart == 0:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10).execute()
    else:
        res = service.cse().list(
        q=doquery,
        cx=const.cse_id,
        num=10,
        start=istart).execute()
    return res

pic_id=1
do_stop=10
cnt=1

page=markup.page()
# Set your page title
page.init(title="ANTPJI OSINT") 
page.h1("ANTPJI OSINT")

# Set output directory
out_dir = "pics_gepl"

# Define your Google hacking query here
query=''
query_params=''

doquery=query+query_params

service = build("customsearch","v1",developerKey=const.cse_token)

if not os.path.exists(out_dir):
    os.makedirs(out_dir)

res=[]
while True:
    if cnt==1:
        res = do_query()
    else:
        if not res['queries'].has_key("nextPage"): break
        res = do_query(res['queries']['nextPage'][0]['startIndex'])
    cnt+=1
    if cnt > do_stop: break
    if res.has_key("items"):
        for item in res['items']:
            name=""
            if not item.has_key('pagemap'): continue
            if not item['pagemap'].has_key('hcard'): continue
            hcard = item['pagemap']['hcard']
            for card in hcard:
                pic_url=""
                if 'title' in card:
                    if 'fn' in card: name = card['fn']
                    if 'photo' in card: pic_url = card['photo']
                if pic_url != "":   
                    image = requests.get(pic_url)
                    pic_n = os.path.join(out_dir,"%s.jpg") % pic_id
                    file = open(pic_n,"w")
                    pic_id+=1
                    try:
                        i = Image.open(StringIO(image.content))
                        if hasattr(i,"_getexif"):
                            ret = {}
                            info = i._getexif()
                            if info:
                                for k,v in info.items():
                                    decode = TAGS.get(k,v)
                                    ret[decode] = v
                                print ret
                        i.save(file,"JPEG")
                        page.p(name.encode('ascii','ignore')) 
                        page.img(src=pic_n)
                        page.br()
                        page.br()
                    except IOError, e:
                        print "error: %s" % e
                    file.close()            

# Set your output filename
with open('index_gepl.html','w') as fp:
    fp.write(str(page))

# VULNEX EOF

A picture is worth a thousand words!

py_osint_img6

As we have seen throughout this article we can easily write sophisticated OSINT tools with a little bit of Python that allows us to gather lots of information about individuals or collectives.

If you would like me to go into any topic in Python and OSINT in depth let me know :)

What tools do you use for OSINT?

— Simon Roses Femerling

References

Posted in Pentest, Privacy, Security, Technology | Tagged , , , , , , , , , , | 10 Comments

My 10 Cyber Weapons Tool List

Few weeks ago the media did publish that the US Air Force has classified 6 tools as cyber weapons, no doubt a hot topic. For this post I will do the same and put a list of 10 tools that could be Cyber weapons, my list.

My selection is based in the following criteria: its usefulness, features and open source or free at least.

Logically there are more tools that I like or that I use, but I think that this list is a great collection to carry out attacks in networks and systems, reverse engineering, traffic analysis, social engineering, vulnerability discovery and exploits development, for sure tools that should be in the toolkit of all pentester :)

  1. Metasploit: the pentesting tool per excellence.
  2. SET: wide features to perform social engineering attacks.
  3. Dsploit: nothing like carry a pentesting toolkit on your mobile for Android.
  4. Nmap: popular network scanner and more.
  5. WireShark: analyzes network traffic, simple and powerful.
  6. Ettercap: all kinds of network attacks.
  7. Immunity Debugger: a little bit of reverse engineering combined with Python scripting.
  8. Mona: powerful script for the previous tool or Wingdb to develop exploits.
  9. Peach: Complete framework to find vulnerabilities via fuzzing.
  10. Androguard: Reverse engineering Android Apps.

Which is your list of 10 cyber weapons?

— Simon Roses Femerling

Posted in Pentest, Security, Technology | Tagged , , , , , , , , , , | 1 Comment

AppSec: Improve your software security with GCC Stack Protector Strong

The other day helping out a client to develop secure software it came to my mind that this topic could be of interest to my readers. Obviously this topic is quite wide, but in this article I will focus in a patch for the GCC compiler, which improves the protection of stack protector (stack canary) mitigating buffer overflow vulnerabilities.

Stack Protector Strong is a patch developed at Google and applied to the Chromium project (browser Chromium and Chromium OS) that substantially improves this defense (StackGuard). By default on GGC we have the switches -fstack-protector and -fstack-protector-all that we can use to compile software: the first switch analyzes each function in the code and if it detects a possible vulnerability applies the defense when compiling the program (the programmer does not have to do anything, well just develop secure ;)), while the second switch applies the defense to ALL functions in the program without validating if they are vulnerable.

Both options have their respective problems: the first switch (-fstack-protector) is limited by the code it considers vulnerable while the second switch (-fstack-protector-all) is too aggressive affecting the performance of the application.

Because of these problems at Google they decided to develop a third switch, -fstack-protector-strong, covering more cases of vulnerable code without sacrificing performance. In figure 1 we can see a comparison between – fstack-protector and -fstack-protector-strong.

stack_protector_VS
Fig. 1 – -fstack-protector vs. –fstack-protector-strong

Clearly a substantial improvement covering more types of possible vulnerabilities in code, but enough theory for today, let’s move on to a practical exercise where we will install the patch to the latest GCC 4.8.0 version, recently posted, on a Linux Debian 6.0.

The first step is to download GCC version that we want to patch. The patch was written for version 4.6, although I have tested with versions 4.7 and 4.8 and it works correctly. So we run the command wget with GCC URL and then unzip it (see figure 2).

gcc_cap1
Fig. 2 – GCC Download

To compile GCC we must have the following libraries installed and to install them we will use the command apt-get (see figure 3):

  • Build-essential
  • libgmp3-dev
  • libmpfr-dev
  • libmpc-dev
  • zip
  • autogen

gcc_cap2
Fig. 3 – Installing required packages to compile GCC

Now let’s download the -fstack-protector-strong patch from here. The patch is composed of 5 diff files.

gcc_cap4
Fig. 4 – Downloaded patches

We then proceed to patch GCC and we must follow the order that is showed in figure 5. Pay special attention to the order of the directories within the GCC.

gcc_cap5
Fig. 5 – Applying patches to GCC

Once we have patched GCC we can compile it, for install it in the system we need to have root privileges (see figure 6). While the command is running you can read other articles on this blog since the process takes a while to complete :)

gcc_cap7
Fig.6- Compiling and installing GCC

Now we are ready to compile programs with the latest version of GCC and a better defense against buffer overflow vulnerabilities.

At figure 7 we compile a vulnerable program with the parameter -fstack-protector-strong.

gcc_cap9
Fig. 7 – Testing –fstack-protector-strong

When disassembling (reversing) myapp we can see that this defense has been applied to several functions that -fstack-protector would have not applied (although I leave this exercise for another article).

This patch is not currently in GCC by default but let us hope that it will be in future versions as well as new and better defenses.

It is true that there are attack vectors to bypass this protection, but all defenses are welcome when building software and currently all modern compilers (GCC, Visual Studio and LLVM) include a variety of defenses that programmers should always use.

No doubt the use of these defenses in compilers does not remove the need for developing secure software using a secure development framework such as the MS SDL or OpenSAMM.

Which security parameters do you use when compiling software?

— Simon Roses Femerling

Posted in Pentest, Privacy, Security, Technology, Threat Modeling | Tagged , , , , , , , , | Leave a comment