Ultimate Python Tutorial: Web App Security Analysis - ByteScout
  • Home
  • /
  • Blog
  • /
  • Ultimate Python Tutorial: Web App Security Analysis

Ultimate Python Tutorial: Web App Security Analysis

Cyber forensics, also called computer forensic science, offers many fascinating branches to pursue. In this tutorial, we will focus on the particular branch of cyber security analysis with code examples using Python as our language of choice.
In the course of our Python socket programming tutorial we will build an actual packet analyzer, sometimes called a network packet sniffer. We will talk specifically about SQL injection techniques and how to detect and thwart them. This leads us into the discussion of malicious packet detection. Here are your key takeaways:

  • Malicious packet detection with Python
  • Defining sockets and ports
  • What is port scanning?
  • Python port scanning methods
  • SQL injection methods

Malicious Packet Detection with Python

In straightforward terms, a malicious packet of data contains a hack attempt, such as an SQL injection. Such attacks can severely damage an app and this is why we need intrusion detection. The purpose of malicious packet detection is to prevent an SQL injection from executing and hijacking your web app and to protect the targeted application and its sensitive data.

SQL injections are often done right on the URL line of browsers! The best way to catch this type of attack is by continually scanning incoming requests on ports used by your app.

If you are familiar with how Python is used to build web scrapers then you will see immediately. Python can watch and interact on client-server channels, send HTTP requests to modify normal $GET and $POST parameters. It’s easy to imagine how Python can be used both for hacking and to thwart hackers!

Our first Python socket programming example will provide an introduction to socket programming. To begin, we need a clear understanding of port scanning, and how to do it with Python. Methods for this purpose are easy to understand on inspection of code samples.

After we develop an understanding of how to capture packets and parse them… We need to know exactly what we are searching for in the packets coming through as server requests. This will lead us naturally to the many brilliant functions available in the Python socket library.

What are the Sockets and Ports Exactly?

In simplest terms, a port is an endpoint in network communication. A port is a logical software construct which identifies a process or service on the network. A Socket, for our purposes, is a socket in an Internet Protocol-based network. We often refer to this as an Internet socket. “TCP” or Transmission Control Protocol used on the Internet is a protocol for one-to-one connections between sockets.

On the Internet, a socket has a specific address, commonly called the IP address, as well as the port number of a local node. Sockets are prolific on the Internet, as they are a core component of all network communication.

Virtually all are INET sockets, and all browsers open a socket to connect to a web server. Monitoring what comes across these sockets on the physical level is the first line of defense against network intrusion. If you know how to scan ports, then you are in total control of what happens on a device.

What is port scanning?

In everyday language, the words “port” and “socket” are used rather loosely. If you Google “port scanning” with Python, you will likely get a lot of results including how to make use of the Python socket library.

Let’s start the discussion by understanding exactly how port scanning with Python works. This will give us a basis for developing several network analysis tools. Such tools figure into the scope of a larger intrusion detection system and leading all the way to general cybersecurity systems.

Python port scanning methods

Python provides a socket library module which gives us easy access to the BSD socket-level API. The Python socket library contains functions which are essential to networking processes like web server-address conversion and network packet data formatting.

The socket library also contains a socket class for monitoring the data channel.  We will use the socket built-in functions to conveniently leverage our example port scanner.

Our first code example implements a basic port scanner in Python by importing the socket library. Let’s go over this code, and then define some of the built-in keywords:

# Basic port scanner
import sys, socket, subprocess
from datetime import datetime

subprocess.call('clear', shell=True)
remoteServer1 = raw_input("Enter the remote host to scan: ")
remoteServerIP1 = socket.gethostbyname(remoteServer1)
print "Wait... scanning host", remoteServerIP1
# start timer
t01 = datetime.now()
# Use range function to scan a range of ports like this:
try:
for port in range(1,1025):
socket1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
result = socket1.connect_ex((remoteServerIP1, port))
if result == 0:
print "Port {}: Open".format(port)
socket1.close()
# Handle errors:
except KeyboardInterrupt:
print "Ctrl+C pressed "
sys.exit()
except socket.gaierror:
print 'Hostname unresolved... quitting'
sys.exit()
except socket.error:
print "Connection to server failed"
sys.exit()
# time duration
t02 = datetime.now()
totalTime = t02 - t01
print 'Scan duration: ', totalTime

Python Socket Built-in Functions

In the above code, we can see the convenience and potential for scanning ports with Python. The syntax for connecting to the host is easily inferred. Here are some of the important keywords:

  • AF_INET – Defines the socket family
  • SOCK_STREAM – Specifies socket type TCP connections
  • SOCK_DGRAM – For socket type UDP connections

A Python Port Sniffer

In the next step, we will develop code to parse incoming TCP packets. This will enable us to eventually scan individual packets for malicious strings such as attempted SQL injections. Another clever attack attempts to intentionally induce the server SQL engine to throw an exception because the error message will often reveal table or column names within the database.

This and other information gleaned from the exception give the attacker valuable information about the structure of the app’s data and potential vulnerabilities.

In this code example, we will construct a data packet parser. First, it parses out the packet to IP Header,  TCP Header, and Data:

#Parse incoming TCP packet
import sys,socket
from struct import *
#Next create INET STREAMing socket
try:
s1 = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
except socket.error , msg:
print 'Create socket failed. Error : ' + str(msg[0]) + ' Message ' + msg[1]
sys.exit()

# Loop to receive packets
while True:
packetT = s.recvfrom(52696)
# get packet string from tuple
packetT = packetT[0]
# first 20 chars => ip header
ip_header = packetT[0:20]
# Parse unpack - BBHHHBBH4s4s is the format specifier!
iph = unpack('!BBHHHBBH4s4s' , ip_header)
version_ihl = iph[0]
version1 = version_ihl >> 4
ihl = version_ihl & 0xF
iph_length1 = ihl * 4
ttl = iph[5]
protocol1 = iph[6]
s_addr = socket.inet_ntoa(iph[8]);
d_addr = socket.inet_ntoa(iph[9]);

print 'Version: ' + str(version) + ' IP Header Length : ' + str(ihl) + ' TTL : ' + str(ttl) + ' Protocol : ' + _ str(protocol1) + ' Source Addr : ' + str(s_addr) + ' Destination Addr : ' + str(d_addr)
tcp_header1 = packet[iph_length:iph_length+20]

# unpack here:
tcph = unpack('!HHLLBBHHH' , tcp_header1)
source_port1 = tcph[0]
dest_port1 = tcph[1]
sequence1 = tcph[2]
acknowledgement1 = tcph[3]
doff_reserved1 = tcph[4]
tcph_length1 = doff_reserved >> 4

print 'Source Port : ' + str(source_port1) + ' Dest Port : ' + str(dest_port1) + ' Sequence Number : ' +     str(sequence1) + ' Acknowledgement : ' + str(acknowledgement1) + ' TCP header length : ' + str(tcph_length1)

h_size1 = iph_length1 + tcph_length1 * 4
data_size1 = len(packet1) - h_size
#get string from packet
data = packet[h_size:]
print 'Data: ' + data

The Substance of SQL injection attacks

Most SQL injections fit one or both of two main categories:

  • Functioning SQL queries
  • Intentionally induced errors

A functional SQL injection is an injection of code that actually works and does some nefarious task within the SQL server. The intention is to circumvent the app and gain access to the server. To catch such an attack we must use our parser to find keywords words entered into an input box.

These are the obvious SQL reserved words like SELECT, INSERT, DELETE, UPDATE, and others. Intentionally induced errors also use the same keywords but with a mistake added. This will make the SQL interpreter think it is looking at code instead of data. But when it tries to execute the code an error occurs. The text output from the error will inform the attacker about your data structure!

The Top Four Attacks

Now we have parsed the data packet into strings for examination. What hidden and potentially damaging code are we actually looking for in the packets? There is a plethora of potential threats. But there are several in the top five most common.

  • Simple SQL injection
  • Buffer overflow
  • Function call injection
  • Code injection

SQL injection 101

Although this is a widely known and frequently used hack, web apps are still designed and coded so that parameters on the URL line of the browser may go directly into the SQL engine unfettered and unfiltered! Astonishing but true, this still happens,  even at a time when massive data losses are reported in the daily news.

What this means is that the application generates SQL strings directly from $GET parameters entered by the user, for example, on the browser URL line. In this case, a clever user may substitute SQL commands for “valid” user input. The hacker needs only a web browser and knowledge of SQL to attack the server. Let’s look at an example of SQL injection.

On a simple login page, this SQL SELECT query has the purpose of validating the user’s input of name and password into textboxes:

"SELECT user_email, user_id FROM tbl_users WHERE user_name='"TextBox1.Text"' and pass='" TextBox2.Text"'";

After validating the user’s input from the textboxes the app then queries the DB for the user’s user_email and user_id from the database. When valid user credentials are passed the application sends a string to the database such as:

SELECT user_email, user_id FROM tbl_users where u_name='brian' and pass='brian123';

Because the username and password are valid, the database executes the command and then returns the requested field values. But the attacker can alter the SQL string by adding a conditional clause which always evaluates to TRUE. For example, if a hacker injects a code fragment such as “… OR ‘2’ = ‘2’, then the DB will respond by returning brian’s valid data to the app. For example:

SELECT user_email, user_id FROM tbl_users where user_name=’brian’ and pass=' ' or '1' = '1';

The injected conditional causes the validation of credentials to always evaluate as true! This gives the attacker access to clues about the table structure of the DB. Suppose a hacker enters a value of  ‘AND user_id =  into the password textbox. If the SQL input is not sanitized, this code fragment will not be interpreted as a value, but instead, it will be interpreted as an error in the SQL language syntax by the DB engine and throw an error.

This behavior illustrates how SQL injection methods largely rely on string concatenation. When a knowledgeable user interacts directly with a form input which goes directly to MySQL engine, the results can be explosive.

The message of an error will likely betray information to the hacker about the background coding language and details of the underlying table structure. Such details then give the hacker clues about how to build more sophisticated attacks.

Buffer Overflow Vulnerability

A distant relation to SQL injection, buffer overflow attack is one in which a coder generates an intentional memory overflow error to exploit with various methods. In many system level languages such as C++ where the use of pointers is prolific, memory safety is a serious concern nowadays.

Imagine a dead link on a web page. This is analogous to a dangling pointer in C++. Pointers persist in C++ whether the associated memory content is still valid or not. When a pointer and its memory allocation get detached, there is room for foul play. Attackers can deposit code in the allocation and potentially execute it through this exploit.

One method is to inject code into the memory overflow area, where it might get control and damage other resources. Paradoxically, Python and apps developed with Python are significantly at risk and often subjected to buffer overflow attacks! Python coding to avoid buffer overflow exploits belongs in another kind of tutorial.

Function Call and Code Injection

There are myriad methods to cause chaos by inserting code where it was not intended by the designer of an app. These are only bounded by the creative limits of the hacker. Web hooks is a method of sending code as a parameter, but it’s easy to imagine how this can be exploited in a manner similar to SQL injection!

Full Stack Python Resources for Cyber Security Coding

Because Python is such a versatile language it is thought to be ideal for many cyber security applications. A vast community of developers, and thousands of library functions readily available to import into a project, easy integration with other languages such as C++ and Java – these are among the reasons Python is the choice for a variety of intrusion detection systems and general cyber forensics apps. Here are a few which we have not already mentioned:

  • Web server fingerprinting
  • Simulation of attacks
  • Geolocation via IP address
  • Website cloning
  • Website load generation and testing
  • Creating intrusion detection systems
  • Wireless network scanning
  • Network traffic Transmission
  • Mail server access

Web server fingerprinting

Every bit of information which can be associated with a user threat is valuable to the cyber forensics professional. For example, headers which are found in HTTP requests and also responses from web servers can be thought of like a fingerprint in forensic terms. But this fingerprint carries much information from the client to server. Here is a  simple Python script to fetch the header information. The text is parsed, and an attempt is made to ID the server which originated the request:

import requests
myReq = requests.get('http://www.bytescout.com')
MyHeaders = ['Server', 'Date', 'Via', 'X-Powered-By', 'X-Country-Code']
for header in MyHeaders:
  try:
    result = MyReq.MyHeaders[header]
    print '%s: %s' % (header, result)
    except Exception, error:
    print '%s: Not found' % header

Geolocation Coding Made Easy!

Tracking the location of intruders is essential for cybersecurity specialists. Python makes it easy to fetch the IP address of a user. This is accomplished by combining Python and Google APIs along with the pygeoip module.

The first step is to import the standard GeoIP database from dev.maxmind.com/geoip/legacy/geolite/. After that, an IP address can be scanned and associated with a location place name. Here are the commands which can be entered in the Python command prompt or in a script:

import pygeoip
 myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’)
 echo myGeoIP.country_name_by_addr(‘<IP Address>’)
 
#Look up a country name with these commands:
 myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’)
 echo myGeoIP.country_code_by_name(‘google.com’)
 echo myGeoIP.country_code_by_addr(‘<IP Address>’)
 echo myGeoIP.country_name_by_addr(‘<IP Address>’)

Comprehensive Python Coverage!

Python offers abundant resources for security apps developers. To borrow an old phrase from the insurance industry, Python gives you full coverage over those resources, along with the power to implement cybersecurity solutions. And there is a bevy of community support and the multitude of library functions available for Python developers.

With all these resources, you can create your own proprietary intrusion prevention systems entirely without the use of third-party tools! Wielding the power of Python empowers you with the confidence that you are in control of your device’s data security. Stay in touch with Bytescout for regular new editions of the best technology tutorials online!

 

About the Author

Author Mark

Mark Ronald Moore

Mark is a freelance consultant and coder in the areas of machine learning, automation testing, and web app development. He currently writes coding tutorials and tech articles regularly for ByteScout. Mark is a resident of Humboldt, California, and enjoys hiking in the redwoods.

 

 



prev
next