Cyber forensics, also called computer forensic science, offers many fascinating branches to pursue. In this tutorial, we will focus on the particular branch of cybersecurity analysis with code examples using Python as our language of choice.
In the course of our Python socket programming tutorial we will build an actual packet analyzer, sometimes called a network packet sniffer. We will talk specifically about SQL injection techniques and how to detect and thwart them. This leads us to the discussion of malicious packet detection. Here are your key takeaways:
In straightforward terms, a malicious packet of data contains a hack attempt, such as an SQL injection. Such attacks can severely damage an app and this is why we need intrusion detection. The purpose of malicious packet detection is to prevent an SQL injection from executing and hijacking your web app and to protect the targeted application and its sensitive data.
SQL injections are often done right on the URL line of browsers! The best way to catch this type of attack is by continually scanning incoming requests on ports used by your app.
If you are familiar with how Python is used to build web scrapers then you will see immediately. Python can watch and interact on client-server channels, send HTTP requests to modify normal $GET and $POST parameters. It’s easy to imagine how Python can be used both for hacking and to thwart hackers!
Our first Python socket programming example will provide an introduction to socket programming. To begin, we need a clear understanding of port scanning, and how to do it with Python. Methods for this purpose are easy to understand on inspection of code samples.
After we develop an understanding of how to capture packets and parse them… We need to know exactly what we are searching for in the packets coming through as server requests. This will lead us naturally to the many brilliant functions available in the Python socket library.
In simplest terms, a port is an endpoint in network communication. A port is a logical software construct that identifies a process or service on the network. A Socket, for our purposes, is a socket in an Internet Protocol-based network. We often refer to this as an Internet socket. “TCP” or Transmission Control Protocol used on the Internet is a protocol for one-to-one connections between sockets.
On the Internet, a socket has a specific address, commonly called the IP address, as well as the port number of a local node. Sockets are prolific on the Internet, as they are a core component of all network communication.
Virtually all are INET sockets, and all browsers open a socket to connect to a web server. Monitoring what comes across these sockets on the physical level is the first line of defense against network intrusion. If you know how to scan ports, then you are in total control of what happens on a device.
In everyday language, the words “port” and “socket” are used rather loosely. If you Google “port scanning” with Python, you will likely get a lot of results including how to make use of the Python socket library.
Let’s start the discussion by understanding exactly how port scanning with Python works. This will give us a basis for developing several network analysis tools. Such tools figure into the scope of a larger intrusion detection system and leading all the way to general cybersecurity systems.
Python provides a socket library module which gives us easy access to the BSD socket-level API. The Python socket library contains functions that are essential to networking processes like web server-address conversion and network packet data formatting.
The socket library also contains a socket class for monitoring the data channel. We will use the socket built-in functions to conveniently leverage our example port scanner.
Our first code example implements a basic port scanner in Python by importing the socket library. Let’s go over this code, and then define some of the built-in keywords:
# Basic port scanner import sys, socket, subprocess from datetime import datetime subprocess.call('clear', shell=True) remoteServer1 = raw_input("Enter the remote host to scan: ") remoteServerIP1 = socket.gethostbyname(remoteServer1) print "Wait... scanning host", remoteServerIP1 # start timer t01 = datetime.now() # Use range function to scan a range of ports like this: try: for port in range(1,1025): socket1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM) result = socket1.connect_ex((remoteServerIP1, port)) if result == 0: print "Port {}: Open".format(port) socket1.close() # Handle errors: except KeyboardInterrupt: print "Ctrl+C pressed " sys.exit() except socket.gaierror: print 'Hostname unresolved... quitting' sys.exit() except socket.error: print "Connection to server failed" sys.exit() # time duration t02 = datetime.now() totalTime = t02 - t01 print 'Scan duration: ', totalTime
In the above code, we can see the convenience and potential for scanning ports with Python. The syntax for connecting to the host is easily inferred. Here are some of the important keywords:
In the next step, we will develop code to parse incoming TCP packets. This will enable us to eventually scan individual packets for malicious strings such as attempted SQL injections. Another clever attack attempts to intentionally induce the server SQL engine to throw an exception because the error message will often reveal table or column names within the database.
This and other information gleaned from the exception give the attacker valuable information about the structure of the app’s data and potential vulnerabilities.
In this code example, we will construct a data packet parser. First, it parses out the packet to IP Header, TCP Header, and Data:
#Parse incoming TCP packet import sys,socket from struct import * #Next create INET STREAMing socket try: s1 = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP) except socket.error , msg: print 'Create socket failed. Error : ' + str(msg[0]) + ' Message ' + msg[1] sys.exit() # Loop to receive packets while True: packetT = s.recvfrom(52696) # get packet string from tuple packetT = packetT[0] # first 20 chars => ip header ip_header = packetT[0:20] # Parse unpack - BBHHHBBH4s4s is the format specifier! iph = unpack('!BBHHHBBH4s4s' , ip_header) version_ihl = iph[0] version1 = version_ihl >> 4 ihl = version_ihl & 0xF iph_length1 = ihl * 4 ttl = iph[5] protocol1 = iph[6] s_addr = socket.inet_ntoa(iph[8]); d_addr = socket.inet_ntoa(iph[9]); print 'Version: ' + str(version) + ' IP Header Length : ' + str(ihl) + ' TTL : ' + str(ttl) + ' Protocol : ' + _ str(protocol1) + ' Source Addr : ' + str(s_addr) + ' Destination Addr : ' + str(d_addr) tcp_header1 = packet[iph_length:iph_length+20] # unpack here: tcph = unpack('!HHLLBBHHH' , tcp_header1) source_port1 = tcph[0] dest_port1 = tcph[1] sequence1 = tcph[2] acknowledgement1 = tcph[3] doff_reserved1 = tcph[4] tcph_length1 = doff_reserved >> 4 print 'Source Port : ' + str(source_port1) + ' Dest Port : ' + str(dest_port1) + ' Sequence Number : ' + str(sequence1) + ' Acknowledgement : ' + str(acknowledgement1) + ' TCP header length : ' + str(tcph_length1) h_size1 = iph_length1 + tcph_length1 * 4 data_size1 = len(packet1) - h_size #get string from packet data = packet[h_size:] print 'Data: ' + data
Most SQL injections fit one or both of two main categories:
A functional SQL injection is an injection of code that actually works and does some nefarious tasks within the SQL server. The intention is to circumvent the app and gain access to the server. To catch such an attack we must use our parser to find keywords words entered into an input box.
These are the obvious SQL reserved words like SELECT, INSERT, DELETE, UPDATE, and others. Intentionally induced errors also use the same keywords but with a mistake added. This will make the SQL interpreter think it is looking at code instead of data. But when it tries to execute the code an error occurs. The text output from the error will inform the attacker about your data structure!
Now we have parsed the data packet into strings for examination. What hidden and potentially damaging code are we actually looking for in the packets? There is a plethora of potential threats. But there are several in the top five most common.
Although this is a widely known and frequently used hack, web apps are still designed and coded so that parameters on the URL line of the browser may go directly into the SQL engine unfettered and unfiltered! Astonishing but true, this still happens, even at a time when massive data losses are reported in the daily news.
What this means is that the application generates SQL strings directly from $GET parameters entered by the user, for example, on the browser URL line. In this case, a clever user may substitute SQL commands for “valid” user input. The hacker needs only a web browser and knowledge of SQL to attack the server. Let’s look at an example of SQL injection.
On a simple login page, this SQL SELECT query has the purpose of validating the user’s input of name and password into textboxes:
"SELECT user_email, user_id FROM tbl_users WHERE user_name='"TextBox1.Text"' and pass='" TextBox2.Text"'";
After validating the user’s input from the textboxes the app then queries the DB for the user’s user_email and user_id from the database. When valid user credentials are passed the application sends a string to the database such as:
SELECT user_email, user_id FROM tbl_users where u_name='brian' and pass='brian123';
Because the username and password are valid, the database executes the command and then returns the requested field values. But the attacker can alter the SQL string by adding a conditional clause that always evaluates to TRUE. For example, if a hacker injects a code fragment such as “… OR ‘2’ = ‘2’, then the DB will respond by returning Brian’s valid data to the app. For example:
SELECT user_email, user_id FROM tbl_users where user_name=’brian’ and pass=' ' or '1' = '1';
The injected conditional causes the validation of credentials to always evaluate as true! This gives the attacker access to clues about the table structure of the DB. Suppose a hacker enters a value of ‘AND user_id = into the password textbox. If the SQL input is not sanitized, this code fragment will not be interpreted as a value, but instead, it will be interpreted as an error in the SQL language syntax by the DB engine and throw an error.
This behavior illustrates how SQL injection methods largely rely on string concatenation. When a knowledgeable user interacts directly with a form input that goes directly to the MySQL engine, the results can be explosive.
The message of an error will likely betray information to the hacker about the background coding language and details of the underlying table structure. Such details then give the hacker clues about how to build more sophisticated attacks.
A distant relation to SQL injection, buffer overflow attack is one in which a coder generates an intentional memory overflow error to exploit with various methods. In many system-level languages such as C++ where the use of pointers is prolific, memory safety is a serious concern nowadays.
Imagine a dead link on a web page. This is analogous to a dangling pointer in C++. Pointers persist in C++ whether the associated memory content is still valid or not. When a pointer and its memory allocation get detached, there is room for foul play. Attackers can deposit code in the allocation and potentially execute it through this exploit.
One method is to inject code into the memory overflow area, where it might get control and damage other resources. Paradoxically, Python and apps developed with Python are significantly at risk and often subjected to buffer overflow attacks! Python coding to avoid buffer overflow exploits belongs to another kind of tutorial.
There are myriad methods to cause chaos by inserting code where it was not intended by the designer of an app. These are only bounded by the creative limits of the hacker. Webhooks are a method of sending code as a parameter, but it’s easy to imagine how this can be exploited in a manner similar to SQL injection!
Because Python is such a versatile language it is thought to be ideal for many cybersecurity applications. A vast community of developers, and thousands of library functions readily available to import into a project, easy integration with other languages such as C++ and Java – these are among the reasons Python is the choice for a variety of intrusion detection systems and general cyber forensics apps. Here are a few which we have not already mentioned:
Every bit of information that can be associated with a user threat is valuable to the cyber forensics professional. For example, headers that are found in HTTP requests and also responses from web servers can be thought of like a fingerprint in forensic terms. But this fingerprint carries much information from the client to the server. Here is a simple Python script to fetch the header information. The text is parsed, and an attempt is made to ID the server which originated the request:
import requests myReq = requests.get('https://bytescout.com') MyHeaders = ['Server', 'Date', 'Via', 'X-Powered-By', 'X-Country-Code'] for header in MyHeaders: try: result = MyReq.MyHeaders[header] print '%s: %s' % (header, result) except Exception, error: print '%s: Not found' % header
Tracking the location of intruders is essential for cybersecurity specialists. Python makes it easy to fetch the IP address of a user. This is accomplished by combining Python and Google APIs along with the pygeoip module.
The first step is to import the standard GeoIP database from dev.maxmind.com/geoip/legacy/geolite/. After that, an IP address can be scanned and associated with a location place name. Here are the commands which can be entered in the Python command prompt or in a script:
import pygeoip myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’) echo myGeoIP.country_name_by_addr(‘<IP Address>’) #Look up a country name with these commands: myGeoIP = pygeoip.GeoIP(‘GeoIPDataSet.dat’) echo myGeoIP.country_code_by_name(‘google.com’) echo myGeoIP.country_code_by_addr(‘<IP Address>’) echo myGeoIP.country_name_by_addr(‘<IP Address>’)
Python offers abundant resources for security apps developers. To borrow an old phrase from the insurance industry, Python gives you full coverage over those resources, along with the power to implement cybersecurity solutions. And there is a bevy of community support and the multitude of library functions available for Python developers.
With all these resources, you can create your own proprietary intrusion prevention systems entirely without the use of third-party tools! Wielding the power of Python empowers you with the confidence that you are in control of your device’s data security. Stay in touch with Bytescout for regular new editions of the best technology tutorials online!