I recently received a request to analyze a suspicious PHP page captured from a user’s Internet history. On the surface it was a typical investigation regarding inappropriate use of a company system based upon the ame of the PHP page: “sex.php”.
But there was more to this page aside from the content that generated the initial concern. It was the probability that pages such as these use common techniques to deploy adware, spyware, or session-stealing capabilities.
In this particular case the code was a fully functional PHP command and control application, and I determined it was a variant of the original 2008 Chinese version called “phpspy.”
This code has a full file manager, database manager, arbitrary command execution, arbitrary php code execution, and a backdoor shell called "backconnect" which will run on TCP 12345. It has two options for the command shell: perl or c. If perl is on the system, it will run angel_bc.pl (included within the decoded PHP as yet another decoded string array).
Otherwise it will try to execute the compiled version of angel_bc.c, which it handles by script code. This blog is not an analysis of the backdoor exactly, but instead describes the methodology and techniques used to decipher malicious code embedded and encoded in a seemingly normal web page. Below is a snippet of the PHP code that caught my attention, and thus begins my initial investigation:
If you look at the PHP code there are two distinct portions of code that look like they are base64, but this is not the case. You can decode that first chunk into meaningful code (the one that starts with JE8wMDB…).
However, if you run the second chunk of base64 data (the one that starts with 8QWsMtgs…) through a base64 decoder it outputs binary-like data. I piped that into a file and hex edited it, but did not see any reference to X86 or ELF binaries, so I knew this probably was obfuscated and needed more analysis. Below are the steps to take to analyze this type of malware:
• base64 (or openssl base64 -d)
• php cli
• PHP manual to reference functions
Basic Primer Needed to Understand Structure of the Code:
• PHP code begins with "
• eval() runs a string as php code
• The string inside eval appears to be base64_decoded before being executed.
• The base64 code is encapsulated between apostrophes
Step One: My first step was to decode the first encoded string chunk by copying the code into a file and issuing: cat coded.txt
This also could be accomplished using browser plugins as shown with the following screenshot (click image to enlarge):
After decoding this string I ended up with the following new code, in blue below:
Step Two: I attempted to base64 decode the second chunk of data the same way as above but it outputs binary-like data instead of code. If you look at the new code we generated above we can inspect, and infer, that the second encoded section actually is obfuscated and needs to be decoded(in red text below):
The above code is just a fancy way of reading in a large chunk of data, splitting it up into smaller sections, and piping it all through an “strtr” function that will trans-position characters through an obfuscation function. In PHP, the strtr function returns a copy of str, translating all occurrences of each character in from to the corresponding character in to ; string strtr ( string $str , string $from , string $to ).
So, basically, here is that function broken down:
The second encoded chunk of data will be read into the function in sections. Then any matching characters from the encoded chunk that match a character in the second parameter:
will be transposed with a character in the third parameter at the same offset:
For instance, let’s say you have this function:strtr("SecireStatg","ig","ue");
When this runs the output string will really be “SecureState” since every “i” will be remapped to a “u” and every “g” will be remapped to an “e”. If that third parameter were backwards, say “eu” instead, then the string would output “SecereStatu”. The following screenshot shows this function and how it can be used to encode or decode (click image to enlarge):
If the from and to are different lengths, the extra characters in the longer of the two are ignored, which is the case within this PHP code. The string $from in the PHP code is one character longer than the $to string, so that last character will be ignored.
In this case, the last character is the “=” sign, which symbolizes base64 encoding (marked in red below). This is just an obfuscation technique to make encoded data still appear to be base64 and was a tricky technique that had me stumped for a while.
Step Three: Once I understood the initial code and its obfuscation, I began to decode. The following screenshot shows a simple way to extract that second chunk of encoded data into a file. In this case, I just cat’d the sex.php file, GREP’d for the first few characters in the large encoded chunk, and saved to a file called “code.txt” (click image to enlarge):
Step Four: Create a PHP script, “decode.php”, that will read in the contents of code.txt with the “file_get_contents” function. Next the contents of that file will be loaded into the “strtr” function, along with the $from and $to substitution character sets (in other words, every character in code.txt that matches characters in the $from parameter will be replaced with characters in the $to parameter).
Next it will base64 decode that entire new encoded chunk. The following screenshot displays the decoding program, but the basic structure of the decoding process is as follows:strtr (string $str, string $from, string $to)
$str will be"code.txt”
$from will be"'SgdtQqnu582JM4Os7yem+TFVlpjLik6fcUHBIZ0Ph1GXCxvaEzKYoDr9/3AwNWbR=’”
$to will be "'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'”
(click image to enlarge):
Step 5: Once the decoding script runs, it will output a new PHP file with meaningful, and readable, code. The following screenshot shows what all this encoding was trying to hide: a full-featured malware program and command and control shell (click image to enlarge):
Step 6: The PHP code also contains a backdoor called “backconnect”. The variable $back_connect is encoded with similar methods. Once decoded, we can see it is a perl script (shown in the terminal window on the left within the following screenshot) (click image to enlarge):
I will be teaching a technical class on encoding and obfuscation techniques within the next couple of weeks at SecureState called “Decoding Basics.” This class will define and present an encoding example and the techniques and methodologies used to analyze the encoding.
Additionally, I will show ways to approach obfuscation problems when trying to decipher them. If you are interested in this type of learning and knowledge transfer, and are in the surrounding area, please join us; it’s free and always enjoyable.
Cross-posted from SecureState