Thursday, May 5, 2016

PCDC 2016 RE Challenge Solutions - Part 2

This is a continuation off of my previous post detailing the solutions to the 2016 PCDC reverse engineering challenges 1 and 2. This post will go over the solutions for challenges 3 and 4. If you were a competitor in this year's PCDC competition, you may not have seen all of the RE challenges. Challenges 1 and 2 were made for high school students, challenge 3 was added for college students, and challenge 4 was added for professional day. As a result, only the professionals saw all 4 challenges. This was designed to help balance expected work load during the competition and provide increasing level so difficulty for increasing levels of skill. So let's look at the remaining two challenge solutions.

Challenge 3 - exclusive

Like with all the other challenges, let's run this one and get an idea of what it does.


So this looks a little bit like our previous challenges where we have to give it a valid input and it well tell us if that input is correct and print out the key. Let's open the challenge up in Immunity and take a look. 


After loading the binary and analyzing the modules just like the we did in the previous challenge, we can scroll down and see the instructions responsible for printing out the initial user prompt. We can also see how our input gets read into the program via the call to fgets(). The important thing to note how fgets() gets its parameters. In x86-32 bit assembly on Windows, which is what we are looking at, these parameters are passed on the stack in a standard called CDECL (C declaration). What this means is the PUSH instructions are actually setting up the parameters to the fgets() function. To understand what these parameters are, let's look at this function.

Using MSDN as a resource, we can see that fgets() has the following signature:

char *fgets(char *str, int n, FILE *stream);

What this function is doing is reading in an n byte string, storing it in a memory buffer pointed to by str, and reading the input from the file stream stream. If you look at Immunity, it even tells you which PUSH instruction is responsible for setting up which parameter to the fgets() call and in this case, it tells you it is reading a string of length 0x16 (n = 16 (22.)). Now for our purposes, the important parameter to consider is the address where this user defined input gets stored. If you looked at the link specifying the CDECL calling convention, you'll see that the first parameter listed in the function (char *str) is the last one pushed onto that stack before the call to fgets(). We don't need to understand all the instructions, we just need to see the use of DWORD PTR SS:[EBP-34]. This is going to be a pointer to where our input will be stored. If we look through the disassembly a little more, we should see this address show up again. Specifically in the 0x00401397 - 0x004013CE address range. Before we go any further, lets recap what we know:

- The program expects to get a 'serial number' which it then validates
- The program uses a call to fgets() to read in our input
- Our input is stored at DWORD PTR SS:[EBP-34]
- We see our stored address used within the assembly address range 0x00401397 0x004013CE

Now it's time to verify these. The best way to do this with a debugger is to set break points. At this point, right-click on the address 0x00401397, go down to breakpoints and select toggle. Alternatively, click on the address and press F2 to toggle the breakpoint. This is after we have entered the string to the program and before it looks like it's getting used.  With the breakpoint turned on, our program will halt at that address when execution has gotten there. So with our breakpoint set, let's run the program. This can be done 3 different ways: 1. Press F9, 2. To to the top menu and press the red play button, 3. Go to Debug -> Run. The first thing that will happen is that the program will break at what is called the program entry point. This is basically the API the program exposes to the OS so that the program can start up. There are a lot of steps that go into getting from this point to the break point we set, but we don't care about those at this time. Just press F9 again to continue. Now at this point, you should see the Windows command window. It should be waiting for you to enter in your input. Go ahead and enter is some string into this window like you did the first time you ran this challenge program.



At this point it will look like your program hangs, but this is Immunity pausing it so you can being to dive deeper into the details of what's going on. Let's analyze the following snippet of code:



If we look at the first instruction, we see a CMP to 0x16. Remember from the first post that CMP is used to make comparisons and remember from earlier in this post, that this challenge reads in a 0x16 byte string. Now look at the second instruction, JNB SHORT exclusiv.004013D0. It is a jump instruction to the address 0x004013D0. That target address is interesting, but not as interesting as the instruction before it at address 0x004013CE, a JMP SHORT exclusive.00401397. That's the same address we set our breakpoint at! This is a loop! We compare some counter to 0x22, if it is less than 0x22, continue execution, otherwise, jump. It kind of looks like:

int x = 0;
while (x < 0x16) {
    loop_body();
}

Now we haven't figured out the loop body yet. But let's step through this an instruction at a time by pressing the F7 key and stop when we get to address 0x004013A5, the first XOR instruction. If you're following along with my example and entered in abcd1234 as your string to test, you should see something like this:


Now I've added some circles to draw your attention to a few places. So the instruction we stopped at is XOR EAX, 9C. This means we are performing the XOR operations on whatever value is in the EAX register with the hex value 0x9C. I've circled the EAX register, and we can see the value of 0x61. If you recall, our input string was abcd1234, the hexadecimal value for the ASCII character 'a' is 0x61. Interesting. Let's continue to the next XOR instruction at 0x004013B9.


Again, I've highlighted the value in the EAX register, 0x62 (ASCII value for 'b'), and the other value in the XOR operation is 0xDC. If you continue to step through, you'll see that we jump back to address 0x00401397, and loop through these series of instructions again. When you inspect the EAX register, you'll see the values 0x63 and 0x64 (ASCII 'c' and 'd' respectively). It is indeed our input string. Now, we know we are in a loop. In fact, we wrote out some pseudo-code for it. So let's update it a little:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x++;
}

Hmmm this doesn't seem quite right. So how is our loop being incremented and how to we know our index for our input? Let's looks at the following details:

What we're looking at is how the input string is indexed using the EDX register, and how that register is incremented by 2 every loop iteration. So let's refine our code a little more:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

Great. Now when we exit our loop, our input has been XOR'ed with the key 0x9CDC. A little further down we see a call to the function memcmp(). Here is the MSDN documentation for that function. What it does is look to see if 2 memory regions contain the same data for a given number of bytes. Again, Immunity has done some work for us and shows us that this memcmp() is using 2 pointers, s1 and s2, and comparing them up to 0x15 (21) bytes. Le't set a breakpoint at address 0x004013DA where memcmp() gets called. So, by looking at the previous instructions and what we learned from the loop we reverse engineered, we see that s2 is the string we entered after it has been XOR'ed. A good way to confirm this is we see that the address of this string got loaded into the EAX register with the instruction at address 0x004013D2. So what we can do is right-click on the EAX register and click Follow in Dump. This will show a hex dump of our data:


And indeed, those are our input bytes after being XOR'ed. Check the first 2 bytes; 0x6162 XOR 0x9CDC = 0xFDBE. So what we are really interested in is the memory contents of the second pointer, s1, being used in the memcmp(). We can find that by doing the same thing we did in the previous step, but with the ECX register. So let's get the contents of that dump.


This this is ultimately our goal. The 0x15 (21) bytes of this memory segment. This is what our input string gets compared to after being XOR'ed against the key 0x9CDC. So now let's update our pseudo-code:

char serial[0x15] = "\xff\xbd\xfa\xb9\xb1\xec\xad\xee\xaf\xe8\xb1\xe9\xaa\xeb\xa4\xe5\xb1\xbe\xf9\xb9\xfa"
char *input = (char *)malloc(0x16);
fgets(input, 0x16, STDIN);

int x = 0;
while (x < 0x16) {
 input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

if (memcmp(key, input, 0x15) == 0) {
    win();
} else {
    fail();
}

Now, we know this is what happens with the memcmp() function by reading the documentation and seeing the TEST EAX, EAX instruction and seeing successful jumps over the failure messages. So the only thing left to do is solve what the input string is. If you read the documentation on XOR, you know that it is reversible. So all we have to do is XOR our serial we dumped out of memory with the key of 0x9CDC. There are plenty of online tools to do this for you. Once you do it, you'll find that the input should be: cafe-01234-56789-beef. Let's give this a try.



Success!

Password: cafe-01234-56789-beef
Flag: FLAG{XOR_Encryption_Is_Super_Safe!!!}

Challenge 4 - hashbrowns

This challenge was made specifically for the competition on professional day. As such, I'm going to assume a slightly more advanced level of readership. The last challenge solution really went in depth by stepping through the explanation. This solution write-up isn't going to go into as much detail describing how a particular section of code does something. Instead, it will be left up to the reader to figure out all the details.

So let's start by running this program like we did the ones before it.



Ok. So we are given a hash and we have to find the input that matches this hash. So the way we do this is we need to find the hashing algorithm. Opening up the executable in Immunity, we should see the following:


So with this challenge, we see that there are a couple of internal functions with the CALL instructions to addresses within the hashbrowns executable image. Notably, these can be seen at 0x00401442 and 0x00401471. Let's look at the first function starting at address 0x00401310 by right-clicking on the address and choosing the follow option.



I've gone ahead and annotated the interested aspects of this function. The first is that this is a loop and we see a call to strlen() in the loop. This leads us to believe that we are looping through a string for the length of a string. The other interesting pieces of information about this loop are the CMP instructions. We see each of them circled in the above picture and each comparison is to a hex value. The four values, 0x41, 0x5A, 0x61, and 0x7A correspond to the ASCII characters 'A', 'Z', 'a', and 'z' respectively. This loop looks like it is looking for upper and lowercase alphabetical characters. We can further confirm this by analyzing the string argument to the printf() call at 0x00401453, "Found a character I can't hash." This clue would indicate that the program only accepts alphabetical characters. Let's test this hypothesis.


We have confirmed our hypothesis. Let's move onto the next internal function at 0x00401390. Here is a picture of that function.



So again we can see can see we are looping through the length of the string character by character. This is in-fact the function that performs the hashing function. Now, there were a few ways to solve this. The difficult way was to manually reverse engineer the actual algorithm represented by the instructions at addresses 0x004013C3 - 0x004013D7, or, look at the value 0x1505 and do some research on hashing algorithms. If you convert 0x1505 to decimal, you get 5381. A Google search with the keywords hash and 5381 should lead you directly to the DJB2 hashing algorithm. Here is what that algorithm looks like.
unsigned long djb2(unsigned char *str){
        unsigned long hash = 5381;
        int c;
        while (c = *str++)
            hash = ((hash << 5) + hash) + c; 

        return hash;
}

When looking at this algorithm, we see the constant 0x1505 and the shift left 5 (SHL EDX, 5). Now we need to make sure this is actually what were after. Going back to the main part of our challenge, we see the following section of code.


What we see here is a call to our hashing function at 0x00401390 then a series of CMP instructions. Interestingly we see a CMP  to a value of 0x28C7FAE4 which is the hash value the challenge asked us to match. If we look at it a little further, we can see that this CMP instruction calls a JMP to a unique address not jumped to by any other CMP instruction. So these are the pieces we have so far:

  • The input expects a string input that will be hashed
  • That string gets hashed through the DJB2 hashing algorithm
  • The result of the hash must match 0x28C7FAE4
The way to solve this is to write a brute force script that enumerates through alphabetical strings and hashes them using the DJB2 hashing algorithm until a match is found with the value 0x28C7FAE4

This is what the DJB2 algorithm looks like in Python.

def djb2(string):
    hash = 5381
    for x in string:
        hash = (( hash << 5) + hash) + ord(x)
    return hash & 0xFFFFFFFF

The best way to solve this is to actually use a password list. Using the rock-you password list and a python script based on the previous snippet of code, you would have found that the hash value matches the input string of cyberdragoon. Here is the proof.



So here is the summary for RE challenge 4:
Password: cyberdragoon
Flag: FLAG{alls_good_when_the_hashing_is_easy}


Conclusion

I hope this year introduced an interesting new twist on the PCDC injects. If you're interested in working through the challenges on your own, I've put the executables on my GitHub account here. As always, I'm always interested in hearing feedback, so if you have anything you'd like to see improved for next year, don't hesitate to let me know. See you in 2017!





Sunday, April 24, 2016

PCDC 2016 RE Challenge Solutions - Part 1

When we sit down and plan the next year's version of PCDC, the goal is to challenge competitors with new and specific learning objectives. This year, our goal was to push more reverse engineering and incident response type challenges for the competitors to complete as part of their injects. For this, I created 4 different reverse engineering challenges. The rules were simple: Use the tools provided to you or freely available for download to analyze the challenges and get the flag. Flags were in the format 'FLAG{}' with a custom string for each challenge in-between the '{}' characters. In order to get the points for the challenge, teams must submit the input to the challenge and the correct flag.

Since PCDC is run for high schools, colleges, and professionals, the challenges were created with increasing levels of difficulty. Additionally, the competition only runs for about 8 hours and these challenges are not the only tasks the Blue Teams must complete. With that in mind, here is Part 1 of the RE solutions.

Challenge 1 - strungout


The first challenge is called 'strungout'. Let's run it and see what it does.


Ok. Looks like we need to figure out the password. Let's just guess something and see what happens.



So password wasn't the password.

In an attempt to provide some insight to the challenges, each challenge had a name that hinted at its solution. This first challenge was meant to be a gentle introduction to reverse engineering. In order to reverse engineer any target, the goal is to gather as much information about that target as you can. The defacto go to is to extract some form of human readable text, aka, strings. As it happens, the Microsoft SysInternals suite has a tool called strings that does exactly what we want. If we run strings on the challenge, we can see the human readable strings embedded in the executable.



And if we scroll down...



We see the 2 strings we saw before, "Sorry. Can haz password??" and "Ouch. Try again!". Interestingly, we see a different string that looks suspicious! "I_Love_PCDC_RE_Challenges!!!". Let's see what happens if we try this for a password.



So the answer to the first RE challenge would have been:
Password: I_Love_PCDC_RE_Challenges!!!
Flag: FLAG{playing_with_strings_iz_easy!!!}


Challenge 2 - maths

This next challenge increases the difficulty a little bit. The goal was to introduce new students to debugging and x86 assembly. For the competition, we made sure that there were copies of Immunity Debugger installed on Blue Team computers already. Additionally, the free version of IDA Pro was installed. For this write-up I'm going to be using IDA to just show off the control flow graph of a program. This helps layout a program into a diagram that makes understanding how different parts of the program get executed. The remaining details will use Immunity debugger just because it is more easily available for people to use. 

So opening up maths in IDA will produce a diagram similar to this:

What we see here is it looks like a number is being read into the program, some math is happening on it, and if it's right, we fall a specific part of the program, otherwise we get the message "Knew you couldn't guess it!!". Let's see if this is true.


Ok. Let's open this program up in Immunity debugger and figure out what's going on.


So things look a little different in Immunity, but that's OK. What we want to do first is look at the executable modules of this program. This can be done by clicking the 'e' in the top bar of the previous picture or by pressing <Alt + e>. This should bring up the following screen:


From here, right click on the list and choose the option "Analyze all modules" to invoke some of Immunity's built in analysis tools. Once the analysis completes, double click on the line with maths to get taken to the code we want to look at. You should see a picture like the one below.


At first this doesn't look too interesting, but scroll down and we will see some details that should look familiar.


Now. Before we go any further, let's talk about what we are actually looking at. The large window with the multicolored text is showing us the output of a program called a disassembler. What a disassembler does is take raw bytes in a program and print them out as an easier to understand and analyze form called assembly. Modern x86 assembly is incredibly complicated and is comprised of over 1,000 instructions! For these challenges, though, we won't need to know anywhere near that many. The understand this assembly, the left most column is the address in the program that the assembly sits at, the next column shows the raw bytes of the assembly, the third column shows the human readable mnemonics of the assembly, and for some lines, there is a 4th column generated by Immunity that shows some comment that may provide you with more information. The window to the right of the disassembly shows the register values of the CPU while you're debugging the program. This is helpful to keep track of values as they move from the CPU to memory and back again. Below the disassembly and register windows are the listing for the program's .data section and stack. We don't need to go into too much detail about each of them, but the .data section is on the left and it shows data built into the program such as the strings. The bottom right window is the stack that is used by the program for various memory operations.

OK. We can see the messages that we saw in IDA and our tests before in our disassembly window. Let's review what we know so far:
  1. The program wants us to guess its secret number
  2. The name of the program is called maths
  3. If we get it wrong, the program will print out "Knew you couldn't guess it!!"
Now let's look at the assembly. We know that we need to guess a magic number. In x86 assembly, values are compared with the CMP instruction. If we look at the assembly at address 0x00401365, we see the instruction CMP DWORD PTR SS:[EBP-4],DEADBEEF. Now, this looks a little scary, but the CMP instruction performs comparison checks. This is just comparing some variable with the hexadecimal value 0xDEADBEEF. Since we see the string "Knew you couldn't guess it!!" right below the CMP at address 0x00401375, this seems like it may be be a good candidate for our password. Let's try!


Hmmm...didn't work. Remember. The name of the program is actually a clue how to solve it. If we look up a couple of instruction from the CMP, we can see some instructions that sound like math operations operating on the same variable that's used in the CMP DWORD PTR SS:[EBP-4],DEADBEEF instruction. Those math instructions are SUB, ADD, and IMUL. Working backwards from the CMP instruction, we can see that there is math being performed on the variable before it gets compared to 0xDEADBEEF. Here is a summary of what those instructions are:
  1. Read 'magic number' from user
  2. Multiply (IMUL) 0x0a to the 'magic' number at 0x00401345
  3. Add (ADD) 0x0a to the 'magic number' at 0x0040134E
  4. Subtract (SUB) 0x01 from the 'magic number' at 0x0040135F
So if we do the opposite of this, we should be able to get our original input:

(((0xdeadbeef + 0x01) - 0x0a) / 0x0a) = 0x16449317

Let's try that!


Still not right! What did we miss! Well, we did miss something. There is another instruction that operates on the variable where our input is stored. This instruction is SHL at 0x00401357. This instruction does a binary shift left which effectively multiplies the number by 2 for the size of the shift. In this case, there is a shift left of 1, so the value is doubled. So to get our answer, we need another divide by 2. Here is the new equation:

((((0xdeadbeef + 0x01) / 2) - 0x0a) / 0x0a) = 0xB22498B

Ok. Let's try this one.


Again it didn't work! Why not?! Well, the answer is found by looking at how the number is actually read in. Look at 0x00401334, we can see in the comments section ASCII "%d" and the next line has a CALL to MSVCR90.scanf_s. This means that the routine reading in the user input is looking for the input to be in decimal form. So converting 0xB22498B to decimal is 186796427. Finally, let's try this.


There we go! So the answer to this one is:
Password: 186796427
Flag: FLAG{deadbeef_is_best_beef!!!}


Alternatives:
Now, this may seem a little complicated for people just entering the RE world. It would be much easier if we had the original source code for this program to actually figure out what's going on. Unfortunately, the process from going a binary to source code is incredibly difficult. A decompiler will never be able to give you the original source. Too much information is lost. There are decompilers out there that can give you something resembling the source, however. Let's look at a few.

RetDec
Found here, RetDec is an online trial of a decompiler. Here is the function we are interested in once decompiled by RetDec:

// Address range: 0x401310 - 0x40138a
int32_t function_401310(char * a1) {
int32_t v1;
g2 = &v1;
printf("I bet you can't guess my secret number...\n\n");
printf("Enter your guess --> ");
scanf_s();
if (20 * g3 == -0x21524124) {
// 0x40136e
function_401240();
// branch -> 0x401383
} else {
// 0x401375
printf("\nKnew you couldn't guess it!!\n");
// branch -> 0x401383
}
// 0x401383
exit(1);
// UNREACHABLE
}

IDA Pro - Hexrays
IDA Pro also has a decompiler plugin, but it is very expensive. Here is some of its output:

//----- (00401310) -------------------------------------------------------
int __cdecl __noreturn main(int argc, const char **argv, const char **envp)
{
  int v3; // ecx@0
  int v4; // [sp+0h] [bp-4h]@1

  v4 = v3;
  printf(aIBetYouCanTGue);
  printf(aEnterYourGuess);
  scanf_s(aD, &v4);
  v4 = 2 * (10 * v4 + 10) - 1;
  if ( v4 == -559038737 )
    sub_401240();
  else
    printf(aKnewYouCouldnT);
  exit(1);
}

Snowman
Another free decompiler is Snowman. It is integrated into the also free debugger, x64dbg. Here is the function decompiled by Snowman:


int32_t g4020a4 = 0x73cd20c1;
int32_t g402098 = 0x73cd2719;
void fun_401240();

int32_t g40209c = 0x73cc2455;

void fun_401311(int32_t ecx, int32_t a2) {
     g4020a4("I bet you can't guess my secret number...\n\n");
     g4020a4("Enter your guess --> ");
     g402098("%d", reinterpret_cast<int32_t>(__zero_stack_offset()) - 4);
     if ((ecx * 10 + 10 << 1) - 1 != 0xdeadbeef) {
          g4020a4("\nKnew you couldn't guess it!!\n");
     } else {
          fun_401240();
    }
    g40209c(1);
    goto a2;
}

As you can see, the outputs from these decompilers are pretty different and this is just for a relatively simple function. None of which, look like the original code:

// Main function
int main(int argc, char *argv[]) {
// Integer to store user input
unsigned int user_input;
// Print info and read input
printf("I bet you can't guess my secret number...\n\n");
printf("Enter your guess --> ");
scanf_s("%d", &user_input);

// Obfuscate, answer == 186796427
user_input *= 10;
user_input += 10;
user_input *= 2;
user_input -= 1;
// Check for valid input
if (user_input == 0xdeadbeef) {
get_flag();
}
else {
printf("\nKnew you couldn't guess it!!\n");
}

// Exit
exit(1);
}

But with these different outputs, it was easier to see the mathematical comparison and check for the magic number. One thing you'll notice is that each decompiler treated the final number, 0xDEADBEEF, differently. This is because type information is lost at this lower level and each decompiler uses different algorithms and heuristics to recover this information. The numbers produced by the decompilers are all technically correct, you just have to interpret it in the appropriate way.

Conclusion

These challenges were meant to get competitors thinking about different areas of cyber security. Hopefully we were successful. In my next post, I'll detail the answers for the remaining 2 RE challenges, exclusive and hashbrowns. If you have any questions or want more clarity on how to solve either of these two challenges, please don't hesitate to ask!