Thursday, May 5, 2016

PCDC 2016 RE Challenge Solutions - Part 2

This is a continuation off of my previous post detailing the solutions to the 2016 PCDC reverse engineering challenges 1 and 2. This post will go over the solutions for challenges 3 and 4. If you were a competitor in this year's PCDC competition, you may not have seen all of the RE challenges. Challenges 1 and 2 were made for high school students, challenge 3 was added for college students, and challenge 4 was added for professional day. As a result, only the professionals saw all 4 challenges. This was designed to help balance expected work load during the competition and provide increasing level so difficulty for increasing levels of skill. So let's look at the remaining two challenge solutions.

Challenge 3 - exclusive

Like with all the other challenges, let's run this one and get an idea of what it does.


So this looks a little bit like our previous challenges where we have to give it a valid input and it well tell us if that input is correct and print out the key. Let's open the challenge up in Immunity and take a look. 


After loading the binary and analyzing the modules just like the we did in the previous challenge, we can scroll down and see the instructions responsible for printing out the initial user prompt. We can also see how our input gets read into the program via the call to fgets(). The important thing to note how fgets() gets its parameters. In x86-32 bit assembly on Windows, which is what we are looking at, these parameters are passed on the stack in a standard called CDECL (C declaration). What this means is the PUSH instructions are actually setting up the parameters to the fgets() function. To understand what these parameters are, let's look at this function.

Using MSDN as a resource, we can see that fgets() has the following signature:

char *fgets(char *str, int n, FILE *stream);

What this function is doing is reading in an n byte string, storing it in a memory buffer pointed to by str, and reading the input from the file stream stream. If you look at Immunity, it even tells you which PUSH instruction is responsible for setting up which parameter to the fgets() call and in this case, it tells you it is reading a string of length 0x16 (n = 16 (22.)). Now for our purposes, the important parameter to consider is the address where this user defined input gets stored. If you looked at the link specifying the CDECL calling convention, you'll see that the first parameter listed in the function (char *str) is the last one pushed onto that stack before the call to fgets(). We don't need to understand all the instructions, we just need to see the use of DWORD PTR SS:[EBP-34]. This is going to be a pointer to where our input will be stored. If we look through the disassembly a little more, we should see this address show up again. Specifically in the 0x00401397 - 0x004013CE address range. Before we go any further, lets recap what we know:

- The program expects to get a 'serial number' which it then validates
- The program uses a call to fgets() to read in our input
- Our input is stored at DWORD PTR SS:[EBP-34]
- We see our stored address used within the assembly address range 0x00401397 0x004013CE

Now it's time to verify these. The best way to do this with a debugger is to set break points. At this point, right-click on the address 0x00401397, go down to breakpoints and select toggle. Alternatively, click on the address and press F2 to toggle the breakpoint. This is after we have entered the string to the program and before it looks like it's getting used.  With the breakpoint turned on, our program will halt at that address when execution has gotten there. So with our breakpoint set, let's run the program. This can be done 3 different ways: 1. Press F9, 2. To to the top menu and press the red play button, 3. Go to Debug -> Run. The first thing that will happen is that the program will break at what is called the program entry point. This is basically the API the program exposes to the OS so that the program can start up. There are a lot of steps that go into getting from this point to the break point we set, but we don't care about those at this time. Just press F9 again to continue. Now at this point, you should see the Windows command window. It should be waiting for you to enter in your input. Go ahead and enter is some string into this window like you did the first time you ran this challenge program.



At this point it will look like your program hangs, but this is Immunity pausing it so you can being to dive deeper into the details of what's going on. Let's analyze the following snippet of code:



If we look at the first instruction, we see a CMP to 0x16. Remember from the first post that CMP is used to make comparisons and remember from earlier in this post, that this challenge reads in a 0x16 byte string. Now look at the second instruction, JNB SHORT exclusiv.004013D0. It is a jump instruction to the address 0x004013D0. That target address is interesting, but not as interesting as the instruction before it at address 0x004013CE, a JMP SHORT exclusive.00401397. That's the same address we set our breakpoint at! This is a loop! We compare some counter to 0x22, if it is less than 0x22, continue execution, otherwise, jump. It kind of looks like:

int x = 0;
while (x < 0x16) {
    loop_body();
}

Now we haven't figured out the loop body yet. But let's step through this an instruction at a time by pressing the F7 key and stop when we get to address 0x004013A5, the first XOR instruction. If you're following along with my example and entered in abcd1234 as your string to test, you should see something like this:


Now I've added some circles to draw your attention to a few places. So the instruction we stopped at is XOR EAX, 9C. This means we are performing the XOR operations on whatever value is in the EAX register with the hex value 0x9C. I've circled the EAX register, and we can see the value of 0x61. If you recall, our input string was abcd1234, the hexadecimal value for the ASCII character 'a' is 0x61. Interesting. Let's continue to the next XOR instruction at 0x004013B9.


Again, I've highlighted the value in the EAX register, 0x62 (ASCII value for 'b'), and the other value in the XOR operation is 0xDC. If you continue to step through, you'll see that we jump back to address 0x00401397, and loop through these series of instructions again. When you inspect the EAX register, you'll see the values 0x63 and 0x64 (ASCII 'c' and 'd' respectively). It is indeed our input string. Now, we know we are in a loop. In fact, we wrote out some pseudo-code for it. So let's update it a little:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x++;
}

Hmmm this doesn't seem quite right. So how is our loop being incremented and how to we know our index for our input? Let's looks at the following details:

What we're looking at is how the input string is indexed using the EDX register, and how that register is incremented by 2 every loop iteration. So let's refine our code a little more:

int x = 0;
while (x < 0x16) {
   input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

Great. Now when we exit our loop, our input has been XOR'ed with the key 0x9CDC. A little further down we see a call to the function memcmp(). Here is the MSDN documentation for that function. What it does is look to see if 2 memory regions contain the same data for a given number of bytes. Again, Immunity has done some work for us and shows us that this memcmp() is using 2 pointers, s1 and s2, and comparing them up to 0x15 (21) bytes. Le't set a breakpoint at address 0x004013DA where memcmp() gets called. So, by looking at the previous instructions and what we learned from the loop we reverse engineered, we see that s2 is the string we entered after it has been XOR'ed. A good way to confirm this is we see that the address of this string got loaded into the EAX register with the instruction at address 0x004013D2. So what we can do is right-click on the EAX register and click Follow in Dump. This will show a hex dump of our data:


And indeed, those are our input bytes after being XOR'ed. Check the first 2 bytes; 0x6162 XOR 0x9CDC = 0xFDBE. So what we are really interested in is the memory contents of the second pointer, s1, being used in the memcmp(). We can find that by doing the same thing we did in the previous step, but with the ECX register. So let's get the contents of that dump.


This this is ultimately our goal. The 0x15 (21) bytes of this memory segment. This is what our input string gets compared to after being XOR'ed against the key 0x9CDC. So now let's update our pseudo-code:

char serial[0x15] = "\xff\xbd\xfa\xb9\xb1\xec\xad\xee\xaf\xe8\xb1\xe9\xaa\xeb\xa4\xe5\xb1\xbe\xf9\xb9\xfa"
char *input = (char *)malloc(0x16);
fgets(input, 0x16, STDIN);

int x = 0;
while (x < 0x16) {
 input[i] ^ 0x9C;
   input[i+1] ^ 0xDC;
   x += 2;
}

if (memcmp(key, input, 0x15) == 0) {
    win();
} else {
    fail();
}

Now, we know this is what happens with the memcmp() function by reading the documentation and seeing the TEST EAX, EAX instruction and seeing successful jumps over the failure messages. So the only thing left to do is solve what the input string is. If you read the documentation on XOR, you know that it is reversible. So all we have to do is XOR our serial we dumped out of memory with the key of 0x9CDC. There are plenty of online tools to do this for you. Once you do it, you'll find that the input should be: cafe-01234-56789-beef. Let's give this a try.



Success!

Password: cafe-01234-56789-beef
Flag: FLAG{XOR_Encryption_Is_Super_Safe!!!}

Challenge 4 - hashbrowns

This challenge was made specifically for the competition on professional day. As such, I'm going to assume a slightly more advanced level of readership. The last challenge solution really went in depth by stepping through the explanation. This solution write-up isn't going to go into as much detail describing how a particular section of code does something. Instead, it will be left up to the reader to figure out all the details.

So let's start by running this program like we did the ones before it.



Ok. So we are given a hash and we have to find the input that matches this hash. So the way we do this is we need to find the hashing algorithm. Opening up the executable in Immunity, we should see the following:


So with this challenge, we see that there are a couple of internal functions with the CALL instructions to addresses within the hashbrowns executable image. Notably, these can be seen at 0x00401442 and 0x00401471. Let's look at the first function starting at address 0x00401310 by right-clicking on the address and choosing the follow option.



I've gone ahead and annotated the interested aspects of this function. The first is that this is a loop and we see a call to strlen() in the loop. This leads us to believe that we are looping through a string for the length of a string. The other interesting pieces of information about this loop are the CMP instructions. We see each of them circled in the above picture and each comparison is to a hex value. The four values, 0x41, 0x5A, 0x61, and 0x7A correspond to the ASCII characters 'A', 'Z', 'a', and 'z' respectively. This loop looks like it is looking for upper and lowercase alphabetical characters. We can further confirm this by analyzing the string argument to the printf() call at 0x00401453, "Found a character I can't hash." This clue would indicate that the program only accepts alphabetical characters. Let's test this hypothesis.


We have confirmed our hypothesis. Let's move onto the next internal function at 0x00401390. Here is a picture of that function.



So again we can see can see we are looping through the length of the string character by character. This is in-fact the function that performs the hashing function. Now, there were a few ways to solve this. The difficult way was to manually reverse engineer the actual algorithm represented by the instructions at addresses 0x004013C3 - 0x004013D7, or, look at the value 0x1505 and do some research on hashing algorithms. If you convert 0x1505 to decimal, you get 5381. A Google search with the keywords hash and 5381 should lead you directly to the DJB2 hashing algorithm. Here is what that algorithm looks like.
unsigned long djb2(unsigned char *str){
        unsigned long hash = 5381;
        int c;
        while (c = *str++)
            hash = ((hash << 5) + hash) + c; 

        return hash;
}

When looking at this algorithm, we see the constant 0x1505 and the shift left 5 (SHL EDX, 5). Now we need to make sure this is actually what were after. Going back to the main part of our challenge, we see the following section of code.


What we see here is a call to our hashing function at 0x00401390 then a series of CMP instructions. Interestingly we see a CMP  to a value of 0x28C7FAE4 which is the hash value the challenge asked us to match. If we look at it a little further, we can see that this CMP instruction calls a JMP to a unique address not jumped to by any other CMP instruction. So these are the pieces we have so far:

  • The input expects a string input that will be hashed
  • That string gets hashed through the DJB2 hashing algorithm
  • The result of the hash must match 0x28C7FAE4
The way to solve this is to write a brute force script that enumerates through alphabetical strings and hashes them using the DJB2 hashing algorithm until a match is found with the value 0x28C7FAE4

This is what the DJB2 algorithm looks like in Python.

def djb2(string):
    hash = 5381
    for x in string:
        hash = (( hash << 5) + hash) + ord(x)
    return hash & 0xFFFFFFFF

The best way to solve this is to actually use a password list. Using the rock-you password list and a python script based on the previous snippet of code, you would have found that the hash value matches the input string of cyberdragoon. Here is the proof.



So here is the summary for RE challenge 4:
Password: cyberdragoon
Flag: FLAG{alls_good_when_the_hashing_is_easy}


Conclusion

I hope this year introduced an interesting new twist on the PCDC injects. If you're interested in working through the challenges on your own, I've put the executables on my GitHub account here. As always, I'm always interested in hearing feedback, so if you have anything you'd like to see improved for next year, don't hesitate to let me know. See you in 2017!





Sunday, April 24, 2016

PCDC 2016 RE Challenge Solutions - Part 1

When we sit down and plan the next year's version of PCDC, the goal is to challenge competitors with new and specific learning objectives. This year, our goal was to push more reverse engineering and incident response type challenges for the competitors to complete as part of their injects. For this, I created 4 different reverse engineering challenges. The rules were simple: Use the tools provided to you or freely available for download to analyze the challenges and get the flag. Flags were in the format 'FLAG{}' with a custom string for each challenge in-between the '{}' characters. In order to get the points for the challenge, teams must submit the input to the challenge and the correct flag.

Since PCDC is run for high schools, colleges, and professionals, the challenges were created with increasing levels of difficulty. Additionally, the competition only runs for about 8 hours and these challenges are not the only tasks the Blue Teams must complete. With that in mind, here is Part 1 of the RE solutions.

Challenge 1 - strungout


The first challenge is called 'strungout'. Let's run it and see what it does.


Ok. Looks like we need to figure out the password. Let's just guess something and see what happens.



So password wasn't the password.

In an attempt to provide some insight to the challenges, each challenge had a name that hinted at its solution. This first challenge was meant to be a gentle introduction to reverse engineering. In order to reverse engineer any target, the goal is to gather as much information about that target as you can. The defacto go to is to extract some form of human readable text, aka, strings. As it happens, the Microsoft SysInternals suite has a tool called strings that does exactly what we want. If we run strings on the challenge, we can see the human readable strings embedded in the executable.



And if we scroll down...



We see the 2 strings we saw before, "Sorry. Can haz password??" and "Ouch. Try again!". Interestingly, we see a different string that looks suspicious! "I_Love_PCDC_RE_Challenges!!!". Let's see what happens if we try this for a password.



So the answer to the first RE challenge would have been:
Password: I_Love_PCDC_RE_Challenges!!!
Flag: FLAG{playing_with_strings_iz_easy!!!}


Challenge 2 - maths

This next challenge increases the difficulty a little bit. The goal was to introduce new students to debugging and x86 assembly. For the competition, we made sure that there were copies of Immunity Debugger installed on Blue Team computers already. Additionally, the free version of IDA Pro was installed. For this write-up I'm going to be using IDA to just show off the control flow graph of a program. This helps layout a program into a diagram that makes understanding how different parts of the program get executed. The remaining details will use Immunity debugger just because it is more easily available for people to use. 

So opening up maths in IDA will produce a diagram similar to this:

What we see here is it looks like a number is being read into the program, some math is happening on it, and if it's right, we fall a specific part of the program, otherwise we get the message "Knew you couldn't guess it!!". Let's see if this is true.


Ok. Let's open this program up in Immunity debugger and figure out what's going on.


So things look a little different in Immunity, but that's OK. What we want to do first is look at the executable modules of this program. This can be done by clicking the 'e' in the top bar of the previous picture or by pressing <Alt + e>. This should bring up the following screen:


From here, right click on the list and choose the option "Analyze all modules" to invoke some of Immunity's built in analysis tools. Once the analysis completes, double click on the line with maths to get taken to the code we want to look at. You should see a picture like the one below.


At first this doesn't look too interesting, but scroll down and we will see some details that should look familiar.


Now. Before we go any further, let's talk about what we are actually looking at. The large window with the multicolored text is showing us the output of a program called a disassembler. What a disassembler does is take raw bytes in a program and print them out as an easier to understand and analyze form called assembly. Modern x86 assembly is incredibly complicated and is comprised of over 1,000 instructions! For these challenges, though, we won't need to know anywhere near that many. The understand this assembly, the left most column is the address in the program that the assembly sits at, the next column shows the raw bytes of the assembly, the third column shows the human readable mnemonics of the assembly, and for some lines, there is a 4th column generated by Immunity that shows some comment that may provide you with more information. The window to the right of the disassembly shows the register values of the CPU while you're debugging the program. This is helpful to keep track of values as they move from the CPU to memory and back again. Below the disassembly and register windows are the listing for the program's .data section and stack. We don't need to go into too much detail about each of them, but the .data section is on the left and it shows data built into the program such as the strings. The bottom right window is the stack that is used by the program for various memory operations.

OK. We can see the messages that we saw in IDA and our tests before in our disassembly window. Let's review what we know so far:
  1. The program wants us to guess its secret number
  2. The name of the program is called maths
  3. If we get it wrong, the program will print out "Knew you couldn't guess it!!"
Now let's look at the assembly. We know that we need to guess a magic number. In x86 assembly, values are compared with the CMP instruction. If we look at the assembly at address 0x00401365, we see the instruction CMP DWORD PTR SS:[EBP-4],DEADBEEF. Now, this looks a little scary, but the CMP instruction performs comparison checks. This is just comparing some variable with the hexadecimal value 0xDEADBEEF. Since we see the string "Knew you couldn't guess it!!" right below the CMP at address 0x00401375, this seems like it may be be a good candidate for our password. Let's try!


Hmmm...didn't work. Remember. The name of the program is actually a clue how to solve it. If we look up a couple of instruction from the CMP, we can see some instructions that sound like math operations operating on the same variable that's used in the CMP DWORD PTR SS:[EBP-4],DEADBEEF instruction. Those math instructions are SUB, ADD, and IMUL. Working backwards from the CMP instruction, we can see that there is math being performed on the variable before it gets compared to 0xDEADBEEF. Here is a summary of what those instructions are:
  1. Read 'magic number' from user
  2. Multiply (IMUL) 0x0a to the 'magic' number at 0x00401345
  3. Add (ADD) 0x0a to the 'magic number' at 0x0040134E
  4. Subtract (SUB) 0x01 from the 'magic number' at 0x0040135F
So if we do the opposite of this, we should be able to get our original input:

(((0xdeadbeef + 0x01) - 0x0a) / 0x0a) = 0x16449317

Let's try that!


Still not right! What did we miss! Well, we did miss something. There is another instruction that operates on the variable where our input is stored. This instruction is SHL at 0x00401357. This instruction does a binary shift left which effectively multiplies the number by 2 for the size of the shift. In this case, there is a shift left of 1, so the value is doubled. So to get our answer, we need another divide by 2. Here is the new equation:

((((0xdeadbeef + 0x01) / 2) - 0x0a) / 0x0a) = 0xB22498B

Ok. Let's try this one.


Again it didn't work! Why not?! Well, the answer is found by looking at how the number is actually read in. Look at 0x00401334, we can see in the comments section ASCII "%d" and the next line has a CALL to MSVCR90.scanf_s. This means that the routine reading in the user input is looking for the input to be in decimal form. So converting 0xB22498B to decimal is 186796427. Finally, let's try this.


There we go! So the answer to this one is:
Password: 186796427
Flag: FLAG{deadbeef_is_best_beef!!!}


Alternatives:
Now, this may seem a little complicated for people just entering the RE world. It would be much easier if we had the original source code for this program to actually figure out what's going on. Unfortunately, the process from going a binary to source code is incredibly difficult. A decompiler will never be able to give you the original source. Too much information is lost. There are decompilers out there that can give you something resembling the source, however. Let's look at a few.

RetDec
Found here, RetDec is an online trial of a decompiler. Here is the function we are interested in once decompiled by RetDec:

// Address range: 0x401310 - 0x40138a
int32_t function_401310(char * a1) {
int32_t v1;
g2 = &v1;
printf("I bet you can't guess my secret number...\n\n");
printf("Enter your guess --> ");
scanf_s();
if (20 * g3 == -0x21524124) {
// 0x40136e
function_401240();
// branch -> 0x401383
} else {
// 0x401375
printf("\nKnew you couldn't guess it!!\n");
// branch -> 0x401383
}
// 0x401383
exit(1);
// UNREACHABLE
}

IDA Pro - Hexrays
IDA Pro also has a decompiler plugin, but it is very expensive. Here is some of its output:

//----- (00401310) -------------------------------------------------------
int __cdecl __noreturn main(int argc, const char **argv, const char **envp)
{
  int v3; // ecx@0
  int v4; // [sp+0h] [bp-4h]@1

  v4 = v3;
  printf(aIBetYouCanTGue);
  printf(aEnterYourGuess);
  scanf_s(aD, &v4);
  v4 = 2 * (10 * v4 + 10) - 1;
  if ( v4 == -559038737 )
    sub_401240();
  else
    printf(aKnewYouCouldnT);
  exit(1);
}

Snowman
Another free decompiler is Snowman. It is integrated into the also free debugger, x64dbg. Here is the function decompiled by Snowman:


int32_t g4020a4 = 0x73cd20c1;
int32_t g402098 = 0x73cd2719;
void fun_401240();

int32_t g40209c = 0x73cc2455;

void fun_401311(int32_t ecx, int32_t a2) {
     g4020a4("I bet you can't guess my secret number...\n\n");
     g4020a4("Enter your guess --> ");
     g402098("%d", reinterpret_cast<int32_t>(__zero_stack_offset()) - 4);
     if ((ecx * 10 + 10 << 1) - 1 != 0xdeadbeef) {
          g4020a4("\nKnew you couldn't guess it!!\n");
     } else {
          fun_401240();
    }
    g40209c(1);
    goto a2;
}

As you can see, the outputs from these decompilers are pretty different and this is just for a relatively simple function. None of which, look like the original code:

// Main function
int main(int argc, char *argv[]) {
// Integer to store user input
unsigned int user_input;
// Print info and read input
printf("I bet you can't guess my secret number...\n\n");
printf("Enter your guess --> ");
scanf_s("%d", &user_input);

// Obfuscate, answer == 186796427
user_input *= 10;
user_input += 10;
user_input *= 2;
user_input -= 1;
// Check for valid input
if (user_input == 0xdeadbeef) {
get_flag();
}
else {
printf("\nKnew you couldn't guess it!!\n");
}

// Exit
exit(1);
}

But with these different outputs, it was easier to see the mathematical comparison and check for the magic number. One thing you'll notice is that each decompiler treated the final number, 0xDEADBEEF, differently. This is because type information is lost at this lower level and each decompiler uses different algorithms and heuristics to recover this information. The numbers produced by the decompilers are all technically correct, you just have to interpret it in the appropriate way.

Conclusion

These challenges were meant to get competitors thinking about different areas of cyber security. Hopefully we were successful. In my next post, I'll detail the answers for the remaining 2 RE challenges, exclusive and hashbrowns. If you have any questions or want more clarity on how to solve either of these two challenges, please don't hesitate to ask!

Tuesday, June 30, 2015

Review of SANS SEC760 - Advanced Exploit Development for Penetration Testers

A little over a week ago I wrapped up taking SANS Advanced Exploit Development for Penetration Testers (SEC760) at SANSFire 2015 in Baltimore, MD. This class is a 6 day long class that covers advanced Linux exploitation, patch-diffing and one day exploits, Windows kernel exploitation, advanced Windows client exploitation, and a variety of other topics. The end of the class, in typical SANS penetration testing class fashion, culminates in a capture the flag (CTF). So far I think this has been the best SANS class that I've taken. Prior to SEC760, I took SEC560, and SEC660 (course review here). The material is extremely technical, relevant, educational, and interesting. The course's author and instructor, Stephen Sims, has really put a lot of time and effort into making this class very challenging yet approachable.

Day 1 - Threat Modeling, Reversing, and Debugging with IDA

Day 1 of the class starts you off with a dive into the world of threat modeling. I'm sure that many people who end up taking SEC760 looking to learn how to advance their exploit writing game aren't exactly interested in talking about threat modeling. I was one of those individuals, but I completely see the value of threat modeling from the stand point that you as an individual can provide an extremely valuable resource to your company when correctly implementing processes like the Microsoft Security Development Lifestyle (SDL). Not to mention, an individual that can provide effective threat modeling stands to make decent money with that skill.

Following the threat modeling, the technical aspects of the class begin to pick up. The rest of the first day is spent on going over the process of reverse engineering binaries and becoming familiar with IDA. For those who haven't used IDA, this day provides you with the opportunity to become familiar with the interface and capabilities of the tool. I should caution, however, that if you've never used IDA, you will not be able to take the first day to learn it. IDA is an extremely complex tool that requires a lot of time to learn. Having experience with other debuggers such as OllyDBG, Immunity, or x64dbg certainly helps (and is expected for this class), but there is no replacement for actual experience.

Day 2 - Advanced Linux Exploitation

Moving on to Day 2 of the course, the class explores advanced exploitation concepts for Linux. The first topic of the day is focused on getting the students more use to heap based exploitation techniques by learning about the original exploitable flaw in the dl-malloc unlink() macro implementation. From here, the day progresses to learning about exploitation examples in custom memory management implementation, function pointer overwrites, and format string vulnerabilities.

I personally would have liked to have seen more Linux exploitation topics, but given the focus on modern exploit development, I understand why Stephen decided to focus more on Windows exploitation for days 3 - 5. To be fair, a lot of the vulnerability and exploit categories taught on days 3 and 5 can be applied to Linux, albeit with some modifications. This opinion is nothing more than a personal bias.

Day 3 - Patch Diffing, One-Day Exploits, and Return Oriented Shellcode

Day 3 starts off with a refresher into ROP based exploit development (a pre-requisite for the class). In these refreshers, you use ROP to disable exploit mitigations. With return oriented shellcode, you learn how to make all of your shellcode execute via a series of ROP gadgets. This has the added benefit of being able to bypass certain exploit mitigations even in instances where common bypass techniques are not available. 

From Day 3 on, Stephen puts a lot of focus on the art of patch diffing. Patch diffing is the process of looking at an original un-patched binary, and comparing the old version with a newly patched version. By using these diffing techniques, it makes it easier for a researcher to pin-point the area within the binary that changed. This is extremely helpful in developing 1-day exploits. For this part of the class, I used a new tool, Diaphora, to perform my diff-analysis. Diaphora worked out great for me. Diaphora is able to perform patch diffs at the control flow graph, assembly, and pseudo-source code levels (if you have the Hex-Rays decompiler plugin). And the best part, Diaphora is free, open source, and actively maintained.

Day 4 - Windows Kernel Debugging and Exploitation

Moving along the Windows exploitation ecosystem, Day 4 focuses purely on Windows kernel debugging and exploitation. I really enjoyed this section of the class. A recommendation I would make for those who are considering taking this class: if you take this class, make your life a little easier and install Windows as your host OS and use VirtualKD by SysProgs to communicate with your target Windows virtual machine. This class has a lot of moving parts and the more you can simplify those pieces, the more time you can spend actually working on the exercises and by consequence, the more you learn.

This is the day when students will become much more familiar with Microsoft's WinDBG. Prior to taking this class, I had never touched WinDBG. Although I found the learning curve rather steep, I came to appreciate the power provided by the debugger. Day 4 culminates in an exercise where you will learn how modern privilege escalation works on Windows via a technique known as token stealing.

Day 5 - Windows Heap Overflows and Client-Side Exploitation

Day 5 of the class zeros in on the area of advanced Windows client side exploitation. After getting a deep dive into the internals of Windows heap management, the entire day is spent learning how to find, identify, confirm, and eventually, exploit modern use-after-free bugs. After going through this day you will be equipped with the knowledge of how to perform modern exploit development on Windows 7 and 8 on applications such as Internet Explorer. Additionally, Stephen goes over the next steps to take when developing exploits for Windows 8.1, exploits that bypass Microsoft's EMET, and exploit tricks that still work with the latest preview version of Windows 10.

I will caution the reader, however, that during my time taking the class, we didn't actually perform any exercises on exploit development for Windows 8.1 or above, and for good reason. Even with the normal class hours of 9am-5pm and the extended boot-camp hours, there simply isn't enough time to go from just learning what a use-after-free exploit is to developing exploits that work on all versions of Windows and making those exploits bypass any new exploit mitigations that may be in place. Overall, Day 5 is where you really begin to have all the pieces from earlier in the week align. I found this day to be the most interesting and I'll definitely be returning to the material in order to ensure I've mastered the techniques presented.

Day 6 - Capture the Flag

For the final day of the class, the class is split into teams to compete against each other in a Jeopardy style CTF. There are 16 challenges with a total of 4000 points up for grabs. I won't go into detail of the challenges themselves, but the competition requires your team to exercise all of the skills gained through the previous 5 days. In the end, our team successfully completed 14 out of the 16 challenges to earn a score of 3200. Although the game was very competitive, 3200 points was enough for our team to come out on top. As a result, we earned the coolest coin offered in the SANS series of challenge coins. The 760 skull coin.

Team ROPNOP Wins the Skull Coin

Conclusion

Overall, I loved this class, It was an absolutely great experience. I encourage any individual looking to take their exploit development skills to the next level to take this class. I will warn you, however, that this is not an easy class, You can expect to be up late into the night after each day of class if you want to finish all of the exercises. Go into this class expecting to work hard, get frustrated, and exercise your patience. This is not a class for beginners. If the term Return Oriented Programming (ROP) is something you aren't familiar with, GDB is a foreign and feared tool, the heap sounds scary, and kernel debugging is shrouded in a fog of wizardry, you may want to consider taking SANS SEC660 first. Otherwise, embrace the challenge and I'm sure you'll finding it very rewarding!

Sunday, May 31, 2015

Palmetto Cyber Defense Competition - 2015

Last month my fellow employees and I wrapped up our third annual Palmetto Cyber Defense Competition (PCDC). Inspired by the Collegiate Cyber Defense Competitions, PCDC has been a computer security competition for high schools and colleges in South Carolina. I've written about the event itself in the past PCDC-2014 so I won't go into detail about what the event actually is. Those details can be found at the official PCDC site. Instead I wanted to focus on a couple of the things that make PCDC unique, lessons learned from putting on a computer security competition, and where we are going in the future. For those of you looking for a detailed Red Team write up, that can be found here. I took the time after the competition to create an in-depth analysis of the Red Team's process for those looking to learn from their mistakes at this year's competition. The rest of this blog post is a few of my thoughts from an organizer's perspective and not from the perspective of the Red Team lead.

First off, PCDC is unique in that for the first two years, both high schools and colleges competed. Each group had their own dedicated competition day which meant that the PCDC competition was actually run twice. This means that after the first day, all the machines must get reset, re-imaged, and configured slightly different for the next day. This year, we had a third day added; a professional day. For the first time, PCDC was to be run three times. The schedule of events was as follows: first day was high school, second day was college, and the third day was for professionals. For the professional day, we had groups representing a mixture of government and private industry. From the government side teams were comprised of members from the 24th Air Force and U.S. Cyber Command while the industry teams had members from Scientific Research Corporation (SRC), and SPARC. All in all, we had 4 government teams and 4 teams from industry.

Our goal from the beginning was to design the competition network to be believable and realistic. Since it was not known to us during the planning and design phase of the competition that there would be a professional day, this year's theme was based around an online game development company. Each of the Blue Teams would be responsible for making sure their development company continued to function through the course of the day and deliver the latest version of their game to their user base through audience accessible tablets connected to each Blue Team via a dedicated wireless connection. One of the things that we quickly realized was that our ambitions greatly eclipsed the amount of time we had available to create the network. Remember, a decent portion of our time goes into infrastructure development so that the competition can be rerun the next day. To add to that pressure, we do not have control over the facility in which the competition is hosted. As a result, or preparation time from the end of one day the beginning of the next is usually around 3 hours.

To put it simply, there was a lot of different attributes that we wanted to include into this year's competition, but we ran out of time. One of the biggest things we feel these types of competition lack when trying to simulate real world networks is realistic user activity. This year we attempted to remedy that by developing simulated users. We got all the code developed and tested for the user simulators, but due to a hardware failure, we were unable to deploy them to the competition network this year. In the interest of education and sharing, I have opened sourced the code for the user simulators on GitHub. We'd really like to hear back from anyone that is doing something similar.

A few of us have been involved in multiple CCDCs and PCDCs and every year, we make the comment that the scoring system needs to be altered. Although this was in the works before this year's competition even took place, we haven't had time to finalize what fixing the scoring means. At this point, I think we have a much better idea of how we are going to fix the scoring. To highlight why scoring is such an issue, I want to talk about how winning Blue Teams typically approach this competition. Within the first few minutes, the strategy includes removing all unnecessary accounts, changing all the default passwords, and for some, unplugging their entire network while they continue to harden. Now, from a strategic perspective with the goal to win a game in mind, I can't argue with this approach. The issue that I do have, however, is that this leaves the networks in a pretty unrealistic state.

Each Blue Team is given an information packet at the beginning of the competition. In that packet includes the names of the accounts that the automated scoring engine will use to log into their systems and perform service checks to make sure the Blue Teams still have their services up and running. Once these account names have been identified, the Blue Teams will delete every other account off the workstation or server. This means you could have 3 or 4 domain joined Windows workstations with zero user accounts and only the scoring engine account. It is important to note, that the Red Team is not allowed to leverage the scoring engine account to gain access to the Blue Team's networks. It's also not realistic. A computer security competition should force the students and competitors to perform real security tasks with the presence of real users. Now, since this is a competition and getting that many unbiased volunteer users is unrealistic, we need simulated users.

Other areas where improvements to scoring need to be made is in the way the scoring engine actually evaluates successful checks. Up to this point, a common service to check for is a functioning MySQL database. Typically the scoring engine will login to the MySQL database, make a query, and check for a specific key-value pair or for the presence of a specific table. This simply isn't good enough. For a real company, the database needs to have constant transactions generated by realistic activity. Right now, Blue Teams get away with making a backup of the database in the beginning of the day and just restoring it anytime the Red Team deletes the database. As long as the Blue Team restores the database in between scoring engine rounds, the scoring engine gives that Blue Team a perfect service check score. Now, this is slightly cured by the fact that the Red Team reports incidents to the Gold Team and the Gold Team can decide to take away points, but these types of scenarios need to have bigger impact on the 'day to day' operations of the Blue Teams' networks.

Where we plan to go in the future will attempt to combine 3 facets of scoring. The first being financial. The scoring engine will no longer add points, but will score the Blue Teams' companies in a financial sense. The second facet is from an internal employee and systems perspective. Employees must be able to perform their job related duties and interdependent systems must be able to communicate with each other. Finally, the third facet is from the Red Team. This year we tried something new by giving the Red Team specific targets/flags to capture when we gained access to the the Blue Teams' networks. This included the credit card numbers in their customer database, the source code to their latest game, and a few other  things.

Now, I know some people will argue that the CCDCs and even PCDC already take these things into consideration with the scoring engine, but we argue, it is not taken into account enough. The example we like to use is the one where the Blue Team unplugs their network. Now sure, they aren't getting any points from the scoring engine, but in the real world, you can't just go and unplug your entire company from the network. Not only would you be losing sales, but you're paying your employees to do a job that they can't accomplish. And not to mention, the security or IT department has no authority to make that type of decision.

We have thought long and hard about the scoring and we think we have something new and exciting for next year. I don't want to give away too much here until things are more settled. Additionally, we want to find a way to make the audience understand what is going on. PCDC is free and open for the community to come in and view. This year we attempted to show what was taking place by visualizing the the traffic between the Blue Teams and Red Team in real time. I wrote the code for this and am also releasing it on  GitHub. You can see a video demo of it on YouTube.

We have a lot of exciting things planned for next year's PCDC! Stay tuned for more, and if you have any feedback from this year's we'd love to hear it.