In this article, I’ll walk you through solving the Simple Encryptor reversing challenge from the platform HackTheBox. Since I haven’t done much in the realm of CTF or any kind of cybersecurity challenges for a while, my approach might not be perfect, and there may be some incorrect assumptions along the way. If you notice anything off, feel free to reach out, and I’ll be happy to correct it.

Please keep in mind that I’m primarily a developer with no formal experience in reverse engineering or tools like Ghidra or IDA Pro, and my understanding of assembly is quite limited.

Challenge Information#

Now that we’ve set the stage, let’s dive into the challenge itself. The description provided is as follows:

During one of our routine checks on the secret flag storage server, we discovered it had been hit by ransomware! The original flag data is gone, but luckily, we still have both the encrypted file and the encryption program itself.

Next, let’s download the challenge files. There’s only one file, Simple Encryptor.zip. After extracting it with:

unzip 'Simple Encryptor.zip'

We find a directory called rev_simpleencryptor, containing two files: an encrypted flag file named flag.enc and an executable binary named encrypt.

To start analyzing the binary, I decided to load it into GDB to get a better sense of what’s happening. I navigated to the rev_simpleencryptor directory and opened the binary in GDB with:

gdb encrypt

Since GDB uses AT&T assembly syntax by default, I switched it to Intel format by running:

set disassembly-flavor intel

Next, I began the disassembly process with:

disass main

Here’s the initial output:

0x0000000000001289 <+0>:     endbr64
0x000000000000128d <+4>:     push   rbp
0x000000000000128e <+5>:     mov    rbp,rsp
0x0000000000001291 <+8>:     sub    rsp,0x40
0x0000000000001295 <+12>:    mov    rax,QWORD PTR fs:0x28
0x000000000000129e <+21>:    mov    QWORD PTR [rbp-0x8],rax
0x00000000000012a2 <+25>:    xor    eax,eax
0x00000000000012a4 <+27>:    lea    rsi,[rip+0xd59]        # 0x2004
0x00000000000012ab <+34>:    lea    rdi,[rip+0xd55]        # 0x2007
0x00000000000012b2 <+41>:    call   0x1170 <fopen@plt>
0x00000000000012b7 <+46>:    mov    QWORD PTR [rbp-0x28],rax
0x00000000000012bb <+50>:    mov    rax,QWORD PTR [rbp-0x28]
0x00000000000012bf <+54>:    mov    edx,0x2
0x00000000000012c4 <+59>:    mov    esi,0x0
0x00000000000012c9 <+64>:    mov    rdi,rax
0x00000000000012cc <+67>:    call   0x1160 <fseek@plt>
0x00000000000012d1 <+72>:    mov    rax,QWORD PTR [rbp-0x28]
0x00000000000012d5 <+76>:    mov    rdi,rax
0x00000000000012d8 <+79>:    call   0x1130 <ftell@plt>
0x00000000000012dd <+84>:    mov    QWORD PTR [rbp-0x20],rax
0x00000000000012e1 <+88>:    mov    rax,QWORD PTR [rbp-0x28]

Now, if I’m being honest, I don’t have much experience with assembly, and I quickly realized that solving this challenge purely by reading the assembly code wasn’t going to work. However, if we look closely, we can spot a few familiar function calls like fopen, fseek, and ftell. I recognized these functions from libc, the standard C library.

Instead of trying to interpret the assembly line by line, I decided to go the decompilation route. While we can’t get the exact original source code from a binary, tools like Ghidra and IDA Pro can generate a fairly accurate approximation of the C code from the assembly.

So, I installed Ghidra on my machine, created a new project, and loaded the encrypt binary. After some analysis, Ghidra produced a decompiled version of the binary, which looked like this:

undefined8 main(void)

{
  int iVar1;
  time_t tVar2;
  long in_FS_OFFSET;
  uint local_40;
  uint local_3c;
  long local_38;
  FILE *local_30;
  size_t local_28;
  void *local_20;
  FILE *local_18;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_30 = fopen("flag","rb");
  fseek(local_30,0,2);
  local_28 = ftell(local_30);
  fseek(local_30,0,0);
  local_20 = malloc(local_28);
  fread(local_20,local_28,1,local_30);
  fclose(local_30);
  tVar2 = time((time_t *)0x0);
  local_40 = (uint)tVar2;
  srand(local_40);
  for (local_38 = 0; local_38 < (long)local_28; local_38 = local_38 + 1) {
    iVar1 = rand();
    *(byte *)((long)local_20 + local_38) = *(byte *)((long)local_20 + local_38) ^ (byte)iVar1;
    local_3c = rand();
    local_3c = local_3c & 7;
    *(byte *)((long)local_20 + local_38) =
         *(byte *)((long)local_20 + local_38) << (sbyte)local_3c |
         *(byte *)((long)local_20 + local_38) >> 8 - (sbyte)local_3c;
  }
  local_18 = fopen("flag.enc","wb");
  fwrite(&local_40,1,4,local_18);
  fwrite(local_20,1,local_28,local_18);
  fclose(local_18);
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return 0;
}

Code Explanation#

DISCLAIMER: I will be explaining the code in-depth, so if you already understand what’s going on, feel free to skip this section.

Let’s observe the decompiled code closely and try to rename variables and types based on context while trying to understand how this binary encrypts data. On the first line, we have undefined8 main(void), and at the end, we return ‘0’, which means undefined8 could be any kind of int type. Let’s assume it is, in fact, int. This is how we are trying to understand the decompiled code.

Let’s explore what the code does:

  local_30 = fopen("flag","rb");
  fseek(local_30,0,2);
  local_28 = ftell(local_30);
  fseek(local_30,0,0);
  local_20 = malloc(local_28);
  fread(local_20,local_28,1,local_30);
  fclose(local_30);

In this block, the application initially opens a file handler to the original file called flag in read mode. Then, it moves the file pointer to the end of the file using the fseek() function. The fseek() function essentially accepts three arguments: a file pointer, an offset, and the position to which the file should move. The position can either be SEEK_SET (which translates to 0), SEEK_CUR (which translates to 1), or SEEK_END (which translates to 2). The offset can be a positive or negative integer, meaning how many bytes we should deviate from the position. If we closely observe the code, we can see that here, the pointer is moved to the end of the file with a 0 offset, meaning no deviation, so we stay at the end of the file.

Then, we call the ftell() function, which accepts a file pointer and returns the current position as a size_t or long. These lines are used to calculate the file size. After that, the file pointer is moved back to the start of the file with a 0 offset by calling fseek(file_ptr, 0, 0). Next, we allocate some memory in the heap to store the file content and store the returned pointer in a variable. This can be seen as a temporary buffer. The application then reads the file content into the buffer by calling fread(buffer, file_size, number_of_chunks_to_read, file_ptr). Once all the contents are read into the buffer, it closes the file handle.

So with this knowledge, we can convert our decompiled code into something more readable,

  FILE *file_ptr = fopen("flag", "rb");
  fseek(file_ptr, 0, SEEK_END);
  long file_size = ftell(file_ptr);

  fseek(file_ptr, 0, SEEK_SET);

  int *buffer = malloc(file_size);
  fread(buffer, file_size, 1, file_ptr);
  fclose(file_ptr);

Now, let’s move to the actual encryption part:

  tVar2 = time((time_t *)0x0);
  local_40 = (uint)tVar2;
  srand(local_40);
  for (local_38 = 0; local_38 < (long)local_28; local_38 = local_38 + 1) {
    iVar1 = rand();
    *(byte *)((long)local_20 + local_38) = *(byte *)((long)local_20 + local_38) ^ (byte)iVar1;
    local_3c = rand();
    local_3c = local_3c & 7;
    *(byte *)((long)local_20 + local_38) =
         *(byte *)((long)local_20 + local_38) << (sbyte)local_3c |
         *(byte *)((long)local_20 + local_38) >> 8 - (sbyte)local_3c;
  }

First, we get the current time and store it in a variable, which we then convert to a uint. This essentially gives us the number of seconds since the Unix epoch. We’re doing this to generate a unique seed value. But why do we need a unique seed value? Let’s dive into that first by understanding how random number generation (RNG) works on a computer. Despite the name, computers can’t truly generate random numbers. This means if we know the starting point (the seed), we can reproduce the same sequence of “random” numbers, making them not truly random. To fix this, we change the starting point to something unique that only we know—like the exact current time. That’s why the program generates a unique value based on the current time, down to the second, as the seed. Without this seed value, we wouldn’t be able to reproduce the same random numbers. Now that we understand RNG and the need for a unique seed, let’s continue.

After getting the seed number, we call the srand() function to seed the rand() function. Then, we loop through every byte of the buffer and manipulate it to perform encryption. Let’s break down how the bytes are being manipulated. Inside the loop, we first generate a random value using rand(). Then, we perform an XOR operation on the byte. Luckily for us, XOR operations can be reversed if we know the random number used for the XOR by performing the same operation again.

After that, we generate another random number using rand() and perform a logical AND operation with 7. This ensures that the number will always be between 0 and 7, regardless of the original value. Now that we have our random number, we perform a left shift operation on the byte using this number. Next, we perform a right shift operation on the same byte. We now have two versions of the byte (shifted in opposite directions). Finally, we combine these two values with a logical OR operation and store the result as the final encrypted byte in the buffer. This is the most interesting and important part of the encryption process. We repeat this for every byte in the buffer, meaning we encrypt the entire file’s contents. The whole process can be reversed, assuming we know the initial seed value.

Let’s rewrite that block of code in a more readable format:

  int current_time = time(NULL);
  uint seed_number  = (uint)current_time;
  srand(seed_number);

  for (int i = 0; i < file_size; i++) {
    int rand_num_xor = rand();
    buffer[i] = buffer[i] ^ rand_num_xor;

    int rand_num_bitshift = rand();
    rand_num_bitshift = rand_num_bitshift & 7;
    buffer[i] = (buffer[i] << rand_num_bitshift) | (buffer[i] >> 8 - rand_num_bitshift);
  }

Much better, right? Let’s move on to the next code block:

  local_18 = fopen("flag.enc","wb");
  fwrite(&local_40,1,4,local_18);
  fwrite(local_20,1,local_28,local_18);
  fclose(local_18);

First, it opens a file handler for flag.enc where the encrypted content will be written, using “write binary” mode. And since we’re lucky, the program writes the unique seed number as the first 4 bytes in the encrypted file, followed by the encrypted buffer contents. It then closes the file handle since it’s no longer needed. Let’s make this code a bit more readable too.

  FILE *encrypted_fp = fopen("flag.enc", "wb");
  fwrite(&seed_number, 1, 4, encrypted_fp);
  fwrite(buffer, 1, file_size, encrypted_fp);
  fclose(encrypted_fp);

More readable than our initial decompiled code, right? Now, let’s build an algorithm to reverse this process and retrieve the actual flag!

Solution#

Before diving into the code for the final solution, let’s first create an algorithm to reverse the encryption process to understand the flow.

Algorithm#

  1. Open the encrypted file.
  2. Read the first 4 bytes as the seed value and store it in a variable for later use.
  3. Get the file size, subtracting 4 bytes (the size of the seed value), as we don’t need the seed in the encrypted data.
  4. Allocate a temporary buffer for storing the encrypted file content using malloc(file_size), and store the returned pointer in the buffer variable.
  5. Seed the rand() function with srand(seed_value).
  6. Start a loop to decrypt the encrypted data.
  7. Generate two random values using rand(): the first for reversing the XOR, and the second for bit shifting.
  8. Perform bit shifting in reverse: first a right shift, then a left shift. After that, perform a logical OR operation and overwrite the buffer content at the current index.
  9. Finally, perform the XOR operation again and overwrite the buffer content at the current index.
  10. Repeat this for all contents and print the buffer.

Code#

Now, let’s write the code. Since we were working with C all this time, we’ll implement the solution in C for consistency with the functions we’ve been using.

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int main() {
  // Open encrypted file
  FILE *fp = fopen("flag.enc", "rb");

  // Read seed value
  uint32_t seed;
  fread(&seed, sizeof(seed), 1, fp);

  // Get file_size - seed value size (4 bytes) and allocate memory
  fseek(fp, 0, SEEK_END);
  long file_size = ftell(fp) - sizeof(seed);
  uint8_t *buffer = malloc(file_size);

  // Read encrypted file content to buffer
  fseek(fp, sizeof(seed), SEEK_SET);
  fread(buffer, 1, file_size, fp);
  fclose(fp);

  // Set seed value for rand()
  srand(seed);
  for (int i = 0; i < file_size; i++) {
    // generate random numbers like encrypt binary
    int rand_num_xor = rand();
    int rand_num_bitshift = rand();
    int shift_num = rand_num_bitshift & 7;

    // Reverse bit shifting and xor
    buffer[i] = (buffer[i] >> shift_num) | (buffer[i] << (8 - shift_num));
    buffer[i] = buffer[i] ^ rand_num_xor;
  }

  // Print result
  printf("%s\n", buffer);
  free(buffer);

  return 0;
}

Now let’s compile and run the solution code:

gcc -o decrypt decrypt.c
./decrypt

This should print the flag in the format HTB{vRy*******************************0r}.