blog feature image

NSA Codebreaker Challenge 2022 Task 9 Writeup

Zaid Khaishagi

Mar 09, 2023

Task 9 - The End of the Road - (Cryptanalysis, Software Development) Points: 5000

Description:

Unfortunately, looks like the ransomware site suffered some data loss, and doesn't have the victim's key to give back! I guess they weren't planning on returning the victims' files, even if they paid up.

There's one last shred of hope: your cryptanalysis skills. We've given you one of the encrypted files from the victim's system, which contains an important message. Find the encryption key, and recover the message.

Downloads:

Prompt:

Enter the value recovered from the file

Solution

We're working with the same things here as in Task 8, we have to further reverse the binary to find out how the keys are being generated and see if we can generate the same key again for our particular case.

Let's first start by using the key-encrypting-key to decrypt all the keys already in the DB. We know that the key we want isn't in there but let's see if it tells us something about any pattern in the keys.

I have a script that does this in decrypt_enc_key.py. It has functions to decrypt an inputted encrypted key. If we do this, we see this kind of an output:

b"eR/+3\t\x8c'\x89N\xb3\xc8\xe0\x0cr\xb4ee6ca29c-6e0d-11eb-b008-1762ad81\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10"

The first 16 bytes are the IV that is prepended to the ciphertext. The following bytes are the plaintext. We see that the bytes after that are these:

ee6ca29c-6e0d-11eb-b008-1762ad81\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10"

We know from using the keyMaster binary that the output gives us a plainkey like this:

{"plainKey":"1aa3bc56-61e3-11ed-b61d-9cb6d0b8","result":"ok"}

So, based on that knowledge, we can say that in the plaintext, the first segment is ee6ca29c-6e0d-11eb-b008-1762ad81 followed by some padding \x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10\x10. I don't know why this padding was added, but it doesn't matter because we can see the plainkey.

The decrypted plainkeys from the DB look like this, as a sample:

b'ee6ca29c-6e0d-11eb-b008-1762ad81' ,
b'55e8e185-7102-11eb-b008-1762ad81' ,
b'507a2975-ac5b-11eb-b008-1762ad81' ,
b'cb711e51-2eb8-11ec-b008-1762ad81' ,
b'b983c01c-83b6-11eb-b008-1762ad81' ,
b'c9e2eb3d-660c-11ec-b008-1762ad81' ,
b'd3d6dd42-fcd7-11eb-b008-1762ad81' ,
b'9f70a18a-8b53-11eb-b008-1762ad81' ,
b'69f603ca-326e-11ec-b008-1762ad81' ,
b'948a4b91-34c7-11ec-b008-1762ad81' ,
b'0c43f9dc-14f5-11ec-b008-1762ad81' ,
b'e46c311e-3fcd-11ec-b008-1762ad81' ,
b'444fea07-ea53-11eb-b008-1762ad81' ,
b'3da1dfc2-997f-11eb-b008-1762ad81' ,
b'd87a2502-f901-11eb-b008-1762ad81' ,
b'0094e01e-7afa-11eb-b008-1762ad81' ,
b'362a31e1-b43a-11eb-b008-1762ad81' ,
b'85fd1300-7269-11eb-b008-1762ad81' ,
b'3638e475-c858-11eb-b008-1762ad81' ,
b'c0a6e128-6afd-11eb-b008-1762ad81' ,
b'd3f8e59e-eb71-11eb-b008-1762ad81' ,
b'd36733d9-4c74-11ec-b008-1762ad81' ,
b'76982f47-886d-11eb-b008-1762ad81' ,
b'27a0d6b8-c7a3-11eb-b008-1762ad81' ,
b'd9ad7bc7-6059-11ec-b008-1762ad81' ,
b'7e12a611-ab78-11eb-b008-1762ad81' ,
b'c054ddba-9127-11eb-b008-1762ad81' ,
b'a6fd726b-76aa-11eb-b008-1762ad81' ,
b'4d7ba3f5-8f25-11eb-b008-1762ad81' ,
b'6d82c0a8-3784-11ec-b008-1762ad81' ,
b'6ca66cab-63cf-11ec-b008-1762ad81' ,
b'6bd2a0de-4ee5-11ec-b008-1762ad81' ,
b'adf62e45-2835-11ec-b008-1762ad81' ,
b'c45ce77a-fc33-11eb-b008-1762ad81' ,
b'70c84119-f544-11eb-b008-1762ad81' ,
b'287edf21-55ab-11eb-b008-1762ad81' ,
b'6b011e4a-75e4-11eb-b008-1762ad81' ,
b'efc158d1-11d3-11ec-b008-1762ad81' ,
b'04f3cd8f-6377-11eb-b008-1762ad81' ,
b'643e21a6-ba57-11eb-b008-1762ad81' ,
b'f510223a-3250-11ec-b008-1762ad81' ,
b'b18806e1-a8f8-11eb-b008-1762ad81' ,
b'fa3f4cfd-8ff8-11eb-b008-1762ad81' ,
b'6e0bd557-5df1-11eb-b008-1762ad81' ,
b'ee543ffd-e975-11eb-b008-1762ad81' ,
b'f7310de8-4df1-11eb-b008-1762ad81' ,
b'9a5f0eca-5465-11ec-b008-1762ad81' ,
b'384f7073-f896-11eb-b008-1762ad81' ,
b'adee3117-7d07-11eb-b008-1762ad81' ,

We can notice something interestinh here. Out of the 5part delimited by a '-', the last 2 parts remain the same and the 3rd part also almost always remains the same. The 3rd part is always either 11eb or 11ec.

Let's goback into Ghidra where we found the plainkey being generated. This was the function main.DchO32CDDK0 as we saw in Task 8. I had renamed this to main.DchO32CDDK0_generate_plainkey_using_clockseq for reasons we'll see inside this function.

Inside this function, we know that the plainkey is being generated so let's see what's happening inside. The function makes a call to os.Getenv() to get an environment variable. In Ghidra:

005b93cd bb 0e 00        MOV        EBX,0xe
            00 00
005b93d2 e8 09 b9        CALL       os.Getenv
            f1 ff
005b93d7 e8 a4 78        CALL       strconv.Atoi
            ec ff
005b93dc 48 85 db        TEST       RBX,RBX

Breaking at 0x005b93d2 in gdb and looking at arguments we see:

 RAX  0x6c4ae5 ◂— 0x45535f4b434f4c43 ('CLOCK_SE')
*RBX  0xe
 RCX  0x0
 RDX  0xaafc

The argument is in rax.

pwndbg> x/s 0x6c4ae5
0x6c4ae5:       "CLOCK_SEQUENCEGC assist waitGC worker initMB; allocated ...

So, the string for the environment variable name is "CLOCK_SEQUENCE". It then calls strconv.Atoi() which converts the value from a string to an integer.

After this, we can see in the disassembly that it moves -0x1 into rax and rcx as the argument and then calls the function github.com/google/uuid.SetClockSequence.

005b93dc 48 85 db        TEST       RBX,RBX
005b93df 48 c7 c1        MOV        RCX,-0x1
            ff ff ff ff
005b93e6 48 0f 45 c1     CMOVNZ     RAX,RCX
005b93ea e8 31 34        CALL       github.com/google/uuid.SetClockSequence

The documentation can be found here: https://github.com/google/uuid/blob/master/time.go It says for this function that "SetClockSequence sets the clock sequence to the lower 14 bits of seq. Setting to -1 causes a new sequence to be generated." From this we can understand that the function generates a new clock sequence value because the argument is -0x1.

Next, what happens is the function github.com/google/uuid.NewUUID is called immediately after it. Inside this function, it makes calls to github.com/google/uuid.GetTime() and github.com/google/uuid.setNodeInterface(). The function ``github.com/google/uuid.NewUUID` uses these to make a new uuid and then return it. So, the plainkeys are just uuids generated by the binary.

About how clock sequence is used in generating UUIDs: https://stackoverflow.com/questions/41475842/what-does-clock-sequence-mean We see that clock sequence has nothing to do with actual time and is more like a counter or random number. We also see that UUID consistes of 3 things:

  • clock sequence
  • timestamp
  • place

An online resource which explains how UUIDs are generated: https://digitalbunker.dev/understanding-how-uuids-are-generated/ There is also an online utility that I found which helps decode UUIDs: https://www.uuidtools.com/decode.

Using these, we can see that the node ID is the MAC address of the device that is generating the UUID, the clock sequence is a number, and there is a timestamp.

The function github.com/google/uuid.GetTime() can be found here: https://github.com/google/uuid/blob/512b657a42880af87e9f0d863aa6dccf3540d4ba/time.go We see that the time returned by this is the current time but it's not a unix timestamp. It's a lillian timestamp.

// GetTime returns the current Time (100s of nanoseconds since 15 Oct 1582) and
// clock sequence as well as adjusting the clock sequence as needed.  An error
// is returned if the current time cannot be determined.
const (
	lillian    = 2299160          // Julian day of 15 Oct 1582
	unix       = 2440587          // Julian day of 1 Jan 1970
	epoch      = unix - lillian   // Days between epochs
	g1582      = epoch * 86400    // seconds between epochs
	g1582ns100 = g1582 * 10000000 // 100s of a nanoseconds between epochs
)

The GetTime() function calls getTime() internally and inside that, it calculates the current time as:

	now := uint64(t.UnixNano()/100) + g1582ns100

So, now we know how the timestamp is generated for use in an UUID. For the clock sequence, we can decode the UUIDs from the decrypted keys and see what value was being used. From the above blog explaining UUID construction, we know that the first 3 segments have the lillian timestamp and the version in the 3rd segment; and the 4th segment has the variant and clock sequence; and finally the last segment is the Node ID.

TimeLow + TimeMid + TimeHighAndVersion + (ClockSequenceHiAndRes && ClockSequenceLow) + NodeID

uuid img

We can decode one of the UUIDs from before to get info from it about clock sequence used and the Node ID. The UUID will need to extended with a few zeroes: ee6ca29c-6e0d-11eb-b008-1762ad810000

  • Standard String Format: ee6ca29c-6e0d-11eb-b008-1762ad810000
  • Single Integer Value: 316920329201540034674735145905626873856
  • Version: 1 (time and node based)
  • Variant: DCE 1.1, ISO/IEC 11578:1996
  • Contents - Time: 2021-02-13 15:12:47.865922.8 UTC
  • Contents - Clock: 12296 (usually random)
  • Contents - Node: 17:62:ad:81:00:00 (local multicast)

We have the clock sequence and the Node ID from this. However, we don't really need it because the as we saw earlier, the last 2 segments don't change at all , so we can just copy them directly without needing to parse them.

Now that we know everything that goes into making a UUID, we can forge the correct UUID for our case to use as the plainkey. What we need is:

  • Clock sequence: Known from decrypted keys in DB
  • Node ID: Known from decrypted keys in DB
  • Timestamp (lillian): We don't know the value yet, but we know where it should go in order to form the UUID key.

Now, we need to find the time when the keys were generated. If we look in the keygeneration.log file, we can see that there are timestamps in it which tell us when the log was added. If we look at the corresponding decrypted UUID and extract the timestamp from the UUID, we see that it is slightly different with some offset. For example, the first row in the log has the following values:

  • Date in keygeneration.log: 2021-01-03T13:32:10-05:00
  • Encrypted key: cVU+CyY2OAZa9YO5kyyXmIANfiTv2TJ0NwYLQu6SXLhd1QrJIOnlNGAAz7PzBoYrXRlgM9pddbP2S6v/8BDsHQ==
  • Decrypted UUID: f7310de8-4df1-11eb-b008-1762ad81
  • Timestamp extracted from UUID (lillian at 10**-7 sec precision): 138289915194576360
  • UUID timestamp, converted to unix (10**-7 sec precision): 16096987194576360
  • Difference (10**-7 sec precision)): 105423640.0

    Note that these values seem to change depending on what timezone you're in. So, you might have to adjust for that if you're not in the same timezone as the GMT-05:00 which is used in keygeneration.log. It would be by some number of hours, but the seconds-level offset should be the same.

In this way, we can look for the maximum offset and the minimum offset between the time mentioned in the log file and the time extracted from the UUID.

  • max: 118353370.0 (100ns precision) ~11.8 sec
  • min: 61562742.0 (100ns precision) ~6.1 sec

Now, we have an idea of how much offset there is between UUID generation (the timestamp that gets used in the UUID) and then inserting the log into the keygeneration.log file. We already know that the DB doesn't have the key we need but if the keygeneration.log file has a record of approximately when the key was generated. This gives us an idea of the timestamp to use in forging the UUID key we need.

We can try to generate all the possible UUIDs are between the maximum and minimum offset we found and add a few seconds on each side of the bounds as margin. We can bruteforce decryption of important_data.pdf.enc using these generated UUIDs. If the decrypted file has teh PDF header %PDF at the start, then we know that the decryption was successful and we can stop bruteforcing.

However, we will first need to figure out the encryption used to encrypt the file so that we know the correct decryption technique to apply with all the UUID keys.

Where can we find this information? Task 9 just has the pdf file we need to decrypt; Task 8 had the keyMaster binary and some DBs; Tasks 6 and 7 were web exploits; Task 5 was about reversingt the ssh-agent binary; Tasks B2 and B1 were again web analysis and exploiting; Task A1 just had a vpn log; and finally Task A2 had a pcap file and file on the root home dir. Task A2 seems like a promising place to look for the information we need so let's look there.

If we analyse the pcap we decrypted earlier closely again, we can find that there is a HTTP GET request made and that there is a file being transferred. This file is called tools.tar. The way to download it is this:

  • Follow TLS stream from the GET request for the file, then filter for the response packets (instead of entire conversation), then save the raw to a file. This would have the HTTP response header included in it, so open in text editor and just delete that starting part so that it starts at tools/

We now have the tools.tar file. Let's try and decompress it. If we try to decompress it, it's going to show errros and give corrupt files. (I spent a lot of time trying to fix it and get it to decompress). This is because the file isn't actually complete. The pcap ends before all the packets for this file are sent. We still have the partial file though. Let's just open the file in a text editor and see if we can make any interesting observations.

If we search for tools/ inside it, we can find that there are several files:

  • tools/busybox
  • tools/ransom.sh
  • tools/openssl

We can find the ransom.sh file if we look for the it by looking at the text nearby where we found tools/ransom.sh mentioned or if we search for #!/bin/sh. We see the script:

read -p "Enter encryption key: " key
hexkey=`echo -n $key | ./busybox xxd -p | ./busybox head -c 32`
export hexkey
./busybox find $1 -regex '.*\.\(pdf\|doc\|docx\|xls\|xlsx\|ppt\|pptx\)' -print -exec sh -c 'iv=`./openssl rand -hex 16`; echo -n $iv > $0.enc; ./openssl enc -e -aes-128-cbc -K $hexkey -iv $iv -in $0 >> $0.enc; rm $0' \{\} \; 2>/dev/null

busybox is a binary that should already be install in Ubuntu so we can directly use that.

For the IV, we see that it is generating random 16 bytes and then saving them to the $iv variable and then adding them to the start of the file $0.enc in below snippet. Here, $0.enc is whatever file is being encrypted, and the script searches for and ecrypts all the pdf, doc, docx, xls, xlsx, ppt, and pptx files that it can find.

iv=`./openssl rand -hex 16`; echo -n $iv > $0.enc;

The hexkey is taken from user input, xxd is used to convert each character into hexadecimal byte, and then truncated to length 32. This forms the $hexkey variable (try it yourself with a dummy value in $key such as a sample UUID).

Now, the encryption is done using the $hexkey and the $iv. The encryption is AES-128 in CBC mode, the input is the file referred by $0 and the output is appended to $0.enc. After this, the original file is deleted with rm.

openssl enc -e -aes-128-cbc -K $hexkey -iv $iv -in $0 >> $0.enc; rm $0

So, now we know what the encryption is. We can extract the IV by taking the first 32 characters and parsing the hexstring into bytes. We can form the key by taking the UUID and then truncating it to have only 16 characters since each character-byte is represented using two hexadecimal digits giving a length of 32 when presented as a hexstring. We then form a bytestring from the truncated UUID, giving us the key. Next, we can use Python's Cryptography library to make an AES-128 cipher and use it to decrypt the file. The script for bruteforcing is in enumerate_uuids_and_decrypt.py.

We let the script run until we find the result.

The result is:

  • d5ba9787-f88b-11ec-b008-1762ad81 is the UUID key.
  • The AES cipher uses this part of it: b'd5ba9787-f88b-11'
  • The decrypted file is in important_data.pdf
  • The answer mentioned in the pdf which we can submit to the challenge is: eaVVbNrPhOYeOaOFBqEMTwQg49UQogff

NSA CBC 2022 Solved !!

Answer

eaVVbNrPhOYeOaOFBqEMTwQg49UQogff

Congratulations! You've completed the 2022 Codebreaker Challenge!