Reverse Engineering Pokemon NDS save files - LostMyPlaintext

Partially inspired by [LiveOverflow's video series] on Pokemon Red I decided to revisit a few classics from my own childhood: the 4th generation Pokemon main game titles. Mainly I really wanted to replay these games using whatever Pokemon I wanted and not just the ones availiable at certain parts of each game so from the beginning my main goal was to find a way to edit a Pokemon's species. I'll be mostly focusing on [Pokemon Platinum version] but most of this information applies to all main games from the same generation (only offsets differ).


Content:


Starting Point

Before doing anything remotly exiting I obviously needed to be able to run the games on my computer. I decided to go with the DeSmuME emulator as it was the easiest to get to run properly on my OS of choice. (Disclaimer: I do own physical copies of these games but if you're going to download them either way at least make sure you get them from a reputable source).

Getting Started With Reverse Engineering:
First I started two new save files, naming my character "AAAAAAA" in the first one and "BBBBBBB" in the second

Then opened both files in a hexeditor (actually I used vimdiff and the xxd command but you get the idea) and checked for differences

In the "A" file we find the sequence of bytes "2b01" seven times in a row and in the "B" file we find the sequence of bytes "2c01" seven times in a row. Considering that our character's name is seven characters long and that the number 2c comes after the number 2b, it seems highly likely that we found the player name within the save file. But if that's the case why are characters encoded in two bytes? What is that "01" for? I later found out through trial and error that it's a sort of region encoding. For example if you were to change "01" to "00" you'd get japanese characters instead of the latin alphabet. Nice so if we were to change all the "2b"s to "2d"s in the first file, our name in game would change to "CCCCCCC" once we load the save file... right...? Well...

... not exactly.


Save file memory blocks and Bypassing CRC checksum

Save file memory blocks:
Before we adress this issue a quick explanation of how information is laid out within the file. Save files are divided into two pairs of blocks, small blocks and big blocks (here you can see at what values they start and end):

With that said the data we want to mess with (player information, party pokemon, items, etc.) is all stored in the small blocks so we'll be focusing on those. We'll be referencing this Bulbapedia* page a lot from this point onward as it gives a lot of usefull information like the offsets for things we want to edit.

(Note: it seems the first time you save the game all data stored in small blocks gets saved only to the second block. The second time you save the game it gets saved to the first block and your previous save remains in the second block. From there on out your most recent save always gets written to the fisrt block and the previous save to the second block. So it appears the game stores your last two saves at all times although I still need to fully confirm this.)

Bypassing CRC checksum:
Turns out the game uses an error detection algorithm to assure data integrity, causing our save file to become corrupted if we try to edit stuff(from the same Bulbapedia page we know the algorithm used is CRC-16-CCITT). Basically, it calculates a checksum value based part of the contents of the save file (in this case it takes all the bytes in the small blocks except for the block's footer: it's last 20 bytes) and checks if said value is consistent. Good news is these kinds of algorithms aren't design to stop attackers. The checksum value is also stored in the save file so we can just:

  • Implement the algorithm
  • Make the changes we want to the save file
  • Run the algorithm with the modified file to get the updated checksum value
  • Switch the previous checksum value with the new one
According to Bulbapedia the checksum bytes should be at 0x0CF2A-0x0CF2B for the first small block and 0x4CF2A-0x4CF2B for the second. We can confirm there are two different pairs bytes in our "A" and "B" files at addresses 0x4CF2A-0x4CF2B (again the first block is still unused because we only saved once):

Let's make sure these values correspond to the checksum by implementing CRC-16-CCITT. There are several ways we can go about doing this, I opted for using a pre computed table (which I later realized is not necessarily optimal but for now let's go with it). Let's have the code print the values it calculates in hex (also note the values are stored in little-endian):

small_block_2 = 0x40000
small_block_footer_offset = 0xcf18


def checksum(data):
	table = [0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7,
	0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF,
	0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6,
	0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE,
	0x2462, 0x3443, 0x0420, 0x1401, 0x64E6, 0x74C7, 0x44A4, 0x5485,
	0xA56A, 0xB54B, 0x8528, 0x9509, 0xE5EE, 0xF5CF, 0xC5AC, 0xD58D,
	0x3653, 0x2672, 0x1611, 0x0630, 0x76D7, 0x66F6, 0x5695, 0x46B4,
	0xB75B, 0xA77A, 0x9719, 0x8738, 0xF7DF, 0xE7FE, 0xD79D, 0xC7BC,
	0x48C4, 0x58E5, 0x6886, 0x78A7, 0x0840, 0x1861, 0x2802, 0x3823,
	0xC9CC, 0xD9ED, 0xE98E, 0xF9AF, 0x8948, 0x9969, 0xA90A, 0xB92B,
	0x5AF5, 0x4AD4, 0x7AB7, 0x6A96, 0x1A71, 0x0A50, 0x3A33, 0x2A12,
	0xDBFD, 0xCBDC, 0xFBBF, 0xEB9E, 0x9B79, 0x8B58, 0xBB3B, 0xAB1A,
	0x6CA6, 0x7C87, 0x4CE4, 0x5CC5, 0x2C22, 0x3C03, 0x0C60, 0x1C41,
	0xEDAE, 0xFD8F, 0xCDEC, 0xDDCD, 0xAD2A, 0xBD0B, 0x8D68, 0x9D49,
	0x7E97, 0x6EB6, 0x5ED5, 0x4EF4, 0x3E13, 0x2E32, 0x1E51, 0x0E70,
	0xFF9F, 0xEFBE, 0xDFDD, 0xCFFC, 0xBF1B, 0xAF3A, 0x9F59, 0x8F78,
	0x9188, 0x81A9, 0xB1CA, 0xA1EB, 0xD10C, 0xC12D, 0xF14E, 0xE16F,
	0x1080, 0x00A1, 0x30C2, 0x20E3, 0x5004, 0x4025, 0x7046, 0x6067,
	0x83B9, 0x9398, 0xA3FB, 0xB3DA, 0xC33D, 0xD31C, 0xE37F, 0xF35E,
	0x02B1, 0x1290, 0x22F3, 0x32D2, 0x4235, 0x5214, 0x6277, 0x7256,
	0xB5EA, 0xA5CB, 0x95A8, 0x8589, 0xF56E, 0xE54F, 0xD52C, 0xC50D,
	0x34E2, 0x24C3, 0x14A0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405,
	0xA7DB, 0xB7FA, 0x8799, 0x97B8, 0xE75F, 0xF77E, 0xC71D, 0xD73C,
	0x26D3, 0x36F2, 0x0691, 0x16B0, 0x6657, 0x7676, 0x4615, 0x5634,
	0xD94C, 0xC96D, 0xF90E, 0xE92F, 0x99C8, 0x89E9, 0xB98A, 0xA9AB,
	0x5844, 0x4865, 0x7806, 0x6827, 0x18C0, 0x08E1, 0x3882, 0x28A3,
	0xCB7D, 0xDB5C, 0xEB3F, 0xFB1E, 0x8BF9, 0x9BD8, 0xABBB, 0xBB9A,
	0x4A75, 0x5A54, 0x6A37, 0x7A16, 0x0AF1, 0x1AD0, 0x2AB3, 0x3A92,
	0xFD2E, 0xED0F, 0xDD6C, 0xCD4D, 0xBDAA, 0xAD8B, 0x9DE8, 0x8DC9,
	0x7C26, 0x6C07, 0x5C64, 0x4C45, 0x3CA2, 0x2C83, 0x1CE0, 0x0CC1,
	0xEF1F, 0xFF3E, 0xCF5D, 0xDF7C, 0xAF9B, 0xBFBA, 0x8FD9, 0x9FF8,
	0x6E17, 0x7E36, 0x4E55, 0x5E74, 0x2E93, 0x3EB2, 0x0ED1, 0x1EF0]

	sum = 0xffff
	print ("Computing checksum...")
	for i in range(0,len(data)):
		sum = (sum << 8)^table[data[i]^int(bin(sum >> 8)[2:][-8:],2)]
	return (hex(int(bin(sum)[2:][-8:],2)) , hex(int(bin(sum >> 8)[2:][-8:],2)))


def main():
	with open("AAAAAAA.dsv","rb") as f:
		sav = bytearray(f.read())
		print (checksum(sav[small_block_2:(small_block_2 + small_block_footer_offset)]))


if __name__ == "__main__":
	main()

This code outputs the following:

The same values we find at addresses 0x4CF2A-0x4CF2B meaning we can be sure we found the checksum value. Now let's try changing all the "2b"s to "2d"s in the "A" file once more:

Then we run the same program to calculate the new checksum:

And we edit the checksum within the file:

And now if we load the save file...

Success!


*The Pokemon Encyclopedia


The Pokemon data structure

Now that we can make changes to our save files without corrupting them let's see if we can find a way to change our Pokemon's species. The data for the up to six Pokemon in the player's party is all stored in the small blocks (each party Pokemon is 236 bytes in size). To keep things simple we'll be focusing on editing only the first Pokemon in the party.

From the Bulbapedia page we've been referencing thus far we know we can find the party Pokemon data from save file offsets 0xA0 to 0x627 and we can learn more about the location of specific offsets for things like the Pokemon's species id (or it's national dex number), its moves and ability from this page. This is where we run into another problem though.

Not only are bytes 0x08 to 0xEB of the Pokemon data structure xor encrypted, this encryption uses a pseudorandom number generator (PRNG) based on a checksum independent from the one we just bypassed. Furthermore bytes 0x08 to 0x87 are devided into four 32-byte blocks and shuffled around.

Thankfully, with the resources linked so far we can learn how these things are implemented and how to find workarrounds (hopefully without too much trouble).

Let's start by pointing out that bytes 0x01 to 0x07 are not encrypted and contain usefull information such as the checksum value and the personality value or pv of the Pokemon (for the purposes of this post you can think of it as a value that varies from Pokemon to Pokemon).

Decrypting the Pokemon structure:
Let's tackle the encryption first. From Bulbapedia we know that the PRNG uses the follwing formula:

  • X[n+1] = (0x41C64E6D * X[n] + 0x6073)
To encrypt/decrypt the Pokemon data the game uses the Pokemon specific checksum value (let's refer to it as the Pokemon checksum) as the first instance of X[n] then for the nth number generated:
  • Takes the 16 upper bytes of said number
  • Takes the nth 2-byte word from Pokemon data offsets 0x08 to 0x87
  • And XORs them
Meaning we can just implement these steps and we'll have a way to encrypt as well as decrypt (since the xor cipher is symmetric) Pokemon data blocks:

small_block_offset = 0x40000
lead_Pokemon_offset = 0xA0

def PRNG(data,checksum):
	for i in range(0,128,2):
		seed = (0x41C64E6D * seed) + 0x00006073
		data[(small_block_offset+lead_Pokemon_offset+0x08)+i] ^= int(bin(seed >> 16)[2:][-8:],2)
		data[(small_block_offset+lead_Pokemon_offset+0x08)+i+1] ^= int(bin(seed >> 24)[2:][-8:],2)

Unshuffling the 32-byte blocks:
Like we mentioned already, Pokemon data from bytes 0x08 to 0x87 is shuffled into four different blocks (let's call these blocks A, B, C and D). First let's undestand how the blocks are shuffled. Using the following formula:

  • ((pv & 0x3E000) >> 0xD) % 24 (where pv is the Pokemon's personality value)
The game orders the blocks based on the result, using the follwing table:

Result Order
0 ABCD
1 ABDC
2 ACBD
3 ACDB
4 ADBC
5 ADCB
6 BACD
7 BADC
8 BCAD
9 BCDA
10 BDAC
11 BDCA
12 CABD
13 CADB
14 CBAD
15 CBDA
16 CDAB
17 CDBA
18 DABC
19 DACB
20 DBAC
21 DBCA
22 DCAB
23 DCBA

Knowing this as well as the personality value (stored in bytes 0x00 to 0x03 which are not encrypted) we can write a function to calculate the correct block order for the Pokemon we're trying to edit:

def getBlockOffsets(pv):
	orderTable = {
			# A offset -> index 0; B offset -> index 1; C offset -> index 2; D offset -> index 3
		   	 0:[0,32,64,96], #ABCD
			 1:[0,32,96,64], #ABDC
			 2:[0,64,32,96], #ACBD
			 3:[0,96,32,64], #ACDB
			 4:[0,64,96,32], #ADBC
			 5:[0,96,64,32], #ADCB
			 6:[32,0,64,96], #BACD
			 7:[32,0,96,64], #BADC
			 8:[64,0,32,96], #BCAD
			 9:[96,0,32,64], #BCDA
			10:[64,0,96,32], #BDAC
			11:[96,0,64,32], #BDCA
			12:[32,64,0,96], #CABD
			13:[32,96,0,64], #CADB
			14:[64,32,0,96], #CBAD
			15:[96,32,0,64], #CBDA
			16:[64,96,0,32], #CDAB
			17:[96,64,0,32], #CDBA
			18:[32,64,96,0], #DABC
			19:[32,96,64,0], #DACB
			20:[64,32,96,0], #DBAC
			21:[96,32,64,0], #DBCA
			22:[64,96,64,0], #DCAB
			23:[96,64,32,0]  #DCBA
		}

	return orderTable[((pv & 0x3e000) >> 0xd) % 24]

The idea here is to have the function return a list L of offsets such that:

  • small_block_offset + lead_Pokemon_offset + 0x08 + L[0] is the first byte in block A
  • small_block_offset + lead_Pokemon_offset + 0x08 + L[1] is the first byte in block B
  • And so on

Now that we have bypassed the Pokemon data structure's encryption let's combine this with our general checksum bypass and try to edit a Pokemon's species id. We can play the game for a bit until we get our starter Pokemon and try to change it to a different Pokemon:

Wait that doesn't look right... Of course, we forgot to update the Pokemon checksum!

Updating the Pokemon checksum:
Thankfully this is very straightforward, the Pokemon checksum consists simply of the 16 lower bits of the sum of bytes 0x08 to 0x87 (unencrypted). We can write a function to calculate and update the Pokemon checksum:

def updatePKMChecksum(data):
	sum = 0
	dataToUpdate = data[small_block_offset+lead_Pokemon_offset+0x08:small_block_offset+lead_Pokemon_offset+0x87]
	for i in range(0,len(dataToUpdate),2):
		sum += ([i+1] << 8) + dataToUpdate[i]

	data[small_block_offset + lead_Pokemon_offset + Pokemon_checksum_offset] = int(bin(sum)[2:][-8:],2)
	data[small_block_offset + lead_Pokemon_offset + Pokemon_checksum_offset+1] = int(bin(sum >> 8)[2:][-8:],2)

Finally we have all the tools we need to get a hacker worthy starter:


Conclusion and future work

I ended up taking what I learned from this project and writing a terminal based save file editor for the 4th generation Pokemon games, you can find it on my github [here's a link if you're interested]

Current features include:

  • Editing player name
  • Editing lead Pokemon's species
  • Editing lead Pokemon's ability
Implementing things like the ability to edit a Pokemon's moves for example (which you'd think would be simple considering what we've managed so far) is on the TODO list. Speaking of which... when I tried editing moves:

No idea why this happened (also, that was supposed to be a Charizard!).