Grandia English Patch

Translating Grandia 1.1.1

Seems like my misunderstanding is by coming across moments in youtube vids (and even on my own disc?) where it does play the disc 1 music again, at random, for reasons I don't understand. The first time I got a battle on Disc 2 it played the Disc 1 music, then stopped, hence why I wondered if it was an issue. I wonder if it's a bug in the game itself or something. This is my first time through. Thanks again for the great work.

The only disc 1 results music that plays on disc 2 is the perfect win music. Which that happens when you win the battle without taking any damage.
 
While the game is playable, the audio is scratchy and glitchy to the point it could make my ears bleed. Playing a rip from my Japanese copy and the audio is fine so I'm obviously doing something wrong during the patching process but I'm following the instruction in the readme to the letter.

Any idea how to fix this issue because it is puzzling the hell out of me.
Now having had the chance to play the game on real hardware the problem with the audio remains, it just hisses and scratches continually. With the unpatched version of the game the audio remains flawless.

My Saturn is a MK-80200-50 Model 1 (the one with the oval buttons) that I bought on release here in the UK. Anyone got any ideas of what I could be doing wrong in the patching process?

Curiously the Beetle core in Retroarch plays the audio correctly.

Edit: Thankfully while actually playing the game the audio is fine. It seems to only glitch out when in FMVs.
 
Last edited:
Now having had the chance to play the game on real hardware the problem with the audio remains, it just hisses and scratches continually. With the unpatched version of the game the audio remains flawless.

My Saturn is a MK-80200-50 Model 1 (the one with the oval buttons) that I bought on release here in the UK. Anyone got any ideas of what I could be doing wrong in the patching process?

Curiously the Beetle core in Retroarch plays the audio correctly.

Edit: Thankfully while actually playing the game the audio is fine. It seems to only glitch out when in FMVs.

The FMVs are untouched in the current patch. The issue you're describing is due to a bad burn. Either the way it was burned, or the media used.
 
The FMVs are untouched in the current patch. The issue you're describing is due to a bad burn. Either the way it was burned, or the media used.
The crackling audio exists in emulation too when using Yaba Sanshiro (Android or PC version) in the same places, so it wouldn't seem to be just a bad burn issue.

You have me thinking though if it could be that I ripped my game incorrectly in the first place. I'll have to try ripping it again and see if that makes a difference when patched.
 
The crackling audio exists in emulation too when using Yaba Sanshiro (Android or PC version) in the same places, so it wouldn't seem to be just a bad burn issue.

You have me thinking though if it could be that I ripped my game incorrectly in the first place. I'll have to try ripping it again and see if that makes a difference when patched.
The issue in Yaba Sanshiro is because Yaba Sanshiro/Yabause isn't an accurate or good emulator. Sure it's debugging tools are nice, but for playing games it's the worst of the 3 main Saturn emulators. It can barely playback standard cinepak, let alone Grandia's custom format.

Your issue on real hardware though is 100% burn related. What type of CD-Rs are you using? How are you burning it? What speed? What is the condition of the laser in your Saturn? All of those can cause this issue.
 
Hello,
Really thank you for your hard work on this project !
I read some times ago on sega16 forum you may offer a french translation too after you finished the english one. Is it still on your plans ?
 
Hello,
Really thank you for your hard work on this project !
I read some times ago on sega16 forum you may offer a french translation too after you finished the english one. Is it still on your plans ?
If I did that it would be more as a proof of concept. The tools I made though should be enough for someone to get such a project started and going.
 
So since people have been asking, I figured I'd post this bit of info that I've sent to nanash and others to see if they can figure out the decompression. I have been able to successfully write a demuxer for the FMVs. This can simply separate the ADX audio from the Video data. The idea being that having them separated can make it easier to identify what actually is video data.

The decompessed FMV frames seem to get sent to VDP2's VRAM at 0x05E00B00 and 0x5E40B00 from what I can tell. It seems that they are doing a double buffering setup where one frame is displayed while the other is being decompressed. They are decompressed to technically 32-bit RGB images but it's really just 24-bit in a 32-bit value with the MSBs not being used. The final output resolution in VDP2's VRAM is 352x176 for the full screen FMVs, and 352x144 for the letterboxed ones. You can extract them from VRAM using crystal tile set to 32-bit aBGR:

XfmcpPK.png


In HWRAM it looks like at 0x0603FFC0 is where Decompression is happening. If you look at this region in a tile viewer like CrystalTile you can see what looks like different channels of the MPEG/MJPEG stream? It seems that the function that kicks this off for each frame is at 0x0602E0AE. It seems that both SH-2s are being used for this:

TYv9VVq.png

1fZsJIe.png


For the Quicktime wrapped files themselves, most of the QuickTime header isn't correct, though it is still fully loaded into RAM. There is a table that is in these files that defines how big each frame is and how many frames to go before the next chunk of ADX audio is loaded. Basically the streams are interleaved like this:

1) Data Starts at 0x4000 with 0x18000 bytes of ADX Audio.
2) Next it will read the first run of frames in the table in the Quicktime header.
3) After all frames are read in the run, it will then read 0x6000 bytes of ADX Audio. From here it will alternate back and forth between a run of frames as defined in the table, and 0x6000 bytes of ADX.

The table that defines the frame sizes starts at 0x284 in the MOV file. Every time you see 0x000C that denotes the end of a Frame Run. When one of these is hit, the next 0x6000 bytes in the data will be ADX audio. The next 2 bytes after the 0x000C seem to be a value used in the decoding of the next run of frames. You can see it get read in if you put a read break point there, but I'm not sure what it's used for. Here is an example of this table with the 0x000C and next 2 bytes highlighted:

oNTQU2b.jpg


The only info I've really been able to get from the individual frames is that the values at 0x02 and 0x03 are the resolution of the image. 0x02 is the width divided by 10, and 0x03 is the height divided by 10.

The demuxing code I made can be found here:

 
So since people have been asking, I figured I'd post this bit of info that I've sent to nanash and others to see if they can figure out the decompression. I have been able to successfully write a demuxer for the FMVs. This can simply separate the ADX audio from the Video data. The idea being that having them separated can make it easier to identify what actually is video data.

The decompessed FMV frames seem to get sent to VDP2's VRAM at 0x05E00B00 and 0x5E40B00 from what I can tell. It seems that they are doing a double buffering setup where one frame is displayed while the other is being decompressed. They are decompressed to technically 32-bit RGB images but it's really just 24-bit in a 32-bit value with the MSBs not being used. The final output resolution in VDP2's VRAM is 352x176 for the full screen FMVs, and 352x144 for the letterboxed ones. You can extract them from VRAM using crystal tile set to 32-bit aBGR:

XfmcpPK.png


In HWRAM it looks like at 0x0603FFC0 is where Decompression is happening. If you look at this region in a tile viewer like CrystalTile you can see what looks like different channels of the MPEG/MJPEG stream? It seems that the function that kicks this off for each frame is at 0x0602E0AE. It seems that both SH-2s are being used for this:

TYv9VVq.png

1fZsJIe.png


For the Quicktime wrapped files themselves, most of the QuickTime header isn't correct, though it is still fully loaded into RAM. There is a table that is in these files that defines how big each frame is and how many frames to go before the next chunk of ADX audio is loaded. Basically the streams are interleaved like this:

1) Data Starts at 0x4000 with 0x18000 bytes of ADX Audio.
2) Next it will read the first run of frames in the table in the Quicktime header.
3) After all frames are read in the run, it will then read 0x6000 bytes of ADX Audio. From here it will alternate back and forth between a run of frames as defined in the table, and 0x6000 bytes of ADX.

The table that defines the frame sizes starts at 0x284 in the MOV file. Every time you see 0x000C that denotes the end of a Frame Run. When one of these is hit, the next 0x6000 bytes in the data will be ADX audio. The next 2 bytes after the 0x000C seem to be a value used in the decoding of the next run of frames. You can see it get read in if you put a read break point there, but I'm not sure what it's used for. Here is an example of this table with the 0x000C and next 2 bytes highlighted:

oNTQU2b.jpg


The only info I've really been able to get from the individual frames is that the values at 0x02 and 0x03 are the resolution of the image. 0x02 is the width divided by 10, and 0x03 is the height divided by 10.

The demuxing code I made can be found here:

That was a very interesting and thorough explanation. I really enjoyed reading the breakdown and example. Thank you for taking the time to share this knowledge.
 
So since people have been asking, I figured I'd post this bit of info that I've sent to nanash and others to see if they can figure out the decompression. I have been able to successfully write a demuxer for the FMVs. This can simply separate the ADX audio from the Video data. The idea being that having them separated can make it easier to identify what actually is video data.

[...]

I had some time to look at the frame data you send me and I think on a surface level I've figured most of the data out. Some of it is a bit educated guessing so maybe I'm wrong on some details, because I have only reverse engineered a small part of the decoding function. However, since more people seem to looking into this, I thought it's better to share what I've found out so far.

I suspect the video is mpeg encoded with only I-frames being used, so it's basically JPEG like encoded frames. The video is probably 4:2:0 subsampled as seems to be typical. This means the macro block size is 16x16 with 6 channels for each block Y0, Y1, Y2, Y3, Cb, Cr.

Each frame begins with a 12 byte header that is followed by 4 different data sections. The header is structured like this:

4 bits -> unknown (usually 0)
4 bits -> unknown (usually 5, 6 or 8) -|
4 bits -> unknown (usually 5, 6 or 8) |-> seem to be always identical within one frame
4 bits -> unknown (usually 5, 6 or 8) _|
8 bits -> image width in blocks
8 bits -> image height in blocks
16 bits -> byte offset of section 1
16 bits -> byte offset of section 2
16 bits -> byte offset of section 3
16 bits -> block size (probably)

The first section, or section 0 as I call it, is 64 bytes long. I suspect it's the quantization matrix for all channels, since 8 x 8 = 64.

The second section is 1452 bytes long and I'm pretty sure it contains the DC coefficients for all channels. Each block uses 6 bytes, thus 6 x 11 x 22 = 14532. Curiously each coefficient seems to be 6 bits wide instead of 8 and the last 12 bits of each data block is zero padding. This results in this structure per block:

section 1 (1452 bytes):
- for each block there are 6 bytes of data
- 6 bits for Y0
- 6 bits for Y1
- 6 bits for Y2
- 6 bits for Y3
- 6 bits for CB
- 6 bits for CR
- 12 bits padding

The third and fourth section contain the compressed AC coefficients for all 6 channels. They seem to be encoded in the same manner as the other compressed images in Grandia I wrote the decompression tool for. The third section contains the look up table encoded data and the fourth section contains the run length encoded zeros.

Btw I opened a github account where I'll probably host the code for the encoder I'm planing on writing. The source code for the compression tools is also hosted there.

 
Last edited:
It's time to revise what I wrote initially about the FMV codec. I didn't really know (and still don't) how to use the correct technical terms for all this video compression stuff. In German you'd say I have "dangerous half knowledge" of video codecs, so please bear with me.

The first thing I want to revise is calling this codec MPEG. At the current stage of reverse engineering, I think all this codec has in common with MPEG is the use of a DCT. This is probably where the similarities end.

The next thing is the header. I think it's actually 16 bytes long instead of 12. The additional 4 bytes seem to be always zero, but don't seem to be padding, as the game reads each of these bytes explicitly at some point. This leaves 60 bytes for section 0, but since an offset for the start of section 1 is specified in the header it's possible that it's length is variable. I haven't seen a different length than 60 bytes yet though. I still don't know what section 0 is used for, but it's probably not for the quantization.

Now let's talk about section 1. I was correct about the structure of this section, but wrong about the content. The number per block that is written here isn't the DC coefficient of the block. Instead it contains the number of non-zero elements of the macro block and their ordering. If we look for example at the first frame of the intro movie, section 1 is simply filled with zeros. The game reads the zero and uses the following lookup table to get the number of elements.

[CODE title="Element number lookup table"] self._elem_num_lut = [1, 2, 3, 4, 5, 6, 7, 8, # starts @6036C90
2, 4, 6, 8, 10, 12, 14, 16,
3, 6, 9, 12, 15, 18, 21, 24,
4, 8, 12, 16, 20, 24, 28, 32,
5, 10, 15, 20, 25, 30, 35, 40,
6, 12, 18, 24, 30, 36, 42, 48,
7, 14, 21, 28, 35, 42, 49, 56,
8, 16, 24, 32, 40, 48, 56, 64][/CODE]

This means 0 => 1 element to write. We see that some numbers are found multiple times in the table. This is because the number also specifies the order in which the elements must be written to the block. The ordering is found in another lookup table.

[CODE title="Order lookup table"]
lut = [[0], [0, 1], [0, 1, 2], ..., [0, 1, 7, 14, 8, 2, 3, 9, 15, ...]]
[/CODE]

The table is quite long so I left out most of the elements, but I hope the idea is clear. If we look at the last element of the lookup table we can see the zig-zag ordering that is used in JPEG for example. If we go back to the example, we see that the order that corresponds to 0 is [0]. So basically write the single element to position 0 in the macro block. Simple enough.

The actual elements of each macro block are encoded in section 2 and 3. Like I said before, section 2 works like the Grandia image compression. This means the first 16 bytes specify how to interpret the data. This is best explained with an example, so let's look again at the first frame that is just a black image by the way. Here we find "0x00000000000000000000000000010010". This means there are only two different codes found in the encoded stream: 27 and 30. This is because the 1s are found at the 27th and 30th position of the hex string. The one means each code is encoded in 1 bit. Therefore 0b0 => 27 and 0b1 => 30. The meaning of the codes is not trivial and I don't fully understand what they mean. I've completely reverse engineered what they do though. What is simple is the meaning of the code "30". It means "end of block". Each macro block ends with this code. The code "27" basically tells us how to decode the first element. It translates to: No zero elements before the current element, use 11 bits from section 3 and add 35, get a value from another lookup table and multiply it by -1, multiply both values for the final element. Well, I told you it was complicated...

Let's go back to the example. The beginning of section 2 (after the first 16 bytes) is "0b0111111111..." This means the first code is 27. As explained above the first step is to get 11 bits from section 3. This is 0b01111011100 or 988. 988 + 35 = 1023. This our first number. The value from the lookup table is 1, so 1023*1*(-1) = -1023. This is the value of the only element of the first block. Now the next bit is read from section 2. It's 1 so the code is 30 or "end of block". This means switch to the next macro block.

The next code is again "30". Since the block size is 1, but the end of block is already reached we write the only element as zero and switch to the next block. Since all following bits in section 2 are 1, this is done for all remaining blocks.

In summary we have one macro block with -1023 as first and only non-zero element and a lot of zero macro blocks for the rest. If you're paying attention you may ask yourself how this makes sense. The picture is filled with black so all macro blocks should have the same values. It's because the macro blocks are delta encoded. So the blocks following the the first one only contain deltas for each element. Since all blocks are same as the first one, they only contain 0s here.

The next step seems to be the iDCT. I'm currently reverse engineering the functions that are used for that and will write up some more once I'm done.
 
Last edited:
Time for some more results. I think data wise I've pretty much figured out everything now. At least all seems to make sense now.

First about section 0, it just contains the starting positions of different subsections in section 2 & 3. The data in section 2 & 3 is divided into image rows. Each subsection contains 22 macro blocks per channel, so 132 blocks in total. This means there are 11 different subsections within section 2 and 3, one per image column. Section 0 contains the offset in bytes and in bits for each of these subsections. It's stored in 6 bytes per subsection, so in total section 0 is 66 bytes long and the header is actually just 10 bytes and stops after the section offsets.

I also learned something new about section 1. The codes there don't just define the number of elements and the order. They mainly define the shape of the macro block. Let me explain. Normally a macro block is a 8x8 matrix. However, to save processing time the codec tries to reduce these matrices to sub-matrix that contain as few zero elements as possible. If, for example, a macro block has only one element in position (0,0), the macro block will be reduced to a 1x1 matrix. If the macro block has elements at (0,0), (0, 2), (0,3) and (2,1), the reduced block will be 4x3. This results in 64 possible macro block shapes.

I also found out how the quantization works. I described in one of the earlier posts how the macro block elements are decoded with codes from section 2. I explained the codes '27' and '30'. The code '27' uses a factor it gets from a lookup table to multiply the resulting element with. This is the quantization factor. There are basically 17*64=1088 different quantization tables used. "64" because the quantization table depends on the shape of the reduced macro block and "17" because there are 17 different quality levels. The quality level is defined per frame and per channel (luma and both chromas). It's stored in the header bits 4-16. For example the first frame begins with 0x0555. 5, 5 and 5 are the quality settings for the channels of the frame. The higher the number, the higher the quantization factors and the greater the data loss.

Now all that's left to understand is the DCT. I found out that it's two step process. The first step is a simple matrix multiplication with the transposed DCT matrix. The DCT matrix is stored in the game data:

[CODE lang="python" title="DCT matrix"] dct_mat_T = np.array(
[0x5a82, 0x7d8a, 0x7642, 0x6a6e, 0x5a82, 0x471d, 0x30fc, 0x18f9, # located @6039844
0x5a82, 0x6a6e, 0x30fc, 0xe707, 0xa57e, 0x8276, 0x89be, 0xb8e3,
0x5a82, 0x471d, 0xcf04, 0x8276, 0xa57e, 0x18f9, 0x7642, 0x6a6e,
0x5a82, 0x18f9, 0x89be, 0xb8e3, 0x5a82, 0x6a6e, 0xcf04, 0x8276,
0x5a82, 0xe707, 0x89be, 0x471d, 0x5a82, 0x9592, 0xcf04, 0x7d8a,
0x5a82, 0xb8e3, 0xcf04, 0x7d8a, 0xa57e, 0xe707, 0x7642, 0x9592,
0x5a82, 0x9592, 0x30fc, 0x18f9, 0xa57e, 0x7d8a, 0x89be, 0x471d,
0x5a82, 0x8276, 0x7642, 0x9592, 0x5a82, 0xb8e3, 0x30fc, 0xe707], dtype="int64").reshape((8,8))[/CODE]

What we see here is the fixed point representation of the normal scaled DCT matrix "C". To get the iDCT you would calculate:

X = C^T*Y*C, where C is the DCT matrix (since it's orthogonal C^T = C^-1)

The only strange thing is that only positive values are used. Usually the DCT matrix has negative elements and I'm not sure how it works without those. Without the negative factors, the matrix isn't orthogonal anymore. Implementing it like this is actually pretty slow, since a matrix multiplication has a runtime of O(n^2). However, that's only how the first step C^T*Y=A is implemented in the codec. Interestingly the second step X=A*C seems to be implemented with a FDCT algorithm or "fast discrete cosine transform". Why only the second step? I've no idea, maybe someone more knowledgeable can explain the reasons. I'm currently trying to piece together the FDCT algorithm, but it has a pretty complicated structure. There are also 8 different implementations, because of the 8 possible shapes of the matrix A depending macro block dimensions.
 
Last edited:
Hi. Finally got a Pseudo Saturn Kai cart for my launch model US Saturn. I have the original Japanese game, and that plays flawlessly. I burned disc 1 of the English patched game on a Taiyo Yuden lacquer CDR. Other burned games play fine, but English patched Grandia frequently freezes on a black screen when accessing the save menu or items menu (music continues playing). Especially the A button menu, seems to freeze 9 times out of 10. Those are the only times the game ever freezes.

I'm still using my original laser lens, and it's kind of flakey, but works for other games. I just found it odd that it's always the same particular actions that seem to lock the game.
 
Back
Top