To TakaIsSilly

I had been thinking that such a thing would be desirable, but hadn't gotten around to looking up how to do it. Thanks 🙂.
 
Here is what I could pull off today. When I think about a assembly code, I think it's _all_ in asm. I had to spend hours trying to figure out how to use inline assembly, since i never had done it before...

Anyway, here is the code I was able to sketch... non-tested and incomplete, but seems prettier than the one from the Denis.

Code:
*BAD CODE*

If it works, the _F flag already has 2 flags set, and only needs Zero (if %0 = 0x00000000) (UPDATE : it's now there) and Signal, that I don't remember how it's done (last bit of _A ?).

Anyway, is this the correct format? I am kinda loopy trying to understand how the C compiller passes values to the inline functions... This is so not my style 😛

(Edited by TakaIsSilly at 11:20 pm on Dec. 27, 2001)

Well, i guess I'm not that smart, and that SR is not a good register to copy... This is very close to a conclusion, just have to figure out a vay to get out the T register without mutch hassle (since i can't just move sr,r0 like it appears here)...

Code:
#define ADD(value)

 __asm__ volatile (

 "shll16 %0      

"

 "shll8 %0       

" /*placed just before the 32-bit boundary of the register*/

 "shll16 %2      

"

 "shll8 %2       

" /* both _A and value suffer the change         */

 "mov %0,r0     

" /* temp store _A since we need to do add twice     */

 "addc %2,%0     

" /* add with carry. Carry stored in T          */

 "mov sr,%1      

" /* moves sr(that contains T) to _F           */

 "and #1,%1      

" /* I don't think this is correct            */

 "mov r0,%0     

" /* get back the old _A value              */

 "addv %2,%0     

" /* Add again, but this time check for overflow     */

 "mov sr,r0     

" /* move sr to the gbr for some modifications      */

 "and #1,r0     

" /* fetch the T flag(part of the sr register)      */

 "shll2 r0      

" /* move the overflow flag to his correct location    */

 "or r0,%2      

" /* OR's the values, creating two Z80 style flags    */

 "xor r0,r0     

" /* Clears gbr, like we do with CISC           */

 "cmp/eq %0,r0    

" /* If _A is zero....                  */

 "mov sr,r0     

" /* Move it again to gbr                 */

 "and #1,r0     

" /* Clear all flags but T                */

 "shll8 r0      

"                              

 "shlr2 r0      

" /* Moves 8-2 = 6 bit positions to the right  */

 "or r0,%2      

" /* OR's that value, 3 flags done            */

 "mov %0,r0     

" /* Moves the result back into the temp register     */

 "and #80,r0     

" /* get the last bit                   */

 "or r0,%2      

" /* OR it with the remaining flag            */

 :"=r" (_A), "=r" (_F)                     

 :"r" (value), "1" (_F), "0" (_A)                

 )

(Edited by TakaIsSilly at 12:27 am on Dec. 28, 2001)
 
I think the problem with SR is that you can't use the MOV instructions to deal with SR, you need to use STC to read it and LDC to write it.
 
I sayed up will 6am to finish this one :

Code:
#define ADD(value)

 __asm__ volatile (

 "shll16 %0     

"

 "shll8 %0      

" /*placed just before the 32-bit boundary of the register*/

 "shll16 %2     

"

 "shll8 %2      

" /* both _A and value suffer the change         */

 "mov %0,r1     

" /* temp store _A since we need to do add twice     */

 "addc %2,%0     

" /* add with carry. Carry stored in T          */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Put _F 1 if T = 1            */

 "or r0,%2   

" /* Set CARRY flag                   */

 "mov r1,%0     

" /* get back the old _A value (we need r0)       */

 "addv %2,%0     

" /* Add again, but this time check for overflow     */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Put _F 1 if T = 1            */

 "shll2 r0      

" /* move the overflow flag to his correct location   */   

 "or r0,%2      

" /* Set OVERFLOW flag                  */

 "xor r1,r1     

" /* Clears r1, like we do with CISC           */

 "cmp/eq %0,r1    

" /* If _A is zero....                  */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Clear all flags but T                */

 "shll8 r0      

"                              

 "shlr2 r0      

" /* Moves 8-2 = 6 bit positions to the left       */

 "or r0,%2      

" /* OR's that value, 3 flags done (ZERO)        */   

 "mov %0,r0     

" /* Moves the result into r0 for speed         */

 "and #128,r0    

" /* get the last bit                  */

 "or r0,%2     

" /* put the result on F(SIGNAL)            */

 "shlr16 %0     

"

 "shlr8 %0      

" /* Reset all values back 24-bits            */

 "shlr16 %2     

"

 "shlr8 %2      

" /* both _A and value suffer the change         */

 :"=r" (_A), "=r" (_F)                     

 :"r" (value), "1" (_F), "0" (_A)                

 )

It compiles fine, But I tried Phantasy Star and it didn't run. Here's the compiled output for an example funtion :

Code:
2797:z80.c     **** OP(dd,84) { ADD(_HX);} /* ADD A,HX */

 31349 8df4 2FE6   mov.lr14,@-r15

 31350 8df6 6EF3   movr15,r14

 31351 8df8 D115   mov.lL4946,r1

 31352 8dfa 6313   movr1,r3

 31353 8dfc 6213   movr1,r2

 31354 8dfe 710E   add#14,r1

 31355 8e00 6610   mov.b@r1,r6

 31356 8e02 720F   add#15,r2

 31357 8e04 6720   mov.b@r2,r7

 31358 8e06 731E   add#30,r3

 31359 8e08 6330   mov.b@r3,r3

 31360 8e0a 4628   shll16 r6     

 31361 8e0c 4618   shll8 r6      

 31362 8e0e 4328   shll16 r3     

 31363 8e10 4318   shll8 r3      

 31364 8e12 6163   mov r6,r1     

 31365 8e14 363E   addc r3,r6     

 31366 8e16 0002   stc sr,r0     

 31367 8e18 C901   and #1,r0     

 31368 8e1a 230B   or r0,r3   

 31369 8e1c 6613   mov r1,r6     

 31370 8e1e 363F   addv r3,r6     

 31371 8e20 0002   stc sr,r0     

 31372 8e22 C901   and #1,r0     

 31373 8e24 4008   shll2 r0      

 31374 8e26 230B   or r0,r3      

 31375 8e28 211A   xor r1,r1     

 31376 8e2a 3160   cmp/eq r6,r1    

 31377 8e2c 0002   stc sr,r0     

 31378 8e2e C901   and #1,r0     

 31379 8e30 4018   shll8 r0      

 31380 8e32 4009   shlr2 r0      

 31381 8e34 230B   or r0,r3      

 31382 8e36 6063   mov r6,r0     

 31383 8e38 C980   and #128,r0    

 31384 8e3a 230B   or r0,r3     

 31385 8e3c 4629   shlr16 r6     

 31386 8e3e 4619   shlr8 r6      

 31387 8e40 4329   shlr16 r3     

 31388 8e42 4319   shlr8 r3      

 31389       

 31390 8e44 6363   movr6,r3

 31391 8e46 2130   mov.br3,@r1

 31392 8e48 2270   mov.br7,@r2

 31393 8e4a 000B   rts

 31394 8e4c 6EF6   mov.l@r15+,r14

(My code is the one more to the left...)

I'm off to study for exams, so I'm trying to give as mutch help as I can before it. I know the error must be either on the ADD functions, or on the flag checking routines. How does C++ loads the register with the _A value? Is it big endian or small endian format? If not, the shll16 is useless... Err... I just found out the error. --; Hold a min.
 
This fixes a bug with the SIGNAL flag, but it's not correct yet.

Code:
#define ADD(value)

 __asm__ volatile (

 "shll16 %0     

"

 "shll8 %0      

" /*placed just before the 32-bit boundary of the register*/

 "shll16 %2     

"

 "shll8 %2      

" /* both _A and value suffer the change         */

 "mov %0,r1     

" /* temp store _A since we need to do add twice     */

 "addc %2,%0     

" /* add with carry. Carry stored in T          */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Put _F 1 if T = 1            */

 "or r0,%2   

" /* Set CARRY flag                   */

 "mov r1,%0     

" /* get back the old _A value (we need r0)       */

 "addv %2,%0     

" /* Add again, but this time check for overflow     */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Put _F 1 if T = 1            */

 "shll2 r0      

" /* move the overflow flag to his correct location   */   

 "or r0,%2      

" /* Set OVERFLOW flag                  */

 "xor r1,r1     

" /* Clears r1, like we do with CISC           */

 "cmp/eq %0,r1    

" /* If _A is zero....                  */

 "stc sr,r0     

" /* move sr to r0                    */

 "and #1,r0     

" /* Clear all flags but T                */

 "shll8 r0      

"                              

 "shlr2 r0      

" /* Moves 8-2 = 6 bit positions to the left       */

 "or r0,%2      

" /* OR's that value, 3 flags done (ZERO)        */   

 "shlr16 %0     

"

 "shlr8 %0      

" /* Reset all values back 24-bits            */

 "mov %0,r0     

" /* Moves the result into r0 for speed         */

 "and #128,r0    

" /* get the last bit                  */

 "or r0,%2     

" /* put the result on F(SIGNAL)            */

 "shlr16 %2     

"

 "shlr8 %2      

" /* both _A and value suffer the change         */

 :"=r" (_A), "=r" (_F)                     

 :"r" (value), "1" (_F), "0" (_A)                

 )

The Z80.c I'm working on is here. I deleted ADD for x86 and C format, sorry, but they were messing up my mind ^^; :

Z80.c

(Edited by TakaIsSilly at 11:47 am on Dec. 28, 2001)
 
Arg... My function is 47 intructions long, while the compiled one, with perfect emulation of the z80 is 45 instrucions long. I was beaten by a compiller ;_;
 
No I'm not. If it's slower than the compiled version, it's useless. Anyway, this shows what is the correct assembly format. Read GCC.FAQ to see the extra squirks and do an output of a function using the sugested code above (I did a .BAT file with %1 replacing foo and junk). I'm rather busy to keep trying to optimize this 😛.
 
Back
Top