Implicit cast (int->long) bug !!!

Hi,

The command mkdir on RAMDISK have a bug. An exception (BadClusterValue) is throwing.
I have trace with System.out, and I have found the position of the bug on Fat.java.

public synchronized long[] allocNew(int nrClusters) throws IOException {
long rc[] = new long[nrClusters];
rc[0] = allocNew();
...
}

allocNew return a int with the value 2 (first directory created). This is true.
But the value of rc[0] after the affectation is 0x91A526800000002 !
We can show the 2 value 0x91A526800000002 , but a part of the long is wrong 0x91A526800000002

ps:
I have tested on true PC with tftp.

How to reproduce?

Hi,

Can you describe all the steps needed to reproduce the problem?
I cannot reproduce it here.

Ewout

long return value bug

Previously, I have mentioned this bug here, where I describe a way to reproduce it:
- a strange bug: with certain methods, it looks like the return value gets screwed up between the return statement and the caller (e.g. a method returns 33 and the caller sees something like 67675646567575478785). It only happens in certain methods, but with those, every time. I made a workaround by returning a Long instead of a long where it happens, but this looks kinda like a bug in the VM and is very weird. If you want to test this bug, open a file for appending and you will get this at FileOutputStream in the FileOutputStream(File, boolean) constructor. When fh.getLength() is called, garbage is received although if you println the return value in getLength(), it correct (at least this is what used to happen for me).
Andras

One first, if you type that :

One first, if you type that : cd jnode; mkdir test
I it run?

On my PC I have the exception "Bad cluster 655927302366904898", but ramdisk is mounted on FAT with no error.

Here, my file Fat.java with debug line and on comment result when I use mkdir on /jnode :

public synchronized long[] allocNew(int nrClusters) throws IOException {

long rc[] = new long[nrClusters];
System.out.println("rc 0 : "+rc[0]);
// on console print 0
rc[0] = allocNew();
System.out.println("rc 0 : "+rc[0]);
// on console print 655927302365904898
System.out.println("test : "+test()+" : "+test2());
// here I have : “test : 655927302366904898 : 2”
.....
}

public synchronized long allocNew() throws IOException {

int i;
int entryIndex = -1;
.........
entries[entryIndex] = eofMarker;
entryIndexFree = entryIndex+1;
this.dirty = true;
System.out.println("alloc new : "+entryIndex);
// print 2 here

return entryIndex;

}

public synchronized long test(){
long l=1;
l++;
System.out.println("long "+l);
// here print 2
return l;
}
public synchronized int test2(){
int i=1;
i++;
System.out.println("int "+i);
// here print 2
return i;
}

When i change public long allocNew() to public int allocNew(), mkdir have not problème!
I think, on my PC, the left 32 bit on a long are corrupted. But I don’t understand for what.

Have you a debug command for print the assembler of a Java method complied??

my java compilateur is 1.4.2, nasm 0.98.38 and my PC x86 famliy 6 model 8 stepping 3
with 386 Mb of RAM

I had the same and also solved it as you did !

I think it's a bug in the VM but I haven't verified it.

I solved (but not commited to CVS as I can't for now) with the same modification you made.
I don't know much about FatFileSystem specified limits but I think an int (32 bits) is sufficient : it give us 2^32 * cluster (about 2 * 1024 Go !)

Yes whit 32 bits it is ok, bu

Yes, 32 bits it is ok. But we need found where come from this bug.
What is your processor ? PI, PII, PIII ?

Same bug with Athlon XP 2800+

I have searched (in JNode-Core org.jnode.vm.*) but I don't know where cast are done (maybe in *.asm).

I can't help you for now, but could you give more details ? Here is some ideas for searching :

Could you reproduce the bug with a little very basic java program ?
Maybe, you should test with different type of data: local, member, formal parameter, value given by a return.

Have you tried with an implicit cast to long ?
Is there also a bug with cast short->int, or short->long ?
And if you don't use the array of long but just a long ?

Not a cast bug!

It is a bug on return, not on cast.
Normal return type (byte,int,float,reference,array) are returned on EAX.
Long and double are on 64 bit, JNODE use EAX:EDX for return value.
But the compilator can make a fontion inline, but i don't no if it is use on JNODE.
I have make a new test (on class Fat.java), on the type double. He bug too!
It is the same bug with long!!
Perhapse the lreturn byte code bug? Or is the push64 after the return?
I don't no.

Bug in l1compiler?

It sounds like this is somehow a bug in the l1 compiler, but then it should be reproducable every time.

I've looked, but i cannot reproduce & find it.
Have a look at org.jnode.vm.x86.compiler.l1.X86BytecodeVisitor, maybe someone can find it.

Ewout

An idea

Here is the pop before returning with a long value, called with EAX EDX:

public void writePOP64(Register lsbReg, Register msbReg) {
		popCount++;
		//final int len = os.getLength();
		final int len = os.getLength() + DISABLED;
		if ((len == lastPushEnd) && (lastPushMode == MODE_REG64)) {
			if (lsbReg != lastPushRegMSB) {
				// We can undo the last push
				pushTrimCount++;
				os.trim(lastPushEnd - lastPushStart);
				/*
				 * if (os.isLogEnabled()) { os.log("rewrite pop64 [" + lsbReg + "+" + msbReg + "]");
				 */
				if (lsbReg != lastPushReg) {
					os.writeMOV(X86Constants.BITS32, lsbReg, lastPushReg);
				}
				if (msbReg != lastPushRegMSB) {
					os.writeMOV(X86Constants.BITS32, msbReg, lastPushRegMSB);
				}
				lastPushEnd = -1;
			} else {
			    //BootLog.debug("lsbReg == lastPushRegMSB (" + lsbReg + ")", new Exception("Stacktrace"));
				popWarnCount++;
				os.writePOP(lsbReg);
				os.writePOP(msbReg);
			}
		} else {
			os.writePOP(lsbReg);
			os.writePOP(msbReg);
		}
}

It looks like this method tries to do some optimisation depending on what were the immediatly previous operations. That code is executed only when generating code for certain methods depending on their bytecode.
I wonder if the postprocessing of the generated code can interfere with this...(remember the bootime message where it reports the removal of push - pop pairs) If so that can also be a source of unpredictibility.

I have allready tested with D

I have allready tested with DISABLED=1 (remove pop,push), no change have made!
And pop64,push64 are used in many case (long+long) for exemple, and it don't bug.

Andreas have the same but on : I'm using vmware 3 on an intel centrino machine

Before the affectation rc[0]

Some presisions:

Before the affectation, rc[0] value is 0 (i have checked).

The precise position of the cast is here:

public synchronized long allocNew() throws IOException {
int entryIndex = -1;
....
return entryIndex; // cast int to long
}

ps:
I have use the latest source of JNODE (cvs HEAD)