Auto byte to int Promotion Bug: use of "0xff", where "0x0ff" is needed

Project:JNode Core
Component:Code
Category:bug report
Priority:normal
Assigned:Unassigned
Status:works for me
Description

X86BinaryAssembler.java (and potentially others) have code like this:

502	    public final int get32(int offset) {
503	        int v1 = m_data[offset++];
504	        int v2 = m_data[offset++];
505	        int v3 = m_data[offset++];
506	        int v4 = m_data[offset];
507	        return (v1 & 0xFF) | ((v2 & 0xFF) << Cool | ((v3 & 0xFF) << 16)
508	            | ((v4 & 0xFF) << 24);
509	    }
510	
511	    public final long get64(int offset) {
512	        long v1 = m_data[offset++];
513	        long v2 = m_data[offset++];
514	        long v3 = m_data[offset++];
515	        long v4 = m_data[offset++];
516	        long v5 = m_data[offset++];
517	        long v6 = m_data[offset++];
518	        long v7 = m_data[offset++];
519	        long v8 = m_data[offset];
520	        return (v1 & 0xFF) | ((v2 & 0xFF) << Cool | ((v3 & 0xFF) << 16)
521	            | ((v4 & 0xFF) << 24) | ((v5 & 0xFF) << 32)
522	            | ((v6 & 0xFF) << 40) | ((v7 & 0xFF) << 48)
523	            | ((v8 & 0xFF) << 56);
524	    }
525	
526	    public final int get8(int offset) {
527	        return (m_data[offset] & 0xFF);
528	    }

X86BinaryAssembler.java (and potentially others) SHOULD have code like this:

502	    public final int get32(int offset) {
503	        int v1 = m_data[offset++];
504	        int v2 = m_data[offset++];
505	        int v3 = m_data[offset++];
506	        int v4 = m_data[offset];
507	        return (v1 & 0x0FF) | ((v2 & 0x0FF) << Cool | ((v3 & 0x0FF) << 16)
508	            | ((v4 & 0x0FF) << 24);
509	    }
510	
511	    public final long get64(int offset) {
512	        long v1 = m_data[offset++];
513	        long v2 = m_data[offset++];
514	        long v3 = m_data[offset++];
515	        long v4 = m_data[offset++];
516	        long v5 = m_data[offset++];
517	        long v6 = m_data[offset++];
518	        long v7 = m_data[offset++];
519	        long v8 = m_data[offset];
520	        return (v1 & 0x0FFL) | ((v2 & 0x0FFL) << Cool | ((v3 & 0x0FFL) << 16)
521	            | ((v4 & 0x0FFL) << 24) | ((v5 & 0x0FFL) << 32)
522	            | ((v6 & 0x0FFL) << 40) | ((v7 & 0x0FFL) << 48)
523	            | ((v8 & 0x0FFL) << 56);
524	    }
525	
526	    public final int get8(int offset) {
527	        return (m_data[offset] & 0x0FF);
528	    }

Where use of "0xFF" should instead be coded as "0x0ff" to avoid AutoInt Promotion bugs.

For example, taking get8(...) method above:
* if m_data[0] is any value 0x080 thru 0x0ff
* then get8(0) will return int value 0x0ffffff80 thru 0x0ffffffff, respectively.

The extra MSB 31 thru 8 are set due to "0xFF" being autopromoted to int value "0x0FFFFFFFF".

I would suggest a gate wide search to avoid further cases of this issue.

#1

I do not see the bug you're discribing as 0xff is always an int. So what you say should not happen (0x0ff and 0xff) should both be ints and the same number.

Though for the get64-case you should be right and we have to add the 0xffl instead of 0xff.

#2

Status:active» works for me

Actually the get64-case is ok too as the single v1,v2,.. values are already "long"-values and 0xff should expand to 0xffl.

Therefore I'm setting this bug to "works for me" except you'd prove me wrong.

#5

Priority:critical» normal

#11

This code, as an example, demonstrates Java byte to int auto promotion:

public class Size {
public static void main(String[] args) {
	byte b = (byte)(args.length);
	switch (b) {
	case 0x00: System.exit(b);
	case 0x01: System.exit(b);
	case 0x80: System.err.println("You used a lot of arguments");
	}
}
}

Results in this compilation error, demonstrating Auto byte to int promotion:

Size.java:7: possible loss of precision
	case 0x80: System.err.println("You used a lot of arguments");
	     ^
  required: byte
  found:    int
1 error

Some byte values with the highest order bit 7 set are auto promoted to int values with bits 31-8 set to 1. The troubles come in assuming this will never happen and coding 0x80 through 0xff values. This is particularly problematic when coding OR'ing techniques described here.

To be safe for interoperability issues, code with leading 0x0... instead of 0x... to ensure highest order bit is set to 0 (explicitly). Especially good to be safe with X86BinaryAssembler code...

#12

Actually, what is happening here (in the 0x00 and 0x01 cases) is an assignment conversion doing a primitive narrow ... in the case of a constant expression. In the 0xff case, this is not legal.

The relevant JLS sections are:

  • Section 3.10.1 paragraph 3 - states that an integer literal is of type 'int' unless it has an 'L' suffix.
  • Section 5.2, in the paragraph that starts "In addition, if the expression is a constant expression ..." - states that an "assignment conversion" can perform a primitive narrow of a constant expression to a 'byte', 'short' or 'char' type, if the expression's value is in the required range.
  • Section 14.11 - states that "every case constant expression associated with a switch statement must be assignable (ยง5.2) to the type of the switch Expression."

The range of 'byte' is -128 to 127, and the 3rd case label is 0xff == 255 ... out of range. Therefore it is a compilation error.

In fact, this whole issue is bogus, per JLS Section 3.10.1. The literals 0xff and 0x0ff are both integer literals with type 'int' ... and they have the same value.

#13

Thanks for the clarification. This appears to be resolved, no code change needed. Thanks.