It's UWAweek 47

help5501

This forum is provided to promote discussion amongst students enrolled in CITS5501 Software Testing and Quality Assurance. If posting a question, it's suggested you check first whether your question is answered in the unit Frequently Asked Questions (FAQ) list, and use the search box (on the right) to see if an answer to your question has already been posted.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!"  followup message.

Note that any posts must comply with the UWA Code of Conduct and the UWA Academic Conduct Policy. That means you should (a) treat everyone with respect and courtesy, and (b) not post your solutions to an assessment that's in progress.

If asking a programming question, it's recommended you read How do I ask a good question? If reporting or troubleshooting a bug in software used in the unit, it's recommend you read How to report bugs effectively.
Displaying selected article
Showing 1 of 135 articles.
Currently 26 other people reading this forum.


 UWA week 37 (2nd semester, week 7) ↓
SVG not supported

Login to reply

👍?
helpful
3:14pm Wed 11th Sep, Arran S.

ANONYMOUS wrote:

> Hi Arran,

> In lab 3.1 Binary Search-question b, you said the input domain consists of a set of all possible chars s (there are 256 of them). I am curious why there are 256 possible chars.

That's a simplifying assumption – I'll make a note of that in the lab sheet.

> Is it because each character is 8 bits long and thus has 256 possible characters? But I look it up in java doc(https://docs.oracle.com/javase/8/docs/api/java/lang/Character.html). It says Java characters now use Unicode Standard, and "The range of legal code points is now U+0000 to U+10FFFF". Does it mean there are 2^16 (or 65,536) possible chars now?

Actually, Java chars are 2 bytes (16 bits) in size. They can therefore take on 65,5356 distinct values. (Also, note that you linked to version 8 of the Java standard library; the current version is actually Java 22, which has different documentation for the Character class.)

But that's not the same as the range of "legal code points" (U+0000 to U+10FFFF). There are 1,114,112 distinct code points, but not all of them have been assigned to actual characters – e.g. in Unicode version 16.0, only 154,998 characters have been assigned so far. In the version of Java you linked to (Java 8), version 6.2 of the Unicode standard was being used, which would have had fewer assigned characters.

In Java, when a method needs to take an argument representing a "character", it can do so in two ways.

  • It can take a char argument. In that case, it will be limited to the 65,536 values a char can represent. The binary search method in the lab is an example of a method like this.

  • It can take an int argument, and in that case, the full range of Unicode characters can be used (from the version of Unicode being used by that particular Java standard). However, a Java int is 32 bits (4 bytes) in size, so many possible int values don't represent code points at all, and of the values that do represent possible code points, not all will have actually been assigned. To find out how the method handles invalid values, you need to read the documentation for the method.

    An example of a method like this is the indexOf method of the java.lang.String class.

    Now, that method says what the behaviour of the function will be if an int is passed that falls in the U+0000 to U+10FFFF range (1,114,112 values), but doesn't say what the behaviour will be if an int outside that range is passed; so we must take the behaviour to be undefined.

When answering question about ISP in tests or exams, you're welcome to make simplifying assumptions, if needed. It would probably be a good idea to do so, if you have a question involving character or String types, since you might not remember the exact size of a Java char (16 bits) and may not have access to the Unicode standard telling you how many assigned characters there are.

So you could make the simplifying assumption that we're only considering ASCII characters. (You should clearly state that you're making this assumption, however, and that the actual number of possible characters is larger.)

In this case, you might say "If we assume for simplicity that a char can represent 256 values ...", and then answer the question on that basis. This is fine, because the markers are generally more interested in your reasoning than in whether you can accurately recall exactly how many valid characters there are in the current Java standard.

Does that help at all?

Cheers

Arran

The University of Western Australia

Computer Science and Software Engineering

CRICOS Code: 00126G
Written by [email protected]
Powered by history
Feedback always welcome - it makes our software better!
Last modified  8:08AM Aug 25 2024
Privacy policy