It's UWAweek 48

help2003/help4407

This forum is provided to promote discussion amongst students enrolled in Open Source Tools and Scripting.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!"  followup message.

How do I ask a good question?
Displaying selected article
Showing 1 of 564 articles.
Currently 2 other people reading this forum.


 UWA week 17 (1st semester, week 8) ↓
SVG not supported

Login to reply

👍?
helpful

ANONYMOUS wrote:

Hi Michael,

I am getting 911 hits for 'the' and 909 hits for 'I'for ADollsHouse.txt, which makes 'the' as -nth 1 (most popular word). However, in the testing sample, it says:

% ./common_words -w I text_files The most significant rank for the word I is 1 in file ADollsHouse.txt

I get the same as Michael ("I" occurs 950 times, "the 920") if I check just that file using grep and some additional flags (e.g. -w. Perhaps your search strategy is too restrictive?

Since the count numbers of two words are extremely close, some trivial/ exceptional cases will make great impact on the result.I think it is due to how we define a word, like is 'The','the' or 'THE' or ('the' with some special characters)the same? Apart from that, let's say if I can't match 100% of your rules to construct/define a word and return a slightly different result, will I still get partial marks based on my logic flow of codes?

Some of the word definitions have been clarified (e.g. preserve cases, so "The" is different to "the"). Hope this helps!

The University of Western Australia

Computer Science and Software Engineering

CRICOS Code: 00126G
Written by [email protected]
Powered by history
Feedback always welcome - it makes our software better!
Last modified  1:17AM Sep 14 2022
Privacy policy