It's UWAweek 48

help2003/help4407

This forum is provided to promote discussion amongst students enrolled in Open Source Tools and Scripting.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!"  followup message.

How do I ask a good question?
Displaying the 10 articles in this topic
Showing 10 of 564 articles.
Currently 4 other people reading this forum.


 UWA week 15 (1st semester, week 7) ↓
SVG not supported

Login to reply

👍?
helpful
3:47pm Wed 13th Apr, ANONYMOUS

Hello just looking at the part 1 of the assignment and I have a few questions: 1) are we going to be checked against the excel file provided, or is it just a format of the file? 2) Malaria incidence (per 1 000 population at risk) is the same everywhere. Do we have to extract it from the csv file, or can we just simply echo "blah blah blah, per 1000"? 3) should "Sudan" and "Sudan (until 2011)" be treated as one country?


SVG not supported

Login to reply

👍?
helpful
5:40pm Wed 13th Apr, Michael W.

ANONYMOUS wrote:
> Hello just looking at the part 1 of the assignment and I have a few questions: > 1) are we going to be checked against the excel file provided, or is it just a format of the file? > 2) Malaria incidence (per 1 000 population at risk) is the same everywhere. Do we have to extract it from the csv file, or can we just simply echo "blah blah blah, per 1000"? > 3) should "Sudan" and "Sudan (until 2011)" be treated as one country? > >
Hi, 1) We will be using precisely that file, called by that name. Given that the data is unique, there is no point making the name of input file a parameter 2) Except for a little cleaning (mostly related to using integer data), the file is largely the one found on Kaggle. 3) Sudan split into Sudan and South Sudan. It is a sensible question to ask whether there is any differences in the rates. Contrast this discussion with "Bolivia (Plurinational State of)", is suggestive of the need for some data cleaning, perhaps using the skills from today's lecture. Cheers MichaelW


 UWA week 16 (1st semester, non-teaching week) ↓
SVG not supported

Login to reply

👍?
helpful
3:19pm Wed 20th Apr, ANONYMOUS

"Michael Wise" <mi*h*e*.*i*[email protected]*a*e*u*a*> wrote:
> ANONYMOUS wrote: > > > Hello just looking at the part 1 of the assignment and I have a few questions: > > 1) are we going to be checked against the excel file provided, or is it just a format of the file? > > 2) Malaria incidence (per 1 000 population at risk) is the same everywhere. Do we have to extract it from the csv file, or can we just simply echo "blah blah blah, per 1000"? > > 3) should "Sudan" and "Sudan (until 2011)" be treated as one country? > > > > > > Hi, > 1) We will be using precisely that file, called by that name. Given that the data is unique, there is no > point making the name of input file a parameter > 2) Except for a little cleaning (mostly related to using integer data), the file is largely the one > found on Kaggle. > 3) Sudan split into Sudan and South Sudan. It is a sensible question to ask whether there is > any differences in the rates. Contrast this discussion with "Bolivia (Plurinational State of)", > is suggestive of the need for some data cleaning, perhaps using the skills from today's lecture. > > Cheers > MichaelW
Hi Michael, with regards to the last point, I don't see a way to remove "(Plurinational State of)" but not remove "(until 2011)" in a generalised way using a single pipeline, as these phrases are both denoted by parentheses and consist of alphanumeric characters. Does that mean that you want us to manually remove (hard-code) all "state of" phrases?


SVG not supported

Login to reply

👍?
helpful
9:23pm Sun 24th Apr, Rashmi MA.
Edited: shortly thereafter

Regarding the inputs 1. Country a. Is the full name i.e. "United Arab Emirates" should be given the successful output or "Emirates" should also give the output considering part of the country name into consideration b. if matching a part is accepted, what should be done if there are more countries fall into name part matching. i.e. "United" c. Can we exclude processing the country name given within "()", and expect the user to refer to the country name using the common reference. i.e. Venezuela ignoring (Bolivarian Republic of) 2. Year a. Will the year be given in 4 digits or do we need to convert 8 into 2008 and 16 into 2016 and proceed considering the year in the given data set fall into 2000-2018


SVG not supported

Login to reply

👍?
helpful
9:23pm Sun 24th Apr, ANONYMOUS

Regarding the inputs 1. Country a. Is the full name i.e. "United Arab Emirates" should be given the successful output or "Emirates" should also give the output considering part of the country name into consideration b. if matching a part is accepted, what should be done if there are more countries fall into name part matching. i.e. "United" c. Can we could exclude processing the country name given within (), and expect the user to refer to the country name using the common reference. i.e. Venezuela ignoring (Bolivarian Republic of) 2. Year a. Will the year be given in 4 digits or do we need to convert 8 into 2008 and 16 into 2016 and proceed considering the year in the given data set fall into 2000-2018


 UWA week 17 (1st semester, week 8) ↓
SVG not supported

Login to reply

👍?
helpful
4:05pm Mon 25th Apr, Michael W.

"Rashmi Mudugamuwa Arachchi" <23*8*3*[email protected]*u*e*t*u*a*e*u*a*> wrote: Hi Rashmi, I'll reply to each question in turn.
> Regarding the inputs > 1. Country > a. Is the full name i.e. "United Arab Emirates" should be given the successful output or "Emirates" should also give the output considering part of the country name into consideration.
A sensible rule is that if that can be done unambiguously, then the contraction is okay. Emirates is a very good example, as that is how they are commonly known in English. Viet Nam == Vietnam was mentioned last week, as both names are common.
> b. if matching a part is accepted, what should be done if there are more countries fall into name part matching. i.e. "United".
United, Republic, et al, are pretty useless as they occur in a number of countries' names. At this point, I think it is best to say too many hits. In truth, we are now getting into the territory of approximate string matching, which is well beyond this unit.
> c. Can we exclude processing the country name given within "()", and expect the user to refer to the country name using the common reference. i.e. Venezuela ignoring (Bolivarian Republic of)
Yes. Ignore anything in brackets.
> > 2. Year > a. Will the year be given in 4 digits or do we need to convert 8 into 2008 and 16 into 2016 and proceed considering the year in the given data set fall into 2000-2018
years should have 4 digits. Well are well beyond the "Y2K Problem", right? Cheers MichaelW


SVG not supported

Login to reply

👍?
helpful
11:15am Wed 27th Apr, Rashmi MA.

Hi Michael, Thank you for the detail clarification.


SVG not supported

Login to reply

👍?
helpful
4:42pm Wed 27th Apr, Hanlin Z.

Hi sir, I do not 100% agree abbreviation for a country name is possible. If user try to search "Sudan", how the bash knows he/she is trying to search "Sudan" or "South Sudan"? As what u said, the bash will said multi hits, but it still not able to show user what they want (the info), since bash can not know it is a abbreviation or not. I have an idea that might solve this issue, try to find "Sudan" first, if there has a country called "Sudan" in the data set, the bash will show the info of "Sudan". otherwise, show info for other country which abbreviation matching "Sudan". but there will have another question, if we do so, when a user try to use "Sudan" as an abbreviation word, to search "South Sudan", not try to search "Sudan". at this time, the bash will match and return info of "Sudan" to the user. that is what they don't expect. so that is why i said I do not 100% agree abbreviation for a country name is possible, since always have some special cases might make ur program works to a wrong way.


SVG not supported

Login to reply

👍?
helpful
11:58pm Wed 27th Apr, Marc E.

Hello, See the updated information of the assignment. "Sudan" is ambiguous and will not be tested for. As long as you "human-proof" your program sensibly you should be fine. Cheers, Marc


SVG not supported

Login to reply

👍?
helpful
12:54am Thu 28th Apr, Hanlin Z.

thanks for point that out. but to be honest, the text below is for staff: i hope in next assignment or something else, if staff update something, please use some ways to mentioned. like email or use another color not same with original words color. i don't think everyone will look through the assignment doc from top to end again and again every day to wait for an update.

The University of Western Australia

Computer Science and Software Engineering

CRICOS Code: 00126G
Written by [email protected]
Powered by history
Feedback always welcome - it makes our software better!
Last modified  1:17AM Sep 14 2022
Privacy policy