Please consider offering answers and suggestions to help other students!
And if you fix a problem by following a suggestion here,
it would be great if other interested students could see a short
"Great, fixed it!" followup message.
ANONYMOUS wrote
While I said I'll delay replying till next week, I just need to note that I ran the tests from within an instance of the Docker image, which implements the standard environment. That environment is based on plain text files.
Cheers
Mich...
Hi,
Working on it as fast as we can. I have about 40 of the submissions completely marked, with the remainder tested, but not marked by human (which is the slow, but important part).
Cheers
MichaelW
Hi
Here is why the representing floating point numbers is locale dependent
https www.grammar-monster.com lessons commas with numbers.htm
Cheers
MichaelW
ANONYMOUS wrote
Hi,
Fair question. That is just a matter of personal taste on my part. Not important. Recalled that exit status 0 implies success; status 1 implies an error. I assume that if the user types a command without options they are trying t...
ANONYMOUS wrote
We're working on it, but no guarantees, not least because most of those doing the marking have exams themselves. Marking 300 assignments is a non-trivial exercise, and unlike the assignments themselves, we cannot start early
Che...
Hi Junyu,
Fair question. The answer is that there, once again, a proliferation of RE sub-languages, so in the same way that the course standardised on Bash (which I don't actually use, preferring ksh), the rationale has all along been to go with the ...
ANONYMOUS wrote
Hi,
Yes, there were 18 lectures (starting at L0 ), and I believe I updated each just after I presented it, as I nearly always add stuff as I prepare the lecture for presentation. You can check against the class schedule - also on LMS.
C...
ANONYMOUS wrote
Hi,
To pass, you'll need to score 50 (or more) over the sum of the 3 components 1 assignments and the final exam, respectively weighted 20 , 20 and 60 . It's in the unit outline for every unit.
Good luck, by the way.
Cheers
MichaelW
ANONYMOUS wrote
HI,
No. There are more questions to get your marks on, and more time, to give you more opportunity to demonstrate your prowess in scripting.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
According to the break-down given at the start (and in the README), the final mark will be out 60. The earlier paper was out of 60, and the exam was 90mins. This exam will be marked out of 90 and you'll have 2hrs, the the exam mark...
ANONYMOUS wrote
Hi,
Having finished the planned material, and done revision on Monday, there was no final lecture this morning (Wed 22 May).
Good luck with the exams.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
Yes. I follow UWA Assessment Policy guidelines for CITS4407, which provide for max 7 days late, but a deduction of 5 of the full mark (so 1 out of 20) for each day late, or part thereof.
Cheers
MichaelW
Hi Andre,
Alexandra is correct. UWA Assessment Policy mandates up to 7 days for late submissions for most assessment items, but with a penalty of 5 of the full mark (so 1 mark out of 20) per late day, part thereof. That is how things are run for CITS...
ANONYMOUS wrote
Hi,
Having removed the rows for entities which do not have a country code, the combination of country code - year will be unique. You need to do a join across that shared combined code, and taken in all the data (including empty cell...
Hi Zeke,
This is not an error. The data is valid; the Cantril values just happen to be the same. (Perhaps they asked the same person each year?? ). In any case, from the point of view of data cleaning, it is clean data, and for this analysis, all ...
ANONYMOUS wrote
Hi
When you are working to the screen you cannot tell whether the data came from stdout or stderr. The only way is to redirect the streams
program tested from stdout 2 from stderr
Cheers
MichaelW
Hi,
Given that you posted anonymously I can't comment, but most marks feedback have been posted, with just a small number left of those which had been set aside groan and then the auto-testing failed, so they have to tested by hand. I'm down to the...
Hi,
That is not how I understand things, which are illustrated in the sample1.xlsx worked example based on sample1.tsv. For each country there are 3 corrections to be done GDP, Homicide Rate, Life Expectancy, each versus Cantril, based on predicto...
Hi Kai,
Empty cells are fine anywhere for the data cleaning program, so long as the rows are within the range of years of interest. Re double quotes, you only need these in CSV format, because comma can occur naturally within a string, so the whole s...
ANONYMOUS wrote
Hi, as far as I know, you the marks are on csmarks, but the feedback files are associated with cssubmit. I'm down the last few, which had to be marked by hand, for a variety of reasons. Speaking of which, I really should get back to m...
ANONYMOUS wrote
Hi
I certainly believe so. Don't forget, you have all the TSV source files, so you can work forward and see what should be there. Have I made a mistake?
Cheers
MichaelW
Hi,
I'm not sure I understand the question. The output file is to be send to standard output, where, if stdout is not redirected into a file, it will appear on the screen. Is that what you mean?
Cheers
MichaelW
ANONYMOUS wrote
Hi,
it mostly is, but in a few cases no; the difference is really whether a code is present at all in the code cell, or absent.
Cheers
MichaelW
Hi Runzhi,
Yes, you can use a number of scripts to get the job done, including more than one awk script, if you wish. You then put all the scripts for the two programs, plus the .git repo for the assignment, in the single zip'd tar'd directory.
Re inf...
Hi Boya,
Good questions. I'll reply to each below it.
The trick with all of these is to place yourself in the role of a user of the system. What will they expect? What information will be useful?
As far as I can see, the country data (ie data that has a...
ANONYMOUS wrote
Hi,
Yes. It includes penalties for late submission based on UWA Assessment Policy, which, for some people, has been modified by Special Consideration or UAAP. If you have either of those and believe it has not been taken into account, ...
ANONYMOUS wrote
Hi,
If I taught it, it's examinable. I can't say what the questions will be. For more general comments about the style of the questions in the exam, please review Monday's lecture slot, where that was covered (also stuff on Assignment ...
Hi Eusha,
The error message about the incorrect row goes to stderr. Then you move on to the the next input line, which has the effect of deleting the from the output. That is, the line won't be there so it appears to have been deleted. (Of course, the...
ANONYMOUS wrote
Hi,
Yes, please include the .git directory in the directory with the other bits, which you then zip or tar for submission.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
The requirements are
cantril data cleaning output is sent to standard output; input is from the 3 TSV format files
best predictor takes a TSV format file of cleaned data, and sends the output of the analysis to standard o...
Hi
I'll answer each part after the question.
ANONYMOUS wrote
I posted it primarily as input to best predictor (particular in relation to the correlation calculation). However, it also serves to show what the output to the cleaning stage looks like.
Diff...
ANONYMOUS wrote
The data from data cleaning should be sent to standard output. This will go to the terminal, but generally stdout is redirected into a file, but that is up to the person running the program, not you. Makes sense?
Cheers
MichaelW
...
Hi Kaichao,
The testing for the correct number of cells per row should happen for the 3 files submitted to the data cleaning program, after which all the lines in a given file should have the same number of cells. Joining two files containing lines wi...
Not sure which program you are referring to. The data cleaning program should output .tsv formatted data via stdout (error messages to stderr). The analysis program will report plain text results, similar to the example, via stdout.
Cheers
MichaelW
...
ANONYMOUS wrote
That is correct. There will be other cleaned data for the analysis program (as there will be for the inputs to the data cleaning program).
cheers
MichaelW
Hi,
I'll respond to each bit below it,
ANONYMOUS wrote
As always,that depends on whether the problem is local fixable or not fixable.
A .csv file instead of the expected .tsv (or some other separator) is not readily fixable (if done properly), so erro...
Hi Zhiyang,
Yes, of course the program can create as many temporary files as are needed; just make sure that they are tidied up at the end. The ouput from the cleaning program is to go to stdout. (I will then redirect stdout, as required.)
Cheers
Michae...
Hi Evan,
As far as I can see, the main thing we are interested in across the files is that Life file header has the word Life, GDP data has the word GDP, etc.
Cheers
MichaelW
Hi Evan,
The Cantril scores must be, as there was only one organisation providing the data for each country in a given year. The population counts come from a range of sources, and could differ, but as far as I know they are the same, all the files ha...
Hi,
Yes, the sample data is correct. Because I was worried on that score, I did a couple countries by hand in the Excel spreadsheet. That corresponds to Sample 1, which is the same as Sample 2, except the later also contains a couple countries which...
Hi Rayhan,
For either program, but particularly the data-cleaning program, error messages should be printed on stderr, with all the real data going to stdout.
cheers
MichaelW
ANONYMOUS wrote
Hi,
Please see the response to the earlier thread About the description of best predictor, and get back to me if anything there is unclear.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
The fundamental misconception here is that the value needs to be at least 3. That is not correct. What I'm trying to say is that, for a given country, there needs to be at least 3 years with Cantril data. For example, in Sample2.tx...
Hi Zhenlong,
The output should go to standard output rather than a file. (The analysis example has a file name because I simply had to use something )
Cheers
MichaelW
Hi Runzhi
That is correct. Tabs are simply replacing the commas that are used in the .csv format. Spaces can appear in either format, and are just parts of strings; never separators.
Cheers
MichaelW
Hi Zexu,
The file names are not relevant, and can change. Rather the content of each file can be judged by words appearing in the header line. The program needs 1 of each of the 3 types of data-file one GDP, one homicide rate, one life expectancy.
Che...
Hi Zhiyang,
The columns will be correct to type. That is, you'll get numbers where numbers are expected and strings where strings are expected (or an empty cell).
Cheers
MichaelW
ANONYMOUS wrote
Hi,
I would assume so, ignoring blank lines. Put another way, I cannot see a circumstance where a cleaned file of the same input data would be different. What am I missing, please?
Cheers
MichaelW
Hi,
Not sure that I understand the question, I'm sorry, but if you use git only your local machine, git init creates a director called .git in the current directory. (Of course, you will need to use ls -a to see it.) You are being asked to include tha...
Hi
I'll respond to each part after the question;
ANONYMOUS wrote
In effect, both. You ignore the row, in the sense of not printing it to stdout, but you also print an error message (to stderr).
That is correct. HINT Ignore the name of the file. Instead...
Hi Zhiyang,
Sample1.tsv is just the first two countries round in Sample2.tsv (with the other two not have enough data-points). I have now made that explicit, but also provided Sample1.tsv, to avoid further confusion.
Yes, cells may be empty, but if the...
ANONYMOUS wrote
Hi,
To help check your (and my) calculations, I created an Excel spreadsheet which gives a breakdown of the correlation computations. Unless you are using scipy, and have coded the correlation computation in Python yourself, you may ha...
ANONYMOUS wrote
Hi,
The specification says that the 3 files can appear in any order, so no, explicit names cannot be used (assuming I understood your question correctly).
Cheers
MichaelW
Hi Zhiyang,
I cannot find those patterns in the data I provided. It looks like unicode characters have been introduced(??), but I cannot reproduce that problem.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
I do not plan to change the due date for Assignment 2, firstly because the release of the specification was only a couple days late, and secondly because it is due the last week of term, after which it will class with the study br...
ANONYMOUS wrote
Sorry for being unclear. You will notice that there is no Cantril data for years before 2011 or after 2021, so years outside that range can be ignored. However, years with no Cantril data which are within the range should be included....
Hi,
Marking 300 assignments, which can arrive across a span of two weeks (with Special Consideration, etc) takes a little time. We're doing our best, and do appreciate that you'd like to get feedback as soon as possible.
Cheers
MichaelW
Hi Zexu,
I am impressed. It's actually not trivial as .csv data can have commas in the strings, and also hidden CR . Was your solution able to cope with these?
Cheers
MichaelW
Hi Suan,
That is what I thought I was setting up. While I investigate whether your answers can be released to you, I'll at least make the solutions visible.
Cheers
MichaelW
Hi Suan,
It starts in the current directory and then all sub-directories, and looks for all files called Makefile or makefile. For each such file it executes the function make, which we will visit on Wednesday. It then prints the full path to that Mak...
Hi Rabea,
The main purpose for asking for Git in Ass2 is to get you used to using this industry-standard (and really useful) Shell tool, and more generally the need for version control. The auto-testing will print the evidence for the human marker to...
ANONYMOUS wrote
Hi,
That was a mistake on my part. Now fixed. Long story short, a csv file would be an error as it cannot be interpreted as a .tsv file.
Cheers
MichaelW
Hi Ziyuan and everyone,
It turns out that there was a feedback tool I need to set up, and a setting I had missed. That done, I believe you should be able to see both the mark and the respective answers feedback. Please let me know if that's not what...
Hi Suan,
I can assure you, and everyone, that all the submitted scripts were marked, and I have clicked the links to release the answers. Clearly something is not happening as it should. I'll follow up with the LMS people tomorrow after the lecture.
Ch...
Hi Everyone,
As Ziyuan said, you can download the recording and play it on your own device. I tried it just now. I have no idea why the streaming version is not working, but there is a work around.
Cheers
MichaelW
Hi Ziyuan,
Your (and everyone's) learning is important to us, which is why we do it. When I know all the submissions have been marked, I'll figure out how to release the feedback. LMS is a complicate beast. In any case, soon.
Have a good weekend
Cheers
M...
Hi Kai,
Some material for the workshops was shuffled around, so arcade.csv is in Lab6, australian-universities.csv is in Lab8. Must put that on the list of things to tidy up for next year.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
I will say more about the final exam close to the time, but yes, as before, you will be allowed to bring a single A4 page into the exam (printed or hand-written, both sides, if you wish).
Cheers
MichaelW
Hi,
I shall respond to each in turn.
ANONYMOUS wrote
See earlier response (no)
This was discussed in lectures
See earlier threads
See earlier threads
The file should not end in .sh (but ok this time), but yes, any directory, which you then zip. The directo...
ANONYMOUS wrote
Hi,
The idea behind both discussions is that if input has a sensible reading, disregarding capitalisation, then you could logically call it a mistake, or you can use the data you've been given for full marks (if correct ). Logically, t...
ANONYMOUS wrote
Hi,
Having multiple scripts called by the top-level one is perfectly fine; providing for that possibility is why a zip'd or gzip'd directory is required for the submitted program. However, please do not have you own call to chmod. The ...
ANONYMOUS wrote
Hi,
The first argument is the full pathname to the file, which the testing program will supply. All your program has to do is open the file associated with 1; if it's not there, the program should simply stop and print an error messag...
Hi Alexandra,
Given the actual range of survey dates, and the fact that we are coding this now, I took the pragmatic view that 2022 (and earlier) is past, while 2025 is future. Still, nice idea.
Cheers
MichaelW
Hi Zichen,
It just means that whatever you did was submitted as is, so that's all that will be marked. Does it matter? Not really, given the context. (I'll set it up differently next time.)
Cheers
MichaelW
ANONYMOUS wrote
Hi,
I have no idea how or why one might use the find command in a situation where the input data consists of a single file (not directory). More to the point, you have to be very careful with find as the command can traverse a lot of f...
Hi,
We've only just covered it. As I mentioned a while back, I release solutions a couple weeks after the week when the lab would cover the lecture, to give people time to try it. I've recently released the solutions for Lab 5. (Congratulations on bei...
ANONYMOUS wrote
Hi
What I was trying to indicate is that treating malE as an error is not wrong, but does make use of perfectly usable information, so would be worth less as an answer, therefore half marks for that question.
Cheers
MichaelW
Hi Suan
Not quite echo 'fred cat' sed -e 's a-z a-z '
The single quote are also important (to top the info for sed being instead processed by Bash.
Cheers
MichaelW
ANONYMOUS wrote
Hi,
In the words of Han Solo, I have a bad feeling about this . (My apologies if I'm being unfair.) Please make sure your script works on the standard environment provided by either the Docker class image, or Linux Lab in the UniAp...
Hi,
I'll answer each part after the question
ANONYMOUS wrote
I hope it's clear that I'm expecting the program to deal with other error states. Given that I've not told you what they are, I can hardly be pedantic about the error message. True? While you...
ANONYMOUS wrote
Hi,
One thing you need to learn is that if something is made explicit in a specification, typically because it needs to work with other modules then you must keep to the spec. However, just this time, I shall accept tobacco nation or t...
Hi,
It's way too early to talk about the final exam, and the truth is that I've not designed it yet. There will be a past exam you can look at, which we'll also discuss in class. As you mention, you will be allowed to bring in a single sheet of A4 pap...
ANONYMOUS wrote
The best hint I can offer, short of spelling it out, is to go back to the lectures L1 to L6 (including L5, though the order was swapped). While the efficiency will be poor, you only need to use the language structures and Unix command...
Hi Kaichao,
I cannot see why there is a problem, so I've extended the test till tomorrow, 11 59pm and will ask IT for help.
Thanks for letting me know
Cheers
MichaelW
Hi Runzhi,
Seeing you've asked a number of questions, I'll follow each, in turn, with my response, rather than leaving it to the end.
sort has been used for some time, and grep is also fairly recent, so sure. They are still "simple" Linux commands. (Th...
Hi Eusha,
Done. For good measure, I also uploaded the code that was created for L6 and today's lecture, L9.
Enjoy ... (ok, perhaps not, but I hope you find it useful).
Cheers
MichaelW
Hi Everyone,
I realised that the lecture did not define what a Bash while loop is, beyond the examples. I have now revised the lecture slides (and PDF) to include a slide on what a while loop looks like, and how it works.
I also took the opportunity to...
Hi Everyone,
This is in response to some questions I was asked in the lab.
What if the user types male rather than the expected Male?
My attitude to error handling is quite pragmatic
- Is there a sensible interpretation of the input data that is close...
ANONYMOUS wrote
Hi
) The name of the directory containing the script can be anything.
) You don't need to include the .csv file, as I shall be using my own. However, no harm done if you do include it; I'll simply ignore it.
) Don't worry about file ...
Hi,
The number of white spaces is not relevant; I believe in auto-testing (for uniformity and fairness), but not auto-marking, because humans are much better at dealing with differences in the output that are actually not relevant. In particular, diff...
Hi,
I discussed dev stderr in response to another query. Please look there.
Regarding the file name, the specification says that the name has to be tobacco nation (with no suffix). However, people find it very hard not to add a .sh, so - while techni...
Hi,
There is a little confusion here. For all Bash messages you need to use echo. The only question then is where the message is to be printed.
echo "Fred here"
will print to stdout (or dev stdout). That is the default destination, so you don't have ...
Hi Rabea,
Use of Bash functions is fine; It's just that we've not covered them yet. The are not "advanced" (I really should have chosen a different word), by which I intended more advanced Unix tools, such as awk (and bc), where you can pretty much ge...
Using gzip (and then gunzip) on the directory containing the script(s) is also fine. In retrospect, given that I've been pushing Unix Linux specific tools, for authenticity I really should have asked for Linux native gzip over the more optional zip. ...
Hi Suan,
For example, let's say that the argument to this program is a word 1 being sought in a text 2
if grep 1 2 dev null
then
echo "The word 1 is found in 2"
else
echo "The word 1 is not found in 2"
fi
In this case we don't want ...
Hi,
This makes no sense to me. If you are able to capture a floating point real in this way (outside of Awk), you will comparing strings, which is not a good idea if you actually are working with numbers (see the discussion on arithmetic versus string...
ANONYMOUS wrote
Hi,
If you are running the standard Ubuntu Bash Gnu-Shell-Tools set up, having the .sh on the end of file name should not matter. Instead you need the usr bin env bash as the first line. (To my taste, having .sh at the end of Shell ...
Hi Suman,
I don't really understand the question. Bash printf prints data to standard output. I can't see how you can use Bash's printf to compare floating point values. Sorry. Confused. (Bash only supports integer arithmetic - which we'll cover short...
Hi Baoyue,
Please go back over Lecture 5 Variables, particularly the material on the PATH variable. You will notice I have put '.' in my PATH, so did not have to explicitly use . count occurrences.
Cheers
MichaelW
Hi,
I wouldn't class the read command as advanced. It's just that we've not covered it yet as it's related to loop structures, which we only get to do next teaching week. Please just use the stuff we've covered - for now. Trainer-wheels will come off...
Hi Nathan, short answers are No and Yes. Longer answer, bc is a small language similar to awk in concept, though more limited. The point of this assignment is get you using the language structures we have covered in the unit, literate programming, an...
Hi Suan,
They are both places where standard output, e.g. from echo, can be redirected. dev stderr (don't forget the first ' ' ) is standard error output (for error messages), while dev null is literally nowhere, a black hole, which simply gobbles u...
Hi Suan
A better way of saying 1 2 echo ERROR
is echo ERROR dev stderr
The first of these can be used by all Unixes Linuxes, but is very unobvious syntactically; translation take the output sent to stdout (file descriptor 1) and send it instead t...
ANONYMOUS wrote
Hi,
I have no idea why you are unable to submit your Assignment 1 file, as 6 people have managed to do so, including 2 just today. All I can suggest is please try again.
Cheers
MichaelW
Hi,
I'll talk about that tomorrow (Wed 27th), but the specification says that what the program should expect is a 3 letter ISO country code.
Cheers
MichaelW
Hi Zichen,
Unless I'm missing something, I think I covered that in yesterday's (Monday 25th) lecture on where binaries are to be found, and the PATH variable. No?
Cheers
MichaelW
Hi Everyone,
Given the questions I'm being asked about Assignment 1, I plan to use the first part of the lecture on Wednesday (27th) as a Q A about Ass 1, so BYO Questions. I'll also show you a useful command for tracking through your program as it ex...
ANONYMOUS wrote
Hi,
The range column is actually not relevant for this assignment, and the maths you suggest is well beyond the tools that I have asked you to use. Please just use the medianPC for the calculations.
Cheers
MichaelW
Hi Eusha,
Yes, it would be okay to ignore the output format in this particularly. Clearly my intention was to have Male where I would otherwise have Female, but I forgot to do that in my implementation and therefore in the examples. To be honest, I wo...
Yes,
tobacco nation
and
. tobacco nation are intended to be the same variable. This will be made clear at Monday's lecture, and - long story short - involves how you set up the PATH variable. Starting with the . is arguably better (which is anothe...
Hi Everyone,
You are aware that we don't have a lecture tomorrow morning (9am March 20). What I strongly suggest you do is make sure you have had a go at Labsheet 3. The language structures covered there are directly relevant to Assignment 1, and whil...
Hi Suan,
Don't worry. I engineered some slack in the timetable to deal with these interruptions, or simply the fact that I want to go more slowly over something, or spend the time answering questions. If you get on LMS you'll see that the Weekly sched...
Hi Everyone,
What I usually do if I receive an email query whose answer could apply to the whole, is reply to the sender and then post the answer as a FAQ. Here is the first.
Cheers
MichaelW
...
" I saw there are 2 spaces in some of your examples in the d...
Hi Kaichao,
wc is very simple it just counts the number of characters, words and lines in a plain-text file.
In general, if you are unsure about any command, I recommend using man, e.g. man wc
Finally, beware of using ChatGPT for anything; it can "hall...
Hi Zeke,
This is what I'll be talking about in tomorrow (Monday's) lecture, and why I brought that lecture forward.
As they say, All Will be Revealed ...... (cue the dramatic music)
Cheers
MichaelW
Hi Suan,
It looks like you are using Windows System for Linux (due to the Windows-format pathnae), rather than Linux itself, which you can get via the Docker image, or download and install UniApps and search for Linux Lab.
Cheers
MichaelW
Hi Weng
That is entirely fair and reasonable. It should be included. However, I know from experience that classes hate it when the specification of an assignment is changed after it has been released. I'll therefore leave the specification as it is.
I...
Welcome to helpOSTS
I've now tried the two class discussion systems available on LMS Discussion Forums and Wiki. Neither is suitable, as far as I'm concerned. The Discussion Forums are only for topics set by the unit coordinator, and are used in ca...