PLEASE NOTE: the upgrades to this server, secure.csse.uwa.edu.au, have not yet been completed.
Hopefully the changes will be completed on THURSDAY 8th December.
Web-based programs, such as csmarks, cssubmit, and the help fora, will be unavailable at some time on Thursday 8th.
  It's UWAweek 49

help2003/help4407

This forum is provided to promote discussion amongst students enrolled in Open Source Tools and Scripting.

Please consider offering answers and suggestions to help other students! And if you fix a problem by following a suggestion here, it would be great if other interested students could see a short "Great, fixed it!"  followup message.

How do I ask a good question?
Displaying the 5 articles in this topic
Showing 5 of 564 articles.
Currently 2 other people reading this forum.


 UWA week 19 (1st semester, week 10) ↓
SVG not supported

Login to reply

👍?
helpful
3:14am Thu 12th May, ANONYMOUS

I found a really interesting tip. "I think this is the harder of the two tasks. The way you may care to do this is first to transform the text into a sequence of things you want to keep: words and the bits of punctuation we are interested in, one per line. You then use a Sed script to convert these words, punctuation, etc, into a stream of tokens, e.g. "word" presents all words, "comma" represents ",", etc. This script is, arguably, the trickiest part of the project. Spend some time on it and do a lot of testing with different text selections. Once you have the stream of tokens, a fairly simple Awk script can be created to compute the profile. Remember to sort the profile." However, I don't understand certain expressions in it, such as "a sequence of things you want to keep... one per line" and "a stream of tokens". I am not asking what script we are gonna use to achieve this stuff, I am actually asking what we should achieve if we follow this tip, and what it looks like. It would be very kind of you to explain it using pictures as examples.


SVG not supported

Login to reply

👍?
helpful
8:21am Thu 12th May, Michael W.

Hi, Clearly, in this tip I'm trying not to be overly specific - it's tip, not an answer - but please look at the word count example that I began the unit with, and then reworked a number of times. That takes text, formatted as words per line and ends up with a file with one word per line. The things you want to keep are words (as I've defined them), comma, dot, etc, but not digits or other punctuation, such as ":" or "$". Makes sense? Cheers MichaelW


SVG not supported

Login to reply

👍x1
helpful
4:38pm Thu 12th May, Daniel S.

To understand Michael's approach, please go back and look at his word counter code from the lectures. His tips on solving this assignment are very similar to how he has made his word counter. - "a sequence of things you want to keep": Michael is suggesting that you remove everything from the file except the words/characters/whatever that you are going to count, one per line - "a stream of tokens": Michael is suggesting that you convert each thing into a corresponding token Imagine you have a file containing the following string: "Jack, and Jill" As a sequence of things, one per line: Jack , and Jill Now, as a sequence of tokens: word comma word word The tokens can then be easily counted using awk or even grep (but awk is more efficient).


SVG not supported

Login to reply

👍?
helpful
1:20pm Fri 13th May, ANONYMOUS

Isn't this more work to do? While identifying the elements (word, comma, etc.), it is possible to count them at the same time. I don't understand what the benefit of implementing tokens is.


SVG not supported

Login to reply

👍?
helpful
4:35pm Fri 13th May, Daniel S.

It's a hint, it's not the only way to approach the assignment. The token approach means you can gradually add logic to your program and easily see where things are going wrong, which is particularly helpful if you don't know how to approach the assignment. If you are comfortable writing a single awk script to create profiles, great! Go for it.

The University of Western Australia

Computer Science and Software Engineering

CRICOS Code: 00126G
Written by [email protected]
Powered by history
Feedback always welcome - it makes our software better!
Last modified  1:17AM Sep 14 2022
Privacy policy