Hi Professor Michael,
I'm a little confused about the the description of first part, data cleaning in assignment 2. It said "The output is expected to be a tab-separated data directed, as before, to standard output", and it also said, "{The output file} sent to stdout should have rows with the data in the following order." Should I redirect outputs into a file as cleaned data for part 2 best_predictor and print out to stdout both? Please see the attachments, that's what I have done. If an output file is needed, what name should I specify?
Secondly, I'm not sure what the word "report" means in the description: "Also based on the header line, report any lines that do not have the same number of cells. (Cells are allowed to be empty.)"
Because the final output is stdout on the screen a file, or both.
Finally, about the output of part 2, the order of the Mean of correlations can be different from the assignment's description? For example:
Mean correlation of Life Expectancy with Cantril ladder is -0.208
Mean correlation of GDP with Cantril ladder is -0.110
Mean correlation of Homicide Rate with Cantril ladder is 0.061
Mean correlation of Population with Cantril ladder is -0.835
Most predictive mean correlation with the Cantril ladder is Population (r = -0.835)
I would appreciate it if you help me with the questions above.
Thank you