Please consider offering answers and suggestions to help other students!
And if you fix a problem by following a suggestion here,
it would be great if other interested students could see a short
"Great, fixed it!" followup message.
The overall program should, for a given datafile:
Based on the header (i.e. top) line, make sure that the file is a tab-separated format file
Also based on the header line, report any lines that do not have the same number of cells. (Cells are allowed be empty.)
Remove the column with header Continent, which is sparsely populated and is not present in one of the files.
Ignore the rows that do not represent countries (the country code field is empty)
Ignore the rows for years outside those for which we have at least some Cantril data. (Cantril data may be absent in certain years within the range of those for which there otherwise is data; those cells should be retained.)
I have a question : What shold I do if a country is in gdp-vs-happiness.tsv, but not in homicide-rate-unodc.tsv, should I just ignore them or i need to send to stdout? The above requirements do not mention this.
what I mean is below:
$$outputfile = gdp-vs-happiness.tsv \cap homicide-rate-unodc.tsv \cap life-satisfaction-vs-life-expectancy.tsv $$
or
$$outputfile = gdp-vs-happiness.tsv \sqcup homicide-rate-unodc.tsv \sqcup life-satisfaction-vs-life-expectancy.tsv $$
?