Hi Michael
Could I do a sanity check?
I believe I have filtered the tsv data per requirements and about the exclude rows that do have insufficient Cantril-Value pairs. This will be done on the basis that:
- Cantril data may be absent in certain years within the range of those for which there otherwise is data - those cells should be retained;
- Empty cells are fine anywhere for the data cleaning program, so long as the rows are within the range of years of interest (thread with Kai Zheng at 3.37pm, 19 May);and
- Cantril data could be missing for some countries, wholly or in part. If so, to include a correlation for any country there be at least 3 predictor-value, Cantril-value pairs.
Looking at my filtered data against Sample 2, I have four rows that, to my understanding, should be retained. Please find my data in attached tsv.
My thought process is as follows:
Afghanistan 2013 doesn't have a Canrtil Ladder Score and should be discarded.
Afghanistan 2014 and UAE 2016-2018 should be retained as they contain three predictor value pairs as well as the Cantril Ladder Score.
Or are those four rows excluded because the entire row/value set is incomplete? The countries however are included in the correlation because they have at lease 3 years of full row/value sets with a Cantril Ladder Score?
Thank you.