CoronaNet Project Goals

Why do we need Project Goals?

This project has always had to deal with uncertainty given that we are collecting data in real time and we thus cannot know ahead of time:

How long the pandemic will last and how governments will react to it
How many people are willing to assess, plan, code, clean and validate data

Having project-wide goals helps us coordinate and make sure that despite the uncertainty, we are doing our best to:

Collect data for all over the world for the same time periods
Ensure high quality in our data collection efforts

Having complete and clean data for many countries over the same time period is crucial to say anything about what the drivers and effects of COVID-19 government policies are

In terms of both ethics and feasibility, our ability to use experiments to figure out which policies are effective or what drives them is limited
We can however learn something about the drivers and effects of the pandemic by comparing what different countries do and:
- the more variation there is in what different countries do, the more we can potentially answer these questions (e.g. if everyone does the same thing, it is hard to say what would have happened if X other thing happened instead)
- the bigger our sample (the more countries over time) we have to compare, the more generalizable our findings (the more the findings will apply not just to a special subset of units/countries, but to a larger group of units/countries)

What are the Project Goals?

The overall project goals are to:

ensure complete and clean data for all countries.

The time period for which we aim to collect complete and clean data will be different depending on whether a country is a spotlight country or a capsule country and whether subantional data collection is involved:

Spotlight national and subnational countries: Document policies made up until 10/2021 (hard goal)

EU27 Countries + Eurasia + Senegal
- Austria, Belgium, Belarus, Bulgaria, Croatia, Cyprus, Czechia, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Kazakhstan , Kyrgyzstan, Latvia, Lithuania, Luxembourg, Malta, Netherlands, Poland, Portugal, Romania, Russia, Senegal, Slovakia, Slovenia, Spain, Sweden, Tajikistan, Turkmenistan, Ukraine, and Uzbekistan
- Subnational data collection for bolded countries

Capsule National Countries: Document policies made up until 03/2021 (hard goal) + 10/2021 (soft goal)

Afghanistan, Albania, Algeria, Andorra, Angola, Antigua and Barbuda, Argentina, Armenia, Azerbaijan, Bahamas, Bahrain, Bangladesh, Barbados, Belarus, Belize, Benin, Bhutan, Bolivia, Bosnia and Herzegovina, Botswana, Brunei, Burkina Faso, Burundi, Cabo Verde, Cambodia, Cameroon, Central African Republic, Chad, Chile, Colombia, Comoros, Costa Rica, Cuba, Democratic Republic of the Congo, Djibouti, Dominica, Dominican Republic, Ecuador, Egypt, El Salvador, Equatorial Guinea, Eritrea, Eswatini, Ethiopia, Fiji, Gabon, Gambia, Georgia, Ghana, Grenada, Guatemala, Guinea, Guinea-Bissau, Guyana, Haiti, Honduras, Hong Kong, Iceland, Indonesia, Iran, Iraq, Israel, Ivory Coast, Jamaica, Jordan, Kenya, Kiribati, Kosovo, Kuwait, Laos, Lebanon, Lesotho, Liberia, Libya, Liechtenstein, Macau, Madagascar, Malawi, Malaysia, Maldives, Mali, Marshall Islands, Mauritania, Mauritius, Micronesia, Moldova, Monaco, Mongolia, Montenegro, Morocco, Mozambique, Myanmar, Namibia, Nauru, Nepal, New Zealand, Nicaragua, Niger, North Korea, North Macedonia, Northern Cyprus, Norway, Oman, Pakistan, Palau, Paletsine, Panama, Papua New Guinea, Paraguay, Peru, Philippines, Qatar, Republic of the Congo, Rwanda, Saint Kitts and Nevis, Saint Lucia, Saint Vincent and the Grenadines, Samoa, San Marino, Sao Tome and Principe, Saudi Arabia, Serbia, Seychelles, Sierra Leone, Singapore, Solomon Islands, Somalia, South Africa, South Korea, South Sudan, Sri Lanka, Sudan, Suriname, Syria, Taiwan, Tanzania, Thailand, Timor Leste, Togo, Tonga, Trinidad and Tobago, Tunisia, Turkey, Tuvalu, Uganda, United Arab Emirates, United Kingdom, Uruguay, Vanuatu, Vatican, Venezuela, Vietnam, Yemen, Zambia, Zimbabwe

Capsule Subnational Countries: Document policies made up until 10/2020 (hard goal) + 10/2021 (soft goal)

Australia, Brazil, Canada, China, India, Japan, Mexico, Nigeria, Switzerland, United States

What is ‘complete’ data?	What is ‘clean’ data?
Complete data for a given time period: in the abstract mean documenting all the COVID-19 government policies that a given government made. in practice, complete data is impossible to define beforehand; if we knew what the complete set of policies there were to code, our jobs would already be halfway done! We can only judge the ‘completeness’ of the data for any given region using: RA internal assessment of whether the policies captured in the CoronaNet dataset for their region represent a faithful representation of government policies made in that region for a given time period. External comparison of the number and types of policies that other datasets have captured and subsequently integrating external data Why might there be incomplete data? Finding information about policies can be very difficult especially for countries with low state capacity Some countries may have not had an RA or inattentive RAs for long periods of time	Clean data means that for each observation: all relevant information is available and; correctly coded according to the latest taxonomy We can judge the ‘cleanliness’ of the data for any given region by RA internal assessment of whether their region are coded in a way that consistently conforms to the last version of the CoronaNet taxonomy, which will then later be verified by a country or regional manager, Automated assessment of problems in the data that can be identified through R code. Why might there be missing information for a given variable? We’ve made a number of changes to the original survey, launched March 28, 2020, in order to keep track of what new things governments have done since then. —> When questions are added later to the survey however, this data is missing for earlier observations and must subsequently be backcoded Why might there be incorrectly coded information for a given variable? When there are new options or new questions added later, earlier data coded using the older taxonomy may not match the new taxonomy Coding data is hard especially with our taxonomy and its only natural that people make mistakes =)

What is ‘complete’ data?

What is ‘clean’ data?

Complete data for a given time period:

in the abstract mean documenting all the COVID-19 government policies that a given government made.
in practice, complete data is impossible to define beforehand; if we knew what the complete set of policies there were to code, our jobs would already be halfway done!

We can only judge the ‘completeness’ of the data for any given region using:

RA internal assessment of whether the policies captured in the CoronaNet dataset for their region represent a faithful representation of government policies made in that region for a given time period.
External comparison of the number and types of policies that other datasets have captured and subsequently integrating external data

Why might there be incomplete data?

Finding information about policies can be very difficult especially for countries with low state capacity
Some countries may have not had an RA or inattentive RAs for long periods of time

Clean data means that for each observation:

all relevant information is available and;
correctly coded according to the latest taxonomy

We can judge the ‘cleanliness’ of the data for any given region by

RA internal assessment of whether their region are coded in a way that consistently conforms to the last version of the CoronaNet taxonomy, which will then later be verified by a country or regional manager,
Automated assessment of problems in the data that can be identified through R code.

Why might there be missing information for a given variable?

We’ve made a number of changes to the original survey, launched March 28, 2020, in order to keep track of what new things governments have done since then. —> When questions are added later to the survey however, this data is missing for earlier observations and must subsequently be backcoded

Why might there be incorrectly coded information for a given variable?

When there are new options or new questions added later, earlier data coded using the older taxonomy may not match the new taxonomy
Coding data is hard especially with our taxonomy and its only natural that people make mistakes =)

What does this mean for me?

I am coding for a Spotlight national or subnational country

Your RM/CM will have quarterly-specific goals designed to help reach the overall project goals which takes into account what the data looks like for your country or subnational region.

In general however, all spotlight national or subnational countries should work in coordination as follows:

Stage 1: Work on collecting complete and clean data for each policy type until October 1, 2020
Stage 2: Work on collecting complete and clean data for each policy type until March 1, 2021
Stage 3: Work on collecting complete and clean data for each policy type until June 1, 2021
Stage 4: Work on collecting complete and clean data for each policy type until October 1, 2021

For each stage, you should work on making sure the data in your region is complete and clean before moving on to the next stage.

If your country or sub-national region has reached ‘Stage 4’ you should not code any further for your country or sub-national region— please coordinate with your regional/country manager to help get other data for other regions complete and clean at lower stages.

I am coding for a Capsule National Country

Your RM/CM will have quarterly-specific goals designed to help reach the overall project goals which takes into account what the data looks like for your country.

In general however, all capsule national countries should work in coordination as follows:

Stage 1: Work on collecting complete and clean data for each policy type until October 1, 2020
Stage 2: Work on collecting complete and clean data for each policy type until March 1, 2021

For each stage, you should work on making sure the data in your region is complete and clean before moving on to the next stage.

If your country has reached ‘Stage 2’ you should not code any further for your country or subnational region— please coordinate with your regional/country manager to help get other data for other regions complete and clean at Stage 1
If the great majority of countries have reached Stage 2’, RAs coding for all capsule national countries may continue coding until June 1, 2021 —> if and when this happens, a general announcement will be made by the PIs in conferral with regional and country managers.

I am coding for a Capsule Subnational Country

Your RM/CM will have quarterly-specific goals designed to help reach the overall project goals which takes into account what the data looks like for your country.

In general however, all capsule subnational countries should work in coordination as follows:

Stage 1: Work on collecting complete and clean data for each policy type until October 1, 2020

If your subnational region has reached ‘Stage 1’ you should not code any further for your subnational region— please coordinate with your country manager to help get data for other subnational regions in your country complete and clean at Stage 1

I am a regional or country manager

Regional and country managers should devise region or country specific quarterly goals to plan for reaching each of the project-level goals. More details will be forthcoming.

What tools and processes will we use to help us reach these goals?

Workflow Overview

In general, the workflow for achieving these goals will look like:

Workflow Timeline

For Fall 2021 we will aim to stick to the following timeline:

Workflow Table

To access the some of the elements listed in the workflow Chart, please see the table below:

	How to get to complete data?	How to get to clean data?
Processes	Quarterly Survey
	Goal Making
	For RMs/CMs: Region/country specific completeness goals (e.g. number of policies to integrate, checking government sources or sources on Overton/Jataware until a certain date)	For RMs/CMs: Region/country specific cleaning goals (e.g. policy types to check for cleanliness, fixing X problems through automated assessment of data problems)
	Assessment of Current State of the data
	For RAs: Internal assessment of e.g. completeness, complexity of the policy-making process via the RA Internal Survey (link for August 2021 survey here)	For RAs: internal assessment of the quality of the data via the RA Internal Survey (link for August 2021 survey here)
	Monthly RM/CM Feedback on the region and progress towards goals (ideally should take 15-20 minutes to fill out )
	For RMs/CMs: Update on whether quarterly regional/country goals are on track	For RMs/CMs: Feedback on problems/developments in the region
Tools	Data Overviews
	CoronaNet Tableau Visualization Overview : - Overview of number of policies currently coded in CoronaNet per country/province and policy type over time in both visual and table formats - Overview of number of policies to integrate from other datasets per country/province and policy type over time	Automated Data Quality Checks - Identification of which policies need to be cleaned according to the automated assessment - Identification which policies have the 'wrong' policy type according to data science models in both visual and table formats (along with update and correction links; forthcoming)
	Data Integration	CornEdit App [link TBD]
	Data Integration Sheets - Data from external data which allows RAs to i) assess the overlap between external data and CoronaNet data ii) integrate/recode data into CoronaNet taxonomy	Easy to use tool for making corrections to common mistakes
	Shiny App
	Use the Shiny App for: - Visualization of a timeline of policies - Access to table format of policies - Update and correction links for policies in Qualtrics
Information Resources
	CoronaNet Dashboard : Main portal which has detailed information on all resources and up to date information on changes in the project
	Slack : Main communication platform for the project, interact with other project members here!
	Overton Raw Sources /Jataware/Starsift Raw Sources: Access to potential raw sources about government COVID-19 actions	CoronaNet Previously Uploaded PDFs : PDFs of previously coded policies
	Low State Capacity Guidelines: Guidelines for how to document policies for countries with low state capacity	CoronaNet Coding Guide
	CoronaNet RA Previous Materials : Access to materials and information generated by RAs
	CoronaNet Skeleton : Details and examples of how the data should in theory be structured
	CoronaNet Survey : R markdown version of the Qualtrics survey
	CoronaNet PDF Codebook : Detailed information on each survey question
	CoronaNet Condensed Taxonomy : Detailed information how a subset of variables related to each other
	CoronaNet Duplicate Detector : Helps you assess whether a policy you are thinking of coding is already in the dataset or not
	CoronaNet Policy Predictor : Helps predict the best policy type for coding a given policy
	RA FAQs [Link TBD] : Summary of commonly asked questions asked in ra-chat

FAQs

How should I prioritize coding different policy types under this overall strategy?

As before, wherever possible, RAs should still prioritize coding the following policy types first wherever possible. Having the same priorities for policy types also helps ensure consistency and completeness along this dimension of the data, which improves cross-regional analysis of the data for these policy types.

Group 1

Lockdown
Curfew
Quarantine

Group 2:

External Border Restrictions
Internal Border Restrictins

Group 3:

Restrictions of Mass Gatherings
Social Distancing

Group 4

Closure and Regulation of Schools
Restrictions and Regulations of Businesses
Restrictions and Regulations of Government Services

Group 5:

Health Monitoring
Health Testing
Health Resources
Hygiene

Group 6:

Declaration of Emergencies
New Task Forces
Public Awareness Measures
Anti-disinformation Measures

Group 7:

COVID-19 Vaccines

What do we do with the data checklists?

We will be relying on the RA Internal Survey for much of the information that we used to get from the data checklists. However, if you’ve found them to be useful tools for yourself, we’ll still keep them around and you’re free to use them if you find them helpful!