The March "Storytelling With Data" challenge was to take aid data available from the College of William and Mary and turn it into a visualization to answer several questions:

  1. Who Gives?
  2. How Much?
  3. To whom?
  4. For what purposes?

This is international aid data, and it includes major non-governmental donors as well as countries, but I decided that I wanted to look at country-to-country donations first.

Analysis

My initial analysis of the data was done using Tableau, to get a feel for the overall content. I plotted the sums of all the donations from the donor countries on a world map and rapidly found that Japan and the U.S. had provided the bulk of the donations over the last 50 years.

This surprised me, as one of the stories I keep hearing is how generous European countries are by comparison to the rest of the world. I decided to combine this dataset with GDP and population data to see whether I could get a clearer picture.

Is everything I believe about Scandinavian countries wrong? (Spoiler: not so much.)

I had already been using data from gapminder and the World Bank, which provide both GDP and demographic information. To help tidy (and categorise) the data, I also used a list of countries by continent that I derived from the World Atlas; I added back in protectorates and territories, as the aid data was provided at this regional level.

I used python for data cleaning and to create a csv file that mapped population by year to the countries included in the W&M aid data. I imported that new csv file into Tableau and carried out the rest of the analysis using Tableau's astonishingly powerful visualization tools.

Data Cleaning

When I initially imported the population file into Tableau, several countries were missing after the inner join... it turned out that the various organizations were using different abbreviations for the country names, and several of them were therefore missing after the merge.

I have a separate post in the works talking about the cleaning process and creating the dictionary mappings; the most important thing is that I got ~19% more lines of data at the end of it, and consider it time well spent.

Categorization

The “Purpose” field describes what each donation was targeted towared. Unfortunately, it has has over 230 unique entries in it, which made it difficult to see any patterns.

To clarify and zoom out, I broke these purposes into categories. The purpose-category mapping was done in Tableau; a copy of the code for the field is available here: Purpose-Category Map

Each Purpose has been assigned to only one category.

I had to make judgement calls on how finely to categorize, seeking to group things conceptually rather than by domain. Like all categorization, this is subject to revision and debate. I remain unsure, for example, on whether to classify Agriculture with Food or Industrial and Economic Development. At the moment, I have it in its own category, but I may be convinced to change that.

A caution: some of those might not be consistent with the original coding. For example, I put “Education” at the end of the conditional structure. This way education or training for a particular purpose is categorized in that purpose, rather than as “Education” per se.

Where the name of the purpose did not give me a clear sense of the meaning (for example, “Chemicals”) I returned to the purpose code (in the aid data) to find out what larger category it referred to. (Is this about cleaning chemicals up, or about building chemical plants? i.e... should I assign this purpose to "Environment" or "industrial development"?)

Visualization

In the end, this was intended as a visualization rather than a coding exercise. International Aid naturally lent itself to a world map, so that was my first step.

My initial dashboard showed the countries of the world shaded by the total quantities received or donated. As I added in the per capita data, it got harder to see: some places with small populations have received relatively large contributions, and they become so dark that they swamp the colour variation across the rest of the map.

I changed the scale representation from colour to size, which is easier to read and interpret anyway. At this point, a feature appeared that I was not expecting.

Islands

The largest recipients per capita are all small islands with small populations. They appear on the map as a series of outsized dots in the middle of the oceans. They tend to receive money from countries with historic links to them; I feel like there is potential to use this feature of the visualization to examine historic trends and relationships and spark conversations about colonization and post-colonial wealth transfer.

On the dashboard you can use the sliders to filter out these outlying data points to scale up the dots on the more populous parts of the world.

Dashboard for Exploration

The current version of the dashboard is published at my Tableau profile, and also embedded below.

You can filter these views on any of: Aid by Sector, Donor Countries, or Recipient Countries.

You can also examine how aid has changed over time using the Decade filter to look at any individual or combination of decades from the 1970’s until the 2000’s. (2010s are included, but as of yet, there is only one year of data for that decade.)

By applying the filters sequentially, you can drill down to see relationships between individual countries, which can also be used to consider the priorities of the donors,  in terms of where they direct their support and for what purposes.

Conclusion

I found this a fascinating project, and I intend to work more with these sets of data, as I continue on my data science journey. I'm now moving on to some machine learning; I am interested to see how various forms of aid impact development over time.