Table of Content
Historically, it has been difficult to make data on human trafficking readily accessible to analysts, academics, practitioners, and policy-makers. But with the help of "The Counter Trafficking Data Collaborative (CTDC)", we can have access to the first global data hub on human trafficking, with data contributed by organizations from around the world.
In this project, I will try to visualize data and drive some insights from the accessible data using Tableau.
You can download the main k-anonymized dataset with data on 48,801 victims of human trafficking by clicking on the button below:
The data is in a CSV format, published on April 14, 2020.
It's a k-anonymized dataset with data on 48,801 victims of human trafficking.
Each row represents a victim of human trafficking and all the variables related to them.
-99 means missing data.
Year of Registration (from 2002 to 2019) is the year in which the victim was registered in the database.
The Data Source is either "Case Management" (from IOM) or "Hotline" (from Polaris). I won't use it as it won't affect my analysis.
Age is collected in ranges (0 to 8, 9 to 17... 39 to 47, and above 48).
Majority Status means that the victim is either "Minor" (Under 18) or "Adult" (18 and Over).
Citizenship (in ISO capital letters) means from which country these victims are.
Means of Control (Boolean) is about how the victims are controlled. There are so many means (Debt Bondage, Takes Earnings, Restricts Financial Access, Threats, Psychological Abuse, Physical Abuse, Sexual Abuse, False Promises, Psychoactive Substances, Restricts movements, Restricts Medical Care, Excessive Working Hours, Uses Children, Threat of Law Enforcement, Withholds Necessities, Withholds Documents, Other, or Not Specified).
Type of Exploit (Boolean) means how the victims are exploited (Is Forced Labour, Is Sexual Exploit, Is Other Exploit, Is Sex and Labour, Is Forced Marriage, Is Forced Military, Is Organ Removal, or Is Slavery and Practices).
Type of Labour (Boolean) means when the victim is forced to work, what type of labour are they doing (Agriculture, Aquafarming, Begging, Construction, Domestic Work, Hospitality, Illicit Activities, Manufacturing, Minning or Drilling, Peddling, Transportation, Other, or Not Specified).
Type of Sex (Boolean) means when the victim is sexually exploited, what type of sex are they doing (Prostitution, Pornography, Remote Interactive Services, or Private Sexual Services).
Is Abduction (Boolean) means if the victim was taken away against their will.
Country of Exploitation means where these victims are being exploited.
Recruiter Relationship (Boolean) means who recruits these victims to be exploited (Intimate Partner, Friend, Family, Other, or Unknown).
I cleaned the data in EXCEL.
I deleted some columns as they are not going to be useful in my analysis (The First Column, Data Source, Majority Status at Exploit, Majority Entry, Means of Control Concatenated, Type of Exploit Concatenated, Type of Labour Concatenated, Recruiter Relationship, and Type of Sex Concatenated).
I replaced the "-99" values either with "Unknown" in some columns such as (citizenship, country of exploitation, broad age, majority status) or with "0" in the boolean variables.
I made sure that the boolean variables have only "0" or "1" as options.
I replaced the "0" values in citizenship and country of exploitation with "Unknown" as it means the same thing.
After connecting the dataset to TABLEAU, I had to change some variables:
I changed the Year of Registration to a date format rather than a number.
I changed Citizenship to country/region rather than a string.
You can download the cleaned Excel Document by clicking on the button below:
Note: The document could be opened in Google Sheets and some Excel features can't be displayed and will be dropped if you make changes. Also, formatting won't be exactly the same.
So, for a better experience, download the document by clicking on File=> Download => Microsoft Excel (.xlsx).
To get access to the workbook online, click on the button below:
Before searching for anything specific, I have created a general dashboard where I could play around with data in order to explore it.
The first thing I did is to create a separate sheet for each of my dimensions:
Citizenship: First, I put the countries on the map, fixed the unrecognized countries, then colored the countries depending on the number of victims.
Country of Exploitation: I did the same thing as for the citizenship sheet.
Note: Greenland represents the unknown citizenships and countries of exploitation. I don't have Greenland in the dataset, and as it looks big and easy to have access to it on the map, I decided that it will represent the unknown countries in the dataset. Also, there is a lot of unknown or missing countries in the dataset, So, I chose this solution to not ignore the observations and have a wrong/incomplete analysis.
Age Broad: This bar chart represents how the victims, in percentages, are distributed through the different age ranges.
Gender: This pie chart represents how the victims are distributed between males and females.
Majority Status: This pie chart represents how the victims are distributed between adults, minors, and the unknown observations.
To make this dashboard more useful and alive, I used each visual as a filter. This way, I can filter by whatever variable of each dimension I want.
For example, I can see only the female victims in all the other visuals (where are they from? where did they get exploited? What's their majority status?...). Also, we can see the victims' gender, age ranges, and the majority status differences from one country to another.
Some Insights:
One-fifth (21%) of the victims are minors, while half are adults. The rest' majority (25%) is unknown.
Almost three-quarters (73%) of the victims are female while a quarter (27%) are male.
The highest number of victims are from the Philippines with 23% of all the registered victims. Ukraine comes in second with almost 16%, followed by the Republic of Moldova with12%.
The highest number of victims (More than 25%) are being exploited in the United States. 11% are being exploited in Ukraine, 9% in The Republic of Moldova, and more than 5% in Russia.
About Minors: 85% of minor victims are between 9 and 17 years old. 3 out of 4 minors are female. Most minors are either from the Republic of Moldova, the US, or have unknown citizenship. Half of the minors are being exploited in the US while one-fifth are being exploited in the Republic of Moldova.
About Adults: 27% of adult victims are between the age of 30 and 38. The same as minors, 3 out of 4 adults are female. Most adults are either from Ukraine, the Republic of Moldova, or have unknown citizenship. 28% of adults are being exploited in the US, almost 20% in Ukraine, and around 10% each in Russia and the Republic of Moldova.
About Males: 1 of 5 males is a minor while another 1 of 5 males is between 30 and 38 years old. Half male victims are from Ukraine and the Philippines. Most of the victims are being exploited in Ukraine, Russia, Indonesia, or in an unknown country.
About Females: The same as males, 1 of 5 females is a minor especially in the age between 9 and 17 years old. 1 out of 4 females is unknown from where they are. Another 1 out of 4 females come from the Philippines, While again, we have so many victims from the Republic of Moldova, Ukraine, and the US. One-third of female victims are being exploited in the US!
Victims exploited in the US: 4 out of 10 victims are minors. More than half the victims exploited in the US are under 20 years old. Almost all the victims (97%) are female. 7 out of 10 victims are unknown from where they are while 3 out of 10 are from the US itself.
Note: Filtering by year didn't make much sense to me as it represents the year of victims' registration. The data is being collected from different organizations and it depends a lot on which year the data of a certain country was added to the database.
Let's take the example of the Philippines as it is one of the highest countries where the victims are originated and also where they are exploited. Between 2002 and 2014 there is no data on the country but in 20015, there are 28 citizens from the Philippines, then 11,262 in 2016, then only 75 in 2017, then nothing until 2019. Considering the victims being exploited in the Philippines, there is only one year in which these victims are being registered and it's 2016.
This is the reason why I didn't want to include the years in my analysis.
This bar chart represents the means of control per gender, how the victims (males or females) are being controlled. It is sorted from the mean of control with the highest number of victims to the lowest one.
By the way, I deleted "Other" and "Not Specified" as they aren't critical to this analysis. Also, so many victims didn't specify any means of control. When represented in the graphic, it makes the other ones look way smaller.
The most used means of control for both genders combined are Psychological Abuse, Restricting Movement, Threats, and Physical Abuse.
The most used means of control for females are the same as for both genders combined, which are: Psychological Abuse, Restricting Movement, Threats, Physical Abuse, and Psychoactive Substances.
The most used means of control for Males are False Promises, Taking Earnings, Excessive working hours, and Psychological Abuse.
Mostly, victims are either being exploited sexually (16,067 victims with 95% females) or forced labour (9,772 victims split equally between males and females).
So, we are going to get a deeper look at what are the types of sex and the types of labour?
Almost half of the victims, mainly females, forced to work are in Domestic Work (43%), followed by one-fifth (19%) in Construction, which are mainly males.
The victims exploited sexually are dominantly (99%) females. 95% of these victims are into Prostitution while 5% are constituted from victims in Pornography and in Private Sexual Services.
The idea behind this project is to showcase my visualization skills using Tableau in order to drive preliminary insights that could be used to drive deeper and more precise questions to understand human trafficking in the world.
For sure I could realize more in-depth analysis about how these variables interact with each other and how human trafficking is shaped differently from one country to another.
I enjoyed working with this data, and I was surprised to discover such an alarming situation about human trafficking and how even in our time of democracy and freedom, we still use and exploit people of all ages, especially kids and teenagers in all sorts of labours, sex exploitation...