The Truth About Open Data

Is it better to stop bringing in more (possibly corrupt) arms and instead focus on education?Open Data → Transparency → Trust → ProgressUnfortunately, there’s two obstacles that stand in the way.

Obstacle #1 : Where Are The Skilled Workers?The reality of data science is that the skills it requires aren’t easily found in developing countries.

Degrees in scraping, cleaning, analyzing, visualizing, interpreting, and utilizing data are few and far between here in Colombia.

Among the few people that do have the skills, they expect to get paid.

Funding from the government is slim, so data scientists look for funding elsewhere.

This means that most of the clients for data scientists are news outlets or organizations from the USA, not the Colombian government.

In Cali alone, if the city had access to skilled data scientists, they could use data to :understand the crime rates / categories between districts and neighborhoods to improve policingencourage the four institutions that report crime in this city to finally share data with one another (this could also help to decrease the amount of corrupt police officers)map the air quality of the city to determine methods of improving itdiscuss if education plays an important role in lowering crimeprove to the public that the days of the guerrillas are truly endingsee that restricting car usage throughout the day among citizens doesn’t actually benefit the environment or the publicaudit government financial transactions to fight corruptionvisualize the growing wealth gap between the peripheral communities of the south and the rising affluent neighborhoods of the north (and then discuss how to fix it)The projects are endless, the impact is immeasurable, and skilled workers are nowhere to be found.

Obstacle #2 : Where Is The Data?The 2018 census of Colombia became controversial in the city of Cali after claims that the numbers had been reported incorrectly.

In 2019 the population was projected to be about 4 million 800 thousand individuals, but the census only revealed 3 million 900 thousand.

That’s almost 1 million missing people.

Citizens of Cali contested these results on the grounds of inconsistent data collection methods, including entire neighborhoods that went uncounted.

An incorrect census can be a big deal, because a smaller population count means less funding from the government for public services (like education).

The census is one of many examples of data discrepancies here in Colombia.

The police would love to use more data to help fight crime, but that isn’t possible yet.

Four separate entities regulate and report crime in the city, and all of them refuse to share their data with each other.

In Bogotá, the capital of Colombia, a three year long project plans to utilize crime data to help create an effective predictive-policing platform for the city.

Recent research has begun to uncover that predictive-policing platforms tend to create discriminatory feedback loops in underprivileged neighborhoods rather than effectively reducing crime.

With this information, I became interested in running analyses on the crime data that exist in Bogotá (in hopes to research alternatives to predictive-policing).

I reached out to some government entities and professors at universities to gain access to the city’s data.

The general response:“Dear Jessie thank you for your interest in our work on predictive policing in Colombia.

Unfortunately data and codes are proprietary of the city of Bogota.

”In third world countries, data hardly exists.

In many developing countries, data is a secret.

When it becomes public, data is often-times inaccurate.

Inaccurate data leads to incorrect analyses and visualizations that lie.

Hidden data ruins trust.

Invalid data hurts communities.

Data holds the key to understanding and helping our communities.

But before we can even begin to utilize it, everyone must first open it.

In other words: open data is important, but it only works if everyone participates, because missing data invalidates the rest.

The Reality About Open DataThe open data platform in Cali has only been around for two years.

Most of the data that was collected more than two years ago exists in piles of pdfs that would take years to digitize.

Automating this process takes serious skill and effort from qualified data scientists that charge significant amounts of money.

The data that is being collected now doesn’t follow any specified format, which makes it difficult to automate a system for visualizing it.

The two individuals who run the Office of Transparency have no technical background.

They want to get the public interested and involved, but they don’t know where to begin.

Most of the roadblocks for this movement can be solved by people who have knowledge of UX, UI, design, data science, dataviz, database administration, software engineering, or computer science.

Unfortunately, without funding it is extremely hard to recruit the skills needed to help with Cali’s open data platform.

So, qualified workers instead put their efforts elsewhere.

Data scientists get lulled into the enticing world of lucrative corporations.

Corporate projects tend to utilize data from customers for advertising and product matching.

When the client is corporate, the end-goal is revenue.

If the client were the public, the end-goal could be social impact.

Here’s the truth:In the United States, I’ve met thousands of qualified coders, data scientists, researchers, and students.

There is a surplus of skill.

However, most of these skilled workers complain about the same problem — their work isn’t fulfilling.

In places like Colombia, the beginning of an open data movement provides an unlimited resource for social impact projects.

The country wants to open up and utilize its data, but it needs help.

What is stopping skilled workers in places like the USA from coming to places like Colombia to help the government, NGOs, and startups with fulfilling projects?I’m looking to connect qualified individuals to meaningful work.

Eventually, I’d like to help bring more opportunities for tech education here too.

If you’re interested in getting involved, please reach out.

This is only the beginning.

Jessie Smith – Machine Learning (ML) Ethics Researcher – California Polytechnic State…California Polytechnic State University-San Luis Obispo Bachelor of Science – BS, Software Engineering Major *…www.



. More details

Leave a Reply