What data science can learn from the LGBTQIA+ Community.

One of the first things we learn at data science 101 is how gender is a 'flag' variable; a binary relationship of 1 and 0 to represent either male or female. As a matter of fact, during data validation, data that does not constitute as 1 or 0 is viewed as an erroneous entry.

Yet the truth of our reality is how the conversation of gender has shifted from extreme-end values to a spectrum of identities. So should data science keep up with the times?

The primary goal of data science is to interpret data and generate insights, answering questions in the most efficient way possible. It adheres to the principle of parsimony that dictates that a theory should provide the simplest possible (viable) explanation for a phenomenon, as such with something as diverse as the LGBTQIA+ community, data science has yet to evolve its language to ethically represent the conversation (O'Neil, C., 2016).

Last Thursday, SG Narratives held a Pride workshop called the Finding the Common in Community, together with WeWork Singapore. The workshop aims to find the common in the community and build a culture and understanding of inclusiveness, no matter your background or orientation.

As an icebreaker activity, one that immediately brings our participants to the spirit of the workshop, we designed a chart that plots a person's gender expression and sexual attraction. Across the graph, we drew an asexual line, crossing starkly against a backdrop of a community that often excludes the A in LGBTQIA+ community.

Gender expression and Sexual attraction; a Graphical Conversation

The intent was to grow beyond our heteronormative boxes of male or female, a zero-sum identification to fit a person's identity to a societally accepted normative. While some might argue the necessities of such identification (efficiency such as the research purported in Bresnick, J. 2016), one must also consider that such narrative lack nuances of a person's life, depending on the objectives of such analysis.

Even then, when we worked through the graph, we had constant discussions about representing transexuals, gender fluidity, and demi-sexuality.

At one point of time, a participant (now a friend) took a marker and wrote transexual in the graph before placing herself on it. The graph began to shift the room to conversations of struggles and identities. One participant recounted his story of fearing rejection from his religious family for being gay. Another participant related to his struggle and offered words on consult.

At SG Narratives, we believe that in return for bearing your souls to a safe space, people might identify with these stories, and these connections resonate. It might not solve the entirety of their challenges but practising empathy is healing and liberating on the storyteller and the empathiser.

It is these conversations that data science, at its efficiency, risk missing out on*. If technology is capable of innovating its language to include these conversations, we would gather better insights into humanity at its best and its most vulnerable.

It all begins by providing a safe space to empathise and turn these conversations into data, no matter how constellated and diverse, just like star stickers across a map; a very flawed map but the start of practising empathy.

Happy Pride! 🏳️‍🌈


1. O'Neil, C. (2016). Slate’s Use of Your Data. [online] Slate Magazine. Available at: https://slate.com/technology/2016/02/how-to-bring-better-ethics-to-data-science.html [Accessed 4 Feb. 2016].

2. Bresnick, J. (2016). Big Data Shows Gender-Based Medical Error, Patient Safety Patterns. [online] HealthITAnalytics. Available at: https://healthitanalytics.com/news/big-data-shows-gender-based-medical-error-patient-safety-patterns [Accessed 22 Sep. 2016].

*Author's Notes

It is important to note that the opinions of this article are not to dispute the relevance of such research where such practices of gender identification are necessary, depending on the intent of the analytics. However, for other intents and purposes, such practices may limit the insights significantly and require a review of inclusion and representation.

51 views0 comments