Recently, I had the opportunity to sit on a Big Data panel event hosted by Women in Computer Science (WiCS) at the University of Waterloo. The three panelists and the moderator all work in different roles in the Big Data world, and there was an impressive turnout of students eager to hear what we had to say.
Erin McLaughlin was our moderator, the connector who brought the panel together. She currently works as the Director of Business Analytics at Sun Life Financial and the Managing Partner at Big Incites, a consulting firm apt at delivering proper data, analytical tools and techniques to tackle Big Data and drive business value.
Amanda Edwards is a Senior Analytics Consultant at Sun Life Financial. She is a certified Scrum Master in charge of delivering Big Data projects at Sun Life.
Greta Cutulenco is a Masters student at UW’s Electrical and Computer Engineering department, with a focus on applying Trace Mining to machine data in order to predict equipment performance and failures. She has started her own company, Acerta, to apply Trace Mining to IoT (Internet of Things) data.
Finally, yours truly was the third panelist. I’m currently working at NetSuite Waterloo which is in the midst of building Human Capital Management software for the new breed of a data driven, consumerized HR experience. My role is partly as a Data Scientist to develop predictive People Analytics models, and partly as a Product Manager to develop insightful analytical reports. As I’ve told anyone who listens, who knew HR was so exciting!
We often hear that Big Data is “hot” and that being a Data Scientist is the “hottest” job. So what exactly do we, those who work as Data Scientists, do in our daily work? How did we get to this point of holding this title in such an industry?
It turns out, it’s less about the algorithms and more about working with people to deliver business value to customers. It’s all about the people.
Erin is in charge of finding out what different stakeholders within her company require from the data, and deliver insights to these internal stakeholders. Amanda works closely with her team to solve people issues and technical issues alike so that key projects are delivered on time. Without people skills to help define what insights are needed, data projects won’t answer the right questions. Without people skills while working with a team, data projects will get road blocked like any other project.
Greta and myself focused on the other interface of data with people -- data models and products are, in the end, consumed by people. If people don’t understand or trust the model, they will seize upon counter examples with any reason they can find in order to discredit the algorithm.
Greta mines machine generated data, and her models predict probability of machine failure. If not properly explained and communicated, her clients will absolutely seize on the false positive or false negative predictions that comes with any statistical model to discredit the model. While one can explain this away by saying “precision and recall” in a room of statisticians, it’s not so easy to explain to clients in other professions.
The job of a data scientist, then, is to communicate, communicate, and communicate. It’s not about saying “I’m right” but rather providing value to the customers and clients -- give them what they paid for -- which is the confidence and ability to use the tool (the model) you have built for them. For this reason, we often choose white-box type models where there is more information transparency (think regression), than black-box models where the “why” of the prediction is hidden (think neural networks).
Greta also shared a wise piece of advice: Don’t make the solutions fully autonomous, let the people choose. We want to build decision-support solutions with data – we don’t want to take away people’s ability to make decisions. Perhaps once people consistently select the choices we recommend, and there is trust built between them and our solution, then they will choose to allow for automation. But until then, leave the decision with the people.
Data Scientists, Data Engineers, and really, just about anyone who has access to customer data, have a lot of power to piece together personal information about customers. With great power comes great responsibility to ensure that we don’t misuse the data while being mindful about any potential consequences from the data products and insights we deliver.
The panelists agreed that we need to pay close attention in guarding PII (Personally Identifiable Information) in the data we access. Just because we can look at or predict a piece of information, doesn’t mean we will choose to see or share it. This has a lot of implications in my own work as I’m in the business of predicting which of our customers’ employees might be a flight risk. My algorithm can never be sure of the prediction’s accuracy. I can also never know the full extent of the human stories behind the employee profiles. My predictions/recommendations will need to aim to retain talent and solve their problems, rather than identifying who to let go.
The panel also discussed topics ranging from data privacy and security, to mentoring, to career advice. Interestingly, there were several questions from the students on the topic of mentoring. I was proud to share the Women in NetSuite Mentorship Program that I helped launch at NetSuite Waterloo. It was great to see that mentorship is being promoted more and more -- there’s great consensus that mentoring relationships benefit both the mentees and the mentors.
So, what are the key takeaways for our budding data scientists in the audience?
First, learn some SQL. Your resume will shine if you can demonstrate some backend experience.
Second, show initiative. Follow your passion and do what you love. Take the extra courses you're interested in. Complete the extra projects you're curious about. Develop a portfolio outside of what school asks you to do. This will tell the employers who you really are. Most of all, it’ll show them that you'll take the initiative to do what’s not directly asked of you.
And last but not least, persist through challenges. Show that you don’t give up, you keep trying, and you learn something from your experiences.
©2018 NetSuite Waterloo. 55 King St. West, Suite 900, Kitchener, ON | NetSuite | TribeHR