Tag Archives: Harvard

What is Data Science?

There’s no question that “data science” is becoming more and more popular. In fact, Booz Allen Hamilton (a consultancy) found:

The term Data Science appeared in the computer science literature throughout the 1960s-1980s. It was not until the late 1990s, however, that the field as we describe it here, began to emerge from the statistics and data mining communities. Data Science was first introduced as an independent discipline in 2001. Since that time, there have been countless articles advancing the discipline, culminating with Data Scientist being declared the sexiest job of the 21st century.

Unsurprisingly, there are countless graduate and undergraduate programs in data science (Harvard, Berkeley, Waterloo, etc.), but what is data science, exactly?

Given that the field is still in its proverbial infancy, there are a number of different perspectives. Booz Allen offers the following in their Field Guide to Data Science from 2015: “Describing Data Science is like trying to describe a sunset — it should be easy, but somehow capturing the words is impossible.”

Pithiness aside, there does seem to be consensus around some of the pertinent themes contained within data science. For instance, a key component is usually “Big Data” (both unstructured and structured data). Dovetailing with Big Data, “statistics” is often cited as an important component. In particular, an understanding of the science of statistics (hypothesis-testing, etc.), including the ability to manipulate data and almost always — the ability to turn that data into something that non-data scientists can understand (i.e. charts, graphs, etc.). The other big component is “programming.” Given the size of the datasets, Excel often isn’t the best option for interacting with the data. As a result, most data scientists need to have their programming skills up to snuff (often times in more than one language).

What’s a Data Scientist?

Now that we know the three major components of data science are statistics, programming, and data visualization, do you think you could identify data scientists from statisticians, programmers, or data visualization experts? It’s a trick question — they’re all data scientists (broadly speaking).

A few years ago, O’Reilly Media conducted research on data scientists:

Why do people use the term “data scientist” to describe all of these professionals?


We think that terms like “data scientist,” “analytics,” and “big data” are the result of what one might call a “buzzword meat grinder.” The people doing this work used to come from more traditional and established fields: statistics, machine learning, databases, operations research, business intelligence, social or physical sciences, and more. All of those professions have clear expectations about what a practitioner is able to do (and not do), substantial communities, and well-defined educational and career paths, including specializations based on the intersection of available skill sets and market needs. This is not yet true of the new buzzwords. Instead, ambiguity reigns, leading to impaired communication (Grice, 1975) and failures to efficiently match talent to projects.

So… the ambiguity in understanding the meaning of data science stems from a failure to communicate? Classic movie references aside, the research from O’Reilly identified four main “clusters” of data scientists (and roles within said “clusters”):

Within these clusters fits some of the components described earlier, including two additional components: math/operations research (including things like algorithms and simulations) and business (including things like product development, management, and budgeting). The graphic below demonstrates the t-shaped-nature of data scientists — they have depth of expertise in one area and knowledge of other closely related areas. NOTE: ML is an acronym for machine learning.


NOTE: This post originally appeared on GCconnex.

Is There a Way to Broadcast Ideology Without it Colouring Opinion?

There was a good article in the New York Times this past weekend from a professor of economics at Harvard, N. Greg Mankiw. He talked about how when economist give advice on policies, they’re also giving advice as political philosophers. While this should come as no surprise to anyone, I think it’s good that it’s being discussed.

What’s more interesting to me, though, is how we can offer opinions or advice on matters as experts, while at the same time disclosing our inherent bias to a given political philosophy. And if we do this, does that then colour the way the opinion is received? Most folks would say that of course it is going to colour the way the opinion is received, but maybe it wouldn’t. Regardless, I think it’s necessary to disclose biases, especially when it comes to making policy advice.

The problem here is that people aren’t always aware that they have a given bias towards one political philosophy over the other. While I’m relatively sure that I lean towards the “left” of the political spectrum when it comes to social issues, where I fall upon the political spectrum when it comes to other matters can vary by issue. This is part of the reason why I encourage folks to take the time and read through some of the more notable philosophers.

I suppose the idea of signaling also comes into play on this matter. That is, if someone has a more conservative viewpoint on health policy and they support a more liberal policy, does that change the way other conservatives view the policy? Does it change the way liberals view the policy? Should it?

There are lots of questions, but no easy answers. As someone who’s steeped in biases in judgment and decision-making, I’m not sure which way would be best, but I’m glad that — at a minimum — it’s being discussed.

Harvard University’s Justice with Professor Michael Sandel

This past semester I had the good fortune of taking a class in . I rather enjoyed the class and it sparked my interest in deepening the learning on the subject. As a result, I did some digging and came across a course that has been put (Aside: I am a big fan of ). The course is called “Justice” and it is offered by . The professor: .

I find the subject of ethics fascinating. I think it is a subject that everyone should have at least a basic understanding of. That is, I think people should have read some of the basic texts or at least know some of the basic arguments from the different theorists or theories (eg. , , , , etc.).

Back to this course: it’s fantastic! There are 12 ‘episodes’ that are really a total of 24 classes. There’s a different subject each week and the professor really engages the students. There’s a great deal of discussion between the students and the professor on a range of moral issues.Here’s a quote from the professor during what could have potentially become a rather contentious point in the last episode: “We’ve done pretty well over a whole semester and we’re doing pretty well now dealing with questions that most people think that can’t even be discussed in a university setting.”

After watching all 12 episodes, I think that his quote is spot on. The students (and the professor) spoke about a number of contentious and possibly controversial subjects without descending into ad hominem attacks. In fact, the way that the students engaged in civil discourse is what I’d like to think that our politicians and pundits could do to set an example for the citizens of the world.

Here’s a short video preview of the course, in case you’re interested:


On another note, I’m really looking forward to reading Michael Sandel’s new book called: .