The Problem with Big Data: Lies, Damn Lies, and Statistics

I’ve used the subtitle in a previous post and I think the application to the content of this post also makes it worthwhile to use again. I was reading a post from Tim Ferriss the other day and it made me think of statistics. The post is about alternative medicine, but understanding that isn’t entirely necessary for the point I’m making. Here’s some context:

Imagine you catch a cold or get the flu. It’s going to get worse and worse, then better and better until you are back to normal. The severity of symptoms, as is true with many injuries, will probably look something like a bell curve.

The bottom flat line, representing normalcy, is the mean. When are you most likely to try the quackiest shit you can get your hands on? That miracle duck extract Aunt Susie swears by? The crystals your roommate uses to open his heart chakra? Naturally, when your symptoms are the worst and nothing seems to help. This is the very top of the bell curve, at the peak of the roller coaster before you head back down. Naturally heading back down is regression toward the mean.

If you are a fallible human, as we all are, you might misattribute getting better to the duck extract, but it was just coincidental timing.

The body had healed itself, as could be predicted from the bell curve–like timeline of symptoms. Mistaking correlation for causation is very common, even among smart people.

And the important part of the quote [Emphasis Added]:

In the world of “big data,” this mistake will become even more common, particularly if researchers seek to “let the data speak for themselves” rather than test hypotheses.

Spurious connections galore–that’s what the data will say, among other things.  Caveat emptor.

This analogy reminded me of the first time I learned about correlation and causation in my first psychology class as an undergraduate. It had to do with ice cream, hot summer days, and swimming pools. In fact, here’s a quick summary from wiki:

An example of a spurious relationship can be illuminated by examining a city’s ice cream sales. These sales are highest when the rate of drownings in city swimming pools is highest. To allege that ice cream sales cause drowning, or vice-versa, would be to imply a spurious relationship between the two. In reality, a heat wave may have caused both. The heat wave is an example of a hidden or unseen variable, also known as a confounding variable.

Getting back to what Ferriss was saying near the end of his quote: as “Big Data” grows in popularity (and use), there may be an increased likelihood of making errors in the form of spurious relationships. One way to mitigate this error is education. That is, if the people who are handling Big Data know and understand things like correlation vs. causation and spurious relationships, these errors may be less likely to occur.

I suppose it’s also possible that some, knowing about these kinds of errors and how little the average person might know when it comes to statistics, could maliciously report statistics based on numbers. I’d like to think that people aren’t doing this and it just has more to do with confirmation bias.

Regardless, one way to guard against this inaccurate reporting would be to use hypotheses. That is, before you look at the data, make a prediction about what you’ll find in the data. It’s certainly not going to solve all the issues, but it’ll go a long way towards doing so.

How History’s Most Famous People Scheduled Their Day Doesn’t Matter

Last month, there was a chart that was making its way around showing how some of the most famous creative people scheduled their day.

To be perfectly honest, how they scheduled their day should have little to no effect on how you schedule your day. I appreciated that some articles (like the one from Mic) acknowledged part of the issue:

Since the greats examined here were already generally well-off and moderately successful before the peak of their careers, it’s hard to tell whether the schedules helped them reach success or were a product of it.

The sentence that follows is the most important of the article:

But what is clear is that the vast majority spent large stretches of time doing intellectual and creative work on a regular basis.

Trying to plan how you should spend your day based on how da Vinci or Picasso spent their days is ludicrous. They lived in a completely different time than we do. More than that, the ways that they schedule their days might not be the most advantageous way for you to structure your day. That is, maybe you’re not an early riser — maybe you’re a night owl. Or maybe you’re a hybrid in that some days you stay up late and some days you wake up early.

As the article in Mic alludes to near the end, but doesn’t outright say, there are only two important things to consider here: sleep and exercise. Time and time again, research has shown positive correlations between sleep and creativity and exercise and creativity. If you want to be creative, there’s a better chance that you’ll be successful if you get enough sleep and you get some exercise. Everything else is optional.

To Tech or Not To Tech: Hiking the Appalachian Trail

It’s hard to believe that it’s only been 1 month since my last post. It feels like the last time I wrote something was ages ago. In March, I said that I intended on writing something once a week, but I suppose having an infant, moving, and preparing to start a new job have made that a little harder than I imagined. Nonetheless, I stole away some time today to write about technology and the Appalachian Trail (AT).

A few summers ago (actually, now that I think about it, it was 6 years ago), I had the good fortune to spend some time hiking on the Appalachian Trail. It was my first time on an extended hike and I really enjoyed it. While on the hike, I learned that the trail spans 14 states including the beginning/end in Maine/Georgia. Many folks try to hike the whole thing in a summer. Lots succeed, but many more give up. When I hiked part of the AT in 2008, technology wasn’t as advanced as it is today (obviously), but I was wondering how I might want to approach this subject when I decide to hike the AT again.

This thought was sparked by a post in Scientific American bemoaning the use of technology on the trail. I can see where she’s coming from — for sure. Most people decide to go into nature to get away from technology. She also makes some good points as to how technology can help in an emergency (read: bear eats pack).

I think if I were to hike the AT tomorrow, I might bring along a MacBook Air — for the sole purpose of writing. That is, I’d intend to do like David Roberts did and take a hiatus from social media (which for me, mainly means Twitter). I say intend because I’ve learned that making hard-and-fast rules can sometimes make things more difficult to uphold. I suppose I could not get some sort of data plan and therefore it would be quite difficult to check things like Twitter.

When I do decide to hike the whole of the AT (sometime in the next 30 years), our relationship to technology may be very different. Maybe Google Glass (or an iteration thereof) might be more user-friendly. Maybe it’ll be ingrained in the way we live our days like smartphones have become. Maybe there’ll be something after Google Glass and something beyond the impending smartwatches. Regardless of how technology evolves, we’ll always be left with the choice: to tech or not to tech.

How to Solve the Password Problem: Teach Kids When They’re Young

I came across an article a few days ago that explained how to teach humans to remember really complex passwords. As I was reading it, I couldn’t help but think that there’s an important piece to the solution to helping humans remember really complex passwords: habit.

When we first started using computers, coming up with a super-difficult password wasn’t necessary as we were usually just trying to keep our stuff protected from our family members. Then, it was trying to keep things protected from our co-workers. Slowly, that grew and grew until now, someone (or something!) on the other side of the planet can figure out your password and hack into your online accounts.

I wonder, if we were taught how to come up with complex passwords when we were younger, would there still be such a high percentage of people using easy-to-crack passwords? That is, if we only knew passwords to be in the form of “passphrases,” would someone still try to use a word as their password? While there would still probably be some, my guess is that the percentage would drop.

So, how do we teach our kids to use smarter passwords? Well, assuming that kids at some point are still taught how to type in school, I see this as the perfect opportunity to also teach them about how to use passphrases for accounts. Assuming that students will have to logon to a computer to use the program that teaches them how to type, this is the best time to imprint the habit of using an effective password.

Of course, this won’t solve the problem of all the people out there today who still use “password” or “1234password” for their password, but it will help to correct problem by not adding more people to the number of people who use poor password habits.

~

Extending this idea, there may still be some adults or teens out there who are still learning how to type. In these cases, we could have the software that is teaching them how to type also teach them about good password habits. If the adults are learning how to type in some sort of class, this could also be a good place to teach them about good password habits.

A Lesson in Overcomplicated: Gender-Neutral Washrooms

If you’ve ever been part of an organization, there’s a better chance than not that you’ve been involved in a meeting where at some point, you found yourself thinking, “what the heck are we doing?” Well, hopefully you’ve found yourself saying that, otherwise you might have fallen into the trap of overcomplicating something.

There was a great (and short!) post on Pacific Standard about the “problem” of a sign for a gender-neutral bathroom:

“But what would you put on the door?!” said a facility manager at an airport, his concern echoed by an administrator at a university: “When people are looking for a restroom, they look for the ‘man’ or ‘woman’ icon. It’s what we know to look for that means restroom.”

And the sign that answers this problem:

Wow, right?

This situation is a perfect example of how overthinking something can lead to a terrible and overcomplicated solution. Is this sign really necessary to signify that there’s a toilet behind the door (or around the corner, in the case of many airports)? Absolutely not.

While there are many problems we can talk about, let’s look at the key issue: false dilemma. Presumably, upon trying to to develop a solution to this problem, the people in the meeting thought that something had to be added to the existing sign. That is, the sign is usually a little man or a little woman, so we’ve got to make it resemble that little man or woman or people might be confused. There are clearly more options than creating that weird looking sign. From the post, there’s this sign offered:

That seems like a pretty good alternative to me. It’s universal in that many people know what a toilet looks like. To be sure, the person who came up with the idea of this pictorial representation took his laptop to a coffee shop to ask patrons if they could hazard a guess as to what was being the sign: 100% of participants were able to identify what would be behind a door with this sign on it. The author, obviously in jest, explained that his research was limited to a corner in Philadelphia, but I think it’s safe to say that most people would be able to perform as well as his participants.

So, the next time you’re in a meeting where your team is trying to come up with an idea that uses an existing structure/idea, double-check that it might not be better to approach the problem from a different perspective.

How Smartphones Can Lead to Better Parents

Over three years ago, I wrote a post about cell phone etiquette. At the time I wrote that, I wouldn’t have guessed that three years later, I’d be considering the possibility that smartphones could actually lead to better parents.

But that’s exactly what this post is about.

The stereotype goes that many parents will bring their children to the park (and/or some activity) and upon arriving, they shoo away their children only to peer down at their cell phone. Some folks do this while out to dinner with friends (even though they don’t have kids, see here). Many will cringe upon seeing parents sitting on the bench enwrapped in the goings on of their cell phone. Farhad Manjoo, however, points out how smartphones can actually make for more available parents [Emphasis Added]:

But we rarely consider how, by liberating us from the office, smartphones have greatly expanded the opportunity for certain kinds of workers to increase their involvement in their children’s lives. Because you can work from anywhere thanks to your phone, you can be present and at least partly attentive to your children in scenarios where, in the past, you’d have had to be totally absent. Even though my son had to yell for my attention once when I was fixed to my phone, if I didn’t have that phone, I would almost certainly not have been able to be with him that day — or at any one of numerous school events or extracurricular activities. I would have been in an office. And he would have been with a caretaker.

Stop and consider that for a moment: having a smartphone can actually make you more available as a parent. Now, this isn’t a commercial for smartphones, but it’s certainly something that should give you pause for consideration. I know it did for me when I read it. This idea put forth from Manjoo is exactly the kind of thing that I’m talking about when I say putting a new perspective on things. Someone who is so focused on how smartphones are bad for parents and how they keep parents from their children wouldn’t be able to see the possibility that for a small population, having a smartphone can actually allow a parent to be away from the office and with their children.

This idea isn’t meant to invalidate the idea that smartphones are changing the relationship we have with our children, but the idea that smartphones are allowing us to be with our children more is, to be hyperbolic for a moment, paradigm-altering. A key step to being a better parent is being able to be with your children. So, if smartphones can get us out of the office and next to our kids, isn’t that an important step?

~

There still might be some of you out there that unequivocally think we shouldn’t be on our phones when we’re with our kids and that’s okay, but I hope that you’ll at least consider (reflect, think about, ponder, etc.) the possibility that the opposite may be true. It’ll put you one step closer to defending against the confirmation bias.

A New Way to Use Pinterest: Financial Charts

I don’t remember when I first signed up for Pinterest, but I do remember that when I did, I had “big” plans of using the site to create a vision board. As you can see from my Pinterest page, I haven’t used it since I signed up. There are any number of explanations I could offer as to why I haven’t really done what I had initially thought I would, but this post isn’t about my usage of Pinterest, no, it’s about Josh Brown’s.

You see, many people (or at least it certainly seems like it) use Pinterest for shopping. That is, they see something they like and Pinterest is a way to bookmark that image. There are also those businesses who use Pinterest to get a better understanding of how their customers like or dislike their products. There are those hobbyists or designers who are trying to showcase their ideas. There are even people who share recipes through Pinterest. In all that I’ve heard of Pinterest, never had I heard someone use it to share financial charts.

Can anyone tell me what this is an example of? Hint: I wrote about this decision-making bias as recently as last month.

Functional Fixedness.

Josh Brown, the person I mentioned earlier, uses Pinterest to bookmark “amazing charts.” These financial charts, in a way, are breaking through that bias of functional fixedness. By using Pinterest to showcase financial charts, Brown found a way to use Pinterest that was a little out of the ordinary.

There are probably dozens of examples of these in your daily lives. On your commute this morning/afternoon (or the next time you head to work), I want you to take a wider perspective and see if you can notice anyone using something in a way that you hadn’t considered. Maybe someone’s using a skateboard as a “wagon” as they’ve tied a string to truck (where the wheels are) and is letting someone pull them down the street. Maybe by watching them participate in what some may consider a dangerous activity, it gives you that flash of an idea you’ve been looking for on a problem you’ve been having. Lateral thinking begets lateral thinking.