The Beauty of Structured Data

When it comes to survey design, Stephen Covey is absolutely right.

 Mar | 4 | 2019



 Mar | 4 | 2019

}

Like many people, I read Stephen Covey’s The 7 Habits of Highly Effective People when it came out. Although I understood the bestseller’s popularity, I found it a tad simplistic. Strangely, though, one of his suggestions stuck with me over the years:

Begin with the end in mind.

No, this isn’t always possible. Exhibit A: Slack started as a video game. Exhibit B: YouTube began as a dating site. Countless other startups have tried to “pivot”—although relatively few are successful. When it comes to data collection and analysis, though, Covey’s rule holds up in spades.

Analyzing structured data remains far easier than analyzing its unstructured counterpart.

Sure, powerful tools such as Open Refine are constantly improving their ability to make sense of unstructured data. Natural language processing continues to make strides. Ditto for robust Python libraries. Make no mistake, though: Analyzing structured data remains far easier than analyzing its unstructured counterpart—and I don’t see that changing anytime soon.

Cases in Point: Making Unstructured Data More Structured

When I started teaching each of my courses, I noticed that my predecessors by and large used survey tools that asked students to provide unstructured data. This was a problem. For instance, to collect peer feedback on capstone projects, students needed to enter the names of their teammates. Sure, this was easy for the students, but this immediately made my data corralling an unnecessarily complex exercise. In my larger classes, simply determining student averages took two hours. This I would not abide.

Let’s say that a six-person team consisted of Steve, Steven, Ian, Mark, Pete, and Lucy.¹ Allowing them to enter their teammates’ names and scores in free-form text fields resulted in chaos. Problems ran the gamut. Some people referred to Pete as Peter. International students often go by nicknames. The surveys allowed for typos. You get my point.

I quickly reconfigured the surveys in future semesters to make student-response data structured. Now they need to select from team-specific drop-downs.² Typos have gone the way of the dodo. Determining average student peer feedback merely requires creating a pivot table. I then upload their scores to Canvas and voilà! The entire process takes me maybe ten minutes.³

This hardly makes me exceptional. In a similar vein, Nextdoor made its data-collection process far more structured when confronted with a racial-profiling issue. (For more on this, see Analytics: The Agile Way.) Many other examples abound.

Simon Says

In my analytics class, students undertake semester-long individual research projects. To be sure, the scope of these projects varies immensely. During the early stages, some students take my advice to heart. Sadly, others ignore nuggets such as these—usually at their peril.

Regardless, sometimes the data arrives in a messy format—something that many of my more experienced pupils have already discovered over their careers. To the extent that we can control survey design and data collection, though, a little extra thought and time typically pays massive dividends down the road.

Footnotes

Yes, this is a Marillion reference.
I require students to enter a portion of their student ID numbers to prevent fraudulent responses.
In the future, I’d like to streamline this further by using the Canvas API.

From the Archives

 Analytics: The Agile Way ASU Capstone Projects EdTech Google Marillion Python Slack Teaching WFH

 Blog E Data E Big Data E The Beauty of Structured Data

← Previous Post Next Post →

0 Comments

Comments close 180 days after post publishes.

 Analytics: The Agile Way ASU Capstone Projects EdTech Google Marillion Python Slack Teaching WFH

Blog E Data E Big Data E The Beauty of Structured Data

Next & Previous Posts

← Previous Post Next Post →

PHIL SIMON

The Beauty of Structured Data

Cases in Point: Making Unstructured Data More Structured

Simon Says

Footnotes

From the Archives

Go Deeper

Three Freelancing Models

The Teaching Shackles Are Off. It Feels Amazing.

Simon’s Laws of Interorganizational Communication and Collaboration

 Blog E Data E Big Data E The Beauty of Structured Data

0 Comments

Next & Previous Posts

0 Comments

Periodic Updates, Musings, & Rants

Academia

Tableau Public

GitHub

Site Map

Privacy Policy

Current Site Status

Site History