All of the Data that’s Fit to Collect

28 Jul

My graduate thesis in exercise physiology involved answering a research question that required collecting an awful lot of data before I had enough for analysis. I was comparing muscle fatigue in males and females, and in order to do this I had to find enough male-female pairs that matched for muscle volume. I took skin fold measurements and calculated the muscle volume of about 150 thighs belonging to men and women on the crew teams of Ithaca College. Out of all of that, I found 8 pairs that matched. It was hardly enough for grand findings, but it was enough to do the analysis, write my thesis, successfully defend it, and earn my degree. After all, that’s what research at this level is all about, i.e. learning how to put together a study and carry it all the way through to completion.

During my defense, one of my advisers asked, “With all of that data, you could have answered ___, too. Why didn’t you?” I hemmed and hawed for a bit, before finally answering, “Because that’s not what I said that I was going to do,” an answer that my statistics professor, also in attendance, said was the right answer. Was my adviser trying to trick me? I’m not sure, but it’s an experience that I remember often today when I read and talk and work in a field obsessed with the “data deluge.”

The temptation to do more than what you set out to do is ever present, maybe even more today than ever before. We have years worth of data – a lot of data – for the mammography study. When the grant proposal was written and funded, it laid out specifics regarding what analysis would be done; what questions would be answered. Five years down the road, it’s easy to see lots of other questions that can be answered with the same data. A common statement made in the team meetings is, “I think people want to know Y” or “Z is really important to find out.” The problem, however, is that we set out to answer X. While Y and Z may well be valuable, X is what the study was designed to answer.

LOD_Cloud_Diagram_as_of_September_2011

“LOD Cloud Diagram as of September 2011” by Anja Jentzsch – Own work. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons

I see a couple of issues with this scenario. First, grant money is a finite resource. In a time when practically all research operates under this funding model, people have a certain amount of time dedicated, i.e. paid for, by a grant. If that time gets used up answering peripheral questions or going down interesting, but unplanned, rabbit holes, the chances of completing the initial work on time is jeopardized. As one who has seen my original funded aims change over time, this can be frustrating. And don’t hear me saying that it’s all frustrating. On the contrary, along with the frustration can come some pretty cool work. The mini-symposium on data management that I described in earlier posts was a HUGE success for my work, but it’s not what we originally set out to do. The ends justified the means, in that case, but this isn’t always what happens.

The second issue I see is one that I hear many researchers express when the topics of data sharing and data reuse are raised, i.e. data is collected a certain way to answer a certain question. Likewise, it’s managed under the same auspices. Being concerned about what another researcher will do with data that was collected for another reason is legitimate. It’s not a concern that can’t be addressed, but it’s certainly worth noting. When I was finished with my thesis data, a couple of faculty members offered to take it and do some further research with it. There were some different questions that could be answered using the larger data set, but not without taking into account the original research question and the methods I used to collect all of it. Anonymous data sharing and reuse, without such context, doesn’t always afford such, at least not in the current climate where data citation and identification is still evolving. (All the more reason to keep working in this area.)

We have so many tools today that allow faster and more efficient data collection. We have grant projects that go on for years, making it difficult to say “no” to ask new questions of the same project that come up along the way. We are inundated with data and information and resources that make it virtually impossible to focus on any one thing for any length of time.

The possibilities of science in a data-driven environment seem limitless. It’s easy to forget that some limits do, in fact, exist.

Come Together

18 Jul
Photo by Antonio. Used with permission. https://www.flickr.com/photos/antpaniagua/with/8110355091

Photo by Antonio. Used with permission. https://www.flickr.com/photos/antpaniagua/with/8110355091

What an exciting week it’s been! You know those days or moments when you see a lot of groundwork (hard work) start to pay off; like when you see the first tomato appear on the vine or the first sprig of a pepper plant pop up through the dirt? Well, we had one of those this week. For the past several years, we’ve been talking about and planning and laying the foundation to provide library services around the needs that our patrons have when it comes to working with data. Years, I tell you.

When my colleague, Rebecca, arrived last August to take the reins in this effort, I’d been out pounding the pavement for a good while, building relationships and doing individual data-related projects, and perhaps most importantly, getting a sense of who did what and when and where and how. Rebecca got to work strategizing, writing plans, working with our library’s administration and other higher-ups in the university, while Lisa and I provided experience and the connections needed to pull it all together. We developed a Library Data Services Advisory Group, bringing a few vested parties to the table. We did an extensive environmental scan to find out what the different stakeholders on campus thought the Library’s role might be in this area. We talked to lots of people. We surveyed students. We gained a lot of insight.

Meanwhile, I continued to do my work with the mammography study team, part of which involved helping put together a mini-symposium around data issues in clinical research. We brought together clinicians, members of our Quantitative Health Sciences (QHS) Department, and members of University’s Information Technology Department. We also surveyed colleagues to gauge their interest and needs in this area. 

Sitting in these different groups, working on these different teams, I started to see pretty clearly that multiple things were happening on campus; that there was at last some real thought and energy being put towards addressing some of the needs we have around data. I also started to see that a lot of right hands weren’t aware of what their left hands were doing. And the most exciting part of that (when I got past being frustrated) was this… I knew what both hands were doing! 

A few weeks back, I wrote about that frustrating part, as well as how I see how exciting it can be when we (librarians and thus, the library) are positioned in a way to make things happen. And this past Monday, was one of those exciting moments. We ALL came together; representatives of each of these groups that I’ve been witnessing talk about what to do to address the data needs at UMMS. The librarians, the clinical researchers, the computing services folks, the QHS people… we were all at the same table where we could share with one another what we do, what we know, and how we can help. And we came away with some very real, tangible projects that we can tackle together. It really was one of those times when I felt a sense of accomplishment in this task that’s been nebulous, to say the least.

And… I was also hired by the University of Rhode Island’s Library & Information Studies program to teach the course on Health Sciences Librarianship this fall. (I’m really excited about it!!) Totally unrelated to the previous tale, but the two events made for a pretty great week. I hope you’ve had the same!

The Doctor is Out

10 Jul

Psychiatric BoothAdmit it. We all know a lot better, a lot of the time. People know that sitting around all day isn’t the best thing for one’s health, but here we sit. We know that the label says there are 6 servings of macaroni and cheese in the box, but it really divides better by 2 or 3. We know that being distracted while driving isn’t the safest thing, but we text and we do our makeup and we fiddle with the radio and we play our ukuleles while we drive, anyway. And when it comes to information and data, of course we know that it’s best to back-up our files in multiple places and formats, to name our files a certain way so that we can find things easily, and to write down instructions and practices so that we, or others, can repeat what we did the first time. Of course we know these things because let’s be honest, it’s common sense. But… we don’t.

Personally, I get incredibly frustrated at librarians who think we’re adding something important to the world of data management, just by teaching people these notions that really are common sense. I think that there’s something more that we need to do and it involves understanding a thing or two about the way people learn and the way they behave. In other words, lacking a behavioral psychologist on your research team, librarians would do well to study some things from their camp and put them to use in our efforts at teaching, providing information, helping with communication issues, and streamlining the information and/or data processes in a team environment.

I’m preparing to teach a course in the fall and thus I’ve been reading some things about instructional design. In her book, Design for How People Learn, Julie Dirksen explains that when you’re trying to teach someone anything, it’s good practice to start by identifying the gaps that exist “between a learner’s current situation and where they need to be in order to be successful.” (p. 2) Dirksen describes several of these gaps:

  • Knowledge and Information Gaps
  • Skills Gaps
  • Motivation Gaps
  • Environment Gaps

More, I believe she hits the nail on the head when she writes, “In most learning situations, it’s assumed that the gap is information – if the learner just had the information, then they could perform.” I know that I fall into this trap often (and I bet that I’m not alone). I believe if I teach a student how to conduct a solid search in PubMed, that’s how they’ll search. I show them a trick or two and they say, “Wow!” I watch them take notes. I help them set up their “My NCBI”  account. We save a search. They’ve got it! I feel like Daniel Day Lewis in the movie, There Will Be Blood, “I have a milkshake and you have a milkshake.” I have knowledge and now you have the knowledge. Success!

Now if you do any work that involves teaching students or clinicians or researchers or anyone, you know not to pat yourself on the back too much here. I teach people, my colleagues teach people, all of our many colleagues before us (teachers, librarians at undergraduate institutions, librarians at other places where our folks previously worked) teach people. We all teach the same people, yet we keep seeing them doing things in their work involving information that make us throw up our hands. How many times do we have to tell them this?! 

Well, maybe it’s not in the telling that we’re failing. This is where I think understanding and appreciating the other gaps that may exist in the situations, addressing them instead of simply passing along information, could lead us to much more success. And this is where we could use that psychologist.

Earlier this week, I tweeted that I was taking suggestions for what to rename the systematic review that I’ve been working on with my team, for it is anything but systematic. A’lynn Ettien, a local colleague, tweeted back the great new name, “Freeform Review.” I loved that. Another colleague, Stephanie Schulte, at the Ohio State University, offered up a really helpful link to a paper on the typology of reviews. But it was what my colleague, Eric Schnell, also at OSU, tweeted that led me to this blog post:

Schnell

BINGO! Every person on my team knows what the “rules” are, but they keep changing them as we go along. I spend time developing tools to help this process go more smoothly, but still get a bunch of notes emailed to me instead of a completed form. I give weeks to developing a detailed table of all of the elements we’ve agreed to look at. Except this one. Oh, and this. Oh, and should we also talk about this? I put my head down on the table.

But Eric is exactly right. This is how most people deal with information. This is how we work. And it’s not a matter at all of people not knowing something, but rather it’s a problem of people not doing something. Or better put, not doing something differently. Sometimes people do lack knowledge. Many times, people lack skills – something that a lot of practice can fix. But an awful lot of time, what we really need to address are the gaps that have nothing to do with knowing what or how to do something.

Why won’t my people use the forms I’ve created and the tables that I’ve prepared? They said that they liked them. They said they were what they wanted. So… what’s the problem? I think it’s something that each of us who works in this field of information wrangling needs to become proficient at, i.e. learning to see and address all of the gaps that exist. At least the ones we can.

And I, for one, am still learning.