Tag Archives: data management

This, That, and a Bit of The Other Thing

8 Aug

I like to make the cards that I give to people. Yes, I too often give in and buy the prefabricated ones, but even then, I try very hard to pick ones out that are blank inside, not substituting anyone else’s words for my own. I like the handmade touch. I have a small box with several cards that I made for my mom when I was a child. They are special. My mom treasured them enough to keep for herself and now, I keep them myself. Crayon-scribbled, “You are the best mom” accompanied by a cut-out, construction paper flower is worth saving.

 

A couple of cards that I made for my mom.

A couple of cards that I made for my mom.

Besides the sentimentality of handmade items, they also share the message that the sender took a bit more time to make something just for you. I’m not knocking the time one can spend searching the shelves at the Hallmark store for just the right message, but you must admit that taking the time to make that right message says just a little something more. 

I thought about making cards earlier this week when I followed along with a listserv discussion about the practice of sending weekly articles, messages, and updates to patrons. A number of participants shared some very helpful resources – aggregators, if you will – for delivering timely pieces. It’s both easy and resourceful to subscribe to them. They scour the internet for stories about the latest medical procedure, disease outbreak, trend in healthcare, etc., and send them right to your email inbox for quick reading. Some even annotate them for you, so that you don’t have to be bogged down reading more than seven paragraphs. The suggestion offered in the discussion was to share these feeds with administrators or doctors or researchers or whoever your target audience is. It’s a great idea, but as I thought about it, the practice reminded me of buying a greeting card instead of making one yourself.

Libraries and librarians have given up a great deal of their identity (their brand) over the past years. The full-text of articles are often accessed through third-party vendors or the websites of journals, despite the fact that it’s one’s library that’s often providing the resource. We buy catalogs developed by other companies, rather than developing homegrown management systems. We embed RSS feeds from other sources into our own websites.

And each and every one of these practices saves both time and money, but at what cost?

I got to wondering how much time it would really take to subscribe to a relevant aggegator or journal table of contents, or to set up a few alerts from custom-saved searches, or to put together several Twitter lists that follow sources specific to a group or department I serve. Then I could use these tools to create my own, customized delivery of an article or an interesting piece of news to the same. Think of the return on the investment I’d get by sending a personal note directly to someone with the resource attached, as compared to the same coming from an automated – and branded by someone else – source. Now, I can already hear some naysayers saying, “I don’t have time to keep up with that.” Maybe not, but I think it might be worth a try.

A full shelf of writing and reading, plus Finz. And an autographed baseball. And a holiday ornament. Librarians don't need to be organized at home.

A full shelf of writing and reading, plus Finz. And an autographed baseball. And a holiday ornament. Librarians don’t need to be organized at home.

Related, another thing that I often hear people say is that we don’t have time to read ____ (insert whatever it is that you don’t have time to read – blog posts, journal articles, interesting pieces from the news). Similarly, many say that we don’t have time to write _____ (insert whatever it is that you don’t have time to write – blog posts, journal articles, etc.). This a dilemma. To paraphrase Stephen King (the writer), if you want to be in the information business, you need to do two things above all others; read a lot of information and write a lot of information. How else can you stay on top of it? How else can you provide good information resources to those you serve? How do we call ourselves information professionals if we ignore the very thing that we’re supposedly experts in? We work in a fast-paced and rapidly changing profession. All the more reason to do those two things above all others. Read and write.

I write a post for this blog each week. Thanks to the kind words of many colleagues, not to mention usage statistics, I know that people read it. But I also read the writings of colleagues and other people who provide so much insight, interest, and entertainment to my work, that I can’t imagine how lousy I’d be at my job without them. With this stated, I’m sharing several really good things that crossed my radar over the past week. If you can find a moment or two to read them, you may find it worth your while:

  • Data Dictionaries, a blog post by Kristin Briney. If you’re charged with the task of managing data, at any level, Kristin’s blog is worth following and this particular piece is a great one to bookmark, because it’s really hard to find good posts and good examples on the topic.  
  • Your Two Kinds of Memory: Electronic and Organic, by Annie Murphy Paul. Medical librarians are forever grousing about a certain resource that’s ever-so-popular with doctors and medical students alike. Annie’s post offers an entirely different reason for concern.
  •  There’s a new series debuting on Cinemax soon about the early days of surgery in the United States. Period medical drama. “The Knick” is the creation of Steven Soderbergh and stars Clive Owen, so it surely has potential to be good. After ‘The Knick’: 7 Fascinating Books on the History of Medicine offers critique and … well, suggestions for further reading. (From the blog for the site, Word & Film.)
  • The Trouble with Medicine’s Metaphors is an article by Dhruv Khullar for the Atlantic. Khullar is currently doing a residency at the Massachusetts General Hospital and Harvard Medical School. Maybe it’s because I majored in philosophy, maybe because I love linguistics, maybe because I was in the hospital last week… for many reasons, I found this a great read.

Finally, I always read Amy Dickinson’s advice column. I need all of the everyday, practical advice that I can get. And my friend, Suzy Becker, wrote a most wonderful blog post to go along with the release of her latest book from Random House Kids this week. Author-Daughter Book Club just about made me cry in my cubicle. In a good way. Moms of sons and daughters, both, will enjoy it. I give shout outs to these two writers who, many days, make my day. 

All of the Data that’s Fit to Collect

28 Jul

My graduate thesis in exercise physiology involved answering a research question that required collecting an awful lot of data before I had enough for analysis. I was comparing muscle fatigue in males and females, and in order to do this I had to find enough male-female pairs that matched for muscle volume. I took skin fold measurements and calculated the muscle volume of about 150 thighs belonging to men and women on the crew teams of Ithaca College. Out of all of that, I found 8 pairs that matched. It was hardly enough for grand findings, but it was enough to do the analysis, write my thesis, successfully defend it, and earn my degree. After all, that’s what research at this level is all about, i.e. learning how to put together a study and carry it all the way through to completion.

During my defense, one of my advisers asked, “With all of that data, you could have answered ___, too. Why didn’t you?” I hemmed and hawed for a bit, before finally answering, “Because that’s not what I said that I was going to do,” an answer that my statistics professor, also in attendance, said was the right answer. Was my adviser trying to trick me? I’m not sure, but it’s an experience that I remember often today when I read and talk and work in a field obsessed with the “data deluge.”

The temptation to do more than what you set out to do is ever present, maybe even more today than ever before. We have years worth of data – a lot of data – for the mammography study. When the grant proposal was written and funded, it laid out specifics regarding what analysis would be done; what questions would be answered. Five years down the road, it’s easy to see lots of other questions that can be answered with the same data. A common statement made in the team meetings is, “I think people want to know Y” or “Z is really important to find out.” The problem, however, is that we set out to answer X. While Y and Z may well be valuable, X is what the study was designed to answer.

LOD_Cloud_Diagram_as_of_September_2011

“LOD Cloud Diagram as of September 2011” by Anja Jentzsch – Own work. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons

I see a couple of issues with this scenario. First, grant money is a finite resource. In a time when practically all research operates under this funding model, people have a certain amount of time dedicated, i.e. paid for, by a grant. If that time gets used up answering peripheral questions or going down interesting, but unplanned, rabbit holes, the chances of completing the initial work on time is jeopardized. As one who has seen my original funded aims change over time, this can be frustrating. And don’t hear me saying that it’s all frustrating. On the contrary, along with the frustration can come some pretty cool work. The mini-symposium on data management that I described in earlier posts was a HUGE success for my work, but it’s not what we originally set out to do. The ends justified the means, in that case, but this isn’t always what happens.

The second issue I see is one that I hear many researchers express when the topics of data sharing and data reuse are raised, i.e. data is collected a certain way to answer a certain question. Likewise, it’s managed under the same auspices. Being concerned about what another researcher will do with data that was collected for another reason is legitimate. It’s not a concern that can’t be addressed, but it’s certainly worth noting. When I was finished with my thesis data, a couple of faculty members offered to take it and do some further research with it. There were some different questions that could be answered using the larger data set, but not without taking into account the original research question and the methods I used to collect all of it. Anonymous data sharing and reuse, without such context, doesn’t always afford such, at least not in the current climate where data citation and identification is still evolving. (All the more reason to keep working in this area.)

We have so many tools today that allow faster and more efficient data collection. We have grant projects that go on for years, making it difficult to say “no” to ask new questions of the same project that come up along the way. We are inundated with data and information and resources that make it virtually impossible to focus on any one thing for any length of time.

The possibilities of science in a data-driven environment seem limitless. It’s easy to forget that some limits do, in fact, exist.

Let’s Decide!

6 Jun

The title of this post can be found written in large, bold letters in the notes I took during a meeting on Tuesday. “LET’S DECIDE!” It followed the side comment (my notes from any meeting are filled with side comments and/or digressions), “Basically, we can facilitate this work and see that as our role or keep doing our own thing.” I realize that it’s not truly an “either/or” situation, but…

Maybe I should offer a little background, first.

Initially, Aim 2 in the proposal for my work as an informationist on the mammography study was this:

Aim 2: Assist investigators in identifying and reporting information technology issues that have arisen in the implementation of the study that may be of use to others.

After spending a great deal of time searching the literature in fields from information technology to medical informatics to team science (or simply teamwork), I realized that not much existed that fit the issues that they’d encountered. Further, I wasn’t convinced that writing an article and/or white paper on the topic was the place to start in terms of reporting their experience. I thought that perhaps bringing people together, i.e. the different stakeholders, to talk about the issues, problems, lessons learned, etc. that occur when IT folks and a research team come together to work on a project. I felt that such a discussion would yield a lot of valuable information that could then, somehow, be collected, organized, and disseminated in a useful manner. After a lot of talk and brainstorming within the team, we all agreed that this seemed a good path to take.

Making a long story short, this idea took hold, evolved, grew, and a couple of weeks ago, took the form of a mini-symposium that was part of the annual research retreat for our Center for Clinical and Translational Science. The program, entitled, “Data Acquisition, Data Management, and Subject Tracking in Clinical and Translational Research: Seeking Solutions to Persistent Challenges,” brought together the researchers from the mammography study, two faculty members from our Department of Quantitative Health Sciences, a biostatistician from the University of Massachusetts, Lowell, and a representative from our Information Services department. My role now is to pull all of the content from the symposium, along with other useful resources, and make it available online for the benefit of our research community.

This is all a really happy story for me in that I’ve been able to help facilitate and see something come together that we have been talking about in my library for a number of years now. Finally… FINALLY … people are starting to talk about issues around data. For too long, the only folks that I’ve heard talking about managing data are librarians. And frustratingly, we’ve mostly been talking among ourselves. But over the past months, I’ve been able to watch people that we’ve been wanting to reach addressing the issue. And best of all, the different players are talking to one another and not just among themselves.

So why the frustrating digression in my notes from Tuesday? Well, it’s because in my position, I can see several things happening. First, I can see several different camps, including the library, trying to stake their claim on one or another aspect in the data management services suite. And there’s a lot of overlap.

Secondly, there’s a lot of the feeling of “we’re the experts, so we should be the ones to do this.” Going along with this is also a lack of awareness and/or understanding of what each stakeholder really is expert in. For example, I might think that the people in Information Services ought to address issues around data storage and security. This is true, of course, but it leaves out the expertise that some in that department have around the proper ways to build databases and thus best practices in file structures and naming conventions and other things that might make me want to say, “Hey! That’s my area of expertise, not yours.” Similarly, many libraries developing data management services are focusing a great deal on providing data management planning in grant applications, but if you asked my colleagues in Quantitative Health Sciences, they’d say, “That’s what we do. Why are you saying it’s your role?”

talk talk talkLastly, despite the success of the mini-symposium, there’s still an awful lot of “talking amongst ourselves” going on. I see this more easily, and thus get a little frustrated at times, because I have my foot in several different areas where I’m hearing the same message. In other words, despite the success of bringing people together for the mini-symposium, there’s still a lot of room for improving how well we communicate and coordinate our efforts, not only campus-wide, but even within my library. So when I wrote “LET’S DECIDE!” it was my reaction to what I see as a really big need that we can fill. There is a huge need for someone to fix the broken communication system, help eliminate some of the duplication of efforts, and facilitate the development of services around data within my institution. And I believe that someone is me and my colleagues in the library. 

One of the characteristics of the library that was lost when we brought our resources to the researchers was our place as the hub of a lot of academic activity. People used to come to our physical library and here the different worlds of campus would collide. Researchers and faculty members and clinicians were forcibly less isolated in labs or offices. They literally ran into one another and likely had a bigger picture of things that were going on, simply through the interactions. At the same time, librarians were more easily able to know a lot of what was going on, too. We had a front row seat for all of the collisions. What I’ve found, as I got out of the library and started working on research teams, is that by going to the people that used to come to us, I’m bringing that lost quality back to life. While it can be incredibly frustrating to observe different groups addressing the same issues, each unaware of what the other is up to, the fact is that I can make them aware.

The mammography study team didn’t know that a team in the library has been working and working and working towards a goal of teaching good data management practices to the students, but as I’m a member of both teams, I did. So, when the study team made a suggestion that we recreate the symposium via a webinar series, archive it, and make it available to the students as part of their curriculum, I immediately chimed in, “Wait! Let me tell you what we’ve been working on.” A similar thing happened with the data management group in Quantitative Health Sciences. And now, we have a meeting scheduled for next month where we will bring these groups together – the research team, the QHS group, IS, and the library’s data services group.

To me, being able to facilitate these gatherings is one of the most rewarding parts of this informationist work. It’s a great role for librarians to take in the area of data management. As I wrote a few posts back, it’s the networking aspect of eScience and a place where we can put our skills to good use. The library itself used to bring people together. Today, librarians do.