Preparing for some upcoming work, I took part in a webinar on systematic reviews yesterday morning. It was a brief, but good, review/overview of the process and the roles librarians and/or information scientists have in it. One thing that stuck out for me was the reminder by Dr. Edoardo Aromataris of the Joanna Briggs Institute, one of the program’s speakers, that a systematic review is a type of research and as such, it needs to be reproducible. He noted that the search strategy ultimately constructed in a review should yield pretty much the same results for anyone who repeats it.
Replication is a hallmark of the scientific method. As Jasny et al state in the above-referenced quote from a special issue of Science on data replication and reproducibility, it is the gold standard of research. Science grows in value as it builds upon itself. Without the characteristic of replication, such growth is thwarted and findings become limited to a study’s specific subject pool. If a study’s design becomes so complicated and the research question(s) keep changing along the way, the study’s value gets clouded, if it remains at all.
I remember during my master’s thesis defense, one of my advisers asked me why I hadn’t done a particular statistical analysis to answer another question about the data I collected. I admit that the question threw me, but after thinking about it for a moment, I said, “Because that isn’t what I said that I would do.” My statistics professor, who was also sitting in on the defense, said calmly, after I hemmed and hawed and tried to defend my answer in a long and drawn out way, “That’s the right answer.” In other words, when I proposed my study and laid out my methodology, I stated that I would do “x, y, and z.” If I later decided to do “q” simply because I thought “q” was more interesting, I wouldn’t have necessarily answered the research question that I set out to answer, nor would my methods be as strong as I initially put forward.
I bring all of this up this week because as I’ve been sitting in on the weekly meetings of my research team these past months, I can’t help but notice how often new questions are asked and how often those questions result in an awareness that the data needed to answer them is missing. This fact then leads to a lot of going back and gathering the missing data. Sometimes this is possible and sometimes it isn’t. For instance, you might go to see your doctor one time and you’re asked the question, “Do you smoke?” But the next time you visit, the nurse doesn’t ask you that same question. Usually, you’re asked something like, “Are you still taking (name the medication)?” You answer, “Yes,” but you fail to mention that you’ve changed dosage. Or that your doctor changed the dosage sometime during the past year. Is that captured in the record? Maybe, maybe not. And further, some insurance carriers require certain patient information while others do not. If you’re drawing subjects for a study from multiple insurance carriers, you’d better be sure that each is collecting all of the data that you need, otherwise you cannot compare the groups. As the analyst on our study said yesterday, “If you can’t get all of the data, you might as well not get any of it.”
Now please remember that I am working as an informationist on a study led by two principal investigators and a research team that has being doing research for a very long time. They have secured any number of big grants to do big studies. They are well-respected and know a whole helluva lot more about clinical research than me and my little master’s-thesis-experienced self. I’m not questioning their methods or their expertise at all. Rather, I’m pointing out that this kind of research – research that involves a lot of people (25+ on the research team), thousands of subjects, a bunch of years, several sources of data (and data and data and data…), and a whole lot of money over time – is messy. Really, really messy! In other words, an awful lot, if not the majority, of biomedical and/or health research today is messy. And as an observer of such research, I cannot help but wonder how in the world these studies could ever be replicated. As that issue of Science noted, research today is at a moment when so many factors are affecting the outcomes that it’s a time for those involved in it to stop and evaluate these factors, and to insure that the work being done – the science being done – meets high standards.
More, as a supposed “expert” in the area of information and a presumed member of the research team, I’m feeling at a loss as to what I can do, at this point in the study, to clean it up. Yes, I admit that yesterday just wasn’t my best day on the study and maybe that’s coloring part of my feelings today. I didn’t have anything to offer in the meeting. I didn’t feel like much of a part of the team. It happens.
So can I take a lesson from the day’s events? The answer to that is equivocally “YES!” and here’s why…
In the afternoon, I had a meeting with a different PI for a different study. We’re exploring areas where I can help her team; writing up a “scope of work” to embed me as an informationist on the study. It’s a very different kind of study and not as big as the mammography study (above), but it still involves multiple players across multiple campuses, and it ultimately will generate a whole bunch of data from a countless number of subjects. The biggest difference, though, is timing. And this is the take-away lesson for me in regards to what brings success to my role. When a researcher is just putting together his/her team, when s/he is just beginning to think about the who and what and where and why of the study, if THEN s/he thinks of including an individual with expertise in information, knowledge, and/or data management, the potential value of that person to the team and to the work is multiplied several fold.
This is because it’s in the beginning of a study when an informationist can put his/her skills to use in building the infrastructure, the system, and/or the tools needed to make the flow of information and data and communication go much more smoothly. It’s hard to go back and fix stuff. It’s much easier to do things right from the beginning. Again, I’m not saying that the mammography study is doing anything wrong, but building information organization into your methods from the get-go can surely help reduce the headaches down the road. And fewer headaches + cleaner data = better science, all the way around.