Tag Archives: reproducibility

Repeat After Me

22 Aug

Reproduction

Reproducibility is the ability of an entire experiment or study to be reproduced, either by the researcher or by someone else working independently. It is one of the main principles of the scientific method and relies on ceteris paribus. Wikipedia

I was going to start this post with a similar statement in my own words, but couldn’t resist the chance to quote Latin. It always makes you sound so smart. But regardless of whether these are a Wikipedia author’s words or my own, the point is the same – one of the foundations of good science is the ability to reproduce the results.

My work for the neuroimaging project involves developing a process for researchers in this field to cite their data in such a way that makes their work more easily reproducible. The current practice of citing data sets alone doesn’t always make reproducibility possible. A researcher might take different images from a number of different data sets to create an entirely new data set, in which case citing the previous sets in whole doesn’t tell exactly which images are being used. Thus, this gap can make the final research harder to replicate, as well as more difficult to review. We think that we may have a way to help fix this problem and that’s what I’ve been working on for the past few months.

At the same time, I’ve been working on a systematic review with the members of the mammography study team. This work has me locating and reading and discussing a whole slew of articles about the use of telephone call reminders to increase the rate of women receiving a mammogram within current clinical guidelines. It also has me wondering about the nature of clinical research and the concept of reproducible science, for in all of my work, I’ve yet to come across any two studies that are exactly alike. In other words, it doesn’t seem to be common practice for anyone to repeat anyone else’s study. And I can’t help but wonder why this is so.

I imagine it has something to do with funding. Will a funding agency award money to a proposal that seeks to repeat something; something unoriginal? Surely they are more apt to look to fund new ideas.

Maybe it has to do with scientific publishing. Like funding agencies, publishers probably much prefer to publish new ideas and new findings. Who wants to read an article that says the same thing as one they read last year?

Of course, it may also be that researchers look to improve on previous studies, rather than simply repeat them. This is what I see in all of the papers I’ve found for this particular systematic review. The methods are tweaked from study to study; the populations differ just a bit, the length of time varies, etc. It makes sense. The goal of this body of research is to determine what intervention works the best and in changing things slightly, you might just find the answer. What has me baffled about this process, though, is that as we continue to tweak this aspect or that aspect of a study’s methodology, when and/or how do we ever discover what aspect actually works and then put it into practice? 

Working on this particular review, I’ve collected easily 50+ relevant articles, yet as we pull them together – consolidate them to discover any conclusions – the task seems, at times, impossible. Too often, despite the relevancy of the articles to the question asked, what you really end up comparing is apples to oranges. How does this get to the heart of scientific discovery? How does it influence or generate “best practice”? I can’t help but wonder.

Yesterday, during my library’s monthly journal club, we discussed an article that had been recommended reading to me by one of the principal investigators on the mammography study. How to Read a Systematic Review and Meta-analysis and Apply the Results to Patient Care, is the latest User’s Guide on the subject from the Journal of the American Medical Association (JAMA). It prompted a lively session about everything from how research is done, to how medical students are taught to read the literature, to how the media portrays medical news. I recommend it.

Of course, there are many explanations to my question and many factors at play. My wondering and our journal club discussion doesn’t afford any concrete solution and/or answer, still I feel it’s a worthwhile topic for medical librarians to think about. If you have any thoughts, please keep the discussion going in the comments section below.

Repeat After Me

13 Mar

Quote from Science

Preparing for some upcoming work, I took part in a webinar on systematic reviews yesterday morning. It was a brief, but good, review/overview of the process and the roles librarians and/or information scientists have in it. One thing that stuck out for me was the reminder by Dr. Edoardo Aromataris of the Joanna Briggs Institute, one of the program’s speakers, that a systematic review is a type of research and as such, it needs to be reproducible. He noted that the search strategy ultimately constructed in a review should yield pretty much the same results for anyone who repeats it.

Replication is a hallmark of the scientific method. As Jasny et al state in the above-referenced quote from a special issue of Science on data replication and reproducibility, it is the gold standard of research. Science grows in value as it builds upon itself. Without the characteristic of replication, such growth is thwarted and findings become limited to a study’s specific subject pool. If a study’s design becomes so complicated and the research question(s) keep changing along the way, the study’s value gets clouded, if it remains at all.

I remember during my master’s thesis defense, one of my advisers asked me why I hadn’t done a particular statistical analysis to answer another question about the data I collected. I admit that the question threw me, but after thinking about it for a moment, I said, “Because that isn’t what I said that I would do.” My statistics professor, who was also sitting in on the defense, said calmly, after I hemmed and hawed and tried to defend my answer in a long and drawn out way, “That’s the right answer.” In other words, when I proposed my study and laid out my methodology, I stated that I would do “x, y, and z.” If I later decided to do “q” simply because I thought “q” was more interesting, I wouldn’t have necessarily answered the research question that I set out to answer, nor would my methods be as strong as I initially put forward.

I bring all of this up this week because as I’ve been sitting in on the weekly meetings of my research team these past months, I can’t help but notice how often new questions are asked and how often those questions result in an awareness that the data needed to answer them is missing. This fact then leads to a lot of going back and gathering the missing data. Sometimes this is possible and sometimes it isn’t. For instance, you might go to see your doctor one time and you’re asked the question, “Do you smoke?” But the next time you visit, the nurse doesn’t ask you that same question. Usually, you’re asked something like, “Are you still taking (name the medication)?” You answer, “Yes,” but you fail to mention that you’ve changed dosage. Or that your doctor changed the dosage sometime during the past year. Is that captured in the record? Maybe, maybe not. And further, some insurance carriers require certain patient information while others do not. If you’re drawing subjects for a study from multiple insurance carriers, you’d better be sure that each is collecting all of the data that you need, otherwise you cannot compare the groups. As the analyst on our study said yesterday, “If you can’t get all of the data, you might as well not get any of it.”

Now please remember that I am working as an informationist on a study led by two principal investigators and a research team that has being doing research for a very long time. They have secured any number of big grants to do big studies. They are well-respected and know a whole helluva lot more about clinical research than me and my little master’s-thesis-experienced self. I’m not questioning their methods or their expertise at all. Rather, I’m pointing out that this kind of research – research that involves a lot of people (25+ on the research team), thousands of subjects, a bunch of years, several sources of data (and data and data and data…), and a whole lot of money over time – is messy. Really, really messy! In other words, an awful lot, if not the majority, of biomedical and/or health research today is messy. And as an observer of such research, I cannot help but wonder how in the world these studies could ever be replicated. As that issue of Science noted, research today is at a moment when so many factors are affecting the outcomes that it’s a time for those involved in it to stop and evaluate these factors, and to insure that the work being done – the science being done – meets high standards.

More, as a supposed “expert” in the area of information and a presumed member of the research team, I’m feeling at a loss as to what I can do, at this point in the study, to clean it up. Yes, I admit that yesterday just wasn’t my best day on the study and maybe that’s coloring part of my feelings today. I didn’t have anything to offer in the meeting. I didn’t feel like much of a part of the team. It happens.

So can I take a lesson from the day’s events? The answer to that is equivocally “YES!” and here’s why…

In the afternoon, I had a meeting with a different PI for a different study. We’re exploring areas where I can help her team; writing up a “scope of work” to embed me as an informationist on the study. It’s a very different kind of study and not as big as the mammography study (above), but it still involves multiple players across multiple campuses, and it ultimately will generate a whole bunch of data from a countless number of subjects. The biggest difference, though, is timing. And this is the take-away lesson for me in regards to what brings success to my role. When a researcher is just putting together his/her team, when s/he is just beginning to think about the who and what and where and why of the study, if THEN s/he thinks of including an individual with expertise in information, knowledge, and/or data management, the potential value of that person to the team and to the work is multiplied several fold.

This is because it’s in the beginning of a study when an informationist can put his/her skills to use in building the infrastructure, the system, and/or the tools needed to make the flow of information and data and communication go much more smoothly. It’s hard to go back and fix stuff. It’s much easier to do things right from the beginning. Again, I’m not saying that the mammography study is doing anything wrong, but building information organization into your methods from the get-go can surely help reduce the headaches down the road. And fewer headaches + cleaner data = better science, all the way around.