Tag Archives: big data

Practice, Practice, Practice … or maybe not

7 Jun
retreat.JPG

Retreating is Hard Work!

Sorry to miss last week’s post. Just as I was getting back in my Friday writing groove, we had a joint department retreat here in my library. GREAT stuff happened there, but it kept me from writing. Not from thinking, though. Oooh… have I been thinking. 

I’m in the midst of reading David Epstein’s latest book, Range. I read the NY Times Book Review about it and immediately ordered a copy. It’s a fascinating read and so very relevant to the work that we do as librarians and/or information professionals. I’m only about 1/3 of the way through, so I can’t give a complete overview of the theories discussed, but so far, I’m pretty well convinced that the argument he is making rings true.

In brief, Epstein suggests that in a world that’s becoming more and more specialized, it’s the generalists who will thrive. He builds on the work of psychologists Gary Klein and Daniel Kahneman who have studied human decision making, and in particular, how the contexts in which decisions are made affect gaining expertise in an area. Robin Hogarth, another psychologist, goes further, identifying these learning environments as either “wicked” or “kind.” In a kind learning environment, “patterns repeat over and over, and feedback is extremely accurate and usually rapid.” (p. 21) Think swinging a golf club or learning Chopin’s Prelude in E minor on the piano. Over and over and over you’ll need to practice, in order to be any good at it. 

Compare this to what’s required to become really good at addressing an ebola outbreak or finding solutions to the climate crisis or figuring out if we can really do long-term space flight. Compare it to becoming really good at helping people not only find credible facts in a sea of opinions, but also determining the difference between the two. Think about problems that have no easy answers – or any answer at all. This is much more the world we live in and our habit of relying on past experience over and over again, much like practicing a golf swing, just might not be the best for us.

Perfect example… I subscribe to and regularly read the STAT Morning Rounds newsletter that appears in my inbox daily. A recent piece on the opioid crisis highlights the work of Dr. Stefan Kertesz, a primary care physician in Birmingham, AL. Dr. Kertesz has many patients who, over time, have developed a dependency on opioids thanks to the practice of overprescribing that’s been well-documented over the last few years. In reaction to this past behavior, the CDC proposed guidelines for prescribing in 2016. The issue, as often occurs, is that guidelines quickly become mandates. It’s simply easier for agencies or insurance companies or governing bodies to enforce a mandate rather than accept a guideline.

The problem with this, it seems, is that setting hard and fast limits on prescribing is applying a “kind” solution to a very “wicked” problem. It strips the patient and all of his/her variables and uncertainties from the equation. As Dr. Kertesz states, while tapering patients off of opioids is certainly to be encouraged, these are choices that physicians need to make in conjunction with their patients. “Backing mandatory limits, he said, assumes that what’s going to happen at the systems level will effect the best clinician.” 

To bring this all back to my work as a librarian, a terrific piece in The Chronicle of Higher Education* last week spoke to the affect that Google (and other search engines, not to mention the structure of the Internet, in general) has had on our presumptions about knowledge. We’ve become so accustomed to the practice of having a question, typing it into a search engine, and receiving back a lengthy list of results, that we’ve been lulled into believing that a list of results equates to an answer to our question. As the author so succinctly states it, “Search engines have created the illusion that vastly more information exists than ever before and that this information lies just a keystroke away. Today people ‘search’ rather than ‘study.'” Spot on, I say.

*My apologies if you cannot get to the article in the Chronicle. It requires a subscription.

I’m often leery of the swooning love affair I perceive when it comes to research, science, decision-making … you name it … around the role of both big data and artificial intelligence (not as a single thing, but the two separate “Ooooooh… it’s the solution to everything” mindset attached to both). Will they bring us significant breakthroughs in complicated problems? No doubt. Will they solve everything? I’m not so sure. I find a bit more truth here:

The progress of AI in the closed and orderly world of chess, with instant feedback and bottomless data, has been exponential. In the rule-bound but messier world of driving, AI has made tremendous progress, but challenges remain. In a truly open-world problem  devoid of rigid rules and reams of perfect historical data, AI has been disastrous. IBM’s Watson destroyed at Jeopardy! and was subsequently pitched as a revolution in cancer care, where it flopped so spectacularly that several AI experts told me they worried its reputation would taint AI research in health-related fields. As one oncologist put it, ‘The difference between winning at Jeopardy! and curing all cancer is that we know the answer to Jeopardy! questions.’ With cancer, we’re still working on posing the right questions in the first place. (Range, David Epstein, p. 29)

But then, of course, Watson never played Emma the librarian in Jeopardy!, either. 🙂

Happy Friday!

 

 

And Then This…

20 Apr

ER_QuoteAfter writing my last post, Iterations on a Profession, a couple of weeks ago, I was prompted to pick up my copy of Eleanor Roosevelt’s book, You Learn by Living, and re-read the first chapter entitled, “Learning to Learn.” It’s a favorite, filled with great words of wisdom and reminders that life isn’t much of anything, once we stop learning. There are many quotable passages, but I chose the above to share here. As you read it, remember this … it was written in 1960. Fifty years ago, “our world was startlingly new.” And surely 50 years before that and 50 years before that and on and on for as long as humans have been riding on the planet. I get hung up too often on how different everything is today, how much change I’ve experienced in my lifetime and in my profession – all in the same timeframe since Mrs. Roosevelt wrote these words. Adapting to change is nothing new. 

The other line from that chapter that lifted my spirits, “if you are interested, you never have to look for new interests. They come to you.” Yes. Thank you for reminding me that it’s a gift to be interested, Mrs. Roosevelt.

Now then, while I was brooding over things, my bookmark folder got filled again. Time to share some with you.

Conference time is upon us and that means many are busy making posters to show off their projects, work, ideas, etc. Better Posters is a great resource to help you make a not-so-awful-and-all-too-common academic poster. Blog posts are added frequently and humor is never in short supply.

Remember Bridget Jones, the character portrayed wonderfully by Renee Zellweger in the movies of the same name?  Well, you may or may not know that her character came to be from a regular column authored by Helen Fielding for the British newspaper, the Independent, in the 1990s. The Independent recently ceased publication of its print paper, becoming a digital-only media outlet. Fielding was interviewed on NPR’s Morning Edition late last month. She speaks of many things, but one of particular interest to my readers might be her thoughts on what’s lost in the shift from print to digital. Anything? You’ll have to listen to find out.

Several items related to data (because, you know, it’s a lot of what I do):

  • A real-life demonstration of the use of big data can have dramatic effects on the child welfare system – Can Big Data Save These Children, from PBS NewsHour.
  • The National Center for Health Statistics has a nice collection of data visualizations that I’d never come across before. Bookmarked. 
  • Gravy Anecdote is the blog/website of Andy Cotgreave, a Technical Evangelist for Tableau. I watched a very informative webinar that he did entitled, How Data Storytelling Can Enhance the Way You Communicate, one in the series produced by BrightTALK. I’ve watched several of their webinars and found many to be quite good. Note, there’s an audio glitch a few minutes in to Andy’s talk. Just wait through it. It doesn’t last long. (Live and learn.) From this talk, I discovered Periscopic, a data visualization studio on the West Coast (USA) doing some amazing work. You can browse through some of their portfolio. I also found Ben Jones’ blog, DataRemixed. Ben also works for Tableau. It’s going to take me awhile to get through all of the things here.
  • Why all of the Tableau focus? One reason is because last week I was trying to teach myself how to use it to create a social network visualization. I found some help from the blog post, In Chaos, Clarity: Social Network Diagrams in Tableau. I remain irked that NodeXL is a Windows-only add-on for Excel (Mac user that I am), but there are workarounds. Believe me.

Just a few more and these are mostly for fun.

Kurt Vonnegut diagrams The Shape of Stories in this YouTube video. I love one viewer’s comment, “It’s like a cross between a college lecture and a stand-up comedy routine!” It’s pretty funny AND informative. 

If you need a story about how to turn a bad situation into something good, read My Wife Left Me with Nothing but a Dog, So I Started this Fun Photo Series. Amazing! I love that dog!

Photos of the archives of the Smithsonian Natural History Museum made the social media rounds a few weeks back. In case you missed it, you can catch up here. I would love a tour some day.

And finally, the one thing that has preoccupied me for more hours than I dare say over the past month… the DC Eagle Cam. Mr. President and the First Lady had a pair of eaglets in March and I have been FASCINATED watching them grow. And I’m not alone. They’ve gotten press on both National Public Radio and the Washington Post. They reside in the Azalea Collection at the U.S. National Arboretum in Washington, DC. Watch online. You can’t get close to them in person.

And with that … Happy Spring!!

Is Big Data Missing the Big Picture?

27 Apr

Forest_for_the_Trees

When I was defending my graduate thesis a number of years ago, I was asked by one of the faculty in attendance to explain why I had done “x” rather than “y” with my data. I stumbled for a bit until I finally said, somewhat out of frustration at not knowing the right answer, “Because that’s not what I said I’d do.” My statistics professor was also in attendance and as I quickly tried to backtrack from my response piped in, “That’s the right answer.”

As I’ve watched and listened to and read and been a part of so many discussions about data – data sharing, data citation, data management – over the past several years, I often find myself thinking back on that defense and my answer. More, I’ve thought of my professor’s comment; that data is collected, managed, and analyzed according to certain rules that a researcher or graduate student or any data collector decides from the outset. That’s best practice, anyway. And such an understanding always makes me wonder if in our exuberance to claim the importance, the need, the mandates, and the “sky’s the limit” views over data sharing, we don’t forget that.

I really enjoyed the panel that the Medical Library Association put together last week for their webinar, “The Diversity of Data Management: Practical Approaches for Health Sciences Librarianship.” The panelists included two data librarians and one research data specialist; Lisa Federer of the National Institutes of Health Library, Kevin Read from New York University’s Health Sciences Library, and Jacqueline Wirz of Oregon Health & Sciences University, respectively. As a disclosure, I know Lisa, Kevin and Jackie each personally and consider them great colleagues, so I guess I could be a little biased in my opinion, but putting that aside, I do feel that they each have a wealth of experience and knowledge in the topic and it showed in their presentations and dialogue.

Listening to the kind of work and the projects that these data-centric professionals shared, it’s easy and exciting to see the many opportunities that exist for libraries, librarians, and others with an interest in data science. At the same time, I admit that I wince when I sense our “We can do this! Librarians can do anything!” enthusiasm bubble up – as occasionally occurs when we gather together and talk about this topic – because I don’t think it’s true. I do believe that individually, librarians can move into an almost limitless career field, given our basic skills in information collection, retrieval, management, preservation, etc. We are well-positioned in an information age. That said, though, I also believe that (1) there IS a difference between information and data and (2) the skills librarians have as a foundation in terms of information science don’t, in and of themselves, translate directly to the age of big data. (I’m not fan of that descriptor, by the way. I tend to think it was created and is perpetuated by the tech industry and the media, both wishing we believe things are simpler than they ever are.) Some librarians, with a desire and propensity towards the opportunities in data science will find their way there. They’ll seek out the extra skills needed and they’ll identify new places and new roles that they can take on. I feel like I’ve done this myself and I know a good plenty handful of others who’ve done the same. But can we sell it as the next big thing that academic and research libraries need to do? Years later, I still find myself a little skeptical.

Moving beyond the individual, though, I wonder if libraries and other entities within information science, as a whole, don’t have a word of caution to share in the midst of our calls for openness of data. It’s certainly the belief of our profession(s) that access to information is vital for the health of a society on every level. However, in many ways it seems that in our discussions of data, we’ve simply expanded our dedication towards the principal of openness to information to include data, as well. Have we really thought through all that we’re saying when we wave that banner? Can we have a more tempered response and/or approach to the big data bandwagon?

Arguably, there are MANY valid reasons for supporting access in this area; peer review, expanded and more efficient science, reproducibility, transparency, etc. Good things, all. But going back to that lesson that I learned in grad school, it’s important to remember that data is collected, managed, and analyzed in certain ways for a reason; things decided by the original researcher. In other words, data has context. Just like information. And like information, I wonder (and have concern for) what happens to data when it’s taken out of its original context. And I wonder if my profession could perhaps advocate this position, too, along with those of openness and sharing, if nothing more than to raise the collective awareness and consciousness of everyone in this new world. To curb the exuberance just a tad.

I recently started getting my local paper delivered to my home. The real thing. The newsprint newspaper. The one that you spread out on the kitchen table and peruse through, page by page. You know what I’ve realized in taking up this long-lost activity again? When you look at a front page with articles of an earthquake in Nepal, nearby horses attacked by a bear, the hiring practices of a local town’s police force, and gay marriage, you’re forced to think of the world in its bigger context. At the very least, you’re made aware of the fact that there’s a bigger picture to see.

When I think of how information is so bifurcated today, I can’t help but ask if there’s a lesson there that can be applied to data before we jump overboard into the “put it all out there” sea. We take research articles out of the context of journals. We take scientific findings out of the context of science. We take individual experiences out of context of the very experience in which they occur. And of course, the most obvious, we take any and every politician’s words out of context in order to support whatever position we either want or don’t want him/her to support. I don’t know about you, but each and every one of these examples appears as a pretty clear reason to at least think about what can and will happen (already happens) to data if and when it suffers the same fate.

Are there reasons why librarians and information specialists are concerned with big data? Absolutely! I just hope that our concern also takes in the big picture.