Tag Archives: data science

Hello Muddah, Hello Faddah…

16 Aug

… here I am at, Camp …  well … at Townshend State Park in Townshend, Vermont. Last week’s vacation spot. It was a wonderful week of camping, hiking, reading, drawing, cooking, and more. Just what a summer vacation is supposed to be. The only downside is that it was all of one week. Too short. Ah, well…

I read three good books while camping:

The latter two are books that appeared in my Little Free Library this summer and I decided after reading them last week that I’d add a “review” feature to my library. We’ll see how – or if – it takes off.

Three work-related books that were recommended and/or loaned to me lately include:

Not quite the page-turners as my vacation books, but worthwhile reading all the same. The first two give very practical advice, examples, and exercises to help one hone his/her data science and math skills, and Few’s book is like all of his others, i.e. chocked full of information and advice for effective data visualization.

And finally, a few interesting websites to peruse and enjoy:

A Snapshot of a 21st-Century Librarian (Adrienne Green, The Atlantic) is a terrific profile piece on Theresa Quill, a research librarian at the Herman B. Wells Library at Indiana University, Bloomington. If you, like me, struggle to explain your not-so-stereotypical librarian job to friends and family, point them to this article as a good example of how we’re pushing the boundaries and redefining our role(s).

Sawbones: A Marital Tour of Misguided Medicine is a hilarious – and informative – podcast that I recently stumbled upon. Dr. Sydnee McElroy provides the medical expertise and her husband, Justin, the banter. Actually, they both banter quite a bit, making it an enjoyable program. I see that last week’s topic was cupping. If you noticed those round bruises on Michael Phelps body during the Olympics, you might want to listen to learn about how they got there (and if the science behind the practice is real).

Speaking of the Olympics, Dynamic Dialects is just a downright awesome site to explore how people around the world pronounce the same set of words. It’s great fun!

If you bookmark sites for free-to-use images, you’ll want to add the USDA’s Pomological Watercolor Collection to your list. One “Fast Fact” from the site – it contains 7, 584 watercolor paintings, lithographs, and line drawings of fruits and nuts, and almost 4,000 of those are apples. Imagine! It’s a beautiful resource.

The Open Notebook gives visitors a wealth of insight and knowledge about science writing, and also provides tools to help one become a better science writer. Interviews, Elements of the Craft, Profiles, and Science Blogging are some of its features. 

 Finally, someone once asked how I discover all of these sharable finds. Better put, I think she asked, “How do you find the time to discover them?” The answer is that I read a lot (stories from Twitter; magazines like The Atlantic, The Economist, and The New Yorker; a number of interesting blogs), I listen to the news via public radio and podcasts of interest, and I subscribe to several email newsletters including The Scout Report from Internet Scout at the University of Wisconsin-Madison, Austin Kleon’s weekly post, Banana Data News, and Wait But Why. I like that with the exception of the last one (which arrives maybe once a month), these appear in my email on Friday mornings. They’re not overwhelming in length and never cease to offer up something that I find interesting and useful – kind of like how I hope you find my blog. 

Is Big Data Missing the Big Picture?

27 Apr

Forest_for_the_Trees

When I was defending my graduate thesis a number of years ago, I was asked by one of the faculty in attendance to explain why I had done “x” rather than “y” with my data. I stumbled for a bit until I finally said, somewhat out of frustration at not knowing the right answer, “Because that’s not what I said I’d do.” My statistics professor was also in attendance and as I quickly tried to backtrack from my response piped in, “That’s the right answer.”

As I’ve watched and listened to and read and been a part of so many discussions about data – data sharing, data citation, data management – over the past several years, I often find myself thinking back on that defense and my answer. More, I’ve thought of my professor’s comment; that data is collected, managed, and analyzed according to certain rules that a researcher or graduate student or any data collector decides from the outset. That’s best practice, anyway. And such an understanding always makes me wonder if in our exuberance to claim the importance, the need, the mandates, and the “sky’s the limit” views over data sharing, we don’t forget that.

I really enjoyed the panel that the Medical Library Association put together last week for their webinar, “The Diversity of Data Management: Practical Approaches for Health Sciences Librarianship.” The panelists included two data librarians and one research data specialist; Lisa Federer of the National Institutes of Health Library, Kevin Read from New York University’s Health Sciences Library, and Jacqueline Wirz of Oregon Health & Sciences University, respectively. As a disclosure, I know Lisa, Kevin and Jackie each personally and consider them great colleagues, so I guess I could be a little biased in my opinion, but putting that aside, I do feel that they each have a wealth of experience and knowledge in the topic and it showed in their presentations and dialogue.

Listening to the kind of work and the projects that these data-centric professionals shared, it’s easy and exciting to see the many opportunities that exist for libraries, librarians, and others with an interest in data science. At the same time, I admit that I wince when I sense our “We can do this! Librarians can do anything!” enthusiasm bubble up – as occasionally occurs when we gather together and talk about this topic – because I don’t think it’s true. I do believe that individually, librarians can move into an almost limitless career field, given our basic skills in information collection, retrieval, management, preservation, etc. We are well-positioned in an information age. That said, though, I also believe that (1) there IS a difference between information and data and (2) the skills librarians have as a foundation in terms of information science don’t, in and of themselves, translate directly to the age of big data. (I’m not fan of that descriptor, by the way. I tend to think it was created and is perpetuated by the tech industry and the media, both wishing we believe things are simpler than they ever are.) Some librarians, with a desire and propensity towards the opportunities in data science will find their way there. They’ll seek out the extra skills needed and they’ll identify new places and new roles that they can take on. I feel like I’ve done this myself and I know a good plenty handful of others who’ve done the same. But can we sell it as the next big thing that academic and research libraries need to do? Years later, I still find myself a little skeptical.

Moving beyond the individual, though, I wonder if libraries and other entities within information science, as a whole, don’t have a word of caution to share in the midst of our calls for openness of data. It’s certainly the belief of our profession(s) that access to information is vital for the health of a society on every level. However, in many ways it seems that in our discussions of data, we’ve simply expanded our dedication towards the principal of openness to information to include data, as well. Have we really thought through all that we’re saying when we wave that banner? Can we have a more tempered response and/or approach to the big data bandwagon?

Arguably, there are MANY valid reasons for supporting access in this area; peer review, expanded and more efficient science, reproducibility, transparency, etc. Good things, all. But going back to that lesson that I learned in grad school, it’s important to remember that data is collected, managed, and analyzed in certain ways for a reason; things decided by the original researcher. In other words, data has context. Just like information. And like information, I wonder (and have concern for) what happens to data when it’s taken out of its original context. And I wonder if my profession could perhaps advocate this position, too, along with those of openness and sharing, if nothing more than to raise the collective awareness and consciousness of everyone in this new world. To curb the exuberance just a tad.

I recently started getting my local paper delivered to my home. The real thing. The newsprint newspaper. The one that you spread out on the kitchen table and peruse through, page by page. You know what I’ve realized in taking up this long-lost activity again? When you look at a front page with articles of an earthquake in Nepal, nearby horses attacked by a bear, the hiring practices of a local town’s police force, and gay marriage, you’re forced to think of the world in its bigger context. At the very least, you’re made aware of the fact that there’s a bigger picture to see.

When I think of how information is so bifurcated today, I can’t help but ask if there’s a lesson there that can be applied to data before we jump overboard into the “put it all out there” sea. We take research articles out of the context of journals. We take scientific findings out of the context of science. We take individual experiences out of context of the very experience in which they occur. And of course, the most obvious, we take any and every politician’s words out of context in order to support whatever position we either want or don’t want him/her to support. I don’t know about you, but each and every one of these examples appears as a pretty clear reason to at least think about what can and will happen (already happens) to data if and when it suffers the same fate.

Are there reasons why librarians and information specialists are concerned with big data? Absolutely! I just hope that our concern also takes in the big picture.

 

Summer Picks

18 Jul

I’ve but a short post to share this week. Honestly, it’s just too hot to even think clearly enough to write, BUT not to read. With this in mind, I thought I’d share a few of the informationist-related books that I’m working through this summer. If you have others to contribute or thoughts to share about any of these, I hope you’ll do so in the comments section.

Beginning Database Design, Clare Churcher

Beginning Database Design, Clare Churcher

It’s true that most librarians learn about database design in grad school and it’s surely a skill that we should have expertise in throughout our careers, but a good refresher text is never anything to snuff at. I picked up this one at the MIT bookstore when I was taking the Software Carpentry Bootcamp several weeks back. It’s a keeper for the bookshelf on my desk.

Visualize This, Nathan Yau

Visualize This, Nathan Yau

Data Points: Visualization that Matters, Nathan Yau

Data Points: Visualization that Matters, Nathan Yau

These two books by Nathan Yau, together, are providing me with both a skill set to retrieve data from the Web and a really good understanding of how to present data and/or information so that it makes the most sense to an audience. Yau writes clearly and with a tone that keeps you interested in a topic that, lets face it, could easily slip into the dry and “put you to sleep” mode. As one with an appreciation for design, I also think that the books are treasures to look at. They’re a great starter set for what is my summer reading’s real focus, data visualization.

Visualizing Data: Exploring and Explaining Data with the Processing Environment, Ben Fry

Visualizing Data: Exploring and Explaining Data with the Processing Environment, Ben Fry

More technical and dense than Yau’s books, I had a half-price coupon for an O’Reilly Media ebook and so I picked this one. It’s definitely good for reference and troubleshooting, though I know it’s not one that I’ll read cover-to-cover.

The Functional Art: An introduction to information graphics and visualization (Voices That Matter), Alberto Cairo

The Functional Art: An Introduction to Information Graphics and Visualization, Alberto Cairo

Cairo’s is another really beautiful book to both look at and read. Design is first and foremost. I’m finding Yau’s books more practical for my learning, but I love picking this one up and flipping through its pages every now and then, just because it’s so nice to peruse. But not to sell it short, it’s filled with a lot of good advice for communicating information in a clear and interesting manner. It fits well with the others on my shelf.

Beautiful Visualization: Looking at Data through the Eyes of Experts (Theory in Practice), edited by Julie Steele and Noah Iliinsky

Beautiful Visualization: Looking at Data through the Eyes of Experts (Theory in Practice), edited by Julie Steele and Noah Iliinsky

As the title suggests, this is a phenomenal collection of works by many of the leading practitioners of data visualization working today. This is the perfect working informationist beach book, offering a bunch of short, quick reads, separate to themselves, that together give you a really high bar to shoot for if you want to go into this field.

A Simple Introduction to Data Science,  Lars Nielsen & Noreen Burlingame

A Simple Introduction to Data Science, Lars Nielsen & Noreen Burlingame

Short and sweet (just 75 pages long), this is a staple on my Kindle. It explains data science in lay terms, yet from the scientist’s (not the librarian’s) point of view. It’s a nice reference to keep handy.

Pretty Good for a Girl

Pretty Good for a Girl: Women in Bluegrass (Music in American Life), Murphy Hicks Henry

And finally, lest you think I’ve completely rearranged all of my life’s priorities, I’m really, (really), enjoying this compilation of women (most forgotten and/or overlooked) from the 1920s to present who have held their own in the male-dominated world of bluegrass music. It’s stellar!

That’s a full beach bag of books for me (and you, if you want to seek some or all of them out) and summer is really only so long. In fact, how many days do I have ’til vacation?!?!

Happy reading and stay cool!