So DO You Want to be a Data Scientist?

28 Mar

Last week, a colleague that I follow on Twitter retweeted a post from the blog, NatureJobs, titled , So you want to be a data scientist by Michael Koploy of SoftwareAdvice.com. The colleague who originally brought the piece to my attention, Kristi Holmes, PhD, is a bioinformaticist at Becker Medical Library at Washington University in St. Louis School of Medicine. She’s also an all-around good egg and one of my absolute favorite colleagues in the field, but that’s beside the point. I would have read the piece regardless of who tweeted it to my attention. However, because it came from Kristi, we then engaged in a mini tweetchat that we’ve had before, i.e. Where and what is the intersection between data scientists and librarians, if there even is one?

One of the interesting things about this discussion, to me, is that Kristi is a scientist who happens to work in a library, while I am a librarian, trying to work in the arena of scientists. And from our different perspectives, she is the one who is routinely much more optimistic about librarians getting into the area of data than I. There’s probably a thing or two you can decipher from this, but that’s for another time.

Another thing that happened after I retweeted and commented on the post was that I got an email from Brittany Richards at Software Advice thanking me for the tweet and additionally, asking if I’d do a blog post of their article here on the Librarian Hats blog. Specifically, Brittany wrote, “You mentioned library science and I was interested to see your thoughts on how the two are related to each other.”

Now if you’ve read this blog for any time, you know my answer was an enthusiastic, “SURE!” So here goes – a recap of that article and some summarizing of conversations I’ve had with Kristi and other scientists on the topic:

scienceI once saw/heard a librarian give a presentation where he identified himself as a data scientist. I called him on it. I am a librarian with a graduate degree in library & information science. I also have a graduate degree in an applied biological science (exercise physiology). Given that background, I feel pretty comfortable stating that while the two share the word, there is a world of difference between the science that librarians do and that which takes place in laboratories, clinics, the field, etc. As I’ve stated in this blog before, my background in exercise physiology is what I feel gives me the extra tools that I need to be effective as an informationist. That’s the science background that is recognized in the sciences.

I hope you don’t hear me dissing my library degree, education, or career. I’m not at all. They are just different and when I read articles like Koploy’s, as well as many books on data, specifically library and librarians’ roles in working with data, I cannot help but keep this thought in mind. It’s what comes to my mind. Every time.

In his post, Koploy recalls the description of a data scientist that he got from Bruno Aziza, a big name in Big Data. Aziza called a data scientist a “business analyst-plus.” He highlights mathematics, statistics, and business strategy as their core skills. Koploy himself adds, “While programming and statistical expertise is the foundation for any data scientist, a strong background in business and strategy can help jettison a younger scientist’s career to the next level.” Further, he notes that successful data scientists are drawn from the fields of biostatistics, econometrics, engineering, computer science, and the like. I’ve read the article several times. Library or information science is not on the list.

Again, this isn’t a slight against my field, but rather an observation that there are different skill sets required for different jobs and the job of a data scientist is not the job of a librarian. And vice versa.

So the question then becomes, how much does a librarian – or an informationist – need to learn to become a data scientist? I say, “A lot.” However, that “a lot” comes with the assumption that one isn’t entering data science from one of those previously mentioned fields. If this is the case, then of course, that individual is well prepared. You’ll note though, that even with the background, Koploy points out that data science is (1) fast-growing, (2) extremely competitive, and (3) new. Even the most seasoned statistician needs to learn some new skills and/or subjects to keep up.

The optimistic among us – those who believe the cross-over between information and data science is broad – focus upon those characteristics that are, in fact, mentioned by experts in the data science field as ones that separate the exceptional data scientist from the average; inquisitiveness, the ability to spot trends, and the tendency (skill) to ask the right questions. It’s the latter where librarians, informationists, and information scientists both have experience and often excel. We know how to ask the right questions that get to the heart of information problems, e.g. How does the business work? How does it collect data? How will it use the data? (per Krishna Gopinathan, Global Analytics Holdings)

So, do you want to be a data scientist? If you’re a librarian or an informationist, depending upon your background, you may or may not have a little or a lot of work to do to get ready to take on the role. If you don’t have the background, I see two possibilities:

  • Get it (hit the books!)
  • Find the right partner(s) where your skills can be paired to produce a good data science team

We choose careers for a lot of different reasons, but I like to believe that in the best case scenario, we choose something that we’re both interested in and good at. Remember those aptitude tests you took in the guidance counselor’s office in high school? They were (and still are) meant to measure something. They measure what we like and what we have an aptitude for. They measure what career would fit us best. It means something to be a librarian. It also means something to be a scientist. I believe that it’a a sign of the times, and a bit of a challenging time at that, that careers and skills and tasks that once sat neatly within cubicles and labs and computer workstations are now all mixed up together. This melting pot of vocations is difficult to navigate. On the one hand, it opens a wealth of new opportunities. On the other, though, it means for everyone working with information and/or data, we will never enjoy sitting back and doing the same old same old for very long.

If you’re interested, I also encourage you to read the original piece that Michael Koploy wrote, along with some of the links he suggests for further reading. In particular, I really enjoyed Hilary Mason’s blog. Good stuff there. I also happened to notice, just this morning, that Coursera’s free Introduction to Data Science class that’s listed is starting up in the not too distant future. If it piques your interest, give it a go. You might well find that you have a hidden talent that will take you far in this new area.

Which brings me full-circle to the question I began with, i.e. Is this new area in the library? Well, quite obviously there are individuals like Kristi, bioinformaticists and data scientists who find their home in libraries*. There are also librarians or informationists with training in data science who find their homes outside of the library. And then there are librarians. And then there are data scientists. In other words, there’s a big mix of us. If you’re comfortable in the mix and you’re up to the task of getting and/or honing new skills, you’ll likely do really well wherever you are.

The times they are a changin’, sings Mr. Dylan, and we look to change with them. At the same time, though, we need to be realistic. We need to see clearly what we know, what we do well, what we like, and more. We need changes in graduate education across the board to address these issues, and likewise those of us working need to accept that we’ll be learning for a lifetime. These are the times we live in. You can’t just call yourself something different. You need to do something different. Or do things differently. Likely all of the above.

special agents rockin

Rockin’ out with my pals, The Special Agents, at Houghton Elementary School. Support art, music, and physical education in your public schools, people! You could get a band out of it.

Now I’m off to play drums with a friend’s band, dressed up like the Cat in the Hat. You’ve got to have a really big tool box o’ skills, friends. Really big!

* And then there’s the matter of money. If you have the chops to get a job as a data scientist, are you willing to work for about half of what you could make in business or industry than you will in a library? It’s a question that comes up in our professional discussions often. If you want to have at it in the comments section to this post, go for it!

11 Responses to “So DO You Want to be a Data Scientist?”

  1. Andy March 29, 2013 at 10:32 am #

    “If you have the chops to get a job as a data scientist, are you willing to work for about half of what you could make in business or industry than you will in a library?”

    As someone who would love to go back to working in a library but am leery of taking the ~20% pay cut from my current non-library health sciences research gig that it would entail, I find this a very salient question.

    • salgore March 29, 2013 at 2:24 pm #

      Thanks, Andy. I think it’s a pretty important piece that we don’t discuss often – one that says a lot more about how librarians are viewed, understood, interpreted, etc. It’s a barrier to our profession, in general too, I believe, in that lower paying professions are deemed “less than.” There’s little truth to that assumption, of course, but it exists and makes it all the more difficult for librarians to break out of their stereotypical role(s). I also believe there’s an underlying (or perhaps even overt) sexism at play. Men work in computer and/or information science (historically). They don’t become librarians. Librarianship is still viewed as a female-dominated profession and thus not paid as well as other, similar professions.

  2. Regina Raboin April 6, 2013 at 10:02 pm #

    A little late in replying, but this statement, “it means for everyone working with information and/or data, we will never enjoy sitting back and doing the same old same old for very long.” really hits home for me. While librarians might not want, nor need to be data scientists, we do need to read, study, and seek out professional development opportunities for data science and use this knowledge to evolve our positions and challenge assumptions made about the profession.

    • salgore April 8, 2013 at 11:56 am #

      One thing that I’ve been thinking a lot about is while it’s acceptable (even exciting) to be a life-long learner and seek professional development/continuing ed opportunities in our work, why do we find that more and more we need to start doing this from the get-go? It seems to me that our degrees are the equivalent of automobiles, nowadays. They depreciate in value as soon as you walk off the campus.

  3. top laptops April 26, 2013 at 1:56 pm #

    My brother suggested I might like this blog. He was totally right.
    This post actually made my day. You cann’t imagine simply how much time I had spent for this information! Thanks!

    • salgore April 26, 2013 at 9:29 pm #

      Thanks so much! I’m glad you enjoyed the post. I hope you’ll stay tuned for more.

  4. Makson de Jesus Reis March 14, 2017 at 2:10 pm #

    OLÁ PREZADO SOU BRASILEIRO E BIBLIOTECÁRIO. Amei a notícia que ciência de dados é importante para a ciência da informação. Quais Bibliotecários no seu país que estudam essa relação .
    Grande Abraço

Trackbacks/Pingbacks

  1. 50+ Articles Every Librarian Should Read | Heard Around the Stacks - May 16, 2013

    […] So DO You Want to be a Data Scientist? (n.d.). A Librarian by Any Other Name. Retrieved May 16, 2013, from https://librarianhats.net/2013/03/28/datascientis/ […]

  2. DST4L: Coding 101 plus Data Science Conversations | Data Scientist Training for Librarians - May 30, 2013

    […] between a data scientist and a data savvy professional: I am firmly in the second camp.  (Both Sally Gore and DST4L’s own Jennifer Prentice have posted blog entries offering insight into this […]

  3. Paths to the Databrary | Databrarians - November 18, 2014

    […] course, as Sally Gore points out on her blog, “It means something to be a librarian. It also means something to be a […]

  4. What does a collaborative blog for data librarians look like? - December 10, 2014

    […] Sally Gore blogs at Librarian by Any Other Name. She writes about many different topics, some quite broad, but there are two great posts on data, here and here. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: