Campus News

UB developing data science librarian training program, thanks to NLM grant

Silhouettes of men looking at an illuminated wall of big data.

UB researchers say the need to train data science librarians stems from the ever-increasing amounts of patient data being generated by electronic health records.

By ELLEN GOLDBAUM

Published November 9, 2018 This content is archived.

Print
“Data science has the potential to facilitate the identification of new treatments for disease even before biomedical scientists and physicians know that a particular therapy is successful. ”
Diane G. Schwartz, research associate professor
Department of Biomedical Informatics

UB researchers in several disciplines are collaborating to develop a new data science training curriculum for library and information science graduate students, and practicing health science librarians.

The $25,000 grant from the National Library of Medicine (NLM) of the National Institutes of Health (NIH) has been awarded to researchers in the Department of Biomedical Informatics in the Jacobs School of Medicine and Biomedical Sciences and the Department of Library and Information Studies in the Graduate School of Education.

The new grant is a supplement to a $2.5 million grant the NLM awarded in 2017 to UB researchers led by Peter Elkin, professor and chair of the Department of Biomedical Informatics. That grant supports doctoral- and postdoctoral-level training for research careers in biomedical informatics and data science.

The new data science librarian program will focus on preparing people to work in academic, hospital, health-related and public libraries, as well as libraries focused on specific subjects.

Students and librarians who complete the program will achieve data science micro-credentials, which are skill sets that are more narrowly focused, more flexible and quicker to achieve than traditional degrees or certificate programs.

“Our goal is to develop micro-credentials that will provide library and information studies graduate students and practicing health science librarians with the knowledge, skills and attributes they need in order to successfully compete for data science positions,” says Diane G. Schwartz, research associate professor of biomedical informatics and co-investigator on the grant with Ying Sun, associate professor in the Department of Library and Information Studies.

“The need to train data science librarians stems from the ever-increasing amounts of patient data being generated by electronic health records, as well as by the internet and social media,” Schwartz adds.

  • The power of data science is just beginning to be appreciated, she notes. “Data science has the potential to facilitate the identification of new treatments for disease even before biomedical scientists and physicians know that a particular therapy is successful,” she explains.

The grant is focused on developing a training program that will provide practitioners with specific skills that will allow them to assist health care professionals and biomedical scientists in making sense of and leveraging the ever-growing deluge of data the biomedical sciences are now generating.

“For example, data science librarians collaborate with health care professionals to assess, manage, analyze and interpret data, developing data sets that can then be communicated to physicians, nurses and other health care providers who will apply these data sets to improve disease prevention, diagnosis and treatment,” Schwartz says.

Schwartz and Sun are creating the data science librarian curriculum around five skill sets:

  • Data analytics or analysis, a scientific, mathematical and statistical area in which data is “cleaned” to enable accurate evaluations or calculations.
  • Data management, in which librarians confront the issue of how best to manage the data deluge and focus on instilling in data users best practices regarding the importance of proper data handling and management.
  • Data archiving/curation, which focuses on alleviating technical issues researchers face, such as data loss, version issues, management of obsolete file formats in long-term projects, and provision of secure collaboration tools.
  • Data visualization, in which librarians create visual representations of data in order to more powerfully explore, examine and communicate the meaning of the data.
  • Terminology/ontology, in which librarians work to develop the skills that will enable them to partner with ontologists — people who specialize in the study of organizing and categorizing knowledge on a topic — in order to better find, organize, categorize, integrate and label relevant data.