The Truven Health Blog

The latest healthcare topics from a trusted, proven, and unbiased source.


What Data Will Be Available for Population Health Analytics?

Key Questions that Data Scientists Will Ask

By Truven Staff

This is the third in a series of three blogs that present key questions that must be answered before developing an analytic to support the business needs of Population Health Management (PHM) stakeholders or players, including health systems, practitioners, insurance companies, employers and government agencies.

The players agree they need cutting-edge analytics to make sense of their population, and the simplest definition of Population Health Management (PHM) that seems to be accepted by all the players is: Meeting the healthcare needs of a defined population of individuals, from the healthiest to the highest risk, with the right programs at the right time to ensure the best outcomes possible. On Tuesday I described the first question,  that is to be managed; on Wednesday we turned to the “so what” question,  to facilitate the management of the population.

The third important question is what data will be available on which to build the analytics?

Commonly utilized data sources for healthcare analytics include:

  • Information created for administrative purposes (administrative data)
  • Administrative data specifically created for reimbursement (claims data)
  • Information recorded to facilitate the process of delivering care (clinical data)
  • Self-reported information, such as survey data
  • Socio-economic data, either public or privately gathered
  • Device-generated data

Two aspects of this topic are important: what data will available to build the analytic and what data will the player have ongoing access to when applying the analytic?  In the ideal world of analytics development, each method is built using a comprehensive and representative data sample. In other words, the data should have a longitudinal view into a population’s healthcare experience using various data inputs, including administrative and EHR sourced content in addition to socioeconomic details; and, it should be inclusive of all types of individuals so that it is not biased toward certain demographics.

Answering questions about a population becomes more difficult when you don’t have all of the population’s information and need to infer certain aspects. Typically, the health systems or practitioners don’t have a comprehensive view of their patient population, but “they don’t know what they don’t know”.  On the other hand, typically, the insurers or employers do not have access to the clinical richness that lives within the medical records. And while many parties are optimistic about the value of socio-economic data, the process of obtaining that data and merging it into other data sources is not insignificant.

In summary, although on the surface it may appear that the same analytic solutions are desired by all the players, it’s highly unlikely that everyone can use precisely the same analytics due to different answers to three key questions: who is the population, what services can realistically be offered, and what data will be available. The job of Truven Health therefore becomes one of designing analytics that are specific to particular use cases, but with as much flexibility as possible to allow for applicability in various business and data situations. In later posts, I’ll discuss the various types of analytics that can be created once these three key questions are answered, along with some of the specific new analytics Truven Health is developing. 

Here are links to the two prior blogs on this topic: 

Anne Fischer
Senior Director, Advanced Analytics