CodeRx Compendium update Q2 2025
New data marts: drug pricing, indications, and OTC label images
Another quarter, another refresh of the Compendium.
In case you don’t know, the Compendium is what we call our curated collection of finished data marts generated using SageRx. Essentially, they’re a dozen or so CSV files in a Google Drive folder that we update for free every quarter (plus, we archive and share the historical versions).
Click the button below for free access to our data marts. Quarterly updates are provided for free - for more frequent (weekly) updates, please contact us.
What’s new this quarter?
Drug pricing
We’ve always included NADAC as a data source, but have never published the transformed data as a data mart. We have contributed several enhancements to the raw NADAC data published weekly on the CMS website.
Continue reading below to learn about some of those enhancements.
Check out the NADAC data marts
De-duplicating row count while preserving the most useful data points
Every week that a drug is included in the current NADAC release, another row is added to the data with a new “as of date”, but everything else usually remaining the same - even the price in many cases. For people that only care about price changes over time, this creates a massive amount of duplicate rows that need to be cleaned up before working with the data.
We create an easy-to-use, lightweight history of price changes for every NDC in NADAC. We also join in RxNorm RXCUIs and normalize related drug names of NDCs to make it possible to aggregate NADAC data by drug product or ingredient using an open, standardized drug terminology.
Preserving recent prices for drugs that are not in the current week’s release
While rows where the “as of date” from the most recent release are the most relevant for reimbursement activities, prices from a week or even a month ago might still be relevant for other uses.
We provide an easy-to-use view of the data that cleans this up and makes it easy to use only prices from the most recent release, from within the past 30 days, from within the past 90 days, etc.
Handling what NADAC curators call “overwrites”
Overwrites are when the “as of date” moves forward, but the “effective date” goes backward. In practice, these rows / prices are erased from ever being used for reimbursement, yet they remain in the data as artifacts you need to deal with.
We flag these rows and remove them to preserve a logical progression of prices over time.
Drug indications
RxClass is an incredible resource for drug classifications and hierarchies. However, it is only accessible via an API or an web-based graphical user interface (GUI). Or I guess RxNav-in-a-box, which is just a version of the web GUI run locally.
Some of the most useful relationships in RxClass are those that provide some insight into which drugs may treat (or prevent) which conditions. These come from a variety of sources and map to a variety of standard disease ontologies - effectively mapping standardized medication terminology to standardized disease terminology with relationships like may_treat, may_prevent, and ci_with (contraindicated with).
We make this useful drug indication mapping easy to use by providing it in a flat file format. We map from products to diseases via ingredient-level mappings provided by RxClass. We also make it easy to map from MeSH terminology for diseases to other standardized disease terminology, thereby extending the utility provided by RxClass.
Check out the drug interactions data mart
OTC label images
In previous releases, we have used our DailyMed transformation process to extract a mapping of prescription NDCs to label images, but this release also includes the equivalent for OTC NDCs to label images. There doesn’t seem to be as good of coverage for OTCs, and likely there are improvements that could be made to the code to handle OTC labels better - but this is just an effort to share what we have for now.
Check out the label image data mart
Why do we do this?
We do this for several reasons.
Building in public is the best way to build
We want you to be able to actually put eyes / hands on the data we are working hard to transform into something greater than the sum of its parts. And we don’t expect you to run SageRx yourselves to do so… anymore… We think the more people exploring the data, the better. If we spent months developing a data mart and nobody ever looked at it, does it even matter?
The dream of the 80s is alive in healthcare
Why do we release these data marts as CSVs?
Flat files are gonna be around as long as faxing and paging are critical to parts of healthcare. They’re not inherently bad - they’re just not very flashy. However, they’re extremely versatile. Flat files (comma-separated value files, or CSVs) have the dual benefit of being easy for non-technical people to explore, and also trivial for technical people to ingest into data lakes or other data tools.
There are certainly more… modern ways for us to share this data with people. We have considered a cloud database or shared data workspace or even offering an API. We still may do some of these things in the near future. But for the time being, flat file are an easy common denominator that works for us and for most people we’ve asked.
No money, no margin
We put a lot of effort into developing these data marts, and it takes some effort to put together these quarterly releases. We think there is legitimate value for people to use these data marts for free if they are developing something new and need data to get started. Or perhaps for people doing historical research that doesn’t require frequent updates.
In full transparency, this also serves as a smooth on-ramp for someone who values the data but needs more frequent (weekly) updates to inquire about our services. For clients, we deliver these updates to an AWS s3 bucket on a weekly basis, provide data quality checks and ongoing support, and work closely to prioritize developing additional data marts that align with our roadmap.
If any of that sounds interesting to you, please reach out and see how we can work together. Supporting CodeRx helps us develop more data marts that can help others develop innovative and useful things.
Contact us
We hope you find these free quarterly data marts useful, as well as the open-source development we have done to produce them. If you have any questions, are interested in more frequent (weekly) updates and support, or are looking for consulting help, please contact us.