Modelling – Jean-Etienne's blog

In a recent paper, Tinglong Dai, Risa Wolf, and Haiyang Yang wrote about unlearning in Medical AI.

With more and more CPU and storage thrown at Large Language Models (LLMs) and Generative Artificial Intelligence (GenAI) in general, the capacity to “memorise” information grows larger and larger with each generation of LLM. This is further accelerated by the capacity to add specific details to generalist LLMs using Retrieval Augmented Generation (RAG) and agents (e.g., with the ability to query real-world systems at the interface with the physical world).

LLMs are learning more, but what about unlearning? Dai and colleagues didn’t evoke the analogy with human memory: our capacity to learn more relies, in part, on our capacity to forget, to reorganise, to summarise, and to prioritise the learnt information. Sleep and stress play a role in this reorganisation of information; this was the overarching topic of my Ph.D. thesis [link]. I will de-prioritise the visual cues along the path leading to a bakery if I no longer go to this bakery (“unlearning”). However, practising navigation to the bakery improved this skill, and this improvement will serve me later when I need to go to another place (something I could call “secondary learning”). It may seem we diverge from AI, but Dai and colleagues actually start their paper with the EU GDPR possibility for a patient to remove their data from a database, wondering how this is technically possible with LLMs (where data is not structured like in a traditional relational database and where the way data is retrieved is often unknown).

The “unlearning” process in LLMs can be considered from three encapsulated levels: algorithm, legal, and ethical levels.

Continue reading →

ISPOR25, the annual North American conference for the International Society for Pharmacoeconomics and Outcomes Research, is in three weeks. As usual, I’m planning for it by browsing its program. This time, I decided to share a few of my interests on my blog. ISPOR usually covers many topics, from “hardcore” statistical methods to top-level overviews of some issues, so I will focus on only a few topics. Feel free to connect with me if you want to discuss anything at or around the conference (or virtually). (And before we start, full disclaimer: I’m currently working for Parexel, but opinions shared here are only mine; otherwise, I would have written them on the company blog.)

Notes from Jep preparing the ISPOR25 conference with its program on a laptop screen in the background — As usual, browsing the ISPOR conference program brings pages of potentially interesting topics

This first post will be about my primary interest: HEOR modelling, what input data we use, the impact of broader frameworks and regulations, and how it is used. Stay tuned for the next posts: they will be about higher-level regulations and pricing, and one specific to AI.

Despite a huge number of posters (as usual) and a somewhat inefficient official search engine, we can still find interesting posters by following poster tours. For instance, the HEOR impact case poster tour (027) on Wednesday is presenting two takes on managed-entry agreements: Zemplenyi et al. will look at three outcome-based agreement models for sickle-cell disease while Arcand will argue that the epidemiological approach is still more valid than RWD for the re-evaluation of CAR T-cell therapies in Quebec. The Methodology Research in HEOR Poster Tour (55), the Cost-Effectiveness Evaluation of Medical Therapies (100) and the session on Novel Concepts and Frameworks in Health Economic Evaluations (131) will have some interesting presentation of real use of (sometimes new) modelling methods.

One aspect of HEOR Modelling that has not yet become mainstream (i.e. required by HTA agencies or taught before basic cost-effectiveness modelling) is the Generalized Cost-Effectiveness Analysis (GCEA) and other frameworks looking at incorporating other elements of value (other than costs and effects, broadly speaking). These frameworks are great and necessary to study the value of a healthcare intervention in a broader perspective than “just” the reimbursement perspective. However, in my opinion, some issues hinder their wider adoption: a wide agreement on their definition (and their usefulness, for a start) and established methods to collect input data. Two panels will be revisiting this: Challenges in the Implementation of Generalized Cost-Effectiveness Analysis (GCEA): Debating a Path Forward (059) and Global Guidance for Evidence-Based Value Assessment of Innovative Health Technologies: Feasible Reality or Idealistic Dream? (138).

Somewhat related, three other panels will explore specific value elements: cost of inequality, family spillover, and financial risk protection. And one session revisits the societal perspective and one the cost-benefit analysis.

From a tools perspective, one session will explore New Tools Facilitating Health Economics and Outcomes (078). It contains 4 interesting posters:

One will seemingly introduce a benchmark to assess a Large Language Model (LLM) performance at extracting information from models and literature reviews. I wrote “seemingly” because, if the intent is great, the rest of the abstract is not clear about how this system will assess the next model (not only the ones currently contained in the LLM database). And I am a bit doubtful about the use of the number of tokens as a measure of quality. Hopefully, the presentation will clarify these points.
One will present a review of the literature on the use of AI in health economics models. From the abstract, it looks a bit like the first review I presented with my previous boss, last year at WorldEPA 2024. Note I will present some interesting sessions on AI in HEOR in a following post.
Two posters/presentations will show tools to improve efficiencies: a VBA/R automator (I wonder about the sustainability of maintaining this type of program) and the use of metamodels.

This last poster is interesting because we will also present a poster about visual programming and try to convince the audience that this way of programming has many benefits for specific uses of modelling: brainstorming, early modelling, strategy (and introducing newcomers to complex topics in modelling).
In a real geek way, it is interesting to note that most of these sessions still rely on MS Excel and venture, from time to time, with R. Our poster introduces Typescript, but it’s more a side effect (due to the framework used). In addition, our solution can be extended to any programming language, including Python, for instance (a programming language used a lot in data science; besides the use of LLMs for HEOR, Python is not used very much in our domain).

To end this post, I will also follow with interest the session asking to Flip (to) the Script: Is It Time to Rethink Health Economic Modeling for HTAs? (073). It has been a decade since R was tested for modelling. There are now great packages, videos tutorials (example) and advanced training, a fast-growing group and a recently created HTA working group within the R Consortium. Still, modellers are mainly using MS Excel …

I didn’t mention sessions on causal inference, survival analysis, surrogate endpoints, … They are all worth attending. In your opinion, what session(s) did I miss in this brief overview?

Jean-Etienne's blog

Notes en passant: how AI could unlearn in HEOR Modelling

What to look for at ISPOR25 – Modelling