With more and more CPU and storage thrown at Large Language Models (LLMs) and Generative Artificial Intelligence (GenAI) in general, the capacity to “memorise” information grows larger and larger with each generation of LLM. This is further accelerated by the capacity to add specific details to generalist LLMs using Retrieval Augmented Generation (RAG) and agents (e.g., with the ability to query real-world systems at the interface with the physical world).
LLMs are learning more, but what about unlearning? Dai and colleagues didn’t evoke the analogy with human memory: our capacity to learn more relies, in part, on our capacity to forget, to reorganise, to summarise, and to prioritise the learnt information. Sleep and stress play a role in this reorganisation of information; this was the overarching topic of my Ph.D. thesis [link]. I will de-prioritise the visual cues along the path leading to a bakery if I no longer go to this bakery (“unlearning”). However, practising navigation to the bakery improved this skill, and this improvement will serve me later when I need to go to another place (something I could call “secondary learning”). It may seem we diverge from AI, but Dai and colleagues actually start their paper with the EU GDPR possibility for a patient to remove their data from a database, wondering how this is technically possible with LLMs (where data is not structured like in a traditional relational database and where the way data is retrieved is often unknown).
The “unlearning” process in LLMs can be considered from three encapsulated levels: algorithm, legal, and ethical levels.
ISPOR25, the annual North American conference for the International Society for Pharmacoeconomics and Outcomes Research, is in three weeks. As usual, I’m planning for it by browsing its program. This time, I decided to share a few of my interests on my blog. ISPOR usually covers many topics, from “hardcore” statistical methods to top-level overviews of some issues, so I will focus on only a few topics. Feel free to connect with me if you want to discuss anything at or around the conference (or virtually). (And before we start, full disclaimer: I’m currently working for Parexel, but opinions shared here are only mine; otherwise, I would have written them on the company blog.)
As usual, browsing the ISPOR conference program brings pages of potentially interesting topics
This first post will be about my primary interest: HEOR modelling, what input data we use, the impact of broader frameworks and regulations, and how it is used. Stay tuned for the next posts: they will be about higher-level regulations and pricing, and one specific to AI.
One will seemingly introduce a benchmark to assess a Large Language Model (LLM) performance at extracting information from models and literature reviews. I wrote “seemingly” because, if the intent is great, the rest of the abstract is not clear about how this system will assess the next model (not only the ones currently contained in the LLM database). And I am a bit doubtful about the use of the number of tokens as a measure of quality. Hopefully, the presentation will clarify these points.
Two posters/presentations will show tools to improve efficiencies: a VBA/R automator (I wonder about the sustainability of maintaining this type of program) and the use of metamodels.
This last poster is interesting because we will also present a poster about visual programming and try to convince the audience that this way of programming has many benefits for specific uses of modelling: brainstorming, early modelling, strategy (and introducing newcomers to complex topics in modelling). In a real geek way, it is interesting to note that most of these sessions still rely on MS Excel and venture, from time to time, with R. Our poster introduces Typescript, but it’s more a side effect (due to the framework used). In addition, our solution can be extended to any programming language, including Python, for instance (a programming language used a lot in data science; besides the use of LLMs for HEOR, Python is not used very much in our domain).
I didn’t mention sessions on causal inference, survival analysis, surrogate endpoints, … They are all worth attending. In your opinion, what session(s) did I miss in this brief overview?