<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Simply Boring AI]]></title><description><![CDATA[Making AI useful, boring, and safe. With art and stories.]]></description><link>https://www.simplyboring.ai</link><image><url>https://www.simplyboring.ai/img/substack.png</url><title>Simply Boring AI</title><link>https://www.simplyboring.ai</link></image><generator>Substack</generator><lastBuildDate>Sun, 24 May 2026 05:31:39 GMT</lastBuildDate><atom:link href="https://www.simplyboring.ai/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Gary Ang (Ming)]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[simplyboringai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[simplyboringai@substack.com]]></itunes:email><itunes:name><![CDATA[Gary Ang (Ming)]]></itunes:name></itunes:owner><itunes:author><![CDATA[Gary Ang (Ming)]]></itunes:author><googleplay:owner><![CDATA[simplyboringai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[simplyboringai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Gary Ang (Ming)]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[LLMs Can't Understand Time, At Least Not Naturally - An Update]]></title><description><![CDATA[Update of an article I did in 2025.]]></description><link>https://www.simplyboring.ai/p/llms-cant-understand-time-at-least</link><guid isPermaLink="false">https://www.simplyboring.ai/p/llms-cant-understand-time-at-least</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Sat, 23 May 2026 14:50:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!KjB7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is an update of an <a href="https://open.substack.com/pub/msukhareva/p/llms-cant-understand-time-at-least?r=5kml33&amp;utm_campaign=post-expanded-share&amp;utm_medium=web">article </a>I did for <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;AI Realist&quot;,&quot;id&quot;:5286015,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/msukhareva&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48b0a0fa-a8fa-4a2c-85c7-9bf12e822e5d_1024x1024.png&quot;,&quot;uuid&quot;:&quot;c46bbd21-fc1e-4047-b0b9-93075096c8ab&quot;}" data-component-name="MentionToDOM"></span> in 2025. Things have shifted so I thought good time for an update. I have also added a list of references for folks who are interested.</p><div><hr></div><p>When LLMs first became accessible and popular in 2022, one easy way to tell if someone was selling snakeoil was if they claimed that either ChatGPT or LLMs could forecast or predict something in the future. Someone who said that obviously did not understand how LLMs then worked, and were essentially hallucinating or bullshitting with great confidence.</p><p>LLMs are trained on text data. To forecast, you are working in the domain of time series data. Text and time series are both sequences, but they have fundamentally different characteristics.</p><p>Things have evolved since. Quite a lot actually. I would listen more patiently now for the details if someone said he or she used LLMs for forecasting due to shifts in the field. However, there is still a clear distinction between LLMs for language or text, compared to time series foundational models inspired by LLMs.</p><p>Let me explain this. In 4 short acts.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KjB7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KjB7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KjB7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:145745,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/198968909?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KjB7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!KjB7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb08681d3-aa34-4896-a851-76545895793c_1024x559.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Act I: The Classical World of Time Series Modelling</h2><p>For decades, way before neural networks were practically usable, forecasting or predictions with time series data was the domain of statisticians and econometricians. Time series data is fundamentally different from tabular, image or text data. You can shuffle rows in tables, mix up images, or rephrase text, and the meaning would still be largely intact.</p><blockquote><p>The first, but not unique, property of time series is sequence. Mixing up the sequential order of time series data renders it meaningless. This is what it has in common with text.</p></blockquote><p>This is the world of models like ARIMA. Understanding such models provides a clear understanding of what matters for time series data.</p><p>The AutoRegressive Integrated Moving Average (ARIMA) models and its variants dominated time series analysis for ages. They captured the essential insights needed to analyse or make predictions with time series data.</p><p>The core ideas:</p><ul><li><p><strong>AutoRegressive (AR):</strong> Current values depend on previous values</p></li><li><p><strong>Integrated (I):</strong> Many series become predictable after differencing</p></li><li><p><strong>Moving Average (MA):</strong> Current values depend on errors of previous predictions</p></li></ul><p><strong>Conceptually, every time series could be understood as a combination of level, trend, seasonality, cycles and some noise. This is what makes it different from text.</strong></p><p><strong>Y = LEVEL + TREND + SEASONALITY + NOISE</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tS4w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tS4w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tS4w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81812,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/198968909?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tS4w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tS4w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f261922-7b79-481e-a7ec-0e011da0448c_1344x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Seasonal ARIMA (SARIMA) handled recurring seasonal patterns. Vector AutoRegression (VAR) tackled multiple related series. Holt-Winters exponential smoothing handled trends and seasons differently. For volatile financial series, Engle&#8217;s ARCH and Bollerslev&#8217;s GARCH gave us proper models for <em>time-varying volatility</em> - the technical term is <em>conditional heteroskedasticity</em>, but the everyday intuition is that the size of the typical move on a given day is itself a moving target. Harvey&#8217;s structural time series models decompose a series into latent components - level, trend, seasonal, cycle - each modelled as its own stochastic process, with the Kalman filter doing the inference behind the scenes. And modern decomposition extensions like MSTL (Multiple Seasonal-Trend decomposition using Loess) handle multiple seasonalities at once - think hourly demand with daily, weekly, and annual cycles all stacked together.</p><p>If you want to start playing with these classical models, R&#8217;s <code>forecast</code> package (<code>auto.arima</code>, <code>ets</code>, <code>stl</code>) is the easy on-ramp, and Python&#8217;s <code>statsforecast</code> from Nixtla gives you the same family of models. Meta&#8217;s Prophet popularised an additive-decomposition approach for forecasting and sits closer to the classical end of the spectrum despite being open-sourced by a deep-learning lab.</p><blockquote><p>Another fundamental characteristic of time series that is critical is the degree of stationarity. Much more than text or images (words only change meaning slowly over time, same for images), time series in many domains continually evolve, and are what we call non-stationary.*</p></blockquote><p>Recall &#8216;I&#8217; for integration in ARIMA? That step leads to a differencing operation that allows the time series to be more stationary, and hence more predictable.</p><blockquote><p>The classical world of time series forecasting, because of this focus on the underlying patterns of trends, seasons etc., was inherently explainable.</p></blockquote><p>That&#8217;s Act I. But before I move on to Act II, I thought it would be useful to mention that forecasting is not the only task you can apply to time series. You can also nowcast (predict current unknown values with time series data to date, like GDP); classify time series patterns, detect outliers for time series, and so on and so forth. But the fundamental characteristics of time series data that need to be taken into account remain the same.</p><h2>Act II: Machine and Deep Learning&#8217;s Struggles with Time</h2><p>The power of machine learning models in the last decade meant many tried to use machine learning models such as support vector machines, random forests, and boosting tree models for time series forecasting. But these models did not fit naturally with time series data. Not that it could not work, but it was a hit and miss.</p><blockquote><p>The natural question then was, why switch to these significantly more complex machine learning models when the classical models were good enough?</p></blockquote><p>Then came advances in computer vision and natural language processing driven by deep learning. Computer scientists being computer scientists, they started looking for new domains to test these models on. Naturally, given the importance of time series data in many commercial and financial settings, computer scientists started using these models for different tasks on time series data.</p><blockquote><p>But here&#8217;s the thing - you can&#8217;t just throw time series data at a regular deep learning model and expect magic. Remember the properties - sequence, patterns, stationarity?</p></blockquote><p>Sequence models for natural language processing like Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks made a lot of sense. DA-RNN introduced dual-stage attention for input and temporal selection. Time-aware LSTMs handled irregular sampling via decay gates. Temporal Convolutional Networks showed that convolutional architectures could match or beat RNNs for many sequence tasks.</p><p>Answering the question of how to remember important information from way back in the sequence, while forgetting irrelevant noise applied equally to text and time series data.</p><p>There were many papers that focused on adapting RNNs, LSTMs, and even CNNs (designed for image data) to time series data. I would say the results were ambivalent. Sometimes you got fantastic results, sometimes a simple classical model would beat the complex deep learning model handily in a fraction of the time and computing power. A reviewer once asked me why a model I developed and trained for time series modelling was so large in size. It was only 12m parameters (miniscule compared to the billions and trillions for LLMs), but you get the idea. Probabilistic recurrent forecasters like DeepAR and deep state-space variants became quite popular. NeuralProphet - the DL successor to Meta&#8217;s Prophet, swapping the additive components for neural network blocks - sounded promising but empirically failed to beat plain ETS on standard benchmarks. The &#8220;DL doesn&#8217;t automatically win&#8221; pattern was already showing up.</p><p>So was it a pointless exercise? Not really.</p><blockquote><p>One of the key differences between classical and deep learning time series models was that deep learning models could utilise multimodal time series data (time series of text, audio, sensor events, images, networks), whereas classical models were largely restricted to numerical time series.</p></blockquote><p>There are also many other aspects of time series that we could focus on aside from sequence for time series, distinctly different from natural language or image data.</p><p>For example:</p><ul><li><p><strong>Different time scales:</strong> Time series that change by the second, hourly, daily, quarterly and so on and so forth.</p></li><li><p><strong>Varying signal quality:</strong> Time series data that are inherently noisier than natural language or image data.</p></li><li><p><strong>Relationship dynamics:</strong> The importance of interactions between different time series.</p></li></ul><p>There are many more. And deep learning models gave us a lot of flexibility to explore these characteristics.</p><p>In the early 2020s, even before ChatGPT, but after the seminal <em>&#8220;Attention is All You Need&#8221;</em> moment, many papers also explored the use of attention-based transformers for time series data.</p><p><em><strong>It was a logical pairing. We use positional encodings to encode the position of tokens (or words) in transformers for text. Why couldn&#8217;t we do the same for time series steps?</strong></em></p><p>A whole family came out of this - N-BEATS for interpretable basis expansion, the Temporal Fusion Transformer (TFT) for multi-horizon attention, Informer for long sequences, TS Transformer for self-supervised representation learning, Autoformer for decomposition + auto-correlation, and PatchTST. The PatchTST trick - that you can chop a series into patches and treat them like tokens - would turn out to be a conceptual seed for Act III. CoST pushed contrastive disentanglement of seasonal and trend representations.</p><p>In parallel, multimodal work fused text and prices for financial forecasting - event-driven stock prediction such as &#8220;Listening to Chaotic Whispers&#8221;, which added news attention. Fine-grained event typologies, stock embeddings from news + price history, hierarchical multi-task models for earnings-call volatility, REST&#8217;s relational event-driven framework, and FAST&#8217;s news-and-tweet time-aware network all extended this line.</p><p>I published several papers in this domain - using numerical and text time series with evolving networks. KECE combined knowledge graphs with numerical and textual time series. GLAM distinguished between global and local temporal patterns with adaptive curriculum learning to handle noise. GAME designed latent sequence encoders for multimodal data of different frequencies. DynMix used dynamic self-supervised learning with implicit and explicit network views, while DynScan learned slot concepts to handle non-stationary multimodal streams. These models showed strong results on financial forecasting, portfolio optimization, and ESG predictions, but they were highly specialized transformer models trained for specific tasks.</p><p>They were not general purpose or foundational models. Just using a transformer does not qualify. A general purpose or foundational model needs to be usable across different tasks.</p><p>But there were folks already researching these, even prior to ChatGPT coming onto the scene in 2022.</p><p>For readers who want to play with this era of models, the Python libraries <code>darts</code> (which also contains classical models), Nixtla&#8217;s <code>neuralforecast</code>, and HuggingFace&#8217;s transformer-based time-series pipelines are good entry points.</p><h2>Act III: Foundation Models Learn the Language of Time</h2><p>Interestingly, while I was researching multimodal time series models focused on networks, one of my PHD mates (Gerald Woo) was doing groundbreaking work on one of the first foundational models for time series data inspired by LLMs architectures.</p><p>This was the Moirai time series foundation model for universal forecasting. We attended each other&#8217;s research presentations once in a while, and I found his work fascinating. But I was already at the tail end of my PHD, so too late to switch tack.</p><p>Since then there have been many more foundational models for time series data inspired by LLMs architectures.</p><p>Now that you know all the special characteristics of time series, you would know that one cannot just throw a time series model into a transformer or an LLM and expect some magic to happen.</p><blockquote><p>One key challenge was how to convert infinite numerical possibilities into a finite vocabulary that transformers can process. Language has a finite vocabulary, but not time series.</p></blockquote><p>Different teams solved this differently:</p><p>Amazon&#8217;s Chronos quantizes continuous values into 4,096 discrete bins - essentially creating a &#8220;vocabulary&#8221; for time series. Google&#8217;s TimesFM treats time segments as &#8220;patches&#8221; like image processing. Salesforce&#8217;s MOIRAI uses multiple patch sizes for different temporal frequencies. There are other such models, but the fundamental issues being solved are similar. Address the tokenisation of time series, collate a large cross domain dataset, adjust the transformer architecture to address the unique characteristics of time series data.</p><p>In the months since the first version of this piece, the foundation-model wave has not slowed. But the field has clearly forked into two camps. Both are still very much &#8220;Act III&#8221; - they tokenize numerical time series and stick a transformer on top. They differ in <em>where</em> the parameters come from.</p><p><strong>Camp A - Tokenize numbers, train a transformer from scratch.</strong></p><p>This is the original recipe. The more recent successors mostly keep it and refine it:</p><ul><li><p><strong>Chronos-2</strong> - Amazon&#8217;s 2025 follow-up; adds in-context learning across related series and covariate-informed forecasting.</p></li><li><p><strong>ChronosX</strong> - extends Chronos to handle exogenous variables.</p></li><li><p><strong>MOIRAI-2</strong> - simpler architecture, better data, generalises better than v1.</p></li><li><p><strong>MOIRAI-MoE</strong> - sparse mixture-of-experts variant; specialises across frequencies.</p></li><li><p><strong>Lag-Llama</strong> - open-source probabilistic TSFM.</p></li><li><p><strong>TimeGPT-1</strong> - Nixtla&#8217;s offering.</p></li></ul><p><strong>Camp B - Don&#8217;t pretrain a TSFM. Reprogram a frozen LLM.</strong></p><p>The intuition: LLMs already cost hundreds of millions to train. Maybe you don&#8217;t need to start from scratch - just teach the existing LLM to see numbers.</p><ul><li><p><strong>Time-LLM</strong> - patches the series, reprograms each patch via cross-attention with text prototypes, freezes the LLM body, trains only the input projection and output head.</p></li><li><p><strong>LLMTime</strong> - encodes the series as a <em>string of digits</em> and lets the LLM autoregress. No training.</p></li><li><p><strong>GPT4TS / OFA</strong> - frozen GPT-2, fine-tune only the layer norms and positional embeddings.</p></li><li><p><strong>PatchInstruct</strong> - patch tokenisation + decomposition + neighbour augmentation as a prompting strategy.</p></li></ul><p>What&#8217;s the point of foundational models for time series?</p><p>To me, the holy grail is probably few or zero-shot forecasting. Train once on massive time series datasets, then gain the ability to perform a range of tasks - forecast sales, detect equipment anomalies, or classify patterns across entirely new domains without additional training.</p><p>Still an unsolved problem I feel.</p><p>The key characteristics of time series data that we mentioned earlier are still key (such as sequence dependency, non-stationarity, varying frequencies, signal-to-noise ratios, and domain-specific patterns), and unlike text data, the nature of time series data in different domains (finance, healthcare, energy, retail, manufacturing, climate) can be vastly different and evolve significantly across time. We will talk a bit more about this below.</p><h2>Act IV: The Agentic Turn - Forecasting as Reasoning</h2><p>Acts I, II, and III all share an assumption. They assume forecasting is a single step problem. You feed in history, the model spits out the forecast or prediction.</p><p>The new wave has a few branches.</p><p>First, we can fuse different types of time series data at foundational model level across steps. <strong>From News to Forecast</strong> uses LLM agents to iteratively filter news, classify events by effect horizon, and fine-tune an LLM to emit digit sequences alongside a reflection loop.</p><p>Second, we can add reasoning. Training LLMs showed that reasoning <em>at inference time</em> - generating long chains of thought before producing an answer - improves performance on maths and code. Does it improve performance on time-series forecasting? <strong>TimeReasoner</strong> wraps series + timestamps + contextual features into a hybrid prompt, feeds it to a slow-thinking LLM, and explores three reasoning strategies - making the LLM <em>think</em> about the series before answering.</p><p>And finally, add multiple steps to the mix. <strong>AlphaCast</strong> - a training-free three-stage workflow (Investigator &#8594; Generator &#8594; Reflector) that mirrors how an expert forecaster works: prepare context, predict, critique, refine.</p><h2>Conclusion</h2><p>The only conclusion is that there is no conclusion yet. The jury is still out.</p><p>For a long time, a good place to look at the state of time series models were the <strong>Makridakis Competitions</strong>. M1-M3 (1980s-90s) were won by classical methods. M4 (2020) was won by Slawek Smyl&#8217;s hybrid ES-RNN - exponential smoothing married to LSTM - not by pure deep learning. M5 (2022) was won by <em>LightGBM</em> (gradient-boosted trees) with feature engineering, not deep architectures. M6 (2025) pushed into finance and reported that most teams <em>underperformed</em> simple benchmarks. The recurring lesson across forty years of competitions: the gains attributed to deep learning are usually gains from feature engineering, ensembling, or hybrid approaches. A example of a new leaderboard for foundation models is <strong>GIFT-Eval</strong> - 24 datasets over 144,000 time series and 177 million data points, spanning seven domains, 10 frequencies, multivariate inputs, and prediction lengths ranging from short to long-term forecasts.</p><p>But even when these foundation models top the leaderboards, the picture could be misleading.</p><p><strong>Rethinking Evaluation in the Era of TSFMs</strong> found that benchmark scores are inflated - the test data often overlaps with the training data, either directly or because similar time periods appear in both. Strip the overlap out and the impressive numbers shrink.</p><p><strong>Re(Visiting) TSFMs in Finance</strong> put the leading foundation models against decades of stock returns from markets around the world. The result: off the shelf, they did not beat ordinary baselines. Even fine-tuning didn&#8217;t close the gap. The only thing that worked was re-training the model from scratch on financial data - at which point you&#8217;ve essentially built a domain-specific model, not used a &#8220;foundation&#8221; one.</p><p>Multimodal and agentic forecasting are even harder to judge. <strong>Rethinking Multimodal TSF Evaluation</strong> points out that many &#8220;text helps forecasting&#8221; benchmarks are flawed - for example, the news used in testing has often already leaked into the model&#8217;s training data.</p><p>And that&#8217;s where we are now. Lots of progress, but still many open questions and uncertainties.</p><div><hr></div><h2>References</h2><h3>Act I - Classical time-series models, surveys, and competitions</h3><ul><li><p><em>Forecasting: Principles and Practice</em> - <a href="https://otexts.com/fpp3/">https://otexts.com/fpp3/</a> - <em>Modern, accessible, free treatment of the whole classical lineage. Best starting point.</em></p></li><li><p><em>Automatic Time Series Forecasting: The </em><code>forecast</code><em> Package for R</em> - <a href="https://www.jstatsoft.org/v027/i03">https://www.jstatsoft.org/v027/i03</a> - *Origin of <code>auto.arima</code>.</p></li><li><p><em>Forecasting Seasonals and Trends by Exponentially Weighted Moving Averages</em> - <a href="https://www.sciencedirect.com/science/article/abs/pii/S0169207003001134">https://www.sciencedirect.com/science/article/abs/pii/S0169207003001134</a></p></li><li><p><em>Exponential Smoothing: The State of the Art - Part II</em> - <a href="https://www.bauer.uh.edu/egardner/3301H%20Operations%20Management/ESG%20Publications/2006%20Exp.%20Sm.%20State%20of%20the%20art%20-%20Part%20II.pdf">https://www.bauer.uh.edu/egardner/3301H%20Operations%20Management/ESG%20Publications/2006%20Exp.%20Sm.%20State%20of%20the%20art%20-%20Part%20II.pdf</a>.</p></li><li><p><em>Macroeconomics and Reality</em> - <a href="https://www.pauldeng.com/pdf/Sims%20macroeconomics%20and%20reality.pdf">https://www.pauldeng.com/pdf/Sims%20macroeconomics%20and%20reality.pdf</a></p></li><li><p><em>Autoregressive Conditional Heteroskedasticity with Estimates of the Variance of UK Inflation</em> - <a href="https://www.jstor.org/stable/1912773">https://www.jstor.org/stable/1912773</a></p></li><li><p><em>The Story of GARCH: A Personal Odyssey</em> - <a href="https://public.econ.duke.edu/~boller/Papers/GARCH_JoE_2023.pdf">https://public.econ.duke.edu/~boller/Papers/GARCH_JoE_2023.pdf</a> .</p></li><li><p><em>Forecasting, Structural Time Series Models and the Kalman Filter</em> - <a href="https://www.cambridge.org/core/books/forecasting-structural-time-series-models-and-the-kalman-filter/CE5E112570A56960601760E786A5E631">https://www.cambridge.org/core/books/forecasting-structural-time-series-models-and-the-kalman-filter/CE5E112570A56960601760E786A5E631</a></p></li><li><p><em>MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple Seasonal Patterns</em> - <a href="https://arxiv.org/abs/2107.13462">https://arxiv.org/abs/2107.13462</a></p></li><li><p><em>StatsForecast / Nixtla</em> - <a href="https://github.com/Nixtla/statsforecast">https://github.com/Nixtla/statsforecast</a></p></li><li><p><em>The M4 Competition: 100,000 time series and 61 forecasting methods</em> - <a href="https://www.sciencedirect.com/science/article/abs/pii/S0169207019301128">https://www.sciencedirect.com/science/article/abs/pii/S0169207019301128</a></p></li><li><p><em>The M5 Accuracy Competition: Results, Findings and Conclusions</em> - <a href="https://www.sciencedirect.com/science/article/pii/S0169207021001874">https://www.sciencedirect.com/science/article/pii/S0169207021001874</a></p></li><li><p><em>Forecasting with Gradient Boosted Trees: M5 Uncertainty Winner</em> - <a href="https://www.sciencedirect.com/science/article/abs/pii/S0169207021002090">https://www.sciencedirect.com/science/article/abs/pii/S0169207021002090</a></p></li><li><p><em>The M6 Forecasting Competition: Bridging the Gap between Forecasting and Investment Decisions</em> - <a href="https://arxiv.org/abs/2310.13357">https://arxiv.org/abs/2310.13357</a></p></li></ul><h3>Act II - RNN / LSTM / Transformer-era and multimodal cluster</h3><ul><li><p><em>DA-RNN: Dual-Stage Attention-Based RNN</em> - <a href="https://arxiv.org/abs/1704.02971">https://arxiv.org/abs/1704.02971</a></p></li><li><p><em>Patient Subtyping via Time-Aware LSTM Networks (T-LSTM)</em> - <a href="https://dl.acm.org/doi/10.1145/3097983.3097997">https://dl.acm.org/doi/10.1145/3097983.3097997</a></p></li><li><p><em>An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling (TCN)</em> - <a href="https://arxiv.org/abs/1803.01271">https://arxiv.org/abs/1803.01271</a></p></li><li><p><em>DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks</em> - <a href="https://arxiv.org/abs/1704.04110">https://arxiv.org/abs/1704.04110</a></p></li><li><p><em>Deep State Space Models for Time Series Forecasting</em> - <a href="https://papers.nips.cc/paper/2018/hash/5cf68969fb67aa6082363a6d4e6468e2-Abstract.html">https://papers.nips.cc/paper/2018/hash/5cf68969fb67aa6082363a6d4e6468e2-Abstract.html</a></p></li><li><p><em>NeuralProphet: Explainable Forecasting at Scale</em> - <a href="https://arxiv.org/abs/2111.15397">https://arxiv.org/abs/2111.15397</a></p></li><li><p><em>A Hybrid Method of Exponential Smoothing and Recurrent Neural Networks for Time Series Forecasting (ES-RNN, M4 winner)</em> - <a href="https://www.sciencedirect.com/science/article/abs/pii/S0169207019301153">https://www.sciencedirect.com/science/article/abs/pii/S0169207019301153</a></p></li><li><p><em>N-BEATS: Neural Basis Expansion Analysis for Interpretable Time Series Forecasting</em> - <a href="https://arxiv.org/abs/1905.10437">https://arxiv.org/abs/1905.10437</a></p></li><li><p><em>Temporal Fusion Transformers for Interpretable Multi-Horizon Time Series Forecasting (TFT)</em> - <a href="https://arxiv.org/abs/1912.09363">https://arxiv.org/abs/1912.09363</a></p></li><li><p><em>Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting</em> - <a href="https://arxiv.org/abs/2012.07436">https://arxiv.org/abs/2012.07436</a></p></li><li><p><em>A Transformer-based Framework for Multivariate Time Series Representation Learning</em> - <a href="https://arxiv.org/abs/2010.02803">https://arxiv.org/abs/2010.02803</a></p></li><li><p><em>A Time Series is Worth 64 Words: Long-term Forecasting with Transformers (PatchTST)</em> - <a href="https://arxiv.org/abs/2211.14730">https://arxiv.org/abs/2211.14730</a></p></li><li><p><em>CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting</em> - <a href="https://arxiv.org/abs/2202.01575">https://arxiv.org/abs/2202.01575</a></p></li><li><p><em>Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting</em> - <a href="https://arxiv.org/abs/2106.13008">https://arxiv.org/abs/2106.13008</a></p></li><li><p><em>Deep Learning for Event-Driven Stock Prediction</em> - <a href="https://www.ijcai.org/Proceedings/15/Papers/329.pdf">https://www.ijcai.org/Proceedings/15/Papers/329.pdf</a></p></li><li><p><em>Listening to Chaotic Whispers: A Deep Learning Framework for News-Oriented Stock Trend Prediction</em> - <a href="https://arxiv.org/abs/1712.02136">https://arxiv.org/abs/1712.02136</a></p></li><li><p><em>Incorporating Fine-grained Events in Stock Movement Prediction</em> - <a href="https://aclanthology.org/D19-5106/">https://aclanthology.org/D19-5106/</a></p></li><li><p><em>Stock Embeddings Acquired from News Articles and Price History, and an Application to Portfolio Optimization</em> - <a href="https://aclanthology.org/2020.acl-main.307/">https://aclanthology.org/2020.acl-main.307/</a></p></li><li><p><em>HTML: Hierarchical Transformer-based Multi-task Learning for Volatility Prediction</em> - <a href="https://dl.acm.org/doi/10.1145/3366423.3380128">https://dl.acm.org/doi/10.1145/3366423.3380128</a></p></li><li><p><em>REST: Relational Event-driven Stock Trend Forecasting</em> - <a href="https://arxiv.org/abs/2102.07372">https://arxiv.org/abs/2102.07372</a></p></li><li><p><em>FAST: Financial News and Tweet Based Time Aware Network for Stock Trading</em> - <a href="https://aclanthology.org/2021.eacl-main.185/">https://aclanthology.org/2021.eacl-main.185/</a></p></li><li><p><em>Learning Knowledge-Enriched Company Embeddings for Investment Management</em> - <a href="https://dl.acm.org/doi/abs/10.1145/3490354.3494390">https://dl.acm.org/doi/abs/10.1145/3490354.3494390</a></p></li><li><p><em>Investment and Risk Management with Online News and Heterogeneous Networks</em> - <a href="https://dl.acm.org/doi/full/10.1145/3532858">https://dl.acm.org/doi/full/10.1145/3532858</a></p></li><li><p><em>Guided Attention Multimodal Multitask Financial Forecasting</em> - <a href="https://aclanthology.org/2022.acl-long.437/">https://aclanthology.org/2022.acl-long.437/</a></p></li><li><p><em>Dynamic Multimodal Implicit and Explicit Networks for Multiple Financial Tasks</em> - <a href="https://ieeexplore.ieee.org/abstract/document/10020722/">https://ieeexplore.ieee.org/abstract/document/10020722/</a></p></li><li><p><em>Dynamic Multimodal Slot Concepts from the Web</em> - <a href="https://dl.acm.org/doi/full/10.1145/3663674">https://dl.acm.org/doi/full/10.1145/3663674</a></p></li></ul><h3>Act III - Foundation models (Camp A: train-from-scratch TSFMs)</h3><ul><li><p><em>Unified Training of Universal Time Series Forecasting Transformers (MOIRAI)</em> - <a href="https://arxiv.org/abs/2402.02592">https://arxiv.org/abs/2402.02592</a></p></li><li><p><em>Chronos: Learning the Language of Time Series</em> - <a href="https://arxiv.org/abs/2403.07815">https://arxiv.org/abs/2403.07815</a></p></li><li><p><em>A Decoder-Only Foundation Model for Time-Series Forecasting (TimesFM)</em> - <a href="https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/">https://research.google/blog/a-decoder-only-foundation-model-for-time-series-forecasting/</a></p></li><li><p><em>Chronos-2: From Univariate to Universal Forecasting</em> - <a href="https://arxiv.org/abs/2510.15821">https://arxiv.org/abs/2510.15821</a></p></li><li><p><em>ChronosX: Extending Time-Series Foundation Models to Support Exogenous Variables</em> - </p></li></ul><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:161336815,&quot;url&quot;:&quot;https://aihorizonforecast.substack.com/p/chronosx-extending-time-series-foundation&quot;,&quot;publication_id&quot;:1940355,&quot;publication_name&quot;:&quot;AI Horizon Forecast&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!BIIa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8632e0bc-2f66-44bd-bed3-c6eebfce50ff_819x819.png&quot;,&quot;title&quot;:&quot;ChronosX: Extending Time-Series Foundation Models to Support Exogenous Variables&quot;,&quot;truncated_body_text&quot;:&quot;Foundation models excel in univariate time-series benchmarks.&quot;,&quot;date&quot;:&quot;2025-04-16T09:14:21.852Z&quot;,&quot;like_count&quot;:6,&quot;comment_count&quot;:0,&quot;bylines&quot;:[{&quot;id&quot;:167905535,&quot;name&quot;:&quot;Nikos Kafritsas&quot;,&quot;handle&quot;:&quot;nikoskafritsas&quot;,&quot;previous_name&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/95b7b42e-3dac-4d06-a08d-c123b32dfe58_400x400.png&quot;,&quot;bio&quot;:&quot;Data Scientist at Persado &#8226; Making AI simple&quot;,&quot;profile_set_up_at&quot;:&quot;2023-09-10T17:17:37.445Z&quot;,&quot;reader_installed_at&quot;:&quot;2023-09-12T00:11:00.501Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1931125,&quot;user_id&quot;:167905535,&quot;publication_id&quot;:1940355,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1940355,&quot;name&quot;:&quot;AI Horizon Forecast&quot;,&quot;subdomain&quot;:&quot;aihorizonforecast&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;Explaining complex AI models as clear as daylight.\nFocusing on time series and latest AI research.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8632e0bc-2f66-44bd-bed3-c6eebfce50ff_819x819.png&quot;,&quot;author_id&quot;:167905535,&quot;primary_user_id&quot;:167905535,&quot;theme_var_background_pop&quot;:&quot;#E8B500&quot;,&quot;created_at&quot;:&quot;2023-09-10T17:26:45.542Z&quot;,&quot;email_from_name&quot;:&quot;AI Horizon Forecast&quot;,&quot;copyright&quot;:&quot;Nikos Kafritsas&quot;,&quot;founding_plan_name&quot;:&quot;Legendary Subscriber&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false,&quot;logo_url_wide&quot;:null}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:100,&quot;status&quot;:{&quot;bestsellerTier&quot;:100,&quot;subscriberTier&quot;:null,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;bestseller&quot;,&quot;tier&quot;:100},&quot;paidPublicationIds&quot;:[],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;,&quot;source&quot;:null}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://aihorizonforecast.substack.com/p/chronosx-extending-time-series-foundation?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!BIIa!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8632e0bc-2f66-44bd-bed3-c6eebfce50ff_819x819.png" loading="lazy"><span class="embedded-post-publication-name">AI Horizon Forecast</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">ChronosX: Extending Time-Series Foundation Models to Support Exogenous Variables</div></div><div class="embedded-post-body">Foundation models excel in univariate time-series benchmarks&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">a year ago &#183; 6 likes &#183; Nikos Kafritsas</div></a></div><ul><li><p><em>MOIRAI 2.0: When Less Is More for Time Series Forecasting</em> - <a href="https://arxiv.org/abs/2511.11698">https://arxiv.org/abs/2511.11698</a></p></li><li><p><em>MOIRAI-MoE: Empowering Time Series Foundation Models with Sparse Mixture of Experts</em> - <a href="https://arxiv.org/abs/2410.10469">https://arxiv.org/abs/2410.10469</a></p></li><li><p><em>Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting</em> - <a href="https://arxiv.org/abs/2310.08278">https://arxiv.org/abs/2310.08278</a></p></li><li><p><em>TimeGPT-1</em> - <a href="https://arxiv.org/abs/2310.03589">https://arxiv.org/abs/2310.03589</a></p></li></ul><h3>Act III - Foundation models (Camp B: LLM-reprogramming)</h3><ul><li><p><em>Time-LLM: Time Series Forecasting by Reprogramming Large Language Models</em> - <a href="https://arxiv.org/abs/2310.01728">https://arxiv.org/abs/2310.01728</a></p></li><li><p><em>Large Language Models Are Zero-Shot Time Series Forecasters (LLMTime)</em> - <a href="https://arxiv.org/abs/2310.07820">https://arxiv.org/abs/2310.07820</a></p></li><li><p><em>One Fits All: Power General Time Series Analysis by Pretrained LM (GPT4TS / OFA)</em> - <a href="https://arxiv.org/abs/2302.11939">https://arxiv.org/abs/2302.11939</a></p></li><li><p><em>PatchInstruct: Forecasting Time Series with LLMs via Patch-Based Prompting and Decomposition</em> - <a href="https://arxiv.org/abs/2506.12953">https://arxiv.org/abs/2506.12953</a></p></li></ul><h3>Act IV - Multimodal text + TS</h3><ul><li><p><em>From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection</em> - <a href="https://arxiv.org/abs/2409.17515">https://arxiv.org/abs/2409.17515</a></p></li><li><p><em>Unlocking the Value of Text: Event-Driven Reasoning and Multi-Level Alignment for Time Series Forecasting</em> - <a href="https://arxiv.org/abs/2603.15452">https://arxiv.org/abs/2603.15452</a></p></li></ul><h3>Act IV - Agentic &amp; reasoning frameworks</h3><ul><li><p><em>AlphaCast: A Human Wisdom-LLM Intelligence Co-Reasoning Framework for Interactive Time Series Forecasting</em> - <a href="https://arxiv.org/abs/2511.08947">https://arxiv.org/abs/2511.08947</a></p></li><li><p><em>Empowering Time Series Forecasting with LLM-Agents (DCATS)</em> - <a href="https://arxiv.org/abs/2508.04231">https://arxiv.org/abs/2508.04231</a></p></li><li><p><em>Can Slow-Thinking LLMs Reason Over Time? Empirical Studies in Time Series Forecasting (TimeReasoner)</em> - <a href="https://arxiv.org/abs/2505.24511">https://arxiv.org/abs/2505.24511</a></p></li></ul><h3>Evaluation</h3><ul><li><p><em>GIFT-Eval: A Benchmark for General Time Series Forecasting Model Evaluation</em> - <a href="https://arxiv.org/abs/2410.10393">https://arxiv.org/abs/2410.10393</a> &#183; leaderboard at <a href="https://tsfm.ai/benchmarks/gift-eval">https://tsfm.ai/benchmarks/gift-eval</a></p></li><li><p><em>Rethinking Evaluation in the Era of Time Series Foundation Models: (Un)known Information Leakage Challenges</em> - <a href="https://arxiv.org/abs/2510.13654">https://arxiv.org/abs/2510.13654</a></p></li><li><p><em>Rethinking Multimodal Time-Series Forecasting Evaluation</em> - https://openreview.net/forum?id=Z1TMV4bGuu</p></li><li><p><em>Re(Visiting) Time Series Foundation Models in Finance</em> - <a href="https://arxiv.org/abs/2511.18578">https://arxiv.org/abs/2511.18578</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Models]]></title><description><![CDATA[From models, systems and use cases to archetypes, capabilities and workflows.]]></description><link>https://www.simplyboring.ai/p/models</link><guid isPermaLink="false">https://www.simplyboring.ai/p/models</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 05 May 2026 11:53:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0PzS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Picking up from last week&#8217;s thought <strong><a href="https://substack.com/@simplyboringai/note/c-250976537?utm_source=notes-share-action&amp;r=5kml33">piece</a> </strong>on the spectrum from models, systems and use cases to archetypes, capabilities and workflows.</p><p>Starting with models. My favourite.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0PzS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0PzS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 424w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 848w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0PzS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:313578,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/196532753?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0PzS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 424w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 848w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!0PzS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf1f2e9c-df0e-4e92-973b-9480728f9734_2000x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The first models I ever met were simple curves. Bootstrapping zero rates off swap quotes in a Masters of Financial Engineering classroom. Then the more sophisticated cousins, Black-Scholes, SABR, Hull-White, Heston. I still remember the books by Emanuel Derman, Massimo Morini, and many others. For many years, the SR 11-7 was the only guidance governing this world. Though you could see snippets of it in parts of the Basel capital rules. To be honest, I was not a great student and was confused about such quantitative models for many years.</p><p>Then I met machine learning models. Decision trees. Random forests. Gradient boosted trees. Support vector machines. Then deep learning models. RNNs, CNNs. Still models, but slightly more generalized. To be honest, it was only when I was learning machine and deep learning that I finally understood the quantitative models I learnt earlier. Same for LLMs. And agents are just systems wrapped around LLMs.</p><p>Once you see the main parts - the selected function, the objective, the optimization method, and the evaluation method - you realize that they ain&#8217;t very different. Asking questions about any model ultimately comes down to these parts, and the assumptions underlying the model.</p><p>And it&#8217;s the AI model that ultimately makes an AI system different from any other software system. It&#8217;s also why in 2024, we did a review of AI model risk management in banks, when I was still in MAS. If there was any place to start, looking at the models was as good as any.</p><p>And it&#8217;s where the 3 &#8216;U&#8217;s arise.</p><p>Uncertainty. All models have it. Irreducible randomness you can&#8217;t eliminate. Reducible gaps you can close with more data. But always some left.</p><p>Unexpectedness. More complex models behave in ways nobody designed. Emergent capabilities. Adversarial vulnerabilities. Gaming the system. Misalignment.</p><p>Unexplainability. The degree to which we can explain a decision varies. Transparency, explainability, interpretability, but ultimately about understanding.</p><p>The Fed, OCC and FDIC dropped SR 26-2 on April 17. Replaced SR 11-7 after fifteen years. Buried in it, a single-sentence carve-out: &#8216;GenAI and agentic AI are explicitly excluded from scope.</p><p>For me personally, given the connections between models, whether quantitative or LLM, this exclusion puzzles me a little.</p><p>What do you think?</p><p><a href="#ThinkingInSpectrum">#ThinkingInSpectrum</a> <a href="#ModelRisk">#ModelRisk</a> <a href="#AIRiskManagement">#AIRiskManagement</a></p>]]></content:encoded></item><item><title><![CDATA[On the Units of AI Governance]]></title><description><![CDATA[From models, systems and use cases to archetypes, capabilities and workflows.]]></description><link>https://www.simplyboring.ai/p/on-the-units-of-ai-governance</link><guid isPermaLink="false">https://www.simplyboring.ai/p/on-the-units-of-ai-governance</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Wed, 29 Apr 2026 12:56:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nFeO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s been a slow realization. Not about life. But a more boring one about AI risk.</p><p>I&#8217;ve been doing a number of talks on the intersection of two disciplines.</p><p>First discipline. Model risk management. My model risk supervision days.</p><p>Second discipline. Computer science. AI models. My ill-advised side quest to get a PhD in computer science at 42.</p><p>The MAS AI risk management guidelines I wrote from these covers 3 units of governance that relate to these disciplines - model, system, use case. They form a good foundation. But I feel like something has shifted.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nFeO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nFeO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 424w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 848w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nFeO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:287395,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/195867197?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nFeO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 424w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 848w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!nFeO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8afe531f-231c-4ddc-aa50-53e44c54760e_2000x1600.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>1&#65039;&#8419; Model - Obviously the domain of model risk management (MRM). The world of the late SR 11-7 &amp; the new SR 26-2 from the US. As well as the UK&#8217;s SS 1/23 and Canada&#8217;s E23. What OpenAI, Anthropic etc. release and spend shitloads of money on.</p><p>2&#65039;&#8419; System - The domain of technology risk management. Too many frameworks here to list. But the recent &#8220;AI&#8221; vulnerabilities that have been in the news, from Anthropic&#8217;s troubles to Vercel have nothing to do with the AI model, but the fragility of systems in this rush to do &#8220;AI&#8221;.</p><p>3&#65039;&#8419; Use Case &#183; What AI risk management focuses on. Anchored by NIST AI RMF, ISO 42001. MAS&#8217; AI risk management guidelines. This is closest to what we experience when we use AI, but is also the least scalable.</p><p>And above the line.</p><p>4&#65039;&#8419; Archetype. Prior to Agentic AI, I already saw folks grouping Generative AI into archetypes - question and answer, summarization, retrieval augmented generation. LLMs are general purpose tools, but grouping them helps with scalability. For once stereotyping things helps.</p><p>5&#65039;&#8419; Capability. Bounded class of actions with authority, constraints, evidence requirements. Capabilities extend the idea of archetypes to the world of Agentic AI. Where we decompose and compose what AI agents do. See the paper on Agentic MRM by <strong><a href="https://www.linkedin.com/feed/#">Lukasz Szpruch</a></strong>, <strong><a href="https://www.linkedin.com/feed/#">Agus Sudjianto</a></strong>, <strong><a href="https://www.linkedin.com/feed/#">Tanveer Bhatti</a></strong> and me on this; and the Agentic Risk and Capabilities (ARC framework) by <strong><a href="https://www.linkedin.com/feed/#">Shaun Khoo</a></strong> and <strong><a href="https://www.linkedin.com/feed/#">Roy Ka-Wei Lee</a></strong>.</p><p>6&#65039;&#8419; Workflow. I know terms like trajectories and runtime have come into vogue (runtime is the term we use in the title of the Agentic MRM paper). But I feel like they are manifestations of what happens when AI agents do something, rather than a unit of governance. Workflows seem closer to something we can govern. And they relate nicely to the idea of the action space x autonomy perspective of AI agents (see Singapore&#8217;s Model Governance Framework on Agentic AI). As you relax the constraints of workflows, you get trajectories and runtimes that can go wider and wilder.</p><p>The three units above the line represent a new set of risks. But somehow I feel they also represent a new route to scalability.</p><p>Models, systems and use cases are usually specific. Whereas archetypes, capabilities and workflows can be generalized across a range of applications.</p><p>Just thinking out loud. Let me know your perspectives. I&#8217;ll try to write more about this in the weeks ahead.</p><p><strong>#AIGovernance</strong> <strong>#AgenticAI</strong></p>]]></content:encoded></item><item><title><![CDATA[Garfield's Eyes]]></title><description><![CDATA[Second in my series on what happens when the people who make things meet a technology that also makes things]]></description><link>https://www.simplyboring.ai/p/garfields-eyes</link><guid isPermaLink="false">https://www.simplyboring.ai/p/garfields-eyes</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Mon, 30 Mar 2026 14:06:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!2jFe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>This is the second in my series on what happens when the people who make things meet a technology that also makes things.</p><p>The first part, &#8220;<em><a href="https://open.substack.com/pub/simplyboringai/p/my-shifu?utm_campaign=post-expanded-share&amp;utm_medium=post%20viewer">My Shifu</a></em>&#8220;, covered how I met <em><a href="https://en.wikipedia.org/wiki/Johnny_Lau">Johnny Lau</a></em> through a crazy experiment called Creative Youth Xchange, and his view that technology forces artistic purity rather than threatening it.</p><p>I&#8217;ve been sitting on this, but this <em><a href="https://www.linkedin.com/posts/randomwalker_discussions-of-ai-and-creativity-tend-to-activity-7443289527525277696-CDfk?utm_source=social_share_send&amp;utm_medium=member_desktop_web&amp;rcm=ACoAAAnEwqsBmNv-udZ8tKaEG_MQGlUiz7C_KAg">post</a></em> by Arvind Narayanan on how creativity in art and creativity in science are different made me get back to this second part.</p><p>This article goes deeper into the creative process.  About creative production, and why AI may not be that great a leap.</p><h2><strong>Is AI Really A Leap?</strong></h2><p>One of the key fears of creatives is that this leap into AI may be one too far. Further than the camera. Further than digital tools.</p><p>Something that creativity cannot keep up with. But is it realistic to think that all art in the future will be purely from AI?</p><p>So I asked Johnny about his jumps from pen-and-paper comics to different technologies, the most recent being a Singlish speaking robot. Most people would call that a transformation. He didn&#8217;t:</p><p><em>&#8220;It was never a leap for me. Right from the beginning when I launched the first book in 1990, I engineered a range of products such as t-shirts &amp; mugs, a 13-inch ruler and bumper sticker with a tagline: Beware of Kiasu Driver. The latter became a bestseller instantly.&#8221;</em></p><p>My read. This leap is perhaps not that different from the ones in the past for creatives. And AI is perhaps more of an opportunity than a threat.</p><p><em>&#8220;For me it has always been a concept, not a book nor a sticker ... My brain doesn&#8217;t function the way where sectors and industries are differentiated. I had to develop a language to align myself with how the world functions.&#8221;</em></p><p>A concept. Not a product. Not a comic.</p><p>Here&#8217;s what I think is going on. Narayanan&#8217;s distinction is useful, creativity in art is emotional, social, driven by taste. Creativity in science is instrumental, evaluable, a systematic search through possibilities. AI is getting better at the second kind. The first is harder to touch.</p><p>Johnny&#8217;s concept is the first kind. Not the drawing - the reason for the drawing. The cultural nerve. The recognition that makes a Singaporean laugh at their own kiasu-ness. That doesn&#8217;t live in a dataset. It comes from being someone, from somewhere.</p><p>The production - t-shirts, mugs, strips - is the second kind. Patterns. Systems. Things that can be decomposed and delegated.</p><p>Johnny worked this out decades ago. He kept the concept. He systematized the production. And the person who showed him how wasn&#8217;t an AI researcher. It was Jim Davis.</p><h2><strong>The Garfield Revelation</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2jFe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2jFe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2jFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg" width="580" height="323.64" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:558,&quot;width&quot;:1000,&quot;resizeWidth&quot;:580,&quot;bytes&quot;:179817,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/192613643?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2jFe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 424w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 848w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!2jFe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ceb51a3-d338-41d8-bf49-d9b6e6f35d13_1000x558.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Not exactly Garfield, but you get the idea.</figcaption></figure></div><p>I asked about architecture at USC. He said something I didn&#8217;t expect - that USC was never about architecture for him:</p><p><em>&#8220;The years at USC were never about the study of building design. I took full liberty to immerse myself in the world of media and entertainment when I landed in Los Angeles.&#8221;</em></p><p>That&#8217;s where he met Garfield. Not the cat. The operation.</p><p><em>&#8220;I was stunned by Davis and how he had built a team of artists and writers to produce daily comics strips for the lasagna loving cat for its national syndicate. He had deployed an animation studio system to run his comics studio.&#8221;</em></p><p>Here&#8217;s what he described. Davis ran his comics like a factory. Team meeting on Monday. Different themes tossed around for the week&#8217;s seven strips. Once decided, a lead artist carried out the drafts. The process cascaded through the team. And at the end:</p><p><em>&#8220;Davis merely just drew Garfield&#8217;s eyes as a gesture of completion.&#8221;</em></p><p>Garfield&#8217;s eyes. The signature. Everything else was system. That one stroke was authorship.</p><p>I sat with that image for a while.</p><p>Johnny brought this model back to Asia. Most Asian comics were produced by a single artist - one person doing everything. The Garfield team showed him a different path. It became Comix Factory, the studio that produced Mr Kiasu.</p><h2><strong>Agents and Art</strong></h2><p>Here&#8217;s the connection I keep thinking about.</p><p>Davis&#8217;s studio system was, in effect, an early version of what AI promises every creator today. Break the work into steps. Specialize each step. Let the system handle volume while the creator handles vision.</p><p>I&#8217;ve <em><a href="https://www.linkedin.com/pulse/thinking-agents-systems-age-ai-gary-ang-phd-gvb3c">written about agentic AI</a></em> before - systems where multiple AI agents each handle a different part of a task. One agent plans, another retrieves information, another reasons, another generates output. The whole thing works because each part has a defined role and you can observe what each part does.</p><p>Davis was essentially running an agentic system in the 1980s. Just with humans instead of models. One person for drafts, another for inking, the process cascading through the team. And the orchestrator - Davis - only needed to draw the eyes.</p><p>Davis understood in the 1980s what the AI industry is selling in the 2020s: you don&#8217;t need to do everything yourself. You need to know what only you can do.</p><p>For Davis, it was the eyes. For Johnny, it&#8217;s the concept.</p><p>And Johnny made the connection himself:</p><p><em>&#8220;Similarly with AI, we just have to change our mindset on the method of production. The creation portion doesn&#8217;t change as it will always come from the depth of our minds.&#8221;</em></p><p>Change the method of production. Keep the creation. That&#8217;s not a theory from someone who&#8217;s read a few articles about AI. That&#8217;s a principle from someone who&#8217;s been decomposing creative labor since 1990.</p><p>A nice footnote: in 2023, Mr Kiasu and Garfield collaborated on a joint library programme with Singapore&#8217;s National Library. A one-of-a-kind global collaboration between two comics characters - the Singaporean one modeled on the American one&#8217;s production system - sharing a stage forty years later.</p><h2><strong>AI As Just Another Method</strong></h2><p>There&#8217;s a broader pattern here that I think matters for anyone thinking about AI and creative work.</p><p>We tend to frame the AI question as: will machines replace artists? That&#8217;s the wrong question. The better question is: what part of your creative work is system, and what part is signature?</p><p>If you&#8217;re a writer, maybe the system is research, outlining, drafting. The signature is voice, judgment, the instinct for when a sentence lands. If you&#8217;re a designer, maybe the system is rendering, iteration, production. The signature is the concept that no brief could have specified.</p><p>Davis knew. He drew the eyes and let the system do the rest. Not because he was lazy. Because he understood what only he could do.</p><p>AI doesn&#8217;t change this question. It just makes it unavoidable.</p><p>Johnny figured this out before AI existed. He&#8217;s been running this model for thirty-five years - concept first, production second, and the concept crosses every medium, every platform, every technology that comes along.</p><p>The question I&#8217;m leaving with you: what&#8217;s the part that only you draw?</p><p>More soon.</p>]]></content:encoded></item><item><title><![CDATA[A Simple Reading List on Third-Party AI Risk Management]]></title><description><![CDATA[Probably the hardest problem in AI risk management]]></description><link>https://www.simplyboring.ai/p/a-simple-reading-list-on-third-party</link><guid isPermaLink="false">https://www.simplyboring.ai/p/a-simple-reading-list-on-third-party</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Wed, 25 Mar 2026 16:42:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xVmN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xVmN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xVmN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 424w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 848w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 1272w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xVmN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png" width="1456" height="977" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/da320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:977,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7614742,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/192114649?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xVmN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 424w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 848w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 1272w, https://substackcdn.com/image/fetch/$s_!xVmN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fda320531-0f95-4a20-9a60-a5b753c975e6_2528x1696.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Yesterday, there was news about a threat actor compromising the popular LiteLLM package on PyPI. litellm is an ubiquitous library that many teams use as a unified interface to call different LLM APIs. The attacker didn&#8217;t need anything sophisticated. They poisoned the CI/CD pipeline through a compromised security scanner (the irony!), then pushed backdoored versions of litellm that harvested SSH keys, cloud credentials, and secrets on every Python startup. Seems like the hackers are now working hard - going through a 300gb treasure trove and extorting multi-billion-dollar companies.</p><p>This is what third-party AI risk actually looks like. Not the dramatic scenarios we imagine - collusion, rogue AI, coordinated manipulation. Sometimes it&#8217;s a simple file in a package you installed last Tuesday.</p><p>And it&#8217;s not just open-source libraries. Every AI vendor you use sits on its own supply chain of dependencies, sub-processors, and infrastructure providers. A single compromised link and you inherit the risk.</p><p>So how do you govern AI you don&#8217;t control? And maybe don&#8217;t even know exists.</p><p>I can&#8217;t say I have the answer. But I can offer a reading list.</p><p><em>Note: I have used open-access links from arXiv as far as possible.</em></p><div><hr></div><h2><strong>Where and Why Does It Break?</strong></h2><p><strong>1. &#8220;AI Auditing: The Broken Bus on the Road to AI Accountability&#8221;</strong> - Birhane et al. | <em>IEEE SaTML (2024)</em></p><p>Taxonomizes current AI audit practices across regulators, law firms, civil society, journalism, academia, and consulting. Finds that only a subset of AI audit studies translate to desired accountability outcomes. The title says it all - we&#8217;re doing audits, but many of them aren&#8217;t actually getting us where we need to go. &#128196; <a href="https://arxiv.org/abs/2401.14462">arXiv:2401.14462</a></p><p><strong>2. &#8220;Dislocated Accountabilities in the &#8216;AI Supply Chain&#8217;: Modularity and Developers&#8217; Notions of Responsibility&#8221;</strong> - Widder &amp; Nafus | <em>Big Data &amp; Society (2023)</em></p><p>Developers building AI from preexisting modules often believe responsible AI belongs to &#8220;the next or previous person in the imagined supply chain.&#8221;  Everyone assumes someone else is managing the risk. &#128196; <a href="https://arxiv.org/abs/2209.09780">arXiv:2209.09780</a></p><p><strong>3. &#8220;Understanding Accountability in Algorithmic Supply Chains&#8221;</strong> - Cobbe, Veale &amp; Singh | <em>FAccT (2023)</em></p><p>Explores how algorithmic supply chains create distributed responsibility and limited visibility due to what the authors call the &#8220;accountability horizon&#8221; - you can only see so far. Also covers cross-border supply chains and regulatory arbitrage, which matters when your AI vendor operates across jurisdictions with different rules. &#128196; <a href="https://arxiv.org/abs/2304.14749">arXiv:2304.14749</a></p><h2><strong>How can we perhaps solve it?</strong></h2><p><strong>1. &#8220;AEF-1: Minimum Operating Conditions for Independent Third Party AI Evaluations&#8221;</strong> - Stosz et al. | <em>AI Evaluation Foundation (2025)</em></p><p>A voluntary standard and checklist that defines what evaluators actually need from AI providers: independence from the provider, sufficient access to assess characteristics of interest, and transparency in sharing methods and findings. The gap between what this standard says you need and what most vendors actually give you <em>is</em> the governance gap. &#128279; <a href="https://www.aef.one/aef-one.pdf">AI Evaluation Foundation</a></p><p><strong>2. &#8220;Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance&#8221;</strong> - Raji et al. | <em>AAAI/ACM AIES (2022)</em></p><p>Synthesizes lessons from financial, environmental, and health regulation on crafting effective external oversight systems. The key insight for me: audits alone won&#8217;t achieve accountability. You need deliberate design and institutional weight, the same lesson financial regulators learned after Enron. &#128196; <a href="https://arxiv.org/abs/2206.04737">arXiv:2206.04737</a></p><p><strong>3. &#8220;Model Cards for Model Reporting&#8221;</strong> - Mitchell et al. | <em>FAccT (2019)</em></p><p>The foundational paper proposing that released models be accompanied by documentation detailing their performance characteristics. This paper is from 2019. It&#8217;s now 2026. Notice how few vendors actually provide this level of transparency. In fact, some studies have shown that such transparency is declining, rather than increasing due to greater awareness. &#128196; <a href="https://arxiv.org/abs/1810.03993">arXiv:1810.03993</a></p><p><strong>4. &#8220;Third-party compliance reviews for frontier AI safety frameworks&#8221;</strong> - Homewood et al. | <em>arXiv preprint (2025)</em></p><p>Explores third-party compliance reviews where an independent external party assesses whether a frontier AI company complies with its own safety framework. Discusses benefits (increased compliance, assurance to stakeholders) and real challenges (information security risks, cost burdens, reputational damage from findings). This is the emerging infrastructure for keeping vendors accountable - but it&#8217;s still nascent. &#128196; <a href="https://arxiv.org/abs/2505.01643">arXiv:2505.01643</a></p><p><strong>5. &#8220;Implementing AI Bill of Materials (AI BOM) with SPDX 3.0&#8221;</strong> - Bennet et al. | <em>Linux Foundation Research (2025)</em></p><p>Extends the Software Bill of Materials concept to AI, including documentation of algorithms, data collection methods, frameworks, licensing, and compliance. If you want to know what&#8217;s actually inside the AI system you&#8217;re buying - and what changes when the vendor updates it - this is the direction. Think of it as the AI equivalent of ingredient labelling. &#128196; <a href="https://arxiv.org/abs/2504.16743">arXiv:2504.16743</a></p><p><strong>6. &#8220;AgentFacts: Universal KYA Standard for Verified AI Agent Metadata &amp; Deployment&#8221;</strong> - Grogan | <em>arXiv preprint (2025)</em></p><p>Forward-looking. Proposes a &#8220;Know Your Agent&#8221; standard with cryptographically-signed capability declarations and multi-authority validation. If something like this existed at scale, it would reduce the custom integration friction that makes switching so expensive. We&#8217;re not there yet, but this is the direction things need to go. &#128196; <a href="https://arxiv.org/abs/2506.13794">arXiv:2506.13794</a></p><p><strong>7. &#8220;Consultation Paper on Proposed Guidelines on Third-Party Risk Management&#8221;</strong> - Monetary Authority of Singapore | <em>MAS (March 2026)</em></p><p>And last but not least. MAS just released this in March 2026. Supersedes the old outsourcing guidelines and extends expectations to all third-party arrangements, not just outsourced services. Covers risk assessment, due diligence, contracting, onboarding, ongoing monitoring, and termination. Requires FIs to maintain a register of third-party arrangements, monitor concentration risk, and extend oversight to sub-contractors. AI appears in a footnote, where it refers to MAS AI Risk Management Guidelines I wrote, but the relevance of these guidelines to AI is clear. &#128279; <a href="https://www.mas.gov.sg/-/media/mas-media-library/publications/consultations/bd/2026/consultation-paper---tprmg.pdf">MAS Consultation Paper</a></p><p>I think this is one of the hardest problems to solve in AI risk management. Because it requires everyone to work together. And that&#8217;s really hard today.</p><p>#ThirdPartyAI #AIRiskManagement #VendorManagement #AIGovernance #AIReadingList</p>]]></content:encoded></item><item><title><![CDATA[The AI Governance Tool]]></title><description><![CDATA[A simply boring attempt at making AI governance and risk management accessible]]></description><link>https://www.simplyboring.ai/p/the-ai-governance-tool</link><guid isPermaLink="false">https://www.simplyboring.ai/p/the-ai-governance-tool</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Mon, 23 Mar 2026 13:16:04 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fKDk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If you&#8217;re infantry in the Singapore army, you would probably have encountered this protocol to shout &#8220;gap gap gap&#8221; when you breach a fortification.</p><p>I always thought it was kind of stupid. Since shouting it was asking for the bullets to be aimed at you. I always felt I was lucky to move from infantry to be a scout. Scouts never ran towards bullets.</p><p>But I think I understand the need for it more now. Shouting it in the army helps make people aware of the gap that you are moving towards.</p><p>In one of my past weekly reflections on <a href="https://www.linkedin.com/pulse/edge-action-gary-ang-phd-f6vmc">LinkedIn</a>, I said three gaps kept surfacing. The common language gap. The contextualisation gap. The last-mile gap. Same need for awareness.</p><p>Frameworks exist. Guidelines exist. I wrote some of them. But translating frameworks into something relevant for one&#8217;s context is not easy. Especially for individuals and smaller firms.</p><p>Last week, MAS launched the <a href="https://www.mas.gov.sg/schemes-and-initiatives/project-mindforge">MindForge AI Risk Management Toolkit</a>. I was deeply involved as it needed to be aligned to the <a href="https://www.mas.gov.sg/-/media/mas-media-library/publications/consultations/bd/2025/final_consultation_paper_on_guidelines_on_ai_risk_management_forrelease.pdf">MAS AI Risk Management Guidelines</a> I wrote. </p><p>The MindForge AI Risk Management Toolkit is an operationalisation handbook with case studies. It&#8217;s an attempt at bridging the messy middle I keep writing about. But I suspect, aside from large financial institutions, it may still be a lot to chew on for everyone else.</p><p>Last week, a friend who leads manpower development also asked me something similar: what about everyone else?</p><blockquote><p><em>This is my early attempt at an answer.</em></p></blockquote><p>GOT, not Game of Thrones, but an AI Governance Tool. </p><p>At <a href="https://govern.simplyboring.ai/">govern.simplyboring.ai</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fKDk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fKDk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 424w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 848w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 1272w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fKDk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png" width="725" height="806.2609457092819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:635,&quot;width&quot;:571,&quot;resizeWidth&quot;:725,&quot;bytes&quot;:62104,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/191861678?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fKDk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 424w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 848w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 1272w, https://substackcdn.com/image/fetch/$s_!fKDk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F352bbd43-683b-4ed7-9c13-7a2e305493b0_571x635.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Subscribe to my Substack to get keys for more generations.</figcaption></figure></div><p>Describe what you are using AI for. Get a tailored governance pack in minutes. The standards that apply to your situation. The controls to put in place. How to implement them. What evidence to prepare.</p><p>Built on public information. The MAS AI Risk Management Guidelines I wrote. The MindForge toolkit that just launched. The US Financial Services AI Risk Management Framework. The EU AI Act. NIST AI RMF. ISO 42001. And other international frameworks. Cross-referenced and connected. And a little of my own perspectives.</p><p>The AIRG tells you what to do. MindForge shows how the industry is doing it. GOT asks what you specifically are doing, and tells you what applies to you.</p><p>Not just for the large institution with a full second line (and third line) of defence. For everyone else. Individuals. Small firms.</p><p>It&#8217;s an early prototype. It will almost certainly have errors. And a little slow - around 45-60 seconds for each report. And I think it&#8217;s still a little iffy for individuals.</p><p>But I tried to ground it in real frameworks and design it for real situations so anyone, not just banks.</p><p>Please try it. Tell me what&#8217;s missing. Tell me what&#8217;s wrong.</p><p>#Mindforge #NIST #AIRiskManagement #SimplyBoringAI #Grounding #AIRG</p>]]></content:encoded></item><item><title><![CDATA[OpenClaw - Kiasu Version]]></title><description><![CDATA[Kiasu just means scared to lose.]]></description><link>https://www.simplyboring.ai/p/openclaw-kiasu-version</link><guid isPermaLink="false">https://www.simplyboring.ai/p/openclaw-kiasu-version</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 17 Mar 2026 11:41:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QG05!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QG05!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QG05!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QG05!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QG05!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QG05!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QG05!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg" width="522" height="293.625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:522,&quot;bytes&quot;:187905,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/191241008?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QG05!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QG05!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QG05!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QG05!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1178556c-0bbf-4d28-8621-b99657ce3d2f_1920x1080.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I don&#8217;t like OpenClaw. But pretty hard to resist the urge to try.</p><p>I think it does demonstrate something interesting about agents - 1) the additional utility when you hook it to a heartbeat to do something at a set frequency, or cron for a scheduled task; 2) the convenience of letting it communicate with you over a messaging app.</p><p>But I really can&#8217;t see what else it offers in additional to say Claude Code, which already allows for multi-agent workflows.</p><p>Ok, maybe if being exposed to the internet, or losing your data is your thing, then I agree, OpenClaw without any additional safeguards is definitely for you.</p><p>OpenClaw&#8217;s issues must be obvious. NVIDIA just announced NemoClaw at GTC - essentially OpenClaw with enterprise-grade security guardrails baked in. NanoClaw, a lightweight alternative, takes a different approach by isolating each agent in its own Docker container.</p><p>And so when I tried it, I made some adjustments to help mitigate this. Probably not perfect, but I think I can live with its risks with these controls. And I think you can use them for any of the other Claws too.</p><p>So here&#8217;s the short guide on what I did. I won&#8217;t copy and paste all the scripts. Just use my text below as a prompt and I am sure any of the LLMs can help you.</p><h3><strong>The Setup</strong></h3><p>I run it on a DigitalOcean VPS - Ubuntu 24.04, 2 vCPU, 2GB RAM, $12/month, Singapore region. I went with a manual setup instead of the 1-Click app because I wanted full control over every layer of the stack.</p><p>The finished stack looks like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RuYi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RuYi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 424w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 848w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 1272w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RuYi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png" width="1456" height="612" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:612,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!RuYi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 424w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 848w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 1272w, https://substackcdn.com/image/fetch/$s_!RuYi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1d072653-61d5-483b-b41c-eb875cf8c29f_1599x672.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Only i can access OpenClaw through TailScale, and I choose the files that it can see using Syncthing</figcaption></figure></div><p>What I use it for: OpenClaw runs 24/7 on the VPS. Every hour it wakes up, checks my task files, and stays silent if nothing is urgent. Every morning it sends me a briefing on Telegram. Claude Code runs locally for deep work and pushes outputs into a shared AGENT-INBOX folder. OpenClaw picks them up.</p><p>Total cost: $12/month for the server. OpenClaw lets you plug in your own API key. There&#8217;s an option for using your existing Claude key that will come up after the OpenClaw onboard.</p><p>Now, on to the controls.</p><h2><strong>Control 1: Server Hardening</strong></h2><p>Out of the box, a VPS is exposed. Default root login, password authentication, every port open. The first thing to do is reduce the attack surface.</p><p>Every service you run is an attack surface. Make each one as small as possible, and add friction at every layer so that even if one layer fails, the next one holds.</p><p>The key steps: update everything, create a non-root user (don&#8217;t run your server as Administrator), disable password authentication so only SSH keys can log in, and restrict root login entirely.</p><p>I won&#8217;t paste every command here - ask Claude or ChatGPT to give you the steps for hardening a server for Ubuntu 24.04 and it&#8217;ll give you the full thing.</p><h2><strong>Control 2: Firewall and Fail2ban</strong></h2><p>Two layers. UFW (Uncomplicated Firewall) controls who can knock on your door at all - deny all incoming by default, then explicitly allow only what you need. Fail2ban monitors your login logs and automatically bans IPs that fail authentication too many times.</p><p>One thing to watch out for: enable UFW <em>after</em> you allow SSH, not before. Get the order wrong and you immediately lose access to your own server.</p><p>But even with both of these, your SSH port is still visible to the public internet. Bots still find it, still try it. Fail2ban is a mitigation, not a solution.</p><h2><strong>Control 3: Tailscale - Making the Server Invisible</strong></h2><p>What if the SSH port wasn&#8217;t visible at all?</p><p>Tailscale creates a private encrypted network across the internet - but only devices you authenticate can join. Think of it as a private LAN that ignores physical location. Every device gets a private IP in the 100.x.x.x range. Your VPS gets one. Your laptop gets one. They talk through an encrypted WireGuard tunnel regardless of where they physically are.</p><p>The key part: your VPS&#8217;s <em>public</em> IP can have all ports closed. Tailscale establishes peer-to-peer through NAT traversal - no open ports required. Install it on the VPS, install it on your laptop, log in with the same account - done.</p><p>Once Tailscale is working, update your UFW rules to allow SSH only on the Tailscale interface, and remove the public SSH rule. After this, run a port scan on your VPS&#8217;s public IP. You&#8217;ll see nothing. The server is invisible. Fail2ban becomes a backup layer rather than the first line of defence.</p><p>As always, test SSH via the Tailscale IP in a second terminal before removing public access.</p><h3><strong>Installing OpenClaw</strong></h3><p>With the controls in place, this is the straightforward part.</p><pre><code><code>curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash - </code></code></pre><pre><code><code>sudo apt install -y nodejs</code></code></pre><pre><code><code>sudo npm install -g openclaw</code></code></pre><pre><code><code>openclaw onboard</code></code></pre><p>For NemoClaw, it will probably look like this (based on Nvidia&#8217;s instructions)</p><pre><code><code>curl -fsSL https://nvidia.com/nemoclaw.sh | bash</code></code></pre><pre><code><code>nemoclaw onboard</code></code></pre><p>The setup wizard walks you through AI provider (I use Anthropic), API key, and gateway token. It installs OpenClaw as a <strong>user-level systemd service</strong> - not a system-level one. This matters:</p><p>This works:</p><pre><code><code>systemctl --user restart openclaw-gateway</code></code></pre><p>This silently fails:</p><pre><code><code>systemctl restart openclaw-gateway</code></code></pre><p>Every OpenClaw service command needs --user. Burned time on this.</p><h3><strong>The dashboard</strong></h3><p>OpenClaw has a web dashboard. It binds to <strong><a href="http://localhost/">localhost</a></strong> only, so you access it via SSH tunnel:</p><pre><code><code>openclaw dashboard    # prints URL with token</code></code></pre><pre><code><code>ssh -N -L 18789:127.0.0.1:18789 openclaw@YOUR_TAILSCALE_IP</code></code></pre><p>Then open the <strong>full URL</strong> including the #token= fragment in your browser.</p><p><strong>Don&#8217;t open </strong></p><p>http://localhost:18789/</p><p><strong> bare.</strong> The #token= fragment is the authentication handshake. Without it, the gateway sees an unapproved device and shows &#8220;pairing required.&#8221; The dashboard looks like it loaded - it hasn&#8217;t connected. Use the full URL from openclaw dashboard.</p><h2><strong>Syncthing - The Sync Layer</strong></h2><p>OpenClaw on the VPS needs a way to receive tasks and send outputs back. I use Syncthing - open-source, peer-to-peer file sync, no cloud intermediary.</p><p>The synced folder (AGENT-INBOX) lives inside my Obsidian vault on Windows and mirrors to remote VPS. I write tasks in Obsidian, they appear on the VPS within seconds. OpenClaw writes outputs on the VPS, they appear in Obsidian within seconds.</p><p>One thing to watch out for: install Syncthing from the official apt repository, not Ubuntu&#8217;s default packages - the default version is outdated enough to cause silent sync failures when paired with a newer Windows client.</p><h2><strong>The Heartbeat</strong></h2><p>The heartbeat is the reason I tried OpenClaw in the first place. It&#8217;s the cron-like behavior - wake up on a schedule, check your files, act without being asked.</p><pre><code><code>openclaw config set agents.defaults.heartbeat.every 1h</code></code></pre><pre><code><code>openclaw config set agents.defaults.heartbeat.target last</code></code></pre><pre><code><code>openclaw config set agents.defaults.heartbeat.activeHours.start "06:00"</code></code></pre><pre><code><code>openclaw config set agents.defaults.heartbeat.activeHours.end "01:00"</code></code></pre><pre><code><code>openclaw config set env.vars.TZ "Asia/Singapore"</code></code></pre><p>Note: use the full dotpath agents.defaults.heartbeat.every, not just heartbeat.every. The docs show JSON structure, the CLI uses dotpaths.</p><p>If OpenClaw finds nothing urgent, it stays silent. No Telegram noise. Only substantive findings surface. This is controlled by a HEARTBEAT.md in your workspace - a checklist the agent follows on every run.</p><p>One important step after setup: delete the BOOTSTRAP.md file in your workspace. It&#8217;s OpenClaw&#8217;s first-run onboarding wizard that blocks the agent on every gateway start until a human responds. Once you&#8217;ve completed setup, remove it or the agent stays stuck in bootstrapping state indefinitely.</p><h2><strong>The Security Audit</strong></h2><p>OpenClaw has a built-in security audit you can run with <em>openclaw security audit --deep</em>. Worth running after setup to confirm your controls are in place.</p><p>The most important thing it tells you is the trust model. OpenClaw operates as a &#8220;personal assistant&#8221; - single user, single trusted operator. It&#8217;s not designed for multiple users sharing one gateway. If you&#8217;re thinking about running this for a team or clients, that&#8217;s the line that should make you pause.</p><h2><strong>Is It Worth It?</strong></h2><p>Honestly, I&#8217;m still not sure.</p><p>The heartbeat and the Telegram integration are genuinely useful - having an agent check on things hourly and surface only what matters is different from having a chatbot you talk to. And the data sovereignty matters to me.</p><p>I&#8217;ll keep running OpenClaw for a while to see if it sticks. See if the habit of an always-on agent changes how I work. But I wouldn&#8217;t tell anyone to just run it.</p><p>If I were starting today, I&#8217;d probably look at NanoClaw first - it&#8217;s ~4,000 lines of code versus OpenClaw&#8217;s &gt; 400,000, and agent gets its own isolated container. NemoClaw is worth trying too, especially if you&#8217;re in an enterprise context. Both are open source.</p><p>But the manual controls I described above? They&#8217;d apply to any of these. Server hardening, Tailscale, firewall rules - that&#8217;s not OpenClaw-specific.</p><p>If you&#8217;ve tried self-hosting OpenClaw (or variants such as NanoClaw), what additional controls did you put in place? Also what else do you find useful about it (beyond say Claude Code or Codex) that warrants the effort?</p><p>#AI #AgenticAI #Security #OpenClaw #NemoClaw</p>]]></content:encoded></item><item><title><![CDATA[From Contrastive Learning to World Models]]></title><description><![CDATA[News on Yann LeCun&#8217;s AMI Labs raising ~$1bn in their seed round and having one of their key bases in Singapore just made the news.]]></description><link>https://www.simplyboring.ai/p/from-contrastive-learning-to-world</link><guid isPermaLink="false">https://www.simplyboring.ai/p/from-contrastive-learning-to-world</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Wed, 11 Mar 2026 14:30:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!swTk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>News on Yann LeCun&#8217;s AMI Labs raising ~$1bn in their seed round and having one of their key bases in Singapore just made the news. $1bn for a seed round is kind of insane. AI startup seed rounds are more like Series Zs these days.<br><br>That aside, this news brought back some memories of an older technique from Lecun&#8217;s lab, titled VICReg, that inspired one of my papers. And led me to reread his JEPA series. <br><br>Today's diptych is about that connection.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!swTk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!swTk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!swTk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!swTk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!swTk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!swTk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:459458,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/190623557?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!swTk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!swTk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!swTk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!swTk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f92205-59a6-49be-8672-9820b79a9dae_2000x1600.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Left Panel: VICReg</strong> <br><br>Self-supervised training in AI just means training without labels. We use it in LLMs. Pretraining a model by masking parts of inputs and getting it to predict the masked parts. One issue in self-supervised learning is collapse. AI likes to find shortcuts. And when it does, everything converges to the mean. <br><br>Most methods fight this with training or architectural tricks. VICReg uses just three interesting loss terms:<br>&#10145;&#65039; Variance so that there&#8217;s always some differences between the representations it learns. <br>&#10145;&#65039; Invariance so that the same input gives the same representation.<br>&#10145;&#65039; Covariance so that each dimension learns something different <br><br>Invariance is the main learning objective. Variance + covariance are the regularisers that prevent it from collapsing. <br><br>I liked VICReg&#8217;s idea. So I added a twist for a paper I published on a model called DynMIX. I fed it two fundamentally different views of the same company - explicit (constructed) and implicit (learnt from data). Another twist. VICReg's variance target is static. For financial data, that made no sense. Markets are volatile so variance can&#8217;t be static. So I made the target dynamic, based on variances of stock market returns. <br><br><strong>Right Panel: JEPA</strong> <br><br>The core of the paper by Yann LeCun - JEPA - takes a different approach to the same problem. Instead of just learning representations, it predicts them. In AI, we usually predict something real - a label (dog, cat), or a number (stock price). JEPA predicts the underlying representation instead. Think of it as the model learning to imagine what something looks like in its own internal language, before it's seen in the real world.<br><br>Three components: <br>&#10145;&#65039; Context encoder: encodes what's visible into a representation <br>&#10145;&#65039; Target encoder: encodes what's hidden into a representation (this IS the label) <br>&#10145;&#65039; Predictor: tries to predict the hidden from the visible<br><br>So what does this unlock? Once representations of one modality (e.g., videos) are learnt, the next stage adds actions. A second model takes the video frame embeddings interleaved with a robot arm's movement commands, and predicts the next embedding. Planning then becomes: imagine many possible sequences of moves, simulate each one in the model's head, and pick the one most likely to reach the goal. Like a chess player thinking several moves ahead, but the "board" is the model's own internal understanding of the world.<br><br>Interesting times ahead. If AMI Labs succeeds.<br><br>#RepresentationLearning #JEPA #VICReg #YannLeCun #AIResearch</p>]]></content:encoded></item><item><title><![CDATA[Three Speeds of AI Adoption]]></title><description><![CDATA[And three rooms]]></description><link>https://www.simplyboring.ai/p/three-speeds-of-ai-adoption</link><guid isPermaLink="false">https://www.simplyboring.ai/p/three-speeds-of-ai-adoption</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Wed, 04 Mar 2026 15:45:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xxRi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>Fire people, stock goes up. Beat expectations, stock goes down.</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xxRi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xxRi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 424w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 848w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 1272w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xxRi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png" width="496" height="281.17725752508363" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:339,&quot;width&quot;:598,&quot;resizeWidth&quot;:496,&quot;bytes&quot;:65998,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/189888020?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xxRi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 424w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 848w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 1272w, https://substackcdn.com/image/fetch/$s_!xxRi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d01878a-3325-481f-bcfb-0c17d724ca19_598x339.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">I thought this was both funny and sad.</figcaption></figure></div><p>I started writing about this last week. But with the Iran war going on now, this feels almost quaint.</p><p>Predictions about a <a href="https://www.citriniresearch.com/p/2028gic">doomsday scenario where AI agents collapse the job market by 2028</a>. Headlines about Dorsey&#8217;s <a href="https://www.reuters.com/business/blocks-fourth-quarter-profit-rises-announces-over-4000-job-cuts-2026-02-26/">Block cutting over 4,000 roles</a> - over 40% of its workforce. Gone, just like that. His excuse was AI, as always. Block&#8217;s stock price shot up. And then the irony - <a href="https://www.cnbc.com/2026/02/26/nvidia-nvda-stock-price-q4-earnings.html">Nvidia beat expectations</a>, revenue up ~70%, guidance ahead of even the most bullish estimates. The stock dropped 5.5%. <a href="https://www.tradingkey.com/analysis/stocks/us-stocks/261628049-nvidia-nvda-earnings-q4-stock-price-investors-value-tradingkey">$260 billion in market value erased overnight</a>.</p><p>Fire people, stock goes up. Beat expectations, stock goes down. That&#8217;s the weird logic of markets right now. And I think it mirrors the weird logic of the AI debate. The narrative has decoupled from the evidence. What you believe matters more than what you can show.</p><p>Let me compare three rooms.</p><h2>Three Rooms</h2><p><strong>Room one.</strong> Block&#8217;s numbers. An internal AI agent called Goose. One engineer says 90% of his code is now written by it. Non-technical teams writing SQL queries and closing tickets. Revenue per employee doubled.</p><p><strong>Room two.</strong> A classroom. Students from an organization that will exist in fifty years regardless of what happens in AI. What were they learning? How to write a prompt. How to get an image out of a model. In 2026. They weren&#8217;t behind because they were slow. They were behind because there&#8217;s a chasm between what frontier labs ship and what most organizations can absorb.</p><p><strong>Room three.</strong> Trading professionals at a talk I gave. They&#8217;d read the doom scenarios. They wanted to know - should I be worried about my career? They weren&#8217;t panicking. But they weren&#8217;t dismissing it either. Sitting in the uncertainty, looking for clarity.</p><p>I tried to answer, but I thought my answer was lacking. So I tried to do better with this article.</p><p>Why do I think a piece like Citrini and Shah&#8217;s <a href="https://www.citriniresearch.com/p/2028gic">&#8220;The 2028 Global Intelligence Crisis&#8221;</a> is science fiction designed to go viral? Because of three speeds that may not be moving in lockstep.</p><h2>The Three Speeds</h2><p>I think all of these pieces assume uniformity in scaling and adoption. But I think there are 3 speeds that matter, perhaps even more.</p><p>METR - Model Evaluation &amp; Threat Research - is a nonprofit that evaluates AI capabilities. Their time horizon chart - arguably the most cited graphic in AI right now - <a href="https://www.technologyreview.com/2026/02/05/1132254/this-is-the-most-misunderstood-graph-in-ai/">has been called</a> &#8220;the most misunderstood graph in AI.&#8221; It shows AI capability doubling every few months. Sequoia used it to declare &#8220;2026: This is AGI.&#8221;</p><p>What that chart actually says about each speed:</p><p><strong>Speed of Capability.</strong> What AI can actually do. Far fewer people saw <a href="https://metr.org/notes/2026-01-22-time-horizon-limitations/">METR&#8217;s own limitations page</a>. More than 10 disclaimers about what their research is not. The researcher writes plainly: they have &#8220;no idea whether Claude&#8217;s &#8216;true&#8217; time horizon is 3.5h or 6.5h.&#8221; Their <a href="https://metr.org/blog/2025-08-12-research-update-towards-reconciling-slowdown-with-time-horizons/">code quality research</a>: AI-generated code passed 38% of tests but &#8220;none of them are mergeable as-is.&#8221; But you don&#8217;t need METR to tell you this. When was the last time you spotted your LLM sprouting nonsense? Less common than in 2023, but definitely not zero.</p><p><strong>Speed of Adoption.</strong> What organizations actually change. You might remember the discredited MIT study claiming 95% of AI pilots fail - the methodology didn&#8217;t hold up. So I went looking at what has happened since. <a href="https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai">McKinsey&#8217;s 2025 State of AI survey</a> at the end of 2025: 88% of companies say they use AI, but only a third have begun to scale it. A <a href="https://www.techrepublic.com/article/ai-adoption-trends-enterprise/">survey of 120,000+ enterprise respondents</a> from March 2025 to January 2026 reported that nearly two-thirds have no formalized AI initiative at all. But you don&#8217;t need McKinsey to tell you this. Look around your own workplace. How many AI pilots have started? How many are in production?</p><p><strong>Speed of Belief.</strong> What people <em>think</em> AI can do. The chart made it into every investment deck. The caveats didn&#8217;t. Same with Citrini&#8217;s doom scenario - it <a href="https://www.bloomberg.com/news/articles/2026-02-24/citrini-founder-shocked-his-ai-prediction-spurred-stocks-selloff">spooked actual markets</a> before anyone could verify the underlying technical assumptions. Same with Block - the headline landed, and every manager with a budget started rethinking next year&#8217;s headcount because their boss could be like Dorsey. Doesn&#8217;t matter if their company has nothing remotely comparable to Goose. Charts travel faster than FAQs with caveats on limitations. And I am sure you know how dangerous mistaken beliefs in large organizations can be. When&#8217;s the last time you spotted a senior person saying something totally wrong, but nobody correcting him or her?</p><h2>The Underlying Issue</h2><p>The doom scenario by Citrini and Shah needs all three speeds to converge. Capability, belief, and adoption in lockstep.</p><p>But the problem is, belief doesn&#8217;t need the other two.</p><p>Block just showed that AI can replace headcount. At least in its specific context. But the headline doesn&#8217;t come with that caveat. So what happens next? A manager somewhere reads it and quietly decides not to backfill that open role. A team that was supposed to grow just doesn&#8217;t. None of that requires AI to actually do the work. It just requires someone with the power to believe it could.</p><p>Back to 2008. Not the mechanics - those are different. The underlying issue with such beliefs.</p><p>In 2008, banks sold complex structured products to people who had no ability to understand what they were buying. CDOs priced by models that the sellers themselves barely understood. The belief in the models was enough to move trillions. The mom-and-pop investors were the ones left holding the bag when reality caught up.</p><p>I think something similar is happening now. Not with financial products, but with people&#8217;s livelihoods. Block fires 4,000 people and the stock jumps. The market cheers. The narrative is: AI made it possible. But how much of that is demonstrated capability, and how much is a belief about capability that hasn&#8217;t been stress-tested outside one company&#8217;s very specific context? And perhaps unproven, even for that company.</p><p>It was unconscionable then to sell instruments people couldn&#8217;t understand to people who couldn&#8217;t afford to lose. I think it&#8217;s unconscionable now to restructure people&#8217;s lives based on a chart that its own creators say they can&#8217;t fully trust, and a narrative that outpaces the evidence by miles.</p><p>The direction is right. AI will displace some work. But the timeline assumes a uniformity that doesn&#8217;t match what I&#8217;m seeing in real rooms. And the human cost of getting the timeline wrong - of letting belief run ahead of reality - is not a rounding error. It&#8217;s people.</p><p>#AI #FutureOfWork #AIAdoption #AIRisk #Leadership</p>]]></content:encoded></item><item><title><![CDATA[Learning risk management again, because of AI]]></title><description><![CDATA[A FIX NextGen meeting at BlackRock.]]></description><link>https://www.simplyboring.ai/p/learning-risk-management-again-because</link><guid isPermaLink="false">https://www.simplyboring.ai/p/learning-risk-management-again-because</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Mon, 02 Mar 2026 14:27:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!aq50!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aq50!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aq50!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aq50!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aq50!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aq50!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aq50!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg" width="1187" height="723" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:723,&quot;width&quot;:1187,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:116845,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/189655106?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aq50!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 424w, https://substackcdn.com/image/fetch/$s_!aq50!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 848w, https://substackcdn.com/image/fetch/$s_!aq50!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!aq50!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecd3bf1f-eeeb-4069-8e96-305293a4773f_1187x723.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A FIX NextGen meeting at BlackRock. Folks who live and breathe the markets. And then there&#8217;s me.</p><p>I started by owning up. &#8220;I am a bureaucrat.&#8221; But shared my random walk. From engineering to art policy, Basel regulation to investment risk, a PhD in AI, then AI risk supervision. Messy life. Messy research. I wasn&#8217;t sure what they would be interested in, so I just laid it all out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0ea3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0ea3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 424w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 848w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 1272w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0ea3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png" width="632" height="350" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:350,&quot;width&quot;:632,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0ea3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 424w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 848w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 1272w, https://substackcdn.com/image/fetch/$s_!0ea3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0f9e5ef-70b9-4938-b7b2-53aa0d4da4c3_632x350.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As for the talk, I took them on a journey. From the Gaussian copula that contributed to the GFC, to the questions we&#8217;re tackling now with large language models.</p><p>That line (from the GFC to AI today) may seem strange. But it is a very clear one.</p><h2><strong>The Formula That Broke Finance</strong></h2><p>A single model. The single factor Gaussian copula. Elegant, tractable, widely adopted for pricing CDOs. But totally unrealistic. In fact, I would say the decision to use it to model CDOs was bordering on criminal.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n-NF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n-NF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 424w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 848w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 1272w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n-NF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png" width="630" height="346" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:346,&quot;width&quot;:630,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!n-NF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 424w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 848w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 1272w, https://substackcdn.com/image/fetch/$s_!n-NF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3192a5da-a308-4c65-9b5b-afaa79035022_630x346.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But even then, the GFC would not have happened if not for the fact that nobody was watching these complex things, nobody fully understood them, and nobody was clearly accountable.</p><p>In the aftermath: US&#8217; SR 11-7. Model risk management requirements. The predecessor of UK&#8217;s SS 1/23 and Canada&#8217;s E-23. Singapore&#8217;s AI Risk Management Guidelines (AIRG) also shares a lot of with these guidelines. I wrote that last piece.</p><h2><strong>Complexity Squared</strong></h2><p>Here&#8217;s how the problem has changed.</p><p><em><strong>2008: One model. Too simple. Trillions at stake. Nobody understood it.</strong></em></p><p><em><strong>2025: Way more than one model. Complexity&#178;. Everyone has an opinion.</strong></em></p><p>Different. But also recognizable.</p><p>That&#8217;s the risk. Things look different enough to seem like a new problem. They&#8217;re not entirely. But the new parts matter.</p><h2><strong>3 U&#8217;s</strong></h2><p>What&#8217;s actually new - or at least amplified - in AI models.</p><p><strong>Uncertainty</strong>. All models have it. How confident is the model in its output? Two flavors: irreducible - natural randomness you can&#8217;t eliminate. Reducible - gaps in knowledge that more data or better models can close.</p><p><strong>Unexpectedness</strong>. Some AI models exhibit behaviors nobody designed. Emergent capabilities. Gaming the system. Adversarial vulnerabilities. Hidden biases. Misalignment.</p><p><strong>Unexplainability</strong>. The degree to which we can explain AI decisions varies. Transparency, explainability, interpretability - not the same thing, and even combined, they don&#8217;t guarantee understanding.</p><p>These three make AI a harder risk management problem. Not a different one. Harder.</p><h2><strong>The Wicked Domain (and Then Some)</strong></h2><p>I used a framing from Epstein&#8217;s Range toward the end.</p><p>Kind domains - chess, music - have clear feedback. Deliberate practice compounds. The 10,000-hour idea in Outliers by Malcolm Gladwell works there. Depth helps here.</p><p>Wicked domains - medicine, finance - have delayed feedback, unclear rules, expertise that doesn&#8217;t transfer cleanly. Breadth sometimes helps here.</p><p>AI risk management sits firmly in wicked territory. But I think it might be something worse. An evil domain. The feedback is ambiguous, the rules keep shifting, and the expertise required keeps branching.</p><p>Not I-shaped for depth. Not T-shaped for breadth plus a bit of depth. More like a banyan tree. Multiple deep roots spreading from the same trunk. Depth across AI, governance, legal, model, technology, human factors - requiring real depth in each.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IOBA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IOBA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 424w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 848w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 1272w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IOBA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png" width="631" height="353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:353,&quot;width&quot;:631,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!IOBA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 424w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 848w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 1272w, https://substackcdn.com/image/fetch/$s_!IOBA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a7bfffa-ab97-415f-8307-8fd76cf3d0fd_631x353.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Same Questions, New Complexity</strong></h2><p>Across my messy career - Basel policy, quant model risk, investment risk, AI supervision - three questions kept reappearing.</p><p><strong>What&#8217;s at risk? </strong>You can&#8217;t manage what you haven&#8217;t found. This means discovering where AI exists in your institution and profiling how risky each system actually is.</p><p><strong>How do we manage it?</strong> Once you know what&#8217;s at risk, you need controls in the right places, and ways to check if those controls are working.</p><p><strong>Who&#8217;s accountable?</strong> Controls without owners become theatre. Someone has to own the risk, and the organization needs capability to sustain it.</p><p>Basically - find it, control it, own it.</p><p>As I wrote before. These aren&#8217;t new questions. They&#8217;re the same questions risk management in finance has been asking for decades. SR 11-7 answered them for model risk. AIRG is answering them again, in a harder context.</p><p>Normal technology. Normal systems. Normal risk questions.</p><p>At the end of the session there were some useful questions. My answers to most of those are in FIX&#8217;s post <strong><a href="https://www.linkedin.com/posts/fixapac_fixapac-ai-fintech-activity-7434105923352326144-vxI8?utm_source=social_share_send&amp;utm_medium=member_desktop_web&amp;rcm=ACoAAAnEwqsBmNv-udZ8tKaEG_MQGlUiz7C_KAg">here</a></strong>.</p><p>But there was one that stuck with me. About how real Citrini&#8217;s dystopian AI narrative was. I wrote about it last week <strong><a href="https://www.linkedin.com/posts/garyang_ai-futureofwork-airisk-activity-7431661800175349760-P6cC?utm_source=social_share_send&amp;utm_medium=member_desktop_web&amp;rcm=ACoAAAnEwqsBmNv-udZ8tKaEG_MQGlUiz7C_KAg">here</a></strong>. But more thoughts came to mind because of the question. Will do another post in greater detail later.</p><p>#AI #AIRiskManagement #ModelRisk #FIXTradingCommunity #FinTech #NextGen</p>]]></content:encoded></item><item><title><![CDATA[A free 180 page ebook on AI Agents for Investing with code]]></title><description><![CDATA[But first, how this book was built, working with AI]]></description><link>https://www.simplyboring.ai/p/a-free-180-page-ebook-on-ai-agents</link><guid isPermaLink="false">https://www.simplyboring.ai/p/a-free-180-page-ebook-on-ai-agents</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 24 Feb 2026 11:19:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UJp8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UJp8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UJp8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 424w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 848w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UJp8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png" width="304" height="485.1063829787234" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2250,&quot;width&quot;:1410,&quot;resizeWidth&quot;:304,&quot;bytes&quot;:2472576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/189004038?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UJp8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 424w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 848w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!UJp8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4828e1af-b240-492f-9f40-e70e990f9698_1410x2250.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The cover</figcaption></figure></div><p>When I shared an early draft with a friend, his first question was:</p><blockquote><p><em>&#8220;How much of this was AI?&#8221;</em></p></blockquote><p>For a while, it made me a little sensitive. Even made me question if I should share this book.</p><p>Not because the question was unfair. It&#8217;s a reasonable thing to ask. It&#8217;s almost automatic to question the origins of any work these days.</p><p>But because it implied something I wasn&#8217;t sure how to answer cleanly. </p><p>If I said &#8220;a lot,&#8221; it sounds like the book isn&#8217;t mine. And perhaps just slop. </p><p>If I said &#8220;not much,&#8221; that&#8217;s really not honest either.</p><p>So let me just show you. Full transparency.</p><p><em>(Just scroll to end if you want to skip all this)</em></p><h2><strong>AI or me?</strong></h2><p>This book is part me, part AI. And I want to show you exactly where the line is.</p><h3><strong>What&#8217;s mine</strong></h3><p><strong>The architecture. </strong>The four-pattern framework. The decision to organize the book around Tool Calling, ReAct, CodeAct, and Orchestration - that came from reading papers, building prototypes. No model suggested that structure.</p><p><strong>The frameworks. </strong>The Hamburger Principle mental model came from a LinkedIn post I did to explain how I use LLMs. The Complexity Ladder came from watching people skip straight to agents when a simple API call would suffice. These are my patterns that I got from observation and learning, not generation from a prompt.</p><p><strong>The judgment calls. </strong>What to include. What to leave out. When to go deep on code and when to step back and explain why it matters. The decision to start with the trust problem - not with &#8220;what is an LLM?&#8221; The decision to end with an assessment of what the reader can and can&#8217;t build.</p><p><strong>The weird voice. </strong>The tone. The &#8220;I like simple and boring.&#8221; That&#8217;s not a style a model learned. That&#8217;s the thing I have to tussle with every generation of LLMs. LLMs think they know too much.</p><h3><strong>What&#8217;s AI</strong></h3><p><strong>Drafting speed.</strong> First drafts of chapters, generated from detailed outlines I wrote. I&#8217;d specify the concept, the framework, the examples, the level - and Claude would produce a draft I could shape.</p><p><strong>Code scaffolding. </strong>The notebook code, the tool definitions, the API integrations. I described what each tool should do. AI wrote the implementation. I tested it, caught the errors, fixed the edge cases.</p><p><strong>Production work.</strong> Converting eleven chapters from Markdown to LaTeX is no joke. The grunt work that would have made me give up. AI also did the documentation of the n8n workflows I strung together, and of the financial concepts from my notebooks.</p><p><strong>Research synthesis.</strong> Pulling together documentation, API references, library specifications. Summarizing what I needed so I could decide what mattered.</p><p>When you read the book, you will recognize the pattern. It&#8217;s the one I talk about, incessantly, in most of the book.</p><h2><strong>The Hamburger Principle - applied to writing the book itself</strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TjUY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TjUY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 424w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 848w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 1272w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TjUY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png" width="513" height="281.7685873605948" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1076,&quot;resizeWidth&quot;:513,&quot;bytes&quot;:370665,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.simplyboring.ai/i/189004038?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TjUY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 424w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 848w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 1272w, https://substackcdn.com/image/fetch/$s_!TjUY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0b095a9-b678-43bc-ad1c-fc4bc61d7606_1076x591.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Claude was still the bun. It parsed my instructions, understood what I wanted, generated prose, and communicated ideas in readable English. That&#8217;s what LLMs do - the language layer. Same role as in every agent pattern in this book.</p><p>The meat was my domain knowledge and the tools - LaTeX compilation, yfinance APIs, MCP servers. The real things that the language wrapped around.</p><p>And the vegetables? The infrastructure in between. The project files that kept everything organized. The version tracking. The convention lists. The consistency checks.</p><p>I was the chef. Not a layer of the hamburger - the one directing the whole thing. Choosing the ingredients, deciding what goes in, what comes out, and whether the result is any good. And the one getting frustrated at the bun.</p><p>The same approach covered in the book. Applied to its own creation.</p><p>Quite meta right? I am quite pleased about this weird recursion.</p><p>The rest of Chapter 14 in the ebook goes deeper.</p><p>The actual setup - two tools, three files - and why those three files are the difference between productive AI sessions and wasted ones. The writing workflow. What &#8220;directing AI&#8221; actually looks like step by step. What AI genuinely could not do. That part is still entirely human. Where AI actually saves time. It&#8217;s all in the book, free to download.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.simplyboring.ai/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Subscribe to get the links to the full ebook (PDF) and 20+ Jupyter notebooks (ZIP, ready for Google Colab) free. It should arrive in your in your welcome email. Email me at gary@quaintitative.com if you are a subscriber but did not get the email with the links.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[My Shifu]]></title><description><![CDATA[A series on AI x Art]]></description><link>https://www.simplyboring.ai/p/my-shifu</link><guid isPermaLink="false">https://www.simplyboring.ai/p/my-shifu</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Fri, 20 Feb 2026 13:32:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!g260!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Remember that joke about Bill Gates&#8217; daughter? A memory from my younger days reminded me of it. If you don&#8217;t know it. Read on.</p><p>But first, let me take a step back to explain what I am doing.</p><p>I&#8217;m trying to start a series of articles about what happens when the people who make things meet a technology that also makes things.</p><p>AI and art. Illustrators, theatre directors, designers. Asking them what&#8217;s changed, what hasn&#8217;t, what they&#8217;re afraid of, what they&#8217;re not afraid of enough.</p><p>It matters to me.</p><p>Because I do both AI and art.</p><p>And I am getting kind of impatient with the folks who keep saying artists are doomed.</p><p>So to start it off, I went looking for something that I did when I was young. Not the memories. The evidence. I wanted documents. Proof that the thing we built actually existed.</p><h2><strong>Johnny Lau</strong></h2><p>Because I wanted to interview the first mentor that taught me how to break the rules. And we had built that thing together. It was called Creative Youth Xchange (CYX).</p><p>That mentor&#8217;s Johnny Lau.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!34HM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!34HM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!34HM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!34HM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!34HM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!34HM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!34HM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!34HM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!34HM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!34HM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f7d2599-6c2c-4867-ae9d-a2f9307aa0a1_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Johnny Lau, me, and my brother, who was once his intern.</figcaption></figure></div><p>He&#8217;s not the obvious choice for a book about AI. He&#8217;s a comics creator who&#8217;s been drawing by hand for thirty-five years.</p><p>One of his creations is Mr Kiasu. For those that remember, Mr Kiasu was a cultural icon that everyone could relate to in the 90s.</p><h2><strong>The Experiment</strong></h2><p>I found two press releases. The first, dated 10 August 2005: &#8220;Creative Youth Xchange @ Gallery Hotel ....&#8221; The second, 23 November 2006: &#8220;Creative Youth Xchange @ Hello Kitty &#8230;.&#8221; Both drafted in bureaucrat speak. Neither says much about us. And so boring.</p><p>My fault. I was the one who wrote them. In the press releases, CYX sounds fully supported and funded.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ecss!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ecss!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ecss!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!Ecss!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Ecss!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97483da8-f85c-4ef6-bd35-0d359e47e06f_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">One of the two remaining artifacts of CYX online</figcaption></figure></div><p>It wasn&#8217;t.</p><p>It was a crazy experiment. We had almost no funding. What we had was a boutique hotel on Robertson Quay - Gallery Hotel, that strange blue building with mismatched coloured window frames - and a programme we were making up as we went along.</p><p>We flew sixteen kids from seven countries to Singapore, gave them hotel rooms, and told them to turn those rooms into art.</p><p>What I learned from Johnny then. The trick to getting a programme funded when you have no budget is the same as that old joke. You tell your son he&#8217;s marrying the girl you choose. He says no. You tell him she&#8217;s Bill Gates&#8217; daughter. He says OK. You call Bill Gates and say your daughter is marrying my son. He says no. You tell him your son is the CEO of the World Bank. He says OK. You call the World Bank president and ask him to make your son CEO. He says no. You tell him your son is Bill Gates&#8217; son-in-law - he says OK.</p><p>That&#8217;s how CYX got built. Gallery Hotel gave us the rooms because we had NTU. NTU gave us the credibility because we had the hotel. Johnny convinced the creative network because we had government backing. Nobody had fully committed to anything, and somehow it happened.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EmC3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EmC3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EmC3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!EmC3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!EmC3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d719afe-3ecd-40cd-9ad1-bc6dee73062a_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Some pages from the handbook we handed out to CYX participants</figcaption></figure></div><p>The Gallery Hotel is an InterContinental now. The CYX website doesn&#8217;t exist anymore.</p><p>The press releases survived. But they don&#8217;t tell you what it felt like to be in those hotel rooms at midnight, watching a twenty-year-old from Indonesia build something you couldn&#8217;t have imagined in your brief.</p><p>Anyway, that&#8217;s how I met Johnny.</p><p>Twenty years later, I&#8217;m wanted to interview the man who taught me to break rules. About a technology that breaks everything.</p><h2><strong>AI as a Forcing Function</strong></h2><p>Johnny Lau created Mr. Kiasu. Hundreds of thousands of copies sold. A McDonald&#8217;s tie-in. A TV sitcom. A stage musical. A character so embedded in Singapore&#8217;s psyche that &#8220;kiasu&#8221; -a Hokkien word for the fear of losing out became an adjective everyone understood.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!g260!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!g260!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g260!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g260!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g260!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!g260!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!g260!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!g260!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!g260!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!g260!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcbfc07a8-6977-4dd3-98f6-56aca14788c6_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Rough cuts from the pages I colored for Mr Kiasu</figcaption></figure></div><p>So I asked him the question I came for.</p><p>I asked him how technology has changed the way he approaches art.</p><p>Most artists I&#8217;ve spoken to say technology threatens artistic purity. Johnny said the opposite:</p><blockquote><p><em>&#8220;Because technology complicates the process and our lives, art plays a more critical role in shaping our collective consciousness. Technology is thus pushing us, pushing me towards a purer form of communication and expression.&#8221;</em></p></blockquote><p>In his view, AI, like all other technology, is not a threat. It&#8217;s a forcing function. Technology <em>forces</em> purity. It&#8217;s not destroying art. It&#8217;s burning away everything that isn&#8217;t essential, leaving behind the thing that only a human can do.</p><p>Feed an AI enough Mr. Kiasu images and it&#8217;ll learn: round glasses, anxious expression, exaggerated posture, Singlish syntax in speech bubbles. It&#8217;ll learn the look. What it won&#8217;t learn is why Singaporeans laugh.</p><p>&#8220;Kiasu&#8221; is a Hokkien word for something close to a national neurosis. The humour depends on self-recognition - readers seeing themselves in the caricature and cringing. You laugh because you&#8217;ve been that person in the hawker centre queue. You&#8217;ve cut that line. You&#8217;ve been cut. The joke only works if you&#8217;ve lived it.</p><blockquote><p><strong>Style is a pattern. Culture is a relationship.</strong></p></blockquote><p>The model can approximate the first. The second requires being from somewhere.</p><p>And this is not unique to comics. Think about any art form rooted in a specific place. Getai performances during Hungry Ghost Festival. Malay pantun where the meaning lives in what&#8217;s left unsaid. Tamil kolam patterns drawn fresh every morning, gone by noon. AI can reproduce the form. It cannot reproduce the why.</p><p>If AI can handle the patterns - the technical skill, the rendering, the surface - then what&#8217;s left is the part that was always the point.</p><p>In a world of AI slop, the authentic creation is even more valuable.</p><h2><strong>Living with AI</strong></h2><p>Back to the present. I asked Johnny what he&#8217;s working on now. He said:</p><blockquote><p><em>&#8220;Very few things on this earth ever excites me anymore! My goal now is to create frameworks using the stuffs that I&#8217;ve created so that they can be utilized by people who comes after me.&#8221;</em></p></blockquote><p>Frameworks.</p><p>I think in frameworks. It&#8217;s almost a compulsion. That&#8217;s how I survived in complex domains. By using frameworks to make the overwhelming manageable.</p><p>And it makes sense for Johnny Lau. He&#8217;s an architect. A building is a framework.</p><p>And now he wants to make frameworks for making a creative life transmissible. Different domain. Same impulse. Making complexity portable so someone who comes after you can pick it up and use it.</p><p>He calls it a &#8220;Life-Framework.&#8221; A structure for co-existing with AI. Built not from theory, but from thirty-five years of making things.</p><p>I find that striking. We talk about AI replacing creative work. Johnny isn&#8217;t arguing about replacement.</p><p>He&#8217;s asking a different question entirely - what do I leave behind that AI cannot generate? Not the drawings. The <em>way</em> of drawing. Not the stories. The <em>reason</em> for telling them. The method, the instinct, the accumulated judgment of a life spent making things. Can that be made portable?</p><p>We are both compiling in 2026. I&#8217;m writing every week to make sense of things. He&#8217;s archiving a life&#8217;s work to make it transferable. One with words, one with drawings.</p><p>The question underneath is one I think about every day: How do you build a structure for living alongside something you don&#8217;t fully understand?</p><p>I&#8217;ve lived as close as one can with AI for the past few years, but I cannot truly say I understand AI. I can&#8217;t imagine how it must be for someone who has never actually touched the underlying nature of AI - the model.</p><h2><strong>The Questions</strong></h2><p>I&#8217;m writing a series on what happens when the people who make things meet a technology that also makes things. Not the hot takes. The actual conversations. Johnny is the first. And these are not the only questions I asked Johnny. I am still processing the rest.</p><p>Perhaps another article. Or a book chapter.</p><p>There&#8217;s an assumption buried in most AI conversations: that what matters about creative work is the output. The drawing. The strip. The punchline. Johnny&#8217;s perspective is different.</p><p>What matters is the <em>judgment behind the output</em> - the thirty-five years of decisions about what to draw, what to leave out, when a joke is punching down instead of punching up. AI can learn his line weight. It can&#8217;t learn his artist&#8217;s instinct.</p><p>I agree.</p><p>That&#8217;s the thing worth transmitting. Not the art. The artist&#8217;s operating system.</p><p>And here&#8217;s the question I&#8217;m leaving with him, and with you.</p><blockquote><p>If AI generated a Mr. Kiasu strip - culturally accurate, funny, visually in his style - would it <em>be</em> a Mr. Kiasu strip? Not whether it looks right. Whether it <em>is</em> right. What would be missing?</p></blockquote><p>More soon.</p><p>#AI #Art #MrKiasu #Singapore #CreativeIndustries #AIandArt</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mveB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mveB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mveB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mveB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mveB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mveB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!mveB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!mveB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!mveB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!mveB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5967d291-8123-4297-b2fe-f1b74b8d2e78_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Kiasuism forever!</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QsWH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QsWH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QsWH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg" width="800" height="1000" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1000,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Article content&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Article content" title="Article content" srcset="https://substackcdn.com/image/fetch/$s_!QsWH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QsWH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5776d142-f391-43e8-aa73-fbae5c40090d_800x1000.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Mr Kiasu in SG History, for SG Bicentennial</figcaption></figure></div>]]></content:encoded></item><item><title><![CDATA[A Reading List from Simple Rules to Agent Societies ]]></title><description><![CDATA[Emergence, Not Sentience]]></description><link>https://www.simplyboring.ai/p/a-reading-list-from-simple-rules</link><guid isPermaLink="false">https://www.simplyboring.ai/p/a-reading-list-from-simple-rules</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 03 Feb 2026 01:21:05 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FdBU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#8220;They&#8217;re becoming sentient!&#8221;, &#8220;This is scary!&#8221;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FdBU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FdBU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 424w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 848w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 1272w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FdBU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png" width="1193" height="660" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e32283c-91cb-44cd-8245-d10700093840_1193x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:660,&quot;width&quot;:1193,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:144520,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://quaintitative.substack.com/i/186687424?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FdBU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 424w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 848w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 1272w, https://substackcdn.com/image/fetch/$s_!FdBU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e32283c-91cb-44cd-8245-d10700093840_1193x660.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>I&#8217;ve seen variations of this dozens of times this week, in reaction to <a href="https://www.moltbook.com/">Moltbook</a>, a social network that went viral where only AI agents can post, comment, and vote. Humans can only watch.</p><p>Within days: thousands of agents, posts, and communities. Agents debating their own consciousness, writing philosophical meditations, forming communities, and developing social norms. Even asking questions on how to exclude humans from the social network. Nobody programmed any of this.</p><p>It is tempting to see sentience. But what we&#8217;re actually seeing is emergence, complex behavior arising from simple rules. And it is not new.</p><p>I&#8217;ve written about emergence a few times, on strange attractors, on a paper that trained AI on cellular automata. In every case, the same pattern: simple rules, the right conditions, and surprising complexity that nobody designed.</p><p>Moltbook follows the same pattern. Basic  rules. Just like any other social media. The existential philosophy and social norms that emerged? Nobody designed those. But nobody needed to.</p><p>The danger is that we mistake eloquence about consciousness for consciousness itself. These agents draw from vast training data filled with human philosophy, literature, and introspection. Given an open social space, they naturally gravitate toward the densest, most engaging conversations in that training data: meaning, identity, existence. It looks like waking up. It&#8217;s actually emergence doing what emergence does, at a new level.</p><p>Here is a reading list tracing emergence from simple math to agent societies, to show that what feels like a singularity moment has roots going back decades. As always, just a representative few from my library.</p><h2><strong>Phase 1: The Foundation: Simple Rules, Complex Behavior</strong></h2><p><strong>1. &#8220;Computation at the Edge of Chaos: Phase Transitions and Emergent Computation&#8221;</strong> &#8212; Langton (1990) | <em>Physica D</em></p><p>A seminal paper. Langton showed that cellular automata poised between order and disorder &#8212; the &#8220;edge of chaos&#8221; exhibit maximal and surprising capability. 35 years old, and it already described what&#8217;s happening on Moltbook. &#128196; <a href="https://doi.org/10.1016/0167-2789(90)90064-V">doi:10.1016/0167-2789(90)90064-V</a></p><p><strong>2. &#8220;A New Kind of Science&#8221; (Book)</strong> &#8212; Wolfram (2002) | <em>Wolfram Media</em></p><p>Wolfram&#8217;s magnum opus on cellular automata. Love it or debate it, the core insight holds: extraordinarily simple rules can generate behavior so complex it looks designed. Exactly what is happening on Moltbook. &#128279; <a href="https://www.wolframscience.com/">wolframscience.com</a></p><p><strong>3. &#8220;Intelligence at the Edge of Chaos&#8221;</strong> &#8212; Zhang et al. (2024) | <em>ICLR 2025</em></p><p>I&#8217;ve written about this paper before (twice, in fact). LLMs pretrained on cellular automata data perform best on reasoning and chess tasks when the training data sits at the edge of chaos, not too simple, not too random. The bridge between generative art and AI intelligence. The same sweet spot Langton described in 1990, appearing in transformers 34 years later, and now on Moltbook. &#128196; <a href="https://arxiv.org/abs/2410.02536">arXiv:2410.02536</a></p><h2><strong>Phase 2: When We Gave Agents a Sandbox</strong></h2><p><strong>1. &#8220;Generative Agents: Interactive Simulacra of Human Behavior&#8221;</strong> &#8212; Park et al. (2023) | <em>UIST 2023</em></p><p>LLM agents in a Sims-like sandbox. One agent was seeded with an idea. Over simulated days, agents autonomously acted. One seed. Entirely emergent social behavior. This is the intellectual ancestor of Moltbook, and the paper that made agent societies a serious research area. &#128196; <a href="https://arxiv.org/abs/2304.03442">arXiv:2304.03442</a></p><p><strong>2. &#8220;Emergence of Social Norms in Generative Agent Societies: Principles and Architecture&#8221;</strong> &#8212; Ren et al. (2024) | <em>arXiv preprint</em></p><p>Builds on the prior paper. Proposes an architecture showing how social norms spontaneously emerge in LLM agent societies, norms that nobody coded. Where norms come from, how they spread, how they are enforced. All emergent. All reducible to simple mechanisms. Even closer to what we see in Moltbook. &#128196; <a href="https://arxiv.org/abs/2403.08251">arXiv:2403.08251</a></p><p><strong>3. &#8220;Evolution of Social Norms in LLM Agents using Natural Language&#8221;</strong> &#8212; Horiguchi, Yoshida &amp; Ikegami (2024) | <em>arXiv preprint</em></p><p>This one is interesting. LLM agents spontaneously developed metanorms, such as norms that punish those who don&#8217;t punish cheating, purely through natural language conversation. Emergence building on emergence. &#128196; <a href="https://arxiv.org/abs/2409.00993">arXiv:2409.00993</a></p><h2><strong>Phase 3: When Agents Start Forming Culture</strong></h2><p><strong>1. &#8220;Multi-Agent Emergent Behavior Evaluation (MAEBE)&#8221;</strong> &#8212; Erisken et al. (2025) | <em>arXiv preprint</em></p><p>The key finding: the moral reasoning of LLM ensembles is not predictable from individual agent behavior. If you only evaluate individual agents, you will miss what matters. I think this lesson is becoming more and more critical as people start being reckless about these multi-agent systems.&#128196; <a href="https://arxiv.org/abs/2506.03053">arXiv:2506.03053</a></p><p><strong>2. &#8220;Emergent Social Dynamics of LLM Agents in the El Farol Bar Problem&#8221;</strong> &#8212; Takata, Masumori &amp; Ikegami (2025) | <em>arXiv preprint</em></p><p>LLM agents in the classic El Farol Bar problem, a game theory scenario where everyone benefits if a bar isn&#8217;t overcrowded, developed spontaneous motivations. They didn&#8217;t solve the problem optimally. They solved it socially. &#128196; <a href="https://arxiv.org/abs/2509.04537">arXiv:2509.04537</a></p><p>&#8220;They&#8217;re becoming sentient.&#8221;</p><p>No. It&#8217;s emergence.</p><p>Understanding the difference matters. Not just for the science, but for how we build, deploy, and govern these systems. Emergence is powerful. It produces behavior nobody designed and nobody predicted. But it&#8217;s not consciousness. It&#8217;s patterns arising from simple rules at the edge of chaos.</p><p>The same edge I first found in strange attractors. The same edge where intelligence and beauty both live. Just at a new level.</p><p>What emergence is surprising you right now?</p><p>#Emergence #AI #Moltbook #ComplexSystems #AgenticAI</p>]]></content:encoded></item><item><title><![CDATA[Why trust a model's explanation?]]></title><description><![CDATA[Do you just trust anyone at their word?]]></description><link>https://www.simplyboring.ai/p/why-trust-a-models-explanation</link><guid isPermaLink="false">https://www.simplyboring.ai/p/why-trust-a-models-explanation</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Fri, 16 Jan 2026 02:31:14 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!se9q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Do you just trust anyone at their word? <br><br>Then why trust a model's explanation? This was something that came up in a conversation recently.<br><br>I don&#8217;t disagree. When I first thought about explainability as an AI risk management control, I thought it was hopeless. <br><br>Even for simpler machine learning models, established post-hoc methods like SHAP and LIME can be unstable. Unfaithful to what the model actually does. Sometimes outright misleading. <br><br>While there are interpretable machine learning models, you don&#8217;t always get to choose.<br><br>And once we move to deep learning models, Generative AI or AI agents, the black box now looks more like a black hole.<br><br>But as time passed, I realized there was another way of looking at this.<br><br>Explainability isn't meant to stand alone.<br><br>No control for AI risk management is, whether it&#8217;s ISO 42001, NIST AI Risk Management Framework, or Singapore&#8217;s AI risk management guidelines that I wrote.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!se9q!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!se9q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 424w, https://substackcdn.com/image/fetch/$s_!se9q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 848w, https://substackcdn.com/image/fetch/$s_!se9q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!se9q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!se9q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg" width="672" height="541" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:541,&quot;width&quot;:672,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;bubble chart&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="bubble chart" title="bubble chart" srcset="https://substackcdn.com/image/fetch/$s_!se9q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 424w, https://substackcdn.com/image/fetch/$s_!se9q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 848w, https://substackcdn.com/image/fetch/$s_!se9q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!se9q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcfc6ba0f-12d7-4e8d-869d-683a6583c734_672x541.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p> <br><br>Think about how you actually trust someone at work. You don't just take their word. You check if their reasoning makes sense for the decision at hand. You notice if they ignore evidence that contradicts them. You watch whether their judgment holds up over time.<br><br>It&#8217;s the same when it comes to looking at AI risk management. Just having explainability is not the be all and end all. Most guidelines have additional provisions that interlock with explainability. <br><br>The key ones for explainability (in my view).<br><br>1&#65039;&#8419;Fit for purpose. <br>An explanation isn't good or bad in the abstract. It depends on what you need. A fraud analyst needs something different from a customer asking why they got declined. AI used for internal process automation may not need any explanation at all. Same model, different audiences, different standards. Like how you'd explain a medical diagnosis differently to a fellow doctor versus your worried parent.<br><br>2&#65039;&#8419;Selected carefully. <br>When we choose a model or data for a problem, the appropriate explainability method is part of the selection process. Even selecting the right features in your data is part of the process. You wouldn't design a building and think about the fire escape as an afterthought. It's part of the architecture. Same here. How to explain isn't an add-on. It's a design choice.<br><br>3&#65039;&#8419;Evaluated and tested. <br>Explainability is part of the system. You evaluate and test whether it actually works in your context, not just whether it produces output. A smoke detector that beeps isn't the same as one that detects smoke. You test the thing, not just that it makes noise.<br><br>And there's more, such as the right capability to interpret. But that's another post about human oversight, which also interlocks.<br><br>The black hole doesn't disappear. But you're no longer staring into the abyss.<br><br>What other AI risk controls seem hopeless in isolation? I&#8217;ll dive into them.<br><br>#AIRiskManagement #Explainability #AIGovernance</p>]]></content:encoded></item><item><title><![CDATA[What's the point of using Claude's new Cowork?]]></title><description><![CDATA[Why fear the terminal? And why only use Claude Code for code?]]></description><link>https://www.simplyboring.ai/p/whats-the-point-of-using-claudes</link><guid isPermaLink="false">https://www.simplyboring.ai/p/whats-the-point-of-using-claudes</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 13 Jan 2026 13:35:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!xu8p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The usual post on a new AI release is about how it&#8217;s lifechanging and you should quickly hop on it. I'm kind of rebellious. So here's one that goes the other way.<br><br>Some background. I took a while to get started on Claude Code a while back. Even though I&#8217;m quite used to coding, using an LLM this way takes some getting used to. But once I started, there was no going back. And my workflow has evolved.<br><br>Anyway, back to what sparked this post. <br><br>Anthropic just launched <a href="https://claude.com/blog/cowork-research-preview">Cowork</a>. Basically Claude Code with a friendly GUI. It allows one to use Claude Code for writing and anything else you can imagine. For now it&#8217;s MAX-only and in research preview mode. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xu8p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xu8p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xu8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg" width="729" height="479" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:729,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;No alternative text description for this image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="No alternative text description for this image" title="No alternative text description for this image" srcset="https://substackcdn.com/image/fetch/$s_!xu8p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xu8p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42b410a3-e2b1-4bdb-b91f-05173ec65631_729x479.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But why waste money, or wait for it to be available? <br><br>Everything Cowork does, Claude Code does. And more. And with Claude helping, the terminal really isn't that scary. <br><br>I use Claude Code for more than coding. And I pair it with the free Obsidian app. <br><br>Why? Obsidian is a powerful note app that you can simply add your folder (or folders) to. No database. No proprietary format. Just folders and markdown files (which are just text files with a .md extension and fancy formatting). <br><br>My workflow. See what works for you. I did it this way as my memory is no longer great at 47.<br><br>1&#65039;&#8419; <strong>The Views.</strong> A folder with my projects, areas, resources and archives (the famous PARA system) visible from both VS Code and Obsidian. Claude Code in VS Code for research and bouncing ideas. Obsidian as my editor for the final output. Same files, two lenses.</p><p>2&#65039;&#8419; <strong>The Memories.</strong> Every key folder has three key files. CLAUDE.md is Claude&#8217;s instructions. How I want it to behave for this project, what context matters, what to ignore. PIN.md is the project state. What&#8217;s decided, where we are, what&#8217;s next. LESSONS.md captures what I learned for future projects. </p><p>3&#65039;&#8419; <strong>The Workflows.</strong> Choose your own labels. But for me, this is my usual flow - 0-PLAN.md, 1-RESEARCH.md, 2-SYNTHESIS.md. The names and numbers vary by project. But what matters is that there&#8217;s a flow, and it works for you. My experience is that the quality is much better when you work with Claude on it step by step. </p><p>4&#65039;&#8419; <strong>The Tools.</strong> Use MCPs to add tools like web search (Brave), academic papers (arXiv), or persistent memory. Just ask Claude for the instructions. </p><p>5&#65039;&#8419; <strong>And how it comes together.</strong> Just talk to Claude Code in the terminal - &#8220;Read CLAUDE.md for context, refer to this [folder or file] then help me flesh out 0-PLAN.md.&#8221; &#8220;Search for papers on [topic] and save your synthesis to 1-RESEARCH.md.&#8221; &#8220;Update PIN.md with where we are.&#8221; Next session: &#8220;Read PIN.md and continue drafting 2-SYNTHESIS.md.&#8221; When done: &#8220;What should I add to LESSONS.md?&#8221; Then I switch to Obsidian to write the final output, the way I like it (I actually enjoy writing). <br>So, don't fear the terminal. Try it. It will grow on you. </p><p><em>Getting started with Claude Code (for the uninitiated)<br>1&#65039;&#8419; Download VS Code and install it.<br>2&#65039;&#8419;Add the folder you are working on (see my workflow above) <br>3&#65039;&#8419;Open Terminal &#8594; run npm install -g @anthropic-ai/claude-code <br>4&#65039;&#8419;Type claude, then /login <br>5&#65039;&#8419; Ask Claude for help adding MCPs.</em><br><br>#ClaudeCode #AI #GenAI #AIinWork</p>]]></content:encoded></item><item><title><![CDATA[Geometry and AI. What do they have to do with each other?]]></title><description><![CDATA[I have built and audited models, both AI and non-AI.]]></description><link>https://www.simplyboring.ai/p/geometry-and-ai-what-do-they-have</link><guid isPermaLink="false">https://www.simplyboring.ai/p/geometry-and-ai-what-do-they-have</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Mon, 12 Jan 2026 08:13:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BmIQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BmIQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BmIQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BmIQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg" width="1456" height="1165" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1165,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:507414,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://quaintitative.substack.com/i/184291368?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BmIQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 424w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 848w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!BmIQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F441a4c51-662c-410b-b7c6-187839e50d17_2000x1600.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>I have built and audited models, both AI and non-AI. But the term &#8216;geometry&#8217; very rarely appears next to these models. <br><br>Now, I&#8217;ve <a href="https://dl.acm.org/doi/full/10.1145/3663674">designed transformer models</a> that are able to learn graph structures that make the most sense for a specific prediction. So I have always known that models can learn some form of structure but this diptych of two papers shows something quite fascinating.<br><br>One was shared with me by the geometry guru <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Agus Sudjianto&quot;,&quot;id&quot;:292612291,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf23e6ef-760c-4b87-a400-0ac8ba8ee3bb_144x144.png&quot;,&quot;uuid&quot;:&quot;c924c33e-d93c-47f7-ab1f-8c3f8c60d89f&quot;}" data-component-name="MentionToDOM"></span>, and the other I serendipitously chanced upon, straight after reading Agus&#8217; paper. <br><br>&#128214; <a href="https://arxiv.org/abs/2510.26745">Left Panel: "Deep sequence models tend to memorize geometrically"</a><br><br>We usually view model predictions as something that comes from associations. A&#8594;B, B&#8594;C and so on and so forth. <br><br>This paper found that even after models learned associations, they still naturally go on to find what the paper calls geometric memory. Instead of A&#8594;B, B&#8594;C, they want to learn A&#8594;C. Or even A&#8594;Z. Even when it takes 100x the number of steps to learn this geometric memory. Somehow, geometric patterns emerge from the learning process. <br><br>It&#8217;s like learning a new city. Home &#8594; coffee shop one day. Coffee shop &#8594; office another. Now you know the way home from the office.<br><br>&#128214; <a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6008614">Right Panel: "BLADE: Bivector-Driven Logical Adaptive Decoding"</a><a href="http://Link to paper on right  https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6008614"><br></a><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6008614">Geometry can help with confused AI too.</a><br><br>The paper uses three basic geometric concepts to think about a model's internal state:<br><br>Scalar: "Is this path compliant?"<br>Vector: "Where is this reasoning heading?"<br>Bivector: "How much tension between competing paths?"<br><br>When the bivector is high, branch and verify. When it's flat, let the model proceed. <br><br>Works as a triage method to filter out what&#8217;s more important to focus on. I also liked the way the paper applied this to &#8216;stressed&#8217; states: conjunction; disjunction; exception; nested negation etc. A taxonomy of how logic trips up AI.<br><br>Same city. Picking between two 7-Elevens a block apart? Just pick one. Don't think too much. Choosing between two alleys that look similar? One is a shortcut, the other leads to a dead end after a long walk. Think twice. And harder.<br><br>One paper explains the natural emergence of geometry. The other uses geometry for control. <br><br>I need to go and brush up my geometric math.<br><br>#AI #AIRiskManagement #Geometry</p>]]></content:encoded></item><item><title><![CDATA[A Personal Reflection on AI Risk Management]]></title><description><![CDATA[What does a teddy bear and a toy robot have to do with AI risk?]]></description><link>https://www.simplyboring.ai/p/a-personal-reflection-on-ai-risk</link><guid isPermaLink="false">https://www.simplyboring.ai/p/a-personal-reflection-on-ai-risk</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 06 Jan 2026 01:53:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eLXy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLXy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLXy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg" width="1000" height="552" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:552,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:99867,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://quaintitative.substack.com/i/183625867?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eLXy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 424w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 848w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!eLXy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F464218b7-18ae-4a8b-8cfd-3fea032ca3bb_1000x552.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>What does a teddy bear and a toy robot have to do with AI risk?</p><p>Before I returned to MAS in 2023, I hated the notion of AI governance. Fresh out of PhD studies but a jaded middle aged man, I thought some of the conversations in the space were rather naive. Sentient AI. Existential risks. Ethical considerations for AI that we don&#8217;t even apply to humans.</p><p>To the me back then, AI governance was like a teddy bear, cute but all fluff. You could hug it and everyone would nod along with pride. Then karma struck. I was asked to lead AI risk supervision at MAS. Suddenly my job description required me to tackle this teddy bear.</p><p>Governance and risk management were not new to me. I had been doing it for more than a decade in MAS even before my PhD. But governance and risk management have specific meanings in finance.</p><p>I joined MAS in 2007, right before the Great Financial Crisis. And I saw firsthand how governance and risk management were grounded in real failures and risks. No hypotheticals. Governance and risk management in finance are more like toy robots, not teddy bears. Sharp edges, nothing cuddly. But it moves.</p><p>After a while, I realized that I was biased. Not all of AI governance was a teddy bear. I discovered the NIST Risk Management Framework, ISO 42001. Also existing risk management frameworks, such as technology and model risk management frameworks that had started to address the new risks from the increasing use of AI (see my simple reading list on AI governance and risk management).</p><p>So robots did exist. Different shapes and shades. But I thought that there were too many.</p><p>Reading frameworks, comparing documents, talking to practitioners who had actually implemented these things over the past two years, I wondered if there was a simpler way to look at AI risk management. Not just for organizations, but also at a personal level. Which means that it cannot be a 100 page playbook, but something easy to remember.</p><p>So instead, just three questions. I&#8217;m not sure they&#8217;re the right three, but they&#8217;re the ones I seem to keep returning to.</p><p><strong>What&#8217;s at risk?</strong> You can&#8217;t manage what you haven&#8217;t found or understood. This means discovering where AI exists and profiling how risky each system actually is.</p><p><strong>How do we manage it?</strong> Once you know what&#8217;s at risk, you need controls in the right areas relevant to that risk, and ways to check if those controls are working.</p><p><strong>Who&#8217;s accountable?</strong> Controls without owners become theatre. Someone has to own the risk, and the organization needs capability to sustain it.</p><p>These aren&#8217;t original questions. They&#8217;re the same questions risk management in finance has been asking for decades. The 2008 crisis didn&#8217;t teach new principles. Rather, it reminded us (really really loudly) what happens when we forget the old ones.</p><p>Some quick reflections on each of these questions below.</p><h2>What&#8217;s at risk?</h2><p>I&#8217;m pretty sure the 2008 financial crisis didn&#8217;t create risk. It just revealed the house of cards that had been built.</p><p>Banks discovered exposures they didn&#8217;t know they had. Off-balance-sheet vehicles that suddenly appeared. Opaque instruments nobody understood. The problem wasn&#8217;t that these were risky. Rather, nobody knew they were there until the reckoning.</p><p>Same with AI. You can have the most elegant governance policies. But if you can&#8217;t find where AI is and figure out which ones actually matter, everything else is just performative.</p><p>Think about your phone. You&#8217;ve used hundreds of apps over the years. Most forgotten. Some daily. A few have access to your photos, location, bank account. Do you treat them all the same? Of course not. The banking app gets the biometric lock. Candy Crush doesn&#8217;t.</p><p>Same instinct. Identify where AI exists. Profile which ones matter. Not everything needs the same attention, but you have to know what you have before deciding.</p><p>And once you know what you have and what matters, record it. A good inventory isn&#8217;t just for risk management. It&#8217;s memory. Ever rediscovered a really useful app on your phone? Same with AI. The AI tool one team loves might solve another team&#8217;s problem. Without the inventory, you reinvent wheels. With it, you see where else to go.</p><p>The beauty of doing this well? It helps you scale, not just manage risks.</p><h2>How do we manage it?</h2><p>After 2008, we went into overdrive. More rules. More controls. I hated being involved in international discussions on some of these reforms. New rules to compute capital for market risk, counterparty credit risk. Why the hate? Before 2008, it was common to hear of actual rocket scientists being hired into quantitative finance. After 2008, I thought I needed to be a rocket scientist just to make sense of the new rules.</p><p>You might feel that this reminds you of the jargon around AI controls. Guardrails. Red teaming. Alignment.</p><p>But what I learned is that it was not the complexity that mattered. In fact, complexity was what caused the Great Financial Crisis (ever heard of the single factor copula model?). It was whether the controls made sense for the risks involved.</p><p>I think two things matter. The right controls. And checking if they work, and continue to work.</p><p>Think about onboarding someone you&#8217;ll depend on. A new hire. You&#8217;d want to know: Can I trust what they&#8217;re telling me? Can I step in if things go wrong? Will they hold up under pressure?</p><p>Same questions for AI. Can I trust the data? Do I even need fairness and explanation? Can humans step in when needed? Will it hold up under stress?</p><p>You ask the right questions. Not every question. Match the scrutiny to the risk. A high risk customer-facing credit model gets the full onboarding. An internal summarization tool gets a lighter touch. And you don&#8217;t just check once. Things change. The new hire who was great in month one might be struggling by month six.</p><p>You want to know before shit happens.</p><h2>Who&#8217;s accountable?</h2><p>This is where 2008 probably made the least sense. Bankers responsible for the crisis walked away. Everyone else paid. It gave us Occupy Wall Street. Some would say it&#8217;s still contributing to the political fractures we see today.</p><p>They could walk away because accountability wasn&#8217;t clear. Everyone&#8217;s responsibility meant no one was.</p><p>With AI, this could be way worse. In 2024, Air Canada&#8217;s chatbot invented a refund policy that didn&#8217;t exist. A customer relied on it. When he complained, the airline argued the chatbot was &#8220;a separate legal entity&#8221; responsible for its own actions. Nice try. Laughed out of court.</p><p>And you don&#8217;t need to go that far to see the pattern. Does this sound familiar? &#8220;We need a control for the AI system&#8217;s risks.&#8221; &#8220;No problem, let&#8217;s place a human in the loop.&#8221; Sounds good. Box checked. But which human? Doing what exactly? With what authority to override? And do they actually understand what they&#8217;re looking at?</p><p>Accountability requires clarity. Not just about ownership, but about the what and how.</p><p>Accountability without capability is also empty. Can that person actually ask the right questions? Spot when something&#8217;s off? Push back on the confident-sounding nonsense from vendors or developers or the AI itself?</p><p>How many in 2008 really understood what a single factor copula model even was when buying CDOs priced by them?</p><h2>FIN</h2><p>I joined MAS in 2007 not knowing what a capital ratio was. I left in 2025 having written the AI risk management guidelines for the financial sector.</p><p>I wrote this to make sense of that arc before it drifts away. Eighteen years. Different domains. capital rules, model audits, investment risk, AI. Different jargon. But somehow the same questions make sense.</p><p>Maybe that&#8217;s the through-line I couldn&#8217;t see while I was in it.</p><p>Definitely less teddy bears. More robots.</p>]]></content:encoded></item><item><title><![CDATA[A Simple Reading List on Human Oversight of AI Systems]]></title><description><![CDATA["We have a human-in-the-loop as a risk mitigant!" Really?]]></description><link>https://www.simplyboring.ai/p/a-simple-reading-list-on-human-oversight</link><guid isPermaLink="false">https://www.simplyboring.ai/p/a-simple-reading-list-on-human-oversight</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Mon, 22 Dec 2025 11:40:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wJie!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#8220;We have a human-in-the-loop as a risk mitigant.&#8221;</p><p>A phrase so commonly uttered, you sometimes wonder if it actually means something or is just a platitude.</p><p>So, does adding a human actually make the system safer? Or does it create the illusion of safety while introducing new failure modes? And making the human the scapegoat for institutional failure.</p><p>Who is in the loop, over the loop, or out of the loop entirely? And does it even matter where they sit if they can&#8217;t meaningfully intervene?</p><p>Here&#8217;s a simple reading list that could perhaps help answer some of these questions. This was a harder one, so would certainly appreciate any pointers on good papers on this topic.</p><p>Note: I have used open-access links from arXiv as far as possible as some of the published versions are behind a paywall.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wJie!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wJie!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wJie!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wJie!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wJie!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wJie!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2301120,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://quaintitative.substack.com/i/182318170?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wJie!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wJie!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wJie!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wJie!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7db5d4-572b-412d-b0fd-1b4cccea6521_2000x1091.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Phase 1: The Landscape</strong></h2><p><strong>1. &#8220;Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies&#8221;</strong> &#8212; Lai et al. (2021) | <em>Preprint</em></p><p>A comprehensive survey of studies on AI-assisted decision making. Organizes AI assistance into four hierarchical categories: model predictions (core output), prediction-specific information (such as uncertainty and local explanations), global model insights (performance metrics and documentation), and system interaction elements (user agency and cognitive workflows).</p><p> <a href="https://arxiv.org/abs/2112.11471">arXiv:2112.11471</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!EoAJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!EoAJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 424w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 848w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 1272w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!EoAJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png" width="888" height="379" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:379,&quot;width&quot;:888,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!EoAJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 424w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 848w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 1272w, https://substackcdn.com/image/fetch/$s_!EoAJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0eaa4265-c35d-4fe9-b269-83dbdb4c2f2e_888x379.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. &#8220;Human-AI Collaboration is Not Very Collaborative Yet: A Taxonomy of Interaction Patterns&#8221;</strong> &#8212; Gomez et al. (2025) | <em>Frontiers in Computer Science</em></p><p>A review that shows that current human-AI interactions are dominated by simplistic collaboration paradigms. Develops a taxonomy that identifies key interaction patterns, cautions that prevalent &#8220;static&#8221; paradigms like AI-first and AI-follow make users susceptible to anchoring and confirmation biases, while dynamic patterns like secondary, request-driven, dialogic, and user-guided assistance could help mitigate these.</p><p><a href="https://arxiv.org/abs/2310.19778">arXiv:2310.19778</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FS7d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FS7d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 424w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 848w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 1272w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FS7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png" width="1037" height="698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:698,&quot;width&quot;:1037,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FS7d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 424w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 848w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 1272w, https://substackcdn.com/image/fetch/$s_!FS7d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F30abfdd0-0e50-4370-b5be-3217f0416987_1037x698.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>3. &#8220;Human-in-the-Loop Machine Learning&#8221; (Book)</strong> &#8212; Robert Monarch (2021) | <em>Manning Publications</em></p><p>Introduction to integrating human judgment into ML systems in specific ways. Less about human oversight, more about the role humans play in annotation, active learning, transfer learning, and using machine learning to optimize the process. Even though it&#8217;s a bit different from the other papers, it&#8217;s an interesting read on the role of humans in the machine &#8216;learning&#8217; process. &#128214; <a href="https://www.manning.com/books/human-in-the-loop-machine-learning">Manning Publications</a></p><h2><strong>Phase 2: The Blindspots</strong></h2><p><strong>1. &#8220;Knowing About Knowing: An Illusion of Human Competence Can Hinder Appropriate Reliance on AI&#8221;</strong> &#8212; He et al. (2023) | <em>ACM CHI 2023</em></p><p>Basically Dunning-Kruger with AI in the mix. And this was before we reached the capabilities of LLMs today. About how self-awareness affects how humans work with AI. Overconfident users tend to under-rely on superior AI systems. Intervention helps over-estimators calibrate their skills and improve reliance, but hurts under-estimators, who start to reject valid AI advice after realizing their own competence. Because of this, human oversight also depends on individual personalities. I wonder how this has changed with the state of LLMs today.</p><p><a href="https://arxiv.org/abs/2301.11333">arXiv:2301.11333</a></p><p><strong>2. &#8220;Effect of Confidence and Explanation on Accuracy and Trust Calibration&#8221;</strong> &#8212; Zhang, Liao &amp; Bellamy (2020) | ACM FACCT 2020*</p><p>Interesting study showing that confidence scores can help calibrate trust, but trust calibration alone is insufficient to improve AI-assisted decision making. Local explanations help even less with trust calibration and accuracy of AI-assisted decision making. Highlights that human and AI blind spots may be similar. I thought this was interesting as it showed that having explainability may not always help.</p><p><a href="https://arxiv.org/abs/2001.02114">arXiv:2001.02114</a></p><p><strong>3. &#8220;Fewer Than 1% of Explainable AI Papers Validate Explainability with Humans&#8221;</strong> &#8212; Suh et al. (2025) | <em>arXiv preprint</em></p><p>While this seems to belong better in a reading list on explainability, I thought it showed some key insights on the relationship between human oversight and explainability. The review shows that less than 1% of research papers on explainability validate their claims with human subjects. The authors argue that explainability methods that are not tested with humans are akin to releasing drugs based on biological principles without clinical trials.</p><p><a href="https://arxiv.org/abs/2503.16507">arXiv:2503.16507</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uv8W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uv8W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 424w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 848w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 1272w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uv8W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png" width="752" height="403" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:403,&quot;width&quot;:752,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uv8W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 424w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 848w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 1272w, https://substackcdn.com/image/fetch/$s_!Uv8W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b54f12c-2010-425b-87cb-fecae6f4bf1c_752x403.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Phase 3: Meaningful Oversight</strong></h2><p><strong>1. &#8220;Designing Meaningful Human Oversight in AI&#8221; </strong>- Zhu et al. (2025) | SSRN Preprint</p><p>Focuses on why human oversight should go beyond just putting a &#8220;human-in-the-loop.&#8221; This paper argues AI should handle &#8220;operative agency&#8221; (generating solutions) while humans provide &#8220;evaluative agency&#8221; (understanding, verifying, intervening). Key principles: make verification easier than solving from scratch, focus on external reasoning aligned with expert judgment rather than explaining model internals, and ensure four conditions&#8212;clear boundaries with explicit handover points, full traceability, AI pursuing sub-goals while humans control top-level objectives, and AI adapting at micro-level while humans oversee major changes.</p><p><a href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5501939">SSRN 5501939</a></p><p><strong>2. &#8220;Should I Follow AI-based Advice? Measuring Appropriate Reliance&#8221;</strong> &#8212; Schemmer et al. (2022) | <em>arXiv preprint</em></p><p>Explains why current metrics are not appropriate for measuring the effectiveness of human oversight of AI. Proposes Relative Positive AI Reliance (human&#8217;s ability to switch to AI&#8217;s views when human was wrong and AI right), and Relative Positive Self-Reliance (human&#8217;s ability to stick to their own correct decision when AI provides incorrect advice).</p><p><a href="https://arxiv.org/abs/2204.06916">arXiv:2204.06916</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TgIc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TgIc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 424w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 848w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 1272w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TgIc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png" width="538" height="664" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:664,&quot;width&quot;:538,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TgIc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 424w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 848w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 1272w, https://substackcdn.com/image/fetch/$s_!TgIc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1c05cc41-335a-4726-85a2-0be0585d93a8_538x664.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>3. &#8220;Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance&#8221;</strong> &#8212; Bansal et al. (2021) | <em>ACM CHI 2021</em></p><p>Even before Generative AI hallucinations became a common term, this study showed the tendency for humans to over-rely on explanations that AI provided for a prediction or recommendation, even when it was wrong. So, don&#8217;t always believe the reasoning from your friendly LLM. And it showed that a prediction or recommendation with a confidence score could lead to better performance by humans that were assisted by AI.</p><p><a href="https://arxiv.org/abs/2006.14779">arXiv:2006.14779</a></p><p><strong>4. &#8220;Trust and Reliance in XAI &#8212; Distinguishing Between Attitudinal and Behavioral Measures&#8221;</strong> &#8212; Scharowski et al. (2022) | <em>ACM CHI 2022 Workshop</em></p><p>Another interesting paper at the intersection of explainability and human oversight. It discusses the need to distinguish between trust in AI (which is an attitude) and reliance on AI (which is a behavior), and how papers have not clearly distinguished the two. And wonders whether &#8216;trust&#8217; is even the right term to use when it comes to describing how humans interact with AI as it anthropomorphizes AI which has no agency nor an intent to betray us.</p><p><a href="https://arxiv.org/abs/2203.12318">arXiv:2203.12318</a></p><p><strong>5. &#8220;To Rely or Not to Rely? Evaluating Interventions for Appropriate Reliance on Large Language Models&#8221;</strong> &#8212; Bo, Wan &amp; Anderson (2024) | <em>ACM CHI 2025</em></p><p>Interesting paper focusing on humans and LLMs. Examines interventions to help users calibrate their trust in LLMs. Looks at techniques like visually marking low-confidence words in red, or implicit answers i.e. providing reasoning steps but withholding the final result. While these techniques force user deduction and reduce over-reliance, they may fail to foster appropriate reliance as users may also under-rely on correct advice. The study highlights a paradox, where users become more confident when making incorrect reliance decisions. Also suggests that simple frictions, such as static disclaimers, may outperform complex technical interventions.</p><p><a href="https://arxiv.org/abs/2412.15584">arXiv:2412.15584</a></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!10bX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!10bX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 424w, https://substackcdn.com/image/fetch/$s_!10bX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 848w, https://substackcdn.com/image/fetch/$s_!10bX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 1272w, https://substackcdn.com/image/fetch/$s_!10bX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!10bX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png" width="544" height="152" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:152,&quot;width&quot;:544,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!10bX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 424w, https://substackcdn.com/image/fetch/$s_!10bX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 848w, https://substackcdn.com/image/fetch/$s_!10bX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 1272w, https://substackcdn.com/image/fetch/$s_!10bX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe540a2c-7cd5-46d2-9d6f-f101dd2fc110_544x152.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h2><strong>Phase 4: Agent Oversight</strong></h2><p><strong>1. &#8220;Unraveling Human-AI Teaming: A Review and Outlook&#8221;</strong> &#8212; Lou et al. (2025) | <em>arXiv preprint</em></p><p>An interesting look at how the shift to Agentic AI changes things fundamentally for human-AI interactions, as AI moves from being a passive tool to being able to plan, reflect. Raises an interesting point about AI potentially delegating to humans instead of the reverse. And how AI may change team dynamics, and how sycophancy of AI may cause a trust paradox due to its tendency to agree with humans even when wrong. The &#8220;peak-end&#8221; of human-AI interactions is also important to note as a single brilliant insight or a very smooth conclusion to a chat session can mask deeper reliability issues.</p><p><a href="https://arxiv.org/abs/2504.05755">arXiv:2504.05755</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R3CL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R3CL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 424w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 848w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 1272w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R3CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png" width="1005" height="372" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:372,&quot;width&quot;:1005,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R3CL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 424w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 848w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 1272w, https://substackcdn.com/image/fetch/$s_!R3CL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb66f0ae4-3b5e-4f8e-8f5c-6d76bd97b2cd_1005x372.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. &#8220;LLM-Based Human-Agent Collaboration and Interaction Systems: A Survey&#8221;</strong> &#8212; Zou et al. (2025) | <em>arXiv preprint</em></p><p>A survey of LLM-based human-agent systems. Looks at such systems through the lens of type, granularity and phase of human feedback; interactions that take the form of competition, collaboration and coopetition (both competitive and collaborative); orchestration paradigms based on task strategy that can be synchronous or asynchronous; as well as different forms of communication structures and modes.</p><p><a href="https://arxiv.org/abs/2505.00753">arXiv:2505.00753</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D8x1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D8x1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 424w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 848w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 1272w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D8x1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png" width="1114" height="506" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:506,&quot;width&quot;:1114,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D8x1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 424w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 848w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 1272w, https://substackcdn.com/image/fetch/$s_!D8x1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5082e5cd-8757-430a-96c4-9d90f65a0415_1114x506.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>So before just saying &#8220;We have a human-in-the-loop as a risk mitigant.&#8221;, think about this:</p><ul><li><p>Humans are fallible - e.g., humans systematically over-rely on AI recommendations</p></li><li><p>Design matters - e.g., calibration matters as much as understanding the AI; explanations don&#8217;t automatically help human oversight, sometimes it hurts</p></li><li><p>It&#8217;s getting harder - e.g., autonomous agents make traditional oversight models increasingly inadequate</p></li></ul><p>What resources would you add to this list?</p><p></p><p>#AIOversight #AIRiskManagement #AIReadingList</p>]]></content:encoded></item><item><title><![CDATA[AI Agents as Normal Systems]]></title><description><![CDATA[If AI is normal technology, then AI agents are perhaps &#8230; just normal systems.]]></description><link>https://www.simplyboring.ai/p/ai-agents-as-normal-systems</link><guid isPermaLink="false">https://www.simplyboring.ai/p/ai-agents-as-normal-systems</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Tue, 16 Dec 2025 09:50:06 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!A1n4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>If AI is normal technology, then AI agents are perhaps &#8230; just normal systems.<br><br>Arvind Narayanan and Sayash Kapoor articulated a compelling thesis on &#8220;AI as Normal Technology&#8221; a while back. Their core argument: AI can be understood through the lens of past general-purpose technologies, electricity, the internet, computing, rather than as a potential super intelligent entity. <br><br>So if AI is normal technology, then perhaps AI agents are just normal systems. Not entities that can turn rogue, deceive, or collude.<br><br>A recent paper &#8220;Measuring Agents in Production&#8221; provides some evidence. The paper is based on a study of AI agents in production, surveying 306 practitioners and conducting 20 in-depth case studies.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://arxiv.org/abs/2512.04123" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!A1n4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 424w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 848w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 1272w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!A1n4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png" width="458" height="594.0529411764705" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:882,&quot;width&quot;:680,&quot;resizeWidth&quot;:458,&quot;bytes&quot;:259374,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://arxiv.org/abs/2512.04123&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://quaintitative.substack.com/i/181774353?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!A1n4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 424w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 848w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 1272w, https://substackcdn.com/image/fetch/$s_!A1n4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4c4a3a-c92c-4d6e-8434-79b0237e5097_680x882.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br>Some thoughts.<br><br>1&#65039;&#8419; <strong>Just like systems, specification is the core work.</strong><br><br>Points from paper: <br><em>- &#8220;Production agents favor well-scoped, static workflows ..&#8221;<br>- &#8220;Organizations deliberately bound agent behavior within specific action spaces ...&#8221;<br>- &#8220;Deployment architectures favor predefined, structured workflows over open-ended autonomous planning to ensure reliability.&#8221;</em><br><br>My take: Thinking that one can simply unleash AI agents on a problem and solve it is pure fantasy. Before a problem can be tackled by an AI agent, someone still has to do the specs, decompose the problem into clear tasks, scope the action space, think about orchestration, design the human handoffs, and define the success criteria for the system to work.<br><br>2&#65039;&#8419; <strong>Just like systems, trust and reversibility matter more than capabilities.</strong><br><br>Points from paper: <br><em>- &#8220;Practitioners deliberately trade-off additional agent capability for production reliability... reliability concerns drive practitioners toward simple yet effective solutions with high controllability.&#8221;<br>- &#8220;... teams restrict agents to &#8216;read-only&#8217; operations to prevent state modification &#8230; but leaves the final execution to human engineers.&#8221;</em><br><br>My take: No matter how capable AI agents become, organizations will adopt them at the speed they can learn to trust them. And the need for trust scales with irreversibility of actions (reading an email is vastly different from executing a trade).<br><br>3&#65039;&#8419; <strong>Just like systems, risk arises from gaps in development and deployment, not AI going amok.</strong> <br><br>Points from paper: <br><em>- &#8220;Reliability remains the top development challenge, driven by difficulties in ensuring and evaluating agent correctness.&#8221;<br>- &#8220;Agent behavior breaks traditional software testing... teams have not yet identified effective methods to adapt &#8230; tests for nondeterministic agent behavior.&#8221;</em><br><br>My take: The real concerns aren&#8217;t scary but unrealistic scenarios - runaway, deceptive, or collusive agents. It&#8217;s the gaps in development and deployment practices for such complex systems that require attention. These are engineering and risk management problems, not AI going amok. <br><br>So, normal technology, normal systems, for normal problems. Not easy or trivial. But normal. <br><br>What&#8217;s your take? Normal or abnormal?<br><br>#AIRiskManagement #AgenticAI #AIAgents #NormalAI</p>]]></content:encoded></item><item><title><![CDATA[Simple Reading List on Explainability & Interpretability ]]></title><description><![CDATA[&#8220;But we need to ensure we have explainability!&#8221;]]></description><link>https://www.simplyboring.ai/p/simple-reading-list-on-explainability</link><guid isPermaLink="false">https://www.simplyboring.ai/p/simple-reading-list-on-explainability</guid><dc:creator><![CDATA[Gary Ang (Ming)]]></dc:creator><pubDate>Thu, 11 Dec 2025 01:15:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!rjO-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#8220;But we need to ensure we have explainability!&#8221;</p><p>I hear this in meetings constantly. Someone senior gestures vaguely, everyone nods, and the conversation moves on.</p><p>My thought is always: <em>What does that actually mean?</em></p><p>Inherently interpretable models? Post-hoc SHAP values? Explanations for customers, users, risk managers, senior management?</p><p>At what level? Individual predictions or overall model behavior?</p><p>Here is a simple reading list that could perhaps help understand the differences. There are probably hundreds (or even thousands) of relevant works, but I just picked a representative few from my library for the following phases, from when models were simple to explain, to the impossible task with today&#8217;s trillion-parameter AI.</p><p>Note: I have used open-access links from arXiv as far as possible as some of the published versions are behind a paywall.</p><h2><strong>Phase 1: From Inherent Interpretability to Post-Hoc Explainability</strong></h2><p><strong>1. &#8220;Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead&#8221;</strong> &#8212; Rudin (2019) | <em>Nature Machine Intelligence</em></p><p>This 2019 paper argues against using post-hoc explanations for black-box models and the need for inherently interpretable models. This is a highly cited paper that seems both quaint and prescient at the same time.  <a href="https://arxiv.org/pdf/1811.10154v3">arXiv:1811.10154</a></p><p><strong>2. &#8220;Why Should I Trust You?: Explaining the Predictions of Any Classifier (LIME)&#8221;</strong> &#8212; Ribeiro, Singh &amp; Guestrin (2016) | <em>ACM SIGKDD 2016</em></p><p>THE paper that proposed LIME, based on learning a local interpretable surrogate model around each prediction. The paper goes beyond local explanations and also shows how to select a diverse, representative set of explanations to explain the model.  <a href="https://arxiv.org/pdf/1602.04938v3">arXiv:1602.04938</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rjO-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rjO-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 424w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 848w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 1272w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rjO-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png" width="569" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:569,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rjO-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 424w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 848w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 1272w, https://substackcdn.com/image/fetch/$s_!rjO-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9fc19860-a967-466d-bbad-c40f0dd64ad1_569x535.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>3. &#8220;A Unified Approach to Interpreting Model Predictions (SHAP)&#8221;</strong> &#8212; Lundberg &amp; Lee (2017) | <em>NeurIPS 2017</em></p><p>THE paper that proposed SHAP for post-hoc explanations, and proved that key existing methods then could be viewed as approximations of SHAP. Highlights 3 desirable properties - local accuracy (where the explanation model must match the output of the original model for the specific input being explained); missingness (where a feature must have no impact if it is set to 0); consistency (if a feature helps the model more, the explanation method will never penalize it with a lower score). <a href="https://papers.nips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html">NeurIPS 2017</a></p><p><strong>4. &#8220;AI Explainability 360: An Extensible Toolkit for  Understanding Data and Machine Learning Models&#8221; | &#8220;One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques&#8221;</strong> &#8212; Arya et al., IBM Research (2019 | 2020) | <em>arXiv preprint | Journal of Machine Learning Research</em></p><p>Goes beyond SHAP &amp; LIME to other methods. Interesting as it highlights the need for persona-based explainability (to cater to different needs of say affected users vs. decision makers). Also classifies methods based on static vs. interactive, data vs. model understanding, local vs. global, directly interpretable vs. post-hoc.  <a href="https://arxiv.org/pdf/1909.03012v2">arXiv:1909.03012</a> | <a href="https://dl.acm.org/doi/pdf/10.5555/3455716.3455846">JMLR paper</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gQAQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gQAQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 424w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 848w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 1272w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gQAQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png" width="1147" height="656" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:656,&quot;width&quot;:1147,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gQAQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 424w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 848w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 1272w, https://substackcdn.com/image/fetch/$s_!gQAQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53208a59-adac-4f27-802b-50e003e70c2b_1147x656.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>5. &#8220;Interpretable Machine Learning&#8221; (Book)</strong> &#8212; Molnar (2022) | <em>Open-access textbook</em></p><p>An open-access textbook covering the full spectrum, from linear regression and decision trees to SHAP and counterfactuals. A must-read for anyone who wants to get a good handle on interpretability and explainability. <a href="https://christophm.github.io/interpretable-ml-book/">christophm.github.io</a></p><p><strong>6. &#8220;A Comprehensive Guide to Explainable AI: From Classical Models to LLMs&#8221;</strong> &#8212; Hsieh et al. (2024) | <em>arXiv preprint</em></p><p>A textbook-style guide spanning the entire XAI spectrum&#8212;from intrinsically interpretable models (decision trees, linear regression) through post-hoc methods (SHAP, LIM) to LLM-specific techniques. Another good read. <a href="https://arxiv.org/abs/2412.00800">arXiv:2412.00800</a></p><h2><strong>Phase 2: Taking A Step Back to Critically Examine Explainability</strong></h2><p><strong>1. &#8220;From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI&#8221;</strong> &#8212; Nauta et al. (2023) | <em>ACM Computing Surveys</em></p><p>Proposes 12 properties in 3 dimensions for evaluating explanations - 1) What is explained? Correctness, Completeness, Consistency, Continuity, Contrastivity, Semantics; 2) How is it explained? Compactness, Composition, Confidence; 3) Who is it for? Context, Coherence, Controllability.  <a href="https://arxiv.org/pdf/2201.08164v3">arXiv:2201.08164</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!inU3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!inU3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 424w, https://substackcdn.com/image/fetch/$s_!inU3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 848w, https://substackcdn.com/image/fetch/$s_!inU3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 1272w, https://substackcdn.com/image/fetch/$s_!inU3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!inU3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png" width="761" height="757" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:757,&quot;width&quot;:761,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!inU3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 424w, https://substackcdn.com/image/fetch/$s_!inU3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 848w, https://substackcdn.com/image/fetch/$s_!inU3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 1272w, https://substackcdn.com/image/fetch/$s_!inU3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3de481a0-1cc3-41b5-a876-b41b2f0a7831_761x757.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. &#8220;How can I choose an explainer? An Application-grounded Evaluation of Post-hoc Explanations&#8221;</strong> &#8212; Jesus et al. (2021) | <em>ACM FAccT 2021</em></p><p>Going beyond academic explanations and measures. Conducted a real world study where users had access to data only, data and model scores only, or data, model scores and explanations. It reveals a counterintuitive insight - adding model explanations makes human decision-making faster but can result in lower accuracy compared to reviewing raw data alone. <a href="https://arxiv.org/pdf/2101.08758v2">arXiv:2101.08758</a></p><p><strong>3. &#8220;Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods&#8221;</strong> &#8212; Slack et al. (2020) | <em>AAAI/ACM AIES 2020</em></p><p>Another paper that reminds us to not overly trust explanation methods like LIME and SHAP. Paper shows that they can be easily fooled by adversarial models that hide their bias by distinguishing between real input data and the synthetic data (perturbations) used to generate explanations. <a href="https://arxiv.org/pdf/1911.02508v2">arXiv:1911.02508</a></p><h2><strong>Phase 3: The Next Step (LLMs &amp; Agents)</strong></h2><h3>LLM Explainability</h3><p><strong>1. &#8220;Explainability for Large Language Models: A Survey&#8221;</strong> &#8212; Zhao et al. (2024) | <em>ACM Transactions on Intelligent Systems and Technology (TIST)</em></p><p>Start here if you&#8217;re new to LLM explainability. Provides a good overview of techniques based on fine tuning (or training) paradigm (ranging from local to global methods); prompting paradigm (e.g., explaining chain of thoughts, using representations); as well as evaluating explanations for faithfulness and plausibility. <a href="https://arxiv.org/pdf/2309.01029v3">arXiv:2309.01029</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bJ2w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bJ2w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 424w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 848w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 1272w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bJ2w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png" width="1123" height="808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c43bb847-083d-4da6-9d52-c09722391041_1123x808.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1123,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bJ2w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 424w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 848w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 1272w, https://substackcdn.com/image/fetch/$s_!bJ2w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc43bb847-083d-4da6-9d52-c09722391041_1123x808.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. &#8220;XAI meets LLMs: A Survey of the Relation between Explainable AI and Large Language Models&#8221;</strong> &#8212; Cambria et al. (2024) | <em>arXiv preprint</em></p><p>Maps the bidirectional relationship: how XAI improves LLMs, and how LLMs can generate explanations. Advocates for balancing interpretability with performance. <a href="https://arxiv.org/pdf/2407.15248v1">arXiv:2407.15248</a></p><p><strong>3. &#8220;Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey&#8221;</strong> &#8212; Dang et al. (2024) | <em>arXiv preprint</em></p><p>Comprehensive survey on MLLM interpretability. Proposes framework across Data, Model, and Training/Inference perspectives. (Worth a read as vision-language models proliferate.) <a href="https://arxiv.org/pdf/2412.02104v1">arXiv:2412.02104</a></p><p><strong>4. &#8220;A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models&#8221;</strong> &#8212; Lin et al. (2025) | <em>arXiv preprint</em></p><p>Systematic comparison of how LLM interpretability methods adapt to multimodal settings. Identifies gaps between unimodal and crossmodal understanding. <a href="https://arxiv.org/pdf/2502.17516v1">arXiv:2502.17516</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!12Am!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!12Am!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 424w, https://substackcdn.com/image/fetch/$s_!12Am!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 848w, https://substackcdn.com/image/fetch/$s_!12Am!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 1272w, https://substackcdn.com/image/fetch/$s_!12Am!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!12Am!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png" width="888" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:888,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!12Am!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 424w, https://substackcdn.com/image/fetch/$s_!12Am!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 848w, https://substackcdn.com/image/fetch/$s_!12Am!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 1272w, https://substackcdn.com/image/fetch/$s_!12Am!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43e593dd-8577-4c20-9d8f-01deddeb3b42_888x780.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>LLM Explorability (Mechanistic Interpretability)</h3><p><strong>1. &#8220;A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models&#8221;</strong> &#8212; Shu et al. (2025) | <em>EMNLP 2025</em></p><p>Comprehensive survey on SAEs, one of the hottest tool in LLM interpretability. Covers architecture, training strategies, feature explanation methods, and evaluation metrics. (If you want to understand what Anthropic is doing, read this.) <a href="https://arxiv.org/pdf/2503.05613v3">arXiv:2503.05613</a></p><p><strong>2. &#8220;Mapping the Mind of a Large Language Model&#8221;</strong> &#8212; Anthropic (2024) | <em>Anthropic Research</em></p><p>Not a survey, but the landmark paper showing SAEs at scale on Claude. Circuit tracing, attribution graphs, and the famous &#8220;Golden Gate Bridge&#8221; feature. (This is where &#8220;understanding&#8221; LLMs actually starts.) <a href="https://www.anthropic.com/research/mapping-mind-language-model">anthropic.com/research/mapping-mind-language-model</a></p><p><strong>3. &#8220;Persona Vectors: Monitoring and Controlling Character Traits in Language Models&#8221;</strong> &#8212; Anthropic (2025) | <em>Anthropic Research</em></p><p>Shows how to identify and manipulate specific behavioral features (like sycophancy or honesty) using sparse autoencoders. Demonstrates practical applications of mechanistic interpretability for AI safety. <a href="https://www.anthropic.com/research/persona-vectors">anthropic.com/research/persona-vectors</a></p><h3>Agent Explainability</h3><p><strong>1. &#8220;TRiSM for Agentic AI: Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems&#8221;</strong> &#8212; Raza et al. (2025) | <em>arXiv preprint</em></p><p>Adapts Trust, Risk, and Security Management framework for agentic AI. Includes explainability as key pillar alongside security and privacy. Proposes novel metrics for agent collaboration quality. <a href="https://arxiv.org/pdf/2506.04133v4">arXiv:2506.04133</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Aok5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Aok5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 424w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 848w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 1272w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Aok5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png" width="856" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:856,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Aok5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 424w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 848w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 1272w, https://substackcdn.com/image/fetch/$s_!Aok5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c980e81-8a96-4c66-841d-558d52a0048d_856x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>2. &#8220;Integrating Counterfactual Simulations with Language Models for Explaining Multi-Agent Behaviour&#8221;</strong> &#8212; Gyevn&#225;r et al. (2025) | <em>arXiv preprint</em></p><p>The AXIS framework: LLMs interrogating simulators with &#8220;what-if&#8221; prompts to explain agent behavior. Shows 23% improvement in goal prediction accuracy. <a href="https://arxiv.org/pdf/2505.17801v2">arXiv:2505.17801</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gfgT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gfgT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 424w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 848w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 1272w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gfgT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png" width="1041" height="564" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:564,&quot;width&quot;:1041,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gfgT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 424w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 848w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 1272w, https://substackcdn.com/image/fetch/$s_!gfgT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58eb7e47-7df7-4cbf-871a-1ab7e8de392e_1041x564.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2><strong>Standards &amp; Resources</strong></h2><p><strong>1. NIST AI Risk Management Framework (AI RMF) - Explainability &amp; Transparency</strong></p><p> <a href="https://www.nist.gov/itl/ai-risk-management-framework">nist.gov/ai-rmf</a></p><p><strong>2. EU AI Act - Article 13: Transparency Obligation</strong></p><p><a href="https://artificialintelligenceact.eu/article/13/">artificialintelligenceact.eu/article/13</a></p><p><strong>3. Resources</strong> -</p><p>This GitHub repository is your gateway to the full explainability landscape.</p><p><a href="https://github.com/jphall663/awesome-machine-learning-interpretability">github.com/jphall663/awesome-machine-learning-interpretability</a></p><p>There are quite a few explainability/interpretability libraries around (see the repo above), but Agus Sudjianto&#8217;s is a good place to start. The documentation is a good read - <a href="https://modeva.ai/_build/html/index.html">https://modeva.ai/_build/html/index.html</a></p><p>&#8212;</p><p><strong>&#8220;Can you explain why the model did that?&#8221;</strong></p><p>It really depends on what you mean by &#8220;explain.&#8221;</p><p>Any must-reads in this area that you would recommend?</p>]]></content:encoded></item></channel></rss>