Short posts
Great investigation –> Police secretly monitored New Orleans with facial recognition cameras
How reliable are LLMs at extracting data from pdfs? Inspired by Simon Willison’s PyCon talk, I added extracting FEMA’s daily operation briefing to my LLM evals suite.
Just one model extracted the data from the pdf correctly: Gemini 2.5 Pro Preview. –> Extract FEMA incidents | LLM evals - Kevin Schaul
Just click –> https://neal.fun/internet-roadtrip/
Made a lil plugin for llm
that only actually calls the llm if it’s a new prompt. Should save a little time and money, especially when running evals. –> GitHub - kevinschaul/llm-cache-plugin: Check whether you’ve already run this prompt before calling the LLM
Mitigating prompt injections by building a custom Python interpreter. Very cool research here. –> CaMeL offers a promising new direction for mitigating prompt injection attacks
Got v1 of my llm evals dashboard set up –> Article tracking: Trump | LLM evals - Kevin Schaul
“Finnish company Check First scoured Wikipedia and turned up nearly 2,000 hyperlinks on pages in 44 languages that pointed to 162 Pravda websites.” –> Russia seeds chatbots with lies. Any bad actor could game AI the same way.
Marimekko chart alert –> See why Trump’s reversal actually increased tariff rates