Short posts
Just released a plugin for llm to pull in U.S. legislation as fragments, enabling you to do stuff like:
llm -f bill:hr1-119 'Anything AI-related in here?' or llm -f bill:hr1-119:section-110101 'Is there language in here to prevent fraud?' –> GitHub - kevinschaul/llm-fragments-us-legislation: Load bills from Congress.gov as LLM fragments
The best llm CLI program now supports tool-calling, very excited about these possibilities –> Large Language Models can run tools in your terminal with LLM 0.26
Great investigation –> Police secretly monitored New Orleans with facial recognition cameras
How reliable are LLMs at extracting data from pdfs? Inspired by Simon Willison’s PyCon talk, I added extracting FEMA’s daily operation briefing to my LLM evals suite.
Just one model extracted the data from the pdf correctly: Gemini 2.5 Pro Preview. –> Extract FEMA incidents | LLM evals - Kevin Schaul
Just click –> https://neal.fun/internet-roadtrip/
Made a lil plugin for llm that only actually calls the llm if it’s a new prompt. Should save a little time and money, especially when running evals. –> GitHub - kevinschaul/llm-cache-plugin: Check whether you’ve already run this prompt before calling the LLM
Mitigating prompt injections by building a custom Python interpreter. Very cool research here. –> CaMeL offers a promising new direction for mitigating prompt injection attacks
Got v1 of my llm evals dashboard set up –> Article tracking: Trump | LLM evals - Kevin Schaul
“Finnish company Check First scoured Wikipedia and turned up nearly 2,000 hyperlinks on pages in 44 languages that pointed to 162 Pravda websites.” –> Russia seeds chatbots with lies. Any bad actor could game AI the same way.