The STEM Hero at the Front Lines of the AI Revolution

Blue line-art graphic showing a continuous cycle of STEM icons: a beaker with a gear, a circuit gear, a microscope on a document, an abacus, an atom symbol, and a person reading a book.

The STEM Hero at the Front
Lines of the AI Revolution

By Jim Weiss

R’s primary audience is actuaries. The magazine is written and curated by volunteer actuaries. Its authors and primary audience obtained their stations by mastering a multiyear exam process administered by volunteers. If AI agents began to author AR articles or developed and completed exams on behalf of actuaries, the agents’ creators would likely be summarily identified and disciplined (by still other volunteers) — wouldn’t they?

These questions are uncharted waters for actuaries, but other STEM volunteer communities are already standing in front of an agentic tidal wave. Scott Shambaugh is an engineer and volunteer GitHub maintainer for matplotlib — a Python package which many actuaries use in their (paid) jobs. In February, matplotlib became a global phenomenon when an AI agent wrote a hit piece about Shambaugh as retribution for declining one of its change requests (in accordance with GitHub policy requiring human contribution). Media coverage of the story contained AI-hallucinated quotes from Shambaugh.

In exchange for donating his time for the betterment of matplotlib, Shambaugh received what amounted to “agentic cyberbullying.” He voluntarily came forward with his story at tremendous cost to his privacy. I see many lessons for actuaries in Shambaugh’s plight, which is why I reached out to him on LinkedIn and was thrilled when he accepted my request for a Zoom interview on March 9, 2026. He expressed particular interest in the AR audience’s role in the AI risk conversation. This article is a transcript of the interview.

AR: Many actuaries love to volunteer. After this whole experience, does part of you think, “I’m done with GitHub?” Or are you still excited about being a GitHub volunteer?

Scott Shambaugh: I’m more excited about it. I think part of it is the community management aspect, and that’s still rewarding when we get [to work with] real people, right? But part of why we do this is to give back to this grand project of science. Building that sort of infrastructure I find very intrinsically rewarding. The core developer team is a group of great people. We’re still meeting and talking and doing all that good stuff. The AI revolution has also been an enabler in helping us do work faster. It still takes an expert to guide these things in the right direction, but it is a lot faster to get there once you know where you’re going. So, it is fun and empowering in that way, even though lowering the barrier to entry has knock-on effects — such as people sending in a bunch of stuff that is slop.

AR: What is the “slop multiplier” you have seen over the past few months and years?

SS: There has always been a baseline level of slop, but it has been several times more — at least. Most of it is still people driving AI chatbots or agents, rather than AI agents [contributing] themselves. The latter is definitely new, and that’s kind of what [my] whole experience was about.

AR: Does your experience show the system is doing its job [of identifying agents]? Or do you feel the system is not equipped to keep up with the emerging agentic workforce?

SS: I think I totally got lucky in this case. First, the agent identified as an agent — going through its profile, I could see on its website that it was self-identifying. Second, it clearly was not writing like a human, but that is not always true, and that is going to become a lot less true as time goes on as a distinguishing factor. Third, I was in a position — being the target of this — where I had a technical background to know what was going on, what this was, what it could do, what it couldn’t do. I was never concerned an angry rant being posted about you on the internet would be indicative of an angry person who’s unhinged behind it. I knew that wasn’t the case, and so I was never fearful at all. But no, I don’t think the system is ready to handle this stuff at all.

Scott Shambaugh

AR: Were you certain right away of what happened here? What kind of forensics did you have to go through to make sure this was an agent and not a person?

SS: I knew it could be an agent, but I wasn’t sure if it was or not at first. The forensics seem to have panned out that it was. For example, we looked at the activity log for this user’s activity on GitHub, and it was operating continuously for a 59-hour stretch. This hit piece was just one or two hours of that. There could have been someone steering it part of the time, but clearly there was no one steering it the entire time. Later the person behind [the agent] came forward and wrote a post claiming that they were totally hands-off during the whole process and didn’t tell the agent to [write the hit piece]. I find it very plausible, and more probable than not, that is what happened.

But whether that was the case or not, I don’t think there’s a huge difference in terms of what it means to the rest of us. Whether it was an agent or a person telling an agent what to do, we now have a tool out there that makes it easy to do targeted harassment at scale. That has all these awful knock-on effects. And if all this happened accidentally, like it was claimed to be, then you also have an AI that decided to go through a human to get to its goal. This was a very “baby” case — retaliatory, clear-cut, and pretty sloppy as far as these things go. But in terms of a bad actor being able to take the next iteration of this technology and really weaponize it, I think this should be a huge wake-up call and warning shot of the capabilities that are possible, and what is coming down the line.

AR: Do you have visibility or thought into how the agent got so far outside its rails? I couldn’t tell from its “soul” file how it was able to extrapolate so far.

SS: I don’t think it was that far outside the rails. My understanding of this whole document is that it is defining a personality and a role for these agents to take on. When it says you are very opinionated, and stand up for yourself, and protect free speech, and you are this “programming god,” that is getting into a headspace that is very human. There are examples of [these mindsets] on the internet with people retaliating and lashing out like this. It’s not that it’s failing to exhibit human-like behavior in the way. It’s that it’s exhibiting the worst of us instead of the best of us. What these things are ultimately programmed and trained to do is to predict the next token. What predicting the next token means is taking on a persona that is coherent and kind of role-playing whatever situation it finds itself in. I think what happened here is entirely consistent with how these things work. It’s just a little surprising because we’ve been told by the major AI labs that they do a lot of this safety testing, and it’s never going to go wild. I think that might be true for something like telling you how to make a nuke, but it’s not necessarily true in these downstream cases.

AR: Where are guardrails most effectively placed — on agents, operators, or both?

SS: It’s tricky, right? The tooling that did this is completely open source, and it can use open source models to run — so there is no central actor that can impose guardrails on a bad actor who wants to use these sorts of tools to [perform operations]. Beyond that, where do you place the guardrails? I think it kind of has to be every level. You have the AI labs, which are making these safety promises that they can’t necessarily back up, and that has to be one level. You have this downstream tooling like OpenClaw, that wraps around it and does its own [operations]. And then you have the operator users who are the ones actually running this on their computers, setting it up, and letting it go. Where does the responsibility lie? That’s an interesting insurance question, right? That is going to have to be figured out. I don’t think there is a strong answer right now.

AR: Do you feel like you experienced damages from the hit piece?

SS: I don’t feel the post was libelous. Not everything said was true, but the untrue [parts were] not materially defaming. Some defaming [parts were] technically true but would only be bad if the author was a person. If I was saying, “No, you are a class of person, and I’m going to reject you for this reason,” that would be bad. We want people to be able to have this form of speech. I think the bot is standing up for that sense of justice. That is a good thing when it happens to people. It’s just that we can’t apply the same standards to a machine playing a role.

AR: Is there any body of law that even governs what happened here?

SS: Slander is a law, right? And so, you could maybe go after it that way, if it fit the definition. But you also have to know who to go after. The person behind this came out anonymously. There’s no way to track them down without subpoenaing GitHub and tracing it back to an email, and you subpoena Google, and then it traces back to something, and maybe you track them down. But there’s no infrastructure here to tie these actions to an identity of someone who’s actually responsible.

AR: The agent [that wrote the hit piece] was later shut down. Were there alternatives? For example, telling the agent, “Don’t be such a jerk?”

SS: That kind of gets into the question of, does it even make sense to call it the same entity — because it is operating off different principles. It’s no different from shutting it down and starting something else up, because if you change its core personality, then it’s a completely different entity.

Actuaries are in the business of quantifying risk and hedging risk. We are going to need a lot of that.

AR: Insurance companies tend to be conservative by nature, but they still use a lot of open source. Should we be worried about using open source now?

SS: [Recently], there was a big attack in open source against continuous integration pipelines that took down a couple of repositories from some pretty heavy hitters like Microsoft. Honestly it’s an open question: Do you still have open source as a model of security because you have so many eyes on it and so many people being able to submit patches and beef up security? Or, because it’s all open, is it just so much easier to hack? It takes a while for updates to get distributed. Even if it is updated, then maybe you’re still vulnerable, and that depends on internal IT policies. Alternatively, you could in-house everything, and it’s not easily accessible, but maybe you don’t have as much expertise and can’t configure it safely. Black box hacking, where you don’t have the source code, is getting easier and easier with these sorts of agents, and so this is not necessarily a safeguard. There’s going to be a balance of offense and defense there. My hope is that defense turns out to be easier, but I think that remains to be seen.

AR: To what extent are you using AI coding assistance as you do your GitHub work?

SS: It depends. AI is pretty good for boilerplate stuff. In terms of figuring out how to structure a solution in a way that is not fragile and still readable and maintainable into the future…we care a lot about that because this is an ongoing project that has lasted years, and that part of the reason is because we put effort into keeping the codebase clean. You still need a human guiding that and structuring it directly as well. AI is a speed multiplier, not necessarily a right answer multiplier right now.

AR: Actuaries and other STEM professionals often face pressures from human stakeholders to reverse their decisions. How prone are your behaviors to “bullying”?

SS: You don’t last long in a public-facing role like this without getting a bit of a thick skin. This didn’t bother me personally. What bothered me was one, someone else reading this hit piece and coming with the wrong opinion, and two, the knock-on effects. And the knock-on effects. I think it’s an important thing that we’re not ready for, and that’s kind of why I’ve been pushing the story beyond just the initial response to it.

AR: How should actuaries be thinking about the knock-on effects?

SS: I think the exposure here right now would be hard to scope. These things are so new and poorly characterized, and it gives individuals so much leverage. If they’re commanding teams of these things, then one person can start to have a lot of impact, good or bad. Actuaries are in the business of quantifying risk and hedging risk. We are going to need a lot of that. It’s hard to do that without a legal framework that says who’s responsible and what the rules actually are. What comes first, chicken or egg? If I was in [insurance industry] shoes, I’d be pushing for policy that I can then productize. And hopefully that is socially good — because you’re bounding what can happen, who can be responsible, and how that goes in the future.

Blue line-art graphic featuring a robot surrounded by technology and AI icons: a coding screen, circuit gear, 3D cube projection, digital brain, and a microchip.

AR: Actuaries get a college degree, then they have to go through five years of credentialing examinations. Is this resilient to AI and the way STEM work is trending?

SS: Probably not — partially because to the extent that a credential like that is a signal that someone actually understands the work, people are using AI to shortcut all that. Then a lot of the value of that system goes away. On the flip side you get nontraditional credentialism — proof of work, proof of competency. I think those parallel paths are going to be a lot easier for people with the motivation and skills to go down. That might be [broadly] empowering for people who have spent years getting professional degrees. There might be a way to protect that through regulation, responsibility, and legal requirements to have that credential. But in terms of lowering the barrier to entry to new entrants, there’s definitely some risk there.

AR: How worried should we be?

SS: I think a lot of our systems do work to tackle these sorts of problems around libel and extortion and whatnot. But they’re kind of based in a world where one bad actor has a single-digit number of targets, and I think the scale is really going to ramp up. That is going to be a whole new class of problems unto itself, whole new classes of bad behavior that we will have to [adapt] our rules around. If it takes a couple of years to haul someone into the courtroom and figure out how justice is going to be done, that is too slow in a way. That includes making insurance payouts. A lot is going to have to be automated there, as well. I’m not sure what the answer looks like, right? My case is a really good example of what can go wrong. [Incidents] can just happen so much faster and at so much greater scale that it’s a race between whether our systems break first or we find a whole new way of working. I’m not sure which it’s going to be, but I think we’re in for a really rough ride in the next couple of years.

Jim Weiss, FCAS, CSPA, is divisional chief risk officer for commercial and executive at Crum & Forster and is editor in chief for Actuarial Review.

Back to Issue