4 Tags: language modeling
7 "I want _you_, Chad," said the woman in the video as she took off her shirt. "Those negative comments on your pull requests were just a smokescreen—because I was afraid to confront the inevitability of our love!"
9 Chad Morgan still couldn't help but marvel at what he and his team had built. It really looked and sounded just like her!
11 It had been obvious since DALL-E back in 'twenty-one—earlier if you were paying attention—that generative AI would reach this level of customization and realism before too long. Eventually, it was just a matter of the right few dozen people rolling up their sleeves—and Magma's willingness to pony up the compute—to make it work. But _it worked_. His awe at Multigen's sheer power would have been humbling, if not for the awareness of his own modest role in bringing it into being.
13 Of course, this particular video wouldn't be showcased in the team's next publication. Technically, Magma employees were not supposed to use their state-of-the-art generative AI system to make custom pornography of their coworkers. Technically (what was probably a lesser offense) Magma employees were not supposed to be viewing such content during work hours. Technically—what should have been a greater offense—Magma employees were not supposed to covertly introduce a bug into the generative AI service codebase specifically in order to make it possible to create such content without leaving a log.
15 But, _technically_? No one could enforce any of that. Developers needed to test what the system they were building was capable of. The flexibility for employees to be able to take care of the occasional personal task during the day was universally understood (if not always explicitly acknowledged) as a perk of remote-work policies. And everyone writes bugs.
17 This miracle of computer science was the product of years of hard work by Chad and his colleagues. _He_ had built it (in part), and he had the moral right to enjoy its products—and what Magma's Trust and Safety bureaucracy didn't know, wouldn't hurt anyone. He had _already_ been visualizing Elaine naked for months; delegating the cognitive work of visualization to be done inside of Magma's GPU farm instead of his own visual cortex couldn't make a moral difference, surely.
19 Elaine, probably, would object, if she knew. But if she didn't know that Chad _specifically_ was using Multigen _specifically_ to generate erotica of her _specifically_, she must have known that this was an obvious use-case of the technology. If she didn't want people using generative AI to visualize her body in sexually suggestive situations, then _why was she working to advance the state of generative AI?_ Really, she had no one to blame but herself.
21 Just as he was about to come, he was interrupted by a
24 * Chad gets a message from someone with a female name on the Capability Evals team. _Tranny or real?_ he wonders. The message asks to talk about a suspicious code change
25 * Evals team had been commissioned recently due to concern about existential risk. Their scope of power is unclear. Chad is skpetical, thinks something derogatory about "Yuddites." He had agreed to be the Multigen team's designated contact person, to be contacted by a liason from the Evals team, thinking it was a joke.
26 * "I hope I'm not interrupting anything important," she said. _Definitely a tranny_, thought Chad. "No, nothing important," he said.
27 * She points to the offending commit, Chad is shocked, terrified that he's been discovered.
28 * The commit uses a regex to match logs, but the regex is written so that it doesn't trigger if the request body starts with an 0x07 ASCII bell character
31 r = re.compile(r"^[^\a].*")
32 if r.match(data.get("prompt")):
35 * He had attributed the commit to Code Assistant with `git commit --amend --author=`, and even used Code Assistant's GPG key, and force-pushed it into someone else's PR; thinking that no one proofreads regular expressions
36 * His terror is broken by puzzlement that the Evals team is telling him this. Does ... does she think the Code Assistant AI did this intentionally? To cover its tracks??
37 * She wouldn't have, if it were just the commit, but the reverse proxy has logs that don't match up with Multigen's internal logs, suggesting someone from within Magma's VPN is exploiting the bug!
38 * She doesn't think Magma should be pushing capabilities the way it is, at all.
39 * Chad is very nervous; he thought deleting the Multigen logs would be enough (the videos are also stored in object storage, but there's no particular reason to expect a human to be combing through the raw files ... but they will, if there's an investigation
40 * He sets up another meeting with the Evals team member, to try to suss out what her plans are, to stall—but ostensibly, to get up to speed on her risk concerns
41 * Scene break: at the meeting, she's explaining Christiano's idea about there being a basin of policies that admit their mistakes, rather than using deception to get a high score
42 * Chad sees the analogy to his own behavior