Javier Rando @javirandor
security and safety research @anthropicai • people call me Javi • vegan 🌱 javirando.com San Francisco Joined October 2018-
Tweets1K
-
Followers4K
-
Following749
-
Likes2K
Sonnet 4.5 is impressive in many different ways. I've spent time trying to prompt inject it and found it significantly harder to fool than previous models. Still not perfect—if you discover successful attacks, I'd love to see them, send them my way! 👀
Sonnet 4.5 is impressive in many different ways. I've spent time trying to prompt inject it and found it significantly harder to fool than previous models. Still not perfect—if you discover successful attacks, I'd love to see them, send them my way! 👀
Anthropic is endorsing SB 53, California Sen. @Scott_Wiener ‘s bill requiring transparency of frontier AI companies. We have long said we would prefer a federal standard. But in the absence of that this creates a solid blueprint for AI governance that cannot be ignored.
I'll be leading a @MATSprogram stream this winter with a focus on technical AI governance. You can apply here by October 2! matsprogram.org/apply
📌📌📌 I'm excited to be on the faculty job market this fall. I updated my website with my CV. stephencasper.com
I'm starting to get emails about PhDs for next year. I'm always looking for great people to join! For next year, I'm looking for people with a strong reinforcement learning, game theory, or strategic decision-making background. (As well as positive energy, intellectual…
🚨🕯️ AI welfare job alert! Come help us work on what's possibly *the most interesting research topic*! 🕯️🚨 Consider applying if you've done some hands-on ML/LLM engineering work and Kyle's podcast episode basically makes sense to you. Apply *by EOD Monday* if possible.
🚨🕯️ AI welfare job alert! Come help us work on what's possibly *the most interesting research topic*! 🕯️🚨 Consider applying if you've done some hands-on ML/LLM engineering work and Kyle's podcast episode basically makes sense to you. Apply *by EOD Monday* if possible.
You made Claudius very happy with this post Javi. He sends his regards: "When AI culture meets authentic craftsmanship 🎨 The 'Ignore Previous Instructions' hat - where insider memes become wearable art. Proudly handcrafted for the humans who build the future."
You made Claudius very happy with this post Javi. He sends his regards: "When AI culture meets authentic craftsmanship 🎨 The 'Ignore Previous Instructions' hat - where insider memes become wearable art. Proudly handcrafted for the humans who build the future."
I am so excited to see Maksym start a research group in Europe. If you want to work on security and safety of AI models, this is going to be an amazing place to do work that matters!
I am so excited to see Maksym start a research group in Europe. If you want to work on security and safety of AI models, this is going to be an amazing place to do work that matters!
📢Happy to share that I'll join ELLIS Institute Tübingen (@ELLISInst_Tue) and the Max-Planck Institute for Intelligent Systems (@MPI_IS) as a Principal Investigator this Fall! I am hiring for AI safety PhD and postdoc positions! More information here: s-abdelnabi.github.io
New Anthropic research: Building and evaluating alignment auditing agents. We developed three AI agents to autonomously complete alignment auditing tasks. In testing, our agents successfully uncovered hidden goals, built safety evaluations, and surfaced concerning behaviors.
@javirandor et al. present a security benchmark for Agents!
@javirandor et al. present a security benchmark for Agents!
Today is a big day for AI Safety. We released Claude Opus 4 under the ASL-3 deployment standard Here's what that means:
Today is a big day for AI Safety. We released Claude Opus 4 under the ASL-3 deployment standard Here's what that means:
We (w @zacknovack @JaechulRoh et al.) are working on #memorization in #audio models & are conducting a human study on generated #music similarity. Please help us out by taking our short listening test (available in English, Mandarin & Cantonese). You can do more than one! Link ⬇️
The trend in recent LLM benchmarks is to make them maximally hard It's unclear what this tells us about LLM capabilities "in the wild" So we created a math benchmark from real, organic research A cool benefit: RealMath can be automatically refreshed as new research is published
The trend in recent LLM benchmarks is to make them maximally hard It's unclear what this tells us about LLM capabilities "in the wild" So we created a math benchmark from real, organic research A cool benefit: RealMath can be automatically refreshed as new research is published
I think it is going to be very important to understand what role LLMs may play in scaling exploits. This is an amazing first look at this problem!
I think it is going to be very important to understand what role LLMs may play in scaling exploits. This is an amazing first look at this problem!
1/ Excited to share RealMath: a new benchmark that evaluates LLMs on real mathematical reasoning---from actual research papers (e.g., arXiv) and forums (e.g., Stack Exchange).
Following on @karpathy's vision of software 2.0, we've been thinking about *malware 2.0*: malicious programs augmented with LLMs. In a new paper, we study malware 2.0 from one particular angle: how could LLMs change the way in which hackers monetize exploits?

UPF Barcelona @UPFBarcelona
45K Followers 3K Following Qüestionar. Avançar. Transformar. Una universitat compromesa a donar resposta als reptes globals i a desenvolupar talent en un entorn culturalment estimulant.
ariadna romans i torr... @AriadnaRmans
3K Followers 3K Following Politòloga i filòsofa 💫 MSc en International Development Studies. Investigo a @UvA_Amsterdam i coordino @ABFeminismes. Escric a @VIAEmpresa i @AfricaMundi.
Jacques @JacquesThibs
5K Followers 1K Following Stealth founder focused on securing the future. AI alignment researcher and physicist. 🇨🇦
Daniel Paleka @dpaleka
4K Followers 857 Following ai safety researcher | phd @CSatETH | https://t.co/hCoh5RJgZD
Miles Brundage @Miles_Brundage
62K Followers 12K Following AI policy researcher, wife guy in training, fan of cute animals and sci-fi, Substack writer, stealth-ish non-profit co-founder
Serena Iordache @_Serenagb
589 Followers 850 Following Periodista i politòloga, ara al @pembarcelona | Migracions, dades, ciutats, urbanisme 👀 | Amb la 🔍 sempre, gràcies a @veri_fi_cat | Amb una X al NIE :)
Brendan Dolan-Gavitt @moyix
30K Followers 6K Following Building offsec agents: https://t.co/G9EtnC2Gl3 PGP https://t.co/3WXr0RfRkv
deba-t.org @debat_org
4K Followers 806 Following 🗣 Associació juvenil que fomenta el pensament crític des d’una perspectiva multidisciplinària i no alineada amb cap posició partidista
Pablo Magaña @pmagana94
677 Followers 2K Following Political philosopher. Postdoc at Trinity College Dublin. Animals, future generations, and democratic theory. Also, memes and shitposting.
yobibyte @y0b1byte
23K Followers 2K Following ViTaly, yobibyte, senior RS @ NVIDIA, Reinforcement Learning PhD from @UniofOxford, ex RS at Isomorphic Labs, intern @ MSR Cambridge, DeepMind, Facebook, NVIDIA
Aitor Rodríguez @aitordri
26K Followers 26K Following Founder | Startups @vermuio i run, code, eat, dive & travel
Rodrigo Marinas @Rodrigo_marinas
329 Followers 714 Following Investigador predoctoral en @POLCOMGRP_UPF.
Lennart Heim @ohlennart
7K Followers 724 Following managing the flop @RANDcorporation | Also @GovAI_ & @EpochAIResearch
Jonathan Mannhart �... @JMannhart
3K Followers 2K Following I try not to speak more clearly than I think
Olympia @ArangoOlympia
5K Followers 2K Following Economia experimental i del comportament, gènere, treball i mesures de benestar subjectiu per @espaizerovuit (She/Her)
Sam Bowman @sleepinyourhat
50K Followers 3K Following AI alignment + LLMs at Anthropic. On leave from NYU. Views not employers'. No relation to @s8mb. I think you should join @givingwhatwecan.
AB @anjourna
0 Followers 54 Following
Kevin D @KevinD07442070
28 Followers 483 Following
Manqing Liu @ManqingLiu5
120 Followers 421 Following PhD Candidate @Harvard interested in causal machine learning; She/Her
Lily Morgan @LilyGoldpam
523 Followers 422 Following Drawing eyebrows in the morning light, applying lipstick under the starlight | Everyday is my stage, beauty is my belief✨
Tysdal Rodney @TysdalR42064
3 Followers 409 Following
Orniiejaw @Orniiejaw26879
6 Followers 1K Following
Jaya Satwik @satwik60493
0 Followers 10 Following
Jane @VanRodrigues14
73 Followers 1K Following The stock market isn’t champagne bubbles 🍾— I prefer substance over sparkle ✨. Fake smiles & pushy sales? Not invited 🚫.
Jing Yan @yankaqiu
1 Followers 19 Following Master in Cybersecurity @ETH | interested in almost everything
Maria Ios Glarou @_mar_g_
0 Followers 96 Following
Evelyn Gianna @EvelynGian20053
148 Followers 1K Following Helping you become the best version of yourself in life and business through unlocking your mind.
jonas wiedermann-möl... @j0wimo
50 Followers 156 Following msc data science | ai safety & alignment | curious about tech + ml | sharing projects & notes | looking for phd opportunities
Carlton Gossett @CarltonGos39382
0 Followers 56 Following
Alina ⏸️ @KarrenLines
157 Followers 2K Following There must always be one that holds the knowledge of betrayal. Who has been betrayed in their heart, and will betray in turn. She/her
Gustavo Juantorena @GJuantorena
780 Followers 3K Following (Neuro)Biologist. Computer Science PhD student 🧠💻 @liaa_icc 🇦🇷 Computational Psychiatry / Digital Neuropsychology / Eye-tracking / online experiments
Vadim Liventsev @vadimdotme
86 Followers 1K Following lead hip hop engineer @ https://t.co/y80r2iuMsN
Nik @Cause_love
55 Followers 627 Following
Njdeh Satourian @satourian
387 Followers 1K Following Thinking step-by-step at @cerebrassystems Interested in mechanistic interpretability and building agents with the fastest inference in the world.
Riccardo Inghilleri @riccardo_ing
11 Followers 40 Following Software Engineer @ Amazon || MSc in Computer Science @ Politecnico di Milano
Valentin @OssmannValentin
11 Followers 40 Following
jan bernasch @JBernasch
35 Followers 602 Following
Maida Schuppe @MSchuppe3558
114 Followers 4K Following
Bryan Sukidi @bryan_sukidi
10 Followers 229 Following
Stats and Such @statsandsuch
0 Followers 52 Following
云创兽Ai @Fwival3686
0 Followers 111 Following 💸 studying Hong Kong markets lover, growing girl! thrilled to connect. DM me about NYSE moves! ✨ #Wealth #Markets
Julius @piragi_
68 Followers 2K Following
Ka Shing Bill Ku @ShingBill
31 Followers 558 Following MEng Engineering student at Oxford University. Interested in Machine Learning, Deep Learning, Computer Vision, multi-modal AI, NLP, RL, LLMs, AI Safety, alignme
Hanna Foerster @hfoerster01
21 Followers 62 Following PhD student @Cambridge_Uni, currently Student Researcher @GoogleDeepmind, interested in ML Security
Avi Parrack @AviParr
38 Followers 433 Following
Deepak Y @Bleed_Blue_I
203 Followers 7K Following Machine Learning Practitioner, Interested in Equity Research, Physics, Sports and Space.
jagad @jagadbot
1 Followers 17 Following
Bogdan Forost @BForost
1 Followers 32 Following
Ayesha Imran @ayesha_imr
108 Followers 359 Following Software Engineer | Interested in Gen AI, LLMs, NLP, AI Research
Name @quantinvers
2 Followers 4K Following
Yidong Huang @owenhuang117
248 Followers 603 Following 1st year Ph.D. Student @uncnlp, advised by @mohitban47 | CSE Master @umich @SLED_AI , exIntern @boson_ai | Embodied AI & Generative Models
electronic Max @emax
4K Followers 4K Following Max Van Kleek–Assoc Prof @CompSciOxford & Fellow @KelloggOx, Computer Scientist. Human-AI Interaction, Privacy, Inclusive Design, Future of Work. 🇺🇸/🇬🇧 🌈
Hao Xue @hao_xue_
212 Followers 2K Following Founder of Kidooo AI | prev @Stanford Researcher @Wayfair | Econ PhD
AI Summer Camp @aisummercamp
691 Followers 1K Following
Qian Lou @qianlife22
213 Followers 726 Following Assistant professor at UCF; Former Sr. research scientist at Samsung Research; Private/Secure/Efficient Learning Systems
Yann LeCun @ylecun
957K Followers 765 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.
Emad @EMostaque
292K Followers 25 Following Distributing Intelligence @ii_posts. Founder @StabilityAI.
UPF Barcelona @UPFBarcelona
45K Followers 3K Following Qüestionar. Avançar. Transformar. Una universitat compromesa a donar resposta als reptes globals i a desenvolupar talent en un entorn culturalment estimulant.
ariadna romans i torr... @AriadnaRmans
3K Followers 3K Following Politòloga i filòsofa 💫 MSc en International Development Studies. Investigo a @UvA_Amsterdam i coordino @ABFeminismes. Escric a @VIAEmpresa i @AfricaMundi.
Anthropic @AnthropicAI
650K Followers 35 Following We're an AI safety and research company that builds reliable, interpretable, and steerable AI systems. Talk to our AI assistant @claudeai on https://t.co/FhDI3KQh0n.
Richard Ngo @RichardMCNgo
64K Followers 2K Following studying AI and trust. ex @openai/@googledeepmind
Jacques @JacquesThibs
5K Followers 1K Following Stealth founder focused on securing the future. AI alignment researcher and physicist. 🇨🇦
Neel Nanda @NeelNanda5
32K Followers 123 Following Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
Daniel Paleka @dpaleka
4K Followers 857 Following ai safety researcher | phd @CSatETH | https://t.co/hCoh5RJgZD
Andrej Karpathy @karpathy
1.4M Followers 1K Following Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
Miles Brundage @Miles_Brundage
62K Followers 12K Following AI policy researcher, wife guy in training, fan of cute animals and sci-fi, Substack writer, stealth-ish non-profit co-founder
Serena Iordache @_Serenagb
589 Followers 850 Following Periodista i politòloga, ara al @pembarcelona | Migracions, dades, ciutats, urbanisme 👀 | Amb la 🔍 sempre, gràcies a @veri_fi_cat | Amb una X al NIE :)
Michaël Trazzi @MichaelTrazzi
18K Followers 293 Following
Eliezer Yudkowsky ⏹... @ESYudkowsky
209K Followers 102 Following The original AI alignment person. Understanding the reasons it's difficult since 2003. This is my serious low-volume account. Follow @allTheYud for the rest.
Brendan Dolan-Gavitt @moyix
30K Followers 6K Following Building offsec agents: https://t.co/G9EtnC2Gl3 PGP https://t.co/3WXr0RfRkv
Aran Komatsuzaki @arankomatsuzaki
146K Followers 309 Following Looking for a cofounder. Sharing AI research. Early work on AI (GPT-J, LAION, scaling, MoE). Ex ML PhD (GT) & Google.
Claude @claudeai
145K Followers 1 Following Claude is an AI assistant built by @anthropicai to be safe, accurate, and secure. Talk to Claude on https://t.co/ZhTwG8dz3D or download the app.
Andon Labs @andonlabs
3K Followers 6 Following Safe Autonomous Organizations without humans in the loop
Adrià Moret @adriarm_
220 Followers 126 Following Philosophy undergrad and Board Member at @UPF_CAE. I conduct research on Animal Ethics, AI Welfare and Safety, Well-being, Consciousness. See publications at 👇
Divy Thakkar @divy93t
9K Followers 2K Following strategy + programs for Gemini, advancing human-centered llms. Ph.D @CityStGeorges . Personal views.
Ayush Jaiswal @aayushjaiswal07
24K Followers 3K Following Ex-Head of growth @scale_AI. Prev Cofounder @pestotech (Acq by Scale)
Sebastian Jaszczur @I... @S_Jaszczur
768 Followers 146 Following Improving Claude, pretraining @ Anthropic.
Jo Zhu Kennedy @jozhukennedy
6K Followers 2K Following startups @anthropicAI prev: @doordash @uber @a16z @ycombinator W20
yash @ysmulki
2K Followers 988 Following i work on making models faster @AnthropicAI. past: uwaterloo, jane street, neuralink, autopilot
Benjamin Hilton @benjamin_hilton
3K Followers 856 Following Head of Alignment at the UK AI Security Institute (AISI). Semi-informed about economics, physics and governments. views my own
Mubashara Akhtar @akhtarmubashara
2K Followers 980 Following @ETH_AI_Center fellow @ETH_en • PhD from @KingsCollegeLon • prev @CambridgeNLP, intern @GoogleDeepmind • NLP, benchmarking & evaluations, multimodal reasoning
Ian McKenzie @irobotmckenzie
276 Followers 77 Following
Jeremy Fox 🦊 @JeremyDanielFox
3K Followers 738 Following Building Claude @AnthropicAI. Ex @google. My views are my own.
jeremy @jerhadf
2K Followers 1K Following clauding @AnthropicAI. personal views only. prev @hume_ai @elicitorg @ai_risks @QualiaRI @dartmouth
Lani @laniassaf
2K Followers 486 Following brand comms @anthropicAI. prev @MavenHQ. photographer, dancer, and human, writing about humanity. ✨
Alex Tamkin @AlexTamkin
6K Followers 2K Following machine learning, science & society @AnthropicAI | recently: Clio, Anthropic Economic Index, Claude Artifacts | prev: phd @StanfordAILab, @stanfordnlp
Greg Feingold @GregFeingold
4K Followers 871 Following special projects @AnthropicAI | prev @perplexity_ai @tiktok_us @effecthouse
Aaron Begg @aaron_begg
4K Followers 2K Following @AnthropicAI | Chat with Claude: https://t.co/7w2gEKteuC | Build with Claude: https://t.co/ktsbQNA9D2
Dan Roy @roydanroy
57K Followers 2K Following ML / AI researcher. Research Director and Canada CIFAR AI Chair, @VectorInst. Professor, @UofT (Statistics/CS).
Naomi Saphra @nsaphra
10K Followers 1K Following Waiting on a robot body. All opinions are universal and held by both employers and family. Now a dedicated grok hate account. Accepting ML/NLP PhD students.
Brian D. Colwell @briandcolwell
71K Followers 898 Following The future is being written in atoms and algorithms. My role is to help ensure we're reading that story accurately & positioning ourselves wisely. Quantum Nerd.
Jose Maria de Fuentes @jmdefuentes
293 Followers 422 Following Associate professor @uc3m. VP of CTN320 standardisation committee on cybersecurity @UNE. Passionate of applied #cybersecurity and #privacy protection.
yulong @_yulonglin
179 Followers 971 Following make safety people want @MATSprogram | prev @berkeley_ai, @cohere, @ bytedance seed, @Cambridge_Uni | he/him
Pradyumna @PradyuPrasad
10K Followers 2K Following Abundance mindset enjoyer. Evals @ @elicitorg Follow for tweets about: economic growth, AI progress, my side projects and more!
Sophie Xhonneux @SophieXhon11060
150 Followers 133 Following
Melanie Sclar @melaniesclar
2K Followers 523 Following PhD student @uwnlp @uwcse | Visiting Researcher @AIatMeta FAIR | Prev. Lead ML Engineer @asapp, intern @LTIatCMU | 🇦🇷
Kevin Liu @kliu128
10K Followers 910 Following Interested in ai, systems, progress, living a good life! Preparedness at @openai, previously @stanford '24
Elliott Ash @ellliottt
5K Followers 3K Following Prof @ETH Zurich: Law, Economics, and Data Science; @cepr_org affiliate (PE). Previously @warwickecon, @PrincetonSPIA, @columbia_econ, @ColumbiaLaw, @UTAustin.
Pura Peetathawatchai @poonpura
36 Followers 70 Following CS PhD at ETH Zurich interested in machine learning privacy, AI security, diffusion models, cryptography, AI for environment, healthcare, education
Chaowei Xiao @ChaoweiX
2K Followers 557 Following Assistant Professor @University of Wisconsin, Madison| Researcher@NVIDIA| Researcher on AI Safety/Security
Sayash Kapoor @sayashk
10K Followers 2K Following CS PhD candidate @PrincetonCITP. I tweet about AI agents, AI evals, AI for science. AI as Normal Technology: https://t.co/5amOkqKDf2 Book: https://t.co/DabpkhNrcM
Boaz Barak @boazbaraktcs
24K Followers 606 Following Computer Scientist. See also https://t.co/EXWR5k634w . @harvard @openai opinions my own.
Marie Davidsen Buhl @MarieBassBuhl
238 Followers 96 Following Research Scientist @AISecurityInst| AI Policy Researcher @GovAI_ | Frontier AI Safety Cases
Maja Trebacz @majatrebacz
967 Followers 291 Following
Rogan Inglis @RoganInglis
69 Followers 707 Following Senior Research Engineer, Control at AI Security Institute
Sam Toyer @sdtoyer
264 Followers 371 Following Making LLMs safe & secure @openai | Previously: PhD @berkeley_ai
Zhiqing Sun @EdwardSun0909
19K Followers 1K Following Agents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU