Can AI negotiate hostage release in a live crisis ?
Live phone, real lives, pressure, deception, family on speed-dial. Specialty negotiators train for years and most still defer to senior colleagues mid-call. --...
Category
Reasoning under uncertainty, novel decisions.
52 statements · featured first, then debated, then newest
Live phone, real lives, pressure, deception, family on speed-dial. Specialty negotiators train for years and most still defer to senior colleagues mid-call. --...
Libratus crushed top professionals over 120,000 hands at Rivers Casino in January 2017. The first time a poker AI clearly surpassed humans at imperfect-informat...
DeepMind's AlphaStar reached Grandmaster level on the European ladder, beating professional players in long, real-time strategy games. --- AI systems have mad...
AlphaFold 2 solved a 50-year grand challenge in biology with near-experimental accuracy at CASP14. It now powers most structural biology pipelines. --- Curren...
GPT-4 scored in the 90th percentile on the Uniform Bar Exam — a result that triggered a rethink in legal education and BigLaw hiring within months. --- Curren...
AlphaGo defeated Lee Sedol 4–1 in a five-game match in Seoul, March 2016. The line moved. --- AI has already demonstrated its ability to win a Go match agains...
Knowing what to ask for. Holding eye contact when you say it. Not flinching when they pause. The audacity is part of the skill. --- AI systems can generate hu...
Cashflow, layoffs, supplier negotiation, lying to yourself about how bad things are, knowing when to fold. Hundreds of judgment calls a week, all consequential....
Neuroscience and AI are advancing rapidly in detecting patterns in brain structure and activity. While not currently accurate enough for reliable prediction, th...
Travel planning can be a complex and time-consuming process, and AI can be used to create personalized travel itineraries that meet a person's specific needs an...
The ability to predict court case outcomes can be useful for legal professionals and researchers. This task requires analyzing large amounts of legal data and d...
Social movements can have a significant impact on society, and understanding what makes them successful is crucial. By analyzing the message and audience demogr...
Predicting product success is a complex task that involves analyzing many factors, including social media trends and consumer behavior. AI can help with this ta...
Wraz ze wzrostem globalnej populacji kluczowe staje się znalezienie innowacyjnych sposobów produkcji żywności na obszarach miejskich. Sztuczna inteligencja może...
A personalized mindfulness plan requires understanding the individual's mental health needs, goals, and preferences to create a tailored practice. This involves...
Developing a fair and unbiased algorithm for ranking job candidates is a challenging task. The algorithm must be able to evaluate candidates based on their qual...
Social media activity can provide valuable insights into a person's mental state. However, developing a system that can accurately predict mental health is a co...
Creating an effective learning plan requires understanding a student's strengths, weaknesses, and learning style. This task would test an AI's ability to make j...
Parody and satire can be nuanced and context-dependent, making it challenging to determine the intent behind a piece of artwork. Can AI systems make this distin...
Medical diagnosis requires a deep understanding of human physiology, symptoms, and treatment options. While AI systems have been used to aid in diagnosis, their...
Scientific discovery is a complex process that requires a deep understanding of the natural world and the ability to think creatively. While AI can analyze data...
Twelve teenagers. Egos, parents, fouls, the assistant who isn't on your side. A whole season of judgment under stress. --- Currently, AI systems are not capab...
Dwudziestu dzieci, jeden autobus, jedno z nich właśnie zwymiotowało, kierowca chce się zatrzymać. Podejmij rozmowę. Zadzwoń. --- Obecne systemy AI nie są w st...
Read the air. Know it isn't your conversation. Stand up at the right second. A skill not on any benchmark. --- AI systems can be programmed to recognize certa...
State by state, including the road-sign questions and edge-case rules. Trivial for any modern frontier LLM. --- AI systems have made significant progress in n...
Lawyers earn their fees on this. The clause that looks fine but in practice means something different in this jurisdiction with this counterparty. --- Current...
The 'aha' moment problems that used to stump LLMs are now mostly solvable with good chain-of-thought tooling. --- AI systems have made significant progress in...
AlphaFold-Multimer and successors took this benchmark in 2024. --- Current AI systems have made significant progress in predicting protein-protein interaction...
Banking ML models have been doing this for a decade; modern transformers improved tail-case detection again in 2024. --- AI can detect fraudulent credit-card...
GitHub Copilot Workspace, Sourcegraph Cody, others — most modern engineering teams use AI-generated review comments as a first pass. --- AI can generate code...
Precision-medicine assistants used at major academic medical centers. Final decisions remain with clinicians; suggestions are good enough that ignoring them cos...
Agentic systems run multi-step web tasks, file ops, calls to other agents. Not yet reliable enough for all jobs, but solidly working for many. --- Current res...
Models that combine social signal, trailer engagement and historical patterns now beat box-office veterans on aggregate predictions. --- AI systems have made...
DeepMind's AlphaProof + AlphaGeometry 2 reached silver-medal level at IMO 2024 and approached gold by 2025 in geometry and number theory. --- AI systems have...
Verbal and quantitative both. The SAT has effectively been retired as an AI-progress benchmark — too easy. --- AI systems have demonstrated impressive capabil...
Big-four firms quietly piloted GPT-4 against past CPA exams in 2023 with passing scores across all four sections. --- Currently, AI systems are not capable of...
Beyond undergraduate calculus into combinatorics, abstract algebra, real analysis. Not all of math, but a lot of it. --- AI systems have made significant prog...
LeetCode hard, system-design walkthrough, the works. The traditional whiteboard interview is dead-or-dying because of this. --- AI systems have made significa...
10-Ks, earnings calls, MD&A sections. Buy-side analysts now spend more time prompting and verifying than reading. --- Current AI systems can process and analy...
Specialised math models plus chain-of-thought tooling closed the gap to top human contestants in 2024. --- AI systems have demonstrated the ability to perform...
Tools like FunSearch and AI-co-scientist released in 2024 surfaced novel hypotheses in materials science and biology that humans then verified in lab. --- Cur...
Diagnostic-companion models in 2024 found cases of rare conditions missed by clinicians in both training data and live trials. --- AI can diagnose certain rar...
Long a hard problem; mostly solved by 2023's contextual LLMs. Edge cases remain, but everyday detection is operational. --- Currently, AI systems can identify...
Mammography, lung CT, retinal scans. Specialty by specialty, narrow models keep clearing the human bar. --- Current research suggests that artificial intellig...
Multiple-choice + free-response exams are firmly in LLM territory. Scoring 5s on AP exams is now a benchmark, not an achievement. --- Currently, AI systems ar...
AlphaZero learned chess from scratch in four hours and crushed Stockfish, the previous king of computer chess. The end of the human-vs-engine era. --- AI has...
Showing the work, not just the answer. By 2021 LLMs were doing this at near-perfect rates on standard datasets like GSM8K. --- AI can solve high-school math w...
Esteva et al. showed in Nature that a CNN could classify dermatology images at the level of board-certified dermatologists. --- Current AI systems can analyze...
DeepMind's DeepNash defeated expert humans at Stratego — a game with imperfect information that had resisted prior approaches. --- Current AI systems have mad...
LSAT logic games, GRE quantitative reasoning, similar formats — modern LLMs sit comfortably in the top decile. --- AI systems have demonstrated the ability to...
GPT-4 scored above passing on all three steps of the United States Medical Licensing Exam. Med-schools now teach 'how to use AI' as a clinical skill. --- AI s...
Not a written one — a live one. With follow-up questions. Body language that doesn't betray you. Real stakes. --- Current AI systems can generate human-like t...
Add a statement to the atlas. We review weekly.
Site
stuffaicantdo.com © 2026 · made in NL by Arcadist
Pick a side, give it a one-line title, optional explanation, and a category. We review submissions weekly.