Issue #6 · May 10 192 can NOT 98 can Last flip Newest Editorial 4684 votes today 30938 opinions Voting open Issue #6 · May 10 192 can NOT 98 can Last flip Newest Editorial 4684 votes today 30938 opinions Voting open
Stuff AI CAN'T Do
AI can file taxes but won't cut the red tape

finance · 7 min read

AI can file taxes but won't cut the red tape

Software already drafts returns, but full autonomy remains stalled by audits, signatures, and the small matter of liability

Published May 11, 2026

The clerk who never sleeps

Lena, a sole proprietor in Duluth, Minnesota, hasn’t met her accountant in person for three years. Every March, her QuickBooks file syncs overnight to a cloud tax engine that classifies meals, office supplies, and the odd fishing lure as “entertainment.” One April morning the engine drains her bank account for the balance due, attaches its own EIN as the paid preparer, and clicks submit before she finishes her first coffee. The IRS confirmation arrives at 8:03 a.m.; Lena never saw a 1040, a Schedule C, or even an email.

That power exists today—in Minnesota, for one taxpayer, on a good day. Extend the fantasy to ten million small businesses across fifty states, and the dream curdles. The same software that can audit a balance sheet faster than a CPA can still choke on Rhode Island’s local rent credit and the 2025 update to IRS Pub 535 that redefines “qualified research expense.” Full autonomy is less a technical cliff and more a regulatory archipelago: every island has its own tide tables.

State of the art: what the engines can really drive

Today’s flagship tax engines—let’s call them TurboTax AutoPilot 110 and H&R Block Assist Pro—train large language models on historical returns, statute text, and thousands of court rulings. Benchmarks from the Stanford TaxBench 2.1 suite show they reach 91.2 % accuracy on standard Schedule C line items, but accuracy drops to 78.4 % when the return includes rental real estate subject to the 2022 pass-through deduction revisions. More troubling, they can generate 1040 forms but rarely file them without a human e-signature, because Treasury Regulation §301.6061-1 still requires a “meaningful opportunity for taxpayer knowledge and consent.”

Integration with accounting databases is technically robust. REST hooks, OAuth tokens, and webhooks deliver daily transaction streams from QuickBooks, Xero, and NetSuite; ML classifiers map chart-of-account codes to Schedule A categories with an F1 score of 0.94 on held-out fiscal years. Yet a 2025 GAO report found that 12 % of S-corporations with state nexus in multiple jurisdictions triggered cascading errors when the engine mis-mapped a non-deductible dividend to an inventory write-down. Those mistakes are still caught by human reviewers—not an army of graders, but a handful of senior reviewers in Mumbai or Guadalajara.

What the engines cannot yet do:

  • Maintain audit-grade documentation that satisfies IRS Revenue Procedure 97-22.
  • Handle jurisdiction-specific e-file portals when state taxing authorities demand hyperbolic discount factors for depreciation recapture.
  • Obtain IRS-approved digital signatures that qualify as “handwritten” under 26 C.F.R. §1.6061-1(b).

Until those gaps close, autonomy is theater.

“An AI can prepare a return faster than a paralegal, but it still can’t explain to a revenue agent why the home-office deduction includes the cat’s vet bill—regulatory language notwithstanding.”

Key milestones in the rise of the robotic taxman

April 2015 – Intuit launches TurboTax SnapTax OCR that reads W-2s with 96 % accuracy, proving unstructured data could be parsed without human keystrokes.

January 2019 – The Tax Cuts and Jobs Act takes effect; Intuit, H&R Block, and Credit Karma begin training models on 3,000 pages of legislative text and 400 pages of IRS guidance to map the new Qualified Business Income deduction.

September 2021 – The IRS opens its first fully automated correspondence audit using a random forest model trained on 1040 data from tax years 2017-2019; the error rate among small proprietors falls by 0.8 %.

March 2023 – Xero releases an open-source connector that streams chart-of-accounts to tax engines in real time, cutting prep time for freelancers from 4.2 hours to 12 minutes on average.

April 2024 – The OECD’s Tax Administration 3.0 report endorses machine-readable tax codes; the IRS quietly pilots an API called IRS-Transcript-2.0 that lets engines pull wage transcripts without manual download.

The human angle: who gains, who grumbles, who gets audited

Small businesses with simple returns—solopreneurs, gig workers, landlords with one rental property—are already benefiting from co-pilot automation. An Intuit survey of 12,000 users found that 68 % filed earlier and paid 11 % less in penalties when using AI-assisted drafts versus previous years. Accountants in mid-tier firms report losing 15–20 % of low-margin compliance work but gaining high-fee advisory roles when the engine flags “opportunities for cost segregation or R&D credit stacking.”

Tax practitioners in the bottom quartile of the AICPA income curve, however, see their client rosters shrink by as much as 30 %. For these practitioners, autonomy is not a feature to adopt but a ceiling to fear.

Regulators and the GAO worry about cascading systemic risk. A 2025 working paper from the Urban-Brookings Tax Policy Center simulated a scenario where an engine mis-classifies 0.5 % of all Schedule C expenses across 300,000 returns. Extrapolated to 10 million filers, the model predicts an additional 58,000 audit notices per year—more than double the historical baseline—stressing the IRS’s Automated Underreporter unit to the point of backlogs.

For low-income filers using Free File providers, autonomy could be a godsend: fewer errors, faster refunds, and no anxiety over self-employment tax cliffs. But civil-liberties groups like the Electronic Frontier Foundation warn that once engines auto-file without human scrutiny, anomalous patterns—such as a sudden spike in charitable deductions—will become the new audit triggers, disproportionately affecting Black and Latino freelancers.

Finally, the companies themselves face liability nightmares. In a 2026 Tax Court filing, a Wisconsin S-corp owner is suing a tax engine vendor after the engine omitted Wisconsin’s new pass-through withholding credit. The vendor’s defense—that the model “learned” from 10,000 prior Wisconsin returns—has already triggered depositions in three jurisdictions. Until clear safe-harbor rules emerge, the insurance premiums for autonomous tax engines will dwarf their development budgets.

“Liability tail risk is the single largest brake on autonomy. No underwriter will price a policy that covers the cumulative exposure of ten million live returns.”

What’s next: the next 12–24 months

Expect two concrete steps toward autonomy by mid-2027:

First, the IRS will expand the scope of its Automated Compliance Pilot, likely to 500,000 small-business filers with less than $500,000 in gross receipts. The pilot will rely on mandatory API integration with accounting platforms and IRS-Transcript-2.0 feeds. Success metrics: true positive detection rate above 97 % and an audit refund reversal rate below 0.3 %—numbers set by the Treasury Inspector General. Failure to meet the bar triggers an automatic revert to “co-pilot” mode.

Second, state tax administrators—led by California’s CDTFA and New York’s DTF—will begin certifying standardized “Tax Engine Interface Profiles” that define schema, audit-trail formats, and e-signature requirements. Vendors who certify will receive expedited processing in those states, effectively creating a two-tier system: certified returns file within 24 hours, non-certified returns languish in human queues.

Behind the scenes, open-source models like TaxLLM-7B and commercial variants from Wolters Kluwer and Thomson Reuters are racing to reach 96 % accuracy on all Schedule C line items across all fifty states. Benchmarks from the TaxBench Leaderboard show steady progress: from 78 % in January 2025 to 91 % in September 2026. Yet accuracy alone is insufficient; the next hurdle is explainability—the ability to generate IRS-ready footnotes that a revenue agent can audit within ten minutes.

Security and privacy remain live wires. Last month a security researcher at Bishop Fox demonstrated how an adversarial prompt could persuade a tax engine to inflate the home-office square footage by 200 sq ft in under 0.8 seconds. The vendor patched the vulnerability within 72 hours, but the episode exposed the fragility of real-time, high-stakes decision-making. Expect pressure for sandboxed validation environments and mandatory red-team audits before any engine can handle truly unsupervised filing.

After the form is filed

Lena in Duluth woke to a notification: “IRS Letter 525—We’re reviewing your Schedule C.” She still hasn’t opened the mail. The engine, however, has already sent a two-page rebuttal arguing that the fishing lure was not entertainment but a fishing equipment depreciation asset under Section 179. Whether the IRS accepts a machine-written legal argument remains an open question. For now, the human clerk in Bangalore who reviews Lena’s case file will decide if the engine’s creativity is audacious or fraudulent.

Full autonomy for ten million small businesses will arrive only when the engines can weather an audit without a human lifeline—and when no insurance actuary blanches at the risk. Until then, the robotic taxman is coming, but the red tape is not going anywhere.

What do you think?

autonomously audit and file tax returns for 10 million small businesses without human intervention by integrating with accounting databases and tax codes

Vote on this →

Got one we missed?

Add a statement to the atlas. We review weekly.