Andrei Spătaru ✓ Verified by Andrei Spătaru

• Published 25 Jun 2026 • 26 min read

The Confidence Trap: Why UK Business Owners Using AI Without Accounting Expertise Are Taking Risks They Cannot See

General-purpose AI feels authoritative on tax, but non-experts cannot tell when it is wrong. UK tribunal cases, cognitive research and HMRC penalties reveal the real risks, and where AI genuinely helps.

Professional accountant explaining bookkeeping records to a client

A critical evaluation for UK sole traders, directors, and small business owners.

A Behaviour That Is Becoming Measurable

Something has shifted in the relationship between UK business owners and their accountants. For decades, the flow of information was largely one-directional: the accountant knew the rules, and the business owner trusted the accountant. That dynamic is now being complicated by a third party that arrived with no professional qualification, no indemnity insurance, and no knowledge of the specific business it is advising, but which speaks with extraordinary fluency and apparent confidence: the general-purpose artificial intelligence chatbot.

The scale of this shift is no longer anecdotal. Research commissioned by Dext and published in December 2025, based on a survey of 500 UK-based accountants and bookkeepers, found that 77% have seen an increase in clients using systems such as ChatGPT and other large language models for financial, tax, or bookkeeping queries during 2025. Nearly three in four, specifically 72%, report that clients now use AI-generated outputs to challenge or question their professional advice. A further 68% report a rise in clients suggesting that AI could replace the need for professional accounting services altogether.

Half of the surveyed accountants are already aware of businesses that have suffered direct financial losses as a result of incorrect or misleading AI-generated advice. Those losses are specific: overpaid tax, missed allowances, financial penalties, compliance failures, and fines. The accountants surveyed are not neutral observers with a commercial interest in overstating the risk. But the losses they describe are documented, not speculative, and the regulatory record that has accumulated in UK tax tribunals during 2025 and 2026 confirms the pattern independently.

This article examines what is actually happening when non-accountants use general-purpose AI for accounting and tax decisions, what the cognitive and structural reasons for the errors are, what the documented legal and financial consequences have been, and where the legitimate boundary between AI-assisted financial management and professional expertise genuinely lies.

The Appeal Is Rational, Which Is Part of the Problem

Before evaluating the risks, it is important to acknowledge that the appeal of using Claude, ChatGPT, Gemini, or any comparable large language model for accounting and tax questions is entirely understandable. These tools are immediately available, free or low-cost at entry level, responsive at any hour, and capable of producing answers that read as authoritative, structured, and comprehensive. For a sole trader with limited funds, facing a self-assessment deadline at 11 pm in January, the alternative to asking an AI is either spending several hundred pounds on professional advice or attempting to navigate HMRC’s own guidance, which is comprehensive but written in a register that assumes familiarity with legislative terminology.

The tools are also genuinely useful for a category of questions. A general-purpose AI can accurately explain what a VAT registration threshold is, what the difference between Class 2 and Class 4 National Insurance contributions means, what capital allowances are in general terms, or what the standard accounting treatment for a prepayment looks like. These are informational questions with stable, established answers that the model’s training data almost certainly contains correctly.

The problem begins precisely where the informational question ends and the applied question starts. The transition from ‘what is Business Asset Disposal Relief’ to ‘do I qualify for Business Asset Disposal Relief given my specific shareholding, employment history, and disposal structure’ is not a question of more information. It is a question of professional analysis applied to a specific set of facts, cross-referenced against current legislation, tested against the case law that defines the boundaries of the relief, and considered in the context of the individual’s full tax position. This is the category of question that general-purpose AI tools answer with the same confident fluency they apply to the informational question, and this is precisely where the damage occurs.

Real Cases, Real Judgments: What Has Already Happened in UK Tax Tribunals

The most important evidence for the risks of unsupervised AI use in tax matters is not survey data. It is the record of UK First-tier Tribunal and Upper Tribunal decisions in which AI-generated submissions have been explicitly addressed by judges, with named cases now forming a body of cautionary precedent.

The first significant UK case in this category is Felicity Harber v HMRC, heard by the First-tier Tribunal Tax Chamber in December 2023. Mrs Harber appeared in person and submitted arguments that referenced several previous Tax Tribunal decisions she contended supported her appeal. The Tribunal had difficulty identifying the cases she cited, as did HMRC. When challenged, she acknowledged that the cases may have been sourced by AI, as submissions had been prepared by a friend working in a solicitor’s office. The cases did not exist. They were AI hallucinations: plausible-sounding citations with correct formatting, authentic-seeming names, and logical but entirely fictitious propositions. The appeal was dismissed.

The second case is HMRC v Marc Gunnarsson, heard by the Upper Tribunal Tax and Chancery Chamber in July 2025. Mr Gunnarsson, representing himself, submitted a skeleton argument referencing three First-tier Tribunal decisions that he said supported his case. HMRC informed him that the decisions did not exist. Mr Gunnarsson sent a revised submission removing the references and during the hearing confirmed he had used AI to help prepare his written submissions. The Upper Tribunal agreed with HMRC that coronavirus Self-Employment Income Support Scheme claims made by a director of a limited company were incorrect and required repayment. The AI-generated case law had not only failed to support his position; it had undermined his credibility before the tribunal.

The third case is Bodrul Zzaman v HMRC, heard by the First-tier Tribunal in 2025. Mr Zzaman, a father challenging a High Income Child Benefit Charge of £2,500, used AI to draft his statement of case. The tribunal dismissed his arguments as irrelevant to the legal question at issue. The judge explicitly noted that the case highlights the dangers of reliance on AI tools without human checks to confirm that what the tool is generating is accurate.

The fourth case, the most recent at the time of writing, is G Elden v HMRC, heard in January 2026. This case introduced a further variation: the submissions cited real case names, but the extracts presented were inaccurate, irrelevant, or unsupported by the actual decisions. The cases existed; the propositions attributed to them did not. The judge issued an explicit warning that the responsibility for verifying the accuracy of cited references lies with the human relying on them, not with the AI tool that generated them.

By late 2025, a survey of disciplinary actions and court records had documented nearly 800 AI-related citation errors across at least 25 countries, according to a paper published in January 2026 in the International Tax Journal, cited by Bloomberg Tax. The UK cases are not isolated incidents. They are part of a documented global pattern.

The ACCA (Association of Chartered Certified Accountants) published topical guidance on AI and professional conduct in January 2026, stating explicitly: ‘In all cases it is important to remember that outputs from AI tools should not be used as authoritative tax or legal advice, with review to be undertaken by a qualified professional in the specific context of the client to whom the advice is being provided.’ The Professional Conduct in Relation to Taxation guidance, adopted by all major UK accounting and tax professional bodies including the ACCA, the Chartered Institute of Taxation, and the Institute of Chartered Accountants in England and Wales, was updated in 2025 to address AI specifically. The Financial Reporting Council’s March 2026 guidance stated that unchecked AI hallucinations constitute evidence of a lack of professional scepticism, the standard it applies to audit partners and finance directors.

The Cognitive Mechanism: Why AI Makes Non-Experts Feel More Competent Than They Are

The legal record confirms that harm is occurring. The more important question for the purposes of prevention is why people who would not attempt to represent themselves in a complex tax dispute without legal support are willing to rely on AI output for the same purpose. The answer involves a specific and now empirically documented cognitive phenomenon.

The Dunning-Kruger Effect, first identified by psychologists David Dunning and Justin Kruger in a 1999 paper published in the Journal of Personality and Social Psychology, describes the tendency of individuals with limited knowledge in a domain to overestimate their competence in that domain. The effect also operates in reverse: genuine experts tend to underestimate their relative competence because they are acutely aware of the complexity they have mastered. In knowledge domains, uncertainty is a signal of depth of understanding, not a signal of inadequacy.

Research published in the journal Computers in Human Behavior in late 2025, conducted by a team led by Professor Robin Welsch at Aalto University in Finland, found that when humans interact with large language models specifically, the Dunning-Kruger Effect does not operate in its standard form. Instead, all users, regardless of their baseline knowledge level, overestimate their performance when using AI. More significantly, the study found a reversal of the standard effect: users who considered themselves more AI-literate showed greater overconfidence than those with less familiarity with AI tools.

‘We found that when it comes to AI, the DKE vanishes,’ Professor Welsch stated. ‘In fact, what is really surprising is that higher AI literacy brings more overconfidence. We would expect people who are AI literate to not only be a bit better at interacting with AI systems, but also at judging their performance with those systems, but this was not the case.’

The researchers identified the mechanism as cognitive offloading: the tendency to trust the system’s output without reflection or independent verification. When an AI tool produces a well-structured, confidently presented response, users treat the structure and confidence as evidence of correctness. The metacognitive signal that in a human expert would prompt the question ‘is this person actually right?’ is suppressed by the plausibility of the AI’s output.

For a business owner asking an AI about their tax position, this means the following: the AI’s answer feels authoritative because it is clearly expressed, well-organised, and covers considerations the user had not thought of. The user interprets this comprehensiveness as evidence of expertise. What they cannot evaluate, because they lack the domain knowledge to do so, is whether the answer is correct for their specific circumstances, whether it reflects the current state of legislation rather than the model’s training data from some months previously, and whether the confidence of the presentation corresponds to the strength of the legal or regulatory foundation underlying it.

A clinical assistant professor of accounting at Purdue University, J.T. Eagan, who studies AI use in tax contexts, described the phenomenon to CNBC in March 2026 with notable directness: ‘AI will convince you that the sky is green. It is so convincing.’ He cited a case where an AI chatbot incorrectly answered one of the tax questions he uses with his students, noting: ‘It gave me this response where the mechanics were perfect, but I had to take a step back and say, well, you are wrong.’ The critical difference between Professor Eagan and a sole trader using the same tool is that he has the domain knowledge to identify the error. The sole trader does not.

What General-Purpose AI Tools Cannot Do in Accounting and Tax, and Why

Understanding why general-purpose AI tools are structurally incapable of providing reliable applied tax advice requires a brief explanation of how these tools work, without the technical complexity that obscures more than it illuminates.

Large language models are trained on enormous volumes of text. They learn statistical relationships between words, phrases, and concepts across that text, and they generate responses by predicting the most likely continuation of a given prompt based on those relationships. They do not retrieve information from a verified database. They do not consult HMRC’s current published manuals in real time. They do not cross-reference a specific set of facts against a current legislative text. They generate the most statistically plausible response to the question as phrased, based on patterns in their training data.

This architecture produces two specific failure modes that are particularly dangerous in accounting and tax contexts.

The first is the knowledge cutoff problem. Every large language model has a training data cutoff date: a point beyond which it has no information about legislative changes, new HMRC guidance, tribunal decisions, or Budget announcements. Tax law in the United Kingdom changes every year, through primary legislation such as Finance Acts, secondary legislation, and HMRC interpretive guidance that can be updated at any time. A model trained on data from twelve or eighteen months ago may not know that a threshold has changed, that a relief has been modified, that a compliance deadline has moved, or that a previously accepted planning structure has been challenged and overturned by a tribunal. When a business owner asks about their tax position and the model applies the previous year’s rules, the error is invisible to the user because they do not know the rules well enough to recognise the discrepancy.

The second is the specificity problem. Tax outcomes depend on the precise facts of each case. Whether a transaction qualifies for a particular relief, whether an expense is allowable, whether a structure triggers an anti-avoidance provision, or whether a business meets the qualifying conditions for a given scheme, are all questions whose answers can change entirely based on a single factual difference. A business in which a director holds a 5% shareholding and meets the employment condition for Business Asset Disposal Relief has a meaningfully different tax position from a business in which the director holds a 4.9% shareholding and does not. An AI tool that cannot access the specific facts of a case, and cannot ask the structured professional questions required to elicit them, cannot provide a reliable analysis of that case. It can only provide a general answer based on the typical scenario, which may or may not correspond to the user’s actual situation.

A Certified Public Accountant and tax professional, Miklos Ringbauer, who reviewed an AI-assisted tax conversation for a CNBC investigation published in March 2026, identified the structural nature of this problem precisely: ‘The question becomes does the taxpayer have the necessary understanding of the documents they look at to understand and correct any items that need to be addressed?’ For a user without accounting training, the answer is typically no. They cannot identify what the AI got wrong because they do not know what right looks like.

The Data Governance Risk That Barely Gets Mentioned

The cognitive risks described above relate to the accuracy of AI output. There is a parallel category of risk that receives substantially less attention in consumer-facing commentary: what happens to the financial data that business owners share with general-purpose AI tools.

When a business owner pastes their profit and loss figures, their supplier names, their customer payment records, or their salary and dividend structure into a public AI chatbot to get financial advice, they are transmitting commercially sensitive information to a third-party system operated under terms of service that may permit that information to be used for model training, stored on servers outside the United Kingdom, or disclosed in ways that are inconsistent with UK General Data Protection Regulation obligations.

Research cited in a Data Snipper report published in 2025 found that only 7% of organisations using AI have embedded AI governance frameworks, despite 93% using AI in some form. For a UK business subject to GDPR, sharing client financial data, employee payroll information, or personal customer records with a non-compliant AI platform is not simply a matter of poor security hygiene. It is a potential data protection breach that could attract regulatory attention from the Information Commissioner’s Office.

Enterprise versions of tools including ChatGPT and Claude offer data processing agreements and privacy controls that address some of these concerns, typically at additional cost. Free-tier and low-cost subscriptions generally do not offer the same contractual protections. A small business owner using a free AI subscription to get tax advice while sharing financial details of their clients or customers is creating data governance exposure that the brief convenience of the interaction does not justify.

The Professional Conduct in Relation to Taxation guidance, updated for 2025 and 2026, specifically requires practitioners to consider maintaining confidentiality and complying with GDPR and Data Protection Act requirements when using AI tools, and to understand the limitations of AI tools and mitigate the risk of irresponsible use. For unrepresented business owners using general-purpose AI without any professional framework, there is no equivalent guidance requiring them to think about these risks. The risk therefore tends not to be considered until after an incident occurs.

The Accountants Already Dealing With the Fallout

The consequences of unsupervised AI use in accounting and tax are not falling primarily on the AI companies whose tools produce the errors. They are falling on the qualified accountants who are subsequently engaged to remediate them, and on the businesses that absorb the financial cost.

The Dext survey of 500 accountants and bookkeepers, published in December 2025, found that among practitioners who have encountered client mistakes caused by AI-generated advice, 52% spend up to three hours per month correcting those errors, 38% spend between four and six hours per month, and 8% spend seven to ten hours per month. This is remediation time that is being billed to the business that made the original error, meaning that the apparent cost saving of using free AI advice is being recouped, with interest, in professional correction fees incurred later.

The forward-looking assessments from the same survey are sobering. Forty-five percent of respondents believe that business decisions based on false confidence in inaccurate AI outputs will become more common. Thirty-eight percent foresee rising fines and penalties from HMRC as a result of incorrect submissions. Thirty-seven percent anticipate greater HMRC scrutiny due to incorrect or late filings. A third of respondents, specifically 33%, warn of a higher risk of insolvency or business failure linked to misuse of AI outputs in financial decisions. Forty-three percent expect increased misuse of AI-generated content to justify inappropriate or fraudulent claims.

Paul Lodder, Vice President of Accounting Product Strategy at Dext, summarised the professional view in December 2025: ‘If we head into 2026 with more businesses treating AI outputs as trusted tax and financial advice without professional oversight, the consequences could be severe. The damage is no longer hypothetical.’

HMRC’s own position is not neutral on this question. The tax authority’s Connect AI system already cross-references submitted tax data against hundreds of external data sources, building a risk profile of each taxpayer based on behavioural consistency, industry benchmarks, and statistical anomalies. A business that has been advised by AI to claim an expense that is actually disallowable, or to apply a VAT treatment that is incorrect, does not receive a grace period because the error was generated by an AI tool rather than a human one. HMRC’s algorithms flag the inconsistency, and the compliance process follows.

The Distinction That Actually Matters: Task Category, Not Tool Category

The critical analytical distinction that is absent from most popular commentary on this subject is not between different AI tools, such as whether Claude or ChatGPT or Gemini is more accurate for tax questions. It is between categories of task.

The comparison of Claude versus ChatGPT for accounting tasks published by Accounting AI Tools in March 2026 found that both tools handle standard bookkeeping queries well, with ChatGPT performing better for quick categorisation questions and Claude performing better when reasoning and nuance are required for tax planning scenarios or complex document analysis. The comparison is useful for qualified accountants choosing between tools for professional workflows. It is not, however, the relevant question for a business owner without accounting expertise, because neither tool’s relative performance changes the fundamental structural limitation: applied tax advice for a specific factual situation requires domain knowledge that the user must possess to verify the output, regardless of which tool generated it.

The tasks for which general-purpose AI tools genuinely add value for non-accountants without meaningful risk are the informational and administrative ones. Understanding what a term means, drafting a professional email to a supplier, learning the general structure of a tax form before engaging a professional to complete it, or understanding the broad framework of a relief before asking an accountant whether it applies to a specific situation: these are tasks where AI’s fluency and accessibility make it genuinely useful, and where the absence of specific applied judgment does not create material risk.

The tasks for which general-purpose AI tools create risk for non-accountants are those where the output will be acted upon directly without professional review. Completing a tax return, claiming a specific relief, determining the VAT treatment of a particular supply, advising on the most efficient structure for profit extraction, calculating the capital gains tax arising from a disposal, or preparing submissions for an HMRC enquiry: these are tasks where the specificity gap and the knowledge cutoff problem combine with the cognitive overconfidence effect to produce a situation in which the user does not know the answer is wrong, cannot verify whether it is right, and will bear the full legal and financial consequence of acting on incorrect information.

What the Academic Literature Adds to the Practical Evidence

The practitioner evidence described above is reinforced by a strand of academic research that examines the broader dynamics of professional expertise displacement by AI, which is relevant to the accounting context even where the research does not address accounting specifically.

A perspective paper published in April 2025 by researchers at the Indian Institute of Technology Jodhpur, examining the relationship between professional expertise and AI collaboration, identified what it terms the paradox of professional input: as domain experts collaborate with AI systems by externalising their implicit knowledge into training data, they accelerate the development of AI systems that can simulate that expertise. The paradox is that this simulation is most convincing to people who lack the expertise to distinguish simulation from substance.

Different research argues that AI tools deployed in domains where expert knowledge is complex but undervalued tend to produce outcomes that further obscure the value of that expertise, creating a feedback loop in which overconfidence in AI output and underestimation of human expertise reinforce each other.

The application to accounting is direct. Accounting expertise is systematically undervalued in popular perception. The image of the accountant as someone who fills in forms and adds up numbers, rather than someone who applies detailed knowledge of a complex and continuously changing body of legislation to specific factual situations, is pervasive. AI tools that can fluently discuss tax concepts reinforce this undervaluation by making it appear that the knowledge component of accounting is readily accessible. What they cannot replicate, and what the AI Failure Loop analysis predicts, is the judgment component: the ability to recognise which facts are legally significant, to identify the question that has not been asked but should be, to apply current legislation correctly to an unusual situation, and to advise on the risk attached to a position that is technically arguable but practically exposed.

A comparative study proposed in academic literature examining novice over-reliance versus expert augmentation of AI found what it terms the automaticity trap: novices treat AI output as definitive and act on it without further verification, while experts use AI to accelerate work they could do independently, applying their domain knowledge to validate and refine the output. The difference in outcome between these two usage patterns is not a function of which AI tool is used. It is a function of whether the user has the expertise to identify when the AI is wrong.

The Legitimate Role of AI in Accounting for Non-Accountants: A Precise Answer

Given everything established above, the appropriate question is not whether UK business owners should use AI tools for financial and accounting tasks, but precisely where those tools add value without creating risk, and where they create risk that professional expertise is required to manage.

AI tools are appropriate for non-accountants in the following specific applications:

Generating first drafts of financial correspondence that will be reviewed before sending.
Understanding the general framework of a tax obligation before discussing it with a professional.
Using AI-embedded features within HMRC-approved accounting software such as Xero, QuickBooks, Sage, or Free Agent to automate transaction categorisation and bank reconciliation, where the AI operates within a defined and auditable system rather than as a freestanding advisor.
Summarising information from financial documents to prepare for a meeting with a professional.
Checking arithmetic in spreadsheets.

These are tasks where AI accelerates work and reduces friction without displacing the professional judgment that the work ultimately requires.

AI tools are not appropriate as substitutes for professional advice in the following applications:

Determining whether a specific transaction is subject to VAT, and at which rate.
Claiming any relief, allowance, or deduction whose qualifying conditions depend on specific factual circumstances.
Preparing any submission to HMRC, including self-assessment returns, VAT returns under Making Tax Digital, corporation tax computations, or responses to HMRC enquiries, without professional review.
Making structuring decisions about profit extraction, asset disposal, business sale, or investment that have material tax consequences.
Preparing any legal or tribunal submission.

The Making Tax Digital context is particularly important here. From April 2026, sole traders and landlords with qualifying income above £50,000 are legally required to submit quarterly updates to HMRC through compatible software. The automation of this submission process through AI-enabled accounting platforms is not the same as using a general-purpose chatbot to determine what should be in those submissions. The former uses AI within a defined, HMRC-approved framework. The latter uses AI to substitute for the professional judgment that should determine the content of the submission. These are categorically different activities, and conflating them is one of the most common errors in popular commentary on AI in accounting.

The Cost Calculation That Is Usually Missed

The perceived financial advantage of using free AI tools for accounting and tax is the upfront cost saving: no professional fee for a question that the AI answers immediately. The cost calculation that is rarely performed is the one that accounts for the full range of possible outcomes.

The cost of an incorrect VAT return, where HMRC raises an assessment for the underpaid VAT plus interest plus a careless behaviour penalty, can easily exceed the cumulative professional fees for several years of correctly prepared returns. The cost of missing a significant tax relief, such as an Annual Investment Allowance claim that AI failed to identify because the question was not phrased in a way that triggered the relevant analysis, is the permanent loss of tax that would have been saved. The cost of an incorrect self-assessment return that triggers an HMRC enquiry includes not only the professional fees for managing the enquiry but the time cost for the business owner of engaging with a process that can take twelve to eighteen months.

Research by a Sage and Demos study estimated that AI-enabled accounting, used properly within professional frameworks, could add £2 billion to UK GDP and create nearly 20,000 specialist roles by 2026. The same research implicitly identifies the counterfactual: the combination of AI capability with unqualified use is not the same as AI capability with professional oversight. The value accrues to the second combination. The first combination can, and demonstrably does, produce outcomes worse than neither AI nor professional involvement, because it eliminates the professional oversight without eliminating the complexity that oversight exists to manage.

Conclusion: The Question That Should Precede Every AI Query in Finance

The phenomenon this article has examined is not primarily a technology problem. The AI tools themselves are, for many purposes, genuinely impressive and genuinely useful. The problem is a calibration problem: a systematic mismatch between how confidently users perceive AI output on accounting and tax questions and how reliably that output can be acted upon without expert review.

The documented cognitive research establishes that AI use makes all users overestimate their competence, and makes more AI-literate users overestimate it most. The documented tribunal record establishes that this overconfidence has already produced tangible legal and financial harm for UK taxpayers, in cases that now form named precedents in the UK tax tribunal system. The documented practitioner survey establishes that qualified accountants are already spending significant unpaid or poorly recovered time remedying errors that would not have occurred if the underlying questions had been directed to them in the first instance.

The calibration question that should precede every AI query in an accounting or tax context is not ‘can AI answer this?’ but ‘do I have the expertise to know if the answer is wrong?’ Where the answer to that question is yes, AI tools are a legitimate and often highly efficient resource. Where the answer is no, the AI tool is a plausible-sounding risk that the user has no mechanism to identify as such until the consequences arrive.

The appropriate use of AI in UK business finance is as a tool in the hands of qualified professionals, or as an automation layer within approved systems that qualified professionals oversee, not as a substitute for the expertise that those professionals possess. The businesses that understand this distinction will use AI to reduce the cost and increase the frequency of professional-grade financial insight. The businesses that do not will use AI to produce the appearance of professional-grade financial management while accumulating the risks that professional expertise exists to prevent.

The difference between these two outcomes is not which AI tool is used. It is whether someone qualified is looking at what the AI produces before it reaches HMRC.

This article was prepared by Zazen Tax. All tribunal case citations are from published UK First-tier Tribunal and Upper Tribunal decisions. Survey data sources include Dext/Censuswide (2025, 2026), Wolters Kluwer (2025), and Bloomberg Tax (2026). Academic research citations include the Aalto University study published in Computers in Human Behavior (2025) and related papers cited in context. This article does not constitute specific tax or legal advice.

Do you want more traffic?

Hey, I am Andrei Spătaru. I am determined to make a business grow. My only question is, will it be yours?

Your website URL

CONTACT

Get in touch

b769f5cdf404d12cec67f6746d8f6256ba6c93ed (1)

First name

Last name

Phone number

Business type

Turnover

Employees

Accounting tool

Region

You agree to our friendly privacy policy