Why Email Validation is Harder Than You Think

The email address format is defined by RFC 5322 and its predecessors. The full specification allows some surprisingly odd-looking addresses that are technically valid:

  • "spaces in quotes"@example.com — quoted strings in the local part
  • user+tag@example.com — plus addressing (widely used for Gmail filtering)
  • user@subdomain.example.co.uk — multiple subdomains and ccTLDs
  • user@[192.168.1.1] — IP address literals as the domain
  • very.unusual."@".unusual.com@example.com — technically valid but almost never seen
  • user@xn--nxasmq6b.com — internationalized domain names (Punycode)

A fully RFC 5322-compliant regex would need to handle all of these, while also rejecting addresses that fail basic format rules. The result is a 1000-character pattern that is essentially unmaintainable. In practice, the goal is not full RFC compliance — it is catching typos while not blocking real addresses.

The Simple Regex (Good Enough for 99% of Cases)

For most applications, this minimal pattern is the right choice. It validates the essential structure — something before an @, a domain, and a TLD — without false negatives on legitimate addresses:

/^[^\s@]+@[^\s@]+\.[^\s@]+$/

Breaking it down:

  • ^ — start of string
  • [^\s@]+ — one or more characters that are not whitespace or @ (the local part)
  • @ — literal at-sign
  • [^\s@]+ — one or more characters that are not whitespace or @ (the domain)
  • \. — a literal dot
  • [^\s@]+ — one or more characters (the TLD)
  • $ — end of string

What it correctly accepts and rejects:

Valid (accepted):

✓ user@example.com
✓ first.last@company.co.uk
✓ user+tag@gmail.com
✓ 123@numbers.org

Invalid (rejected):

✗ plainstring
✗ @nodomain.com
✗ user @example.com (space)
✗ user@

The main weakness is that it accepts user@@example.com and a@b.c (technically valid but suspicious). For most signup forms and API inputs, this is an acceptable tradeoff.

The Comprehensive Regex (RFC 5322 Compliant)

When you need stricter validation — for example, in an email sending system where invalid addresses cause bounces and damage sender reputation — a more thorough pattern is warranted. This is a widely-cited RFC 5322-derived pattern:

/^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/

This pattern is longer but handles significantly more cases. Breaking down the key parts:

  • Local part (before @): Either a sequence of allowed characters with optional dot-separated segments [^<>()...]+(\.[^<>()...]+)*, or a quoted string ".+" for addresses like "john doe"@example.com.
  • Domain (after @): Either an IP address literal in brackets \[[0-9]{1,3}...\], or a standard hostname with at least a 2-character TLD ([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}.
  • TLD minimum of 2 chars: Rejects single-character TLDs while accepting long ones like .museum or .technology.

Additional cases this pattern handles correctly:

✓ "quoted string"@example.com
✓ user@[192.168.1.1]
✓ user@subdomain.long-domain.co.uk

Still rejected correctly:

✗ user@example (no TLD)
✗ user @example.com (space before @)
✗ user@@example.com (double @)

Email Validation in JavaScript

Here are practical, copy-paste-ready JavaScript implementations using both patterns:

Simple validation function

/**
 * Validates email format for most use cases.
 * Fast, readable, very few false negatives.
 */
function isValidEmail(email) {
  const pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
  return pattern.test(email.trim());
}

// Usage
isValidEmail("user@example.com");      // true
isValidEmail("not-an-email");          // false
isValidEmail("user+tag@gmail.com");    // true

Stricter validation with detailed feedback

function validateEmail(email) {
  const trimmed = email.trim();

  if (!trimmed) {
    return { valid: false, error: "Email is required" };
  }

  if (trimmed.length > 254) {
    return { valid: false, error: "Email address too long (max 254 characters)" };
  }

  if (!trimmed.includes("@")) {
    return { valid: false, error: "Missing @ symbol" };
  }

  const [local, ...domainParts] = trimmed.split("@");
  const domain = domainParts.join("@");

  if (!local || local.length > 64) {
    return { valid: false, error: "Invalid local part (before @)" };
  }

  if (!domain || !domain.includes(".")) {
    return { valid: false, error: "Invalid domain (missing TLD)" };
  }

  const rfcPattern = /^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/;

  if (!rfcPattern.test(trimmed)) {
    return { valid: false, error: "Invalid email format" };
  }

  return { valid: true, email: trimmed.toLowerCase() };
}

// Usage
const result = validateEmail("  User@Example.COM  ");
// { valid: true, email: "user@example.com" }

HTML5 Email Validation

Before reaching for JavaScript, consider what the browser gives you for free. The input type="email" element has built-in validation that runs on form submission:

<!-- Basic built-in validation -->
<input type="email" name="email" required>

<!-- With custom pattern for stricter validation -->
<input
  type="email"
  name="email"
  required
  pattern="[^\s@]+@[^\s@]+\.[^\s@]+"
  title="Please enter a valid email address"
>

The browser's built-in algorithm for type="email" follows the HTML5 specification (a simplified subset of RFC 5322). It rejects obvious non-emails, displays a native validation tooltip, and works without any JavaScript. The pattern attribute lets you layer additional constraints on top.

Limitations to be aware of: built-in validation only fires on form submission, not as the user types. For real-time feedback, you need JavaScript. Also, the :invalid CSS pseudo-class lets you style invalid inputs, but it triggers even on untouched empty fields unless you also use :not(:placeholder-shown):

/* Only show red border after user has interacted */
input[type="email"]:not(:placeholder-shown):invalid {
  border-color: #ef4444;
  outline-color: #ef4444;
}

input[type="email"]:not(:placeholder-shown):valid {
  border-color: #34d399;
}

Common Edge Cases

These are the cases that trip up most email validation implementations:

Plus addressing (subaddressing)

user+tag@gmail.com is valid and widely used. Gmail, Outlook, and most modern email providers support it for filtering. Your regex must not reject the + character in the local part. Many overly restrictive patterns fail on this.

Subdomains

first.last@mail.company.co.uk is a valid email with dots in both the local part and the domain. The domain has three levels. A good pattern must handle multiple dot-separated labels in the domain.

Long TLDs

TLDs are no longer restricted to two or three characters. .museum, .photography, .technology, and hundreds of other gTLDs are valid. Any regex that enforces a maximum TLD length of 4 or 6 characters will produce false negatives. Enforce only a minimum (2 chars) and no maximum.

International domains (IDN)

Domain names can contain non-ASCII characters encoded as Punycode. user@münchen.de is valid — the real domain is xn--mnchen-3ya.de in ASCII form. Most regex patterns cannot validate Punycode directly; rely on a dedicated email validation library if IDN support is required.

IP address literals

user@[192.168.1.1] is technically valid per RFC 5321 but almost never used in practice. Unless you are building an SMTP implementation, you can safely ignore this case.

Don't Over-Validate

This is the single most important practical advice in this article: the purpose of client-side email validation is to catch typos, not to verify deliverability.

No regex can tell you whether an email address actually exists or whether its inbox is accepting mail. Only delivering to the inbox can confirm that. Over-validating (rejecting addresses your regex incorrectly flags as invalid) loses real users. Under-validating (accepting malformed addresses) costs you a bounce.

The right approach is a two-layer strategy:

  1. Use a simple regex to catch obvious format errors (missing @, missing TLD, spaces).
  2. Send a confirmation or verification email to confirm the address is real and the user controls it.

A confirmation email is the only reliable gate. Everything else is a heuristic. If you find yourself debating whether to reject user@localhost or a@b.io, step back — send the email and let the SMTP layer handle it.

Popular Email Regex Patterns Compared

Here is a practical comparison of the most commonly used patterns, including their tradeoffs:

Pattern Type Pros Cons Coverage
/^[^\s@]+@[^\s@]+\.[^\s@]+$/ Simple Readable, minimal false negatives, fast Accepts a@@b.c, very lenient ~95%
/^[\w.+-]+@[\w-]+\.[\w.]{2,}$/ Common Short, handles most real-world emails Rejects valid special chars in local part ~92%
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/ Balanced Widely used, good balance of strictness and coverage Rejects some valid quoted-string locals ~97%
/^(([^<>()\[\]\\.,;:\s@"]+(\.[^<>()\[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}...)/ RFC 5322 Handles quoted strings, IP literals, long TLDs Long, hard to read, still not 100% RFC compliant ~99%

The "balanced" pattern in full

The third pattern in the table above is the most practical choice for production use when you want more than the simple pattern but not the full RFC 5322 complexity:

/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/

Breaking it down:

  • [a-zA-Z0-9._%+-]+ — local part: alphanumerics plus ., _, %, +, -
  • @ — required at-sign
  • [a-zA-Z0-9.-]+ — domain: alphanumerics, hyphens, dots (handles subdomains)
  • \. — dot separator before TLD
  • [a-zA-Z]{2,} — TLD: at least 2 letters, no maximum (handles long TLDs)

Using a library instead of rolling your own

For applications where email validity is critical (transactional email systems, B2B SaaS sign-up flows), consider a dedicated validation library rather than writing regex yourself:

// Node.js — validator.js (most popular)
import validator from 'validator';
validator.isEmail('user@example.com'); // true

// Node.js — email-validator (lightweight)
import { validate } from 'email-validator';
validate('user@example.com'); // true

// Python — email-validator (RFC-compliant)
# pip install email-validator
from email_validator import validate_email, EmailNotValidError
try:
    info = validate_email("user@example.com")
except EmailNotValidError as e:
    print(str(e))

Test Your Email Regex

The best way to understand any regex is to run it against a comprehensive set of test inputs — both valid addresses you expect to accept and invalid ones you need to reject. Our Regex Tester lets you paste any of the patterns above, build a test suite, and see match results highlighted in real time.

Test Your Email Regex Live

Paste any email regex pattern and test it against your own inputs. Supports flags, capture groups, and multi-line matching. Runs entirely in your browser.

Open Regex Tester →

Related Developer Tools