The Domain Set plugin in lazydns supports sophisticated domain name matching with multiple rule types and priority-based evaluation. This document describes the complete domain matching rule system.
# Load a domain list file with default (domain) matching
- name: domain_set
tag: direct
args:
files:
- direct-list.txt
# With specific match type
- name: domain_set
tag: gfw
args:
files:
- gfw-list.txt
default_match_type: domain
auto_reload: true
# Comments start with #
# This is a comment
# Exact match only
full:google.com
# Domain match (default)
domain:example.com
example.com # No prefix = uses default_match_type
# Keyword substring match
keyword:facebook
# Regular expression match
regexp:.*\.google\.com$
# Empty lines are ignored
full:)Exact domain matching only, no subdomains.
full:example.comexample.com, EXAMPLE.COM (case-insensitive)www.example.com, sub.example.com, example.com.hkfull:google.com → matches only "google.com"
full:api.github.com → matches only "api.github.com"
→ does NOT match "github.com" or "www.api.github.com"
domain:)Match domain and all its subdomains.
domain:example.com or just example.com (uses default)example.com, www.example.com, api.example.com, a.b.c.example.comnotexample.com, example.com.hk, examplecomWhen multiple domain rules could match, the most specific (longest) rule wins:
Rules: com, example.com, api.example.com
Query www.example.com:
✓ Matches api.example.com? No
✓ Matches example.com? Yes (return true)
✗ Would also match com, but already found more specific match
Query api.example.com:
✓ Matches api.example.com? Yes (return true)
Query other.com:
✓ Matches api.example.com? No
✓ Matches example.com? No
✓ Matches com? Yes (return true)
domain:google.com → matches google.com, www.google.com, maps.google.com, etc.
domain:co.uk → matches all .co.uk domains
example.com → equivalent to domain:example.com (if default is domain)
keyword:)Substring/keyword matching anywhere in the domain.
keyword:googlegoogle.com, www.google.com, google.com.hk, mygoogle.net, my-google-service.orggogle.com (typo), notgooglelike.com (keyword not present as substring)keyword:ad matches add.com, advertisement.com, badword.com)keyword:facebook → matches facebook.com, www.facebook.com, facebook.com.cn, myfacebook.net, etc.
keyword:google → matches google.com, mygoogle.com, google.com.hk, googlechrome.com, etc.
keyword:cdn → matches cdn.com, mycdn.net, ocdn.org, etc. (be careful!)
regexp:)Regular expression pattern matching using Rust regex syntax (compatible with Go stdlib).
regexp:^[a-z]+\.google\.com$| Pattern | Matches | Does NOT match |
|---|---|---|
.+\.google\.com$ |
www.google.com, maps.google.com |
google.com (no prefix) |
^google\. |
google.com, google.co.uk |
www.google.com |
(baidu\|google) |
baidu.com, google.com |
notbaidu.com |
test-[0-9]+ |
test-123.com, test-1.org |
test-abc.com |
regexp:.+\.github\.io$ → matches *.github.io (personal GitHub Pages)
regexp:^api\. → matches api.example.com, api.service.com, etc.
regexp:(qq\|wechat) → matches qq.com, wechat.com
regexp:.*cdn.* → matches any domain containing "cdn"
Regexp matching is CPU-intensive, especially with:
.*.*, .+.+)Best practices:
^ and $ to improve performanceRules are evaluated in strict priority order. The first matching rule determines the result.
Full > Domain > Regexp > Keyword
Rules:
- full:example.com
- domain:example.com
- keyword:example
- regexp:.*example.*
Query example.com:
1. Check Full rules → matches full:example.com ✓ RETURN TRUE
(Never reaches Domain, Regexp, or Keyword checks)
Query sub.example.com:
1. Check Full rules → no match
2. Check Domain rules → matches domain:example.com ✓ RETURN TRUE
(Never reaches Regexp or Keyword checks)
Query myexample.org:
1. Check Full rules → no match
2. Check Domain rules → no match
3. Check Regexp rules → matches .*example.* ✓ RETURN TRUE
(Never reaches Keyword check)
| Match Type | Complexity | Notes |
|---|---|---|
| Full | O(1) | HashMap lookup |
| Domain | O(d) | d = domain depth, typically 3-4 |
| Regexp | O(n·r) | n = rules, r = regex complexity |
| Keyword | O(n·s) | n = rules, s = string length |
| Match Type | Memory per 10,000 rules |
|---|---|
| Full | ~1 MB |
| Domain | ~1 MB |
| Regexp | ~2-5 MB (includes compiled regex) |
| Keyword | ~0.5-1 MB |
Rule set size: 100,000 domains
Match type | Avg Query Time | Remarks
-----------------------------------
Full | < 1 µs | Instant
Domain | < 5 µs | Very fast
Regexp | 100-1000 µs | Slow with complex patterns
Keyword | 50-500 µs | Linear scan
When multiple full or domain rules could match, the most specific match wins:
Rules:
- domain:com
- domain:example.com
- domain:api.example.com
Query api.example.com:
Evaluation: Longest match wins
→ Matches api.example.com (most specific) ✓
Rules are evaluated in import order (file order). The first match wins:
Rules (in order):
- regexp:google
- regexp:.*oogle
- keyword:abc
Query "google.com":
→ Matches first regexp:google ✓ (returns true immediately)
→ Never evaluates remaining rules
- name: domain_set
tag: direct
args:
files:
- direct-list.txt
With default domain matching (rules without prefix use domain match).
- name: domain_set
tag: gfw
args:
files:
- gfw.txt
default_match_type: keyword
auto_reload: true
All rules without a prefix will use keyword matching.
- name: domain_set
tag: combined
args:
files:
- blocklist.txt
- custom-domains.txt
- regex-patterns.txt
default_match_type: domain
auto_reload: true
# Auto-reload checks files every ~200ms
You can also specify domain rules inline using the exps parameter instead of external files:
Single rule (string format):
- name: domain_set
tag: direct
args:
exps: "example.com"
Multiple rules (array format):
- name: domain_set
tag: combined
args:
exps:
- "example.com"
- "full:github.com"
- "regexp:.+\.google\.com$"
- "keyword:facebook"
Mixed files and inline expressions:
- name: domain_set
tag: comprehensive
args:
files:
- blocklist.txt
exps:
- "example.com"
- "full:special.service.com"
- "regexp:^internal-.*\.local$"
default_match_type: domain
auto_reload: true
The exps parameter supports the same rule format as files:
full:, domain:, keyword:, or regexp: for specific match typesdefault_match_type# Direct access (fast, no censorship)
# Domain format (matches subdomains)
domain:example.com
github.com
www.wikipedia.org
# Exact matches for specific services
full:dns.google
# Keywords for broad categories
keyword:cdn
# Complex patterns
regexp:.+\.local$
regexp:^192-168-.*\.nip\.io$
example.com
sub.example.com
full:exact.com
domain:parent.com
keyword:google
regexp:.+\.example\.com$
# This is a comment
# Comments must be on their own line
example.com
Leading and trailing whitespace is trimmed
example.com → matches "example.com"
Empty lines are silently ignored
# Direct access domains (no blocking)
# Updated: 2024-12-26
# GitHub and services
github.com
www.github.io
api.github.com
# Exact services
full:dns.google.com
full:8.8.8.8
# CDN and infrastructure
keyword:cdn
keyword:cloudflare
# User agent patterns
regexp:.*bot.*
regexp:.*crawler.*
# Personal domains
domain:*.example.com
1. Full matches (most specific, fastest)
2. Domain matches (common case)
3. Regexp patterns (complex logic)
4. Keyword matches (broad patterns)
full: and specific domain: ruleskeyword: and regexp: patternsauto_reload: true for frequently updated listsfull:, domain:, etc.)Example:
# These all match "example.com":
domain:example.com
DOMAIN:EXAMPLE.COM
example.com. # trailing dot normalized
Example.Com # case normalized
# These do NOT match "example.com":
full:www.example.com # full requires exact match
keyword:exam # keyword is substring, would match but rule is for "exam"
Enable tracing to see matching details:
RUST_LOG=debug lazydns
use lazydns::plugins::dataset::{DomainRules, MatchType};
let mut rules = DomainRules::new();
// Add rules
rules.add_rule(MatchType::Full, "exact.com");
rules.add_rule(MatchType::Domain, "example.com");
rules.add_rule(MatchType::Keyword, "google");
rules.add_rule(MatchType::Regexp, r".+\.github\.io$");
// Parse lines
rules.add_line("domain:test.com", MatchType::Domain);
// Check matches
assert!(rules.matches("exact.com"));
assert!(rules.matches("sub.example.com"));
assert!(rules.matches("test-google.com"));
assert!(rules.matches("mysite.github.io"));
let stats = rules.stats();
println!("Full: {}", stats.full_count);
println!("Domain: {}", stats.domain_count);
println!("Regexp: {}", stats.regexp_count);
println!("Keyword: {}", stats.keyword_count);