Maverick Docs
Operations

Filtering

How filter rules work, sampling logic, and reviewing filtered contacts

What This Does

Filtering is the second pipeline stage. It takes raw scraped contacts and applies workspace-specific filter rules to decide which contacts move forward to verification. This reduces volume and ensures only eligible contacts are emailed.

How It Works

After scraping completes, filtering automatically starts. The system:

  1. Loads filter rules configured for the workspace
  2. Applies each rule (age range, coverage type, geography, etc.)
  3. Uses deterministic hash-based sampling for reproducibility — the same contact always gets the same sampling decision
  4. Outputs to the filtered_contacts table

Deterministic Sampling

Filtering uses hash-based sampling so results are reproducible:

hash_input = f"routing_sample_{contact_id}"
hash_val = (int(md5(hash_input).hexdigest(), 16) % 10000) / 10000.0

Same contact always gets the same sampling decision.

How To Use It

Viewing Filtered Contacts

  1. Go to Contacts in the sidebar
  2. Select the workspace and month
  3. The contacts table shows filtering status for each contact
  4. Use the status filter dropdown to see only filtered contacts

Filter Rules

Filter rules are configured per workspace in Settings → Filter Configuration. Common rules:

  • Age range (e.g., 65-80)
  • Coverage type (Medicare Supplement, etc.)
  • State inclusion/exclusion
  • Sampling rate (e.g., 50% of eligible contacts)

Common Issues

SymptomCauseFix
0 contacts after filteringFilter rules too restrictive or no rules configuredCheck filter rules in Settings. If no rules exist, filtering passes everything through
Filtering takes too longLarge batch (>50K contacts)Normal for large batches. Check queue depth on Status Page — filtering queue has concurrency 2
Different results after re-runShouldn't happen — deterministic samplingIf contact IDs changed (re-scraped), the hash changes. This is expected for new scrapes
  • PipelineSuccessRateSLOBreach: Filtering failures contribute to overall pipeline SLO
  • CeleryHighFailureRate: Check if filtering tasks are failing

On this page