Datasynx Shadowing

The SOPs your team never has time to write — observed, generated, anonymized.

by Datasynx AI · ★ Star on GitHub

100% local MCP-native Claude-powered PII-anonymized

Shadowing observes daily workflows like a silent shadow — shell commands, active windows, git commits, file changes — and automatically generates structured Standard Operating Procedures (SOPs) via Claude. Everything stays on the employee's machine: a local SQLite database, no cloud sync, no daemon. Every export is automatically scrubbed of PII.

5-minute quickstart. npm i -g @datasynx/agentic-ai-shadowingshadowing initshadowing start. Complete a task, get a reviewable SOP.

Quickstart

# 1. Setup (create DB + config)
shadowing init

# 2. Manual mode: start task → generate SOP
shadowing start

# 3. Automatic mode: observe workflow → auto-generate SOPs
shadowing observe --auto-sop

# 4. View SOPs
shadowing list
shadowing show <sop-id>

# 5. Start the web dashboard
shadowing ui

# 6. Export with anonymization
shadowing export --all

Two capture modes

Manual — the employee starts, pauses and completes tasks, then rates complexity. Automaticobserve captures windows, shell history and git, then clusters activity into tasks.

Reviewable output

Each SOP is versioned Markdown with Objective → Prerequisites → Steps → Expected Result → Notes, AI-generated tags, and a status workflow: Draft → Reviewed → Approved → Exported.

Installation

npm install -g @datasynx/agentic-ai-shadowing

Requirements

  • Node.js ≥ 20 (Linux, macOS, or Windows)
  • ANTHROPIC_API_KEY environment variable (for SOP generation)
  • @datasynx/agentic-ai-cartography (optional, for infrastructure context)
export ANTHROPIC_API_KEY=sk-ant-...
Shadowing also works without Cartography — SOPs are then generated without the system-landscape context. See Datasynx Agentic Suite.

Architecture

CLI (Commander.js — 27 commands)
  └─ shadowing init / start / observe / list / export / ui / ...
      ├─ TaskManager      Task lifecycle (start → pause → resume → complete)
      ├─ Observer         Heartbeat-based workflow capture
      │   ├─ WindowDetector   xdotool (Linux) / osascript (macOS) / P/Invoke (Win)
      │   ├─ ShellHistory     Zsh / Bash / Fish / PowerShell parser
      │   └─ Git + File       Commit tracking + file changes
      ├─ SessionAnalyzer  Silence clustering → task detection (LLM)
      ├─ SOPGenerator     Claude API → structured SOPs + tags
      ├─ Anonymizer       PII redaction (8+ patterns)
      ├─ Exporter         Markdown + manifest.json (atomic operations)
      ├─ Metrics          Consistency · Maturity · Freshness · Quality
      ├─ PrivacyManager   Consent + exclusion rules + degradation
      └─ ShadowingDB      SQLite WAL (11 tables, constraints, indices)

  Integrations:
      ├─ UIServer         REST API (17 endpoints) + HTML dashboard
      ├─ MCPServer        Model Context Protocol (18 tools, stdio)
      ├─ HookHandler      Claude Code event processing
      └─ Cartography      JGF graph import from agentic-ai-cartography

Task Management

CommandDescription
shadowing initInitial setup (DB + config)
shadowing startStart interactive shadowing mode
shadowing statusShow current task and statistics

SOP Management

CommandDescription
shadowing list [--status --tag --search]List SOPs with filters
shadowing show <sop-id>Display SOP in terminal
shadowing edit <sop-id>Edit SOP in default editor
shadowing delete <sop-id>Permanently delete SOP
shadowing history <sop-id>Show version history
shadowing diff <sop-id> [version]Diff between versions
shadowing tag <sop-id> <tags...>Add (+tag) / remove (-tag) tags

Automatic Observation

CommandDescription
shadowing observe [--auto-sop --no-window --no-shell]Start observation mode
shadowing sessionsList observation sessions
shadowing timeline [session-id]Show session timeline
shadowing analyze [session-id]Session → detect tasks → generate SOPs

Cross-platform capture

CapabilityLinuxmacOSWindows
Window Detectionxdotool (X11)osascriptPowerShell P/Invoke
Shell HistoryZsh / BashZsh / BashPSReadLine
Git Trackinggit loggit loggit log

Metrics & Export

CommandDescription
shadowing statsMetrics dashboard in terminal
shadowing exportInteractive export wizard
shadowing export --allExport all approved SOPs
shadowing ui [--port <n>]Start web dashboard (default: 3847)
shadowing consentConsent management for observation
shadowing excludeManage exclusion rules

Export structure

exports/export_2026-02-24T14-30-00/
├── manifest.json          Metadata, tags, metrics summary
└── sops/
    ├── sop_001.md         Anonymized SOP (Markdown)
    └── sop_002.md

Integration

CommandDescription
shadowing import-graph <path>Import Cartography graph (JGF)
shadowing infra [dir]Extract infrastructure context
shadowing mcpStart MCP server (stdio transport)
shadowing setup-hooksConfigure Claude Code hooks + MCP
shadowing guideComplete guide
The MCP server exposes 18 tools (shadowing_start_task, shadowing_list_sops, shadowing_start_observation, shadowing_export_sops, …) so Claude Code, Codex, and Cursor can drive Shadowing natively.

Quality Metrics

Consistency Score

max(0, 100 - CV * 2) — based on the coefficient of variation of execution durations.

Maturity Score (weighted)

CriterionWeight
≥ 5 executions30%
Review completed30%
≥ 1 revision20%
Tags present10%
Description present10%

Overall Quality Score

consistency * 0.35 + maturity * 0.35 + freshness * 0.30

Privacy & Anonymization

Redact-on-capture

PII and secrets are redacted before observation data is written to SQLite — window titles, shell commands, file paths, task titles/descriptions, and task notes never reach disk in raw form (anonymization.redact_on_capture, default true). Export-time anonymization runs as a second layer. Older databases can be cleaned retroactively with the idempotent shadowing scrub command.

Redaction patterns

PatternReplacementConfigurable
Email addresses[email@example.com]redact_emails
IPv4/IPv6 addresses[internal-ip]redact_ips
URLs[internal-system]/pathredact_urls
Phone numbers[phone-number]redact_phone_numbers
File paths/Users/[user]/...redact_file_paths
Unknown high-entropy tokens[high-entropy-string]redact_high_entropy
IBAN[IBAN]always active
Credit card numbers[credit-card]always active
Tax ID[tax-id]always active
Social security number[social-security]always active
GitHub tokens (ghp_, github_pat_, …)[github-token]always active
Anthropic / OpenAI-style keys[anthropic-api-key] / [api-key]always active
AWS access keys & Secrets Manager ARNs[aws-access-key-id] / [aws-secret-arn]always active
Slack tokens[slack-token]always active
JWTs[jwt]always active
Bearer header valuesBearer [api-token]always active
PEM private-key blocks[private-key]always active

Developer-secret detection is never configurable off. The entropy fallback skips git commit SHAs, UUIDs, and ordinary identifiers; the whole pipeline is idempotent. Custom replacements via config.anonymization.custom_replacements.

REST API

The web dashboard (shadowing ui) serves a dark-theme single-page app and 17 REST endpoints on port 3847. The API requires Bearer-token auth, is rate-limited, and rejects cross-origin requests by default (no CORS wildcard — allowlist via ui_allowed_origins).

MethodPathDescription
GET/api/statsGlobal statistics
GET/api/tasksTask list (?status=)
GET/api/sopsSOP list (?status= ?tag= ?search=)
GET/api/sops/:idSOP detail with metrics + versions
PUT/api/sops/:id/statusChange status
GET/api/sops/:id/diffDiff to previous version
GET/api/exportsExport history
POST/api/exportsTrigger export

Configuration

Config lives at ~/.datasynx/shadowing/config.json.

{
  "version": "1.0.0",
  "language": "en",
  "polling_interval_minutes": 15,
  "editor": "code",
  "ui_port": 3847,
  "ui_allowed_origins": [],
  "cartography_graph_path": null,
  "anonymization": {
    "redact_emails": true,
    "redact_ips": true,
    "redact_urls": true,
    "redact_phone_numbers": true,
    "redact_file_paths": true,
    "redact_high_entropy": true,
    "redact_on_capture": true,
    "custom_replacements": {}
  },
  "sop_generation": {
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 4096,
    "temperature": 0.3,
    "include_cartography_context": true,
    "auto_generate_tags": true,
    "sop_language": "en",
    "base_url": null,
    "api_key_env": "ANTHROPIC_API_KEY",
    "use_structured_output": true
  }
}

Enterprise deployment

The only network egress is SOP generation — and it is fully configurable: base_url routes traffic through an internal LLM gateway or a local Anthropic-compatible model server (no-egress deployment), api_key_env lets gateways keep their own credential env-var naming, and use_structured_output: false falls back to plain-text parsing for gateways without tool support.

Datasynx Agentic Suite

Shadowing is one of three composable, local-first, MCP-native tools. They work standalone, but compound when combined: Cartography maps the systems, Shadowing documents the work, CRM runs the customer relationships.

🗺️ Cartography

@datasynx/agentic-ai-cartography

Maps your infrastructure into a graph of systems, services, and dependencies — the context Shadowing folds into generated SOPs.

npm · GitHub

👁️ Shadowing you are here

@datasynx/agentic-ai-shadowing

Observes daily workflows and generates anonymized SOPs via Claude. Fully local, fully anonymized.

npm · GitHub

🤝 CRM

@datasynx/agentic-crm

The CRM your AI agents actually run — local-first, MCP-native, one autonomous agent per customer.

npm · GitHub · Docs

FAQ

Where does my data live?

In a local SQLite database at ~/.datasynx/shadowing/shadowing.db. No cloud, no daemon. The only network call is to the Claude API for SOP generation — and exports are PII-scrubbed before they leave the tool.

Do I need Cartography?

No. Shadowing works standalone; SOPs are simply generated without the system-landscape context. Install Cartography and run shadowing import-graph to enrich SOPs with infrastructure context.

Which AI tools can drive it?

Anything that speaks MCP — Claude Code, Codex, Cursor, Claude Desktop. Run shadowing mcp (18 tools) or shadowing setup-hooks to wire it into Claude Code automatically.

Is it really free?

Yes. MIT-licensed and self-hosted. No seats, no metering, no telemetry.

How is PII handled on export?

Every export runs through the Anonymizer: emails, IPs, URLs, phone numbers, and file paths are redacted (configurable), while IBANs, credit-card numbers, tax IDs, and social-security numbers are always redacted. You can preview the anonymized output before exporting.