CINDRCINDR
Software/CINDR Triage Engine

Malware Analysis Pipeline

CINDR Triage Engine

Automated static & behavioral malware analysis pipeline

Submit a file. Get a report. No sandboxes to configure, no RE tooling to maintain. The Triage Engine runs a full analysis stack — triage, strings, YARA, PE, macros, archives, obfuscation — in a serverless Azure pipeline built for speed and repeatability.

Pipeline Architecture

From upload to report
in seconds.

Four stages. Fully automated. Every file that enters the pipeline is triaged, analyzed, and reported without operator intervention.

01

Intake

Files are submitted via a secure REST API endpoint. The pipeline validates, fingerprints, and deduplicates each submission before staging it in Azure Blob Storage and queuing it for analysis.

  • Multipart HTTP upload — single or batch submissions
  • Per-file SHA-256 fingerprinting with submission-level dedup
  • Filename sanitization and size validation (configurable limits)
  • Structured intake form (submitter, context, archive password)
  • Automatic rollback on partial failures — no orphaned data
02

Triage

Each file is classified by magic-byte inspection before any analysis runs. Type detection drives routing decisions — what analyses to run, what tools to dispatch, and how to recurse into extracted content.

  • 25+ format signatures: PE, ELF, Mach-O, PDF, OLE2, ZIP, RAR, 7z, gzip, and more
  • Platform routing: Windows, Linux, macOS
  • Extension fallback for interpreted code (JS, PS1, py, sh, bat, vbs, hta)
  • Category classification: executable, archive, document, interpreted_code, media, unknown
  • Triage result embedded in every report for full traceability
03

Analyze

Analysis modules run in parallel — inline Python modules execute immediately and heavy native tools are dispatched asynchronously. Derived files are re-triaged and recursed up to three levels deep.

  • Inline modules: strings, YARA, PE, PDF, macros, obfuscation, base64, JS beautify, archive unpack
  • Heavy tools dispatched async: capa, speakeasy, pestudio, detect-it-easy, trid, xorsearch, binwalk
  • Recursive derived-file analysis — up to 3 levels of nesting
  • VirusTotal hash reputation lookup
  • Full error isolation — one failed module never blocks the rest
04

Report

Analysis results are written as structured JSON — one report per file, one summary per submission. Reports are stored in Azure Blob Storage and indexed in Table Storage for fast retrieval.

  • Per-file reports: triage, hashes, all analysis module outputs, derived files
  • Submission summary: aggregate status, file count, error log, timestamp
  • Status tracking: complete, partial, error — queryable from Table Storage
  • Pending analyses listed for files awaiting heavy-tool results
  • Fully structured JSON — import directly into your SIEM, SOAR, or case platform

Analysis Modules

Six modules.
Parallel execution.

Every module runs in isolation. A failure in one never stops the others. Results are merged into a single structured report per file.

String & IOC Extraction

  • ASCII and UTF-16LE string extraction
  • IOC parsing: URLs, IPv4, email addresses, registry keys
  • Suspicious Win32 API detection (60+ signatures)
  • Packer signature matching: UPX, ASPack, Themida, VMProtect, MPRESS
  • Crypto artifact detection (PEM keys, AES markers)

YARA Scanning

  • Bulk and per-file rule compilation with broken-rule filtering
  • Namespace-based loading — community rulesets alongside custom rules
  • External variable injection: filename, extension, filetype per scan
  • Configurable per-scan timeout to prevent runaway matches
  • Match output: rule name, namespace, tags, metadata, string hit count

PE / Executable Analysis

  • Machine type, compile timestamp, image base, entry point, subsystem
  • Section entropy analysis — high-entropy sections flagged automatically
  • Full import and export table parsing with imphash
  • Resource directory enumeration
  • TLS callback and overlay detection
  • Packer section name matching

Document & Macro Analysis

  • PDF: metadata, URI extraction, JavaScript detection, embedded file enumeration
  • PDF JavaScript extracted as derived files for further analysis
  • OLE2 and OOXML VBA module extraction via olevba
  • XLM macro enumeration
  • olevba IOC scan: AutoExec, hex/base64 strings, Dridex artifacts
  • Each extracted macro module re-enters the analysis pipeline

Archive Unpacking

  • ZIP: encrypted member support, per-member size and total size limits
  • tar, gzip, bzip2, xz stream decompression
  • Nested archive recursion — gzip wrapping tar re-enters the pipeline automatically
  • RAR and 7z detected and flagged for heavy-tool dispatch
  • All unpacked members triaged and analyzed as first-class files

Obfuscation Scoring

  • Language-specific heuristics: JavaScript, PowerShell, Python, shell scripts
  • Generic metrics: entropy, line length, whitespace ratio, hex escape density
  • JS: eval, new Function, String.fromCharCode, atob, hex-prefixed identifiers
  • PowerShell: -EncodedCommand, FromBase64String, IEX, -bxor, char casts
  • Verdicts: clean / minified / likely_obfuscated / highly_obfuscated

Behavioral Analysis — Pro

Heavy-tool dispatch.
Async. Non-blocking.

When a file warrants deeper inspection, the Triage Engine dispatches it to a second analysis tier that runs industry-standard native tools inside a containerized environment — without blocking the fast static analysis path.

Tool dispatch is routed by file category and platform: Windows PE files receive the full stack; Linux/macOS executables receive a subset; media and unknown files receive detect-it-easy, trid, and xorsearch.

capaCapability detection — maps executable behaviors to MITRE ATT&CK techniques
speakeasyWindows PE emulation — extracts dynamic behaviors and API call sequences without a sandbox
pestudioPE artifact analysis — imports, strings, indicators, and threat scoring
detect-it-easyPacker and compiler identification for executables and archives
tridFile type identification based on statistical signature analysis
xorsearchXOR-encoded string extraction across all file types — runs on every submission
binwalkFirmware and binary file analysis — embedded file and filesystem extraction
pestrPE-specific string extraction with encoding awareness

Supported File Types

25+ format signatures.
Magic-byte detection.

Extension-based fallback handles interpreted code and script files where magic bytes are absent.

PE / DOS ExecutableWindows
ELF ExecutableLinux
Mach-O (32/64-bit, fat)macOS
PDF DocumentAny
OLE2 Compound DocumentAny
ZIP / OOXML ArchiveAny
RAR ArchiveAny
7z ArchiveAny
gzip / bzip2 / xz / tarAny
JavaScript / Node.jsAny
PowerShell ScriptWindows
VBScript / VBA / HTAWindows
Python / Perl / Ruby / PHPAny
Bash / Shell / BatchAny
HTML DocumentAny
JPEG / PNG / GIF / BMPAny
MP4 / WAV / MP3 / OggAny
Unknown (generic triage)Any

Built for Real-World Operations

Four operational
use cases.

01

Incident Response Triage

During an active incident, analysts need answers fast. Submit suspicious files from an endpoint or network capture and receive structured reports in seconds — no sandbox queue, no manual RE required to get initial signal.

02

Phishing & Email Analysis

Drop suspected phishing attachments directly into the pipeline. The engine unpacks archives, extracts Office macros, analyzes embedded PDFs, and scores scripts for obfuscation — surfacing what a document actually does without opening it.

03

Threat Intelligence Enrichment

Enrich IOC feeds and threat reports with static analysis context. YARA matches tie samples to known threat actor tooling, PE hashes correlate against VirusTotal, and string extraction pulls embedded infrastructure at scale.

04

SOC Automation

Integrate the intake API with your SOAR platform or SIEM alert pipeline. Every alert involving a suspicious file can trigger an automated analysis job, with results delivered via webhook and indexed for case management.

Azure-Native Architecture

Serverless.
No VMs. No clusters.

The Triage Engine is built entirely on Azure serverless primitives. It scales to zero when idle and handles burst submissions without pre-provisioning. No infrastructure to maintain, no clusters to size.

Azure FunctionsServerless compute for intake HTTP handler and queue-triggered analysis workers
Azure Blob StorageImmutable file storage for submitted samples, per-file reports, and submission summaries
Azure Table StorageSubmission index and per-file status tracking — queryable without loading full reports
Azure Queue StorageDecoupled message bus between intake and analysis workers, with separate queue for heavy-tool dispatch
Azure Application InsightsStructured telemetry with custom dimensions for every analysis stage — queryable via Kusto

Get the Triage Engine.

Available as a self-hosted open-source deployment or as a managed cloud service with SLA, custom YARA management, and dedicated support.