parser_logs Processing Pipeline

Interactive six-stage pipeline diagram for Golliath's parser_logs subsystem. Walk through Walker, Layout Classifier, File Classifier, Grammar Router, Field Extractors, and NLP Enrichment — each stage reveals its implementation logic, safety guards, and data outputs.

By Mohamed Habib JaouadiJune 22, 2026
Post Related
#cti
#parser
#stealer-logs
#python
#nlp
#pipeline
#archive-analysis
parser_logs Processing Pipeline
Six-stage pipeline from raw archive to enriched structured intelligence. Click any stage.
Walker
Recursive Traversal

Recursively walk the extracted archive tree. Enforce safety limits before touching anything.

Output
File path + metadata stream
Implementation
  • -Extracts ZIP/RAR/7z/TAR with format-specific decompressors
  • -Archive bomb guard: ratio > 20x above 1 MB — abort
  • -Entry count cap: > 100,000 entries — abort
  • -Nested archive depth: max 3 levels
  • -Zip-slip protection: resolves all symlinks before extraction
  • -Emits { path, size, mtime } records for classifier