parser_logs Processing Pipeline
Interactive six-stage pipeline diagram for Golliath's parser_logs subsystem. Walk through Walker, Layout Classifier, File Classifier, Grammar Router, Field Extractors, and NLP Enrichment — each stage reveals its implementation logic, safety guards, and data outputs.
By Mohamed Habib Jaouadi•June 22, 2026•
Post Related
#cti
#parser
#stealer-logs
#python
#nlp
#pipeline
#archive-analysis
parser_logs Processing Pipeline
Six-stage pipeline from raw archive to enriched structured intelligence. Click any stage.
Walker
Recursive Traversal
Recursively walk the extracted archive tree. Enforce safety limits before touching anything.
Output
File path + metadata stream
Implementation
- -Extracts ZIP/RAR/7z/TAR with format-specific decompressors
- -Archive bomb guard: ratio > 20x above 1 MB — abort
- -Entry count cap: > 100,000 entries — abort
- -Nested archive depth: max 3 levels
- -Zip-slip protection: resolves all symlinks before extraction
- -Emits { path, size, mtime } records for classifier