PhantomDoc is a security tool for Microsoft Word documents (.docx). It prevents automated scraping and AI ingestion by "poisoning" the text stream with invisible garbage data.
Uniquely, PhantomDoc maintains a perfect user experience: humans can read and copy/paste the text normally, while bots extract unusable gibberish.
Standard obfuscation (like "Hidden Text") is easily filtered by smart bots. PhantomDoc uses the "Visible Ghost" strategy to defeat them.
- Explosion: The script splits valid words apart (e.g., "Contract" →
C,o,n...). - Injection: It inserts random alphanumeric garbage characters between the real letters.
- The "Zero-Width" Trick: The garbage text is styled to be technically visible (so bots read it) but spatially non-existent (so humans miss it):
- Color:
FFFFFF(White) — Invisible to the eye. - Size:
0.5pt— Microscopic. - Spacing:
-20— Condenses the text to 0 pixels width.
- Color:
| Observer | What they perceive | Why? |
|---|---|---|
| Human Eye | Hello |
Garbage is too small/white to see. |
| Clipboard (Copy/Paste) | Hello |
Garbage has 0 width, so the mouse cursor skips it during selection. |
| Standard Bot | H8x9le9l2o... |
Reads the code linearly; sees all text. |
Run the main script to obfuscate your document.
python ghost.py