Skip to content

obaskly/PhantomDoc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

PhantomDoc

PhantomDoc is a security tool for Microsoft Word documents (.docx). It prevents automated scraping and AI ingestion by "poisoning" the text stream with invisible garbage data.

Uniquely, PhantomDoc maintains a perfect user experience: humans can read and copy/paste the text normally, while bots extract unusable gibberish.


How It Works

Standard obfuscation (like "Hidden Text") is easily filtered by smart bots. PhantomDoc uses the "Visible Ghost" strategy to defeat them.

  1. Explosion: The script splits valid words apart (e.g., "Contract" → C, o, n...).
  2. Injection: It inserts random alphanumeric garbage characters between the real letters.
  3. The "Zero-Width" Trick: The garbage text is styled to be technically visible (so bots read it) but spatially non-existent (so humans miss it):
    • Color: FFFFFF (White) — Invisible to the eye.
    • Size: 0.5pt — Microscopic.
    • Spacing: -20 — Condenses the text to 0 pixels width.

The Comparison

Observer What they perceive Why?
Human Eye Hello Garbage is too small/white to see.
Clipboard (Copy/Paste) Hello Garbage has 0 width, so the mouse cursor skips it during selection.
Standard Bot H8x9le9l2o... Reads the code linearly; sees all text.

Usage

1. Poisoning a File

Run the main script to obfuscate your document.

python ghost.py

POC

ghoost.mp4

About

Protect word documents from automated scraping and AI ingestion

Topics

Resources

Stars

Watchers

Forks

Languages