AI & ML interests

None defined yet.

Recent Activity

🕵️ SHA-Index

The Global Registry of Model Provenance

SHA-Index is a community-driven initiative to map the "DNA" of open-source AI models. By indexing the unique SHA256 hashes of model weights (via Git LFS), we trace the lineage of models across the Hugging Face Hub, identifying original authors and verifying model authenticity.

🛠️ Our Tools

🕵️ Search-SHA (Live Tool)

The Search-SHA is our flagship tool. It allows you to:

  • Trace Origins: Paste a SHA256 hash to find the original repository where it first appeared.
  • Live Patrol: Scan the newest uploads on Hugging Face to detect re-uploads and uncredited copies in real-time.
  • Index Repos: Help grow the database by scanning your favorite models.

💾 The Data

🧬 Model DNA Index

This is the central database powering our tools. It is an open, community-maintained registry of:

  • SHA256 Hashes (extracted from LFS pointers)
  • Repository IDs
  • Creation Timestamps
  • Filenames

This dataset is updated automatically by the Search-SHA space.

🔍 How It Works

We do not download model weights. Instead, we analyze the Git LFS (Large File Storage) pointer files. These tiny metadata files act as a fingerprint for the actual weights.

  1. Scan: We read the oid sha256:... from the pointer file.
  2. Compare: We check our index for this hash.
  3. Verify: If the hash exists, the earliest timestamp wins. That repository is considered the "Original Source."

🤝 Contributing

We believe in open provenance. You can contribute by:

  • Using the Search-SHA to index missing models.
  • Reporting bugs or feature requests in our community discussions.

Let's build the source of truth for AI weights.

models 0

None public yet