
Certificate of Completion
THIS ACKNOWLEDGES THAT
HAS COMPLETED THE SPRING 2026 DEEP LEARNING BOOT CAMP
Dharineesh Somisetty
Roman Holowinsky, PhD
MARCH 25, 2026
DIRECTOR
DATE

TEAM
Fragmented ID Resolution
Noimot Bakare Ayoub, Dharineesh Somisetty, Arpith Shanbhag, Pedro Fontanarrosa

Scope: Detect duplicate identities across noisy, fragmented datasets (fraud, patient mismatch, citizen records)
Architecture: CNN Embeddings + Siamese Network
Problem: Real-world identity data is messy, small inconsistencies cause one person to appear as multiple records, creating operational risk and inefficiency.
Approach: We learn record similarity using CNN embeddings and a Siamese network. LinkID detects, ranks, and resolves duplicate identities auto-linking high-confidence matches and routing borderline cases for review.
Data: HPI snapshot of North Carolina voter records with labeled duplicate and non duplicate pairs.
Results: Strong performance overall, with ~25-point improvement on hard cases where traditional models struggle.
Conclusion: Learned similarity models significantly outperform traditional approaches in complex identity resolution tasks.
