Hello! I'm a research engineer with an academic computer vision background in NYC. Before graduating in 2018, I was a Ph.D. student at the Cornell Tech SE(3) Vision Group under Dr. Serge Belongie. I previously worked for four years as a software engineer in Google Research. I started my career as a research assistant at the Vision and Security Technology lab and as a software engineer at Securics, Inc., both under Terrance E. Boult.

My interests include applied computer vision and machine learning to empower users and the society we live in, especially assistive technology, privacy/security, ecology, individual identity, creativity, and antisurveillance. Ask me about my side projects!

The work that I enjoy the most is visually communicating experimental results to an outside audience, quickly producing figures or plots of internal ML state diagnostics, creating infra/tools for engineers to find insights faster. I'm also skilled at performance optimization, e.g. rewriting hot loops in lower-level languages or SSE/AVX compiler intrinsics, especially improving "research-quality" code.

Find me around the web

Selected Papers

This is an excerpt. View all publications...


Dense Prediction

  • [PDF] [arXivPolyMaX: General Dense Prediction with Mask Transformer
    Xuan Yang; Liangzhe Yuan; Kimberly Wilber; Astuti Sharma; Xiuye Gu; Siyuan Qiao; Stephanie Debats; Huisheng Wang; Hartwig Adam; Mikhail Sirotenko; Liang-Chieh Chen. Winter Conference on Applications of Computer Vision (WACV 2024)
What's the difference between depth estimation, semantic segmentation, and surface normal prediction? Less than you might think. Previous work converged on similar approaches for each, so we decided to unify them in one simple transformer-based architecture.


Putting the "Mobile" in Mobile Vision

  • [PDF] [arXivSANPO: A Scene Understanding, Accessibility, Navigation, Pathfinding, Obstacle Avoidance Dataset
    Sagar M. Waghmare; Kimberly Wilber; Dave Hawkey; Xuan Yang; Matthew Wilson; Stephanie Debats; Cattalyya Nuengsigkapian; Astuti Sharma; Lars Pandikow; Huisheng Wang; Hartwig Adam; Mikhail Sirotenko. ArXiv
We created a dataset that stretches the limits of state-of-the-art obstacle detection and scene understanding systems. Our dataset, SANPO, includes 112,000 frames of egocentric video from a runner's perspective, annotated with panoptic segmentations and depth maps. We also include 113,000 synthetic frames from virtual environments. This was a collaboration between Google, Project Guideline, and Parallel Domain.


Online Trust

  • [PDF] [arXivUnderstanding Image Quality and Trust in Peer-to-Peer Marketplaces
    Xiao Ma; Lina Mezghani; Kimberly Wilber; Hui Hong; Robinson Piramuthu; Mor Naaman; Serge Belongie. Winter Conference on Applications of Computer Vision (WACV 2019)
We created a system that can understand the aesthetic quality of user-created photos on online marketplaces like Letgo or eBay. This was a collaboration between Cornell Tech, eBay, and the Oath connected experiences lab.


Artistic Aesthetics

  • [PDF] [arXivBAM! The Behance Artistic Media Dataset for Recognition Beyond Photography
    M. Wilber; Chen Fang; Hailin Jin; Aaron Hertzmann; John Collomosse; Serge Belongie. International Conference on Computer Vision (ICCV 2017)
We taught a computer about artwork! Our efforts led to the creation of "BAM," currently the largest semisupervised dataset of digital artwork on the Internet freely available for researchers.


Deep learning theory

  • [PDF] [arXivResidual Networks Behave Like Ensembles of Relatively Shallow Networks
    Andreas Veit; M. Wilber; Serge Belongie. Neural information processing systems (NIPS 2016)
What happens when you perform brain surgery on a ResNet? Surprisingly, performance still stays the same when deleting several layers, even without fine-tuning. We investigate why in this paper.

Biometric Privacy

  • [PDF] [arXivCan we still avoid automatic face detection?
    M. Wilber; Vitaly Shmatikov; Serge Belongie. Winter Conference on Applications of Computer Vision (WACV 2016)
Can you see the face in each of these images? Facebook's automatic face detector can detect and localize all six faces shown above, even when the uploader takes steps to hide it.


Humans and Machines

  • [PDF] [arXivLearning Concept Embeddings with Combined Human-Machine Expertise
    M. Wilber; Iljung Sam Kwak; Serge Belongie. International Conference on Computer Vision (ICCV 2015)
We built "SNaCK", a system that combines expert constraints with deep-learning similarity kernels to understand intuitive ideas like the taste of food or the visual similarity of different classes of animals.


Human In-the-Loop with efficient crowdsourcing

  • [PDF] [arXivCost-Effective HITs for Relative Similarity Comparisons
    M. Wilber; Iljung Sam Kwak; Serge Belongie. AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2014)
This project studied how best to ask human experts about food taste similarity. We showed that we could pay our Amazon Mechanical Turk workers more by collecting their input using more efficient UIs for higher-quality results.

Metric Learning

  • [PDF] [arXivGood Recognition is Non-Metric
    Walter J. Scheirer; M. Wilber; Michael Eckmann; Terry Boult. E. Pattern Recognition 47 (8), 2014
On flagship datasets like LFW and Caltech-256, the current top performing algorithms at time of writing do not satisfy the triangle inequality, symettricity, or even identity. Why are top-performing algorithms non-metric?


Wildlife conservation

  • [PDFAnimal Recognition in the Mojave Desert: Vision Tools for Field Biologists
    M. Wilber; Walter J. Scheirer; Phil Leitner; et. al.. Workshop on Applications of Computer Vision (WACV 2013)
Our team built a system to help scientists track the population of endangered ground squirrels and desert tortoises living near the Edwards Air Force Base in the Mojave Desert. This is a challenging detection and recognition task - the animals of interest are only a few pixels tall and are easily confused with non-endangered species.


Biometrics without privacy compromise

  • [PDFPRIVV: Private Remote Iris Authentication with Vaulted Verification
    M. Wilber; Walter J. Scheirer; Terry Boult. Conference on Computer Vision and Pattern Recognition Biometrics Workshop (CVPR 2012)
Most biometrics systems compromise users' privacy by storing their data in large biometrics databases that can be searched by law enforcement and rogue actors. Our Vaulted Verification work allows users to authenticate their accounts, but requires their cooperation to verify their identity, so it cannot be used in large-scale search databases.

View all publications


Throughout my academic career, I am grateful to be supported by Dr. Serge Belongie, the NSF Graduate Research Fellowship, Oath, Google, Adobe, the faculty and staff at Cornell Tech, and and my friends, family, partners, and community. I am an Oath PhD fellow. I also participated at the NSF REU program at UCCS in Summer 2011.


You can find my contact details on my CV or my LinkedIn page.