Benn Jordan challenges a fundamental assumption of modern engineering: that seeing the invisible requires expensive, proprietary hardware. In a project that blends DIY enthusiasm with rigorous signal processing, Jordan demonstrates that the subtle pixel variations in standard video files contain enough data to visualize sound waves, track heartbeats, and locate acoustic sources for a fraction of the cost of commercial alternatives. This is not just a tech demo; it is a case study in how open-source software and consumer-grade sensors are democratizing high-fidelity data analysis.
The Cost of Seeing Sound
Jordan begins by exposing the absurd economics of the current market. "Acoustic cameras definitely already exist and they exist in a business to business Market," he notes, pointing out that the cheapest commercial units start around $5,000 while top-tier models approach $100,000. He argues that this pricing structure is artificial, driven by proprietary hardware rather than the actual complexity of the task. "To me it just doesn't seem like it would need to cost tens of thousands of dollars and require a bunch of proprietary Hardware if you have a smartphone that could record 4K video," Jordan writes.
The core of Jordan's argument is that the necessary data is already being captured by devices in our pockets. By treating video not as a visual record but as a dense stream of numerical data, one can extract information that the human eye ignores. He builds a functional acoustic imaging system using a 16-channel microphone array, a Raspberry Pi, and a global shutter camera module for under $400. "This acoustic camera comes in below $400 if you already have a laptop saving you I don't know $29,000 off of buying an existing camera with similar features," he concludes. The framing is effective because it shifts the focus from hardware acquisition to algorithmic ingenuity, suggesting that the barrier to entry for advanced diagnostics is now intellectual, not financial.
Critics might note that while the cost savings are real, the reliability of such DIY systems in critical industrial environments remains unproven compared to certified, calibrated commercial units. However, for prototyping and non-critical monitoring, the trade-off is compelling.
"You and I are on this journey together and there are a lot of exploratory steps and we're going to try and accomplish all this as inexpensively as possible."
The Software Bottleneck
The project's greatest hurdle was not the hardware, but the software ecosystem. Jordan admits that his lack of discipline as a developer made the process arduous, noting that "anybody with a few years of python experience will probably get farther in a few hours than I did in the last two weeks." He highlights the fragmentation of open-source tools, specifically the difficulty of managing dependencies for frameworks like Ocular and Spectacular. "If you attempt to get this running without running a virtual environment you will almost certainly run into some headaches," he warns, a sentiment that resonates with anyone who has tried to assemble complex scientific software stacks.
Despite these friction points, the results were tangible. Jordan successfully used the system to locate the source of unwanted noise in his studio and even identify the specific chickens making a racket. "I never know which one of my chickens are making noise now I do," he quips. This practical application underscores the utility of the technology: it turns abstract audio data into a visual map, allowing users to pinpoint problems without expensive consultants. The narrative choice to include the failures and the messy debugging process adds credibility, preventing the piece from feeling like a polished, unrealistic sales pitch.
Visualizing the Invisible: Pulse and Heat
The most provocative section of Jordan's work extends the concept of motion amplification beyond sound to biological and thermal phenomena. Drawing inspiration from fellow creator Poozie's work on "motion extraction," Jordan demonstrates how blending video frames with slight time differences can exaggerate micro-movements. "The subtle differences that your eyes with would never notice that exist between frames pixel to pixel in a video of a stationary scene holds a ton of information," he explains.
He pushes this further by applying the technique to human physiology. By filtering out vibrations outside the normal pulse range, he claims to have visualized his own heartbeat through a standard webcam. "This means that if we figured out a way to filter out vibrations or oscillations outside of the normal human pulse range using something like processing or python we could probably find the pulse of anyone sitting still in front of any modern webcam," Jordan writes. He even touches on the potential for corporate surveillance, joking that "big companies can use this to gauge how organically excited their employees are about a PowerPoint presentation."
This section raises significant privacy concerns that the piece only lightly skims over. While Jordan frames this as a triumph of data extraction, the implication that a camera could remotely monitor vital signs without consent is a double-edged sword. A counterargument worth considering is that the same technology used to diagnose machinery faults could be weaponized for non-consensual biometric surveillance, a risk that requires robust policy and technical safeguards.
"The subtle differences that your eyes with would never notice that exist between frames pixel to pixel in a video of a stationary scene holds a ton of information."
The Physics of Rolling Shutter
Jordan concludes by addressing a critical technical limitation: the rolling shutter effect common in most digital cameras. He explains that because these sensors scan horizontally rather than capturing the entire frame at once, fast-moving data can be distorted. "If something is moving quickly in the shot that you're getting with the video it will be picked up inaccurately because different regions of the frame are being captured at different times," he writes. To solve this, he advocates for global shutter cameras, which capture all pixels simultaneously, ensuring data integrity for scientific analysis.
He also revisits his previous work on schlieren imaging, showing how he adapted the technique to visualize sound waves and heat using simple patterns and high frame rates. "As long as you are not observing photons and bookmark that for a much more fuing complicated and crazy video in the future," he jokes, before demonstrating how sound pressure waves generate detectable heat. The transition from high-level theory to a literal fireball experiment keeps the tone accessible while maintaining scientific rigor. The piece succeeds in showing that the laws of physics are not the limit; the limit is often just our willingness to look closer at the data we already have.
Bottom Line
Jordan's strongest argument is that the democratization of high-end sensing technology is already underway, driven by software innovation rather than hardware breakthroughs. The piece's biggest vulnerability is the gap between a functional prototype and a reliable, user-friendly product, particularly regarding the complex software dependencies he encountered. Readers should watch for how these open-source methods evolve into standardized tools for industrial maintenance and medical diagnostics, as the potential for non-invasive, low-cost monitoring is now undeniable.