You can calculate the distance of the object "by the sound it produces" with one microphone, if you know the time when the sound was produced and can compare it to the time when the sound was received, correcting for all other latencies in your system.
If you don't have this information, you need at least two microphones, and then you can compute the distance of the object by using the different times the signal arrived at the microphones to triangulate its position.
Couldn't you give it a try by using JuceDemo's latency demo?
I'm thinking, attach a speaker on the otherside of a room, run the demo, move it closer, see what the differnece is samples WRT speed of sound / air pressure / temp / altitude etc?
The latency demo only works because it records the system time for emitting the sound, and then the system time for receiving it.
If you know exactly what sound is going to be played, you could examine the relative attenuation at different frequencies (high frequencies attenuate faster).
Otherwise there is no way to solve. You need to gather more information somehow.
Yes, I have tried but it's impossible with only one microphone.
I was going to use the mic of an android device to detect the sound of an object and calculate its distance from the device.
The problem is that the device has only one microphone and this makes it impossible. Maybe there is another way to measure it but till now I haven't found how to.
That's why we have two ears to survive evolution... ;-)
But if you create the impuls yourself and measure the echo, it might work, because the impulse is once where your device stands and comes again in two times the distance.
All other approaches are only working with a lot of assumptions (like frequency attenuation as pi said) or if you know the "wetness" of the room, then the percentage of dry signal to environmental reverb can also give hints. Also timing of early reflections helps to get an idea of positions in space and the environment.
The possibilities increase drastically if you have a second microphone. And last but not least, if you have an artificial head microphone, then you can also analyze "head related transfer functions", which uses knowledge of your ear form i.e. pinnae.
All these cues combined give an idea of surrounding. A thing your brain and ears are pretty good at. You realize how amazing it is, if you try to mimic this with your computer.
Reading again, i got another idea, if you can add an impuls with your IR led of your android device, then you can use the difference between sound speed and light speed, like you would in a thunderstorm: the difference between light and sound in seconds divided by three is about the distance in km.
They should add more microphones into the devices ( at least 2 ). I had a beautiful idea and as I understand now its impossible to achieve it.
" if you can add an impuls with your IR " - maybe I didn't understand you well but even if this is possible it will be very hard to get the correct distance.