That's why we have two ears to survive evolution... ;-)
But if you create the impuls yourself and measure the echo, it might work, because the impulse is once where your device stands and comes again in two times the distance.
All other approaches are only working with a lot of assumptions (like frequency attenuation as pi said) or if you know the "wetness" of the room, then the percentage of dry signal to environmental reverb can also give hints. Also timing of early reflections helps to get an idea of positions in space and the environment.
The possibilities increase drastically if you have a second microphone. And last but not least, if you have an artificial head microphone, then you can also analyze "head related transfer functions", which uses knowledge of your ear form i.e. pinnae.
All these cues combined give an idea of surrounding. A thing your brain and ears are pretty good at. You realize how amazing it is, if you try to mimic this with your computer.
Reading again, i got another idea, if you can add an impuls with your IR led of your android device, then you can use the difference between sound speed and light speed, like you would in a thunderstorm: the difference between light and sound in seconds divided by three is about the distance in km.