Wow, that's really good audio. You can pay like 2k for a wedding video and still not get such good audio, unless you use a wireless lapel mic.
Even with shotgun mic you will definitely get quite a fair bit of walking noise with all the hundreds of shuffling footsteps behind them (its a MRT lah...whaddya expect). You see its all about "ratios"....say in a wedding the VG would be very close to the subjects, so with the levels adjusted the main subj levels would be quite higher than the background. With this video the "VG" is standing quite far away. Sound drops about 3dB in strength in locations like this, for every doubling of distance. And with such distance, there will be slight echos, as in you can hear more of the acoustic space. There are many hard walls/glass/floors there....the ceiling does not have a very high coefficient of absorption too.
No way shotgun, check out 0:40, blocked by people and still no cut in audio levels. At the end the small girl in green shouted **** YOU twice, and the guy shouted back....the guy is already so far away and the level did not drop. The whole thing is too "level"..... And the bigger black girl facing away from the cam, still get so much treble and clarity....
Correction....think better than 8k videographer.
BTW I pass by there everyday for work. Wasted did not witness the thingy.