i'm also curious if anyone has tips for selecting the cap values to get ~20ms smoothing (assuming i even put the caps in the right spot).
At Q3 and Q4 there's nothing to really *fully* discharge the caps. You could add a pull-down resistor but that might give rise to distortion. Perhaps a better place for the cap is across BE of Q2 since it has R5 to discharge the cap. It also means you only need one cap.
The load on Q2 is only the two 4.7k's so you could increase the values of R3, R4, R5, R6 to say 47k. The higher resistance will let you use a smaller cap. Very roughly you want 20ms = (47k/2) * C, which is about 850nF, still not small.
If you want to speed-up the mute-on time you would need to used a series diode + resistor across R6.
A spice sim would confirm all is good and the time constants are OK.