You really have to have been exposed to the derivation of the selection rules to understand this very well. This means knowing how to formulate and solve the transition moment integral. The difference in selection rules between the pure vibrational case and between vibrational states of different electronic surfaces derives from the fact that in the former, the (vibrational) wavefunctions involved form a single orthonormal set that vanish in the transition moment integral (or nearly do, in a more anharmonic paradigm) except under very specific circumstances - as you've noted, only when Δν = ± is M nonzero, at least under a harmonic approximation. (Although, bear in mind that all surfaces have some degree of anharmonicity, which relaxes the selection rules to varying degrees). In contrast, the vibrational wavefunctions of disparate electronic surfaces do not exhibit this orthonormality and therefore do not give rise to zero transition probability under the same circumstances. That said, vibrational wavefunctions do still contribute to the allowedness of electronic transitions. The vibrational component is appropriately expressed as an spatial overlap function, a coefficient often called the Franck-Condon factor. It is difficult to predict this factor ahead of time, as it depends on a number of molecular features influencing the electronic surface structure, including the degree of molecular rigidity, the electronic transition energy, and so forth. Note that electronic transition probabilities are also attenuated by other vibration-related factors that impact transition probability, including symmetry considerations and vibronic coupling, which can greatly complicate interpretation of electronic spectra.