The generation of entanglement between remote matter qubits has developed into a key capability for fundamental investigations as well as for emerging quantum technologies. In the single-photon, protocol entanglement is heralded by generation of qubit-photon entangled states and subsequent detection of a single photon behind a beam splitter. In this work we perform a detailed theoretical and experimental investigation of this protocol and its various sources of infidelity. We develop an extensive theoretical model and subsequently tailor it to our experimental setting, based on nitrogen-vacancy centers in diamond. Experimentally, we verify the model by generating remote states for varying phase and amplitudes of the initial qubit superposition states and varying optical phase difference of the photons arriving at the beam splitter. We show that a static frequency offset between the optical transitions of the qubits leads to an entangled state phase that depends on the photon detection time. We find that the implementation of a Charge-Resonance check on the nitrogen-vacancy center yields transform-limited linewidths. Moreover, we measure the probability of double optical excitation, a significant source of infidelity, as a function of the power of the excitation pulse. Finally, we find that imperfect optical excitation can lead to a detection-arm-dependent entangled state fidelity and rate. The conclusion presented here are not specific to the nitrogen-vacancy centers used to carry out the experiments, and are therefore readily applicable to other qubit platforms.