the sound going over HDMI is digital, so the sound-chip would be in the TV/monitor.
Not long ago, you had to connect an extra little cable from your motherboard to the GPU to get HDMI sound.
Nowadays, it's transfered by PCIe, but I'm pretty sure it's still the sound chip that produce the sound.
I would expect the GPU itself has very little control over this stream, i think it's just routed through PCIe for convenience.
the video card basically presents a sound-device to the OS, so that the OS can send it audio, which is then fed over the HDMI cable along with all the video data.
basically, the HDMI cable is an STP-type cable (shielded twisted pair), with each pair and also the ground/shielding for each pair having a pin.
the data sent is basically as a continuous stream of packets, which may contain pixel data, audio data, or other data (apparently including Ethernet traffic...). video data is apparently sent mostly as raw RGB or YUV (at 24 or 48 bits) and audio uses PCM. apparently, it will mostly send all the video data, and then typically all the audio and other data after the video frame has been sent (in the time between video frames). (apparently the packets are fixed-size, apparently 32 bytes, each with a type-tag and some ECC data, and some payload).