I'm... not entirely sure what this question is asking... so I will just answer as many things I think it's asking as possible. The first one isn't really answering a question, just adding a note to a consideration. (Probably due to it being 4 in the morning)
There are still reasons to use a CPU based particle system over a GPU.
Mostly because you receive more control over the CPU based systems without needing to wait for resources to transfer back and forth between the CPU and GPU. Remember that defering calls tends to force the system to wait for callbacks. In a threading system, this means that you will have a thread that is randomly stalled and has to wait for the GPU to finish doing what ever it's doing, then compute, and send back.
With CPUs you can control your particles via AI, boids, complex particle properties that tells each particle to do something very specific, and so on.
GPU's are also significantly slower than CPUs on simple and complex tasks. GPus may have 512-1256 cores... but they are at a significantly lower clock. If you have a program that controls only 400-800 particles that is being uploaded to the GPU every frame, you're only hurting yourself. It takes time to cross from the CPU, to the BUS to the GPU's card, then to be handled from there, then get a call back. So... while the GPU's many cores could theoretically process this faster... it still takes significantly more time to update, when it takes the CPU barely a fraction of a microsecond.
GPUs is better when you need a massive amount of particles that justifies time, and all of it's transformations are procedurally mathematical. Even then, there is a cost to draw time and update time.
This is also why games only push aesthetic physics to the GPU, and not move the entire physics system to the GPU. Particles, wood chips, denting, cloth, anything that is not game changing, and only after the data systems from the CPU end have determined that the two are "colliding" for larger objects and sub systems. And particles are only approximated by the world's texture space.
Is it good to call one compute shader for each affector ?
Typically... probably not. It's expensive for the GPU to constantly swap shaders. Even if it's on the same data. I've taken a look at Unreal's code sometime ago to figure out how they do particles.
Interestingly, it's faster for the CPU to have multiple affectors in the particle's properties than it is for the GPU to do the same. There was also some very good reason why a good number of the affectors are availible for only GPU or CPU based particles. Possibly due to how the two different systems were designed. GPUs can only do one thing at once, and requires all data to be present. CPUs are perfectly capable of doing multiple things at once, and holding something if it doesn't have what it needs.
For CPUs, the data and is instantiated when it's needed, and held onto. The kernal always exists. If I remember correctly, GPUs tend to dump kernal code, which is often reuploaded by the CPU.
That being said, this is why I guess it lead to Unreal's Design. Unreal actually treats it's particle GPUs like it does it's materials. It procedurally creates a single shader for that particle system. When it's done, the game has one computer shader that defines the particle system's behavior.