first off all, 144 tex lookups per pixel is a lot. a solution would be to simple use less ( find a balance between performance and visuals).
Try to make your textures smaller and/or slimmer.
fat textures (like your (R16G16B16A16) are slower so try to pack your data.
Normals for example could be stored as X8Y8 and reconstruct Z inside the shader.
also look to the access patterns and filtering.
reduce filtering quality or try to use the Gather instruction if possible.
you could even refactor your renderer to make use of a compute shader where you can sample your textures into group-shared memory