World space is too early - consider that the camera transformation can (and likely will) rotate the geometry so that faces which were "backfacing" (if that even makes sense without a camera) are no longer oriented in their original directions. Projection can also affect the orientation of the faces with regard to the screen. Backface culling should be done, at the earliest, after projection.
You could verify the function by reading the documentation of the bitwise operators used therein, and testing it by feeding various values to it and comparing the returned values with known, expected values. It would also be a good chance to learn unit testing.
It is dangerous to use copy/pasted code in production projects, if you don't understand what it does. When stuff breaks, it is your fault, whether or not you understand why.
256x256 does fit within your hardware's capabilities. Still, 65536 samples per pixel is a very large number.
Depending on what you're trying to do, you could use mip-mapping to downsample your source texture so that you'd drastically reduce the amount of samples needed. Of course, this sacrifices a little bit of precision, but you would gain a lot of performance in return.
Consider that n1 is the normalized vector from the center to the arc start point p1. Its tangent t1 with respect to the circle is simply n1 rotated 90 degrees. If you need to "place" the tangent vector, move it to start it from the arc point. However, for measuring the screen-space angle of it, you don't have to offset it.
angleOfTangent1 = atan2 (t1.x, t1.y)
Since you calculate the n1 already, you can find the angle of the tangent vector by swapping the x and y components of the normal n1 (thus implicitly performing the 90deg rotation), and put them to the atan2.
You can find the t2 and its angle exactly the same way.
Note that common implementations of atan2 return the angle in radians. 90 degrees is pi/2 radians. Also, atan2 is usually a helper layer on top of atan, that considers if either or both components are zero or negative, and adjusts the return value accordingly to arrive to a correct angle across the full circle. Atan, by itself, would only make sense when both parameters are positive.
If you don't have the skills to develop the game prototype yourself, you'll have to pay for other people to develop it, or have such an impressive portfolio that you attract people to work for free initially, with the promise of future profits.
Professionals usually get the job done, but you generally need to pay at least something up front, because their livelihood is tied to the work. Some developers may work initially for free, but if you don't promise anything concrete, you do run into the risk of losing them without notice; they are not obligated to work for you, because you are not obligated to pay for them.
Depending on which of these is your approach, you could post in "help wanted" (if seeking free help) or "classifieds" (if seeking professional help for a fee).
Finally, everyone has ideas. Even if you have the greatest game idea ever, it is the execution of the idea that matters.
It is very possible to load the compressed files to memory and unpack them when you need to display them. Simply load them to byte vectors and when you need the Bitmap object from the data, initialize that from memory instead of from a file.
Note that scroll viewer virtualization (as in not keeping all items in memory) is a very common performance optimization technique in data-heavy user interfaces.
In combination with the aforementioned things, you could use an alternative PNG library that lets you specify the rectangle to load from the image, while not consuming the memory for the rest of the image. The format itself makes this possible, but I don't know specific libraries off the top of my head that can achieve this.
If Ace12 (or other alternative libraries) is not a viable path, you could separate the data access layer to its own process, and use some kind of inter-process communication mechanism to call it.
Windows offers several options for this; RPC, TCP or UDP (sockets), named pipes and shared memory files come to mind immediately. All of these allow 32- and 64-bit processes to communicate with each other.
A good starting point would be to just draw a color wheel to a window.
If you're using Windows Forms, you can use a Bitmap object to draw the graphic. You can get access to the image pixels and fill them yourself.
Since Bitmap is usually represented as RGB(a) values, you would probably find RGB<->HSL conversion functions useful. A simple HSL to RGB function divides the circle into six sectors and performs linear interpolation over the nearest sector to the hue angle as triangle. Saturation is simply linear interpolation between gray and the found hue, and lightness is interpolation between black, the color and white.
The "wheel" is a function of the angle and the distance from the center point. You can use atan (or atan2) to find an angle of a pixel relative to the center of the bitmap. This angle would be the hue angle parameter (as described in the previous paragraph). Lightness or saturation, could be the Euclidean distance of the pixel from the center. Either one could also be represented as a separate slider, because you cannot represent all three dimensions (h, s and l) in the same 2d area.
When you manage to draw the wheel, getting the RGB value given a clicked HSL point becomes simply evaluating the HSL to RGB conversion function at the clicked point. The restrictions that you mention can be implemented by simply restricting the selected parameters to given ranges.
The actual coding takes about half a day to a day, if you understand the concepts. Googling will readily find RGB-HSL conversion functions as well as the atan and distance functions, if they are not familiar. The .net reference documentation includes the Bitmap class documentation, which explains how to fill a bitmap manually in memory (as opposed to loading it from a file).