For nearest neighbor search, a k-d tree or BSP is not likely to be a good idea in 256 dimensions. The reason for this is that the there is a lower bound on the query time for any range searching data structure with linear storage (ie all BSP type data strucutres) of Omega(n^((d-1)/d)). In this case, as d grows larger, this quickly approaches linear cost, which basically means that you will be back to your ordinary brute force in terms of performance. A better option for intermediate/small dimensions is to use an r-tree, which requires more space but guarantees polylog range queries. However, the space storage here is exponentially large in d, and so it would again be a terrible idea.
As a result, I would be inclined to recommend that you forgo using a spatial indexing tree and instead default to using a locality sensitive hash instead. The behavior of locality sensitive hashes for hamming distance computations are now quite well understood and should get you pretty good results. Also they are much easier to implement, and tend to behave nicely with cache performance. Here is a link from wikipedia: (scroll to the bottom to see how to use it to compute nearest neighbors)
http://en.wikipedia....amming_Distance