Copy-on-Write - How?? [subject name changed]

Started by
30 comments, last by NotAYakk 17 years, 7 months ago

Quote:Original post by EasilyConfused
In the example Stroustrup uses for a copy-on-write string, obviously operator= and operator+= and so on are quite straightforward, but he does some very clever stuff with operator[] (accessing characters) whereby on the non-const operator[], a class wrapping a char but binded with a reference to the owning string is returned.

If memory serves, this class then overloads all the operators for modification and can then defer any modifications to an individual character back to the string, which can then check to see if it needs its own copy first before modifying.


I want to avoid doing this for every operator I need.. which may only be '()', but I am not certain.
--== discman1028 ==--
Advertisement
Maybe someone else can suggest something, but I can't see a way to force the const functions to be called. In my daft example above, the problem is that if you do:

int z=x.value();

it is not the call itself, but where the result is placed that determines whether you want the const or non-const method called and I don't personally know of a way in C++ to enforce this. I'm fairly sure someone else here will have a bright idea that I'm missing.

Another option would be to not worry about it. In your example of:

Mat a=fft(fft(b));

you would not be calling your () operator. Perhaps a simpler solution would be to make () call GetOwnCopy (or whatever) but also have a const version of () that doesn't (I'm not too sure what () is going to do so don't know if that is feasible).

You could then rely on the user of the library to enforce it like this:

void f(const Mat &M){    cout << M(4) << endl; // guessing about the use of () here}void g(Mat &M){    M(2)=2.4f;}int main(){    Mat b;    Mat a=fft(fft(b)); // GetOwnCopy not called yet    f(a); // GetOwnCopy still not called    g(a); // GetOwnCopy called inside g}


I appreciate this is not ideal, but it is a solution. Perhaps if you could describe what you want the () function to actually do, people will be able to suggest cleaner solutions.

There could equally be a far better overall solution to your original problem than copy-on-write as well of course. It just seemed appropriate to me.

HTH
I actually wanted the operator for accessing one element of a matrix:

float& Mat::operator()( int x, int y ){  return data[ x*m_numRows + y ];}float Mat::operator()( int x, int y ) const{  return data[ x*m_numRows + y ];}
--== discman1028 ==--
This is a suggestion I have thrown together quickly and not thoroughly tested but it seems to work:

#include <iostream>#include <vector>using namespace std;class Matrix{private:class Imp{public:    Imp(int w,int h) : r(0),wt(w),ht(h) { v.resize(w*h); cout << "imp\n"; }    ~Imp(){ cout << "free\n"; }    vector<float> v; int r,wt,ht;};class MFloat{public:    MFloat(Matrix *M,float *v,int fx,int fy) : Mp(M),Fv(v),x(fx),y(fy) { }    operator float() const { return *Fv; }    operator float&()        {        if(Mp->Im->r>1) Mp->GetOwnCopy();        Fv=&(Mp->Im->v[(y*Mp->Im->wt)+x]);        return *Fv;        }    Matrix *Mp; float *Fv; int x,y;};    Imp *Im;    friend class MFloat;    void GetOwnCopy();public:    Matrix(int w,int h){ Im=new Imp(w,h); Im->r=1; }    Matrix(const Matrix &M){ Im=M.Im; ++Im->r; }    ~Matrix(){ --(Im->r); if(Im->r==0) delete Im; }    const Matrix &operator=(const Matrix &M)        { --Im->r; if(Im->r==0) delete Im; Im=M.Im; ++Im->r; return *this; }    float operator()(int x,int y) const { return Im->v[(y*Im->wt)+x]; }    MFloat operator()(int x,int y)        { return MFloat(this,&(Im->v[(y*Im->wt)+x]),x,y); }};void Matrix::GetOwnCopy(){    Imp *I=Im; Im=new Imp(I->wt,I->ht); Im->v=I->v; Im->r=1;    --(I->r); if(I->r==0) delete I;}int main(){    Matrix M(5,5),N=M;    M(2,2)=23.0f;    cout << M(2,2) << endl;}


It relys on the fact that when a non-const Matrix has its () operator called, it returns the MFloat class, which is private to Matrix. MFloat has operator float and operator float& functions and it would appear that the correct one gets called depending on whether you are asking for an lvalue or not.

In the int main() as above, the GetOwnCopy is called before the assignment. If you take the M(2,2)=23.0f line out, it isn't called since the operator float is used to get the value for the cout statement rather than the operator float&.

I have no idea whether this would work if developed or not, or whether there is a well-defined precedence for calling the non-reference operator float function over the reference version depending on if an lvalue is being used or not, so I make no apology if this program is relying on undefined behaviour or non-standard aspects of the compiler I tested this on. It is just an idea that I hope might provoke some discussion.
Thanks for the reply. I am very much interested in this problem, but have decided that I will deal with its complexity later, as I do not have forever to spend on the problem. I will return by value for now from the fft function.

EDIT: I agree, in that I don't know if the technique you mention could be trusted / extended, I'm interested to hear what others do! I found a link of a cow_ptr implementer that used the string [] technique you mentioned earlier here, and is pretty similar to what you tried -- differentiate an operator() by the return type.

Thanks EasilyConfused!
--== discman1028 ==--
I think that sounds like a wise idea.

This was an interesting topic for me because I've never tried to implement copy-on-write for anything before.

To be honest though, I tend to pass stuff I don't want copied around as const whatever &, and I'd have just made ttf take a non-const reference to the value it is modifying rather than return a value.

Not a solution if you're writing classes for other people to use though.
Quote:Original post by EasilyConfused
To be honest though, I tend to pass stuff I don't want copied around as const whatever &, and I'd have just made ttf take a non-const reference to the value it is modifying rather than return a value.

Not a solution if you're writing classes for other people to use though.


Exactly my reasoning.

I have no problem making fft() a non-member function, and passing non-const ref as you say, but the real problem is the return value.
--== discman1028 ==--
I also realize that I may have been going over the top for my needs. Though that auto-copying would be nice, all I really needed was a reference-counting ptr to the matrix implementation, and I can decide myself when to copy it (e.g., in fft()):

#include <cstdlib>#include <vector>#include <iostream>#include <boost/shared_ptr.hpp>using namespace std;/// The actual fft.void thefftimplementation( vector<float>& rData ){	// just do something dumb like multiply by 2.0f for now	for( int i = 0; i < rData.size(); i++ )	{		rData *= 2.0f;	}}class MatImplementation{public:	/// Constructors.	MatImplementation( int nRows, int nCols )	{		m_data.resize( nRows*nCols );		for( int i = 0; i < nRows*nCols; i++ )		{	// Fill with random data for now			m_data = <span class="cpp-number">1</span>.0f*rand();<br>		}<br>	}<br>	MatImplementation( MatImplementation <span class="cpp-keyword">const</span>&amp; m ) : m_data( m.m_data ) {}<br><br>	<span class="cpp-comment">/// = operator.</span><br>	MatImplementation&amp; <span class="cpp-keyword">operator</span>=( MatImplementation <span class="cpp-keyword">const</span>&amp; m )<br>	{<br>		m_data = m.m_data;<br>		<span class="cpp-keyword">return</span> *<span class="cpp-keyword">this</span>;<br>	}<br><br>	<span class="cpp-comment">/// Destructor.</span><br>	~MatImplementation() {}<br><br>	<span class="cpp-comment">/// Honest-to-goodness matrix data.</span><br>	vector&lt;<span class="cpp-keyword">float</span>&gt; m_data;<br>};<br><br><br><span class="cpp-keyword">class</span> Mat<br>{<br><span class="cpp-keyword">public</span>:<br>	<span class="cpp-comment">/// Constructors.</span><br>	Mat( <span class="cpp-keyword">int</span> nRows, <span class="cpp-keyword">int</span> nCols )<br>	{<br>		pMatrixData.reset( <span class="cpp-keyword">new</span> MatImplementation(nRows, nCols) );<br>	}<br>	Mat( Mat <span class="cpp-keyword">const</span>&amp; m ) : pMatrixData( m.pMatrixData ) {}<br><br>	<span class="cpp-comment">/// = operator.</span><br>	Mat&amp; <span class="cpp-keyword">operator</span>=( Mat <span class="cpp-keyword">const</span>&amp; m )<br>	{<br>		pMatrixData = m.pMatrixData;<br>		<span class="cpp-keyword">return</span> *<span class="cpp-keyword">this</span>;<br>	}<br><br>	<span class="cpp-comment">/// FFT operation.</span><br>	Mat fft()<br>	{<br>		<span class="cpp-comment">// Make a copy of the current Mat (copies a ptr only).</span><br>		Mat returnMat( *<span class="cpp-keyword">this</span> );<br><br>		<span class="cpp-comment">// Explicitly ask for a ptr to a new copy of the data.</span><br>		returnMat.pMatrixData.reset( <span class="cpp-keyword">new</span> MatImplementation( *pMatrixData ) );<br><br>		<span class="cpp-comment">// Perform an fft on the data.</span><br>		thefftimplementation( returnMat.pMatrixData-&gt;m_data );<br><br>		<span class="cpp-comment">// Return a (small) copy of the Mat class.</span><br>		<span class="cpp-keyword">return</span> returnMat;<br>	}<br><br>	<span class="cpp-comment">/// Destructor.</span><br>	<span class="cpp-keyword">virtual</span> ~Mat() {}<br>	<br><span class="cpp-keyword">protected</span>:<br>	<span class="cpp-comment">/// Shared pointer to a heap-allocated matrix.</span><br>	boost::shared_ptr&lt;MatImplementation&gt; pMatrixData;<br><br>	<span class="cpp-comment">/// Constructor (hidden).</span><br>	Mat();<br>};<br><br><br><span class="cpp-keyword">int</span> main(<span class="cpp-keyword">int</span> argc, <span class="cpp-keyword">char</span>* argv[])<br>{<br>	Mat a( <span class="cpp-number">10</span>, <span class="cpp-number">10</span> );<br>	Mat b( <span class="cpp-number">10</span>, <span class="cpp-number">10</span> );<br><br>	a = b.fft();<br><br><br><br><br><br>	<span class="cpp-keyword">return</span> <span class="cpp-number">0</span>;<br>}<br><br><br><br></pre></div><!–ENDSCRIPT–><br><br>This seems ideal, for now.
--== discman1028 ==--
Quote:Original post by discman1028
Hey all, I need a sugestion or two.

Say I had a matrix class, Mat. It is an arbitrary-size 2D matrix, using a std::vector<float> as its internal representation. These objects will be of very large size, so I'd like to avoid any unnecessary construction / destruction, while keeping their use simple.

Now, here is the syntax of what I would like to do:

*** Source Snippet Removed ***

Very simple. I am getting stuck in the details however. Should fft() return by value? If so, there are unnecessary copies going on. Should it allocate on the heap and return a pointer? If so, the syntax gets a bit uglier (unless Mat::operator=( Mat* ) is defined, but then the user must save the pointer, and handle deleteing). Should there be a global vector<Mat*> to keep track of all Mat's? Should I pass around handles as a level of indirection in this case?

The real problem seems to be fft( Mat ), I'm not sure how to allocate the result-of-the-fft's memory. Would inlining the fft() function get rid of the return-by-value copying cost?


Thanks for tips.


Copy-On-Write is nice, but for your problem it looks like you could just pass in a reference to the object to write to:

Mat A, B, C;B.fft( A );// orC.fft( A );A = B;// orMat temp;B.fft( temp );temp.fft( temp ); // does fft have dependancies that would disallow this?temp.( A ); 


I don't know anything about FFT, but if you're unable to write to the object being read you'd simply use another temp Mat. It makes the syntax a little ugly but, if it's speed you want, passing in references to memory should be more effecient than reference counting and Copy-On-Write logic.

Hope that helps.
If you're looking for a matrix implementation that do copy-on-write, expression templates and more magic, check out Blitz++ at oonumerics.org.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian W. Kernighan

This topic is closed to new replies.

Advertisement