Reuse object vs creating new object
Clash Royale CLAN TAG#URR8PPP
up vote
7
down vote
favorite
One of our projects deals with tons of data. It selects data from an database and serializes the results into json/xml.
Sometimes the amount of selected rows can reach the 50 million mark easily.
However though, the runtime of the program was to bad in the beginning.
So we have refactored the program with one major adjustment:
The working objects for serialization wouldn't be recreated for every single row, instead the object will be cleared and reinitialized.
For Example:
before:
For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.
// loop with all dbRows
DatabaseRowSerializer serializer(dbRow);
result.add(serializer.toXml());
after:
The constructor of DatabaseRowSerializer doesn't sets the dbRow. Instead this will be done by the initDbRow()-function.
The main thing here is, that only one object will be used for the whole runtime. After the serialization of an dbRow, the clear()-function
will be called to reset the object.
DatabaseRowSerializer serializer;
// loop with all dbRows
serializier.initDbRow(dbRow);
result.add(serializer.toXml());
serializier.clear();
So my question:
Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.
Which way do you generally prefer? before or after?
c++
add a comment |Â
up vote
7
down vote
favorite
One of our projects deals with tons of data. It selects data from an database and serializes the results into json/xml.
Sometimes the amount of selected rows can reach the 50 million mark easily.
However though, the runtime of the program was to bad in the beginning.
So we have refactored the program with one major adjustment:
The working objects for serialization wouldn't be recreated for every single row, instead the object will be cleared and reinitialized.
For Example:
before:
For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.
// loop with all dbRows
DatabaseRowSerializer serializer(dbRow);
result.add(serializer.toXml());
after:
The constructor of DatabaseRowSerializer doesn't sets the dbRow. Instead this will be done by the initDbRow()-function.
The main thing here is, that only one object will be used for the whole runtime. After the serialization of an dbRow, the clear()-function
will be called to reset the object.
DatabaseRowSerializer serializer;
// loop with all dbRows
serializier.initDbRow(dbRow);
result.add(serializer.toXml());
serializier.clear();
So my question:
Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.
Which way do you generally prefer? before or after?
c++
2
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
1
Generally the way to go with C++ is RAII, i.e. noinit()
/clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
– DevSolar
1 hour ago
2
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago
add a comment |Â
up vote
7
down vote
favorite
up vote
7
down vote
favorite
One of our projects deals with tons of data. It selects data from an database and serializes the results into json/xml.
Sometimes the amount of selected rows can reach the 50 million mark easily.
However though, the runtime of the program was to bad in the beginning.
So we have refactored the program with one major adjustment:
The working objects for serialization wouldn't be recreated for every single row, instead the object will be cleared and reinitialized.
For Example:
before:
For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.
// loop with all dbRows
DatabaseRowSerializer serializer(dbRow);
result.add(serializer.toXml());
after:
The constructor of DatabaseRowSerializer doesn't sets the dbRow. Instead this will be done by the initDbRow()-function.
The main thing here is, that only one object will be used for the whole runtime. After the serialization of an dbRow, the clear()-function
will be called to reset the object.
DatabaseRowSerializer serializer;
// loop with all dbRows
serializier.initDbRow(dbRow);
result.add(serializer.toXml());
serializier.clear();
So my question:
Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.
Which way do you generally prefer? before or after?
c++
One of our projects deals with tons of data. It selects data from an database and serializes the results into json/xml.
Sometimes the amount of selected rows can reach the 50 million mark easily.
However though, the runtime of the program was to bad in the beginning.
So we have refactored the program with one major adjustment:
The working objects for serialization wouldn't be recreated for every single row, instead the object will be cleared and reinitialized.
For Example:
before:
For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.
// loop with all dbRows
DatabaseRowSerializer serializer(dbRow);
result.add(serializer.toXml());
after:
The constructor of DatabaseRowSerializer doesn't sets the dbRow. Instead this will be done by the initDbRow()-function.
The main thing here is, that only one object will be used for the whole runtime. After the serialization of an dbRow, the clear()-function
will be called to reset the object.
DatabaseRowSerializer serializer;
// loop with all dbRows
serializier.initDbRow(dbRow);
result.add(serializer.toXml());
serializier.clear();
So my question:
Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.
Which way do you generally prefer? before or after?
c++
c++
asked 1 hour ago
user2622344
114312
114312
2
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
1
Generally the way to go with C++ is RAII, i.e. noinit()
/clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
– DevSolar
1 hour ago
2
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago
add a comment |Â
2
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
1
Generally the way to go with C++ is RAII, i.e. noinit()
/clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
– DevSolar
1 hour ago
2
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago
2
2
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
1
1
Generally the way to go with C++ is RAII, i.e. no
init()
/ clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.– DevSolar
1 hour ago
Generally the way to go with C++ is RAII, i.e. no
init()
/ clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.– DevSolar
1 hour ago
2
2
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
7
down vote
On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:
It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objects…) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.
It is unnecessary
There is essentially no overhead in creating a
DatabaseRowSerializer
every time, unless its constructor does more than yourinitDbRow
function, in which case your two examples are not equivalent anyway.Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
So if this change really solved your performance problem, something else was probably going on.
Use your constructors and destructors. Freely and proudly!
That's the common advice when writing C++.
A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:
DatabaseRowSerializer serializer;
// loop with all dbRows
result.add(serializer.toXml(dbRow));
You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.
If you do this and still don't have any state, then the whole thing can just be a static call:
// loop with all dbRows
result.add(DatabaseRowSerializer::toXml(dbRow));
…but then it may as well just be a function.
Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
3
down vote
Generally I agree with the points raised by LRiO in the other answer.
Just moving the c'tor out of the loop isn't a good idea.
However, for this style of loop body:
- feed object some data
- transform data within object
- return transformed data from object
it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.
So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.
So, in Summary:
You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
7
down vote
On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:
It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objects…) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.
It is unnecessary
There is essentially no overhead in creating a
DatabaseRowSerializer
every time, unless its constructor does more than yourinitDbRow
function, in which case your two examples are not equivalent anyway.Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
So if this change really solved your performance problem, something else was probably going on.
Use your constructors and destructors. Freely and proudly!
That's the common advice when writing C++.
A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:
DatabaseRowSerializer serializer;
// loop with all dbRows
result.add(serializer.toXml(dbRow));
You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.
If you do this and still don't have any state, then the whole thing can just be a static call:
// loop with all dbRows
result.add(DatabaseRowSerializer::toXml(dbRow));
…but then it may as well just be a function.
Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
7
down vote
On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:
It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objects…) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.
It is unnecessary
There is essentially no overhead in creating a
DatabaseRowSerializer
every time, unless its constructor does more than yourinitDbRow
function, in which case your two examples are not equivalent anyway.Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
So if this change really solved your performance problem, something else was probably going on.
Use your constructors and destructors. Freely and proudly!
That's the common advice when writing C++.
A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:
DatabaseRowSerializer serializer;
// loop with all dbRows
result.add(serializer.toXml(dbRow));
You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.
If you do this and still don't have any state, then the whole thing can just be a static call:
// loop with all dbRows
result.add(DatabaseRowSerializer::toXml(dbRow));
…but then it may as well just be a function.
Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
7
down vote
up vote
7
down vote
On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:
It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objects…) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.
It is unnecessary
There is essentially no overhead in creating a
DatabaseRowSerializer
every time, unless its constructor does more than yourinitDbRow
function, in which case your two examples are not equivalent anyway.Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
So if this change really solved your performance problem, something else was probably going on.
Use your constructors and destructors. Freely and proudly!
That's the common advice when writing C++.
A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:
DatabaseRowSerializer serializer;
// loop with all dbRows
result.add(serializer.toXml(dbRow));
You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.
If you do this and still don't have any state, then the whole thing can just be a static call:
// loop with all dbRows
result.add(DatabaseRowSerializer::toXml(dbRow));
…but then it may as well just be a function.
Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.
On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:
It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objects…) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.
It is unnecessary
There is essentially no overhead in creating a
DatabaseRowSerializer
every time, unless its constructor does more than yourinitDbRow
function, in which case your two examples are not equivalent anyway.Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
So if this change really solved your performance problem, something else was probably going on.
Use your constructors and destructors. Freely and proudly!
That's the common advice when writing C++.
A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:
DatabaseRowSerializer serializer;
// loop with all dbRows
result.add(serializer.toXml(dbRow));
You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.
If you do this and still don't have any state, then the whole thing can just be a static call:
// loop with all dbRows
result.add(DatabaseRowSerializer::toXml(dbRow));
…but then it may as well just be a function.
Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.
edited 1 hour ago
answered 1 hour ago


Lightness Races in Orbit
273k50445753
273k50445753
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
– Lightness Races in Orbit
1 hour ago
add a comment |Â
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
– Lightness Races in Orbit
1 hour ago
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
in this case the serializer could be static, because it doesn't has any state. correct?
– user2622344
1 hour ago
@user2622344 Correct. So, in fact,
DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.– Lightness Races in Orbit
1 hour ago
@user2622344 Correct. So, in fact,
DatabaseRowSerializer::toXml(dbRow)
could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
3
down vote
Generally I agree with the points raised by LRiO in the other answer.
Just moving the c'tor out of the loop isn't a good idea.
However, for this style of loop body:
- feed object some data
- transform data within object
- return transformed data from object
it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.
So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.
So, in Summary:
You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
3
down vote
Generally I agree with the points raised by LRiO in the other answer.
Just moving the c'tor out of the loop isn't a good idea.
However, for this style of loop body:
- feed object some data
- transform data within object
- return transformed data from object
it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.
So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.
So, in Summary:
You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Generally I agree with the points raised by LRiO in the other answer.
Just moving the c'tor out of the loop isn't a good idea.
However, for this style of loop body:
- feed object some data
- transform data within object
- return transformed data from object
it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.
So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.
So, in Summary:
You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.
Generally I agree with the points raised by LRiO in the other answer.
Just moving the c'tor out of the loop isn't a good idea.
However, for this style of loop body:
- feed object some data
- transform data within object
- return transformed data from object
it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.
So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.
So, in Summary:
You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.
edited 1 hour ago
answered 1 hour ago
Martin Ba
19.4k21115243
19.4k21115243
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
add a comment |Â
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
1
1
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
– Lightness Races in Orbit
1 hour ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52851368%2freuse-object-vs-creating-new-object%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
Did this "major" change fix your performance problem? If it didn't, leave it as it was.
– molbdnilo
1 hour ago
1
Generally the way to go with C++ is RAII, i.e. no
init()
/clear()
, unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.– DevSolar
1 hour ago
2
@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
– Lightness Races in Orbit
55 mins ago