Reuse object vs creating new object

up vote
7
down vote

favorite

One of our projects deals with tons of data. It selects data from an database and serializes the results into json/xml.
Sometimes the amount of selected rows can reach the 50 million mark easily.
However though, the runtime of the program was to bad in the beginning.
So we have refactored the program with one major adjustment:
The working objects for serialization wouldn't be recreated for every single row, instead the object will be cleared and reinitialized.

For Example:

before:

For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.

// loop with all dbRows

 DatabaseRowSerializer serializer(dbRow);
 result.add(serializer.toXml());

after:

The constructor of DatabaseRowSerializer doesn't sets the dbRow. Instead this will be done by the initDbRow()-function.
The main thing here is, that only one object will be used for the whole runtime. After the serialization of an dbRow, the clear()-function
will be called to reset the object.

DatabaseRowSerializer serializer;

// loop with all dbRows

 serializier.initDbRow(dbRow);
 result.add(serializer.toXml());
 serializier.clear();

So my question:

Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.

Which way do you generally prefer? before or after?

asked 1 hour ago

user2622344

114312

2

Did this "major" change fix your performance problem? If it didn't, leave it as it was.
â€“Â molbdnilo
1 hour ago

1

Generally the way to go with C++ is RAII, i.e. no init() / clear(), unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
â€“Â DevSolar
1 hour ago

2

@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
â€“Â Lightness Races in Orbit
55 mins ago

add a commentÂ |Â

up vote
7
down vote

favorite

For Example:

before:

For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.

// loop with all dbRows

 DatabaseRowSerializer serializer(dbRow);
 result.add(serializer.toXml());

after:

DatabaseRowSerializer serializer;

// loop with all dbRows

 serializier.initDbRow(dbRow);
 result.add(serializer.toXml());
 serializier.clear();

So my question:

Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.

Which way do you generally prefer? before or after?

asked 1 hour ago

user2622344

114312

2

Did this "major" change fix your performance problem? If it didn't, leave it as it was.
â€“Â molbdnilo
1 hour ago

1

Generally the way to go with C++ is RAII, i.e. no init() / clear(), unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
â€“Â DevSolar
1 hour ago

2

@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
â€“Â Lightness Races in Orbit
55 mins ago

add a commentÂ |Â

up vote
7
down vote

favorite

For Example:

before:

For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.

// loop with all dbRows

 DatabaseRowSerializer serializer(dbRow);
 result.add(serializer.toXml());

after:

DatabaseRowSerializer serializer;

// loop with all dbRows

 serializier.initDbRow(dbRow);
 result.add(serializer.toXml());
 serializier.clear();

So my question:

Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.

Which way do you generally prefer? before or after?

asked 1 hour ago

user2622344

114312

For Example:

before:

For every single database row we create an object of DatabaseRowSerializer and call the specific serialize function.

// loop with all dbRows

 DatabaseRowSerializer serializer(dbRow);
 result.add(serializer.toXml());

after:

DatabaseRowSerializer serializer;

// loop with all dbRows

 serializier.initDbRow(dbRow);
 result.add(serializer.toXml());
 serializier.clear();

So my question:

Is this really a good way to handle the problem?
In my opinion init()-functions aren't really smart. And normally a constructor should be used to initialize the possible parameters.

Which way do you generally prefer? before or after?

c++

asked 1 hour ago

user2622344

114312

asked 1 hour ago

user2622344

114312

asked 1 hour ago

user2622344

114312

asked 1 hour ago

user2622344

114312

asked 1 hour ago

user2622344

114312

2

Did this "major" change fix your performance problem? If it didn't, leave it as it was.
â€“Â molbdnilo
1 hour ago

1

Generally the way to go with C++ is RAII, i.e. no init() / clear(), unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
â€“Â DevSolar
1 hour ago

2

@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
â€“Â Lightness Races in Orbit
55 mins ago

add a commentÂ |Â

2

Did this "major" change fix your performance problem? If it didn't, leave it as it was.
â€“Â molbdnilo
1 hour ago

1

Generally the way to go with C++ is RAII, i.e. no init() / clear(), unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
â€“Â DevSolar
1 hour ago

2

@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
â€“Â Lightness Races in Orbit
55 mins ago

Did this "major" change fix your performance problem? If it didn't, leave it as it was.
â€“Â molbdnilo
1 hour ago

Generally the way to go with C++ is RAII, i.e. no init() / clear(), unless you're solving a problem that's more important to solve than maintaining good architecture... I am voting to close this question as opinion-based.
â€“Â DevSolar
1 hour ago

@DevSolar There are plenty of objective things we can say about this. It's not exactly "what's your favourite compiler?"
â€“Â Lightness Races in Orbit
55 mins ago

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
7
down vote

On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:

It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objectsâ€¦) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.

It is unnecessary
- There is essentially no overhead in creating a DatabaseRowSerializer every time, unless its constructor does more than your initDbRow function, in which case your two examples are not equivalent anyway.
  
  Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
  
  So if this change really solved your performance problem, something else was probably going on.

Use your constructors and destructors. Freely and proudly!

That's the common advice when writing C++.

A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:

DatabaseRowSerializer serializer;

// loop with all dbRows

 result.add(serializer.toXml(dbRow));

You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.

If you do this and still don't have any state, then the whole thing can just be a static call:

// loop with all dbRows

 result.add(DatabaseRowSerializer::toXml(dbRow));

â€¦but then it may as well just be a function.

Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
3
down vote

Generally I agree with the points raised by LRiO in the other answer.

Just moving the c'tor out of the loop isn't a good idea.

However, for this style of loop body:

feed object some data

transform data within object

return transformed data from object

it is, IMHO, often the case that the transforming object will allocate some buffers (on the heap) that potentially can be reused when the second form with the init function is used. In naive implementations, this reuse may not even be deliberate, just a side effect of the implementation.

So, IFF you're seeing a speed up by your refactoring (hoisting the object c'tor out of the loop), it may be because the object is now able to re-use some buffers and avoid repeated "redundant" heap allocations for these buffers.

So, in Summary:

You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

1

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52851368%2freuse-object-vs-creating-new-object%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
7
down vote

On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:

It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objectsâ€¦) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.

It is unnecessary
- There is essentially no overhead in creating a DatabaseRowSerializer every time, unless its constructor does more than your initDbRow function, in which case your two examples are not equivalent anyway.
  
  Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
  
  So if this change really solved your performance problem, something else was probably going on.

Use your constructors and destructors. Freely and proudly!

That's the common advice when writing C++.

A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:

DatabaseRowSerializer serializer;

// loop with all dbRows

 result.add(serializer.toXml(dbRow));

You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.

If you do this and still don't have any state, then the whole thing can just be a static call:

// loop with all dbRows

 result.add(DatabaseRowSerializer::toXml(dbRow));

â€¦but then it may as well just be a function.

Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
7
down vote

On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:

It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objectsâ€¦) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.

It is unnecessary
- There is essentially no overhead in creating a DatabaseRowSerializer every time, unless its constructor does more than your initDbRow function, in which case your two examples are not equivalent anyway.
  
  Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
  
  So if this change really solved your performance problem, something else was probably going on.

Use your constructors and destructors. Freely and proudly!

That's the common advice when writing C++.

A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:

DatabaseRowSerializer serializer;

// loop with all dbRows

 result.add(serializer.toXml(dbRow));

You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.

If you do this and still don't have any state, then the whole thing can just be a static call:

// loop with all dbRows

 result.add(DatabaseRowSerializer::toXml(dbRow));

â€¦but then it may as well just be a function.

Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
7
down vote

On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:

It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objectsâ€¦) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.

It is unnecessary
- There is essentially no overhead in creating a DatabaseRowSerializer every time, unless its constructor does more than your initDbRow function, in which case your two examples are not equivalent anyway.
  
  Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
  
  So if this change really solved your performance problem, something else was probably going on.

Use your constructors and destructors. Freely and proudly!

That's the common advice when writing C++.

A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:

DatabaseRowSerializer serializer;

// loop with all dbRows

 result.add(serializer.toXml(dbRow));

You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.

If you do this and still don't have any state, then the whole thing can just be a static call:

// loop with all dbRows

 result.add(DatabaseRowSerializer::toXml(dbRow));

â€¦but then it may as well just be a function.

Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

On the one hand, this is subjective. On the other, opinion widely agrees that in C++ you should avoid this "init function" idiom because:

It is worse code
- You have to remember to "initialise" your object and, if you don't, what state is it in? Your object should never be in a "dead" state. (Don't get me started on "moved-from" objectsâ€¦) This is why C++ introduced constructors and destructors, because the old C approach was kind of minging and resulting programs are harder to prove correct.

It is unnecessary
- There is essentially no overhead in creating a DatabaseRowSerializer every time, unless its constructor does more than your initDbRow function, in which case your two examples are not equivalent anyway.
  
  Even if your compiler doesn't optimise away the unnecessary "allocation", there isn't really an allocation anyway because the object just takes up space on the stack and it has to do that regardless.
  
  So if this change really solved your performance problem, something else was probably going on.

Use your constructors and destructors. Freely and proudly!

That's the common advice when writing C++.

A possible third approach if you did want to make the serializer re-usable for whatever reason, is to move all of its state into the actual operational function call:

DatabaseRowSerializer serializer;

// loop with all dbRows

 result.add(serializer.toXml(dbRow));

You might do this if the serialiser has some desire to cache information, or re-use dynamically-allocated buffers, to aid in performance. That of course adds some state into the serialiser.

If you do this and still don't have any state, then the whole thing can just be a static call:

// loop with all dbRows

 result.add(DatabaseRowSerializer::toXml(dbRow));

â€¦but then it may as well just be a function.

Ultimately we can't know exactly what's best for you, but there are plenty of options and considerations.

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

edited 1 hour ago

answered 1 hour ago

Lightness Races in Orbit

273k50445753

answered 1 hour ago

Lightness Races in Orbit

273k50445753

answered 1 hour ago

Lightness Races in Orbit

273k50445753

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

in this case the serializer could be static, because it doesn't has any state. correct?
â€“Â user2622344
1 hour ago

@user2622344 Correct. So, in fact, DatabaseRowSerializer::toXml(dbRow) could be a fourth valid approach. What's best for you depends on things we can't see or know or measure from here.
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
3
down vote

Generally I agree with the points raised by LRiO in the other answer.

Just moving the c'tor out of the loop isn't a good idea.

However, for this style of loop body:

feed object some data

transform data within object

return transformed data from object

So, in Summary:

You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

1

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
3
down vote

Generally I agree with the points raised by LRiO in the other answer.

Just moving the c'tor out of the loop isn't a good idea.

However, for this style of loop body:

feed object some data

transform data within object

return transformed data from object

So, in Summary:

You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

1

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

up vote
3
down vote

Generally I agree with the points raised by LRiO in the other answer.

Just moving the c'tor out of the loop isn't a good idea.

However, for this style of loop body:

feed object some data

transform data within object

return transformed data from object

So, in Summary:

You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

Generally I agree with the points raised by LRiO in the other answer.

Just moving the c'tor out of the loop isn't a good idea.

However, for this style of loop body:

feed object some data

transform data within object

return transformed data from object

So, in Summary:

You do not want the constructor to be hoisted out of the loop for its own sake. But you want all buffers that can be preserved to be preserved across the loop iterations.

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

edited 1 hour ago

answered 1 hour ago

Martin Ba

19.4k21115243

answered 1 hour ago

Martin Ba

19.4k21115243

answered 1 hour ago

Martin Ba

19.4k21115243

1

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

1

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

That's a good point. Starting with the "re-used" design permits caching and buffer re-use and other optimisations that you otherwise prevent yourself from adding later. Well, unless you refactor again :)
â€“Â Lightness Races in Orbit
1 hour ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky