How do you deal with âbugsâ that can never be reproduced
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
Suppose your team writes a software system that is running fine.
One day one of the engineers mistakenly runs some SQL queries that change some of the DB data, then forgets about it.
After some time you discover the corrupted/erroneous data and everyone scratches their heads as to which part of the code caused this and why, to no avail. Meanwhile the project manager insists that we find the part of the code that caused it.
How do you deal with this?
project-management
add a comment |Â
up vote
2
down vote
favorite
Suppose your team writes a software system that is running fine.
One day one of the engineers mistakenly runs some SQL queries that change some of the DB data, then forgets about it.
After some time you discover the corrupted/erroneous data and everyone scratches their heads as to which part of the code caused this and why, to no avail. Meanwhile the project manager insists that we find the part of the code that caused it.
How do you deal with this?
project-management
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Suppose your team writes a software system that is running fine.
One day one of the engineers mistakenly runs some SQL queries that change some of the DB data, then forgets about it.
After some time you discover the corrupted/erroneous data and everyone scratches their heads as to which part of the code caused this and why, to no avail. Meanwhile the project manager insists that we find the part of the code that caused it.
How do you deal with this?
project-management
Suppose your team writes a software system that is running fine.
One day one of the engineers mistakenly runs some SQL queries that change some of the DB data, then forgets about it.
After some time you discover the corrupted/erroneous data and everyone scratches their heads as to which part of the code caused this and why, to no avail. Meanwhile the project manager insists that we find the part of the code that caused it.
How do you deal with this?
project-management
project-management
asked 2 hours ago
Nicholas Kyriakides
433220
433220
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago
add a comment |Â
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
6
down vote
It is obvious no project manager will invest an infinite amount of time into such a problem. What they want is to prevent happening the same situation again.
To achieve this goal, even if one cannot find the root cause of such a failure, it is often possible to take some measures for
- detecting such kind of failure earlier in case they reoccure
- making it less likely the same failure will happen again
- making the system more robust against the specific kind of inconsistency
For example, more detailed logging, more finegrained error handling, immediate error signaling could help to prevent the same error to strike again, or to find the root cause. If your system allows to add database triggers, maybe it is possible to add a trigger which forbids the inconsistency to be introduced in the first place.
Think of what the appropriate kind of action might be applicable in your situation, and suggest this to the team, I am sure your project manager will be pleased.
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
add a comment |Â
up vote
2
down vote
A production database should have full access logging and role based access controls. Thus you should have hard evidence as to WHO did WHAT WHEN to the database thus moving the attention from the code to poor operational security.
New contributor
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
add a comment |Â
up vote
0
down vote
- Explain to your project manager that you think the most likely cause is manual database access.
- If they still want you to look for the code that caused this, go and have another look at the code.
- Come back in a couple of hours (or some other appropriate time) and say you can't find any code which would have caused this, therefore you still believe the most likely cause is manual database access.
- If they still want you to look for the code, ask how much time they would like you to spend on this. Subtly remind them that you won't be working on feature X, bug Y or enhancement Z while you're doing this.
- Spend as much time as they ask. If you still think the most likely cause is manual database access, tell them this.
- If they still want you to look for the code, escalate the issue as this has clearly become an unproductive use of your team's time.
You may also want to consider if you should add in an extra processes to reduce the likelihood of manual database access causing this kind of issue in future.
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
It is obvious no project manager will invest an infinite amount of time into such a problem. What they want is to prevent happening the same situation again.
To achieve this goal, even if one cannot find the root cause of such a failure, it is often possible to take some measures for
- detecting such kind of failure earlier in case they reoccure
- making it less likely the same failure will happen again
- making the system more robust against the specific kind of inconsistency
For example, more detailed logging, more finegrained error handling, immediate error signaling could help to prevent the same error to strike again, or to find the root cause. If your system allows to add database triggers, maybe it is possible to add a trigger which forbids the inconsistency to be introduced in the first place.
Think of what the appropriate kind of action might be applicable in your situation, and suggest this to the team, I am sure your project manager will be pleased.
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
add a comment |Â
up vote
6
down vote
It is obvious no project manager will invest an infinite amount of time into such a problem. What they want is to prevent happening the same situation again.
To achieve this goal, even if one cannot find the root cause of such a failure, it is often possible to take some measures for
- detecting such kind of failure earlier in case they reoccure
- making it less likely the same failure will happen again
- making the system more robust against the specific kind of inconsistency
For example, more detailed logging, more finegrained error handling, immediate error signaling could help to prevent the same error to strike again, or to find the root cause. If your system allows to add database triggers, maybe it is possible to add a trigger which forbids the inconsistency to be introduced in the first place.
Think of what the appropriate kind of action might be applicable in your situation, and suggest this to the team, I am sure your project manager will be pleased.
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
add a comment |Â
up vote
6
down vote
up vote
6
down vote
It is obvious no project manager will invest an infinite amount of time into such a problem. What they want is to prevent happening the same situation again.
To achieve this goal, even if one cannot find the root cause of such a failure, it is often possible to take some measures for
- detecting such kind of failure earlier in case they reoccure
- making it less likely the same failure will happen again
- making the system more robust against the specific kind of inconsistency
For example, more detailed logging, more finegrained error handling, immediate error signaling could help to prevent the same error to strike again, or to find the root cause. If your system allows to add database triggers, maybe it is possible to add a trigger which forbids the inconsistency to be introduced in the first place.
Think of what the appropriate kind of action might be applicable in your situation, and suggest this to the team, I am sure your project manager will be pleased.
It is obvious no project manager will invest an infinite amount of time into such a problem. What they want is to prevent happening the same situation again.
To achieve this goal, even if one cannot find the root cause of such a failure, it is often possible to take some measures for
- detecting such kind of failure earlier in case they reoccure
- making it less likely the same failure will happen again
- making the system more robust against the specific kind of inconsistency
For example, more detailed logging, more finegrained error handling, immediate error signaling could help to prevent the same error to strike again, or to find the root cause. If your system allows to add database triggers, maybe it is possible to add a trigger which forbids the inconsistency to be introduced in the first place.
Think of what the appropriate kind of action might be applicable in your situation, and suggest this to the team, I am sure your project manager will be pleased.
edited 1 hour ago
answered 1 hour ago
Doc Brown
127k21232367
127k21232367
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
add a comment |Â
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
Is there an established procedure that you're giving me an overview here or is this just based on experience/common-sense?
â Nicholas Kyriakides
59 mins ago
add a comment |Â
up vote
2
down vote
A production database should have full access logging and role based access controls. Thus you should have hard evidence as to WHO did WHAT WHEN to the database thus moving the attention from the code to poor operational security.
New contributor
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
add a comment |Â
up vote
2
down vote
A production database should have full access logging and role based access controls. Thus you should have hard evidence as to WHO did WHAT WHEN to the database thus moving the attention from the code to poor operational security.
New contributor
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
add a comment |Â
up vote
2
down vote
up vote
2
down vote
A production database should have full access logging and role based access controls. Thus you should have hard evidence as to WHO did WHAT WHEN to the database thus moving the attention from the code to poor operational security.
New contributor
A production database should have full access logging and role based access controls. Thus you should have hard evidence as to WHO did WHAT WHEN to the database thus moving the attention from the code to poor operational security.
New contributor
New contributor
answered 1 hour ago
Don Gilman
291
291
New contributor
New contributor
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
add a comment |Â
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
1
1
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
It sounds like they may not know exactly when the data corruption occurred, which could make it difficult to figure out what logs they need to investigate.
â Nathanael
1 hour ago
add a comment |Â
up vote
0
down vote
- Explain to your project manager that you think the most likely cause is manual database access.
- If they still want you to look for the code that caused this, go and have another look at the code.
- Come back in a couple of hours (or some other appropriate time) and say you can't find any code which would have caused this, therefore you still believe the most likely cause is manual database access.
- If they still want you to look for the code, ask how much time they would like you to spend on this. Subtly remind them that you won't be working on feature X, bug Y or enhancement Z while you're doing this.
- Spend as much time as they ask. If you still think the most likely cause is manual database access, tell them this.
- If they still want you to look for the code, escalate the issue as this has clearly become an unproductive use of your team's time.
You may also want to consider if you should add in an extra processes to reduce the likelihood of manual database access causing this kind of issue in future.
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
add a comment |Â
up vote
0
down vote
- Explain to your project manager that you think the most likely cause is manual database access.
- If they still want you to look for the code that caused this, go and have another look at the code.
- Come back in a couple of hours (or some other appropriate time) and say you can't find any code which would have caused this, therefore you still believe the most likely cause is manual database access.
- If they still want you to look for the code, ask how much time they would like you to spend on this. Subtly remind them that you won't be working on feature X, bug Y or enhancement Z while you're doing this.
- Spend as much time as they ask. If you still think the most likely cause is manual database access, tell them this.
- If they still want you to look for the code, escalate the issue as this has clearly become an unproductive use of your team's time.
You may also want to consider if you should add in an extra processes to reduce the likelihood of manual database access causing this kind of issue in future.
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
add a comment |Â
up vote
0
down vote
up vote
0
down vote
- Explain to your project manager that you think the most likely cause is manual database access.
- If they still want you to look for the code that caused this, go and have another look at the code.
- Come back in a couple of hours (or some other appropriate time) and say you can't find any code which would have caused this, therefore you still believe the most likely cause is manual database access.
- If they still want you to look for the code, ask how much time they would like you to spend on this. Subtly remind them that you won't be working on feature X, bug Y or enhancement Z while you're doing this.
- Spend as much time as they ask. If you still think the most likely cause is manual database access, tell them this.
- If they still want you to look for the code, escalate the issue as this has clearly become an unproductive use of your team's time.
You may also want to consider if you should add in an extra processes to reduce the likelihood of manual database access causing this kind of issue in future.
- Explain to your project manager that you think the most likely cause is manual database access.
- If they still want you to look for the code that caused this, go and have another look at the code.
- Come back in a couple of hours (or some other appropriate time) and say you can't find any code which would have caused this, therefore you still believe the most likely cause is manual database access.
- If they still want you to look for the code, ask how much time they would like you to spend on this. Subtly remind them that you won't be working on feature X, bug Y or enhancement Z while you're doing this.
- Spend as much time as they ask. If you still think the most likely cause is manual database access, tell them this.
- If they still want you to look for the code, escalate the issue as this has clearly become an unproductive use of your team's time.
You may also want to consider if you should add in an extra processes to reduce the likelihood of manual database access causing this kind of issue in future.
answered 1 hour ago
Philip Kendall
4,67811824
4,67811824
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
add a comment |Â
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
I had no idea one of the engineers did a manual update + engineers almost never run queries directly on the database. This one just did, as a one-off thing and forgot about it. We spent a day + preparing to spent a full week on finding out what's wrong. My question is what happens if you can't find the cause and can't suggest what the potential cause might be.
â Nicholas Kyriakides
1 hour ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
"My question is what happens if you can't find the cause and can't suggest what the potential cause might be" This is the exact reason the 'won't fix - can't duplicate' flag was invented.
â esoterik
29 mins ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsoftwareengineering.stackexchange.com%2fquestions%2f380579%2fhow-do-you-deal-with-bugs-that-can-never-be-reproduced%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
If the engineer forgot about it, how do you know that's what happened? How do you it was corrupted by someone running a script, and not by a bug?
â DaveG
55 mins ago
He had an epiphany after a day or two. This is a hypothetical in case he never did remember which could have easily been the case.
â Nicholas Kyriakides
53 mins ago
So in this case, is the project manager still thinking this is due to a bug, despite one of the engineers saying it's due to a manual SQL query? Or does the manager just want better error checking & procedures to try to catch this sort of thing before it's a problem?
â DaveG
48 mins ago
This is a hypothetical. I'm sure the PM would have us chase this is as much as we can if he never did remember. I know I would.
â Nicholas Kyriakides
46 mins ago