DD issues - as always

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












my command:



gunzip -c serial2udp.image.gz |
sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M


my output:



15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 

0+331128 records in
0+331128 records out
15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s


the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s

distribution: 16.0.4 xenial with xfce

kernel version: 4.13.0.37-generic



i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes??



iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen
iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.










share|improve this question









New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • updated 4.13.0.37-generic kernel. thanks
    – cts
    2 days ago














up vote
2
down vote

favorite












my command:



gunzip -c serial2udp.image.gz |
sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M


my output:



15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 

0+331128 records in
0+331128 records out
15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s


the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s

distribution: 16.0.4 xenial with xfce

kernel version: 4.13.0.37-generic



i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes??



iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen
iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.










share|improve this question









New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.



















  • updated 4.13.0.37-generic kernel. thanks
    – cts
    2 days ago












up vote
2
down vote

favorite









up vote
2
down vote

favorite











my command:



gunzip -c serial2udp.image.gz |
sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M


my output:



15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 

0+331128 records in
0+331128 records out
15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s


the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s

distribution: 16.0.4 xenial with xfce

kernel version: 4.13.0.37-generic



i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes??



iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen
iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.










share|improve this question









New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











my command:



gunzip -c serial2udp.image.gz |
sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M


my output:



15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 

0+331128 records in
0+331128 records out
15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s


the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s

distribution: 16.0.4 xenial with xfce

kernel version: 4.13.0.37-generic



i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes??



iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen
iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.







linux dd sd-card iotop






share|improve this question









New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited 2 days ago









sourcejedi

19.7k42681




19.7k42681






New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Sep 9 at 12:03









cts

214




214




New contributor




cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






cts is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • updated 4.13.0.37-generic kernel. thanks
    – cts
    2 days ago
















  • updated 4.13.0.37-generic kernel. thanks
    – cts
    2 days ago















updated 4.13.0.37-generic kernel. thanks
– cts
2 days ago




updated 4.13.0.37-generic kernel. thanks
– cts
2 days ago










2 Answers
2






active

oldest

votes

















up vote
8
down vote



accepted











I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen




conv=fsync means to write back any caches by calling fsync - after dd has written all the data. Hanging at the end is exactly what it will do...



When the output file is slower than the input file, the data written by dd can pile up in caches. The kernel cache can sometimes fill a significant fraction of system RAM. This makes for very misleading progress information.



The "final 5MB" was just an artefact of how dd shows progress.



I usually ignore this as a known problem :-). I try to use the simplest commands I need, and often use a separate sync command. Not getting any progress information for 15 minutes would annoy me as well though :-(.



If your system was indeed caching about 8GB (i.e. half of the 16GB of written data), then I think you either must have about 32GB of RAM, or have been fiddling with certain kernel options. See the lwn.net link below.



Other dd options are also available, to hang in different ways... If you want dd to show more accurate progress without degrading performance, you might have to accept more complexity:



gunzip -c serial2udp.image.gz |
dd iflag=fullblock bs=4M |
sudo dd iflag=fullblock oflag=direct conv=fsync status=progress bs=4M of=/dev/mmcblk0



  • oflag=direct iflag=fullblock avoids piling up kernel cache, because it bypasses it altogether.


  • iflag=fullblock is required in such a command AFAIK (e.g. because you are reading from a pipe and writing using direct IO). The effect of missing fullblock is another unfortunate complexity of dd. Some posts on this site use this to argue you should always prefer to use a different command. It's hard to find another way to do direct or sync IO though.


  • conv=fsync should still be used, to write back the device cache.

  • I added an extra dd after gunzip, to buffer the decompressed output in parallel with the disk write. This is one of the issues that makes the performance with oflag=direct or oflag=sync a bit complex. Normal IO (non-direct, non-sync) doesn't need this as it is already buffered by the kernel cache. You also wouldn't need the extra buffer if you were writing to a hard drive with 4M of writeback cache, but I don't assume an SD card has that.

You could alternatively use oflag=direct,sync (and not need conv=fsync). This might be useful for good progress information if you had a weird output device with hundreds of megabytes of cache. But normally I think of oflag=sync as a potential barrier to performance.



There is a 2013 article https://lwn.net/Articles/572911/ which mentions minute-long delays like yours. Many people see this ability to cache minutes worth of writeback data as undesirable. The problem was that the limit on the cache size was applied indiscriminately, to both fast and slow devices. Note that it is non-trivial for the kernel to measure device speed, because it varies depending on the data locations. E.g. if the cached writes are scattered in random locations, a hard drive will take longer from repeatedly moving the write head.




why do the updates hang




The fsync() is a single system call that applies to the entire range of the file device. It does not return any status updates before it is done. dd is simplisitic. There are more things it could do if it wanted, some of them Linux-specific. Linux sync_file_range() would allow waiting for individual blocks (or larger ranges).






share|improve this answer






















  • just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
    – cts
    2 days ago











  • @cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
    – sourcejedi
    2 days ago

















up vote
0
down vote













You want to use fullblock. notrunc has no effect on device files.



dd of=/dev/mmcblk0 iflag=fullblock status=progress bs=4M





share|improve this answer


















  • 1




    Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
    – sourcejedi
    Sep 9 at 15:08











Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);






cts is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f467844%2fdd-issues-as-always%23new-answer', 'question_page');

);

Post as a guest






























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
8
down vote



accepted











I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen




conv=fsync means to write back any caches by calling fsync - after dd has written all the data. Hanging at the end is exactly what it will do...



When the output file is slower than the input file, the data written by dd can pile up in caches. The kernel cache can sometimes fill a significant fraction of system RAM. This makes for very misleading progress information.



The "final 5MB" was just an artefact of how dd shows progress.



I usually ignore this as a known problem :-). I try to use the simplest commands I need, and often use a separate sync command. Not getting any progress information for 15 minutes would annoy me as well though :-(.



If your system was indeed caching about 8GB (i.e. half of the 16GB of written data), then I think you either must have about 32GB of RAM, or have been fiddling with certain kernel options. See the lwn.net link below.



Other dd options are also available, to hang in different ways... If you want dd to show more accurate progress without degrading performance, you might have to accept more complexity:



gunzip -c serial2udp.image.gz |
dd iflag=fullblock bs=4M |
sudo dd iflag=fullblock oflag=direct conv=fsync status=progress bs=4M of=/dev/mmcblk0



  • oflag=direct iflag=fullblock avoids piling up kernel cache, because it bypasses it altogether.


  • iflag=fullblock is required in such a command AFAIK (e.g. because you are reading from a pipe and writing using direct IO). The effect of missing fullblock is another unfortunate complexity of dd. Some posts on this site use this to argue you should always prefer to use a different command. It's hard to find another way to do direct or sync IO though.


  • conv=fsync should still be used, to write back the device cache.

  • I added an extra dd after gunzip, to buffer the decompressed output in parallel with the disk write. This is one of the issues that makes the performance with oflag=direct or oflag=sync a bit complex. Normal IO (non-direct, non-sync) doesn't need this as it is already buffered by the kernel cache. You also wouldn't need the extra buffer if you were writing to a hard drive with 4M of writeback cache, but I don't assume an SD card has that.

You could alternatively use oflag=direct,sync (and not need conv=fsync). This might be useful for good progress information if you had a weird output device with hundreds of megabytes of cache. But normally I think of oflag=sync as a potential barrier to performance.



There is a 2013 article https://lwn.net/Articles/572911/ which mentions minute-long delays like yours. Many people see this ability to cache minutes worth of writeback data as undesirable. The problem was that the limit on the cache size was applied indiscriminately, to both fast and slow devices. Note that it is non-trivial for the kernel to measure device speed, because it varies depending on the data locations. E.g. if the cached writes are scattered in random locations, a hard drive will take longer from repeatedly moving the write head.




why do the updates hang




The fsync() is a single system call that applies to the entire range of the file device. It does not return any status updates before it is done. dd is simplisitic. There are more things it could do if it wanted, some of them Linux-specific. Linux sync_file_range() would allow waiting for individual blocks (or larger ranges).






share|improve this answer






















  • just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
    – cts
    2 days ago











  • @cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
    – sourcejedi
    2 days ago














up vote
8
down vote



accepted











I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen




conv=fsync means to write back any caches by calling fsync - after dd has written all the data. Hanging at the end is exactly what it will do...



When the output file is slower than the input file, the data written by dd can pile up in caches. The kernel cache can sometimes fill a significant fraction of system RAM. This makes for very misleading progress information.



The "final 5MB" was just an artefact of how dd shows progress.



I usually ignore this as a known problem :-). I try to use the simplest commands I need, and often use a separate sync command. Not getting any progress information for 15 minutes would annoy me as well though :-(.



If your system was indeed caching about 8GB (i.e. half of the 16GB of written data), then I think you either must have about 32GB of RAM, or have been fiddling with certain kernel options. See the lwn.net link below.



Other dd options are also available, to hang in different ways... If you want dd to show more accurate progress without degrading performance, you might have to accept more complexity:



gunzip -c serial2udp.image.gz |
dd iflag=fullblock bs=4M |
sudo dd iflag=fullblock oflag=direct conv=fsync status=progress bs=4M of=/dev/mmcblk0



  • oflag=direct iflag=fullblock avoids piling up kernel cache, because it bypasses it altogether.


  • iflag=fullblock is required in such a command AFAIK (e.g. because you are reading from a pipe and writing using direct IO). The effect of missing fullblock is another unfortunate complexity of dd. Some posts on this site use this to argue you should always prefer to use a different command. It's hard to find another way to do direct or sync IO though.


  • conv=fsync should still be used, to write back the device cache.

  • I added an extra dd after gunzip, to buffer the decompressed output in parallel with the disk write. This is one of the issues that makes the performance with oflag=direct or oflag=sync a bit complex. Normal IO (non-direct, non-sync) doesn't need this as it is already buffered by the kernel cache. You also wouldn't need the extra buffer if you were writing to a hard drive with 4M of writeback cache, but I don't assume an SD card has that.

You could alternatively use oflag=direct,sync (and not need conv=fsync). This might be useful for good progress information if you had a weird output device with hundreds of megabytes of cache. But normally I think of oflag=sync as a potential barrier to performance.



There is a 2013 article https://lwn.net/Articles/572911/ which mentions minute-long delays like yours. Many people see this ability to cache minutes worth of writeback data as undesirable. The problem was that the limit on the cache size was applied indiscriminately, to both fast and slow devices. Note that it is non-trivial for the kernel to measure device speed, because it varies depending on the data locations. E.g. if the cached writes are scattered in random locations, a hard drive will take longer from repeatedly moving the write head.




why do the updates hang




The fsync() is a single system call that applies to the entire range of the file device. It does not return any status updates before it is done. dd is simplisitic. There are more things it could do if it wanted, some of them Linux-specific. Linux sync_file_range() would allow waiting for individual blocks (or larger ranges).






share|improve this answer






















  • just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
    – cts
    2 days ago











  • @cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
    – sourcejedi
    2 days ago












up vote
8
down vote



accepted







up vote
8
down vote



accepted







I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen




conv=fsync means to write back any caches by calling fsync - after dd has written all the data. Hanging at the end is exactly what it will do...



When the output file is slower than the input file, the data written by dd can pile up in caches. The kernel cache can sometimes fill a significant fraction of system RAM. This makes for very misleading progress information.



The "final 5MB" was just an artefact of how dd shows progress.



I usually ignore this as a known problem :-). I try to use the simplest commands I need, and often use a separate sync command. Not getting any progress information for 15 minutes would annoy me as well though :-(.



If your system was indeed caching about 8GB (i.e. half of the 16GB of written data), then I think you either must have about 32GB of RAM, or have been fiddling with certain kernel options. See the lwn.net link below.



Other dd options are also available, to hang in different ways... If you want dd to show more accurate progress without degrading performance, you might have to accept more complexity:



gunzip -c serial2udp.image.gz |
dd iflag=fullblock bs=4M |
sudo dd iflag=fullblock oflag=direct conv=fsync status=progress bs=4M of=/dev/mmcblk0



  • oflag=direct iflag=fullblock avoids piling up kernel cache, because it bypasses it altogether.


  • iflag=fullblock is required in such a command AFAIK (e.g. because you are reading from a pipe and writing using direct IO). The effect of missing fullblock is another unfortunate complexity of dd. Some posts on this site use this to argue you should always prefer to use a different command. It's hard to find another way to do direct or sync IO though.


  • conv=fsync should still be used, to write back the device cache.

  • I added an extra dd after gunzip, to buffer the decompressed output in parallel with the disk write. This is one of the issues that makes the performance with oflag=direct or oflag=sync a bit complex. Normal IO (non-direct, non-sync) doesn't need this as it is already buffered by the kernel cache. You also wouldn't need the extra buffer if you were writing to a hard drive with 4M of writeback cache, but I don't assume an SD card has that.

You could alternatively use oflag=direct,sync (and not need conv=fsync). This might be useful for good progress information if you had a weird output device with hundreds of megabytes of cache. But normally I think of oflag=sync as a potential barrier to performance.



There is a 2013 article https://lwn.net/Articles/572911/ which mentions minute-long delays like yours. Many people see this ability to cache minutes worth of writeback data as undesirable. The problem was that the limit on the cache size was applied indiscriminately, to both fast and slow devices. Note that it is non-trivial for the kernel to measure device speed, because it varies depending on the data locations. E.g. if the cached writes are scattered in random locations, a hard drive will take longer from repeatedly moving the write head.




why do the updates hang




The fsync() is a single system call that applies to the entire range of the file device. It does not return any status updates before it is done. dd is simplisitic. There are more things it could do if it wanted, some of them Linux-specific. Linux sync_file_range() would allow waiting for individual blocks (or larger ranges).






share|improve this answer















I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen




conv=fsync means to write back any caches by calling fsync - after dd has written all the data. Hanging at the end is exactly what it will do...



When the output file is slower than the input file, the data written by dd can pile up in caches. The kernel cache can sometimes fill a significant fraction of system RAM. This makes for very misleading progress information.



The "final 5MB" was just an artefact of how dd shows progress.



I usually ignore this as a known problem :-). I try to use the simplest commands I need, and often use a separate sync command. Not getting any progress information for 15 minutes would annoy me as well though :-(.



If your system was indeed caching about 8GB (i.e. half of the 16GB of written data), then I think you either must have about 32GB of RAM, or have been fiddling with certain kernel options. See the lwn.net link below.



Other dd options are also available, to hang in different ways... If you want dd to show more accurate progress without degrading performance, you might have to accept more complexity:



gunzip -c serial2udp.image.gz |
dd iflag=fullblock bs=4M |
sudo dd iflag=fullblock oflag=direct conv=fsync status=progress bs=4M of=/dev/mmcblk0



  • oflag=direct iflag=fullblock avoids piling up kernel cache, because it bypasses it altogether.


  • iflag=fullblock is required in such a command AFAIK (e.g. because you are reading from a pipe and writing using direct IO). The effect of missing fullblock is another unfortunate complexity of dd. Some posts on this site use this to argue you should always prefer to use a different command. It's hard to find another way to do direct or sync IO though.


  • conv=fsync should still be used, to write back the device cache.

  • I added an extra dd after gunzip, to buffer the decompressed output in parallel with the disk write. This is one of the issues that makes the performance with oflag=direct or oflag=sync a bit complex. Normal IO (non-direct, non-sync) doesn't need this as it is already buffered by the kernel cache. You also wouldn't need the extra buffer if you were writing to a hard drive with 4M of writeback cache, but I don't assume an SD card has that.

You could alternatively use oflag=direct,sync (and not need conv=fsync). This might be useful for good progress information if you had a weird output device with hundreds of megabytes of cache. But normally I think of oflag=sync as a potential barrier to performance.



There is a 2013 article https://lwn.net/Articles/572911/ which mentions minute-long delays like yours. Many people see this ability to cache minutes worth of writeback data as undesirable. The problem was that the limit on the cache size was applied indiscriminately, to both fast and slow devices. Note that it is non-trivial for the kernel to measure device speed, because it varies depending on the data locations. E.g. if the cached writes are scattered in random locations, a hard drive will take longer from repeatedly moving the write head.




why do the updates hang




The fsync() is a single system call that applies to the entire range of the file device. It does not return any status updates before it is done. dd is simplisitic. There are more things it could do if it wanted, some of them Linux-specific. Linux sync_file_range() would allow waiting for individual blocks (or larger ranges).







share|improve this answer














share|improve this answer



share|improve this answer








edited 2 days ago

























answered Sep 9 at 14:59









sourcejedi

19.7k42681




19.7k42681











  • just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
    – cts
    2 days ago











  • @cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
    – sourcejedi
    2 days ago
















  • just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
    – cts
    2 days ago











  • @cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
    – sourcejedi
    2 days ago















just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
– cts
2 days ago





just as a follow up, this code suggested by @sourcejedi worked as in it gave me a much more accurate representation of the progress and the system did not hang at the end. it did take longer to process though: 3798+1 records in 3798+1 records out 15931539456 bytes (16 GB, 15 GiB) copied, 2838.48 s, 5.6 MB/s identical file size and block size used. so this seems to be a trade off of realtime progress versus speed. that being said, I had tried this previously on 4 different attempts and got corrupted writes for the raspi.img, and this time it worked on the first write.
– cts
2 days ago













@cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
– sourcejedi
2 days ago




@cts Thank you for feedback! I can't think of any explanation why it would be that much slower for your device. So I don't have any other specific suggestion. In principle direct could want a bigger bs=, but on my system cat /sys/class/block/mmcblk0/queue/max_hw_sectors_kb already suggests it is capped at 512 KB. I have a 16GB Ultra bought in 2017, so I've tested it with 500MB of data. I got ~13.1 MB/s overall with your original command, and 13.5 MB/s with mine.
– sourcejedi
2 days ago












up vote
0
down vote













You want to use fullblock. notrunc has no effect on device files.



dd of=/dev/mmcblk0 iflag=fullblock status=progress bs=4M





share|improve this answer


















  • 1




    Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
    – sourcejedi
    Sep 9 at 15:08















up vote
0
down vote













You want to use fullblock. notrunc has no effect on device files.



dd of=/dev/mmcblk0 iflag=fullblock status=progress bs=4M





share|improve this answer


















  • 1




    Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
    – sourcejedi
    Sep 9 at 15:08













up vote
0
down vote










up vote
0
down vote









You want to use fullblock. notrunc has no effect on device files.



dd of=/dev/mmcblk0 iflag=fullblock status=progress bs=4M





share|improve this answer














You want to use fullblock. notrunc has no effect on device files.



dd of=/dev/mmcblk0 iflag=fullblock status=progress bs=4M






share|improve this answer














share|improve this answer



share|improve this answer








edited Sep 9 at 14:55

























answered Sep 9 at 13:09









RalfFriedl

3,2351522




3,2351522







  • 1




    Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
    – sourcejedi
    Sep 9 at 15:08













  • 1




    Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
    – sourcejedi
    Sep 9 at 15:08








1




1




Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
– sourcejedi
Sep 9 at 15:08





Very impressive if fsync() is not doing anything on the block device and the other issue is causing a 16 minute delay after writing 99.99% of data in 16 minutes. Anyway, if the answer doesn't assert fsync is a no-op then I have no downvote-worthy issue. I found a claim that device writes are synchronous on BSD, maybe there is a difference between OS's.
– sourcejedi
Sep 9 at 15:08











cts is a new contributor. Be nice, and check out our Code of Conduct.









 

draft saved


draft discarded


















cts is a new contributor. Be nice, and check out our Code of Conduct.












cts is a new contributor. Be nice, and check out our Code of Conduct.











cts is a new contributor. Be nice, and check out our Code of Conduct.













 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f467844%2fdd-issues-as-always%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

List of Gilmore Girls characters

What does second last employer means? [closed]

One-line joke