Adam Goryachev
2016-01-18 02:54:22 UTC
OK, so I've spent forever (years) suffering from this bug, but I've
spent a bit more time on it, and might have some insight.
Firstly, this seems to happen semi-randomly during backups, so
eventually the backup will complete (usually) though it seems related to
either the backup client, and/or the number of files that the client has.
I get logs like this in my /var/log/messages:
Jan 17 23:34:31 keep kernel: [11513620.906447] rsync_bpc[27253]:
segfault at 7fe37aefd428 ip 00000000004473af sp 00007ffd49086e40 error 4
in rsync_bpc[400000+75000]
Jan 18 00:35:01 keep kernel: [11517256.388512] rsync_bpc[27472]:
segfault at 7fcaa53f3428 ip 00000000004473af sp 00007ffede0afdd0 error 4
in rsync_bpc[400000+75000]
Jan 18 01:05:12 keep kernel: [11519069.903776] rsync_bpc[27607]:
segfault at 7f747bbf5428 ip 00000000004473af sp 00007fffe8b03b40 error 4
in rsync_bpc[400000+75000]
Jan 18 01:35:09 keep kernel: [11520869.888899] rsync_bpc[27860]:
segfault at 7f7a06240428 ip 00000000004473af sp 00007ffebad15f60 error 4
in rsync_bpc[400000+75000]
Jan 18 02:04:56 keep kernel: [11522659.795284] rsync_bpc[28086]:
segfault at 7f0088f10428 ip 00000000004473af sp 00007ffe07298520 error 4
in rsync_bpc[400000+75000]
Jan 18 02:40:48 keep kernel: [11524814.507776] rsync_bpc[28340]:
segfault at 7f048c215428 ip 00000000004473af sp 00007fff569b9e20 error 4
in rsync_bpc[400000+75000]
Jan 18 03:04:41 keep kernel: [11526249.846662] rsync_bpc[28562]:
segfault at 7fb36e61e428 ip 00000000004473af sp 00007ffdcb018d20 error 4
in rsync_bpc[400000+75000]
Jan 18 03:40:53 keep kernel: [11528425.088184] rsync_bpc[28795]:
segfault at 7f1cb1b9b428 ip 00000000004473af sp 00007fff75d291a0 error 4
in rsync_bpc[400000+75000]
Jan 18 04:05:13 keep kernel: [11529887.025178] rsync_bpc[29045]:
segfault at 7f94898d8428 ip 00000000004473af sp 00007fff3d6a1dd0 error 4
in rsync_bpc[400000+75000]
Jan 18 04:37:06 keep kernel: [11531803.020965] rsync_bpc[29275]:
segfault at 7f83b84b3428 ip 00000000004473af sp 00007ffcb8962ed0 error 4
in rsync_bpc[400000+75000]
Jan 18 05:11:09 keep kernel: [11533848.516550] rsync_bpc[29531]:
segfault at 7f18f296a428 ip 00000000004473af sp 00007ffe3f582680 error 4
in rsync_bpc[400000+75000]
Jan 18 05:47:10 keep kernel: [11536013.450327] rsync_bpc[29921]:
segfault at 7fd986392428 ip 00000000004473af sp 00007ffe07aacb60 error 4
in rsync_bpc[400000+75000]
Jan 18 06:04:47 keep kernel: [11537071.297055] rsync_bpc[30127]:
segfault at 7f0dd13f3428 ip 00000000004473af sp 00007fff977dd350 error 4
in rsync_bpc[400000+75000]
Jan 18 13:15:05 keep kernel: [11562928.034694] rsync_bpc[1224]: segfault
at 7f6923390428 ip 00000000004473af sp 00007fff7d94c8f0 error 4 in
rsync_bpc[400000+75000]
Jan 18 13:30:57 keep kernel: [11563881.316870] rsync_bpc[1322]: segfault
at 7f8a9f83b428 ip 00000000004473af sp 00007fff9b9d9850 error 4 in
rsync_bpc[400000+75000]
I've found an informative post:
http://stackoverflow.com/questions/2549214/interpreting-segfault-messages and
this pointed me to this command:
addr2line -e /usr/local/bin/rsync_bpc -fCi 0x00000000004473af
bpc_attrib_fileCopyOpt
/usr/src/rsync-bpc-3.0.9.3/backuppc/bpc_attrib.c:284
Looking at the file I see this function:
273 /*
274 * Copy all the attributes from fileSrc to fileDest. fileDest
should already have a
275 * valid allocated fileName and allocated xattr hash. The
fileDest xattr hash is
276 * emptied before the copy, meaning it is over written.
277 *
278 * If overwriteEmptyDigest == 0, an empty digest in fileSrc
will not overwrite fileDest.
279 */
280 void bpc_attrib_fileCopyOpt(bpc_attrib_file *fileDest,
bpc_attrib_file *fileSrc, int overwriteEmptyDigest)
281 {
282 if ( fileDest == fileSrc ) return;
283
284 fileDest->type = fileSrc->type;
285 fileDest->compress = fileSrc->compress;
286 fileDest->mode = fileSrc->mode;
287 fileDest->isTemp = fileSrc->isTemp;
288 fileDest->uid = fileSrc->uid;
289 fileDest->gid = fileSrc->gid;
290 fileDest->nlinks = fileSrc->nlinks;
291 fileDest->mtime = fileSrc->mtime;
292 fileDest->size = fileSrc->size;
293 fileDest->inode = fileSrc->inode;
Looking at line 284 we see that this is the first time we try to read
from the object fileSrc. I suspect that somehow fileSrc is either
invalid, doesn't exist, etc, and therefore that is why we are getting
this error. I'm guessing that is has some value, as otherwise we should
see a much smaller number in the at value from the logs (similar to the
OP in the stackoverflow message.
So, is there a simple way to make sure fileSrc is "valid" before trying
to read it, and potentially cause a crash? I'd like to add some extra
logs/debug before the crash to try and find the cause and hopefully fix it.
There are only two places that this function is called:
/*
* Copy all the attributes from fileSrc to fileDest. fileDest should
already have a
* valid allocated fileName and allocated xattr hash. The fileDest
xattr hash is
* emptied before the copy, meaning it is over written.
*/
void bpc_attrib_fileCopy(bpc_attrib_file *fileDest, bpc_attrib_file
*fileSrc)
{
if ( fileDest == fileSrc ) return;
bpc_attrib_fileCopyOpt(fileDest, fileSrc, 1);
}
This is just passing the exact same variable it received, so we will
need to trace it back another step... I guess if there is an easy method
to test if it is valid, I can add that test to each function before
calling bpc_attrib_fileCopy and hopefully eventually work out what is
wrong with it.
I guess my c skills are quite rusty (non-existant really), so if anyone
is able to assist, I'd be very happy, even if it is just a clue on the
right way to debug/find the problem.
Regards,
Adam
--
Adam Goryachev Website Managers www.websitemanagers.com.au
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
BackupPC-devel mailing list
BackupPC-***@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
spent a bit more time on it, and might have some insight.
Firstly, this seems to happen semi-randomly during backups, so
eventually the backup will complete (usually) though it seems related to
either the backup client, and/or the number of files that the client has.
I get logs like this in my /var/log/messages:
Jan 17 23:34:31 keep kernel: [11513620.906447] rsync_bpc[27253]:
segfault at 7fe37aefd428 ip 00000000004473af sp 00007ffd49086e40 error 4
in rsync_bpc[400000+75000]
Jan 18 00:35:01 keep kernel: [11517256.388512] rsync_bpc[27472]:
segfault at 7fcaa53f3428 ip 00000000004473af sp 00007ffede0afdd0 error 4
in rsync_bpc[400000+75000]
Jan 18 01:05:12 keep kernel: [11519069.903776] rsync_bpc[27607]:
segfault at 7f747bbf5428 ip 00000000004473af sp 00007fffe8b03b40 error 4
in rsync_bpc[400000+75000]
Jan 18 01:35:09 keep kernel: [11520869.888899] rsync_bpc[27860]:
segfault at 7f7a06240428 ip 00000000004473af sp 00007ffebad15f60 error 4
in rsync_bpc[400000+75000]
Jan 18 02:04:56 keep kernel: [11522659.795284] rsync_bpc[28086]:
segfault at 7f0088f10428 ip 00000000004473af sp 00007ffe07298520 error 4
in rsync_bpc[400000+75000]
Jan 18 02:40:48 keep kernel: [11524814.507776] rsync_bpc[28340]:
segfault at 7f048c215428 ip 00000000004473af sp 00007fff569b9e20 error 4
in rsync_bpc[400000+75000]
Jan 18 03:04:41 keep kernel: [11526249.846662] rsync_bpc[28562]:
segfault at 7fb36e61e428 ip 00000000004473af sp 00007ffdcb018d20 error 4
in rsync_bpc[400000+75000]
Jan 18 03:40:53 keep kernel: [11528425.088184] rsync_bpc[28795]:
segfault at 7f1cb1b9b428 ip 00000000004473af sp 00007fff75d291a0 error 4
in rsync_bpc[400000+75000]
Jan 18 04:05:13 keep kernel: [11529887.025178] rsync_bpc[29045]:
segfault at 7f94898d8428 ip 00000000004473af sp 00007fff3d6a1dd0 error 4
in rsync_bpc[400000+75000]
Jan 18 04:37:06 keep kernel: [11531803.020965] rsync_bpc[29275]:
segfault at 7f83b84b3428 ip 00000000004473af sp 00007ffcb8962ed0 error 4
in rsync_bpc[400000+75000]
Jan 18 05:11:09 keep kernel: [11533848.516550] rsync_bpc[29531]:
segfault at 7f18f296a428 ip 00000000004473af sp 00007ffe3f582680 error 4
in rsync_bpc[400000+75000]
Jan 18 05:47:10 keep kernel: [11536013.450327] rsync_bpc[29921]:
segfault at 7fd986392428 ip 00000000004473af sp 00007ffe07aacb60 error 4
in rsync_bpc[400000+75000]
Jan 18 06:04:47 keep kernel: [11537071.297055] rsync_bpc[30127]:
segfault at 7f0dd13f3428 ip 00000000004473af sp 00007fff977dd350 error 4
in rsync_bpc[400000+75000]
Jan 18 13:15:05 keep kernel: [11562928.034694] rsync_bpc[1224]: segfault
at 7f6923390428 ip 00000000004473af sp 00007fff7d94c8f0 error 4 in
rsync_bpc[400000+75000]
Jan 18 13:30:57 keep kernel: [11563881.316870] rsync_bpc[1322]: segfault
at 7f8a9f83b428 ip 00000000004473af sp 00007fff9b9d9850 error 4 in
rsync_bpc[400000+75000]
I've found an informative post:
http://stackoverflow.com/questions/2549214/interpreting-segfault-messages and
this pointed me to this command:
addr2line -e /usr/local/bin/rsync_bpc -fCi 0x00000000004473af
bpc_attrib_fileCopyOpt
/usr/src/rsync-bpc-3.0.9.3/backuppc/bpc_attrib.c:284
Looking at the file I see this function:
273 /*
274 * Copy all the attributes from fileSrc to fileDest. fileDest
should already have a
275 * valid allocated fileName and allocated xattr hash. The
fileDest xattr hash is
276 * emptied before the copy, meaning it is over written.
277 *
278 * If overwriteEmptyDigest == 0, an empty digest in fileSrc
will not overwrite fileDest.
279 */
280 void bpc_attrib_fileCopyOpt(bpc_attrib_file *fileDest,
bpc_attrib_file *fileSrc, int overwriteEmptyDigest)
281 {
282 if ( fileDest == fileSrc ) return;
283
284 fileDest->type = fileSrc->type;
285 fileDest->compress = fileSrc->compress;
286 fileDest->mode = fileSrc->mode;
287 fileDest->isTemp = fileSrc->isTemp;
288 fileDest->uid = fileSrc->uid;
289 fileDest->gid = fileSrc->gid;
290 fileDest->nlinks = fileSrc->nlinks;
291 fileDest->mtime = fileSrc->mtime;
292 fileDest->size = fileSrc->size;
293 fileDest->inode = fileSrc->inode;
Looking at line 284 we see that this is the first time we try to read
from the object fileSrc. I suspect that somehow fileSrc is either
invalid, doesn't exist, etc, and therefore that is why we are getting
this error. I'm guessing that is has some value, as otherwise we should
see a much smaller number in the at value from the logs (similar to the
OP in the stackoverflow message.
So, is there a simple way to make sure fileSrc is "valid" before trying
to read it, and potentially cause a crash? I'd like to add some extra
logs/debug before the crash to try and find the cause and hopefully fix it.
There are only two places that this function is called:
/*
* Copy all the attributes from fileSrc to fileDest. fileDest should
already have a
* valid allocated fileName and allocated xattr hash. The fileDest
xattr hash is
* emptied before the copy, meaning it is over written.
*/
void bpc_attrib_fileCopy(bpc_attrib_file *fileDest, bpc_attrib_file
*fileSrc)
{
if ( fileDest == fileSrc ) return;
bpc_attrib_fileCopyOpt(fileDest, fileSrc, 1);
}
This is just passing the exact same variable it received, so we will
need to trace it back another step... I guess if there is an easy method
to test if it is valid, I can add that test to each function before
calling bpc_attrib_fileCopy and hopefully eventually work out what is
wrong with it.
I guess my c skills are quite rusty (non-existant really), so if anyone
is able to assist, I'd be very happy, even if it is just a clue on the
right way to debug/find the problem.
Regards,
Adam
--
Adam Goryachev Website Managers www.websitemanagers.com.au
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=267308311&iu=/4140
_______________________________________________
BackupPC-devel mailing list
BackupPC-***@lists.sourceforge.net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-devel
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/