ari_linn: (warrior - normal)
Ари Линн ([personal profile] ari_linn) wrote in [site community profile] dw_dev 2016-12-29 02:21 am (UTC)

I think if you can do something like (pseudocode) DELETE FROM comments WHERE user='ari_linn' AND jitemid = 0, then it might not be necessary for me to re-run the importer. I remember filing a support request several years ago after I discovered, much to my dismay, that I have many duplicate posts. The interesting feature of them was that one copy always had all of its comments imported, but the other always lacked comments, and the copy that lacked comments always had its public id (numbersnumbers.html) less than the one with comments. For example 257106.html is commentless, but 515532.html has 10 comments, just like the original entry. I remember support guys failed to figure out my problem and do something about it, so I had to keep all the duplicates in my journal. I've recently identified 500+ of these duplicate commentless entries and manually deleted about 250 of them. What I think is that comments with jitemid="0" are comments of these duplicate posts. Since there's probably no way to link them back to their respective posts, and I've deleted half of these posts anyway, it'll be the best if you could only delete comments with jitemid=0 for me. This way I'll just delete another 250 duplicate posts and go on with my blogging. You can always check if you're irrevocably deleting something useful by matching comment posterids and dates with the query like (in pseudocode):

SELECT * FROM comments jitemid_zeros
LEFT JOIN comments jitemid_normal ON jitemid_zeros.posterid = jitemid_normal.posterid AND jitemid_zeros.createddate = jitemid_normal.createddate
WHERE jitemid_zeros.user = 'ari_linn' AND jitemid_zeros.jitemid = 0 AND jitemid_normal.user = 'ari_linn' AND jitemid_normal.comment_id IS NULL

I'm not familiar with your actual db structure but imagine it to be something like this. If these jitemid_zeros are really just duplicates of successfully imported comments that are properly linked to posts, then you'll get empty query result because all the duplicates will be matched.

