Sep. 30th, 2011

jeshyr: Programming dreamsheep (Dreamwidth - Development)
[personal profile] jeshyr
[This ended up being sorta like a walkthrough-in-progress ... I'll turn it into a proper walkthrough if I ever finish it!]

Initial pokings at bug 1964 - taglists truncated when crossposting.

First I tried looking at the LJ code and found Changeset 9887 coded by [personal profile] zorkian himself about 5 years ago. I couldn't figure out how that code could produce the results noted in the support requests though.

Secondly I tried setting up crossposting from my Dreamhack to Dreamwidth itself to see if I could replicate the bug because it seemed it would be easier to isolate if it was in the DW code too. Apparently it does happen in the DW code too which is handy ... these are the tests I ran:

Test 1: Non-new tags
tag text in various states )

Test 2: New Tags
tag text in various states )

Test 3: Longer New Tags
tag text in various states )

Conclusions so far:

  1. This happens between two DW instances so it's not an LJ-specific bug

  2. It doesn't matter if the tag is new to the system or not

  3. Tag length is not significant, only string length

  4. The raw tag string is being truncated at 254 characters. I got 253 a few times but both times there was no truncated tag so I hypothesise the 254th character was a space on those occasions.



So something in the protocol is truncating this string at 254 characters either before it leaves the sending system, or before the receiving system has a chance to parse the string into an array of tags.

The code for the protocol used by the crossposting system is largely LJ::Protocol, stored in $LJHOME/cgi-bin/ljprotocol.pl.

... at this point I vaguely remember seeing something else potentially relevant in the LJ changesets I waded through so I go back and discover Changeset 9767 and poke at weblib.pl.

It indeed seems that we do still have this limit in DW, it's line 2192 at the moment so I delete the limit and restart everything.

Results now on DW are .... exactly the same.

At this point I am thinking throwing things is an attractive option so I am posting this ...

HALP!

[ETA: OMG headdesk headdesk ... realised literally 1 second after I pressed post of course: I need that patch on the *receiving* system not the sending one! I'm testing the wrong THING... back to the checking board for me.]
pauamma: Cartooney crab wearing hot pink and acid green facemask holding drink with straw (Default)
[personal profile] pauamma
Currently, the sphinx indexer works by looping over users and copying news and edited entries into the sphinx database. Those copies are scheduled periodically using bin/schedule-copier-jobs, and also per-user when entries are created or edited. But in the latter case, the copier still searches the whole log2 table for that user's entries instead of only copying the affected entry. Does anyone know what the cost of that is compared to the indexing itself? If it's significant, http://dw-dev.dreamwidth.org/97273.html may also be used to trigger per-entry copying. Anyone has performance/resource use figures, or opinions?

Profile

dw_dev: The word "develop" using the Swirly D logo.  (Default)
Dreamwidth Open Source Development

May 2025

S M T W T F S
    123
4 5678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 8th, 2025 10:26 pm
Powered by Dreamwidth Studios