Jun. 5th, 2020

roadrunnertwice: Dee perpetrates some Mess. (Arts and crafts (Little Dee))
[personal profile] roadrunnertwice

I've been working on a thing, and I could use some feedback on the implementation. It might take a little explaining (because there's a fair amount of backstory), but I'll try and be as concise as possible.

Backstory: Raw and transformed text

DW stores the text of entries and comments raw, exactly as the user entered it. Then, whenever we need to display that text, we transform it to produce nice legible HTML. Those transformations include:

  • Turning <user> tags (which aren't real HTML) into [profile] user, which is really like a span plus an image plus a link.
  • Handling <cut> tags.
  • Several other things, etc., not important right now.

Most of those get applied to everything we display. But there are also some OPTIONAL transformations:

  • Turning normal line breaks into HTML <br> tags.
  • Turning bare URLs into clickable links.

Those get applied by default, but we've always had a "don't autoformat" checkbox (inherited from LJ) that could be used to disable them for an entry or comment. (BTW, under the hood the RTE saves entries as "don't autoformat.")

Then, later, DW added some other optional transforms, which had their own special enabling conditions:

  • Turning Markdown into HTML. (For entries that start with a special !markdown glyph, or comments submitted by email.)
  • Turning @mentions into user tags. (Currently applies to everything except "don't autoformat," but gets suppressed within certain HTML elements or their Markdown equivalents.)

ALL of these transforms get handled by something called the "html cleaner," at LJ::CleanHTML. At this point "cleaner" is kind of a misnomer; in actual fact, it's the central place where we handle all transformations of raw user-entered text into a fragment of display HTML.

The problems

In my understanding, the current state of affairs has two main problems:

  • The interface for choosing text transforms is incoherent. That happened gradually; we've added new transformations over time, and changed the interactions between them, and now it's weird:
    • Half of the interfaces for entering entry/comment text don't even include the "don't autoformat" checkbox anymore.
    • The way you enable Markdown has always been a mystery. For example, I want to use Markdown in comments (because typing html angle brackets on a telephone is bullshit), but currently it's impossible except when responding via email.
  • Introducing new text transforms is dangerous and chaotic. In mid-2019, we enabled @mentions for HTML-formatted content (previously they only worked in Markdown content), and about 40% of hell broke loose:
    • Current content suffered because we didn't have a good way to beta-test @mentions, so we didn't have a chance to learn about bugs and edge cases from our users (who are more ingenious at doing weird textual shit than we are) before enabling them globally.
    • Old posts weren't written to expect @mentions, so we ended up totally vandalizing any historical post that ever discussed CSS code (@media (min-width: etc...)), Perl or Ruby or Objective-C code, or a wide variety of other things that involve @ signs.

Questions: Does anyone disagree with those two problems or my characterization of them? Does anyone see any closely related problems that I'm not recognizing here?

The solution, maybe

Read more... )

Profile

dw_dev: The word "develop" using the Swirly D logo.  (Default)
Dreamwidth Open Source Development

July 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223 242526
2728293031  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 12th, 2025 06:05 pm
Powered by Dreamwidth Studios