Alphanumerical lists are sortable by alphabet and number, obviously, but if you have a list where each entry begins with a different punctuation mark (or any other kind of non-alphanumeric character), is there a similar standardised ordering method for them?

I imagine, for example, that a comma will come before whatever this is: ¦

I just tested an A-Z sort in Google Sheets where each cell was a different punctuation mark, and it seemed to rearrange what I’d entered into some sort of order, but is this order shared universally? Is there a global Unicode-compliant ordering method everyone uses?

Cheers!

    • fubo@lemmy.world
      link
      fedilink
      arrow-up
      11
      ·
      2 months ago

      If your input is limited to ASCII, sure.

      But ASCII is only a 7-bit standard, and only supports those characters needed by American English computer users in the 1960s. Lots of characters you might see in “plain text” are not part of ASCII; including all accented characters, all non-Latin alphabets, and many common symbols and punctuation marks including these: £€¢©™°

      (Yes, you could get accented characters in the pre-Unicode days using 8-bit “extended ASCII”, e.g. IBM/Windows code pages. However, those are not really ASCII and they will break if the text is interpreted as the wrong code page.)

      Unicode collation is the Right Thing today.