I’ve noticed some files I opened in a text editor have all kinds of crazy unrenderable chars

  • vithigar
    link
    fedilink
    arrow-up
    6
    ·
    2 months ago

    I think you are conflating a few different concepts here.

    Can you comment on the specific makeup of a “rendered” audio file in plaintext, how is the computer representing every little noise bit of sound at any given point, the polyphony etc?
    What are the conventions of such representation? How can a spectrogram tell pitches are where they are, how is the computer representing that?

    This is a completely separate concern from how data can be represented as text, and will vary by audio format. The “simplest”, PCM encoded audio like in a .wav file, doesn’t really concern itself at all with polyphony and is just a quantised representation of the audio wave amplitude at any given instant in time. It samples that tens of thousands of times per second. Whether it’s a single pure tone or a full symphony the density of what’s stored is the same. Just an air-pressure-over-time graph, essentially.

    Is it the same to view plaintext as analysing it with a hex-viewer?

    “Plaintext” doesn’t really have a fixed definition in this context. It can be the same as looking at it in a hex viewer, if your “plaintext” representation is hexadecimal encoding. Binary data, like in audio files, isn’t plaintext, and opening it directly in a text editor is not expected to give you a useful result, or even a consistent result. Different editors might show you different “text” depending on what encoding they fall back on, or how they represent unprintable characters.

    There are several methods of representing binary data as text, such as hexadecimal, base64, or uuencode, but none of these representations if saved as-is are the original file, strictly speaking.