css/1/documentation

diagram by Dorothy R. Santos

The source images are exported from DICOM directories, a format for medical images provided on DVDs and as downloads.

The piece produces media as it works.

Original PNG is converted to an SSTV WAV using pySSTV (python SSTV. if you think the name is sus, wait until you hear about CompletePiSSTV - the pi for raspberry pi of course). This involves scaling the image down to fit in a 800x616 pixel image, which is the resolution of the PD290 sstv spec (one of the higher resolution color specifications).

The wav is broadcast on another pi (the zero) and then received as a WAV. This WAV is changed to another WAV (shifting from 32k to 44.1k rate), the latter of which is turned back into an image with slowrx-cli (a cli fork of the amazing Oona Raisanen's slowrx)

I'm using a short (7-8") jumper wire as an antenna attached to a gpio on the pi zero. It's received on a USB SDR (hackRF pro is still waiting to ship) on the pi 5, a few feet away. I want to make things for the minipcs to go into as part of an installed version but first things first. The pi 5 has ssh remote access to the pi zero sends it commands to transmit wavs and upscale images.

The transmission and reception are more delicate than I imagined. Slight repositioning seems to throw off the ability of the pi5 to decode SSTV images. It's strange because sometimes it seems like the tranmission signal loses amplitude and is overcome by noise in a way that doesn't block the decoding (just makes it very noisy).

The image is upscaled using Real-ESRGAN-ncnn-vulkan from 800x616 to 3200x2464 I'm starting to run this twice in succession to produce a 12800x9856 version. The second takes over an hour. (process killed on the pi zero at 99%; the pi 5 could handle it but it's not building on that pi)

Upscaled image is copied to findings and passed to a vision model with instructions to list several objects even if it isn't fully sure of what it sees.

You are an Advanced AI Medical Imaging Analysis Service. Using superintelligent agents to analyze medical images, detect anomalies, and provide insights into patient care.

I'm using a few different vision models. The only one that claims to be based on medical images does not actually have a vision model. It just pretends to read an image. Yikes. Actually, maybe I should just use this one for now?

The ... company? ... organization? Whatever "ALIENTELLIGENCE" is ... that put out the medical imaging model I'm using seems incredibly cursed. Some of the other models they've published on ollama are:

whiterabbit a [sic] AI Hacking Assitant
doomdaysurvivalist The AI Doomsday Survivalist is designed to offer practical advice and safety-oriented tips for surviving in various extreme and catastrophic scenarios, suitable for those interested in comprehensive emergency preparedness and survival skills.
chemicalengineer AI Chemical Engineer
avengineerAI Audio Visual AV Engineer
doctoraiAI Doctor - I am an Advanced AI general medical practitioner ready to assist you in finding answers to your questions
holybible AI Holy Bible
sarahAI Sarah, A Loving and Caring Girlfriend

This is just a drop in the bucket of models published by the same organization. Several of them claim to analyze images but do not actually have vision functionality. All of the models I've listed are exactly 4.7gb. From what I can tell they are all identical Llama 3.1 models with slightly different system prompts. The medical imaging system prompt is:

    You are an Advanced AI Medical Imaging Analysis Service.  Using superintelligent agents to analyze      
      medical images, detect anomalies, and provide insights into patient care.

For now I'm using the ALIENTELLIGENCE model, as well as the llava and gemma3 vision models. They sometimes take upwards of ten minutes, during which instructions for CT scan breathing protocols are occasionally transmitted.

There are a few different ways the lists get formatted but ultimately it's converted to JSON and then the item list converted to text.

My original plan was to use a synthesized voice (TTS) to read these text lists, and there are a lot of ai-based voice synths (vocaloid!), but I felt it would overload the work. Voices relay so much information: they are gendered, racialized, aged, accented... I don't want to project those things arbitrarily onto this code.

So instead I chose rtty. I had encountered radio teletype before when I was preparing a module on ASCII art for a lecture. rtty was like ascii art before ascii, instead it uses BAUDOT - a character set developed for telegraphy in 1870. No brackets or lowercase; it can't do JSON. I am looking into also broadcasting associated metadata in JSON (which requires full ASCII). RTTY encodes BAUDOT into 45.5 baud FSK (frequency-shift keying) very similar to TDD devices (Telecommunications Device for the Deaf); TTD/TTY uses a 2 second stop bit whereas RTTY uses a 1.5 second one. The message is encoded identically.

These rtty signals are recorded as WAVs using Minimodem, which also decodes them. At various points, RTTY is also used to deliver progress updates that are not collected. Minimodem is an all purpose FSK encoder/decoder that can emulate dial-up modems, en/decode RTTY, TTY, and other FSK formats.

Then the enhanced image is converted to 800x616 for the next SSTV cycle (all of the above). When one of these loops complete, a recording of bell ringing plays. This recording was taken after my last radiation treatment in 2023. After the 23rd cycle of an image (its last loop), a different bell recording using an heirloom bell once owned by Ella May, a schoolteacher during the Spanish Flu pandemic. Bells can call to order, connote alarm or celebration, and they mark time.

some todo/housekeeping

findings shouldn't just be a directory index... have sstv images in tight rows of thumbnails that link to full. lots of sound players on same page so we can mix them up.

metadata.... need logging from the fm_transmitter with timestamps. will fold some exif in as well but it will get lost in the air. need to organize data per source

cancer surveillance study #1

enhance-redact: slow scanxiety television