As large language models continue to impact various professional domains, it's critical to address ethics. This is especially true in clinical and accessibility work, doubly so as someone building (sometimes hesitantly) such tools. As a developer, speech pathologist, and stakeholder in other ways, I've been thinking a lot about this. It's important to outline the perspectives and guidelines to which I hold myself - and propose that others should - regarding the responsible use of the tools I've developed.
This is particularly important as I begin to broadly release such tools, and I am very conscious of the irony in that statement. I hope that you will allow me to borrow a moment of your time.
For Speech-Language Pathologists (SLPs), educators, and other clinicians, tools built with generative language can assist with tasks like generating curriculum, creating templates, finding research, and brainstorming goals. However, they cannot fully grasp the complex needs of a patient or student as a human clinician can. Outputs are based on input data and algorithms but are not guaranteed to be accurate, comprehensive, or appropriate for every situation.
Always consult your professional training and judgment when using these tools. Think of them as pointer dogs, not sharpshooters—they can show you where to look, but it's up to you to get it right.
The immediate benefit of language models, however, is overwhelming in accessibility. I have tools addressing communication, home automation, and general access; the most popular model I've built is one that produces alt text. I don't mean any of that as a humblebrag - I'm just excited about what we could empower others to do. One system, for example, is designed to create eye gaze interfaces on the fly - it's not bad, and it's getting better. As a very wise woman once said: "What makes things easier for some, makes them possible for others."
Language models offer enormous opportunities, not least of which is the chance to seize the narrative in a way that benefits those who are among the most marginalized. I feel a personal obligation to get it right.
Some professions, such as graphic design and journalism, experience more risk than benefit. Importantly, these systems do not engage in generative imagery, sidestepping two of the most controversial aspects of these applications—specifically, the co-opting of human creativity and the potential generation of fake or harmful visual content.
My focus is on functional, language-based AI that directly serves accessibility needs, ensuring that the primary function of the tool is to support real, tangible human interaction and productivity. I am not comfortable with the idea of symbols being generated for communication systems. It would be very easy to lose or distort meaning among vulnerable populations.
The question of water and energy use is also challenging. I have chosen not to engage in highly energy-demanding processes, certainly not generative imagery or long-form writing and journalism. I focus solely on "grounded" language models—those tied directly to, and only to, knowledge they receive in real time from sources that I (or, more accurately and happily, users) decide.
Optimization at my level, however, isn't the real need; energy use remains a huge challenge that I both recognize and try to reduce. I am just a spectator in that larger world and cannot speak to the full ramifications, but while this technology offers transformative potential in accessibility, it's important that these benefits do not come at an unsustainable cost.
There are other issues to address such as plagiarism, job loss, and long-term social cost; large language models took every cultural and institutional bias in our language and baked it into the foundation of future technology, and they are emphatically not "AI." To build a model of the mind and expect thought is like building a model of the sea and expecting to get wet.
I acknowledge these challenges and strive to create tools that mitigate these risks as much as possible. I build with targeted, specialized functions in mind, which allows for more efficient use of resources via so-called "mixtures of experts" (MoEs)—that sleep or wake to optimize for specific tasks. None of them, however, write position statements on the ethics of their own design; don't worry, as I'm sure someone will, that a language model wrote this statement about itself.
In fact, the majority of the models I've trained were built from the past communication of adult communicators with degenerative disorders. While hallucination is still possible, I have a handful of controls; you'd be amazed what can be accomplished by bringing someone's entire corpus of communication into language system design, which can now feasibly run on many users' own computers.
At a very high level, my goal is to reduce the latency between intention and outcome in all domains—which just happens to be the whole point of accessibility. All technology starts as assistive technology, and I can say with confidence—having worked in both language modeling and voice synthesis—this has been, and will be, no different.
To protect user privacy, a small language model runs locally on a server I control (the product of many nerdy years in other fields). It captures queries and anonymizes personal information before obscuring them for a more powerful language model that enhances their ability. Sensitive data - in theory (I'll explain) - remains confidential during processing.