We are (somewhat surprisingly) receiving a lot of requests from end users to implement speech to text features, as this will make it easier for users to enter information on the go.
Have Appfarm considered creating a speech to text action, or something that allows the user to generate strings using the microphone?
Apart from the general improved app-usability, and accessibility perspectives, this would also be a great addition, as are beginning to integrate AI agents in our apps, and having the ability to record a message, convert it to a string and then send it to the agent would be a really great feature.
Thank you, that is a very interesting feature request!
We haven’t explicitly considered this feature yet, but we are currently working to bring several smart features to Create. I see several potential solutions here: a separate action node, building it into the text edit component as a setting you can enable, or releasing it as a shareable component/integration.
I will register the feature request internally and we will discuss the possibilities!
This is a very relevant discussion! Hæhre is also looking for speech-to-text functionality and would greatly benefit from such a feature. We see significant potential in making it easier for users to input information on the go, particularly in field operations where typing can be impractical.
Additionally, as we are starting to integrate AI agents into our applications, having the ability to record messages, convert them into text, and process them accordingly would be extremely valuable. We would love to see this implemented, whether as an action node, a setting in the text edit component, or a shareable integration.
Looking forward to hearing more about the possibilities!
That is a really good point, and I think it solves all our needs.
The only reason that I can think of, is that it would be easier for the end users if we could trigger the microphone automatically for certain actions, but this should probably be enough. Maybe it is possible to trigger the built-in speech-to-text function by using a script in the app, and if that is the case, then it would definitely not be necessary.
where the updateText is another action that updates a text variable. The result was that I was able to trigger speech-to-text on any device, without having to setup this in the settings
After looking into the data handling and privacy, it turns out this is not suitable for enterprise apps after all, as both the features provided by the OS and the browser are collecting data, which is a no-go for us.
Only solution I see as of now is to download and host our own models, using something we find open source, and then integrate towards that.
A secure speech-to-text would very much be desired for enterprise use, and hope this is something Appfarm would consider
I tried to look into this and as far as I can see, neither Microsoft nor Apple stores the content provided by speech-to-text at rest.
Could you provide us with your privacy requirements for a speech-to-text feature, and how the current implementations in the different operating systems and browsers breach with those? This would inform our choices when exploring ways to support this feature.
Since we handle a lot of non-public market information, and a lot of the use cases includes the use of sensitive data, we cannot allow users to dictate/transcribe if there is a chance that the data is shared outside of the organization.
From what I read, the Apple dictation appears to take this into consideration, however it states that it sends the request to the Apple servers and might store them if the user has opted in.
Our requirements are in short that the voice input and text output should not be (or inadvertently allowed by the user to be) shared or accessible to any third-parties