Much more to come soon!

Until then, enjoy having a super easy to setup status bar above your head that can track your computer’s RAM, CPU, GPU, and VRAM resources. And of course Spotify integration allows everyone to see what you’re listening to, or if you have it paused or muted.

It all works together with Speech To Text and won’t interfere with your chat messages at all!

 

The speech to text bubble is now officially a Poiyomi 3rd party mod!

A lot of the Poiyomi features are now much easier to add to the messagebox, as well as a huge performance improvement! The messagebox has gone from nearly 800 polys to under 400, and from 4 materials to 1! You will also notice a difference in box sizes when uploading the new Speech Bubble on your avatar.

And with the addition to the global mask in Poiyomi, the amount of customization becomes much more easy and boundless, like creating an outline, changing the color of the text, fonts, emissions, and many more!

And a whole bunch of other features Poiyomi provides! Shoutout to them and their discord server!

 

Along with the new shader comes an overhaul of the messagebox itself. Although it looks very similar to the old one, it has a lot more new features and enhancements to performance.

Avatar Dynamics have been added! Simply grab the box and drag it into the world, or put it back on your head (or anywhere else you want to put the Avatar constraint):

 

Our next steps will involve customizability. Now that we have a solid framework to build from, we will be adding new messagebox shapes, fonts, and emotes in the near future! Stay tuned for more updates from us!

 

Server links:

Poyomi Patreon: https://www.patreon.com/poiyomi
Poyomi Discord: https://discord.gg/poiyomi
VRCSTT Patreon: https://www.patreon.com/RabidCrab
VRCSTT Discord: https://discord.gg/vrcstt

This release isn’t too exciting, but it was a necessary series of bug fixes and optimizations that spans multiple updates.

Bugs were becoming a big problem. Everything from the voice recorder to the AI voice playback had some sort of major bug in it that needed to be squashed. And squashed they were! Here’s a (not comprehensive) list of all the bug fixes that were all crushed in 1.7:

Additions

  • Added FonixTalk, also knows as the Moonbase Alpha voice…. Aeiou
  • Added some TikTok voices
  • Added SAM because why not. Just a heads up, the voice is LOUD
  • Added a timeout to the button press frequency, as well as a noise that will play if you press record/interrupt record too quickly and your request gets denied
  • Added a queued calling system to the AI voices so translation can continue working while audio is playing

Major fixes & Hotfix

  • Removed the duplicate issue caused by mixing the partial translation results with the complete translations. It should still perform just as quickly, but now it won’t show both the partial and complete message in the same message
  • Fixed an issue with Azure not getting all of the audio data it needed when sending a message. This made it so it wasn’t hearing the first couple of words you spoke
  • Fixed an issue where disabling and enabling continuous recognition was causing it to freeze up
  • Fixed audio playing twice in rapid succession in some cases
  • Fixed translation not working as intended
  • Finally fixed the recording/interrupt recording button for Azure. It will consistently work now
  • Fixed issue with program breaking after the first message if the Custom Messagebox option was disabled and the VRChatbox option was enabled
  • Removed ability to create an empty keyword for emotes
  • Fixed an issue with the program speaking the same message twice in Continuous mode
  • Fixed issue with Show Partial Messages not working
  • Fixed an issue with the last sentence of a message not always getting spoken
  • Fixed issue with interrupting translations for both Deepgram and Azure
  • Spamming the recording button no longer causes a crash
  • Fixed an issue with AEIOU and SAM not working in all cases
  • The previous update added Standard voices for AWS which would not work because they weren’t Neural. Program now handles Standard voices correctly
  • Analog keybindings weren’t working at all. Wonder how long that was a bug….

 

Additional mentions

  • When you’re recording the Text, STT, and STTS windows will be disabled to prevent crashes associated with changing stuff that the recorder is actively using
  • The Chatbox will now be as sussy as the custom messagebox
  • Messages that need to be translated to another language will be faster now
  • Settings will now be transferred over to newer versions
  • The same thread controlling the voice output was getting paused while it waited for audio to finish playing, causing micro stutters if the computer was experiencing latency. I refactored the thread to continue handling audio while it waits to send another message
  • Azure wasn’t handling sentences very well and the timing for a sentence required a really long pause of 500ms. I shortened it to 200ms and now it responds a lot better. I’ll add this as a customizable option a bit later since it’s tied into the speaking grace period and you can break it if you input a smaller grace period than a timeout period

Although bug fixes are boring, they are still very important to manage. 1.7 isn’t an exciting patch for many, but it’s a big one for the program. Your experience will be much smoother and more consistent than ever before with 1.7.

Up next is some more exciting new features…. Message box customizability! More about that will be coming soon!

1.6 is a huge improvement on how my program manages audio.

The entire audio system has been revamped to include a noise suppressor (thank you RNNoise https://jmvalin.ca/demo/rnnoise/) that reduces audio bleed from your speakers to your microphone and increases accuracy by cleaning up the background noise present in all microphones.

All of the known messagebox bugs have been fixed with the exception of a bug with emotes that you are very unlikely to encounter.

I spent a whole 2 months on this project, with a month and a half spent on the entire revamp of the audio system, and a good 3 weeks spent on revamping the messagebox from the ground up to get rid of the numerous bugs plaguing message population and display.

I can confidently say this is the most stable and usable version of the program I’ve ever released.

Patreon finally gave us the option to charge monthly on the day you sign up and not the 1st of the month! YES!

If you sign up in the middle of the month, it will now charge you next month on the same day like it should have at the beginning.

All existing Patreons will get charged on the 1st of this month, but don’t worry, you will always get AT LEAST 30 days of access per charge. You don’t lose any days because you signed up after the 1st!

Always-On (continuous) speech recognition is now available for the Premium Tier patrons!

Continuous speech recognition is just a fancy way of saying you no longer need to press the record button every time you want to talk. Instead, all you need to do is press it once and it’ll stay on!

Continuous speech has support for all of the fancy features of the program, including logging messages, translation between languages, and AI voice!

Next up is some bug fixes to take care of some old (and new) bugs.


 

New features and fixes:

– 63 Emotes!

– OBS and Streamlabs support!

– The sizing issue with some avatars is finally fixed!

This update was an absolute nightmare to program out. Handling multi-line emotes was a much more difficult process than I imagined, and even now there’s still some rare cases where emotes won’t properly line up, but it works in 99.5% of cases. You have to try and break it in order to get it to break now.

Next on the list is support for more than just Azure translation. We’ve been getting a loss in translation quality for some people, so now is the time for some additional options!

This new installation tutorial covers everything you need to get started with STT (and STTS coming soon)!

This video also includes explanations and solutions for common issues you may come across during installation, although hopefully you don’t come across any!

English, 日本語 (ひらがな), français, Español, Deutsch, svenska, Suomalainen, Nederlands, and many others! With live translation between the languages!

I created 2 new options to choose from! If you choose the cheaper option, you’ll still get all the features of the higher priced tier, just at a slower rate. I figured this would be a good option for those who are tight on money and want to use it. As the number of patrons increase, I may be able to lower the prices more, but don’t hold me to it, it all depends on a future I can’t predict.

Version 1.1.2 was also released. Between 1.1.2 and 1.0.7 I’ve been fixing a number of bugs that have popped up, so nothing new or exciting yet. Still working on the French version, so stay tuned for that!

Who we are

We are a team of developers that have been wanting to bring something new to the community. We are individuals that take security and privacy very seriously and also wish to offer the same quality of service to our members in the same way. You can visit and share our website address (https://vrcstt.com) to anyone who wishes to learn more about the Speech To Text tool.

Comments/Cookies

This website was created with the purpose of making it easier for our members to learn more about the tool. As such, we do not collect nor retain any information or data from our visitors and do not collect any cookies. We do not collect IP address and do not have any forms or comment section that could be used in ways to collect information about users. Should you want to contact the administrators of this website, you may find us by clicking and joining our Discord server or via Twitter.

Thank you all and have a nice day!