changelog.txt

Changelog

v1.8

- whisper-turbo, a new ultra fast transcribing method
	- Optional, off by default
	- Roughly 4x as fast, with the same transcription quality!
	- Needs CUDA Toolkit and cuDNN downloaded to work with GPU
		- CUDA 12 (RTX 20 series+)
	- You can also toggle it to just use the CPU instead, at a hit to quality/speed.
	- WARNING: I have experienced random crashes with this, but it's likely due to low VRAM / improper install.

- Chunked audio transcription
	- Optional, on by default
	- Takes in your audio and processes it while you are still speaking.
	- Helps with response times. Reduces transcription quality slightly.
	- Configurable chunking amount, Lower = process more often. Default is fairly high.

- Mac support
	- Use the statup.sh script.
	- Thanks to @Cootshk!
	- May require tinkering to figure out.

- Fixed an issue where the first sentence said would cause a crash after that.
- Fixed an issue with numba and sympy importing things, despite not existing.
- Interruptible chats in Hangout Mode now work off their own audio file internally.

---.---.---.---

v1.7

- Hangout Mode	
	- Like a very advanced autochat.
	- Your waifu decides how to reply to messages, based on hardcoded presets.
		- They may wait, see if any more input comes, and then reply
		- They may reply right away
		- They may use the camera
		- In the future they could also think on their own and decide how to reply
	- You can configure their reply personality to change how they reply, or how engaged they are.
	- Certain words phrases "think about" or "ponder" will cause them to think more.
		- Words are configurable under "Configurables/Hangout"
	- Certain words phrases "look at this" or "camera" will cause them to use the vision, if enabled.
		- Words are configurable under "Configurables/Hangout"
	- By default, you can interrupt them by saying "Wait, " and then their name.
		- Can eat up resources, as this also uses whisper. Toggleable in the Configurables.

- The chat logs now have an automatic backup, named "LiveLogBackup.bak".
	- Simply rename the file to "LiveLog.json" to restore.
	- Backs it up upon every time the program is started.
	- Includes a failsafe measure to not back the files up if the history gets cleared.
	- Of course, backing up logs in additional methods (to a flash drive, or other PC) is always advised.

- The RAG database now has a progress bar when first calculating it.
- Further enhanced the Autochat volume listener to better handle different sensitivities.
- Fixed an issue where streamed camera chats would appear in the log twice.

---.---.---.---

v1.6

- Added an option for Semi-Auto Chat
	- This will simply toggle the mic on after each reply.
	- Keyboard hotkey for this is "Q"
	- Mutually exclusive with Autochat.

- Autochat will not send any requests under ~2 seconds.
	- This is to help stop noises randomly triggering responses.

- Autochat now has an audio buffer that will contain the ~1-2 seconds before you started speaking.
- Adjusted the parameters for the Autochat, making it decay faster while no noise is detected.

---.---.---.---

v1.5-R2

- While using streaming text, emotes are now threaded, meaning that there is no pause for them to happen.
- The VtubeStudio interactions now use a try-catch system, adding general resistance to errors.
- Added in more implementation for Unipipes - the system that basically will manage the centralized execution of code.

- Enhanced the ".bat" files, making them pause after a crash happens.

- Fixed an error where the random looking would cause a crash due to requests not closing properly.
- Fixed an issue with the Discord module crashing when emotes would be triggered.

---.---.---.---

v1.5

- Stopping Strings (what cuts off your waifu if they try talking out of format) can now be changed in the configurables.
- There is now a "Send" button you can click next to the textbox.
- The primary color of the interface is now changeable via the configurables. This changes the color of the borders, checkboxes, and the new "Send" button.
	- For a full list of colors, go to: https://www.gradio.app/guides/theming-guide

- The results from the visual system can now be properly rerolled.
	- The streamed results can also be interrupted and re-done as it comes in.
	- Metadata tags are also applied to visual chats, for future (and current) reference.
- Streaming from the visual system now properly shows in the UI.
- The visual preview no longer requires tabbing in to it to accept / cancel.

- Can now run multiple emotes per message.
- Emotes now trigger as text streams in.
- Removed an old vtube.py script that was unused.

- Hotkeys are now customizable, and can be changed in the configurables.
- Fixed a bug where some users would crash and fail to launch if the hotkeys failed to bind.
- Fixed an issue where doing hotkeys multiple times would "queue" the actions.



---.---.---.---

V1.4

- The text now "streams", appearing as it is generated!
	- This is the default now, responses come in quicker and can be read out loud as they come.
		- If you get issues, try disabling streaming text in the .env file.
	- In effect, this means a slower language model is no issue, as long as it generates a little faster than your waifu talks.
		- This won't apply to internal thinking, such as the camera or other future operations.
		- Only things read out loud will actually go faster.
	- Works on multimodal/visual as well.
		- No "[System C]" headers while streaming, as it is read aloud immediately.
		- Makes the vision a whole lot better to use, as it tends to be long winded and slow to generate.

- RP Supression and Newline Cut are now unbound from one another.
- RP Supression and Newline Cut can both be turned on/off in the UI, as well as on/off in the .env (.env is what it is on boot).
- Lowered the RP Supression (what stops your AI acting as multiple people) watchdog counter (less likely to misfire).
- Stopping strings are better organized internally.
- Warning messages about messages being too short or too long now appear in the debug log.


---.---.---.---

V1.3-R2

- Fixed a minor bug where if there wasn't enough chat history, the program would crash, as it would attempt to load chat that wasn't there.

---.---.---.---

V1.3

- Added the Tag & Task menu
	- Tags can be used to classify info for future use.
		- Applies tags automatically to chats that you put in.
	- Tasks can be used to swap between character cards, allowing you to swap out parts of the memory.
		- Tasks are hyphenated between the "WaifuName" and "Task"
		- For example, if your waifu is named "Ember", and you have a task called "Party", you would want a character card in Oobabooga to be defined "Ember-Party"

- Your bot can now use keyboard input to control the keyboard.
	- Be sure to toggle "MODULE_GAMING" to "ON".
	- Changing the task will change what JSON file it uses (i.e. the task "Emerald" will use the button mappings in "/Configurables/GamingInputs/Emerald.json").
		- By default, this is set to "None" with no mappings.
		- You can add mappings by copy/paste the file, and renaming to something else.
		- Try to use lowercase letters for the keyboard input, capital letters did not work.
	- Warning: They can also trigger their own hotkeys, if not turned off!
	- Automatic gaming can now be toggled on in the Task menu. This is done by taking a picture, then asking for an input.
		- Note: Very bad at the moment, don't expect much of anything. May require more prompting and tuning.

- Vision can now use the main monitor's screenshot as the image input.
	- Turn on "Use Screenshot" in the Visual menu.

- Timestamps will now be included in the encoding, telling them the current date and time.
	- You may want to ask them in the character card to not mention the current time, as they may spam it.
	- Timestamps are also included and stored in message metadata.
	- Can be toggled in the .env
- Messages can now be undone / redone while they are speaking, cutting them off.
	- Messages are now chunked out and read, instead of all at once.
- Asterisks can now be banned from generating, for conversational mode and stopping roleplay.

- Discord token is now stored in "Configurables/Tokens/Discord.json", for security reasons.

- Fixed an issue where the lorebook was not giving lore for messages with a "?" at the end.
- Added a "is_live_pipe" state to the main script, which will tell us if we are currently running/processing something.

---.---.---.---

V1.2

- Lorebook messages are now directly infused into the encoding as it is sent.
	- This now sends all relevant lore triggered within the past 3 message sets, instead of just 1 with a required cooldown.
	- Lore triggering requirements were improved, to add plurals and fix edgecases.
	- You can still view what lore is triggered via the UI Logs.
- Random Memories will now trigger before the alarm.
	- This allows your bot to randomly scan your chat history, and remember past times.
	- You can also trigger random memories manually via the UI.

- Your VTuber can now look around, either Following Faces or Randomly.
	- This requires setting up 6 emotes for your VTuber. In order, they should have your VTuber's eyes doing the following (they can be named anything); 
		- "Look Slight Right"
		- "Look Right"
		- "Look Very Right"
		- "Look Slight Left"
		- "Look Left"
		- "Look Very Left"
	- In the .env, change "EYES_FOLLOW" to "Random" or "Faces". Set the "EYES_START_ID" to whatever emote slot the "Look Slight Right" is set up as.
		- Make sure all the eye looking emotes follow eachother in order. You can re-order them in VTube Studio if needed.
	- Obviously, you need a camera for the VTuber to follow faces, as well as the Vision module enabled.

- Other Roleplay Suppression is now disabled if you have "Cutoff at Newlines" off.
	- This will allow the bot to send messages containing character lines, such as "User:" or "Riley:".
	- This is to allow lists, info, and multi-user RP scenarios, if you want.
- Fixed issues with the RAG history desyncing when undoing messages.


---.---.---.---

v1.1-R2

- Fixed a few major bugs:
	- Fixed the "Error" taking over all of the Gradio WebUI
		- Happened due to Gradio & FastAPI dependency conflict (reminder: always vet your stuff~!)
	- Fixed issues with the software failing gently when you have no mic
	- Fixed crashes relating to searching for "Minecraft" logs, it now check to see if the module is enabled first

---.---.---.---

v1.1

- Visual System
	- Toggleable as a module
	- Able to take new images or upload them directly for the AI to see
	- Runs using Ooba, like with the text
		- Can set the port to the existing, default one, or load another instance to dual wield
	- Option to see images before being sent
		- Can retake them
		- Use C/X on the keyboard to confirm
	- Automatically shrinks images to a proper size
- Fixed bits of the Minecraft module
	- Configurable "MinecraftUsername" to set your AI's name (stops feedback loops)
	- Configurable "MinecraftUsernameFollow" to set who your AI follows when doing "#follow"

---.---.---.---

V1.0

- Initial public release of Z-Waif. Contains:
	- WebUI
	- RAG
	- Discord
	- Semi-Minecraft Functionality
	- VTuber Emotes
	- Hotkeys
	- Various other initial release items