-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathchangelog.txt
242 lines (183 loc) · 11.3 KB
/
changelog.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
Changelog
v1.8
- whisper-turbo, a new ultra fast transcribing method
- Optional, off by default
- Roughly 4x as fast, with the same transcription quality!
- Needs CUDA Toolkit and cuDNN downloaded to work with GPU
- CUDA 12 (RTX 20 series+)
- You can also toggle it to just use the CPU instead, at a hit to quality/speed.
- WARNING: I have experienced random crashes with this, but it's likely due to low VRAM / improper install.
- Chunked audio transcription
- Optional, on by default
- Takes in your audio and processes it while you are still speaking.
- Helps with response times. Reduces transcription quality slightly.
- Configurable chunking amount, Lower = process more often. Default is fairly high.
- Mac support
- Use the statup.sh script.
- Thanks to @Cootshk!
- May require tinkering to figure out.
- Fixed an issue where the first sentence said would cause a crash after that.
- Fixed an issue with numba and sympy importing things, despite not existing.
- Interruptible chats in Hangout Mode now work off their own audio file internally.
---.---.---.---
v1.7
- Hangout Mode
- Like a very advanced autochat.
- Your waifu decides how to reply to messages, based on hardcoded presets.
- They may wait, see if any more input comes, and then reply
- They may reply right away
- They may use the camera
- In the future they could also think on their own and decide how to reply
- You can configure their reply personality to change how they reply, or how engaged they are.
- Certain words phrases "think about" or "ponder" will cause them to think more.
- Words are configurable under "Configurables/Hangout"
- Certain words phrases "look at this" or "camera" will cause them to use the vision, if enabled.
- Words are configurable under "Configurables/Hangout"
- By default, you can interrupt them by saying "Wait, " and then their name.
- Can eat up resources, as this also uses whisper. Toggleable in the Configurables.
- The chat logs now have an automatic backup, named "LiveLogBackup.bak".
- Simply rename the file to "LiveLog.json" to restore.
- Backs it up upon every time the program is started.
- Includes a failsafe measure to not back the files up if the history gets cleared.
- Of course, backing up logs in additional methods (to a flash drive, or other PC) is always advised.
- The RAG database now has a progress bar when first calculating it.
- Further enhanced the Autochat volume listener to better handle different sensitivities.
- Fixed an issue where streamed camera chats would appear in the log twice.
---.---.---.---
v1.6
- Added an option for Semi-Auto Chat
- This will simply toggle the mic on after each reply.
- Keyboard hotkey for this is "Q"
- Mutually exclusive with Autochat.
- Autochat will not send any requests under ~2 seconds.
- This is to help stop noises randomly triggering responses.
- Autochat now has an audio buffer that will contain the ~1-2 seconds before you started speaking.
- Adjusted the parameters for the Autochat, making it decay faster while no noise is detected.
---.---.---.---
v1.5-R2
- While using streaming text, emotes are now threaded, meaning that there is no pause for them to happen.
- The VtubeStudio interactions now use a try-catch system, adding general resistance to errors.
- Added in more implementation for Unipipes - the system that basically will manage the centralized execution of code.
- Enhanced the ".bat" files, making them pause after a crash happens.
- Fixed an error where the random looking would cause a crash due to requests not closing properly.
- Fixed an issue with the Discord module crashing when emotes would be triggered.
---.---.---.---
v1.5
- Stopping Strings (what cuts off your waifu if they try talking out of format) can now be changed in the configurables.
- There is now a "Send" button you can click next to the textbox.
- The primary color of the interface is now changeable via the configurables. This changes the color of the borders, checkboxes, and the new "Send" button.
- For a full list of colors, go to: https://www.gradio.app/guides/theming-guide
- The results from the visual system can now be properly rerolled.
- The streamed results can also be interrupted and re-done as it comes in.
- Metadata tags are also applied to visual chats, for future (and current) reference.
- Streaming from the visual system now properly shows in the UI.
- The visual preview no longer requires tabbing in to it to accept / cancel.
- Can now run multiple emotes per message.
- Emotes now trigger as text streams in.
- Removed an old vtube.py script that was unused.
- Hotkeys are now customizable, and can be changed in the configurables.
- Fixed a bug where some users would crash and fail to launch if the hotkeys failed to bind.
- Fixed an issue where doing hotkeys multiple times would "queue" the actions.
---.---.---.---
V1.4
- The text now "streams", appearing as it is generated!
- This is the default now, responses come in quicker and can be read out loud as they come.
- If you get issues, try disabling streaming text in the .env file.
- In effect, this means a slower language model is no issue, as long as it generates a little faster than your waifu talks.
- This won't apply to internal thinking, such as the camera or other future operations.
- Only things read out loud will actually go faster.
- Works on multimodal/visual as well.
- No "[System C]" headers while streaming, as it is read aloud immediately.
- Makes the vision a whole lot better to use, as it tends to be long winded and slow to generate.
- RP Supression and Newline Cut are now unbound from one another.
- RP Supression and Newline Cut can both be turned on/off in the UI, as well as on/off in the .env (.env is what it is on boot).
- Lowered the RP Supression (what stops your AI acting as multiple people) watchdog counter (less likely to misfire).
- Stopping strings are better organized internally.
- Warning messages about messages being too short or too long now appear in the debug log.
---.---.---.---
V1.3-R2
- Fixed a minor bug where if there wasn't enough chat history, the program would crash, as it would attempt to load chat that wasn't there.
---.---.---.---
V1.3
- Added the Tag & Task menu
- Tags can be used to classify info for future use.
- Applies tags automatically to chats that you put in.
- Tasks can be used to swap between character cards, allowing you to swap out parts of the memory.
- Tasks are hyphenated between the "WaifuName" and "Task"
- For example, if your waifu is named "Ember", and you have a task called "Party", you would want a character card in Oobabooga to be defined "Ember-Party"
- Your bot can now use keyboard input to control the keyboard.
- Be sure to toggle "MODULE_GAMING" to "ON".
- Changing the task will change what JSON file it uses (i.e. the task "Emerald" will use the button mappings in "/Configurables/GamingInputs/Emerald.json").
- By default, this is set to "None" with no mappings.
- You can add mappings by copy/paste the file, and renaming to something else.
- Try to use lowercase letters for the keyboard input, capital letters did not work.
- Warning: They can also trigger their own hotkeys, if not turned off!
- Automatic gaming can now be toggled on in the Task menu. This is done by taking a picture, then asking for an input.
- Note: Very bad at the moment, don't expect much of anything. May require more prompting and tuning.
- Vision can now use the main monitor's screenshot as the image input.
- Turn on "Use Screenshot" in the Visual menu.
- Timestamps will now be included in the encoding, telling them the current date and time.
- You may want to ask them in the character card to not mention the current time, as they may spam it.
- Timestamps are also included and stored in message metadata.
- Can be toggled in the .env
- Messages can now be undone / redone while they are speaking, cutting them off.
- Messages are now chunked out and read, instead of all at once.
- Asterisks can now be banned from generating, for conversational mode and stopping roleplay.
- Discord token is now stored in "Configurables/Tokens/Discord.json", for security reasons.
- Fixed an issue where the lorebook was not giving lore for messages with a "?" at the end.
- Added a "is_live_pipe" state to the main script, which will tell us if we are currently running/processing something.
---.---.---.---
V1.2
- Lorebook messages are now directly infused into the encoding as it is sent.
- This now sends all relevant lore triggered within the past 3 message sets, instead of just 1 with a required cooldown.
- Lore triggering requirements were improved, to add plurals and fix edgecases.
- You can still view what lore is triggered via the UI Logs.
- Random Memories will now trigger before the alarm.
- This allows your bot to randomly scan your chat history, and remember past times.
- You can also trigger random memories manually via the UI.
- Your VTuber can now look around, either Following Faces or Randomly.
- This requires setting up 6 emotes for your VTuber. In order, they should have your VTuber's eyes doing the following (they can be named anything);
- "Look Slight Right"
- "Look Right"
- "Look Very Right"
- "Look Slight Left"
- "Look Left"
- "Look Very Left"
- In the .env, change "EYES_FOLLOW" to "Random" or "Faces". Set the "EYES_START_ID" to whatever emote slot the "Look Slight Right" is set up as.
- Make sure all the eye looking emotes follow eachother in order. You can re-order them in VTube Studio if needed.
- Obviously, you need a camera for the VTuber to follow faces, as well as the Vision module enabled.
- Other Roleplay Suppression is now disabled if you have "Cutoff at Newlines" off.
- This will allow the bot to send messages containing character lines, such as "User:" or "Riley:".
- This is to allow lists, info, and multi-user RP scenarios, if you want.
- Fixed issues with the RAG history desyncing when undoing messages.
---.---.---.---
v1.1-R2
- Fixed a few major bugs:
- Fixed the "Error" taking over all of the Gradio WebUI
- Happened due to Gradio & FastAPI dependency conflict (reminder: always vet your stuff~!)
- Fixed issues with the software failing gently when you have no mic
- Fixed crashes relating to searching for "Minecraft" logs, it now check to see if the module is enabled first
---.---.---.---
v1.1
- Visual System
- Toggleable as a module
- Able to take new images or upload them directly for the AI to see
- Runs using Ooba, like with the text
- Can set the port to the existing, default one, or load another instance to dual wield
- Option to see images before being sent
- Can retake them
- Use C/X on the keyboard to confirm
- Automatically shrinks images to a proper size
- Fixed bits of the Minecraft module
- Configurable "MinecraftUsername" to set your AI's name (stops feedback loops)
- Configurable "MinecraftUsernameFollow" to set who your AI follows when doing "#follow"
---.---.---.---
V1.0
- Initial public release of Z-Waif. Contains:
- WebUI
- RAG
- Discord
- Semi-Minecraft Functionality
- VTuber Emotes
- Hotkeys
- Various other initial release items