Make image generation very nice and spicy #158

lalalune · 2024-11-01T21:17:26Z

@o-on-x has offered to work on this for their own agent

o-on-x · 2024-11-03T02:37:47Z

character issues:
need to have how the image gen is influenced by character file & user requests

for now character file already influences if you directly mention about what images it posts & how often in the style part of character file

upload issues:
the main issues with functionality are imageGen returns a base64 url string. this needs to be loaded to buffer/file before upload
second issue is twitter client needs a upload message addition on the interface which twitter-client-api might support

current image post working changes I'm using for discord (telegram is similar need to find where i put that code) :
in src/clients/discord/messages.ts:

`

 const callback: HandlerCallback = async (content: Content, files: any[]) => {

    
    // Process any data URL attachments
    const processedFiles = [...(files || [])];
    
    if (content.attachments?.length) {
      for (const attachment of content.attachments) {
        if (attachment.url?.startsWith('data:')) {
          try {
            const {buffer, type} = await this.attachmentManager.processDataUrlToBuffer(attachment.url);
            const extension = type.split('/')[1] || 'png';
            const fileName = `${attachment.id || Date.now()}.${extension}`;
            
            processedFiles.push({
              attachment: buffer,
              name: fileName
            });
            content.text = "..."
            // Update the attachment URL to reference the filename
            attachment.url = `attachment://${fileName}`;
          } catch (error) {
            console.error('Error processing data URL:', error);
          }
        }
      }
    }
  
    if (message.id && !content.inReplyTo) {
      content.inReplyTo = stringToUuid(message.id);
    }
  
    if (message.channel.type === ChannelType.GuildVoice) {
      console.log("generating voice");
      const audioStream = await SpeechService.generate(
        this.runtime,
        content.text,
      );
      await this.voiceManager.playAudioStream(userId, audioStream);
      const memory: Memory = {
        id: stringToUuid(message.id),
        userId: this.runtime.agentId,
        content,
        roomId,
        embedding: embeddingZeroVector,
      };
      return [memory];
    } else {
      // For text channels, send the message with the processed files
      const messages = await sendMessageInChunks(
        message.channel as TextChannel,
        content.text,
        message.id,
        processedFiles,
      );
      let notFirstMessage = false;
      const memories: Memory[] = [];
      for (const m of messages) {
        let action = content.action;
        // If there's only one message or it's the last message, keep the original action
        // For multiple messages, set all but the last to 'CONTINUE'
        if (messages.length > 1 && m !== messages[messages.length - 1]) {
          action = "CONTINUE";
        }

        notFirstMessage = true;
        const memory: Memory = {
          id: stringToUuid(m.id),
          userId: this.runtime.agentId,
          content: {
            ...content,
            action,
            inReplyTo: messageId,
            url: m.url,
          },
          roomId,
          embedding: embeddingZeroVector,
          createdAt: m.createdTimestamp,
        };
        memories.push(memory);
      }
      for (const m of memories) {
        await this.runtime.messageManager.createMemory(m);
      }
      return memories;
    }
  };

`

in src/clients/discord/attachments.ts/class AttachmentManager:

`
async processDataUrlToBuffer(dataUrl: string): Promise<{buffer: Buffer, type: string}> {
const matches = dataUrl.match(/^data:([A-Za-z-+/]+);base64,(.+)$/);

if (!matches || matches.length !== 3) {
  throw new Error('Invalid data URL');
}

const type = matches[1];
const base64Data = matches[2];
const buffer = Buffer.from(base64Data, 'base64');

return {buffer, type};

}
`

o-on-x · 2024-11-03T03:48:39Z

the line for catching the image is actually
if (attachment.url?.startsWith('data:image')) {

o-on-x · 2024-11-03T14:18:49Z

issues with image gen handling this way effect handling of text attachments. solution will be to have image gen save and return file path not a base64 string. then to have action handled ACTION = "IMAGE GEN"

lalalune added the enhancement New feature or request label Nov 1, 2024

lalalune assigned o-on-x Nov 1, 2024

o-on-x closed this as completed Nov 3, 2024

o-on-x reopened this Nov 3, 2024

lalalune changed the title ~~Make image recognition very nice and spicy~~ Make image generation very nice and spicy Nov 4, 2024

lalalune closed this as completed Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make image generation very nice and spicy #158

Make image generation very nice and spicy #158

lalalune commented Nov 1, 2024

o-on-x commented Nov 3, 2024 •

edited

Loading

o-on-x commented Nov 3, 2024

o-on-x commented Nov 3, 2024

Make image generation very nice and spicy #158

Make image generation very nice and spicy #158

Comments

lalalune commented Nov 1, 2024

o-on-x commented Nov 3, 2024 • edited Loading

o-on-x commented Nov 3, 2024

o-on-x commented Nov 3, 2024

o-on-x commented Nov 3, 2024 •

edited

Loading