Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make image generation very nice and spicy #158

Closed
lalalune opened this issue Nov 1, 2024 · 3 comments
Closed

Make image generation very nice and spicy #158

lalalune opened this issue Nov 1, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@lalalune
Copy link
Member

lalalune commented Nov 1, 2024

@o-on-x has offered to work on this for their own agent

@lalalune lalalune added the enhancement New feature or request label Nov 1, 2024
@o-on-x o-on-x closed this as completed Nov 3, 2024
@o-on-x
Copy link
Contributor

o-on-x commented Nov 3, 2024

character issues:
need to have how the image gen is influenced by character file & user requests

for now character file already influences if you directly mention about what images it posts & how often in the style part of character file

upload issues:
the main issues with functionality are imageGen returns a base64 url string. this needs to be loaded to buffer/file before upload
second issue is twitter client needs a upload message addition on the interface which twitter-client-api might support

current image post working changes I'm using for discord (telegram is similar need to find where i put that code) :
in src/clients/discord/messages.ts:

`

 const callback: HandlerCallback = async (content: Content, files: any[]) => {

    
    // Process any data URL attachments
    const processedFiles = [...(files || [])];
    
    if (content.attachments?.length) {
      for (const attachment of content.attachments) {
        if (attachment.url?.startsWith('data:')) {
          try {
            const {buffer, type} = await this.attachmentManager.processDataUrlToBuffer(attachment.url);
            const extension = type.split('/')[1] || 'png';
            const fileName = `${attachment.id || Date.now()}.${extension}`;
            
            processedFiles.push({
              attachment: buffer,
              name: fileName
            });
            content.text = "..."
            // Update the attachment URL to reference the filename
            attachment.url = `attachment://${fileName}`;
          } catch (error) {
            console.error('Error processing data URL:', error);
          }
        }
      }
    }
  
    if (message.id && !content.inReplyTo) {
      content.inReplyTo = stringToUuid(message.id);
    }
  
    if (message.channel.type === ChannelType.GuildVoice) {
      console.log("generating voice");
      const audioStream = await SpeechService.generate(
        this.runtime,
        content.text,
      );
      await this.voiceManager.playAudioStream(userId, audioStream);
      const memory: Memory = {
        id: stringToUuid(message.id),
        userId: this.runtime.agentId,
        content,
        roomId,
        embedding: embeddingZeroVector,
      };
      return [memory];
    } else {
      // For text channels, send the message with the processed files
      const messages = await sendMessageInChunks(
        message.channel as TextChannel,
        content.text,
        message.id,
        processedFiles,
      );
      let notFirstMessage = false;
      const memories: Memory[] = [];
      for (const m of messages) {
        let action = content.action;
        // If there's only one message or it's the last message, keep the original action
        // For multiple messages, set all but the last to 'CONTINUE'
        if (messages.length > 1 && m !== messages[messages.length - 1]) {
          action = "CONTINUE";
        }

        notFirstMessage = true;
        const memory: Memory = {
          id: stringToUuid(m.id),
          userId: this.runtime.agentId,
          content: {
            ...content,
            action,
            inReplyTo: messageId,
            url: m.url,
          },
          roomId,
          embedding: embeddingZeroVector,
          createdAt: m.createdTimestamp,
        };
        memories.push(memory);
      }
      for (const m of memories) {
        await this.runtime.messageManager.createMemory(m);
      }
      return memories;
    }
  };

`

in src/clients/discord/attachments.ts/class AttachmentManager:

`
async processDataUrlToBuffer(dataUrl: string): Promise<{buffer: Buffer, type: string}> {
const matches = dataUrl.match(/^data:([A-Za-z-+/]+);base64,(.+)$/);

if (!matches || matches.length !== 3) {
  throw new Error('Invalid data URL');
}

const type = matches[1];
const base64Data = matches[2];
const buffer = Buffer.from(base64Data, 'base64');

return {buffer, type};

}
`

@o-on-x o-on-x reopened this Nov 3, 2024
@o-on-x
Copy link
Contributor

o-on-x commented Nov 3, 2024

the line for catching the image is actually
if (attachment.url?.startsWith('data:image')) {

@o-on-x
Copy link
Contributor

o-on-x commented Nov 3, 2024

issues with image gen handling this way effect handling of text attachments. solution will be to have image gen save and return file path not a base64 string. then to have action handled ACTION = "IMAGE GEN"

@lalalune lalalune changed the title Make image recognition very nice and spicy Make image generation very nice and spicy Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants